Today, like the past few days, we have had some downtime. Apparently some script kids are enjoying themselves by targeting our server (and others). Sorry for the inconvenience.
Most of these ‘attacks’ are targeted at the database, but some are more ddos-like and can be mitigated by using a CDN. Some other Lemmy servers are using Cloudflare, so we know that works. Therefore we have chosen Cloudflare as CDN / DDOS protection platform for now. We will look into other options, but we needed something to be implemented asap.
For the other attacks, we are using them to investigate and implement measures like rate limiting etc.
Yes it is. Suddenly your database exists in more than one location, which is extremely difficult to do with reasonable performance.
Going from 3 to 100 is trivial. Going from one to any number greater than one is the challenge.
Define “horrible”?
When Lemmy, or any server side software is running on a single server, you generally upgrade the hardware before moving to multiple servers (because upgrading is cheaper). When that stops working, and you need to move to another server, it’s possible everything in the database that matters (possibly the entire database) will be in L4 cache in the CPU - not even in RAM a lot of it will be in the CPU.
When you move to multiple servers, suddenly a lot of frequent database operations are on another server, which you can only reach over a network connection. Even the fastest network connection is dog slow compared to L4 cache and it doesn’t really matter how well written your code is, if you haven’t done extensive testing in production with real world users (and actively malicious bots) placing your systems under high load, you will have to make substantial changes to deal with a database that is suddenly hundreds of millions of times slower.
The database might still be able to handle the same number of queries per second, but each individual query will take a lot longer, which will have unpredictable results.
The other problem is you need to make sure all of your servers have the same content. Being part of the Fediverse though, Lemmy probably already has a pretty good architecture for that.
Friend…you have zero idea what you’re talking about. Database existing in multiple locations? What in the hell are you even talking about? Single db instance, multiple app servers, and single LB. You are absolutely not experienced with this type of work, and need to just stop because you’re making an ass out of yourself with these wild ideas that have no basis in practical deployments. Stop embarrassing yourself.
What if your application has to know a state? Say for certain write requests, only one instance is allowed to process those as it needs a cache that it can somewhat consistently rely on?
(Granted, I wouldn’t know why something like Lemmy needs that. But we had that problem at work, and it was a pain to solve while also supporting multiple app instances.)
In that case, I’d use a message queue. Rabbitmq, or I use Pulsar at work - multiple subscribers (using the same subscription name) to one queue of messages that need to be processed. One worker picks it up, processes it, and marks the message as processed. The worker either passes it into a different queue for further processing, or persists it to the DB.
The nice thing with this is when using the Pulsar paradigm, you can have multiple subscriptions to the same message queue, each one carrying its own state as to which messages are processed or not. So say I get one message from an external system, have one system that is processing it right now, and need to add a second system. In that case I just use a different subscription name for the second system, and it works independently of the first with no issues.
Distributed lock of any form would work. Memcache, redis, etcd, read access mechanism in an MQ…etc. Only one process would work on whatever it as a time. Simple.