Suboptimal ways to respond to a public security incident

andrew@lemmy.stuart.fun · edit-2 2 years ago

Suboptimal ways to respond to a public security incident

TragicNotCute@lemmy.world · 2 years ago

IMO it’s not a good idea to be discussing attack vectors publicly when a number of other instances are unpatched and the exploit has been in the wild for less than a day.

I agree that admins need to work together, but discussing it in public on Lemmy so soon after the attack isn’t the way. There exists a Matrix channel for admins, that’s where this type of thing should go.

entropicshart@lemmy.world · edit-2 2 years ago

When a vulnerability at this level happens and a patch is created, visibility is exactly what you need.

It is the reason CVE sites exist and why so many organizations have their own (e.g. Atlassian, SalesForce/Tableau )

It is also why those CVE will be on the front page of sites like https://news.ycombinator.com to ensure folks are aware and taking precautions.

Organizations that do not report or highlight such critical vulnerabilities are only hurting their users.

TragicNotCute@lemmy.world · 2 years ago

It is common practice to notify affected parties privately and then give full details to the public after the threat is largely neutralized. Expecting public disclosure with technical details on how to perform the attack in less than 24 hours goes against established industry norms.

Dark Arc@lemmy.world · 2 years ago

That only stands true when the issue is not being actively exploited.

Puzzle_Sluts_4Ever@lemmy.world · edit-2 2 years ago

Removed by mod

2 years ago

I strongly disagree with some of your points.

Yes, the vulnerability is out there. Maybe the root cause actually introduced a LOT of vulnerabilities. The fix is being pushed at a frantic pace. To expect the devs to take time out of the mad rush to notify those impacted to do a proper writeup is just insanity.

It’s not insanity. It’s called incident management and it’s something the development team needs to build a proper procedure around, given the expanded scope of this project. I agree that the devs working on identifying, mitigating, and fixing the vulnerability should not be expected to also handle the communication. They need to designate someone for that role.

A 0-day was actively being exploited in the wild. There was confusion, misinformation, and a general lack of information.

You need to:

Indicate that you are aware of an ongoing problem and are working to identify it. This let’s people know there is an issue and that you’re aware of it. You can do this without giving specific details on how to replicate the exploit. This includes server admins publicly acknowledging that they are aware of the issue and will provide updates when they have them, to alleviate the concerns of their user base.
Once a mitigation are known, you publish that, in as many channels as you need to get that information out to the people who need it. So that server admins are aware of what they need to do to reduce their risk.
Once a fix is in place, you publish that, same as above.

The way I see it? This (hopefully) got fixed pretty much instantly and there is active work to get the fix applied by the people who need to apply it. That is what should be done.

And how do you know this since it’s not been communicated? Most of the information I (as a person running a lemmy server) have been able to glean is from random threads spread across random communities.

Give it a week or two to see how they handle the public disclosure side of things.

A couple of weeks for a postmortem. Sure. A couple of weeks for an active, in the wild, 0-day, to officially communicate that the problem exists and how to mitigate/patch it. Absolutely not. I still don’t see a security alert on the GitHub telling me I should be updating to <insert version> to patch an active exploit and it’s been how many hours now?

Puzzle_Sluts_4Ever@lemmy.world · edit-2 2 years ago

Removed by mod

2 years ago

Is the project small? Yes.

Did it explode in popularity leaving the devs overwhelmed? Certainly.

Do I expect them to strictly follow established ITIL incident management? No.

Do I expect them to communicate in a consistent way when an incident happens? Yes.

I agree the primary developers should be left to fixing the problems but there are enough active members of that project that someone could have handled communication in a more concise and official way. I don’t consider random posts in asklemmy or selfhosting by random users just guessing to be a substitute for that.

If the project is going to persist and grow it needs to get better at that. Pointing it out isn’t shitposting.

Puzzle_Sluts_4Ever@lemmy.world · edit-2 2 years ago

Removed by mod

fuser@quex.cc · 2 years ago

whilst I differ somewhat on sharing information on the exploit - knowing something about what was going on allowed some instance admins to take evasive steps - I agree with you completely that there could be a better channel for coordinating communication - I imagine a lot of the discussion went on via Matrix - under the circumstances the response wasn’t so bad given the complete lack of formal organization but yes, it definitely could be improved - you sound quite well-versed in how to handle security/critical incidents. Maybe consider contacting the devs and offering them some help in this area?

2 years ago

I don’t think I’m asking for a lot. A post on !lemmy@lemmy.ml xposted to !lemmy_support@lemmy.ml that gets pinned to the top. Edit the post when relevant information comes out. Release a security advisory on github as soon as you have enough info to warrant one and keep it up-to-date as well.

I’m not asking for the troubleshooting to happen out in the open.

you sound quite well-versed in how to handle security/critical incidents. Maybe consider contacting the devs and offering them some help in this area?

I know enough. I’m certainly not an infosec guy I’m just a sysadmin who’s been doing this long enough to know what should be done. At least partly due to this there’s currently 400 open issues just in lemmy-ui on github. Right now I think the best most of us can do is wait for the dust to settle.

fuser@quex.cc · edit-2 2 years ago

Right, but Lemmy.ml is really just one of a thousand plus instances. We need something instance independent or a way to propagate info that doesn’t rely on any single failure points, or Lemmy as the communication channel. What happens when lemmy.ml is down, or if no instances are able to post due to concerted DoS?

It’s impossible to stop anyone randomly posting stuff on Lemmy. Attackers can post misinformation as well, especially if they compromise admin accounts. Who are we gonna trust in the midst of the next incident? The account posting most prolifically about the UI exploit in progress was using a burner account that had just been created to post about it. I’m sure there were good reasons for wanting to be anonymous when discussing the work of unknown malicious actors, but it made me think twice about what was being posted at the time.

Goodie@lemmy.world · 2 years ago

Your typical dev is not a technical writer, and shouldn’t be doing the proper write-up.

If you feel (and it seems you do) that this skill is missing from the Lemmy team, perhaps you should volunteer some time.

andrew@lemmy.stuart.fun · edit-2 2 years ago

If this was not a zero day being actively exploited then you would be 100% correct. As it is currently being exploited and a fix is available, visibility is significantly more important than anything else or else the long tail of upgrades is going to be a lot longer.

Keep in mind a list of federated instances and their version is available at the bottom of every lemmy instance (at /instances), so this is a really easy chain to follow and try to exploit.

The discovery was largely discussed in the lemmy-dev Matrix channel, fixes published on github, and also discussed on a dozen alternate lemmy servers. This is not an issue you can really keep quiet any longer, so ideally now you move along to the shout it from the mountaintop stage.

hawkwind@lemmy.management · 2 years ago

FYI for anyone looking to deface more instances, That list is only updated every 24 hours. Depending on when it last run on your home instance, the info could be out of date.

andrew@lemmy.stuart.fun · 2 years ago

I think it also only shows backend version, not frontend, so it won’t reflect this fix.

xantoxis@lemmy.one · 2 years ago

OK, as long as all the well-meaning people stop discussing it, nobody will ever find out about it.

Son, this is not it.

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

This is my take on it. I am running Lemmy in a docker using the dessalines image. I hope that there will be an update come this afternoon.

andrew@lemmy.stuart.fun · 2 years ago

There’s already an update available, but it’s for lemmy-ui not lemmy. Just update the tag to 0.18.2-rc.1 and you’ll have this fix.

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

Yep, that’s the plan! Thanks for letting me know. Lemmy is awesome and I am having so much fun with it. I expect it only to get better as the days and weeks progress.

roboadmin@lemmy.robotra.sh · 2 years ago

This is probably a dumb question but I used the Ansible install for Lemmy and just did a git pull and --become again but UI wasn’t updated so I assume 0.18.2 isn’t in release yet (which is fine) but is there documentation on updating UI? I see where it’s showing in the docker-compose.yml file but I am uncertain what to do after changing it there (or if that’s the right place to change it).

robotrash@lemmy.robotra.sh · 2 years ago

This is probably a dumb question but I used the Ansible install for Lemmy and just did a git pull and --become again but UI wasn’t updated so I assume 0.18.2 isn’t in release yet (which is fine) but is there documentation on updating UI? I see where it’s showing in the docker-compose.yml file but I am uncertain what to do after changing it there (or if that’s the right place to change it).

twitterfluechtling@lemmy.pathoris.de · 2 years ago

According to https://github.com/LemmyNet/lemmy/commits/main, the bug was fixed with https://github.com/LemmyNet/lemmy/commit/00f9f79a44887869dcdc3fe5bd1dabbbdc080cec and is part of release 0.18.1, right? I usually wouldn´t recommend to install the release candidate, except for testing, but since this is still 0.X anyway…

Demigodrick@lemmy.zip · 2 years ago

There is already an update. 0.18.2-rc1

You can apply it now.

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

I will have to wait until I can get home from work. Work does deep packet inspection and blocks SSH. I’ve tried doing SSH on port 993, one I know for a fact is open because I get my email that way on my phone and I still get a connection refused. Bunch of fascists!

Midas@ymmel.nl · 2 years ago

Woah seriously

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

Yep, seriously!

BlueBockser@programming.dev · 2 years ago

You could try using a VPN or some other kind of proxy which wraps your SSH traffic to prevent packet inspection. Then it should look like normal UDP traffic ;)

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

They block UDP traffic too. I tried some things like that.

redcalcium@c.calciumlabs.com · 2 years ago

Given how strict it is, I assume your company implements some sort of certification such as ISO27001 and really stick their gun on it? So, can you like, not using your company’s wifi on your phone if it’s heavily monitored? Or is the cell reception poor at your office?

Meow.tar.gz@lemmy.goblackcat.com · 2 years ago

I use my employer’s guest wifi. Right now I can’t afford adding the hotspot to my plan.

redcalcium@c.calciumlabs.com · 2 years ago

No need to enable hotspot on your phone. Just install Termux from f-droid if you’re on android, or Prompt if you’re on iOS, and use SSH directly from your phone.

robotrash@lemmy.robotra.sh · 2 years ago

This is probably a dumb question but I used the Ansible install for Lemmy and just did a git pull and --become again but UI wasn’t updated so I assume 0.18.2 isn’t in release yet (which is fine) but is there documentation on updating UI? I see where it’s showing in the docker-compose.yml file but I am uncertain what to do after changing it there (or if that’s the right place to change it).

tko@tkohhh.social · 2 years ago

Where is this Matrix Channel? Is it private? How can I get access as an instance admin?

ColonelPanic@lemmy.ml · 2 years ago

https://matrix.to/#/#lemmy-support-general:discuss.online

andrew@lemmy.stuart.fun · 2 years ago

Presumably that channel (I hadn’t seen it there) and https://matrix.to/#/#lemmydev:matrix.org, which is actually where I was watching for the fix.

Kresten · 2 years ago

It isn’t private, I think there’s a link on the girhub

hawkwind@lemmy.management · 2 years ago

If the only criteria to be in a private channel for admins is being an admin, there’s no use making it private. ;) Unless your just looking to filter out bad actors who don’t want to take 5 min and 5$ to make an instance.

Kresten · 2 years ago

I have an account that’s in there without being an administrator.

andrew@lemmy.stuart.fun · 2 years ago

Yep, it’s public on github.

Kresten · 2 years ago

I was actually speaking of the matrix channel

popemichael@lemmy.world · 2 years ago

It’s strange that they would try to bury this information.

The number 1 tool against future hacks like this is education.

exu@feditown.com · 2 years ago

From what I found digging through some posts, this exploit only works if your instance uses custom emoji. Federated custom emoji are apparently harmless.

andrew@lemmy.stuart.fun · 2 years ago

Yes, if you have no custom emoji on your instance, you should not be vulnerable. A valid workaround before the fix is also to just remove all custom emoji, from what I’ve also read.

Netto Hikari@social.fossware.space · 2 years ago

I’m not sure what to think about that instance. I saw some weird stuff in the mod protocol recently, if I remember correctly… Like some drama going on, etc.

fox@lemmy.fakecake.org · 2 years ago

unfortunately there’s no images for 0.18.2-anything yet :(

andrew@lemmy.stuart.fun · edit-2 2 years ago

There are, via dessalines’ repo. It’s for lemmy-ui only, at 0.18.2-rc.1.

I have it running already on my instance, and have for 93min now.

fox@lemmy.fakecake.org · 2 years ago

thanks, I guess I missed it. gonna update ASAP just in case, even though I’m the only user of my instance.

Slashzero@hakbox.social · edit-2 2 years ago

Yes, there is: 0.18.2-rc.1, which has the hot fix, but will also require a DB query to “fix” the modlog once upgraded.

binwiederhier@discuss.ntfy.sh · 2 years ago

Excuse my ignorance, but where can I find details to this issueand does it affect only 0.18.1, or also 0.18.0?

Netto Hikari@social.fossware.space · 2 years ago

For amd64, Lemmy dev Dessalines pushes images to his Docker Hub repo usually right after a new version comes out.

Since they don’t release arm64 builds anymore, I build them regularly and push them to my repo, which can be found here.

Excel@lemmy.megumin.org · edit-2 2 years ago

Multi-platform images are kept up-to-date on this 3rd party repo: https://github.com/ubergeek77/lemmy-docker-multiarch

demesisx@programming.dev · 2 years ago

Which leads me to ask: why are we still using Docker images as a MAJOR part of our infrastructure when superior alternatives exist? The Docker aspect made me realize how hacked together the codebase actually is.

Zetaphor@zemmy.cc · 2 years ago

Just because it’s not using your personal preference of containerization doesn’t qualify it as being “hacked together”. Docker is a perfectly acceptable solution for what Lemmy is.

demesisx@programming.dev · 2 years ago

That’s just like your opinion, man.

Zetaphor@zemmy.cc · 2 years ago

Yes, that’s my point.

MigratingtoLemmy@lemmy.world · 2 years ago

I will always espouse containers for critical workloads as they provide much better orchestration, especially during deployment. If your complaint is specifically against docker, I agree, we should be using k8s

demesisx@programming.dev · 2 years ago

I disagree.

IMO, we should be using Nix and OCI.

andrew@lemmy.stuart.fun · 2 years ago

When someone says docker in the context of images today, they’re already talking about the OCI format.

The Quuuuuill@slrpnk.net · 2 years ago

OCI uses Dockerfiles and runs Docker images as docker images are just KVM image, which is what OCI runs. Nix is absolute overkill for the orchestration of a web server workload and would be better for managing the container host (whatever you’re running kubernetes or docker swarm on).

I don’t really know how to put this, but nearly every single web service you encounter and interact with is built using a dockerfile just like how Lemmy is doing. If you’re going to disqualify Lemmy as a viable platform based on it having a dockerfile, I got bad news

towerful@programming.dev · 2 years ago

I thought KVM was virtualisation, as in separate kernels.
And I thought containers shared the hosts kernel. Essentially an “overlay os”.

So, a KVM could virtualise different hardware and CPU architectures.
Whereas a container can only use what the host has

dannoffs@lemmy.sdf.org · 2 years ago

Lmao

mcmxci@mimiclem.me · 2 years ago

What are these ‘superior’ alternatives?

Midas@ymmel.nl · 2 years ago

That’s a ridiculous take

ebits21@lemmy.ca · 2 years ago

What’s so bad about using docker? Serious question.