So if you’re the only user (let’s assume for ease) then, that represents all the updates (posts, comments, votes) from each community that you are subscribed to?
Yeah, and I purposely subscribe to (or sometimes have a dedicated “federation helper bot” account I run subscribe to) most of the most popular communities on the most popular instances so I can get a decent sampling of what’s going on in the fediverse on the “All” feed. So I assume my storage usage is maybe a bit higher than what an “average” single-user instance may be…
Yeah it’s not automated or anything, I just pop an incognito window and use it when there is a communitI think is worth seeing sometimes in “All” (or just for archiving purposes) but don’t want to clutter “Subscribed”. I may make something to auto-subscribe to communities meeting some criteria or something at some point in the future…
Do you also post stuff? I mean my instance is only about an hour old, but I’ve subscribed to some communities, yet I don’t see the picture service consuming the S3 storage I’ve configured
Lemmy caches every thumbnail of every post for like a month or something using Pictrs, so that storage will eventually hit a sort of equilibrium and start growing much more slowly (only reflecting post/thumbnail volume during the cache time).
Between profile images, community banners/icons, post images etc. there are probably a few dozen images that will be sticking around for the long haul at the moment.
Your instance only caches thumbnails, so it won’t take much space. The full images are served from the remote instance. So you basically only store whatever your users upload.
It won’t scale linearly. A lot of those users will be subscribed to subs the instance is already replicating. It would only be new subs that would add to the growth.
Question if you know: does a lemmy instance have to be publically accessable to work? Like, if I make an instance on my homelab can the instance “fetch” content and serve it faster locally? Could I reply to a post and have others see it? Etc
At the end of the day the vast majority of what needs to be saved is text. If media content is embedded, the the server just has to save the path to the file not the file itself.
Feels like this will benefit from some sort of fuzzy deduplication in the pictrs storage. I bet there are a lot of similar pics in there. E.g. if one pic or a gif is very similar to another, say just different quality or size, or compression, it should keep only one copy. It might already do this for the same files uploaded by different people as those can be compared trivially via hashing, but I doubt it does similarity based deduplication.
This is lemmy.world after 4 weeks:
Considering this is going to be around a 5 user instance at most I think I’ll be good for awhile. Thanks!
im running 50 users right now, subbed to A LOT of communities, seeing db growth of about 100mb per day.
That seems high when you extrapolate that to 10000 users, like a larger instance might have.
It’s all about how many communities your user(s) subscribe to since your instance basically acts as a mirror for those.
My instance has been running for 23 days, and I am pretty much the only active local user:
So if you’re the only user (let’s assume for ease) then, that represents all the updates (posts, comments, votes) from each community that you are subscribed to?
Yeah, and I purposely subscribe to (or sometimes have a dedicated “federation helper bot” account I run subscribe to) most of the most popular communities on the most popular instances so I can get a decent sampling of what’s going on in the fediverse on the “All” feed. So I assume my storage usage is maybe a bit higher than what an “average” single-user instance may be…
lmao same here. I have a spare account that I use to sub to everything worth subbing to. I haven’t automated it yet though.
Ooh, that’s a really good idea, I need a federation helper bot/account when I start self-hosting a Lemmy instance!
Yeah it’s not automated or anything, I just pop an incognito window and use it when there is a communitI think is worth seeing sometimes in “All” (or just for archiving purposes) but don’t want to clutter “Subscribed”. I may make something to auto-subscribe to communities meeting some criteria or something at some point in the future…
Do you also post stuff? I mean my instance is only about an hour old, but I’ve subscribed to some communities, yet I don’t see the picture service consuming the S3 storage I’ve configured
Lemmy caches every thumbnail of every post for like a month or something using Pictrs, so that storage will eventually hit a sort of equilibrium and start growing much more slowly (only reflecting post/thumbnail volume during the cache time).
Between profile images, community banners/icons, post images etc. there are probably a few dozen images that will be sticking around for the long haul at the moment.
Your instance only caches thumbnails, so it won’t take much space. The full images are served from the remote instance. So you basically only store whatever your users upload.
It won’t scale linearly. A lot of those users will be subscribed to subs the instance is already replicating. It would only be new subs that would add to the growth.
And only active subs. And even then, it’s just text and tiny thumbnails.
Question if you know: does a lemmy instance have to be publically accessable to work? Like, if I make an instance on my homelab can the instance “fetch” content and serve it faster locally? Could I reply to a post and have others see it? Etc
wondering this also! wouldnt it require a domain for your account though?
Now I wonder how viable it would be to support video hosting. The answer is almost certainly “God no!”
It is viable through other hostings
Honestly, Less than I thought!
Interesting, I thought it would be waaayyy more
At the end of the day the vast majority of what needs to be saved is text. If media content is embedded, the the server just has to save the path to the file not the file itself.
Yeah lemmy seems to use just about nothing for data storage.
Wow, that is surprisingly not bad given the size of the instance!
Feels like this will benefit from some sort of fuzzy deduplication in the pictrs storage. I bet there are a lot of similar pics in there. E.g. if one pic or a gif is very similar to another, say just different quality or size, or compression, it should keep only one copy. It might already do this for the same files uploaded by different people as those can be compared trivially via hashing, but I doubt it does similarity based deduplication.
That’s not super terrible given the size of lemmy.world.