I was wondering if the nature of decentralization would negatively affect SEO, since people can access the same post from many different instance
https://lemmy.ml/robots.txt , https://lemmy.world/robots.txt , etc don’t seem to disallow posts, so the text-based content should be easy to index, at least for these instances.
related news: Google is getting a lot worse because of the Reddit blackouts.
An earlier post pointed out: federated sites seem like they will suffer against central content in a SEO world - regardless of whether they are technically indexable.
I wonder if lemmy should have a SEO friendly federated site… .com domain, robots.txt and everything else…
How long does it usually take for google to index websites? Because I tried the string
lemmy site:lemmy.ml after:2023-06-15
and only one post turned up for me and it wasMemes
… the current state of affairs does not seem promising 😔 And if I tried with another instance with the same keywordslemmy site:kbin.social after:2023-06-15
nothing even turned up.I wonder though, will search engines adapt to Lemmy and its fediverse system? Or will search engines die? Or will we see dedicated search engines to search through the fediverse?
How long does it usually take for google to index websites?
Anything between a couple of hours to more than a week, I don’t think having a “real-time feed” through Google is important though. Other than world cup scores, their results were never about speed.
Google brings it up
The second link for me when searching for Lemmy on Google is the link to the “Join Lemmy” website. Surprisingly, Brave Search, which has seemingly no search bubbles or accounts, shows the same.
Sorry I didnt make the post very clear. I was referring to an individual posts when people search for a specific issue/discussion in Lemmy.
I was googling about Lemmy instances and got several front page results from the Self Hosted community.
It’ll be a sad day when it reaches first place on Google. Kilmister earned it, but as it is said: a person dies twice. Once when they die, and once again when nobody remembers their name.
I tried searching for the title of this post verbatim and it isn’t in google results period.
That could just be because of lag between when the post is created and when the Google crawler finds/indexes the page
Oh, good point. Yes, probably? We can not simply assume search engines know that all of these point to the same content:
- https://slrpnk.net/c/technology
- https://feddit.de/c/technology@slrpnk.net
- https://sopuli.xyz/c/technology@slrpnk.net
- https://beehaw.org/c/technology@slrpnk.net
Or even worse, due to defederation, they may not all point to the exact same content.
Without further investment either from lemmy or the search engine’s side, they are probably seen as distinct sources, not aggregated. Which makes each individually less relevant and less likely to show up .
Also note none of the adresses above contain ‘lemmy’. How would users search for content on lemmy in these cases? Can’t do “technology site:lemmy”, or?
But I can say, lemmy content is visible. Haven’t seen it on the first page of ecosia yet, but on page 2 or 3.
Maybe you could use use site:lemmy.ml, because they federate with most instances, they’re likely to have most of lemmy’s content?
Add to that, I’ll bet search engines that find identical content scattered across different sites rank it lower than content at just one site.
This is relatively simple to solve from a technology perspective. You just incorporate the canonical URL meta tag on federated sites that reference the source URL. It’d be trivial to implement, provided the authoritative URL is known.
Fair enough for atleast lemmy webiste and github page shows in the search.
I believe there are engines that specifically index (search) the Fediverse. Searx is one that can do this but it might depend on the instance.