Researchers published a massive database of more than 2 billion Discord messages that they say they scraped using Discord’s public API. The data was pulled from 3,167 servers and covers posts made between 2015 and 2024, the entire time Discord has been active.

Though the researchers claim they’ve anonymized the data, it’s hard to imagine anyone is comfortable with almost a decade of their Discord messages sitting in a public JSON file online. Separately, a different programmer released a Discord tool called “Searchcord” based on a different data set that shows non-anonymized chat histories.

  • Melvin_Ferd@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    16 hours ago

    Yeah, a lot of this push is about ushering in new laws to prevent data scraping.

    Propaganda spreads easily through fake accounts—but how do we detect large-scale operations if they’re constantly creating and deleting accounts or trying to blend in with the rest of us? We’d need access to massive data sets to mine for patterns and expose coordinated behavior.

    But the powers that benefit from shaping the narrative are the same ones pushing the idea that all scraping is bad. They want people to hate it, so they can justify laws that lock down access. That’s the end game.