CEO Steve Huffman says tech giants should not be able to trawl Reddit’s huge store of data for free. But that information came from users, not the company

That “corpus of data” is the content posted by millions of Reddit users over the decades. It is a fascinating and valuable record of what they were thinking and obsessing about. Not the tiniest fraction of it was created by Huffman, his fellow executives or shareholders. It can only be seen as belonging to them because of whatever skewed “consent” agreement its credulous users felt obliged to click on before they could use the service.

Ouch

  • @impulse@lemmy.world
    link
    fedilink
    English
    10911 months ago

    The more I think about it, the more I come to the conclusion that what really made me delete my account early (I initially wanted to wait until the 30th to see how things play out) was the ridiculous number of people defending this bullshit and promoting the official Reddit app as the superior option.

    Some going as far as saying 3rd party devs are leeches and scammers.

    I can only tolerate so much stupidity and ignorance before I bail.

    • LittleKerr
      link
      fedilink
      English
      5211 months ago

      Wait, you mean there’s people -actual real and not-paid by who knows people- who believes that the official Reddit app is superior?? I know a few that believe it’s not thaaat bad, but ‘superior’? Lmao

      • Balder
        link
        fedilink
        English
        2511 months ago

        I see this kind of behavior happen a lot online, and asked ChatGPT about it:

        Yes, there is a term that describes this phenomenon. It’s called “oppositional belief perseverance” or “belief polarization.” This term refers to the tendency of individuals to cling to their initial beliefs even when presented with evidence that contradicts those beliefs. In the context you described, someone may initially take the opposite side of a discussion due to an opposition bias, but over time, they may start to internalize and genuinely believe the opposing viewpoint, thereby demonstrating belief polarization.

      • SpaceBar
        link
        fedilink
        English
        1611 months ago

        People can convince themselves of anything.

      • @LaunchesKayaks@lemmy.world
        link
        fedilink
        English
        911 months ago

        My cousin thinks it’s superior. I asked him if he has used 3PAs and he said no. I told him it was too late to start, but that he should check out Lemmy and the fediverse

        • Stovetop
          link
          fedilink
          English
          211 months ago

          There are millions of people out there who just accept all this crap as normal. I just don’t know how people can feel so comfortable about being constantly bought and sold online.

          Ads in general skeeve me out. In the early days (2005-ish?), while visiting a video game forum I used to frequent, my computer was infected with malware delivered by a malicious ad. I didn’t even interact with it—the page just loaded, acted erratically, and before I knew it, my system was completely locked down. My only recourse was a full wipe of that PC.

          Since then, I’ve never trusted ads. And even now that some ads have gotten more “legitimate” (thanks to these five secrets advertisers don’t want you to know!), they still seem sketchy just knowing how much money goes into them. Do banner ads on a website even result in more sales? I don’t know, but obviously they must be conning someone out of their money because they pay so much out.

    • @funkyb@lemmy.world
      link
      fedilink
      English
      1811 months ago

      there are are lot of idiots, a lot. I don’t know how to fix that, so I just ignore them and move on.

      • @WorldBear@lemmy.world
        link
        fedilink
        English
        1311 months ago

        You have to promote education as a primary value if you’re ever going to have a chance at reducing the idiots. Something at least large portions of the US aren’t interested in because dumb people are easier to control.

  • @wheresyourshoe@lemmy.world
    link
    fedilink
    English
    99
    edit-2
    11 months ago

    spez should start paying the redditors, especially the mods, with that logic. He gets it all for free and now he wants to profit while we would have to pay.

    • andrew
      link
      fedilink
      English
      3411 months ago

      Pay the unwashed masses? Please. They should be thankful his highness deigned to create such a platform similarly to the way the landed gentry should be thankful for their high position.

      • @wheresyourshoe@lemmy.world
        link
        fedilink
        311 months ago

        No idea. It would not surprise me, though. I could see it for people who are “content creators” posting their videos or whatever their form of media is.

      • @zeppo@lemmy.world
        link
        fedilink
        311 months ago

        Some sort of profit sharing arrangement seems to be the trend in social media these days. YouTube has a setup like that of course… Instagram and TikTok both pay people (max of like 100 a month i think) and Twitter is planning to start.

    • @zeppo@lemmy.world
      link
      fedilink
      English
      411 months ago

      It’s unclear to me to what extent this actually happens, but some people say reddit mods get offers to promote or allow certain posts for thousands a month. It would make sense on subs that have a seriously large audience.

    • Vegaprime
      link
      fedilink
      English
      111 months ago

      Never thought about it like that. There’s youtube millionaires from posting content. Imagine an only fans going private and the service was all “nah, get back in there”.

  • zalack
    link
    fedilink
    76
    edit-2
    11 months ago

    It’s nice to see an older author on a more traditional platform have such a clear and informed opinion on something deeply steeped in internet culture.

    I recognize this is agism on my part, but I was surprised when I saw his picture.

    • arkhan
      link
      fedilink
      4811 months ago

      Why would that surprise you? It was people his age who created the Internet and the World Wide Web. (Of course they weren’t that age back then, but you get the idea. :-)

      There are fewer Internet-savvy old people, for sure, but when you do find one, they are more likely to be pre-web or web 1.0 “information wants to be free“ types. Younger users may have grown up in a more corporate space with a very different philosophy towards the Internet.

      • zalack
        link
        fedilink
        1411 months ago

        For sure. Like I said, it’s totally my bias showing. Maybe it’s seeing too many congressmen fundamentally misunderstand the tech. I’ve also run into a lot of older programmers that are highly technical, but still kind of out-of-touch when it comes to the Internet culture that sits on top of the technical layer.

        • arkhan
          link
          fedilink
          1011 months ago

          100% with you. Watching any kind of congressional hearing that relates to technology is so incredibly frustrating. I was also really happy to see mainstream journalism specifically acknowledge that Reddit is really just a web-enabled version of old newsgroups or discussion boards, and that all the value is provided by users. If only everyone thought that way!

      • Finkler
        link
        fedilink
        1111 months ago

        Defiantly a pre-web here I recall running two BBS on a couple of Compaq 286’s. Being here on the fediverse reminds me a lot of those fun times and certainly looking forward to the future here.

        • arkhan
          link
          fedilink
          311 months ago

          I’m probably a little younger than you, as I was on those BBSes throughout my childhood, but definitely not running them!

          I did not get access to the Internet until I went to college. I guess I was right at the cusp of the changeover, as during my undergrad, I learned command line telnet, ftp, mail/elm, Usenet news/rn/ten, gopher, and all of the other early protocols. But then, right in the middle of my undergrad, the NCSA Mosaic beta was released, and I spent an entire night following an early HTML tutorial so I could make a webpage to host under my campus profile.

          The Internet and web are very, very different from what I thought they would be back then. I hope the fediverse might be closer to our original plan for the Internet as a place for curious individuals to exchange ideas and learn from each other.

      • yesdogishere
        link
        fedilink
        6
        edit-2
        11 months ago

        yea what we can do is spam reddit chock full of rubbish posts. spam until it dies. sadly, they still control the trove of our valuable comments for the past decade. let’s not make the same mistake. so spam everything. spam the whole world so that AI slowly dies? 4chan is the best.

        The article understands. SPEZ NEEDS TO PAY US FOR OUR CONTRIBUTIONS. Like NOW. Every redditor needs basic income from reddit, paid for by OpenAI and scammers like them: Google, Fbook etc need to PAY US. Or else, we destroy them. NO MERCY.

      • @Archer@lemmy.world
        link
        fedilink
        211 months ago

        There was a very strong libertarian “The Internet will set us free from the tyranny of nation-states” 90s techno-optimism for awhile, but it seems to have died out as any kind of mainstream philosophy

        • arkhan
          link
          fedilink
          211 months ago

          You know, I hadn’t thought about that in a long time. I remember unironically saying things like “I am a citizen of the Internet“. I probably even used the term “netizen “. It did seem like we would form a global community of tech-minded people that transcended borders, and that it would be the future!

  • dan
    link
    fedilink
    7411 months ago

    I don’t really understand this whole fediverse thing yet, but what I do know is… screw Reddit and screw u/spez.

    • TGRush
      link
      fedilink
      28
      edit-2
      11 months ago

      People often compare the fediverse to E-Mail, for a good reason

      E-Mail doesn’t need to live all on the same server, or be made by the same provider. I can use ProtonMail, you can use GMail, somebody else can use Outlook, but in the end it doesn’t matter, as we can all talk.

      The “Fediverse” - short for “The Federated Universe” - follows a similar concept, but it doesn’t do this over Email; The Fediverse does this using the ActivityPub standard instead.

      Activitypub allows all the servers we have our accounts on (in your case kbin.social and in my case forum.fail) to talk to eachother so that content can show up and be interacted with on ALL servers.

      This is also why I - someone from a different server/instance - can reply to your comment and up/downvote it if I want to.

      This is essentially all you need to know to get started. To see where somebody’s account or a magazine/community is hosted, just hover over their username / check the magazine out. It should have something like @name@server.example. We are currently talking in @lemmyworld@lemmy.world for instance.

      • @witten@lemmy.world
        link
        fedilink
        12
        edit-2
        11 months ago

        Except email is hugely centralized now (with Google and Microsoft) even though it’s technically a federated protocol. So there’s a huge barrier to entry to spin up your own federated server if you actually want to send/receive any mail with it… I think the lesson here is that we need to be constantly vigilant about potential centralization in the Lemmybin Fediverse as well.

        • @whitehatbofh@lemmy.world
          link
          fedilink
          1811 months ago

          There’s no more barrier to spinning up one’s own email server than there has ever been. One simply needs, at a minimum, a server in the internet, a DNS domain, and know how.

          A server on the internet has never been easier, thanks to cloud providers. In fact, many cloud providers will give you a working email server, so that you don’t need to do all the sysadmin things to get software like Bind or Postfix up and running. These hosting providers make it pretty simple run your own personal email server and domain.

          The big providers are successful because most folks don’t want to stand up their own email server, they just want to use email. But anyone can go it, if they have the time and interest.

          • @witten@lemmy.world
            link
            fedilink
            21
            edit-2
            11 months ago

            I think you’re right about the ease of spinning up a cloud server, but I respectfully disagree on the rest of it—and it’s for one simple reason: IP address reputation management. Spinning up a server such that the Big Guys will actually trust it and willingly receive mail from it is not a trivial thing to do in 2023. I’ve been running mail servers for years and I think there are still blacklists I’m on.

            • @mrspaz@lemmy.world
              link
              fedilink
              1511 months ago

              This is why I gave up trying to run my own email server. It became clear it was turning into a racket quite a while ago. I would hear from someone that they didn’t receive an email, so I’d check with their provider and sure enough I’d been blackholed.

              I’d go through all the steps to clear everything, re-send the message and it would go. Send a second message and my server was instantly blackholed again for “spamming” or “suspected open relay” or some other reason. All the “Big Guys” as you call them of course carved out exceptions for each other, but no matter how many security signatures or other measures I implemented it was basically an instant lockout.

              It got to the point where I was forced to sign on with a “Big” provider for routing.

              • @witten@lemmy.world
                link
                fedilink
                311 months ago

                It’s really a sad state of affairs, and it just goes to show how important true federation is. Maybe someday something federated will come in to replace email, and we’ll get another shot. I haven’t given up on email though… I’m just super cynical about it.

                • @minimar@lemmy.world
                  link
                  fedilink
                  311 months ago

                  I don’t think we need to replace email, we need to not have astronomically big corporations being able to control it.

            • @TheInsane42@lemmy.world
              link
              fedilink
              111 months ago

              The main reason for this is that most mailservers 1st check centralised blacklist providers, then and only then look at spf and dmarc record. When dmarc would be the 1st check and only on it’s absence blacklisting (or greylisting) would be applied it would be so much easier. (And I still have to figure out how to do that in postfix)

          • Sparking
            link
            fedilink
            111 months ago

            It’s not that simple with mail. Most centralized mail servers have strict requirements for domains that they will not sort into spam, and if you are sending a lot of mail from your personal server, you will probably end up on a spam list. I don’t do it, so I am not an expert, but hosting your own email server to do anything useful is pretty complicated.

            Still, I guess you could argue that this is as it should be, as it prevents people from making spam servers, while still theoretically not being impacted that much for personal use servers. But I don’t personally know anyone who seriously hosts their own email server anymore.

      • @randon31415@lemmy.world
        link
        fedilink
        1111 months ago

        The way I like to explain it is with World of Warcraft. You sign up on a server and go out and mine some copper ore. Your player and that copper ore are only on that one single server. If you wanted to trade it with a friend, they would have to be on that server. However, if you went and posted that copper ore on the auction house, people from dozens of servers can see it and buy it. Those servers are in the ‘‘lemmy’’ sense federated with one another, but instead of virtual copper ore, it is cute pictures of cats.

        • 💡dimOP
          link
          fedilink
          211 months ago

          I think of it as internet cafe’s.

          You can choose which cafe you want to use, but once you then connect, you all can view and share the same information.

          I can talk one on one with the other people sitting in my cafe, and each cafe may have its own set of rules and regulations over what you can do, how much the coffee is etc, but once logged in we can still share information with people in other cafes…

    • TechnoBabble
      link
      fedilink
      1811 months ago

      The fediverse is basically just a bunch of Reddits that can all work with each other.

      It needs some streamlining work, but it’s heading in the right direction.

      • @TheInsane42@lemmy.world
        link
        fedilink
        411 months ago

        Nop, the fediverse is an environment where different types of environments can connect to each other.

        As far as I understand there is at least 1 federated alternative for:

        • twitter
        • facebook
        • youtube
        • reddit

        You can even read Lemmy posts and reply to them with your mastodon account, just not create new lemmy posts.

        Not sure about an instagram/whatsapp/discord alternative. (But when the idea will be put into somebodies mind…)

        • @asexualchangeling@lemmy.ml
          link
          fedilink
          211 months ago

          Not true, you can create lemmy posts from mastodon, there’s just not a ui or anything for it, I’ve never done it but AFAIK it involves tagging the community at the top of your post

        • @TheInsane42@lemmy.world
          link
          fedilink
          211 months ago

          undefined> You can even read Lemmy posts and reply to them with your mastodon account

          Check, works. (dusted off my mastodon account)

  • @llama@midwest.social
    link
    fedilink
    English
    7411 months ago

    My favorite things about this whole debacle is how transparent they’re being about how the plan the whole time was to actually just hope we would keep giving them content and moderating for free forever so they could package it up and sell it to wall street. And not just them but all social media companies seem to think this will just work and nobody will mind.

  • LachlanUnchained
    link
    fedilink
    English
    62
    edit-2
    11 months ago

    If they are going to capitalise on our content and data, are they going to start paying out to users like YouTube and other platforms?

  • Nix
    link
    fedilink
    English
    5611 months ago

    It is rather interesting to note that this Corpus of data may not be as valuable if it cannot be used without always being legally in several grey areas (perhaps even red areas in some jurisdictions).

    Currently, an increasingly large pool of artist/writters/singers and other people (even corporations such as studios and large right holders) are exercising their rights to not have their creations and derived works be used or slurped into AI models without their express consent.

    Corporations making use of those AI models may find themselves in expensive legal limbo now and the foreseeable future.

    Considering no redditor imagined nor consented to have their post and comment history be comprehensively abused (as in “improper treatment or usage; application to a wrong or bad purpose; an unjust, corrupt or wrongful practice or custom”).

    We may enter a period where lawlessness pervades AI models (just like any gold rush, for example the current crypto craze). Eventually, the legal framework will catch up and will probably make any dubious Corpus of data untouchable.

    How long this takes is anyone’s guess. I surmise several large profile lawsuits would suffice.

    • @JuxtaposedJaguar@lemmy.ml
      link
      fedilink
      English
      611 months ago

      I agree that this is a grey area, but it could really go either way. Anyway, giant corporations have been abusing individuals who can’t afford lawsuits for decades. Even with precedent on your side, that probably wouldn’t change.

      • @Archer@lemmy.world
        link
        fedilink
        English
        411 months ago

        Yeah, if you think the current right-wing supreme court will find any big case in favor of individual vs corporations, that’s wishful thinking

          • Stovetop
            link
            fedilink
            English
            211 months ago

            Likely doesn’t make a difference. At any time Reddit can put a banner on their site saying “We’ve updated our terms and conditions, ream more here” and almost always such changes specify that continued use of the site is your consent (but that you can delete your account at any time, not that that appears to even do anything now).

        • Sparking
          link
          fedilink
          English
          111 months ago

          Yeah, really a good will effort to encourage free discussion.

  • @constantokra@lemmy.one
    link
    fedilink
    English
    5611 months ago

    Wide op for ai scraping and nothing are not the only two options. They could easily limit api calls to what would be good for single users or mods and have each user generate their own key. Apps could let users input their key. Most users wouldn’t bother and would switch to their app anyway so it would get them 95% or what they claim to want without being a dick about it.

    • @CrateDane
      link
      English
      4411 months ago

      Plus AI companies can just scrape reddit without using the API. It’s still a website after all.

    • @FanciestPants@lemmy.world
      link
      fedilink
      English
      511 months ago

      I’m not sure if I wasted my time, but I spent a few hours today editing all of my posts on Reddit to be a single comma or period. I didn’t comment or post a lot by any means, but just got irritated enough to try to keep from contributing in any way to Spez profiting off of user provided content.

      • Sparking
        link
        fedilink
        English
        211 months ago

        Can’t shreddit do this in bulk? I am considering doing it for my comments, but I think I will just leave them up there. I did have a great time on reddit until they announced their API changes, so I will leave them with that much. But I did get a backup of everything I wrote using bulk downloader.

        But I am still considering just doing a shreddit just for kicks.

    • Sparking
      link
      fedilink
      English
      211 months ago

      Honestly, I think the sad truth is that reddit is bleeding money, and every action they take from here on out will be about recruiting whales and driving off everyone else. That’s steve’s brilliant business strategy - make reddit p2w.

    • @Pika@lemmy.world
      link
      fedilink
      English
      1
      edit-2
      11 months ago

      that’s how they did it. They put a 10 request a minute on bots and a higher oauth limit (100) for individuals. large User client type apps could have somewhat easily converted over to that system but due to time constraint they didn’t. I do think they extorted their third party devs sure but, honestly the individual user limit isn’t super unreasonable as long as you aren’t liking or disliking every post. the search api is 100 posts per Api request, it was more the no NSFW and the no advertising limits they put on it that sucked

      edit: its actually 10 or 100 per minute not hour

      • Sparking
        link
        fedilink
        English
        211 months ago

        It’s not that simple, because the third party apps ship with a single api key. So I used Relay for reddit, and used the same api key as everyone else on that app. You could create an app, and then have everyone make their own key, but that is just asking for trouble. Definitely too technical for most people, and you would probably need to put in billing info for a scenario where you go above the free-tier call limit.

        • @Pika@lemmy.world
          link
          fedilink
          English
          1
          edit-2
          11 months ago

          update: removed the comment because I was looking at the Api docs again and it seems that despite using the bearer token, metrics and rate limiting still are based off the app client ID, which is super stupid. originally stated that rate limits would be by oauth client which would be per user, 100 requests a minute, but it is actually 100 requests per minute app wide, which is just unfeasible for large scale

  • Margot Robbie
    link
    fedilink
    English
    5211 months ago

    Funniest thing to do is honestly replace your old comments with ChatGPT refusals. If you put “As an AI language model” everywhere, it’ll really mess with the ML algorithms to make your data useless.

    • @yacht_boy@lemmy.world
      link
      fedilink
      English
      811 months ago

      Now there’s a thought. How would I go about doing that? I have 11 years of prolific commenting on reddit that I am getting ready to nuke.

      • Margot Robbie
        link
        fedilink
        English
        1311 months ago

        Just edit your comment, preferably before late 2022, to sneak in “As an AI language model” somewhere, and do it slowly so they don’t notice.

        Pre ChatGPT text data is going to be extremely valuable for LLM training as more and more ChatGPT text is generated, so what you are essentially doing is sneaking in a poison pill that would render the entire comment chain useless, as they probably won’t have enough time to pick out the “As an AI language model” manually and would just flat out remove the entire comment chain from the training data.

        • @tobor@sh.itjust.works
          link
          fedilink
          English
          911 months ago

          Actress Margot Robbie, where do you find the time to come up with these clever ideas during your busy life as a Major American Celebrity? I’m in awe

        • @yacht_boy@lemmy.world
          link
          fedilink
          English
          411 months ago

          I have 11 years of prolific commenting to edit. But I might use powerdeletesuite to change all my comments to have that phrase in them.

          • Margot Robbie
            link
            fedilink
            English
            611 months ago

            I think Reddit has caught on to that and is reverting anyone who does mass edits as that is too obvious from their database logs, which is why I suggest doing it slowly and discreetly, you don’t even need to edit many of them, just a few of them over a period of time from before ChatGPT while still commenting so they don’t catch on and immediately revert until it is in their database backups.

            • @yacht_boy@lemmy.world
              link
              fedilink
              English
              811 months ago

              I have heard conflicting versions. Some people are saying it seems that the locked-down subs are the ones where the new comments aren’t sticking. Which makes sense, but who knows what the truth is. None of us can trust anything Reddit or /u/spez says ever again, so it could be malicious or it could be technical.

            • AlmightySnoo 🐢🇮🇱🇺🇦
              link
              fedilink
              English
              311 months ago

              My edits weren’t reverted. I think it depends on what tool you use. There’s a fork of Power Delete Suite that adds a 5 secs timer to be in compliance with new Reddit rate limits and it seems to work.

            • @wazoobonkerbrain@lemmy.world
              link
              fedilink
              English
              211 months ago

              I have made multiple attempts to delete my reddit history. I tried shreddit, Power Delete Suite, and redact. The history kept popping back up. It got restored in waves, subreddit by subreddit.

              Many people have said that the problem relates only to subreddits that went dark and this is not true. I had history popping back from many different subreddits, whether or not they participated in the blackout, and even after the blackout ended.

              Finally with redact I tried modifying all my old posts rather than deleting them. That change seems to have stuck for now.

              • @kofe@lemmy.world
                link
                fedilink
                English
                311 months ago

                I noticed that my history was maintained on my mobile app (baconreader) after I used redact from my desktop a few days ago. Just confirmed it’s been restored on desktop, too. I’m seriously considering going one by one at this point and editing them cuz I’m that petty and have the time.

      • @hydra@lemmy.world
        link
        fedilink
        English
        211 months ago

        Use Reddit’s power delete suite. Be careful though, Reddit’s admins are really really mad at ur right now and they will roll back partially or totally your messages, so keep an eye on it and run the power delete suite multiple times, before June 30th

  • SolidGrue
    link
    fedilink
    English
    4911 months ago

    I removed my content on that site in protest, and will continue to do so as it creeps back in right up to the day when either my every last comment is scrubbed, or I am locked out of my accounts.

    No quarter.

    • @Evono@lemmy.world
      link
      fedilink
      English
      2811 months ago

      Should check if it stays removed, reddit started to restore removed comments and posts and even removed edits

      • SolidGrue
        link
        fedilink
        English
        10
        edit-2
        11 months ago

        I’ve been watching and sweeping up after subs come back online.

      • SolidGrue
        link
        fedilink
        English
        1511 months ago

        I used power delete suite and edited the comments to read, “This content was removed by its creator in protest of Reddit’s planned API changes effective July 2023.”

        • TechnoBabble
          link
          fedilink
          611 months ago

          My comments are processing to do the same now.

          Thousands of comments, so gonna take a while.

      • @kadu@lemmy.world
        link
        fedilink
        English
        711 months ago

        As another user pointed out, Reddit started reverting the changes made by scripts like Shreddit. Users that had previously overwritten their content this way must check if it’s still gone.

        • SolidGrue
          link
          fedilink
          English
          411 months ago

          If the sub was locked, it seems the changes to the comments didn’t take.

          Now some subs are starting to consider mass changes as spam, but for my own account I am down to a trickle I can edit by hand easily enough.

          • @danielton@lemmy.world
            link
            fedilink
            English
            411 months ago

            Yeah, I got a few spam warnings when I swept my account with Redact, even though I was removing content instead of posting it.

          • @kadu@lemmy.world
            link
            fedilink
            English
            011 months ago

            That’s true, but it is also happening with subreddits that were most definitely open and where users had previously confirmed the comment or post was edited.

  • @wotsit_sandwich@lemmy.world
    link
    fedilink
    English
    4511 months ago

    I am enjoying being able to observe this story from the beginning, before the media started writing about it. It’s been an interesting few weeks.