• @driving_crooner@lemmy.eco.br
    link
    fedilink
    567 months ago

    They are not talking about the training process, to combat racial bias on the training process, they insert words on the prompt, like for example “racially ambiguous”. For some reason, this time the AI weighted the inserted promt too much that it made Homer from the Caribbean.

    • 520
      link
      fedilink
      -22
      edit-2
      7 months ago

      They are not talking about the training process

      They literally say they do this “to combat the racial bias in its training data”

      to combat racial bias on the training process, they insert words on the prompt, like for example “racially ambiguous”.

      And like I said, this makes no fucking sense.

      If your training processes, specifically your training data, has biases, inserting key words does not fix that issue. It literally does nothing to actually combat it. It might hide issues if the data model has sufficient training to do the job with the inserted key words, but that is not a fix, nor combating the issue. It is a cheap hack that does not address the underlying training issues.

      • Primarily0617
        link
        fedilink
        43
        edit-2
        7 months ago

        but that is not a fix

        congratulations you stumbled upon the reason this is a bad idea all by yourself

        all it took was a bit of actually-reading-the-original-post

        • 520
          link
          fedilink
          -207 months ago

          ?

          My position was always that this is a bad idea.

          • Primarily0617
            link
            fedilink
            40
            edit-2
            7 months ago

            the point of the original post is that artificially fixing a bias in training data post-training is a bad idea because it ends up in weird scenarios like this one

            your comment is saying that the original post is dumb and betrays a lack of knowledge because artificially fixing a bias in training data post-training would obviously only result in weird scenarios like this one

            i don’t know what your aim is here

          • Gamma
            link
            fedilink
            English
            14
            edit-2
            7 months ago

            You started your initial rant based on a misunderstanding of what was actually said. Stumbling into the correct answer != knowing what you’re reacting to

      • @lars@programming.dev
        link
        fedilink
        287 months ago

        Yes. The training data has a bias, and they are using a cheap hack (prompt manipulation) to try to patch it.

      • @phx@lemmy.ca
        cake
        link
        fedilink
        13
        edit-2
        7 months ago

        Any training data almost certainly has biases. For awhile, if you asked for pictures of people eating waffles or fried chicken they’d very likely be black.

        Most of the pictures I tried of kid-type characters were blue eyed.

        Then people review the output and say "hey this might still racist, so they tweak things to “diversity” the output. This is likely the result of that, where they’ve “fixed” one “problem” and created another.

        Behold, Homer in brownface. D’oh!

      • @jacksilver@lemmy.world
        link
        fedilink
        14 months ago

        So the issue is not that they don’t have diverse training data, the issue is that not all things get equal representation. So their trained model will have biases to produce a white person when you ask generically for a “person”. To prevent it from always spitting out a white person when someone prompts the model for a generic person, they inject additional words into the prompt, like “racially ambiguous”. Therefore it occasionally encourages/forces more diversity in the results. The issue is that these models are too complex for these kinds of approaches to work seamlessly.