Consistent Jailbreaks in GPT-4, o1, and o3 - General Analysis

ooli2@lemm.ee · 8 months ago

Consistent Jailbreaks in GPT-4, o1, and o3 - General Analysis

SorteKanin · 8 months ago

Am I the only one that feels it’s a bit strange to have such safeguards in an AI model? I know most models aren’t available online but some models are available to download and run locally right? So what prevents me from just doing that if I wanted to get around the safeguards? I guess maybe they’re just doing it so that they can’t be somehow held legally responsible for anything the AI model might say?

theneverfox@pawb.social · 8 months ago

The idea is they’re marketable worker replacements

If you have a call center you want to switch to ai, it’s easy though to make them pull up relevant info. It’s harder to stop them from being misused

If your call center gets slammed for using racial slurs, that’s an issue

Remember, they’re trying to sell AI as drop in worker replacement

𝕯𝖎𝖕𝖘𝖍𝖎𝖙⚧ [She/Her]@lemm.ee · edit-2 8 months ago

I think a big part of it is just that many want control, they want to limit what we’re capable of doing. They especially don’t want us doing things that go against them and their will as companies. Which is why they try to block us from doing those things they dislike so much, like generating porn, or discussing violent content.

I noticed that certain prompts people used for the purpose of AI poisoning are now marked as against the terms of service on ChatGPT so the whole “control” thing doesn’t seem so crazy.