• 2 Posts
  • 16 Comments
Joined 2 years ago
cake
Cake day: July 13th, 2023

help-circle
  • Yeah it really is fascinating. It follows some sort of recipe to try to solve the problem, like it’s trained to work a bit like an automatic algebra system.

    I think they had employed a lot of people to write generators of variants of select common logical puzzles, e.g. river crossings with varying boat capacities and constraints, generating both the puzzle and the corresponding step by step solution with “reasoning” and re-printing of the state of the items on every step and all that.

    It seems to me that their thinking is that successive parroting can amount to reasoning, if its parroting well enough. I don’t think it can. They have this one-path approach, where it just tries doing steps and representing state, just always trying the same thing.

    What they need for this problem is to take a different kind of step, reduction (the duck can not be left unsupervised -> the duck must be taken with me on every trip -> rewrite a problem without the duck and with 1 less boat capacity -> solve -> rewrite the solution with “take the duck with you” on every trip).

    But if they add this, then there’s two possible paths it can take on every step, and this thing is far too slow to brute force the right one. They may get it to solve my duck variant, but at the expense of making it fail a lot of other variants.

    The other problem is that even seemingly most elementary reasoning involves very many applications of basic axioms. This is what doomed symbol manipulation “AI” in the past and this is what is dooming it now.


  • Not really. Here’s the chain-of-word-vomit that led to the answers:

    https://pastebin.com/HQUExXkX

    Note that in “its impossible” answer it correctly echoes that you can take one other item with you, and does not bring the duck back (while the old overfitted gpt4 obsessively brought items back), while in the duck + 3 vegetables variant, it has a correct answer in the wordvomit, but not being an AI enthusiast it can’t actually choose the correct answer (a problem shared with the monkeys on typewriters).

    I’d say it clearly isn’t ignoring the prompt or differences from the original river crossings. It just can’t actually reason, and the problem requires a modicum of reasoning, much as unloading groceries from a car does.


  • It’s a failure mode that comes from pattern matching without actual reasoning.

    Exactly. Also looking at its chain-of-wordvomit (which apparently I can’t share other than by cut and pasting it somewhere), I don’t think this is the same as GPT 4 overfitting to the original river crossing and always bringing items back needlessly.

    Note also that in one example it discusses moving the duck and another item across the river (so “up to two other items” works); it is not ignoring the prompt, and it isn’t even trying to bring anything back. And its answer (calling it impossible) has nothing to do with the original.

    In the other one it does bring items back, it tries different orders, even finds an order that actually works (with two unnecessary moves), but because it isn’t an AI fanboy reading tea leaves, it still gives out the wrong answer.

    Here’s the full logs:

    https://pastebin.com/HQUExXkX

    Content warning: AI wordvomit which is so bad it is folded hidden in a google tool.


  • Yeah, exactly. There’s no trick to it at all, unlike the original puzzle.

    I also tested OpenAI’s offerings a few months back with similarly nonsensical results: https://awful.systems/post/1769506

    All-vegetables no duck variant is solved correctly now, but I doubt it is due to improved reasoning as such, I think they may have augmented the training data with some variants of the river crossing. The river crossing is one of the top most known puzzles, and various people have been posting hilarious bot failures with variants of it. So it wouldn’t be unexpected that their training data augmentation has river crossing variants.

    Of course, there’s very many ways in which the puzzle can be modified, and their augmentation would only cover obvious stuff like variation on what items can be left with what items or spots on the boat.



  • Full time AI grift jobs would of course be forever closed to any AI whistleblower. There’s still a plenty of other jobs.

    I did participate in the hiring process, I can tell you that at your typical huge corporation the recruiter / HR are too inept to notice that you are a whistleblower, and don’t give a shit anyway. And of the rank and file who will actually google you, plenty enough people dislike AI.

    At the rank and file level, the only folks who actually give a shit who you are are people who will have to work with you. Not the background check provider, not the recruiter.






  • Using tools from physics to create something that is popular but unrelated to physics is enough for the nobel prize in physics?

    If only, it’s not even that! Neither Boltzmann machines nor Hopfield networks led to anything used in the modern spam and deepfake generating AI, nor in image recognition AI, or the like. This is the kind of stuff that struggles to get above 60% accuracy on MNIST (hand written digits).

    Hinton went on to do some different stuff based on backpropagation and gradient descent, on newer computers than those who came up with it long before him, and so he got Turing Award for that, and it’s a wee bit controversial because of the whole “people doing it before, but on worse computers, and so they didn’t get any award” thing, but at least it is for work that is on the path leading to modern AI and not for work that is part of the vast list of things that just didn’t work and it’s extremely hard to explain why you would even think they would work in the first place.






  • AI peddlers just love any “critique” that presumes the AI is great at something.

    Safety concern that LLMs would go Skynet? Say no more, I hear you and I’ll bring it up first thing in the Congress.

    Safety concern that terrorists might use it to make bombs? Say no more! I agree that the AI is so great for making bombs! We’ll restrict it to keep people safe!

    It sounds too horny, you say? Yeah, good point, I love it. Our technology is better than sex itself! We’ll keep it SFW to keep mankind from going extinct due to robosexuality!



  • I love the “criti-hype”. AI peddlers absolutely love any concerns that imply that the AI is really good at something.

    Safety concern that LLMs would go Skynet? Say no more, I hear you and I’ll bring it up in the congress!

    Safety concern that terrorists might use it to make bombs? Say no more! I agree that the AI is so great for making bombs! We’ll restrict it to keep people safe!

    Sexual roleplay? Yeah, good point, I love it. Our technology is better than sex itself! We’ll restrict it to keep mankind from falling into the sin of robosexuality and going extinct! I mean, of course, you can’t restrict something like that, but we’ll try, at least until we release a hornybot.

    But any concern about language modeling being fundamentally not the right tool for some job (Do you want to cite a paper or do you want to sample from the underlying probability distribution?), hey hey hows about we talk about the skynet thing instead?