LLMs are solving MCAT, the bar test, SAT etc like they’re nothing. At this point their performance is super human. However they’ll often trip on super simple common sense questions, they’ll struggle with creative thinking.

Is this literally proof that standard tests are not a good measure of intelligence?

  • FaceDeer@fedia.io
    link
    fedilink
    arrow-up
    2
    arrow-down
    3
    ·
    4 months ago

    It kind of bothers me that we work hard on making AIs intelligent, and then when one actually starts performing well we go “oh, the test must be bad, let’s change it to make sure the AI still scores poorly compared to humans.” I agree that tests are generally bad but this makes one of the biases we build into them obvious.

    • bstix
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 months ago

      Eventually we will be too dumb to tell if it is smarter than us regardless of the tests that we invent.