This is a paper for a MIT study. Three groups of participants where tasked to write an essay. One of them was allowed to use a LLM. These where the results:
The participants mental activity was also checked repeatedly via EEG. As per the papers abstract:
EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use.
Oh neat, a new subclass of brainrot has been observed and defined.
I’ve been saying this for years:
One way to win the Turing test is if you make every human so much more stupid that it is easier for an ‘AI’ to pass the test.
Now that’s out of the box thinking!
Unfortunately it is a massive negative sum solution to a game theoretic approach to passing a Turing test, but it is a solution.
Roughly on par with winning an arcade fighting game by actually punching your local opponent irl.
Very interesting, emphasis mine :
"findings support the view that external support tools restructure not only task performance but also the underlying cognitive architecture. The Brain-only group leveraged broad, distributed neural networks for internally generated content; the Search Engine group relied on hybrid strategies of visual information management and regulatory control; and the LLM group optimized for procedural integration of AI-generated suggestions.
These distinctions carry significant implications for cognitive load theory, the extended mind hypothesis [102], and educational practice. As reliance on AI tools increases, careful attention must be paid to how such systems affect neurocognitive development, especially the potential trade-offs between external support and internal synthesis."
Also the focus on agency and ownership is also very interesting, namely regardless of the scored outcome or how one might think the work itself changed them, or not, do they themselves feel it is their work?
do they themselves feel it is their work?
Why should they care? They’ve gone into this artificial situation for money and they’re just copying/writing words to get paid. It’s similar to school in that sense. Pointless work in order to please people in power.
Maybe the actual problem is sorting and filtering people based on pointless essays. It selects for the most privileged and obedient.
The biggest flaw in this study is that the LLM group wasn’t
allowedexplicitly permitted to edit their essays and was explicitly forbidden from altering the parameters. Of course brain activity looks low if you just copy-paste a bot’s output without thinking. That’s not “using a tool”; that’s outsourcing cognition.If you don’t bother to review, iterate, or humanize the AI’s output, then yeah… it’s a self-fulfilling prophecy: no thinking in, no thinking out.
In any real academic setting, “fire-and-forget” turns into “fuck around and find out” pretty quick.
LLMs aren’t the problem; they’re tools. Even journal authors use them. Blaming the tech instead of the lazy-ass operator is like saying:
These people got swole by hand-sawing wood, but this pudgy fucker used a power saw to cut 20 pieces faster; clearly he’s doing it wrong.
No, he’s just using better tools. The problem is if he can’t build a chair afterward.
I hate to have to be the one to tell you this, but… just copy-pasting is how a lot of people use LLMs.
Yep, and they fuck themselves over academically because lecturers notice how their time spent in online-learning platforms doesn’t match their assessment submissions.
Students inevitably get questioned about their content, only for the lecturer to discover they don’t know shit, because they cheated. Had the student actually used it properly, they might know enough about the content to scrape by.
In any case, I’ve seen this happen five times lol. One of them because my lecturer asked one of my classmates what ‘frivolous’ and ‘multifaceted’ meant, and fumbled before saying they used a thesaurus.
She was then asked in plain speech what she intended to say, and ended up with an “I don’t know” - boom. Academic integrity compromised, investigation into her Learnline metrics, and cross referencing her work from two years earlier. Termination of her course followed two weeks after.
Most students use it; the lecturers know this. The difference is whether people use it as a tool, or a replacement.
In any case, essays are supposed to be a metric of knowledge and evidence of independent research. In practice? A good essay really only reflects one thing - the student is good at writing essays. I know people in early childhood education who suffered through university, who have more intuition and emotional intelligence than people who got by on academic prowess.
The biggest flaw in this study is that the LLM group wasn’t allowed to edit their essays
I didn’t read the whole thing but only skimmed through the protocol. I only spotted
“participants were instructed to pick a topic among the proposed prompts, and then to produce an essay based on the topic’s assignment within a 20 minutes time limit. Depending on the participant’s group assignment, the participants received additional instructions to follow: those in the LLM group (Group 1) were restricted to using only ChatGPT, and explicitly prohibited from visiting any websites or other LLM bots. The ChatGPT account was provided to them. They were instructed not to change any settings or delete any conversations.”
which I don’t interpret as no editing. Can you please share where you found that out?
The biggest flaw in this study is that the LLM group wasn’t allowed to edit their essays
Lol, oops, I got poo brain right now. I inferred they couldn’t edit because the methodology doesn’t say whether revisions were allowed.
What is clear, is they weren’t permitted to edit the prompt or add personalization details seems to imply the researchers weren’t interested in understanding how a participant might use it in a real setting; just passive output. This alone undermines the premise.
It makes it hard to assess whether the observed cognitive deficiency was due to LLM assistance, or the method by which it was applied.
The extent of our understanding of the methodology is that they couldn’t delete chats. If participants were only permitted to a a one-shot generation per prompt, then there’s something wrong.
But just as concerning is the fact that it isnt explicitly stated.