Research Findings:
- reCAPTCHA v2 is not effective in preventing bots and fraud, despite its intended purpose
- reCAPTCHA v2 can be defeated by bots 70-100% of the time
- reCAPTCHA v3, the latest version, is also vulnerable to attacks and has been beaten 97% of the time
- reCAPTCHA interactions impose a significant cost on users, with an estimated 819 million hours of human time spent on reCAPTCHA over 13 years, which corresponds to at least $6.1 billion USD in wages
- Google has potentially profited $888 billion from cookies [created by reCAPTCHA sessions] and $8.75–32.3 billion per each sale of their total labeled data set
- Google should bear the cost of detecting bots, rather than shifting it to users
“The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service,” the paper declares.
In a statement provided to The Register after this story was filed, a Google spokesperson said: “reCAPTCHA user data is not used for any other purpose than to improve the reCAPTCHA service, which the terms of service make clear. Further, a majority of our user base have moved to reCAPTCHA v3, which improves fraud detection with invisible scoring. Even if a site were still on the previous generation of the product, reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling.”
Also worth noting that Google has always been extremely open about the fact that they use recaptcha for that purpose. It’s never been a secret.
Their service to the website owners is the meaningful reduction in effectiveness of bots in places bots are harmful. The website’s service to you is the content that that’s being used to protect (and the stuff that has recaptcha on it is stuff like games where there’s a competitive advantage, things like search engines where there’s a meaningful cost to heavy bot use, and login pages where there’s a real security cost to mass bot use). I use a VPN, which increases the rate of captchas a lot, and I think it’s a pretty reasonable way to do things, personally.