Once upon a time in the 18th century, a fantastic chess-playing machine known as the Mechanical Turk was exhibited around the world, stunning audiences with its ability to beat skilled players and heads of state like Napoleon Bonaparte. Years later it transpired that the machine’s extraordinary feats were only possible because a human was hiding inside the machine, making all the moves.
Today, a similar phenomenon goes on behind the scenes in developing artificial intelligence: Humans label much of the data used to train AI models and they often babysit those models in the wild too, meaning our modern-day machinery isn’t as fully automated as we think. Yet now comes a twist in the tale: AI systems can produce content that is so humanlike, some of those behind-the-scenes humans are training new AI with old AI.
AI models are often described as a black box, so what happens when one black box teaches another? The new system becomes even harder to scrutinize. It can make biases in those systems more entrenched.
A new study from academics at Switzerland EPFL suggested that workers on Amazon.com Inc.’s MTurk — a crowdsourcing job platform named after the original mechanical Turk — have started using ChatGPT and other large language models to automate their work. The researchers said 33%-46% of them were using the AI tools when carrying out their tasks.
Normally, companies and academics hire MTurk workers because of their ability to do things that computers cannot, like label an image, rate an ad or answer survey questions. Their work is often used to train algorithms to do things like recognize photos or read receipts.
Nearly all tasks on MTurk pay tiny amounts. West Virginia-based Sherry Stanley, who was an MTurk worker for more than seven years up until recently, said she’d seen requesters offer to pay just 50 cents for three paragraphs of written work. Turkers can hike up their hourly takings from $3 to around $30 if they use specialized software to speed up their tasks.
The problem with using ChatGPT, though, is that it isn’t just streamlining the work, it’s doing it.
There are several implications. For example, this behavior impacts the 250,000 or so people, mostly in the US, who are estimated to be working on the MTurk platform. “Scam workers can just exploit the whole system,” says Stanley. “And then the good workers are the ones that suffer the consequences.”
Companies who hire Turkers pay them based on the number of tasks they complete and the quality of their work. If some are producing work faster thanks to software that mimics their human abilities, that puts greater pressure on MTurk workers to increase their speed and output overall, something other professionals are likely to experience too in the advent of generative AI.
Another consequence is skewed results for academic researchers who use MTurk to carry out studies, and for companies that hire Turkers to help train AI systems. If less human input goes into those processes, then the algorithms and scientific studies that use crowdsourcing will get a more warped reflection of reality.
“Human data is hugely important,” says Veniamin Veselovsky, an author on the EPFL research paper. “Psychology, computational social science, sociology all depend on it to better understand ‘us.’”
If more crowd workers use ChatGTP, they’ll also add to the growth of synthetic content derived from AI that is coming to the web. Large language models developed by companies like OpenAI and Google are poised to play a larger role in our so-called information ecosystem, adding to growing amounts of synthetic data that companies are producing to teach AI models.
Overall, that’ll make the internet a potentially more confusing place to learn about the world. Between the bots on Twitter and AI-generated ads, it’s becoming harder to find content on the web that comes from real, live humans. That shift threatens to reinforce prejudices known to have been baked into some language models and AI systems.
“It opens up a series of ethical questions,” says Veselovsky. “These models can represent specific viewpoints, opinions, and ideologies. This may lead to a lack of diversity in the models we are training.”
In other words, if biased AI systems are training other AI systems, we’ll find ourselves caught in a loop of dodgy information whose origins become harder and harder to decipher. The humans who are working behind the scenes of AI are integral to its development, but it would be good if they could stay human for as long as possible.
A message from Advisor Perspectives and VettaFi: To learn more on this and other topics, check out our full schedule of upcoming CE-approved virtual events.
Bloomberg News provided this article. For more articles like this please visit
bloomberg.com.
Read more articles by Parmy Olson