AI Brokers Are Horrible Freelance Staff

By Khalid Nasir
October 29, 2025

Even the greatest artificial intelligence agents are pretty hopeless at on-line freelance work, in accordance to an experiment that challenges the thought of AI changing workplace employees en masse.

The Distant Labor Index, a brand new benchmark developed by researchers at information annotation firm Scale AI and the Middle for AI Security (CAIS), a nonprofit, measures the potential of frontier AI fashions to automate economically beneficial work.

The researchers gave a number of main AI brokers a spread of simulated freelance work and located that even the greatest might carry out lower than 3 % of the work, incomes $1,810 out of a doable $143,991. The researchers checked out a number of instruments and located the most succesful to be Manus from a Chinese language startup of the identical title, adopted by Grok from xAI, Claude from Anthropic, ChatGPT from OpenAI, and Gemini from Google.

“I ought to hope this offers far more correct impressions as to what is going on on with AI capabilities,” says Dan Hendrycks, director of CAIS. He provides that whereas some brokers have improved considerably over the previous 12 months or so, that does not imply that this may proceed at the identical charge.

Spectacular AI advances have led to hypothesis about AI quickly surpassing human intelligence and changing huge numbers of employees. In March, Dario Amodei, CEO of Anthropic, recommended that 90 % of coding work would be automated inside a matter of months.

Earlier waves of AI have impressed misplaced predictions about job displacement, for instance regarding the imminent replacement of radiologists with AI algorithms.

The researchers generated a spread of freelance duties by verified Upwork employees. The duties span a spread of labor together with graphic design, video enhancing, sport growth, and administrative chores like scraping information. They mixed an outline of every job with a listing of recordsdata wanted to carry out the work and an instance of a completed mission produced by a human.

Hendrycks says that whereas AI fashions have gotten higher at coding, math, and logical reasoning lately, they nonetheless wrestle to use completely different instruments and to carry out advanced duties that contain quite a few steps. “They do not have long-term reminiscence storage and may’t do continuous studying from experiences. They cannot decide up expertise on the job like people,” he says.

The evaluation provides a counterpoint to a benchmark of financial work provided in September by OpenAI known as GDPval, which purports to measure economically beneficial work. In accordance to GDPval, frontier AI fashions akin to GPT-5 are approaching human skills on 220 duties throughout a spread of workplace jobs. OpenAI did not present a remark.

Disclaimer: This article is sourced from external platforms. OverBeta has not independently verified the information. Readers are advised to verify details before relying on them.

Your Bookmarks

Sorry, you have no bookmarks yet.

I keep away from AI instruments...

I Cloned Myself With Gemini’s AI...

How massive tech obtained its method...

Tech

AI

SEO

Security

How-To

AI Brokers Are Horrible Freelance Staff

Search

Follow Us

Join Our Community

Read Also:

Mark Zuckerberg Takes the Stand, No Meta Ray-Bans Allowed

Sigourney Weaver Is Additionally Curious About What Occurred to Her ‘Ghostbusters’ Son

How To Handle Stakeholders When An Algorithm Replace Hits

‘We may hit a wall’: why trillions of {dollars} of danger is...

The ‘Stranger Issues’ Brothers on Resolving Character Arcs

Rise of the Killer Chatbots

L’Oréal brings AI into on a regular basis digital promoting manufacturing

YouTube Provides Title A/B Testing And “Ask Studio” Analytics

Ex-L3Harris Cyber Boss Pleads Responsible to Promoting Commerce Secrets...

Stay Updated!

Recent Posts:

I keep away from AI instruments as...

I Cloned Myself With Gemini’s AI Avatar...

How massive tech obtained its method on...

Meta Is in Disaster, Google Search’s Makeover,...

Google I/O Did not Finish search engine...

Can OpenAI’s ‘Grasp of Catastrophe’ Repair AI’s...

These particular cellphone and app options might...

The Gulf’s AI Growth Has an Undersea...