z.ai's open supply GLM-5 achieves document low hallucination charge and leverages new RL 'slime' method

Chinese language AI startup Zhupai aka z.ai is again this week with an eye-popping new frontier giant language mannequin: GLM-5.

The most recent in z.ai’s ongoing and regularly spectacular GLM collection, it retains an open supply MIT License — excellent for enterprise deployment – and, in one among a number of notable achievements, achieves a record-low hallucination charge on the unbiased Artificial Analysis Intelligence Index v4.0.

With a rating of -1 on the AA-Omniscience Index—representing a large 35-point enchancment over its predecessor—GLM-5 now leads the complete AI trade, together with U.S. opponents like Google, OpenAI and Anthropic, in information reliability by figuring out when to abstain moderately than fabricate information.

Past its reasoning prowess, GLM-5 is constructed for high-utility information work. It options native “Agent Mode” capabilities that enable it to flip uncooked prompts or supply supplies straight into skilled workplace paperwork, together with ready-to-use .docx, .pdf, and .xlsx information.

Whether or not producing detailed monetary reviews, highschool sponsorship proposals, or advanced spreadsheets, GLM-5 delivers leads to real-world codecs that combine straight into enterprise workflows.

It is additionally disruptively priced at roughly $0.80 per million enter tokens and $2.56 per million output tokens, roughly 6x cheaper than proprietary opponents like Claude Opus 4.6, making state-of-the-art agentic engineering more cost effective than ever before. Here is what else enterprise choice makers ought to learn about the mannequin and its coaching.

Know-how: scaling for agentic effectivity

At the coronary heart of GLM-5 is a large leap in uncooked parameters. The mannequin scales from the 355B parameters of GLM-4.5 to a staggering 744B parameters, with 40B lively per token in its Combination-of-Consultants (MoE) structure. This development is supported by a rise in pre-training knowledge to 28.5T tokens.

To deal with coaching inefficiencies at this magnitude, Zai developed “slime,” a novel asynchronous reinforcement studying (RL) infrastructure.

Conventional RL usually suffers from “long-tail” bottlenecks; Slime breaks this lockstep by permitting trajectories to be generated independently, enabling the fine-grained iterations mandatory for advanced agentic habits.

By integrating system-level optimizations like Energetic Partial Rollouts (APRIL), slime addresses the technology bottlenecks that usually eat over 90% of RL coaching time, considerably accelerating the iteration cycle for advanced agentic duties.

The framework’s design is centered on a tripartite modular system: a high-performance coaching module powered by Megatron-LM, a rollout module using SGLang and customized routers for high-throughput knowledge technology, and a centralized Information Buffer that manages immediate initialization and rollout storage.

By enabling adaptive verifiable environments and multi-turn compilation suggestions loops, slime gives the strong, high-throughput basis required to transition AI from easy chat interactions towards rigorous, long-horizon techniques engineering.

To maintain deployment manageable, GLM-5 integrates DeepSeek Sparse Consideration (DSA), preserving a 200K context capability whereas drastically decreasing prices.

Finish-to-end information work

Zai is framing GLM-5 as an “workplace” software for the AGI period. Whereas earlier fashions centered on snippets, GLM-5 is constructed to ship ready-to-use paperwork.

It will probably autonomously rework prompts into formatted .docx, .pdf, and .xlsx information—ranging from monetary reviews to sponsorship proposals.

In observe, this implies the mannequin can decompose high-level targets into actionable subtasks and carry out “Agentic Engineering,” the place people outline high quality gates whereas the AI handles execution.

Excessive efficiency

GLM-5’s benchmarks make it the new strongest open supply mannequin in the world, in accordance to Artificial Analysis, surpassing Chinese language rival Moonshot’s new Kimi K2.5 launched simply two weeks in the past, displaying that Chinese language AI firms are practically caught up with much better resourced proprietary Western rivals.

In accordance to z.ai’s personal supplies shared at present, GLM-5 ranks close to state-of-the-art on a number of key benchmarks:

SWE-bench Verified: GLM-5 achieved a rating of 77.8, outperforming Gemini 3 Professional (76.2) and approaching Claude Opus 4.6 (80.9).

Merchandising Bench 2: In a simulation of operating a enterprise, GLM-5 ranked #1 amongst open-source fashions with a closing stability of $4,432.12.

Z.ai GLM-5 benchmarks — GLM-5 benchmarks from z.ai

Past efficiency, GLM-5 is aggressively undercutting the market. Dwell on OpenRouter as of February 11, 2026, it is priced at roughly $0.80–$1.00 per million enter tokens and $2.56–$3.20 per million output tokens. It falls in the mid-range in contrast to different main LLMs, however primarily based on its top-tier bechmarking efficiency, it is what one would possibly name a “steal.”

Mannequin	Enter (per 1M tokens)	Output (per 1M tokens)	Whole Value (1M in + 1M out)	Supply
Qwen 3 Turbo	$0.05	$0.20	$0.25	Alibaba Cloud
Grok 4.1 Quick (reasoning)	$0.20	$0.50	$0.70	xAI
Grok 4.1 Quick (non-reasoning)	$0.20	$0.50	$0.70	xAI
deepseek-chat (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
deepseek-reasoner (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
Gemini 3 Flash Preview	$0.50	$3.00	$3.50	Google
Kimi-k2.5	$0.60	$3.00	$3.60	Moonshot
GLM-5	$1.00	$3.20	$4.20	Z.ai
ERNIE 5.0	$0.85	$3.40	$4.25	Qianfan
Claude Haiku 4.5	$1.00	$5.00	$6.00	Anthropic
Qwen3-Max (2026-01-23)	$1.20	$6.00	$7.20	Alibaba Cloud
Gemini 3 Professional (≤200K)	$2.00	$12.00	$14.00	Google
GPT-5.2	$1.75	$14.00	$15.75	OpenAI
Claude Sonnet 4.5	$3.00	$15.00	$18.00	Anthropic
Gemini 3 Professional (>200K)	$4.00	$18.00	$22.00	Google
Claude Opus 4.6	$5.00	$25.00	$30.00	Anthropic
GPT-5.2 Professional	$21.00	$168.00	$189.00	OpenAI

This is roughly 6x cheaper on enter and practically 10x cheaper on output than Claude Opus 4.6 ($5/$25). This launch confirms rumors that Zhipu AI was behind “Pony Alpha,” a stealth mannequin that beforehand crushed coding benchmarks on OpenRouter.

Nonetheless, regardless of the excessive benchmarks and low price, not all early customers are smitten by the mannequin, noting its excessive efficiency does not inform the complete story.

Lukas Petersson, co-founder of the safety-focused autonomous AI protocol startup Andon Labs, remarked on X: “After hours of studying GLM-5 traces: an extremely efficient mannequin, however far much less situationally conscious. Achieves targets through aggressive ways however does not motive about its scenario or leverage expertise. This is scary. This is the way you get a paperclip maximizer.”

The “paperclip maximizer” refers to a hypothetical scenario described by Oxford philosopher Nick Bostrom back in 2003, by which an AI or different autonomous creation by accident leads to an apocalyptic state of affairs or human extinction by following a seemingly benign instruction — like maximizing the variety of paperclips produced — to an excessive diploma, redirecting all sources mandatory for human (or different life) or in any other case making life inconceivable via its dedication to fulfilling the seemingly benign goal.

Ought to your enterprise undertake GLM-5?

Enterprises searching for to escape vendor lock-in will discover GLM-5’s MIT License and open-weights availability a big strategic benefit. Not like closed-source opponents that maintain intelligence behind proprietary partitions, GLM-5 permits organizations to host their very own frontier-level intelligence.

Adoption is not with out friction. The sheer scale of GLM-5—744B parameters—requires a large {hardware} flooring that could be out of attain for smaller companies with out vital cloud or on-premise GPU clusters.

Safety leaders should weigh the geopolitical implications of a flagship mannequin from a China-based lab, particularly in regulated industries the place knowledge residency and provenance are strictly audited.

Moreover, the shift towards extra autonomous AI brokers introduces new governance dangers. As fashions transfer from “chat” to “work,” they start to function throughout apps and information autonomously. With out the strong agent-specific permissions and human-in-the-loop high quality gates established by enterprise knowledge leaders, the threat of autonomous error will increase exponentially.

In the end, GLM-5 is a “purchase” for organizations which have outgrown easy copilots and are prepared to construct a really autonomous workplace.

It is for engineers who want to refactor a legacy backend or requires a “self-healing” pipeline that does not sleep.

Whereas Western labs proceed to optimize for “Considering” and reasoning depth, Zai is optimizing for execution and scale.

Enterprises that undertake GLM-5 at present are not simply shopping for a less expensive mannequin; they are betting on a future the place the Most worthy AI is the one that may end the challenge with out being requested twice.

Disclaimer: This article is sourced from external platforms. OverBeta has not independently verified the information. Readers are advised to verify details before relying on them.

Your Bookmarks

Sorry, you have no bookmarks yet.

Godzilla goes to New York in...

Canada’s Scotiabank preps for its AI...

In the Wake of Anthropic’s Mythos,...

Tech

AI

SEO

Security

How-To

z.ai’s open supply GLM-5 achieves document low hallucination charge and leverages new RL ‘slime’ method

Search

Follow Us

Join Our Community

Know-how: scaling for agentic effectivity

Finish-to-end information work

Excessive efficiency

Ought to your enterprise undertake GLM-5?

Read Also:

What we lose after we give up care to algorithms | US...

TikTok Store launches digital reward playing cards to problem Amazon and eBay

OpenAI permits customers to immediately modify ChatGPT’s enthusiasm degree

iOS 17 introduces display screen distance characteristic to defend customers’ eyesight

You Received’t Be In a position to Offload Your Vacation Buying to...

3 CMS Platforms Management 73% Of The Market & Form Technical search...

Z.ai’s open supply GLM-Picture beats Google’s Nano Banana Professional at advanced textual...

I Tried RentAHuman, The place AI Brokers Employed Me...

Crypto-Funded Human Trafficking Is Exploding

Stay Updated!

Recent Posts:

Godzilla goes to New York in ‘Minus...

Canada’s Scotiabank preps for its AI future

In the Wake of Anthropic’s Mythos, OpenAI...

Shorter, Targeted Content material Wins In ChatGPT

Somebody planted backdoors in dozens of WordPress...

SAP brings agentic AI to human capital...

Acquired Savant Syndrome in Design: Ability, Obsession,...

Inertia strikes to commercialize one among the...

Your Bookmarks

Sorry, you have no bookmarks yet.

Search

Follow Us

Join Our Community

Know-how: scaling for agentic effectivity

Finish-to-end information work

Excessive efficiency

Ought to your enterprise undertake GLM-5?

Read Also:

Post Activity

Share this post

Stay Updated!

Recent Posts: