▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

Archer® Proves Purpose-Built AI Beats General-Purpose LLMs on Regulatory Change Management: 95% Verified Accuracy, 80x Faster, 92% Lower Cost

#AIGovernance--For enterprises deploying AI in compliance, a wrong date is a missed deadline. The more dangerous failure is a wrong answer the model returns with high confidence, one that flows silent...

Immagine

In a head-to-head benchmark, a leading general-purpose LLM was confidently wrong 35% of the time on regulatory dates. Archer Evolv™ shipped zero errors.

OVERLAND PARK, Kan.: #AIGovernance--For enterprises deploying AI in compliance, a wrong date is a missed deadline. The more dangerous failure is a wrong answer the model returns with high confidence, one that flows silently into a compliance calendar and is only discovered after the window has passed. Archer® today released results showing purpose-built AI beats a general-purpose large language model (LLM) on regulatory work, and it’s not close. This head-to-head test compared Archer’s purpose-built, vertical-specific AI and proprietary data sets against a leading general-purpose LLM, on a core compliance task: determining the publication, effective and comment-close dates of regulatory documents across six jurisdictions.

General-purpose models are a genuine breakthrough, and this is no referendum on their quality. The question Archer set out to answer is narrower and more practical: what it takes to make a specific, high-stakes determination reliable, fast and affordable at scale. A vertical, domain-focused process, grounded in an expert-verified knowledge base, wins on all three at once.

Accuracy: 90% fewer wrong answers

On the same 55 documents, the general-purpose LLM was wrong 56% of the time. Confidence made it worse, not better. Of the answers it rated high confidence, 35% were still wrong. With Archer Evolv, more than 95% of determinations are verified outright, and the rest are routed to an expert before use. Not a single wrong date reached production. Nothing ships unverified.

Outcome on the sample documents

Generic LLM process

Archer Evolv

Correct

44%

95% verified, 5% expert-checked

Wrong, returned as valid

25%

0%

Failed or timed out

31%

0%

A model's own confidence is not a control. Of the answers the general purpose LLM rated high confidence, 35% were wrong. That accuracy gap is the precondition for deploying agentic AI responsibly, because an autonomous operator is only as trustworthy as the determinations beneath it. Verified, source-traceable, expert-governed answers make it possible to safely deploy AI agents across an enterprise. This is the core of AI governance, and the layer Archer is built to provide.

“In compliance, an answer that is fast and cheap, but wrong, is worthless, and an answer you cannot trace is a liability," said Kayvan Alikhani, Chief Product and Technology Officer of Archer. "Archer's purpose-built AI verified more than 95% of determinations in real time. That is the foundation that lets enterprises scale AI agents without losing control of the outcome."

Speed: verified answers in real time

Per request, the general-purpose process averaged about four seconds per response within a five-second timeout. Archer Evolv served a verified date in roughly five-hundredths of a second, about 80 times faster on repeat lookups. For AI agents and analysts working at the pace of a regulatory calendar, that is the difference between keeping up and becoming the bottleneck.

Cost: a persistent, verified knowledge base, not on-demand inference

A general-purpose process recomputes the answer on every request, with no memory of what it found before. Archer Evolv computes once at ingestion, verifies the result into a scalable, expert-governed knowledge base, and persists it for every future lookup at a fraction of the cost and latency. When a regulation is amended, Evolv catches the change proactively, re-verifies, and versions the updated answer. Nothing served is ever stale. For a 500-document corpus with 12 lookups each per month, that is 6,000 determinations against only 500. Archer Evolv avoids roughly 92% of the inference calls, a structural saving that widens as volume grows.

Context is what makes this possible

Archer Evolv's advantage traces to context: before any AI runs, it assesses the organization's jurisdictions, products, business units, risks and regulatory themes, so every determination is grounded in what is relevant to that enterprise. This is the difference between an answer and a defensible answer. The more agents a company deploys, the more valuable that foundation becomes, because every agent inherits the same verified, source-traceable grounding rather than re-deriving the world from scratch.

"The companies that win the next decade of SaaS will pair domain-specific AI with proprietary, vertical-specific context the foundation models cannot replicate," said Bill Diaz, Chief Executive Officer of Archer. "That is the moat, and it compounds. This test is the proof."

The full methodology, source data and case study are available on Archer’s thought leadership website, compliance.ai/evolv_assets/case-01-evolv-vs-raw-llm.html. To see Archer Evolv in action, visit www.archerirm.com.

About Archer

Archer powers how the world's leading enterprises govern risk, compliance, and regulatory change. More than 1,300 organizations run on Archer, including half the Fortune 500 and 37 of the top 50 global banks. A new regulatory change lands somewhere in the world every six minutes, and agentic AI is outpacing most teams' ability to govern it. Archer's purpose-built AI is grounded in the deepest regulatory data and domain expertise in GRC, so every result traces back to its source, and every decision can be defended. Archer delivers solutions across the full range of GRC, including regulatory change management, AI risk management, regulatory intelligence, third-party risk, and IT and security risk. Learn more at www.archerirm.com.

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

Pattern Announces Preliminary Inclusion in Russell 3000® and Russell 2000®…

Pattern Group Inc. (Nasdaq: PTRN) (“Pattern”), a leader in accelerating brands on global ecommerce marketplaces, announced today that the Company has…

Arca Raises $64 Million to Revolutionize and Humanize Wealth Management…

Arca, an AI-native wealth management company that brings personalized, advisor-led financial services, today exited stealth and announced it secured $64…

Capco Recognized by OpenAI for Innovation and Responsible AI Leadership

Global management and technology consultancy Capco, a Wipro company, has been recognized by OpenAI for both AI innovation and responsible AI leadership.…

TensorX Launches With €8M Seed Funding Round Led by Darius Cubed Ventures…

A team of Irish founders has committed €8 million to Nvidia Blackwell GPUs, including the latest B300 chips, to launch TensorX, a sovereign AI inference…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!