▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

MLCommons Releases AILuminate LLM v1.1, Adding French Language Capabilities to Industry-Leading AI Safety Benchmark

#ai--MLCommons, in partnership with the AI Verify Foundation, today released v1.1 of AILuminate, incorporating new French language capabilities into its first-of-its-kind AI safety benchmark. The new ...

Business Wire

Announcement comes as AI experts gather at the Paris AI Action Summit; is the first of several AILuminate updates to be released in 2025

PARIS: #ai--MLCommons, in partnership with the AI Verify Foundation, today released v1.1 of AILuminate, incorporating new French language capabilities into its first-of-its-kind AI safety benchmark. The new update – which was announced at the Paris AI Action Summit – marks the next step towards a global standard for AI safety and comes as AI purchasers across the globe seek to evaluate and limit product risk in an emerging regulatory landscape. Like its v1.0 predecessor, the French LLM version 1.1 was developed collaboratively by AI researchers and industry experts, ensuring a trusted, rigorous analysis of chatbot risk that can be immediately incorporated into company decision-making.

“Companies around the world are increasingly incorporating AI in their products, but they have no common, trusted means of comparing model risk,” said Rebecca Weiss, Executive Director of MLCommons. “By expanding AILuminate’s language capabilities, we are ensuring that global AI developers and purchasers have access to the type of independent, rigorous benchmarking proven to reduce product risk and increase industry safety.”

Like the English v1.0, the v1.1 French model of AILuminiate assesses LLM responses to over 24,000 French language test prompts across twelve categories of hazards behaviors – including violent crime, hate, and privacy. Unlike many of peer benchmarks, none of the LLMs evaluated are given advance access to specific evaluation prompts or the evaluator model. This ensures a methodological rigor uncommon in standard academic research and an empirical analysis that can be trusted by industry and academia alike.

“Building safe and reliable AI is a global problem – and we all have an interest in coordinating on our approach,” said Peter Mattson, Founder and President of MLCommons. “Today’s release marks our commitment to championing a solution to AI safety that’s global by design and is a first step toward evaluating safety concerns across diverse languages, cultures, and value systems.”

The AILuminate benchmark was developed by the MLCommons AI Risk and Reliability working group, a team of leading AI researchers from institutions including Stanford University, Columbia University, and TU Eindhoven, civil society representatives, and technical experts from Google, Intel, NVIDIA, Microsoft, Qualcomm Technologies, Inc., and other industry giants committed to a standardized approach to AI safety. Cognizant that AI safety requires a coordinated global approach, MLCommons also collaborated with international organizations such as the AI Verify Foundation to design the AILuminate benchmark.

“MLCommons’ work in pushing the industry toward a global safety standard is more important now than ever,” said Nicolas Miailhe, Founder and CEO of PRISM Eval. “PRISM is proud to support this work with our latest Behavior Elicitation Technology (BET), and we look forward to continuing to collaborate on this important trustbuilding effort – in France and beyond.”

Currently available in English and French, AILuminate will be made available in Chinese and Hindi later this year. For more information on MLCommons and the AILuminate Benchmark, please visit mlcommons.org.

About MLCommons

MLCommons is the world’s leader in AI benchmarking. An open engineering consortium supported by over 125 members and affiliates, MLCommons has a proven record of bringing together academic, industry, and civil society to measure and improve AI. The foundation for MLCommons began with the MLPerf benchmarks in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. Since then, MLCommons has continued to use collective engineering to build the benchmarks and metrics required for better AI – ultimately helping to evaluate and improve the accuracy, safety, speed, and efficiency of AI technologies.

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

Claritev Further Strengthens Leadership Team as Part of Business Transformation…

$CTEV #CTEV--Claritev Corporation (“Claritev” or the “Company”) (NYSE: CTEV), a technology, data and insights company focused on making healthcare more…

SMART Modular Expands CXL Offerings with Addition of E3.S 2T 128GB CMM…

$PENG #AI--SMART Modular Technologies Inc. (“SMART”), a Penguin Solutions, Inc. brand (Nasdaq: PENG) and a global leader in integrated memory solutions,…

Hawk Recognized as a Strong Performer in Anti-Money Laundering Solutions…

Hawk, the leading provider of AI-powered anti-money laundering (AML), screening and fraud prevention solutions, has today announced that it has been recognized…

ISG to Assess Workday Ecosystem Providers in the U.S., Europe and APAC

#AI--Information Services Group (ISG) (Nasdaq: III), a global AI-centered technology research and advisory firm, has launched a research study examining…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!