▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

MangoBoost Sets New Benchmark for Multi-Node LLM Training on AMD GPUs in MLPerf Training v5.0

#AI--MangoBoost, a provider of cutting-edge system solutions for maximizing compute efficiency and scalability, has validated the scalability and efficiency of large-scale AI training on AMD Instinct

Business Wire

BELLEVUE, Wash.: #AI--MangoBoost, a provider of cutting-edge system solutions for maximizing compute efficiency and scalability, has validated the scalability and efficiency of large-scale AI training on AMD Instinct™ MI300X GPUs through its MLPerf Training v5.0 submission. Tailored for enterprise data centers prioritizing performance, flexibility, and cost-efficiency, this milestone demonstrates that state-of-the-art LLM training is now viable beyond traditional vendor-locked GPU platforms.

Using 32 AMD Instinct™ MI300X GPUs across four nodes, MangoBoost fine-tuned the Llama2-70B-LoRA model in just 10.91 minutes, setting the fastest multi-node MLPerf benchmark on AMD GPUs to date. The system achieved near-linear scaling efficiency (95–100%), proving that MangoBoost’s stack can support practical and scalable LLM training in production environments.

Scalability and Efficiency for Enterprise Data Centers

The result showcases more than just benchmark success—it underscores how enterprises can reliably scale LLM training across clusters without network bottlenecks or rigid infrastructure dependencies.

  • Mango LLMBoost™: A full-featured MLOps software platform for large language models, supporting model parallelism, automatic tuning, batch scheduling, and advanced memory management.
  • Mango GPUBoost™ RoCEv2 RDMA: Inter-GPU communication hardware optimized for low-latency, high-throughput node-to-node communication, sustaining line-rate performance across thousands of concurrent QPs.

These technologies together deliver predictable and efficient multi-node training, ideal for organizations operating their own AI infrastructure or deploying on public cloud.

Industry-First MLPerf Training on AMD MI300X GPUs

This is the first-ever MLPerf Training submission on AMD GPUs spanning multiple nodes. MangoBoost’s platform demonstrated robust performance with a 4-node, 32-GPU cluster and confirmed compatibility with additional model sizes and structures—including Llama2-7B and Llama3.1-8B—in internal benchmarks. These results validate the generalizability of MangoBoost’s platform beyond benchmarks to diverse production-scale use cases.

"I'm excited to see MangoBoost's first MLPerf Training results, pairing their LLMBoost AI Enterprise MLOps software with their RoCEv2-based GPUBoost DPU hardware to unlock the full power of AMD GPUs, demonstrated by their scalable performance from a single-node MI300X to 2- and 4-node MI300X results on Llama2-70B LoRA. Their results underscore that a well-optimized software stack is critical to fully harness the capabilities of modern AI accelerators." — David Kanter, Founder, Head of MLPerf, MLCommons

Vendor-Neutral AI Infrastructure Enabled by AMD Collaboration

The achievement was made possible through deep collaboration with AMD and seamless integration with the ROCm™ software ecosystem, enabling full utilization of MI300X’s compute, memory bandwidth, and capacity. Enterprises are now empowered to choose infrastructure based on business needs—not vendor constraints.

"We congratulate MangoBoost on their MLPerf 5.0 training results on AMD GPUs and are excited to continue our collaboration with them to unleash the full power of AMD GPUs. In this MLPerf Training submission, MangoBoost has achieved a key milestone in demonstrating training results on AMD GPUs across 4 nodes (32 GPUs). This showcases how the AMD Instinct™ MI300X GPUs and ROCm™ software stack synergize with MangoBoost's LLMBoost™ AI Enterprise software and GPUBoost™ RoCEv2 NIC."
Meena Arunachalam, Fellow, AI Performance Design Engineering, AMD

"At MangoBoost, we’ve shown that software-hardware co-optimization enables scalable, efficient LLM training without vendor lock-in. Our MLPerf result is a key milestone proving our technology is ready for enterprise-scale AI training with superior efficiency and flexibility," said CEO Jangwoo Kim.

MangoBoost continues to develop innovations in communication optimization, hybrid parallelism, topology-aware scheduling, and domain-specific acceleration to further scale performance in distributed AI workloads.

About MangoBoost

MangoBoost is a provider of cutting-edge, full-stack system solutions for maximizing compute efficiency and scalability. At the heart of the solutions is the MangoBoost Data Processing Unit (DPU), which ensures full compatibility with general-purpose GPUs, accelerators, and storage devices, enabling cost-efficient, standardized AI infrastructure. Founded in 2022 on a decade of research, MangoBoost is rapidly expanding its operations in the U.S., Canada, and Korea.

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

Ionic Digital Issues Open Letter to Stockholders to Deliver Essential…

Ionic Digital Inc., (the “Company” or “Ionic”), an emerging innovator in digital infrastructure and bitcoin mining, today issued an open letter to stockholders…

University of Phoenix Leadership Presents at 1EdTech

University of Phoenix is pleased to announce that Vice President of Accessibility and Student Affairs Kelly Hermann co-presented at the 2025 1EdTech Learning…

EDO and TelevisaUnivision Expand Partnership to Unlock Cross-Platform…

EDO, the TV outcomes company, and TelevisaUnivision, the world’s leading Spanish-language media company, have expanded their partnership to measure the…

Tapcheck Named Workday Innovation Partner

#EWA--Tapcheck , today announced that it has achieved Workday Certified Integration status. As a Workday Innovation Partner, Tapcheck offers customers…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!