▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

Protege Launches DataLab to Make AI Data a Scientific Discipline

Protege, an AI data platform providing trusted, real-world data at scale, today announced DataLab at Protege, a new research institution advancing the science of AI data. Built to support leading AI l...

Immagine

New research institution brings rigor and standards to the AI data layer, launching with participation from five of the world’s leading AI companies

NEW YORK: Protege, an AI data platform providing trusted, real-world data at scale, today announced DataLab at Protege, a new research institution advancing the science of AI data. Built to support leading AI labs and global technology companies operating at the frontier of AI, DataLab helps AI researchers and pioneers navigate challenges and ambiguity in data quality, selection, representation, complexity, methodology, and safety for AI research.

DataLab’s team of in-house experts and researchers innovate to produce, repackage, and surface novel training and evaluation datasets from data produced in the real world. At launch, a majority of the “Magnificent 7” AI companies and major frontier AI labs are collaborating with DataLab across various AI training and evaluation data projects.

DataLab launches at a time when AI development is increasingly shaped by data limitations. As models grow more advanced, progress depends not only on model size and compute, but also on access to high-quality, well-curated training data. Built with the same scientific ambition of a frontier model lab, DataLab brings discipline to dataset design, construction, and evaluation, establishing clear quality standards and reproducible methodologies that translate into more reliable systems and measurable performance gains.

“We understand the three core pillars driving AI: models, chips, and data. We are convinced that with the right datasets-the third, underdeveloped pillar-you can push the entire frontier forward,” said Bobby Samuels, CEO of Protege. “We created DataLab to treat data as infrastructure, not exhaust. If we want more capable, reliable systems, we need standards, reproducibility, and real scientific discipline at the data layer.”

DataLab operates across three core areas:

  • Scientific partnerships: DataLab engages directly with leading AI researchers to navigate frontier-level technical discussions and identify commercially viable pathways.
  • Building high-value datasets and data products: Through deep methodological discipline, exposure to commercial data applications, and rigorous processes, DataLab develops new product opportunities that originate from the lab.
  • Leading AI data research: DataLab maintains an active presence within the broader academic community by publishing cutting-edge data research, designing evaluations and benchmarks, and identifying gaps in today’s training and evaluation data.

Led by Engy Ziedan, Co-Founder and Chief Scientific Officer at Protege, DataLab brings together machine learning researchers, economists, and domain experts with deep experience in evaluation, dataset design, and applied AI systems. Built to operate alongside both frontier AI research institutions and the world’s leading technology companies, the team combines academic rigor with applied expertise to raise the standards of the AI data layer.

“The strength of DataLab is its ability to integrate perspectives that are often siloed,” said Ziedan. “Advancing AI requires more than larger models or more data alone. It requires thinking at the margin, where we weigh the marginal value of a datapoint on learning and the opportunity cost of choosing the wrong dataset. This requires disciplined dataset design, careful evaluation, and a deep understanding of real-world complexity. Our team is structured to deliver exactly that.”

Since its launch, DataLab has released multimodal healthcare benchmark datasets designed to reflect diagnostic ambiguity and longitudinal clinical context, co-designed MedScribe and Medcode, two multimodal benchmarks for healthcare, and is collaborating with frontier AI organizations on high-stakes data challenges ranging from advanced cancers to agentic task selection to audio de-identification to international healthcare representation.

“Data quality has become the defining constraint in frontier AI development, yet investment and innovation have lagged,” said Nikhil Basu Trivedi, Co-Founder and General Partner at Footwork. “That changes with DataLab at Protege, which brings the same level of rigor and expertise to AI data that we have for AI chips and models. DataLab experts are doing the essential AI data infrastructure work and research that moves the AI frontier forward.”

As AI systems move from research environments into high-stakes, real-world use cases, the strength of the data foundation becomes decisive. DataLab is inviting collaboration from frontier labs, academic researchers, and domain experts committed to raising the standards for how AI data is built, validated, and measured.

To learn more about DataLab visit: datalab.withprotege.ai.

About Protege:

Protege is an AI data platform designed to unlock real-world data at scale. By enabling high-quality, cross-domain data networks, Protege helps AI teams overcome the most critical bottleneck in AI development and deploy more capable, reliable models across industries such as healthcare, media, audio, motion capture, and beyond. Learn more at www.withprotege.ai.

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

Palladyne AI Reports Fourth Quarter and Full Year 2025 Results and Reiterates…

Palladyne AI Corp. (NASDAQ: PDYN and PDYNW) (“Palladyne AI”), a U.S.-based defense and industrial technology company delivering embodied AI-powered collaborative…

Context-Driven Litigation Platform Advocacy Emerges From Stealth, Raises…

Advocacy, the AI-native, context-first litigation workspace, today emerged from stealth and announced it has raised $3.5 million in seed funding. The…

Conduent Appoints Greta Van to Board of Directors

Conduent Incorporated (Nasdaq: CNDT), a global technology-driven business solutions and services company, today announced the appointment of Greta Van…

Turkcell Iletisim Hizmetleri A.S.: Full Year 2025 Results

Turkcell Iletisim Hizmetleri A.S. (NYSE:TKC) (BIST:TCELL): Please note that all financial data is consolidated and comprises that of Turkcell İletişim…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!