▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

Thunderbit Launches High-Fidelity Web Data API, MCP Server, and CLI

#AIWebScraper--Thunderbit, an AI web data platform with over 100,000 users, today launched its developer API, Model Context Protocol (MCP) server, and CLI, giving developers new ways to turn complex, ...

Immagine

SAN FRANCISCO: #AIWebScraper--Thunderbit, an AI web data platform with over 100,000 users, today launched its developer API, Model Context Protocol (MCP) server, and CLI, giving developers new ways to turn complex, long-tail websites into clean Markdown or structured data for AI agents, RAG pipelines, and automation workflows.

At the center of the launch is Thunderbit Distill, an adaptive HTML-to-Markdown engine designed for high-fidelity conversion across complex web pages. In internal HTML-to-Markdown evaluations, Distill scored 0.87 ROUGE-L and produced cleaner, more complete Markdown across product pages, pricing tables, directories, search results, reviews, and other page types, without requiring site-specific rules.

Thunderbit uses AI models rather than fixed parsing rules to identify meaningful page content, then cleans navigation, scripts, ads, and boilerplate so LLMs and databases receive less noisy input.

Thunderbit also introduced Extract, which returns structured JSON or CSV from a URL using a developer-defined schema. Together, Distill and Extract support Markdown for AI agents, RAG, knowledge bases, and content ingestion, or structured data for databases, spreadsheets, enrichment jobs, and internal tools.

"AI agents are only as useful as the web data they can actually reach," said Shuai Guan, Co-founder and CEO of Thunderbit. "We built Thunderbit to turn changing web pages into data that software can use reliably."

Traditional scraping pipelines often rely on CSS selectors, XPath, or site-specific parsing rules that can break when layouts change. Thunderbit is built to understand page semantics and adapt to changing structure, helping developers get cleaner, more complete output without maintaining custom scrapers for every site.

The launch extends Thunderbit beyond its no-code Chrome extension and web app, which are used by sales, ecommerce, research, and operations teams to extract tens of millions of pages every month. Developers can now bring the same adaptive extraction engine into AI applications, automated workflows, and internal systems.

Thunderbit's developer API, MCP server, CLI, and documentation are available today at https://thunderbit.com/docs. Free credits are available for new users.

About Thunderbit

Thunderbit is an AI web data platform used by over 100,000 users to extract structured data from websites, PDFs, and images. Products include a no-code Chrome extension and developer tools for AI workflows, web data extraction, and automation. Learn more at https://thunderbit.com.

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

StitcherAI Launches IT Investment ROI Platform to Answer the Question…

StitcherAI, the company building the industry’s first IT Finance system of intelligence, today announced general availability of its platform that steers…

Ripjar Reports 40% ARR Growth and Secures Additional Investment as Demand…

#AML--Ripjar, the AI-native provider of smarter screening solutions, has announced a 40% increase in annual recurring revenues over the last 12 months.…

Blue Yonder Announces Winners of 2026 ICONic Customer Awards

Blue Yonder, the AI company for supply chain, today recognized the most innovative supply chain companies with the ICONic Customer Awards at the annual…

Gartner Forecasts Worldwide AI Spending to Grow 47% in 2026

#GartnerSYM--Worldwide spending on AI is forecast to total $2.59 trillion in 2026, a 47% increase year-over-year, according to Gartner, Inc. a business…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!