▾ G11 Media Network: | ChannelCity | ImpresaCity | SecurityOpenLab | Italian Channel Awards | Italian Project Awards | Italian Security Awards | ...
InnovationOpenLab

ISC 2026: KAYTUS Launches Rack-Scale KSManage Ultra for AI Factories

KAYTUS, a leading provider in AI infrastructure and liquid cooling solutions, has launched KSManage Ultra at ISC 2026, a next-generation intelligent infrastructure management platform purpose-built fo...

Immagine

KSManage Ultra delivers full-stack visibility across GPUs, racks, and data centers, integrating in-band and out-of-band system management to address performance bottlenecks and maximize AI Factory operational efficiency.

FRANKFURT, Germany: KAYTUS, a leading provider in AI infrastructure and liquid cooling solutions, has launched KSManage Ultra at ISC 2026, a next-generation intelligent infrastructure management platform purpose-built for AI Factories. Designed for the latest high-density AI racks, KSManage Ultra enables unified, intelligent management of key rack-level components, including compute trays, switch trays, power distribution units (PDUs), and cooling distribution units (CDUs). Through end-to-end visibility, performance-level diagnostics, and automated operations, the platform transforms highly coupled AI infrastructure from fragmented oversight into integrated, system-level operations, helping enterprises build more efficient, reliable, and sustainable AI infrastructure.

Three Key Challenges Facing Traditional AI Operations

Compared with traditional data centers, AI data centers face significantly greater operational complexity, including rack-scale AI system management, intricate network topologies, challenging fault isolation, and liquid-cooling safety requirements. As a result, traditional operations approaches are increasingly constrained by three key challenges:

First, Management complexity is soaring: the basic unit of an AI Factory is no longer an individual server, but a resource-coupled, high-density AI rack. A single rack integrates multiple deeply coordinated subsystems, including computing, networking, power supply, and liquid cooling. Compared with traditional 4U 8-GPU deployments, NVL72 rack-scale systems integrate nearly one hundred accelerators and thousands of high-speed interconnects. In a 100 kW-class rack, power density can be 2–3 times higher¹, while thermal management becomes significantly more complex, involving coolant distribution, CDUs, flow rates, and related safety controls. As AI Factories continue to scale, operational complexity rises sharply, and fluctuations in any single component can affect the performance and stability of the entire rack.

Second, fault identification has moved beyond the hardware layer. AI training and inference workloads are highly sensitive to performance fluctuations, and hidden anomalies can significantly reduce operational efficiency. Unlike traditional downtime-related failures, performance degradation in AI systems often occurs silently. Because these performance issues are closely linked to underlying hardware and infrastructure conditions, identifying the true root cause can be difficult when relying on isolated data from either the workload or infrastructure side alone.

Third, AI Factories face a growing operational efficiency crisis as deployments scale. Traditional device-by-device onboarding is inefficient, slowing deployment and increasing the risk of configuration inconsistencies. At the same time, conventional configuration methods are time-consuming and error prone. With multiple device types integrated within each AI rack, even minor configuration deviations can lead to cluster-wide performance degradation or service interruptions.

KAYTUS Builds an Integrated Intelligent Operations Platform for AI Factories

Against this backdrop, traditional operations models that depend on manual processes or fragmented tools often result in delayed deployment, challenging troubleshooting, and inconsistent configurations, limiting the development and large-scale adoption of AI applications. To help simplify the operation and management of AI data centers, KAYTUS has introduced KSManage Ultra. The platform delivers integrated management across the full infrastructure stack, spanning components, nodes, racks, clusters, and the data center level, by connecting in band and out of band management paths and correlating IT infrastructure status with physical infrastructure conditions. It represents the shift from reactive operations to proactive alerting, helping customers build intelligent operations capabilities for monitoring, diagnosis, fault isolation, and full recovery in complex AI Factory environments.

Single-Pane Global Visibility into AI Data Center Operational Status

KSManage Ultra delivers full-stack unified management across both traditional infrastructure and advanced AI rack systems. The platform provides centralized management for GPUs, CPUs, memory, high-speed switching modules, management networks, power shelves, CDUs, liquid cooling systems, racks, and cluster resources. By breaking down management boundaries between IT and physical infrastructure, as well as between individual components and full racks, KSManage Ultra creates a multi-level resource view spanning components, nodes, racks, clusters, and the entire data center.

Through a unified platform, customers can avoid repeatedly switching between multiple systems and quickly assess resource health, rack availability, and cluster readiness for efficient production deployment and operation.

Integrated In-Band and Out-of-Band System Management for Proactive Remediation

KSManage Ultra consolidates in-band data, including operating systems, drivers, applications, and performance, with out-of-band data, such as BMC, firmware, power, temperature, and hardware logs, together with infrastructure data into a single unified management system. It enables correlation analysis across operating status, hardware health, link topology, power supply, and liquid cooling conditions, shifting operations from reactive response to proactive alerting. When the system detects GPU anomalies, degraded link quality, liquid cooling fluctuations, or declining node health, it can proactively identify at-risk nodes and guide customers to isolate, maintain, or reconfigure resources, helping prevent faulty nodes from entering critical task runs.

Using liquid cooling monitoring as an example, KSManage Ultra supports three-level leak detection at the node, rack, and loop levels. Once a leakage risk is detected, the platform can coordinate safety shutdown, solenoid valve closure, and node isolation, while also triggering email alerts, work order generation, and closed-loop remediation. This helps customers build system-level proactive operations capabilities for AI rack systems.

Real-Time Agile System Health Monitoring and Compute Power Resource Allocation

Designed for multi-rack deployment scenarios, KSManage Ultra provides resource health identification and fault isolation capabilities. The platform continuously evaluates node and rack health based on indicators such as GPU status, memory and PCIe status, network link quality, firmware consistency, liquid cooling conditions, and power supply status. When abnormal nodes or high-risk components are detected, the system can apply intelligent tagging, analyze the potential impact scope, and initiate isolation actions, helping prevent faulty nodes from entering critical task runs.

KSManage Ultra helps customers establish a clear view of available resources, including which nodes should be removed from service, which racks remain suitable for combined use, which resources are ready for training and inference workloads, and which resources should enter in maintenance procedures. As a result, customers can move beyond reactive repairs after failures occur, and continuously maintain a stable compute health-zone, improving AI Factory business continuity and resource utilization.

Minute-Level Onboarding, Configuration, and Full-Stack Automated Operations

KSManage Ultra supports one-click batch scanning and automatic node addition. By intelligently identifying device serial numbers and IP addresses, the platform automatically builds topology mappings between nodes and racks, reducing single-rack onboarding time from the traditional 50 minutes to less than 3 minutes. KSManage Ultra supports one-click batch stress testing at L10 and L11 levels, reducing fault root-cause localization from hours to minutes. The platform also enables rack-level automated initialization and configuration, including driver installation, hardware configuration, and software deployment, all of which can be delivered in batches based on templates. By significantly improving operational efficiency while helping maintain consistent hardware environments across the same cluster, KSManage Ultra effectively reduces the risk of performance fluctuations or task failures caused by configuration drift.

As a comprehensive unified platform for AI Factories, KSManage Ultra features open and highly compatible architecture. Through open APIs, it seamlessly integrates with upper-layer systems such as scheduling platforms and CMDBs, while also providing unified management of lower-layer heterogeneous devices, including servers, networking equipment, power infrastructure, and cooling systems. This enables centralized management across the entire data center environment. KSManage Ultra is designed to help enterprises achieve unified management and intelligent operations for heterogeneous infrastructure, providing a solid foundation for stable and efficient operation of AI Factories.

Source:

1. Traditional HGX H100/H200 4U 8-GPU servers typically support 4 to 8 units per 42U rack, resulting in rack-level power consumption of approximately 40 to 80 kW. In contrast, GB200 NVL72 racks can exceed 120 kW, driving a roughly 2x to 3x increase in power density.

About KAYTUS

KAYTUS is a leading provider of AI infrastructure and liquid cooling solutions, delivering a diverse range of innovative, open, and eco-friendly products for cloud, AI, edge computing, and other emerging applications. With a customer-centric approach, KAYTUS is agile and responsive to user needs through its adaptable business model. Discover more at KAYTUS.com and follow us on LinkedIn and X

Fonte: Business Wire

If you liked this article and want to stay up to date with news from InnovationOpenLab.com subscribe to ours Free newsletter.

Related news

Last News

RSA at Cybertech Europe 2024

Alaa Abdul Nabi, Vice President, Sales International at RSA presents the innovations the vendor brings to Cybertech as part of a passwordless vision for…

Italian Security Awards 2024: G11 Media honours the best of Italian cybersecurity

G11 Media's SecurityOpenLab magazine rewards excellence in cybersecurity: the best vendors based on user votes

How Austria is making its AI ecosystem grow

Always keeping an European perspective, Austria has developed a thriving AI ecosystem that now can attract talents and companies from other countries

Sparkle and Telsy test Quantum Key Distribution in practice

Successfully completing a Proof of Concept implementation in Athens, the two Italian companies prove that QKD can be easily implemented also in pre-existing…

Most read

Greenstone Biosciences, Inc. and Intel Corp. Launch Strategic Collaboration…

#AIinbiotech--Greenstone Biosciences, Inc. announces a collaboration with Intel Corp. (NASDAQ: INTC) to accelerate AI-enabled drug discovery using Greenstone’s…

Accenture Reports Third-Quarter Fiscal 2026 Results

Accenture (NYSE: ACN) reported financial results for the third quarter of fiscal 2026 ended May 31, 2026. All comparisons are to the third quarter of…

Qnity Powers the Transition from Shrink to Stack with Advanced Packaging…

Qnity Electronics, Inc. (“Qnity”) (NYSE: Q), a premier technology solutions leader across the semiconductor value chain, today announced the launch of…

RebuilderAI Debuts Design-to-Manufacturing AI Agent VRING:ON at VivaTech…

#AIsolution--RebuilderAI, a company specializing in AI-driven design-to-manufacturing automation, unveiled its design-to-manufacturing AI agent, VRING:ON,…

Newsletter signup

Join our mailing list to get weekly updates delivered to your inbox.

Sign me up!