Tag: AI Hypercomputer

AI Hypercomputer Infrastructure Networking Official Blog June 1, 2026

How we evolved Google’s global and data center networks for the AI era - Google’s data center and global networks can distribute AI workloads across campuses, to create a massive-scale, pooled hypercomputing resource.

AI Hypercomputer Official Blog TPU May 18, 2026

Cluster-level reliability for trillion-parameter models on TPUs - Rather than instance-level reliability, Google’s cluster reliability framework measures performance of the “superpod” to enable frontier AI research.

AI Hypercomputer Event Google Kubernetes Engine Infrastructure Official Blog April 6, 2026

Top Infrastructure and GKE Sessions at Cloud Next '26 - Don't miss the top Cloud Next 26 sessions on infrastructure. Learn about AI Hypercomputer, GKE's future, enterprise migration, agentic AI, and cost-effective scale.

AI AI Hypercomputer LLM April 6, 2026

Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in Orbax and MaxText - The newly introduced continuous checkpointing feature in Orbax and MaxText is designed to optimize the balance between reliability and performance during model training, addressing issues with conventional fixed-frequency checkpointing. Unlike fixed intervals—which can either compromise reliability or bottleneck performance—continuous checkpointing maximizes I/O bandwidth and minimizes failure risk by asynchronously initiating a new save operation only after the previous one successfully completes. Benchmarks demonstrate that this approach significantly reduces checkpoint intervals and results in substantial resource conservation, especially in large-scale training jobs where mean-time-between-failure (MTBF) is short.

AI AI Hypercomputer GPU Official Blog Jan. 26, 2026

Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo - A new reference architecture for mixture-of-experts (MoE) workloads uses AI Hypercomputer with A4X machines, NVIDIA GB200 NVL72 and NVIDIA Dynamo.

AI Hypercomputer Official Blog Oct. 27, 2025

What's new with the AI Hypercomputer? vLLM on TPU, and more - New ways to simplify AI infrastructure deployment, improve performance, and optimize your costs.

AI AI Hypercomputer Official Blog Partners Sept. 15, 2025

Fast and efficient AI inference with new NVIDIA Dynamo recipe on AI Hypercomputer - A recipe for disaggregated inferencing with NVIDIA Dynamo on AI Hypercomputer provides better performance and cost while meeting latency needs.

AI Hypercomputer Google Kubernetes Engine Official Blog Sept. 15, 2025

Scaling high-performance inference cost-effectively - GKE Inference Gateway, now GA, provides faster, more efficient inference serving, while GKE Inference Quickstart helps select the best infrastructure.

AI Hypercomputer Official Blog Aug. 11, 2025

Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners - AI Hypercomputer recently got enhancements to Dynamic Workload Scheduler, updates to MaxText and MaxDiffusion, and support for Managed Lustre.

AI Hypercomputer Official Blog Aug. 4, 2025

Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs - Dynamic Workload Scheduler Calendar mode provides up to 90 days of reserved GPU and TPU capacity for your ML workloads without long-term commitments.

AI Hypercomputer Official Blog TPU July 21, 2025

Announcing a new monitoring library to optimize TPU performance - A new monitoring library for Cloud TPUs provides observability and diagnostic tools to help you assess and improve the efficiency of your workloads.

AI Hypercomputer Generative AI Official Blog June 9, 2025

Accelerate your gen AI: Deploy Llama4 & DeepSeek on AI Hypercomputer with new recipes - Learn about new recipes on GitHub for deploying the latest Llama4 and DeepSeek models on the AI Hypercomputer platform.

AI Hypercomputer LLM Official Blog May 26, 2025

Introducing the next generation of AI inference, powered by llm-d - We’re making inference easier and more cost-effective with llm-d, an open-source, Kubernetes-native, distributed and disaggregated inference platform.

AI Hypercomputer Official Blog May 19, 2025

AI Hypercomputer developer experience enhancements from Q1 25: build faster, scale bigger - The article discusses recent enhancements to Google Cloud's AI Hypercomputer, designed to improve the AI developer experience. These enhancements include Pathways on Cloud for interactive scaling, Xprofiler for performance analysis, optimized container images for popular AI frameworks, and recipes for boosting GPU training efficiency.

AI Hypercomputer LLM Official Blog May 12, 2025

From LLMs to image generation: Accelerate inference workloads with AI Hypercomputer - Google Cloud is enhancing its AI Hypercomputer with new inference capabilities, including the Ironwood TPU, vLLM support for TPUs, and GKE Inference Gateway and Quickstart. JetStream, Google's inference engine, now integrates Pathways for lower latency and supports multi-host inference, while MaxDiffusion delivers improved image generation performance on TPUs. MLPerf™ Inference v5.0 results highlight the powerful inference performance of A3 Ultra (NVIDIA H200) and A4 (NVIDIA HGX B200) VMs.

AI AI Hypercomputer Official Blog April 14, 2025

High performance storage innovations for your AI workloads - Google Cloud introduces high-performance storage innovations to optimize AI workloads. Rapid Storage offers sub-millisecond latency and high throughput, while Anywhere Cache improves read-storage latency by up to 70%. Google Cloud Managed Lustre provides a fully managed parallel file system with sub-millisecond latency and high throughput. Storage Intelligence analyzes object metadata to generate storage insights and optimize costs.

AI Hypercomputer Official Blog TPU April 14, 2025

Introducing Ironwood TPUs and new innovations in AI Hypercomputer - Google Cloud introduces new innovations in AI Hypercomputer, including Ironwood TPUs, enhanced networking, and software capabilities for training and inference. Ironwood TPUs offer 5x more peak compute capacity and 6x the high-bandwidth memory capacity compared to the previous generation.

AI Hypercomputer Official Blog March 10, 2025

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials - AI Hypercomputer, a fully integrated supercomputing architecture for AI workloads, offers various use cases with reference architectures and tutorials. It enables affordable inference with JAX, GKE, and NVIDIA Triton Inference Server, especially when paired with Spot VMs for significant cost savings.

Useful Links

Contact

Zdenko Hrček
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]

Tag: AI Hypercomputer

Latest Issues

#507 Issue

#506 Issue

#505 Issue

Useful Links

Contact