Welcome to issue #501 May 4th, 2026

News

50+ fully managed MCP servers now available for Google Cloud services - At Google Cloud Next ‘26, we announced Google-managed MCP servers are available for everyone, featuring 50+ MCP servers across general availability (GA) and preview (with more along the way).

BigQuery Data Analytics Earth Engine GIS Official Blog

Mapping a smarter future with BigQuery and Google Earth AI models and datasets - New Google Earth AI models and datasets on BigQuery and Gemini Enterprise Agent Platform help to understand our planet and its communities.

AI Gemini LLM

Building with Gemini Embedding 2: Agentic multimodal RAG and beyond - Google has announced the general availability of Gemini Embedding 2, a unified model that maps text, images, video, audio, and documents into a single semantic space. This model allows developers to process interleaved multimodal inputs in a single request, significantly improving performance for tasks like agentic RAG, visual search, and content moderation. By supporting over 100 languages and offering features like task-specific prefixes and Matryoshka dimensionality reduction, the model provides a highly efficient and accurate foundation for building complex AI agents.

Cloud Storage PyTorch

Speeding Up AI: Bringing Google Colossus to PyTorch via GCSFS and Rapid Bucket - Google Cloud has introduced a high-performance integration that connects Rapid Storage directly to PyTorch via the fsspec interface to eliminate AI training bottlenecks. By utilizing Google’s Colossus architecture and bidirectional gRPC streaming, the solution offers up to 15 TiB/s aggregate throughput and significant reductions in latency.

Articles, Tutorials

Infrastructure, Networking, Security, Kubernetes

CISO Official Blog

Cloud CISO Perspectives: At Next ‘26, why we’re multicloud and multi-AI - Following our news at Google Cloud Next, COO Francis deSouza talks multicloud, multi-AI cybersecurity in the agentic enterprise era.

TPU

Google Cloud TPU Architecture Versions Explained: From v1 to the Eighth Generation - A guide to Cloud TPU generations, what changed between them, and how to choose the right one for your workload.

Security

New to Google SecOps: Everything Counts (in Windowed Amounts) - This article details the various time windowing options available in Google Security Operations (SecOps), focusing on tumbling, hop, and sliding windows. It explains the unique characteristics and applications of each window type for effectively analyzing and visualizing event data. Understanding these windowing techniques is crucial for advanced security analysis and detection strategies within Google Cloud.

DevOps Google Kubernetes Engine Kubernetes

Instant-On Scaling: Eliminating Node Provisioning Delays in GKE with Capacity Buffers - Google Cloud's new GKE Active Buffer feature eliminates node provisioning delays by maintaining a pre-provisioned buffer of warm capacity. This allows latency-sensitive workloads to scale instantly, ensuring rapid response times and a seamless user experience.

App Development, Serverless, Databases, DevOps

Databases Gemini CLI

Your Databases Finally Speak Human - Announcing agent skills for Google Data Cloud — across Gemini CLI, Claude Code, Codex, and Antigravity.

Cloud Build DevOps Java

How We Reduced Cloud Build Time by 60% Using Maven Caching - This article demonstrates how to significantly reduce Google Cloud Build times for Java/Maven applications, which are often slowed by repeated dependency downloads. It outlines a method to achieve a 60% build speed improvement by creating a custom Docker image with pre-cached Maven dependencies, incurring minimal cost.

Cloud SQL

Demystifying max connections limit in Cloud SQL for PostgreSQL - Google Cloud SQL for PostgreSQL sets a default `max_connections` limit based on instance size, but this isn't a hard limit and can be customized. Manually setting this value overrides automatic adjustments during instance resizing, requiring users to manage it carefully while considering memory constraints.

DevOps SRE

When DevSecOps Met SRE: How We Hunted Down a GCP Security Incident and Made Our System Bulletproof - A real-world story of applied SRE principles, shift-left security, and blameless postmortems on Google Cloud Platform.

Cloud Firestore

Firestore levels up: Bringing the power of search and JOINs to NoSQL - Google Cloud Firestore has significantly expanded its query capabilities with the general availability of pipeline operations, introducing robust features like full text search, geospatial queries, and advanced subqueries for data joining and aggregation. These powerful additions, alongside new in-database data manipulation language (DML) operations, empower developers to build more complex and efficient applications directly within Firestore, bringing it closer to the functionality of traditional relational databases.

Data Analytics Databases GCP Experience Official Blog

UKG unlocks real-time workforce intelligence at scale with the Agentic Data Cloud - Learn how UKG scales their global workforce intelligence with Google’s Agentic Data Cloud.

BigQuery Cloud Asset Inventory DevOps

Reading a Google Cloud Organization Like a Database: Asset Inventory to BigQuery - How Cloud Asset Inventory plus BigQuery turn any Google Cloud organization into a queryable dataset for assessment, audit, and IAM analysis.

Big Data, Analytics, ML&AI

BigQuery FinOps Terraform

Enforcing SELECT * Restrictions in BigQuery - A Smart Policy-Tag Trick to Protect Your Cloud Bill.

Serverless Spark

Lakehouse Demystified — Part 3: Just enough about Spark notebooks powered by Google’s Managed Service for Apache Spark — serverless interactive sessions - This article details Google Cloud's Managed Service for Apache Spark's serverless interactive sessions, which power Spark notebooks for interactive data analysis and engineering. It highlights how this managed service provides an auto-scaling environment for notebooks, eliminating operational toil and accelerating development.

LLM TPU

Serve and Inference Gemma 4 on TPU - This article outlines how to efficiently serve Google's Gemma 4 multimodal model on Tensor Processing Units (TPUs) within Google Cloud. It details a step-by-step process using vLLM, a high-performance inference engine, to maximize hardware utilization and achieve rapid, sub-second inference for complex AI workflows.

Gemini Gemini CLI

How I used Gemini CLI to orchestrate a complex RAG migration - This article details how Gemini CLI, paired with the Conductor extension, orchestrated a complex RAG migration by applying AI to manage the entire project, not just code. This approach leverages spec-driven development, AI-driven Test-Driven Development (TDD), and human-in-the-loop collaboration to ensure consistent and high-quality cloud implementations.

LLM TPU

Beyond the Basics: 4.5x Performance with Disaggregated Serving on TPUs - This article demonstrates a significant performance boost for large language models (LLMs) on Google Cloud TPUs through a disaggregated serving architecture. By separating the prefill and decode phases and utilizing the GKE Inference Gateway, throughput increased 4.5x, from 3,000 to over 14,000 tokens/sec.

GPU TPU

vLLM on Google Cloud TPU: A Model Size vs Chip Cheat Sheet (With Interactive Tool) - This article provides guidance on efficiently running vLLM for large language model inference on Google Cloud TPUs. It helps users select the optimal TPU configuration by considering model size, HBM requirements, and cost, aiming to prevent issues like out-of-memory errors or overpaying for resources.

AI Paywall

Google Just Shipped 13 Agent Skills. I Plugged Them Into Gemini CLI and Watched Code Quality Jump. - Google has launched 13 "Agent Skills" for its Gemini CLI and other AI agents, dramatically improving the quality and accuracy of generated code and architectural advice. These skills dynamically provide specialized, up-to-date documentation using an open-source format developed by Anthropic, allowing AI models to overcome reliance on outdated training data and deliver significantly more correct solutions.

Generative AI TPU

TPU vs. GPU: The Shift from General Purpose to Pure Performance - The article details the critical shift in AI hardware from general-purpose GPUs to Google's purpose-built Tensor Processing Units (TPUs) as generative AI scales. It highlights how TPUs offer superior power efficiency, sustained throughput, and a lower total cost of ownership, especially for production AI inference workloads, compared to GPUs. Leveraging innovations like systolic arrays and the JAX/MaxText software stack, TPUs provide significant economic advantages, enabling companies to dramatically reduce their AI operational costs.

LLM Model Armor

How to wear Model Armor 2: Integrating with ADK and LangChain - Secure your AI agents. Learn to interpret Model Armor API responses and implement direct security hooks in both LangChain and Google ADK.

Various

Official Blog Partners Startups

The founder’s AI foundation: The top announcements for startups from Next ‘26 - Three dozen AI, infrastructure, security, and sales announcements showing the latest from Google Cloud that helps founders ship AI products faster and easier.

Official Blog Public Sector

Welcome to the agentic era: Public sector highlights and reflections from Next ‘26 - Discover Google Cloud Next '26 highlights for the public sector. Learn how Gemini Enterprise, agentic platforms, and AI drive government missions.

Slides, Videos, Audio

Security Podcast - #274 AI, Zero Trust and Secure by Design Walk into a Bar...

Releases

Dataproc Serverless - New Managed Service for Apache Spark (formerly Dataproc on Compute Engine) subminor cluster image versions: 2.1.113-debian11, 2.1.113-rocky8, 2.1.113-ubuntu20, 2.1.113-ubuntu20-arm 2.2.81-debian12, 2.2.81-rocky9, 2.2.81-ubuntu22, 2.2.81-ubuntu22-arm 2.3.29-debian12, 2.3.29-ml-ubuntu22, 2.3.29-rocky9, 2.3.29-ubuntu22, 2.3.29-ubuntu22-arm

Billing - The AI Cost Summary Agent is now available in Preview You can now use the AI Cost Summary Agent to analyze your AI costs and gain critical insights into your AI-related spend. The agent analyzes spending related to Gemini usage, including Gemini API and Vertex AI. This feature is available as a widget on the Billing Overview page for your Cloud Billing account. For more information, see Analyze your AI spend with the AI Cost Summary Agent.

Load Balancing - A new quota system governing the configuration size of Application Load Balancer is now available in Preview. This update increases the individual URL map size limit from 64 KB and 128 KB to 1 MB. For more information, see URL map size and quota units. Key aspects of this feature include: Complexity-based quota: Quota units reflect URL map complexity (number of rules, hostnames, and path matchers). Scoped measurement: Quota is measured and enforced on a per-project, per-region, or per-VPC depending on Application Load Balancer type. Active consumption: Only URL maps currently referenced by forwarding rules contribute to quota usage. New URL map size limit: Projects enabled for the new quota have a new URL map size limit increased to 1 MB for global and regional external and internal Application Load Balancers. Classic Application Load Balancer remain restricted to 64 KB. For more information on increasing your limit or to participate in the preview, please contact Google Cloud Support. Backend Cloud Storage buckets are available for regional external Application Load Balancer and regional internal Application Load Balancer. For more information, see: Set up a regional external Application Load Balancer with Cloud Storage buckets Set up a regional internal Application Load Balancer with Cloud Storage buckets Set up a regional external Application Load Balancer with Cloud Storage buckets in a Shared VPC environment Set up a regional internal Application Load Balancer with Cloud Storage buckets in a Shared VPC environment This feature is in General availability.

Chronicle Security Operations - Check release page for new parsers.

GKE new features - Google Kubernetes Engine now offers support for AI zones. To learn more, see AI zones.

Service Mesh - Managed Cloud Service Mesh using the TRAFFIC_DIRECTOR implementation in the regular channel now supports a limited implementation of the EnvoyFilter API. To learn about the supported fields, extensions, and how to use EnvoyFilter for features like local rate limiting see Data plane extensibility with EnvoyFilter. To troubleshoot any issue while configuring, see Resolving data plane extensibility issues.

API Gateway - New validations on paths in API configurations API Gateway now enforces stricter syntax validations on templated paths when you create new API configurations and gateways. See path templating syntax rules and limits for more information.

Dataproc - New Managed Service for Apache Spark (formerly Dataproc on Compute Engine) subminor cluster image versions: 2.1.113-debian11, 2.1.113-rocky8, 2.1.113-ubuntu20, 2.1.113-ubuntu20-arm 2.2.81-debian12, 2.2.81-rocky9, 2.2.81-ubuntu22, 2.2.81-ubuntu22-arm 2.3.29-debian12, 2.3.29-ml-ubuntu22, 2.3.29-rocky9, 2.3.29-ubuntu22, 2.3.29-ubuntu22-arm

Cloud TPU - Generally available: Cloud TPU now offers TPU availability in AI zones. To learn more, see About AI zones.

Cloud Storage - Cloud Storage now offers support for AI zones. To learn more, see AI zones.

Cloud Trace - Cloud Trace is a service covered by the Cloud Observability (Monitoring, Logging, Trace) Service Level Agreement (SLA). Google Cloud Observability has expanded the supported locations for observability buckets, which store your trace data, to include the following: australia-southeast1 europe-central2 europe-north1 europe-southwest1 europe-west2 europe-west10 europe-west12 me-central2 northamerica-northeast1 us-east4 For a list of supported locations, see Locations for observability buckets.

Compute Engine - Generally available: Compute Engine now offers support for AI zones. To learn more, see AI zones. Preview: In an autoscaled managed instance group (MIG), you can monitor individual autoscaling events and view details to understand the reasons behind each autoscaling decision. For more information, see Monitor autoscaling events. Generally available: Compute Engine has enabled support of Spot VMs in Google Cloud Dedicated universes. Spot VMs are available for C3, M3, and A3 machine series. Use Spot VMs for workloads that can withstand preemption to receive a discount of up to 60% off the on-demand price. For the latest pricing information, see the pricing page. For information about how Spot VMs work, see Spot VMs and Create and use Spot VMs. Generally available: Spot VMs are available for C3, M3, and A3 machine series. Use Spot VMs for workloads that can withstand preemption to receive a discount of up to 60% off the on-demand price. For the latest pricing information, see the pricing page. For information about how Spot VMs work, see Spot VMs and Create and use Spot VMs.

Workstation - The preconfigured base images include a notification when the running_timeout for the workstation is close to being reached.

BigQuery - You can now create materialized views over active change data capture (CDC) enabled tables. This feature is generally available (GA). You can now use the PARTITION BY clause of the CREATE VECTOR INDEX statement to partition TreeAH vector indexes. Partitioning enables partition pruning and can decrease I/O costs. This feature is Generally Available. Strict act-as mode is enforced globally for all Dataform repositories, requiring the use of a custom service account or user credentials for running Dataform workflows, BigQuery pipelines, notebooks, and data preparations. You can now use the VECTOR_INDEX.STATISTICS function to calculate how much an indexed table's data has drifted between when a vector index was created and the present. If table data has changed enough to require a vector index rebuild, you can use the ALTER VECTOR INDEX REBUILD statement to rebuild the vector index without downtime. These features are generally available (GA). Starting May 7, 2026, new transfer configurations that transfer data from Google Ads using the BigQuery Data Transfer Service will require Multi-factor authentication (MFA) for individual user authentication. For more information, see May 7, 2026.

Cloud Asset Inventory - The following resource types are publicly available through the ExportAssets, ListAssets, BatchGetAssetsHistory, QueryAssets, Feed, SearchAllResources, and SearchAllIamPolicies APIs. App Lifecycle Manager saasservicemgmt.googleapis.com/Saas saasservicemgmt.googleapis.com/Tenant saasservicemgmt.googleapis.com/UnitKind saasservicemgmt.googleapis.com/Unit saasservicemgmt.googleapis.com/Release Backup and DR backupdr.googleapis.com/BackupPlanRevision Parallelstore parallelstore.googleapis.com/Instance Vertex AI aiplatform.googleapis.com/DeploymentResourcePool

AlloyDB - When the initial user or password is unspecified during cluster creation, a locked postgres role with null password is created.

Dataplex - Dataproc and Google Cloud Serverless for Apache Spark are now unified under the Managed Service for Apache Spark brand. This change consolidates our managed Spark deployment options into a single umbrella brand that includes the full breadth of our Spark capabilities. No existing functionality is being removed as part of this change, and there is no impact to the Dataproc API, metastore, client library, CLI, or IAM names.

Dataform - You can use custom constraints with Organization Policy to provide more granular control over specific fields for the Folder and TeamFolder resources. For more information, see Create custom organization policy constraints. This feature is generally available (GA). Strict act-as mode is enforced globally for all Dataform repositories, requiring the use of a custom service account or user credentials for running Dataform workflows, BigQuery pipelines, notebooks, and data preparations.

Cloud Interconnect - Managed traffic classification for Cloud Interconnect is available in Preview. This feature automates the assignment of differentiated services field codepoint (DSCP) bits in your outgoing packets. For more information, see Configure managed traffic classification.

Database Migration Service - Database Migration Service for heterogeneous migrations to Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL now supports PostgreSQL version 18. For more information, see Supported source and destination databases.

Resource Manager - Generally Available: The Resource Manager remote MCP server is now generally available. The remote MCP server lets AI agents dynamically search for and identify all Google Cloud projects that you have the necessary permissions to access. This ensures that agents have the correct identifiers (such as project ID, project number, and lifecycle state) before attempting more specific resource configurations. For more information, see Use the Resource Manager remote MCP server.

Bigtable - You can use Bigtable agent skills to let AI agents assist with Bigtable-related tasks, such as schema design, generating SQL queries, and infrastructure management.

CDN - Google Kubernetes Engine (GKE) Gateway supports Cloud CDN to help you cache content closer to your users, improve application latency, and reduce origin load. Using GKE Gateway APIs, you can configure, manage, and fine-tune caching configurations for different segments of your traffic. This feature is Generally Available. For more information, see Configure Cloud CDN for Gateway.

VPC Service Controls - VPC Service Controls feature: Support for using IAM roles in ingress and egress rules to allow access to resources protected by a service perimeter is generally available. This feature includes the following updates: You can use the gcloud access-context-manager supported-permissions describe command to check the support status of an IAM role. You can use the gcloud access-context-manager supported-permissions list command to retrieve the complete list of all supported permissions. For more information, see Configure IAM roles in ingress and egress rules.

Cloud Logging - Custom data retention periods are supported for the _Default log bucket and for user-defined log buckets in Google Cloud Dedicated universes. For more information, see Configure custom retention.

Cloud SQL SQL Server - If a specific active query is blocked or running much longer than expected, it can block other dependent queries. Cloud SQL for SQL Server offers an optional feature that lets you view and terminate blocking queries. For more information, see Blocked active queries ( Preview ).

Cloud SQL Postgres - Cloud SQL has made the following enhancements to expand the list of eligible Cloud SQL Enterprise Plus edition instances that support planned operations with near-zero downtime. Instances with connector enforcement enabled are eligible for planned operations with near-zero downtime. Instances that use private services access with a non-RFC 1918 IP address are eligible for planned operations with near-zero downtime.

Cloud Architecture Center - (New guide) Build trusted AI agents with Google Maps Platform: A high-level architecture to build trustworthy and effective AI agents by grounding them in real-world maps and calendar data.

Chronicle SOAR - Release 6.3.83 is now available for all regions. Enhanced "Time to respond" options for multi-choice questions Google SecOps now provides more granular control over playbook execution when the "time to respond" for a MultiChoiceQuestion step is exceeded. When configuring a multi-choice question, you can now choose to proceed with one of the predefined answer branches or to create a dedicated branch to handle this scenario. For more information, see Add a multi-choice question flow. Release 6.3.84 is being rolled out to the first phase of regions as listed here. This release contains internal and customer bug fixes.

Useful Links

Contact

Zdenko Hrček
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]

Google Cloud Platform Newsletter

Check Archive for older issues