Tag: LLM

Chronicle Gemini LLM July 14, 2025

Implementing Custom AI Investigators in Vertex AI for Google SecOps - This article details how to leverage Gemini AI models within Google Cloud's Vertex AI and SecOps platform to automate alert triage.

Generative AI LLM July 14, 2025

Gen AI Evaluation Service — Model-Based Metrics - The article discusses model-based metrics within Vertex AI's Gen AI Evaluation Service, highlighting the use of LLMs as judges for evaluating model outputs. It covers built-in metrics for translation, fluency, safety, and more, as well as custom metrics, including RAG triad implementation.

ADK Generative AI LLM Official Blog July 7, 2025

How to build a simple multi-agentic system using Google’s ADK - Learn how to build a simple multi-agentic system in just a few steps using Google’s ADK – Agent Development Kit.

ADK Generative AI LLM Official Blog July 7, 2025

A guide to converting ADK agents with MCP to the A2A framework - Transform standalone Google ADK agents into collaborative A2A components. This guide shows how to unlock multi-agent potential and interoperability.

AI LLM Official Blog June 30, 2025

You dream it, Veo creates it: Veo 3 is now available for everyone in public preview on Vertex AI - Veo 3 text-to-video is now available for all Google Cloud customers in public preview on Vertex AI. Learn more about Veo 3 and try it on Vertex AI Media Studio today.

AI Data Science LLM Paywall June 30, 2025

Talk to Your Docs: Building a Scalable RAG App on Google Cloud - Build a modular, secure, and scalable RAG system with Python, LangChain, and Google Cloud’s serverless stack.

Gemini LLM Official Blog Vertex AI June 30, 2025

How to fine-tune Gemini 2.5 using videos via Vertex AI - Gemini 2.5 is making it possible to fine-tune video outputs on Vertex AI. Read more to learn how to conduct truly effective tuning experiments using the Vertex AI tuning service.

GCP Certification LLM June 23, 2025

Want to become Google Cloud GenAI Leader? Your Guide to AI Mastery - The article provides a comprehensive guide to Generative AI certificate on Google Cloud, covering fundamental concepts, models, the ML lifecycle, and Google Cloud's GenAI offerings like Gemini and Vertex AI.

LLM June 23, 2025

Using HTTP endpoints as tools with MCP Toolbox for Databases - Turn external endpoints into MCP compatible tools using MCP Toolbox.

AI LLM Official Blog June 16, 2025

Cloud CISO Perspectives: How Google secures AI Agents - To help mitigate potential agentic AI risks, we need to invest in a new field of study focused specifically on securing agent systems.

LLM Vertex AI June 9, 2025

Step-by-Step: Serving PyTorch Models with a Custom Handler on Vertex AI - The article provides a step-by-step guide on deploying PyTorch models with custom handlers on Google Cloud's Vertex AI.

Generative AI LLM Official Blog May 26, 2025

Mistral AI's Le Chat Enterprise and Mistral OCR 25.05 model are available on Google Cloud - Announcing Mistral AI’s Le Chat Enterprise and Mistral OCR 25.05 model are available on Google Cloud.

Generative AI LLM Official Blog May 26, 2025

Announcing Anthropic’s Claude Opus 4 and Claude Sonnet 4 on Vertex AI - Today, we're announcing the expansion of our Model Garden collection with the addition of two new models from Anthropic: Claude Opus 4 and Claude Sonnet 4.

AI Hypercomputer LLM Official Blog May 26, 2025

Introducing the next generation of AI inference, powered by llm-d - We’re making inference easier and more cost-effective with llm-d, an open-source, Kubernetes-native, distributed and disaggregated inference platform.

Generative AI LLM Official Blog Vertex AI May 19, 2025

Evaluate your gen media models with multimodal evaluation on Vertex AI - Google Cloud introduces Gecko on Vertex AI, a new rubric-based autorater for evaluating generative AI models, addressing the challenge of assessing the quality of images and videos produced by models like Lyria, Imagen, and Veo.

AI Hypercomputer LLM Official Blog May 12, 2025

From LLMs to image generation: Accelerate inference workloads with AI Hypercomputer - Google Cloud is enhancing its AI Hypercomputer with new inference capabilities, including the Ironwood TPU, vLLM support for TPUs, and GKE Inference Gateway and Quickstart. JetStream, Google's inference engine, now integrates Pathways for lower latency and supports multi-host inference, while MaxDiffusion delivers improved image generation performance on TPUs. MLPerf™ Inference v5.0 results highlight the powerful inference performance of A3 Ultra (NVIDIA H200) and A4 (NVIDIA HGX B200) VMs.

Gemini LLM Vertex AI May 5, 2025

Vertex AI Batch Generation - Vertex AI Batch Generation offers a 50% cost reduction for Gemini models by enabling parallel processing of multiple multimodal requests as asynchronous jobs. It supports data from Cloud Storage (in JSONL format) and BigQuery, allowing for processing of text, images, audio, and video. The tool offers customizable generation parameters and can be integrated with other tools like Google Search.

Gemini Generative AI LLM May 5, 2025

DeepEval adds native support for Gemini as an LLM Judge - DeepEval, an open-source evaluation framework for LLMs, now natively supports Gemini models via the unified Google GenAI SDK. This integration allows users to utilize Gemini as an LLM Judge within DeepEval, either through the command line or directly in code, on both Vertex AI and Google AI. The new integration simplifies the process of evaluating LLM outputs, offering more flexibility and options for developers.

LLM Official Blog May 5, 2025

Create chatbots that speak different languages with Gemini, Gemma, Translation LLM, and Model Context Protocol - The article introduces an architecture leveraging Google's Gemma, Translation LLM, and Gemini models, orchestrated via Model Context Protocol (MCP), to build multilingual chatbots. This approach uses specialized AI models for tasks like translation and complex reasoning, improving efficiency and maintainability. A GitHub repository is provided to illustrate the architecture, highlighting its adaptability and ease of management for various applications.

Gemini LLM Paywall April 28, 2025

Chatting With Dataproc Clusters Using An AI Agent - A Dataproc AI agent built using Google’s Agent Development Kit (ADK).

Databases Generative AI LLM Official Blog April 28, 2025

Google Cloud Database and LangChain integrations now support Go, Java, and JavaScript - Google Cloud Database and LangChain integrations now support Go, Java, and JavaScript. Developers can now create intricate workflows and easily interchange underlying components as needed to align with specific use cases.

AI Kubernetes LLM April 28, 2025

Inference Gateway: Intelligent Load Balancing for LLMs on GKE - GKE Inference Gateway, an extension of the standard GKE Gateway controller, optimizes routing and load balancing for generative AI workloads on Google Kubernetes Engine (GKE). It addresses the unique challenges of stateful LLM inference, such as cache affinity and queue length awareness, to maximize cache reuse and minimize prefill operations.

Firebase Generative AI LLM April 21, 2025

Extending Your AI Application with Genkit MCP - This article introduces Genkit MCP, a framework for connecting large language models to external data and tools. It explains how to integrate Genkit MCP with Genkit, Google Firebase's AI orchestration framework, using the genkitx-mcp plugin.

BigQuery LLM April 21, 2025

Forecasting the Future with BigQueryML TimesFM: A Game-Changer in Time Series Analysis - TimesFM, a foundation model built on Google's Large Language Model (LLM) architecture, revolutionizes time series forecasting by offering zero-shot performance and eliminating the need for dataset-specific training. Integrated into BigQuery ML as the AI.FORECAST function, it empowers data analysts to generate sophisticated forecasts using familiar SQL syntax.

LLM Security April 21, 2025

Shielding Your AI Models: A Dive into Google Cloud Model Armor for Securing LLMs - An overview of Google Cloud Model Armor.

Gemini LLM Official Blog April 14, 2025

Gemini 2.5 brings enhanced reasoning to enterprise use cases - Gemini 2.5, Google's latest AI model, offers enhanced reasoning capabilities for enterprise use cases. Gemini 2.5 Pro excels in complex tasks requiring deep reasoning and coding expertise, while Gemini 2.5 Flash prioritizes speed, low latency, and cost-efficiency for high-volume applications. Vertex AI Model Optimizer helps choose the best model for specific needs, and the Live API enables real-time interactions with streaming audio, video, and text.

LLM Official Blog Vertex AI April 14, 2025

Vertex AI is now the only platform with generative media models across video, image, speech, and music - Google Cloud's Vertex AI platform now offers Lyria, a text-to-music model, making it the only platform with generative media models across video, image, speech, and music.

Gemini Generative AI LLM April 14, 2025

Building AI Agents with Google’s Agent Development Kit (ADK) as MCP Client — A Deep Dive (Full Code) - This article explores how to use the open-source Agent Development Kit (ADK) to build an agent setup assistant using the Model Context Protocol (MCP) with Google Gemini LLM as an MCP Client.

AI Kubernetes LLM Official Blog April 7, 2025

Google, Bytedance, and Red Hat make Kubernetes generative AI inference aware - Google, ByteDance, and Red Hat have collaborated to enhance Kubernetes for generative AI inference. New capabilities include LLM-aware routing, an inference performance project for benchmarking, and Dynamic Resource Allocation for efficient scheduling of accelerators.

AI Cloud Run LLM Serverless March 31, 2025

Deploy your first LLM on GCP: Gemma with Cloud Run (Serverless & GPU-powered) - Deploy your own powerful language model (LLM) in the cloud using Google Cloud Run GPU. With Ollama and Cloud Run, you can easily run LLMs like Gemma without the need for a powerful local GPU. This serverless solution allows you to pay only when your model is in use, making it cost-effective for occasional or on-demand workloads.

Cloud Firestore Generative AI LLM March 31, 2025

RAG with a PDF using LlamaIndex and SimpleVectorStore on Vertex AI - A sample on how to set up a PDF-based RAG pipeline with LlamaIndex and Vertex AI is presented. The process involves loading the PDF into documents, splitting the documents into chunks, creating vector embeddings, and querying the index with a Vertex AI model.

AI LLM Security March 31, 2025

Leveraging GCP Model Armor for Robust LLM and Agentic AI Security - Model Armor, a conceptual framework for securing AI models, addresses vulnerabilities in agentic AI systems, such as prompt injection, data poisoning, model extraction, and unintended consequences.

Cloud Spanner Databases Generative AI LLM Official Blog March 24, 2025

Build GraphRAG applications using Spanner Graph and LangChain - Spanner Graph and LangChain streamline GraphRAG development by combining Spanner Graph's enterprise-grade reliability, scalability, and distributed graph processing with LangChain's versatile tools. This enables rapid prototyping of intelligent applications and unlocks valuable data insights. GraphRAG outperforms conventional RAG by providing richer, more informative answers through graph traversals and context retrieval.

Cloud Run LLM Official Blog March 17, 2025

How to deploy serverless AI with Gemma 3 on Cloud Run - Gemma 3, a family of lightweight, open AI models, is now available on Cloud Run. Gemma 3 is engineered for exceptional performance with lower memory footprints, making it ideal for cost-effective inference workloads.

Cloud Run Generative AI LLM March 17, 2025

Ollama on Cloud Run with GPU in less than 20 seconds!

Gemini LLM Official Blog March 10, 2025

Use Gemini 2.0 to speed up document extraction and lower costs - Gemini 2.0, a powerful AI tool, can help businesses speed up document extraction and lower costs. This article presents a multi-step approach to document extraction using Gemini 2.0 and structured, externalized rules. This method offers advantages like modular extraction, externalized rule management, and easy integration with existing data pipelines.

LLM Official Blog Vertex AI March 3, 2025

Evaluate gen AI models with Vertex AI evaluation service and LLM comparator - Vertex AI evaluation service and LLM Comparator are tools that help evaluate and compare generative AI models. Vertex AI evaluation service allows users to define custom metrics and perform pairwise evaluations, while LLM Comparator provides human-in-the-loop evaluation capabilities with visualizations and insights.

Generative AI LLM Official Blog March 3, 2025

Announcing Claude 3.7 Sonnet, Anthropic’s first hybrid reasoning model, is available on Vertex AI - Anthropic's Claude 3.7 Sonnet, the first hybrid reasoning model, is now available on Vertex AI. It combines rapid responses with step-by-step reasoning visible to users and is optimized for real-world use cases.

AI Generative AI Kubernetes LLM March 3, 2025

Streamline your LangChain deployments with LangServe - LangServe is a helpful tool designed to simplify the deployment of LangChain applications as REST APIs.

Generative AI LLM Vertex AI Feb. 3, 2025

Running DeepSeek: From Open Source Model to Production-Ready API on Google Cloud — VertexAI - This guide breaks down the end-to-end deployment of the 7B parameter language model DeepSeek, tackling performance, cost optimization, and best practices to make it efficient, responsive, and cloud-native on Google Cloud Vertex AI.

AI LLM Feb. 3, 2025

How Generative AI Transforms Enterprise Data Insights with Google Gemini and Teradata - GenAI tools, like Google Gemini and Teradata Vantage are transforming the way businesses analyze vast amounts of unstructured data.

Generative AI LLM Jan. 27, 2025

Improve the RAG pipeline with RAG triad metrics - This article discusses how to improve the performance of a Retrieve-and-Generate (RAG) pipeline using RAG triad metrics (answer relevancy, faithfulness, and contextual relevancy).

AI Cloud Run LLM Jan. 27, 2025

Building GenAI Chat App (Part 1): How to Deploy Gemma 2 on Cloud Run Utilizing Ollama - This article provides a step-by-step guide on deploying Gemma 2, an open-source large language model, on Google Cloud Run using Ollama.

AI BigQuery LLM Jan. 20, 2025

What is an agent, and does your data need one? - This blog introduces the idea of agents and explores the opportunities (and challenges) they bring to the world of data.

Generative AI LLM Jan. 13, 2025

Evaluating RAG pipelines - This article goes through different approaches to evaluating RAG pipelines and what metrics to use.

Gemini LLM Dec. 16, 2024

Building product recommendation bot using Gemini — Part 2— Voice, Gemini 2.0 announcement - In this part we cover giving our bot a voice input (using Speech-to-Text or LLM) and voice output (using Text-to-Speech or new Gemini 2.0).

Gemini Generative AI LLM Dec. 16, 2024

Introducing Gemini 2.0: our new AI model for the agentic era - An Introduction to Gemini 2.0 Flash and other AI-related projects.

Google Kubernetes Engine Kubernetes LLM Dec. 9, 2024

Deploying vLLM on Google Cloud: A Guide to Scalable Open LLM Inference - This guide explores deploying a production-ready LLM inference service on Google Cloud Platform using vLLM. It includes a step-by-step deployment guide, configuration considerations, and production best practices for memory management, request handling, Kubernetes infrastructure setup, and security.

Cloud Run Gemini LLM Dec. 9, 2024

Deploying LlamaIndex Workflows to Cloud Run with Llama Deploy - This guide provides a comprehensive walkthrough of deploying custom LLM workflows on Google Cloud Run with Llama Deploy. It covers containerization, building an interactive Flask app, and empowering users to deploy and scale AI solutions with ease. The full code for the sample application is available in the provided repository.

Cloud Firestore Cloud Run LLM Vertex AI Dec. 9, 2024

Deploying AI Agents on Google Cloud Platform - Deploying AI agents with large language models (LLMs) can be challenging, but this article demonstrates how to do it cost-effectively on Google Cloud Platform using LangChain and LangGraph. The technology stack includes Firestore for the vector store, Vertex AI for text embedding and the LLM, Cloud Run for deployment, Cloud Functions for preprocessing, and Cloud SQL for persistence.

BigQuery LLM Dec. 2, 2024

Text-to-SQL with Gemini and BigQuery: Using LlamaIndex to Simplify Dynamic Prompt Generation - This article demonstrates how to build a text-to-SQL application using LlamaIndex, Gemini, and Google BigQuery. It addresses common challenges like handling dynamic business context, multiple tables, and dynamic prompts. Real-world applications include business intelligence dashboards, customer support tools, data exploration, and data engineering.

Gemini LLM Monitoring OpenTelemetry Dec. 2, 2024

Tracing with Langtrace and Gemini - Langtrace is an open-source observability tool that helps you improve your Large Language Model (LLM) apps by collecting and analyzing traces. It has an SDK to collect traces from LLM APIs, Vector Databases, and LLM-based Frameworks. The traces are OpenTelemetry compatible and can be exported to Langtrace or any other observability stack.

LLM Official Blog Vertex AI Nov. 25, 2024

Announcing Mistral AI’s Large-Instruct-2411 on Vertex AI - Google Cloud has announced the availability of Mistral AI's newest model, Mistral-Large-Instruct-2411, on Vertex AI Model Garden. This advanced dense large language model (LLM) has 123B parameters and offers strong reasoning, knowledge, and coding capabilities.

API LLM Official Blog Nov. 25, 2024

Don't let resource exhaustion leave your users hanging: A guide to handling 429 errors - This article explores strategies to handle 429 resource exhaustion errors when using large language models (LLMs) in production. It discusses three practical approaches: backoff and retry mechanisms, dynamic shared quota, and provisioned throughput.

Gemini LLM Official Blog Vertex AI Nov. 18, 2024

Use AI to build AI: Save time on prompt design with AI-powered prompt writing - Vertex AI introduces AI-powered prompt writing tools to streamline prompt engineering for generative AI models. The Generate prompt feature creates comprehensive prompts from simple objectives, while the Refine prompt feature provides AI-powered suggestions for prompt improvement based on user feedback.

Cloud Run Generative AI LLM Official Blog Nov. 18, 2024

How to deploy Llama 3.2-1B-Instruct model with Google Cloud Run GPU - Deploy the Meta Llama 3.2 1B Instruction model on Cloud Run with NVIDIA GPUs. This guide provides step-by-step instructions for local model testing, deployment, and cold start improvements using Cloud Storage FUSE.

AI LLM Official Blog Threat Intelligence Nov. 18, 2024

Pirates in the Data Sea: AI Enhancing Your Adversarial Emulation - This blog post discusses how artificial intelligence (AI) and large language models (LLMs) can be used to enhance adversarial emulation and improve cybersecurity.

LLM Security Nov. 18, 2024

LLM Guard and Vertex AI - LLM Guard is a comprehensive security toolkit for LLMs, offering input and output scanners for sanitization, harmful language detection, data leakage prevention, and more. It integrates with Vertex AI, allowing users to securely interact with LLMs and protect sensitive information. LLM Guard also includes anonymize and de-anonymize scanners to ensure personal data is not shared with the LLM.

LLM Vertex AI Web3 Nov. 11, 2024

Talk to Your Cronos Data: AI Agent based User Experiences for Blockchain Insights - Talk to Your Cronos Data: AI Agent based User Experiences for Blockchain Insights explores the use of AI agents to simplify complex blockchain data and enhance its accessibility. By leveraging Google Cloud's Vertex AI and Agent Builder, users can interact with on-chain data using natural language queries, making blockchain insights readily available to a broader audience.

LLM Vertex AI Nov. 4, 2024

PPT Query Tool with Google’s LLMs - This blog post introduces a pipeline that allows users to quickly retrieve relevant slides from content-heavy slide decks using Google's Vertex AI Text Embedding model and cosine similarity. The pipeline involves converting PowerPoint slides to images, generating embeddings for semantic search, storing embeddings and metadata in BigQuery, processing user queries, and retrieving top matches with summarization.

Gemini GitHub LLM Official Blog Nov. 4, 2024

Gemini models are coming to GitHub Copilot - GitHub Copilot, a popular AI coding assistant, is partnering with Google Cloud to bring Gemini models to its platform. Developers will soon be able to use Gemini 1.5 Pro, which excels in code generation, analysis, and optimization, within GitHub Copilot.

LLM Official Blog Partners Vertex AI Oct. 28, 2024

Announcing Anthropic’s upgraded Claude 3.5 Sonnet on Vertex AI - Anthropic's upgraded Claude 3.5 Sonnet model is now generally available on Vertex AI, featuring a new "computer use" capability in public beta. This means you can use the model to direct the model to generate computer actions, like keystrokes and mouse clicks, allowing it to interact with your user interface (UI).

Generative AI LLM Oct. 28, 2024

Building a Scalable Pipeline for Continuous Document Indexing to Power RAG based Q&A - Continuous Document Indexing and Q&A Solution with Google Cloud, Redis, and LangChain.

Generative AI LLM Machine Learning Vertex AI Oct. 28, 2024

Model Alignment Through Automatic Prompt Updates From User Feedback - Google researchers have developed a technique to automatically improve prompts for language models based on user-provided feedback. The technique is available as an open-source Python library and through a user interface in Vertex AI Studio, making it easy for prompt developers to use. Prompt refinement with this model alignment technique has been shown to boost the quality of prompts and save significant time during prompt design.

Generative AI LLM Machine Learning Oct. 28, 2024

Designing Cognitive Architectures: Agentic Workflow Patterns from Scratch - This article explores 8 advanced agentic workflow patterns that enhance the capabilities of AI systems using Large Language Models (LLMs) and AI agents.

Generative AI LLM Oct. 14, 2024

Multi-Agent interactions with Autogen and Gemini — Part 2 : Terminating Conversations - This is the second part of a series exploring multi-agent conversations using the Autogen framework. In this part, we focus on terminating conversations based on specific feedback from one of the agents. We introduce the `is_termination_msg` condition in the CFP Writer agent, which checks for a specific message (e.g., "Looks good") in the response from the CFP Reviewer agent. If this message is detected, the conversation is terminated. We also modify the System Message for the CFP Reviewer to suggest mentioning "looks good" when there are no significant improvements. The final code is provided, and the complete repository is available on GitHub.

Generative AI LLM Machine Learning Oct. 14, 2024

Building ReAct Agents from Scratch: A Hands-On Guide using Gemini

Generative AI LLM Official Blog Oct. 7, 2024

When to use supervised fine-tuning for Gemini - This article delves into what SFT (Supervised Fine-Tuning) is, when to embrace SFT, and how it compares to other methods for optimizing your models output.

AI Cloud Firestore LLM Oct. 7, 2024

Persisting LLM chat history to Firestore - LangChain and Firestore can be used to build LLM powered chat applications. LangChain's RunnableWithMessageHistory helps manage chat message history, while Firestore's FirestoreChatMessageHistory stores messages to a Firestore collection. This allows for persistent chat history and more meaningful conversations. Future posts will explore advanced use cases for Firestore in LLM applications.

Generative AI LLM Sept. 30, 2024

Fine-Tuning LLM: Crafting Personalized Ad Copy with Tuned-Gemini Model - This article presents a proof of concept for fine-tuning large language models for specific marketing needs. The authors fine-tuned Google's Gemini-1.0-pro-002 model on a dataset of Facebook ads from Planned Parenthood Action, resulting in a specialized AI capable of generating ad copy that closely mimics the organization's style and messaging.

Generative AI LLM Official Blog Partners Vertex AI Sept. 30, 2024

Meta's Llama 3.2 is now available on Google Cloud - Meta's new generation of multimodal models, Llama 3.2, is now available on Google Cloud's Vertex AI Model Garden. Llama 3.2 includes new vision and lightweight models designed for edge devices, enabling more private and personalized AI experiences.

AlloyDB LLM Sept. 23, 2024

AI on your Laptop with AlloyDB Omni and Ollama. - AlloyDB Omni, a fully-managed PostgreSQL-compatible database service, can be integrated with Ollama, an open-source tool for running large language models locally, to generate embeddings from user inputs stored in databases. The process involves setting up Ollama on the local laptop, integrating AlloyDB Omni with Vertex AI, setting up input and output transformation, and loading the local model into AlloyDB Omni.

Gemini LLM Official Blog Sept. 16, 2024

Test it out: an online shopping demo experience with Gemini and RAG - An online shopping demo showcases how Gemini, a large language model, can enhance the shopping experience by providing personalized recommendations. Retrieval-Augmented Generation (RAG) improves the accuracy of Gemini's responses by incorporating relevant data from an external database, ensuring that recommendations are based on actual products in the store's inventory.

Cloud Functions LLM Sept. 16, 2024

Using Google Workflows and Cloud Functions to build an LLM-based app - This article describes the architecture of Biglang.app, a language-learning tool that leverages LLMs to enable users to create exercises from any content they like.

LLM Official Blog Vertex AI Vertex AI Agent Builder Sept. 16, 2024

Next-gen search and RAG with Vertex AI - Vertex AI offers a comprehensive suite of tools and services to build next-gen search applications. It provides out-of-the-box solutions for building semantic and hybrid search applications, as well as DIY APIs for developers who want to construct their own end-to-end RAG solutions. Vertex AI Search can be used to tackle analytical queries, and it integrates with other Google Cloud capabilities such as Vertex AI Agent Builder and BigQuery data canvas.

BigQuery Data Analytics Generative AI LLM Official Blog Partners Sept. 9, 2024

BigQuery and Anthropic’s Claude: A powerful combination for data-driven insights - Google Cloud has integrated Anthropic's Claude models with BigQuery, allowing organizations to leverage advanced AI capabilities directly within their data platform. This integration enables tasks like text generation, summarization, translation, and more, to be performed directly on data in BigQuery.

AI Generative AI LLM Official Blog Sept. 2, 2024

Magic partners with Google Cloud to train frontier-scale LLMs - Magic, a generative AI startup, has partnered with Google Cloud to build two new cloud-based supercomputers to support its mission of developing code assistants with a context window reaching 100 million tokens.

AI LLM Machine Learning Vertex AI Sept. 2, 2024

Vertex AI Function Calling - LLMs are turning into reasoning engines using capabilities like web search and calling external APIs.

Generative AI Google Kubernetes Engine LLM Official Blog Vertex AI Aug. 26, 2024

Choosing between self-hosted GKE and managed Vertex AI to host AI models - A comparison of managed Vertex AI solutions with self-hosted options on Google Kubernetes Engine for deploying Large Language Model (LLM) and Gen AI applications on Google Cloud Platform.

Google Kubernetes Engine GPU LLM Official Blog Aug. 26, 2024

Maximize your LLM serving throughput for GPUs on GKE — a practical guide - This blog post contains recommendations that can help you maximize your serving throughput on NVIDIA GPUs on GKE. Combining these recommendations with the performance benchmarking tool will enable you to make data-driven decisions when setting up your inference stack on GKE.

LLM Official Blog Partners Vertex AI Aug. 26, 2024

Announcing the Jamba 1.5 Model Family from AI21 Labs on Vertex AI - The Jamba 1.5 Model Family from AI21 Labs is now available on Vertex AI Model Garden. They excel in handling key enterprise use cases such as summarizing and analyzing lengthy documents, powering RAG-based solutions, and a wide range of applications that demand both high-quality output and efficiency.

Gemini LLM Aug. 26, 2024

Semantic Kernel and Gemini - Semantic Kernel is an open-source development kit from Microsoft that lets you easily build AI agents and integrate the latest AI models into your applications. This blog post demonstrates how to use Semantic Kernel to build a chat application with Gemini.

LLM Official Blog Security Aug. 26, 2024

Testing your LLMs differently: Security updates from our latest Cyber Snapshot Report - Security teams should update their approach to assessing and adapting existing security methodologies for LLMs. LLMs' ability to accept non-structured prompts can expose security weaknesses and lead to exploitation, such as sensitive information disclosure. Incorporating probabilistic testing can help provide better evaluation and protection against prompt injection, excessive agency, and overreliance.

Generative AI LLM Vertex AI Aug. 19, 2024

DeepEval and Vertex AI - DeepEval is an open-source evaluation framework for Large Language Models (LLMs) that allows "unit testing" LLM outputs. It can be configured to work with Vertex AI, Google's machine learning platform.

Gemini LLM Vertex AI Aug. 5, 2024

Beyond temperature: Tuning LLM output with top-k and top-p - What are top-k and top-p? What do they mean, how do they work, and how can they be tuned?

Generative AI LLM Official Blog Aug. 5, 2024

Enhancing LLM quality and interpretability with the Vertex AI Gen AI Evaluation Service - The Vertex AI Gen AI Evaluation Service helps developers improve the quality and interpretability of large language models (LLMs) by generating diverse responses, automating the selection of the best response, and providing quality metrics and explanations. This workflow can be applied to any modality or use case, including text, images, and audio, and can be parallelized to minimize latency.

Cloud SQL Generative AI LLM Aug. 5, 2024

Retrieval Augmented Generation (RAG) with Cloud SQL for MySQL - How to build a Generative AI application with your Cloud SQL for MySQL database.

Gemini Generative AI Java LLM Aug. 5, 2024

Sentiment analysis with few-shot prompting - This article explores sentiment analysis using few-shot prompting with Gemini and LangChain4j. It demonstrates three approaches: using a big string of inputs/outputs, a low-level list of messages, and an AiServices abstraction.

Generative AI LLM Machine Learning July 29, 2024

Running Google’s Gemma2 LLM locally with LangchainJS & Ollama - This article explores running Google’s powerful Gemma2 LLM locally using JavaScript, LangchainJS & Ollama.

AI LLM July 29, 2024

Run Google’s Gemma 2 model on a single GPU with Ollama: A Step-by-Step Tutorial - Have you ever wished you could run powerful Large Language Models like those from Google on a single GPU?

AI Generative AI LLM Official Blog Vertex AI July 29, 2024

Meta’s Llama 3.1 is now available on Google Cloud - Meta's Llama 3.1, including a 405B model, is now available on Vertex AI Model Garden. Access the 405B model via Model-as-a-Service in preview or use the 8B and 70B models for self-service fine-tuning.

LLM Official Blog TPU Vertex AI July 29, 2024

Hex-LLM: High-efficiency large language model serving on TPUs in Vertex AI Model Garden - Hex-LLM, a high-efficiency large language model (LLM) serving framework designed for Google's Cloud TPU hardware, is now available in Vertex AI Model Garden. Hex-LLM combines state-of-the-art LLM serving technologies with in-house optimizations tailored for XLA/TPU, delivering competitive performance with high throughput and low latency.

AI Generative AI LLM Vertex AI July 29, 2024

The Chronicles of Llama: The new Llama 3.1 405b on Vertex AI! - This notebook shows how to get started with the new Llama 3.1 405b on Vertex AI.

Generative AI LLM July 29, 2024

Portable Training Data Generation for Supervised Fine-tuning: A Reverse RAG approach! - This blog introduces an automated approach to fine-tune large language models (LLMs) using a "Reverse RAG" method. The key idea is to generate question-and-answer pairs from raw information using an arbiter model, and then use these pairs as training data for fine-tuning. This approach can significantly streamline the fine-tuning process and improve the performance of LLMs on specific tasks.

Gemini LLM Machine Learning July 22, 2024

Is a Zero Temperature Deterministic? - Learn more about a crucial LLM model parameter, and how to configure it on Gemini Pro with Vertex AI.

LLM Vertex AI July 22, 2024

Control LLM output with response type and schema - Vertex AI introduces two new features, response_mime_type and response_schema, to control the output format of large language models (LLMs).

Billing Generative AI LLM July 22, 2024

Control LLM costs with context caching - Context caching is a cost-saving technique for large language models (LLMs) with extensive context windows. The cached content can be used for subsequent prompts, and the number of input tokens cached are billed at a reduced rate.

LLM July 15, 2024

Caching Out with Gemini: Making AI Chat Less Taxing (on Your Wallet) - Context caching, a feature of Google's Gemini API, optimizes AI chat interactions by storing frequently used data and reducing repetitive requests. It saves computational costs and enhances efficiency, particularly for chatbots with extensive backstories, video analysis, large document processing, and code analysis. By caching content like PDFs and videos, users can ask questions based on the cached data, leading to more streamlined and cost-effective AI conversations.

Generative AI Google Kubernetes Engine Kubernetes LLM Machine Learning July 15, 2024

Distributed OpenSource LLM Fine-Tuning with LLaMA-Factory on GKE - This blog post explores distributed fine-tuning for LLMs using open-source tool LLaMA-Factory on Google Kubernetes Engine. LLaMA-Factory empowers researchers and developers to leverage pre-trained LLaMA models and efficiently fine-tune them on their own datasets.

AI Generative AI LLM July 8, 2024

Quizaic — A Generative AI Case Study - Continuing with the series application Quizaic, which uses generative AI to create and play high quality trivia quizzes. This article explores how best to assess the accuracy of our AI-generated quizzes.

Generative AI LLM Official Blog Vertex AI July 8, 2024

How to evaluate the impact of LLMs on business outcomes - The Vertex Gen AI Evaluation Service provides a toolkit with quality-controlled and explainable methods to evaluate the impact of large language models (LLMs) on business outcomes. It offers online and offline evaluations, auto-logging in Vertex AI Experiments, and pre-built pipeline components for production monitoring.

BigQuery Data Analytics Generative AI LLM Official Blog July 8, 2024

Prompting best practices for BigQuery data canvas - Tips to increase Natural Language to SQL or Chart queries in Bigquery Data Canvas.

Gemini LLM Python July 1, 2024

How to prompt Gemini asynchronously using Python on Google Cloud - How to send all your prompts at the same time and collect the answers, rather than sending them one by one, using Python.

LLM July 1, 2024

Building a Custom Classification API on Google Cloud: A Technical Deep Dive - Unlock the potential of LLMs with a custom API that streamlines content classification for many real-world applications.

Generative AI LLM June 24, 2024

Quizaic — A Generative AI Case Study - Part 3— Prompting and Image Generation.

Generative AI LLM Official Blog Vertex AI June 24, 2024

Announcing Anthropic’s Claude 3.5 Sonnet on Vertex AI, providing more choice for enterprises - Anthropic's newly released model, Claude 3.5 Sonnet, is now generally available on Google Cloud's Vertex AI platform. With advanced capabilities in reasoning, knowledge, math, and coding, it can power various applications, including coding assistance, customer support, data analysis, and visual processing. Enterprises can leverage Vertex AI's enterprise-grade infrastructure, tooling, and security to build and deploy production-grade generative AI applications.

AI Generative AI LLM Networking Official Blog June 24, 2024

Exploring Google Cloud networking enhancements for generative AI applications - Google Cloud offers new networking capabilities to optimize traffic for generative AI applications. These capabilities include Cross-Cloud Network for accelerated AI training and inference, Model as a Service Endpoint for secure and reliable access to AI models, custom AI-aware load balancing for minimized inference latency, optimized traffic distribution for AI inference applications, and Service Extensions for enhanced gen AI serving.

LLM Machine Learning June 17, 2024

Implementing Semantic Caching: A Step-by-Step Guide to Faster, Cost-Effective GenAI Workflows - This article is a focused, in-depth exploration of semantic caching, its intricate implementation process, its relationship to LLMs, and its strategic positioning within the broader AI landscape.

BigQuery Data Analytics LLM Official Blog June 17, 2024

Exploring synthetic data generation with BigQuery DataFrames and LLMs - BigQuery DataFrames enables the generation of synthetic data directly within BigQuery, eliminating the need for third-party solutions or data movement. It integrates seamlessly with Vertex AI, allowing users to leverage advanced language models like Gemini to generate code that produces synthetic data based on specified schemas or existing table structures. This approach addresses data privacy concerns and accelerates AI development by providing a scalable and cost-efficient platform for synthetic data generation.

BigQuery dbt Generative AI LLM Terraform Vertex AI June 10, 2024

Productionise genAI directly in dbt - Using Vertex AI in DBT.

AI Flutter LLM Machine Learning June 10, 2024

Quizaic — A Generative AI Case Study - Quizaic is a demo application that uses generative AI to create high-quality trivia quizzes and manage the interactive quiz-playing experience. The app is built using Google Cloud Platform, AI, Flutter, Machine Learning, and LLM.

BigQuery LLM Official Blog June 10, 2024

Getting started with retrieval augmented generation on BigQuery with LangChain - The blog demonstrates how to build a simple RAG pipeline using BigQuery and LangChain, and highlights the benefits of using BigQuery Vector Search, which is optimized for large-scale analytical workloads and offers features like scalability, serverless operation, and fine-grained access control.

LLM Official Blog Vertex AI June 3, 2024

Vertex AI's Grounding with Google Search: how to use it and why - Vertex AI's Grounding with Google Search helps improve the factuality and up-to-date information of large language models (LLMs) by grounding their responses on trusted Google Search world knowledge and public facts. It addresses the limitations of LLMs, such as hallucinations, staleness, lack of citation, and limited relation to private data. With Grounding, LLMs can provide more reliable and trustworthy responses, especially for questions that require recent or factual information.

BigQuery Gemini LLM May 27, 2024

Unlocking Multimodal AI with Google Gemini, Embeddings, Vertex Search, and RAG: A Practical Guide with BigQuery - Google's latest AI innovations, including Gemini, embeddings, Vertex Search, and Retrieval Augmented Generation (RAG), are revolutionizing how we interact with and extract insights from data. By leveraging these concepts with BigQuery, users can unlock powerful AI capabilities such as image tagging, vector search, and retrieval augmented generation. This enables enhanced image discovery, improved user experience, efficient scalability, and the generation of creative ideas and insights. The combination of these technologies opens up a world of possibilities for building recommendation systems, question-answering bots, and interactive multimodal experiences.

BigQuery Data Analytics LLM Official Blog May 27, 2024

Unlocking enhanced LLM capabilities with RAG in BigQuery - Now you can build smarter AI applications from right inside your data warehouse.

AI LLM Official Blog May 20, 2024

To tune or not to tune? A guide to leveraging your data with LLMs

LLM Official Blog Translation API May 13, 2024

LLMs, AI Studio, higher quality, oh my! Our latest Translation AI advancements - Announcing new generative model for Google Cloud’s Translation API.

BigQuery Gemini LLM Official Blog May 6, 2024

Simplifying data modeling and schema generation in BigQuery using multi-modal LLMs - Now you can pass multi-modal input to Gemini to create data models for your data warehouse.

AI LLM April 29, 2024

LLm infini-attention with linear complexity - Introducing Google’s Infini-attention to increase LLM attention windows and reduce quadratic complexity.

Cloud Spanner Generative AI LLM April 29, 2024

LLM in your favorite Transactional Database: Spanner - Build a Patent Search App with Spanner, Vector Search & Gemini 1.0 Pro!

Gemini Generative AI LLM April 29, 2024

Gemini has entered the chat: building an LLM-powered Discord bot - Take your first steps into the world of Generative AI by building a Discord bot that uses Gemini to talk with other users.

LLM Official Blog Vertex AI April 22, 2024

Meta Llama 3 Available Today on Google Cloud Vertex AI - Meta Llama 3 model is available on Vertex AI Model Garden.

BigQuery LLM Official Blog April 22, 2024

Introducing LLM fine-tuning and evaluation in BigQuery - Supervised fine-tuning via BigQuery uses a dataset which has examples of input text (the prompt) and the expected ideal output text (the label), and fine-tunes the model to mimic the behavior or task implied from these examples.

AI Google Kubernetes Engine LLM April 22, 2024

GKE Orchestration : Deploy your Gemma LLM - Deploying Gemma - lightweight open model on GKE.

Generative AI LLM April 8, 2024

Shh, It’s Free: But Let’s Not Tell Google! Exploring Gemini’s Multimodal Capabilities on Vertex AI - Consider this your backdoor pass into a free club, where the only membership requirement is your curiosity.

AI Gemini LLM Python April 1, 2024

Crafting Bespoke Output Formats with Gemini API - Propose a method using question phrasing and API calls to craft a bespoke output, enabling seamless integration with user applications.

Generative AI LLM Machine Learning April 1, 2024

Demystifying Generative AI for Enterprise Developers - Guide to kickstart your Enterprise GenAI journey.

Google Kubernetes Engine Kubernetes LLM April 1, 2024

GKE + Gemma + Ollama: The Power Trio for Flexible LLM Deployment - Deploying Gemma on GKE.

BigQuery Generative AI LLM March 25, 2024

In-Place LLM Insights: BigQuery & Gemini for Structured & Unstructured Data Analytics - Introduction.

BigQuery LLM March 11, 2024

Apply GenAI on Dataset in Data Mesh with HandOns experiment (GCP BigQuery) - Apply GenAI on Dataset in Data Mesh with HandOns experiment (GCP BigQuery).

Cloud Memorystore LLM Official Blog March 11, 2024

Memorystore for Redis vector search and LangChain integrations for gen AI - An example of how to combine Memorystore for Redis with LangChain to create a chatbot that answers questions about movies.

AI LLM Official Blog March 11, 2024

Domain-specific AI apps: A three-step design pattern for specializing LLMs - This article embarks on a journey through the key advantages of domain-specific LLMs.

LLM Machine Learning Vertex AI Feb. 26, 2024

Making AI more Open and Accessible to Cloud Developers with Gemma on Vertex AI - Gemma is a family of open, lightweight, and easy-to-use models developed by Google Deepmind.

LLM Python Feb. 26, 2024

Using and Finetuning Google’s State-of-the-Art Open Source Model Gemma-2B - This article describes how to use and fine-tune Gemma model.

Google Kubernetes Engine Infrastructure Kubernetes LLM Feb. 19, 2024

Serving Open Source LLMs on GKE using vLLM framework - This post shows how to serve Open source LLM models(Mistrial 7B, Llama2 etc) on Nvidia GPUs(L4, Tesla-T4, for example) running on Google Cloud Kubernetes Engine (GKE).

AI LLM Machine Learning Official Blog Feb. 19, 2024

Your RAGs powered by Google Search technology, part 2 - A deeper look at the critical technologies that are essential for building a successful RAG system to help ground large language models (LLM) when building applications.

LLM Official Blog Feb. 19, 2024

Your RAGs powered by Google Search technology, part 1 - Exploring the key features that power Google-quality retrieval in LLM and RAG-based applications.

AI Data Science LLM Machine Learning Feb. 19, 2024

BigQuery Data Analyses With Gemini LLM - The Gemini-Pro LLM model is now available in BigQuery ML. Here’s how to use it.

Cloud Dataflow LLM Official Blog Feb. 11, 2024

Leveraging streaming analytics for actionable insights with gen AI and Dataflow - In this blog post, we showcase how to get real-time LLM insights in an easy and scalable way using Dataflow.

Cloud Workstations Generative AI LLM Official Blog Feb. 11, 2024

No GPU? No problem. localllm lets you develop gen AI apps on local CPUs - In this post, we introduce you to a novel solution that allows developers to harness the power of LLMs locally on CPU and memory, right within Cloud Workstations, Google Cloud’s fully managed development environment.

Generative AI LLM Official Blog Feb. 5, 2024

Build enterprise gen AI apps with Google Cloud databases - An overview of databases on GCP that can be used to store and query vector embeddings.

AlloyDB LLM Official Blog Translation API Jan. 22, 2024

How to create a multilingual chatbot that queries AlloyDB with Langchain, Streamlit, LLMs, and Google Translate

AI BigQuery LLM Machine Learning Official Blog Jan. 22, 2024

Integrating BigQuery data into your LangChain application - See how to integrate your BigQuery data into LLM solutions.

LLM Vertex AI Jan. 15, 2024

Large Language Models(LLMs) in Google Cloud with VertexAI - From concept to code: Everything you need to know to start building an application with GenAI’s LLMs.

LLM Vertex AI Dec. 18, 2023

Fine Tuning of LLM’S in GCP Vertex AI - This article delves into fine-tuning of LLM’S why it is required, how to fine-tune it, and the results that can be achieved through fine-tuning.

Generative AI LLM Machine Learning Python Dec. 18, 2023

Google Imagen (through GCP Vertex AI Studio) as fashion design assistant - In this article, we will explore how generative AI can assist fashion designers in generating new ideas and designs using Google’s suite of generative models for text and image generation.

BigQuery LLM Dec. 18, 2023

BigQuery Meets LLM: Unlocking New Frontiers in AI-Driven Data Analytics - Unlocking a level of understanding that was previously unimaginable.

API Colab LLM Dec. 11, 2023

Fine-tune and deploy an LLM on Google Colab Notebook with QLoRA and VertexAI - An example of fine-tuning and deploying MistralAI 7B model using QLoRA on your data and VertexAI endpoint, in Google Colab Notebook .

Generative AI LLM Official Blog Dec. 3, 2023

Introducing sample GenAI Databases Retrieval App – augment your LLMs with Google Cloud databases

LLM Machine Learning Vertex AI Nov. 27, 2023

Vertex AI Model Garden - Vertex AI Model Garden is a collection of pre-built foundation models, task-specific models, and Google ML APIs.

LLM Official Blog Vertex AI Oct. 30, 2023

Serving open-source large language models efficiently on Vertex AI Model Garden - An updated LLM-efficient serving solution that improves serving throughput in Vertex AI.

Useful Links

Contact

Zdenko Hrček
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]

Tag: LLM

Latest Issues

#459 Issue

#458 Issue

#457 Issue

Useful Links

Contact