Tag: Data Science

BigQuery Data Science April 8, 2024

Google enabled History based Optimization for Queries - How to save Time and Costs trough automated History based Optimization.

BigQuery Data Science April 1, 2024

Google made Query Search Indexes generally available - Realize Query Optimization with Creating and Searching Indexes.

Cloud Dataproc Data Science April 1, 2024

Spark Performance Tuning for BigQuery APIs - Dealing with challenges when using Spark for NLP processing.

Cloud Build Cloud Run Data Science Machine Learning Python March 25, 2024

Deploy A Production-Ready Streamlit App with Cloud Run and Cloud Build - How to ship containerized applications on a serverless architecture and over a CICD pipeline.

BigQuery Data Science March 4, 2024

Google made Cubes generally available for BigQuery - Combining classical Data Structures with Column based Data Warehouses.

BigQuery Data Science March 4, 2024

Google just launched Time Series and Range Functions for BigQuery - How to perform Time Series Analysis with GoogleSQL.

BigQuery Data Science Feb. 19, 2024

Adding Nested Columns with Confidence in BigQuery - A Step-by-Step Guide to Safely Expanding Your BigQuery Tables.

AI Data Science LLM Machine Learning Feb. 19, 2024

BigQuery Data Analyses With Gemini LLM - The Gemini-Pro LLM model is now available in BigQuery ML. Here’s how to use it.

BigQuery Data Science Feb. 11, 2024

Google launches Entity Resolution for BigQuery - An Introduction to Entity Resolution — How to share Data more easily.

Data Science DevOps Python Feb. 5, 2024

Host and monetize your Streamlit app cost-effectively - Hosting and monetizing a Streamlit app for many users can be quite tricky, but that's not as impossible as it looks like.

BigQuery Data Science Public Datasets Jan. 29, 2024

How to Use the Google Trends Open Dataset on BigQuery - Example of accessing Google Trends from public datasets in BigQuery.

BigQuery Data Science Machine Learning Jan. 29, 2024

Mastering Feature Preprocessing in BigQuery ML: A Comprehensive Guide - BigQuery ML’s Impact on Data Analytics.

BigQuery Data Science Machine Learning Jan. 22, 2024

How to Low-Pass Filter in Google BigQuery - This article shows how to implement a low-pass filter in SQL / BigQuery that can come in handy when improving ML features.

BigQuery Data Science Jan. 8, 2024

Using the Substring Function in BigQuery - Working with Strings and Text Data in BigQuery SQL.

Big Data BigQuery Data Science Jan. 8, 2024

How Google BigQuery becomes an even more powerful Data Lakehouse - Recap 2023: What were the major Updates and what can we expect in 2024?

Data Science Machine Learning Vertex AI Jan. 1, 2024

Empowering data science exploration with Vertex AI Workbench on Google Cloud - In this article, we will explore two methods for creating Python notebooks in the newly-released Vertex AI Workbench instances service.

BigQuery Data Science Dec. 18, 2023

Be Careful When Using “NOT IN” in SQL - + 3 simple solutions to make sure you’re not caught out.

BigQuery Data Science Nov. 27, 2023

Using the TF_IDF Function in BigQuery - How to evaluate how relevant a Term is to a Tokenized Document.

BigQuery Data Science Machine Learning Nov. 13, 2023

Google launched Bag of Word for BigQuery & BigQuery ML - How you can now do Text Analysis easily.

BigQuery Data Science Machine Learning Oct. 30, 2023

How to Avoid Five Common Mistakes in Google BigQuery / SQL - While working with BigQuery for years, I observed 5 issues that are commonly made, even by experienced Data Scientists.

BigQuery Data Science Oct. 23, 2023

How Google attacks Apache Hive Data Warehouse - After Snowflake Google is now also aiming at Apache Hive.

BigQuery Cloud Dataproc Data Science dbt Python Oct. 16, 2023

Choosing the right tool while building your Data Platform: DBT vs. Spark (By example) - Table of contents.

BigQuery Data Science Oct. 16, 2023

Finally, Data Cube Aggregation Can Work Directly in Google BigQuery - Syntax Support for Grouping by Cubes Now Available in Google BigQuery since October 2023.

Data Science Machine Learning Python Vertex AI Oct. 9, 2023

Multimodal Prediction Model Using Google Vertex Ai - Multimodal Embeddings are a powerful tool to use for ML applications, with many use cases.

Data Science Machine Learning Vertex AI Sept. 4, 2023

5 Useful Tips To Master Vertex AI Model Registry (With Code Examples) - Useful tips to work with Vertex AI Model registry.

Data Science Python Aug. 21, 2023

Python Logs Aren’t Code. They’re A Communication Tool. - Embrace logs as communication plus 3 non-negotiables you must include for a functional, transparent data pipeline.

Billing Data Science GCP Experience Python Aug. 14, 2023

Taking 5 Minutes To Make These Tweaks Reduced My GCP VM Costs From $110 to $30 A Month - Understand Google Cloud Platform pricing, VM configurations and virtual environments to save > $700 a year.

BigQuery Data Science Python Aug. 7, 2023

Introducing BQFlow ETL - BQFlow is a Python library that moves data between Google APIs and BigQuery with minimal overhead and configuration.

Data Science Machine Learning July 31, 2023

MLOps With Kubeflow Pipelines (Part 2) - Accelerating Machine Learning Operations with Kubeflow Pipelines.

BigQuery Data Science July 24, 2023

How to Fix Missing Dates for Time Series Analysis - Learn how to use TVFs in BigQuery to effortlessly generate date ranges for your time series analysis.

BigQuery Data Science July 24, 2023

A Guide to Using Window Functions - Create running totals, moving averages, and rankings with ease in BigQuery.

Data Science Kubeflow Machine Learning July 17, 2023

MLOps With Kubeflow Pipelines (Part 1) - Accelerating Machine Learning Operations with Kubeflow Pipelines.

Data Science Kubeflow Machine Learning July 10, 2023

Kubeflow Pipelines: Orchestrating Machine Learning Workflows With Ease - Everything you need to know about Kubeflow Pipelines for Machine Learning Pipelines.

BigQuery Data Science Machine Learning July 10, 2023

5 Useful Tips To Change Your BigQuery Experience - Sharing the game-changing tips I wish someone told me 5 years ago..

BigQuery Data Science Java Scala July 3, 2023

Google launches Java and Scala Procedures for BigQuery - Using stored procedures for Apache Spark with Java or Scala.

Data Science GCP Experience Migration July 3, 2023

Onboard large data science teams to GCP from on-prem cloud - Learn how to onboard large data science teams to Google Cloud (GCP) from an on-prem cloud.

AI Data Science GCP Experience July 3, 2023

Unveiling the first generation data architecture of a newspaper - This article describes how NZZ, Switzerland’s German-speaking newspaper of record, developed and improved its first data cloud architecture powering various data products. Use-case driven, iteratively, and modular.

Data Science Vertex AI June 26, 2023

Google PaLM API: Generative Models for Code Generation - VertexAI API for GenAI.

Data Science Vertex AI June 19, 2023

Google Generative AI Transformations - Using Generative AI for simple ETL.

AI Data Science Terraform June 12, 2023

Deploying Google Cloud Dataproc with Terraform — what, why and how - A brief overview of Dataproc and how it can be deployed with Terraform.

BigQuery Data Science Machine Learning Python May 1, 2023

Analysing 260.000 Text Documents - An end to end NLP project to find trend and discover topics.

BigQuery Data Science April 24, 2023

Optimize Google Cloud BigQuery and Control Cost - I unknowingly blew $3,000 in 6 hours on one query in Google Cloud BigQuery. Here’s why and 3 easy steps to optimize the cost.

BigQuery Data Science GIS April 24, 2023

Unleashing the Power of Geospatial Data with DBSCAN Clustering in BigQuery - One of the most powerful tools for analyzing geospatial data is DBSCAN clustering, which can be used to identify patterns and relationships….

Big Data BigQuery Data Science Python April 17, 2023

Simplify Data Science Workflows on BigQuery with Fugue and Python - Speed Up Iteration and Cut Computation Cost.

Data Science Python April 10, 2023

Creating a Google Looker Studio Dashboard — Populating it with data - A step-by-step guide on building an automated process to update a Looker Studio dashboard.

BigQuery Data Science March 27, 2023

BigQuery Schema Design 101 — And What To Watch Out For - Understand these BigQuery SQL nuances to create table schemas that result in less errors and less headaches.

Data Science DevOps Terraform March 6, 2023

Deploying Airbyte with Terraform on GCP - Create a reproducible codebase to deploy Airbyte in any GCP project with a single command.

BigQuery Data Science Dataform Feb. 27, 2023

Time series anomaly detection with BigQuery ML and Dataform - For the time series data, let’s see how you can check for anomalies without actually looking into the data.

Cloud Bigtable Data Science Feb. 20, 2023

A Crash Course in Google Bigtable - An overview of Bigtable.

Apache Beam Data Science Feb. 13, 2023

The top 15 methods to know in Apache Beam to transform your data. - Learning to transform your data in a pipeline.

BigQuery Billing Data Science Feb. 6, 2023

FinOps: Four Ways to Reduce Your BigQuery Storage Cost - Don’t overlook the cloud storage cost.

Data Science Machine Learning Feb. 6, 2023

How we deployed a simple wildlife monitoring system on Google Cloud - Our journey in designing and building an ML system for Smart Parks.

BigQuery Data Science Jan. 30, 2023

4 Useful BigQuery SQL Functions You May Not Know - Some not so well known BigQuery functions.

BigQuery Data Science Jan. 16, 2023

Load Data into BigQuery Using Python-File Format Benchmark - In this post, you will be introduced to several methods showing how to load data with the most popular formats into BigQuery and analyze performance.

BigQuery Data Analytics Data Science Jan. 9, 2023

Reproducible Random Split in BigQuery SQL (For Beginners) - A complete tutorial to randomly split a dataset into multiple groups.

BigQuery Data Science Jan. 2, 2023

Random Sampling Tips for Google BigQuery - Examples of how to get data samples in BigQuery.

Big Data BigQuery Data Science Dec. 26, 2022

How I use BigQuery Analytic Functions as a Data Scientist - Practical examples on how to use advanced SQL to do analyses in BigQuery.

BigQuery Data Science GIS Dec. 26, 2022

Loading Geographic Multiband Raster Data in BigQuery - Goal: Load Raster Data in BigQuery using Dataflow with GeoBeam or GDAL core libraries.

BigQuery Data Science Nov. 27, 2022

BigQuery SQL Procedural Language to Simplify Data Engineering - Basic SQL Procedural Language statements in BigQuery.

BigQuery Data Science dbt GCP Experience Nov. 14, 2022

BQ+DBT: 5 proven practices to scale you analytics infrastructure effectively without exploding your cloud costs - Sharing learnings and distilled techniques used to manage analytics infrastructure.

BigQuery Data Science Python Oct. 10, 2022

3 BigQuery SQL Tricks to Undo Your Screw Ups - Messing up while writing and running SQL is inevitable; your recovery determines whether this is a hiccup or an apocalypse.

BigQuery Billing Data Science Oct. 3, 2022

7 Cost Optimization Practices for BigQuery - Things you can do to keep the cost of BigQuery lower.

BigQuery Data Science Data Studio Aug. 8, 2022

Looking for Power User Journeys in E-commerce - Using BigQuery to understand user journey in an e-commerce website.

Data Science GIS Aug. 8, 2022

Importing GIS Data into BigQuery - Have you been wondering how to import data that is geolocated into BigQuery? Well, wonder no more.

BigQuery Data Science Machine Learning Aug. 1, 2022

Google rolls out BigLake and integrates Analytics Hub and BigQueryML - How Google makes its Data Platform more powerful with 3 awesome Updates.

BigQuery Data Science Aug. 1, 2022

6 BigQuery SQL Functions Every User Should Know - Check if your database has them too.

BigQuery Data Science July 25, 2022

How to use variables in BigQuery using SQL — Part 1 - A step towards flexibility and reusability using parameters and variables.

BigQuery Data Science July 25, 2022

Pivot and Unpivot Functions in BigQuery For Better Data Manipulation - A detailed tutorial.

Data Science Machine Learning Vertex AI July 25, 2022

Machine Learning Batch Prediction Architecture Using Vertex AI - Batch prediction architecture implemented with Vertex AI.

Data Science July 25, 2022

Data Contracts — The Mesh glue - A practical definition and implementation guidelines.

Big Data BigQuery Data Science July 11, 2022

Awesome new Feature: Change History in Google BigQuery - Using The Append Change history TVF in BigQuery.

BigQuery Data Science Looker July 4, 2022

Tell Me, BigQuery: What is Trending on Google? - Exploring and enriching the international BigQuery Google Trends dataset with Looker.

BigQuery Data Science Machine Learning July 4, 2022

How to Split and Sample a Dataset in BigQuery Using SQL - Easily segment your data into training, validation, and test sets.

Data Science Jupyter Notebook July 4, 2022

Trick: Almost-Free Jupyter Notebooks on the Cloud! - A cheaper alternative to Vertex AI Workbench managed notebooks.

BigQuery Data Science June 27, 2022

BigQuery now supporting Query Queues - Using Query Queues for Concurrency in Google BigQuery.

BigQuery Data Science June 27, 2022

Median, Mode, and Average Order Value in BigQuery using SQL - Learn about your customers’ ordering habits and choose the best strategy for increasing the value of your orders.

BigQuery Data Catalog Data Science Security June 27, 2022

Dynamic Data Masking on BigQuery - This article describes how to use dynamic data masking in BigQuery.

Data Science Machine Learning PyTorch Vertex AI June 27, 2022

Training a PyTorch Model on GCP Vertex AI - Training models with managed notebooks or custom training jobs.

BigQuery Cloud KMS Data Science June 20, 2022

Google improves Data Security in it’s Data Warehouse BigQuery - Using column level SQL encryption with Cloud KMS keys.

BigQuery Data Science June 13, 2022

Improved Storage Read API Quotas in Google BigQuery - How Google empowers it’s Data Warehouse even more.

BigQuery Data Science June 6, 2022

How to Use Partitions and Clusters in BigQuery Using SQL - Optimize your costs and speed up your queries.

Big Data BigQuery Data Science June 6, 2022

A Senior’s Guide to Kickstart your BigQuery Journey - Missing basics you need to know when using BigQuery.

BigQuery Data Catalog Data Science June 6, 2022

Google improves Data Security in BigQuery - Using Column based Data Masking in BigQuery and Data Catalog.

BigQuery Data Science May 30, 2022

Analyze and plot 5.5M records in 20s with BigQuery and Ploomber - Develop scalable pipelines on Google Cloud using open-source software.

BigQuery Data Science Machine Learning May 23, 2022

Predict Transactions On Your Website Using Big Query ML - Train a model on Google Analytics data.

BigQuery Data Science Looker May 16, 2022

Looker and BigQuery ML: create control charts for your KPIs - Or how to monitor your actual vs target KPI values on a highly-dimensional dataset.

Data Analytics Data Science May 16, 2022

Accelerating Cloud Migration with Data Mesh Solution Patterns - The article introduces the notion of a data gravity well to describe a consistent pattern: applications and use-cases will be developed closer to where the data they need can be accessed.

BigQuery Data Science May 9, 2022

4 Ways BigQuery Metadata Can Help You - Get data about tables, jobs, and more.

BigQuery Data Science May 9, 2022

14 Best Practices to Tune BigQuery SQL Performance - With big data, querying is no longer just about writing the “correct” syntax, it needs to be cost-effective and fast, too. Here is how….

BigQuery Data Science May 9, 2022

Using Collation in Google BigQuery - How to Compare and Sort Strings easily with SQL.

BigQuery Data Science May 2, 2022

More Options to Restore your Data in Google BigQuery - How to use the Time Travel Function in BigQuery.

Big Data BigQuery Data Analytics Data Science April 18, 2022

Google Data Cloud Summit 2022: Recap - An overview of the many new updates coming to Google Cloud Platform!

BigQuery Data Science Firebase April 11, 2022

Know More About Your App Users Through BigQuery - A more customized approach to event analytics beyond Firebase and Google Analytics.

Data Science GCP Experience Machine Learning April 11, 2022

Data Apps: From Local to Live in 10 Minutes - This post explains how the Talabat Machine Learning Ops team built this simple yet elegant pipeline that brings their Machine Learning models and analyses live in a few minutes with the least possible effort required by Data Scientists.

Data Science GCP Experience Machine Learning April 11, 2022

Enabling data science on Google Cloud Platform at Adevinta - Empowering data scientists to develop an end-to-end machine learning platform on Google Cloud Platform.

BigQuery Data Science Machine Learning April 4, 2022

Rapid Batch Inference in Google Cloud - How we tweaked Google’s new SQL-based ML Framework to scale our inference stack and accelerate our product roadmap.

BigQuery Data Science March 28, 2022

BigQuery UDFs Complete Guide - Everything you need to know about Google Cloud BigQuery’s User-Defined Functions.

Data Science March 21, 2022

10 Resources to Learn Data Science on Google Cloud - Top resources to learn one of the most in demand skills for data scientists.

Data Science March 21, 2022

Governing your data in Google Cloud - Key data governance solutions you can find in Google Cloud.

Data Science Official Blog Feb. 7, 2022

Intro to data science on Google Cloud - Overview of the data science workflow on Google Cloud, from data engineering, data analysis, model development, ML engineering and orchestration.

BigQuery Data Science Machine Learning Jan. 31, 2022

Using Explainable AI in BigQuery ML - Google BigQuery now supports Explainable Artificial Intelligence for your Models.

BigQuery Data Science Kaggle Machine Learning Jan. 24, 2022

End-to-End BigQuery Machine Learning - Use Google Cloud BigQuery to compete in a Kaggle competition.

BigQuery Data Science Jan. 24, 2022

Levenshtein distance as a remedy for sequential data - Calculating Levenshtein distance in BigQuery.

Data Science Machine Learning Vertex AI Jan. 24, 2022

Tokenizing sensitive data to train models using VertexAI

BigQuery Data Science Data Studio Jan. 17, 2022

A Simple Way to Segment Customers Using Google BigQuery and Data Studio - A guide to RFM Segmentation and visualization of the resulting segments.

BigQuery Data Science Jan. 3, 2022

How to access Historical Data using Time Travel in BigQuery - Restore and Analyze deleted or changed data.

BigQuery Data Science Dec. 6, 2021

Incremental Data Ingestion in BigQuery using MERGE - Example of incremental pipeline in BigQuery.

BigQuery Data Science Machine Learning Nov. 15, 2021

Creating a Machine Learning Model with SQL - Build an ML model using SQL on Google Big Query.

BigQuery Data Science GIS Nov. 8, 2021

Spatial Binning with Google BigQuery - Binning geographical coordinates into square tiles with Google BigQuery.

BigQuery Data Science Public Datasets Visualization Nov. 1, 2021

Bike Share Chicago, Case study - The purpose of the exercise is to analyze the usage of the bike sharing data in Chicago and to increase annual memberships.

BigQuery Data Science Machine Learning Python Oct. 11, 2021

BigQuery fetching + multiprocessing - Does multiprocessing improve the fetching speed of BigQuery API requests?

Big Data BigQuery Data Science Oct. 4, 2021

Mathematical Functions you should know in BigQuery - How to Work with Numbers in BigQuery.

BigQuery Data Science Data Studio Oct. 4, 2021

Campaign Comparison Dashboard - Comparing different campaigns in the same dashboard using BigQuery and Data Studio.

Cloud Dataprep Data Science Sept. 27, 2021

Dataprep is all you need for a data preparation job on GCP - An example of using Dataprep for data cleaning.

BigQuery Data Science Sept. 20, 2021

How to Deal with NULL Values in Standard SQL - This article explains how to handle null values in BigQuery.

BigQuery Data Science Machine Learning Python Sept. 20, 2021

The fastest way to fetch BigQuery tables - A benchmark of the fastest methods used to fetch tables from BigQuery. Also introducing bqfetch: an easy-to-use tool for fast fetching.

BigQuery Data Analytics Data Science Aug. 30, 2021

Best Practices when working with Google’s BigQuery - How to optimize Usage and Costs.

BigQuery Data Analytics Data Science Aug. 30, 2021

Working with Strings in BigQuery - What you have to know when working with String Functions.

AI Platform Cloud Resource Manager Data Science Jupyter Notebook Aug. 30, 2021

Managing Scripts on AI Platform with GCP Cloud Source Repository - A tutorial to share the steps to manage and share scripts via GCP Cloud Source Repository.

BigQuery Data Science Jupyter Notebook Python Aug. 23, 2021

How Data Scientists Can Increase Their Productivity With the Aid of Data Engineers Solutions Using BigQuery, Google Colab and Python - This article aims to bring a set of solutions in Python used by Data Engineers that will increase the productivity of Data Scientists that needs to use Google BigQuery in daily operations and just want this thing to work.

Data Science Machine Learning Vertex AI Aug. 9, 2021

What does Vertex AI do? - This post covers the common tasks in a typical machine learning workflow, and how Vertex AI brings together all the tools you need to achieve them under one unified user interface.

BigQuery Data Science Machine Learning Aug. 2, 2021

Anomalies detection using River - From a proof of concept to predicting millions of transactions.

Compute Engine Data Science Machine Learning Python Aug. 2, 2021

Remote development with PyCharm and Google Cloud - Data Scientists guide to setting up remote development with PyCharm and GCP.

Cloud Dataproc Data Science Aug. 2, 2021

Creating a Dataproc cluster: considerations, gotchas & resources - This article discusses focus areas users should consider in their efforts to successfully create a reliable, reproducible, and consistent cluster.

Beginner BigQuery Data Science July 12, 2021

Working with Times and Dates in BigQuery - Common operations with dates in BigQuery.

BigQuery Data Science Python July 5, 2021

3 ways to query BigQuery in Python - SQLAlchemy, Python Client for Google BigQuery, and bq command-line tool.

BigQuery Data Science Python July 5, 2021

Build Robust Google BigQuery Pipelines with Python: Part II - BigQuery STRUCT in Python.

BigQuery Data Science SAP July 5, 2021

SAP Data Analytics in the Google Cloud - How to combine SAP with the Google Cloud Platform.

Big Data BigQuery Data Science Machine Learning June 28, 2021

Machine Learning with Google’s BigQuery - How to easily create and deploy ML Models with SQL.

BigQuery Data Science GIS June 22, 2021

A Primer on JavaScript UDFs for Spatial Analysis in BigQuery - Succinct guide to writing JavaScript User-Defined Functions for Geospatial Operations in BigQuery.

Airflow Cloud Dataproc Data Science June 14, 2021

Apache Airflow + GCP Dataproc via DataProcSparkOperator - Doing integration with Cloud Dataproc and exploring DataProcSparkOperator running Airflow.

Cloud Natural Language API Data Science Python June 14, 2021

How to categorise text in a Pandas dataframe using Google’s Natural Language API - Using Cloud Natural API for text categorization.

Big Data BigQuery Data Science Public Datasets June 7, 2021

Working with OpenStreetMap Data - Analyzing OpenStreetMap data in BigQuery public dataset.

BigQuery Data Science GIS Python May 31, 2021

Transforming GeoJSON’s Geometric Features into BigQuery’s Polygon Format with Simple Python Script - Bridging the geometric data available in GeoJson.io into analytics use cases.

BigQuery Data Science May 24, 2021

Analyzing Disneyland Paris visitors reviews for parks and hotels — Part 1 - Exploratory Data Analysis and Topic Modeling on more than 120k TripAdvisor reviews using BigQuery.

Data Science Machine Learning Vertex AI May 24, 2021

Serverless Machine Learning Pipelines with Vertex AI: An Introduction - Vertex AI in the context of MLOps.

Cloud Functions Cloud Run Data Science Jupyter Notebook Serverless May 17, 2021

Executing Jupyter Notebooks on serverless GCP products - Example of deploying and executing Jupyter notebook on serverless Google Cloud products.

Data Science GCP Certification May 10, 2021

How I Passed the GCP Professional ML Engineer Certification - A study plan to pass ML Engine certification exam.

BigQuery Data Science Machine Learning Tutorial April 19, 2021

How to Train a Model to Predict Next-Day Rain with Google BigQuery ML - Demonstrating BigQuery ML with predicting rain based on 10-year weather observations.

Data Science GCP Certification Machine Learning April 19, 2021

Google Cloud Professional Machine Learning Engineer Study Guide - Resources to help with preparation for Machine Learning certification.

Data Science Docker R Tutorial April 19, 2021

Dockerizing and deploying a Shiny dashboard on Google Cloud - A step-by-step guide to bringing Shiny to the cloud.

BigQuery Data Science Looker April 19, 2021

Data Patterns in a Multi-cloud Future - Different data patterns that you can adopt for your data science, data engineering, and BI tasks with data spread across multiple clouds.

BigQuery Data Science Machine Learning April 12, 2021

Super-fast Machine Learning to Production with BigQuery ML - How to use Bigquery ML to deploy your models in no time, and focus on what really matters.

Data Analytics Data Science Event April 5, 2021

Data Cloud Summit - May 26, 2021 - Join us to learn how leading companies are powering innovation with our data solutions. Attend sessions, demos, and live Q&As to discover how data can help you make smarter business decisions and solve your organization’s most complex challenges.

AI Platform Data Science Machine Learning March 29, 2021

How to train and deploy a Vaex model pipeline on Google Cloud Platform - No-pipeline deployments with Vaex.

BigQuery Data Science Machine Learning March 15, 2021

Building a K-means Clustering Model for Population A/B Testing with BigQuery - How you can use Google’s data warehouse to build homogeneous groups of individuals.

BigQuery Data Analytics Data Science March 8, 2021

Google BigQuery Date & Time Quick Reference Guide - Common date and time operations in BigQuery.

Big Data BigQuery Data Science March 1, 2021

BigQuery Hack: Flexible Queries For Any Number of Columns - How can we use BigQuery to handle tables with many columns? Here’s how using scripting and table metadata.

BigQuery Data Science March 1, 2021

From cURL to Automated Workflow - Creating a pipeline to load financial data into BigQuery and do Analytics.

BigQuery Data Science Feb. 22, 2021

Use a Bigquery Stored Procedure to Extract Table DDL - A SQL script to obtain DDL statements for BigQuery tables.

BigQuery Data Science Machine Learning Feb. 15, 2021

How to Build Machine Learning Model using BigQuery - Machine Learning and Predictive Modelling using SQL.

BigQuery Data Science Feb. 15, 2021

Why You Should Use superQuery for SQL - Benefits of using the Google BigQuery IDE.

AI Platform Data Science Machine Learning Feb. 15, 2021

Deploy a Machine Learning model on Google AI Platform - How to easily create a cloud service to query your trained ML model.

BigQuery Data Science Machine Learning Feb. 8, 2021

BigQuery Anomaly Detection using Kmeans Clustering from BQ ML - Find rogue transactions the smart way using BigQuery ML.

BigQuery Data Science Feb. 8, 2021

Using BigQuery Arrays to Analyze my Netflix Viewing History - How to do Advanced Data Cleaning in SQL.

Data Science Machine Learning Feb. 8, 2021

How to Translate and Dub Videos with Machine Learning - Use speech-to-text, translation, and text-to-speech to automatically translate and dub videos.

Beginner BigQuery Data Science Public Datasets Jan. 25, 2021

How to Work with Nested Data in BigQuery - Example of querying nested columns in BigQuery on Open Street Map dataset.

API BigQuery Data Science Jan. 18, 2021

Read/Write From Any Google API To/From BigQuery In 1 Minute Using BQ Flow - Use BQ Flow to transfer data between any Google API (Campaign Manager, Adwords API, Display Video) and.

Big Data BigQuery Data Science Jan. 18, 2021

BigQuery Hack: 1000x More Efficient Aggregation Using Materialized View - Learn how to supercharge your aggregation queries using Materialized View.

BigQuery Cloud AutoML Data Science Machine Learning Jan. 18, 2021

Comparing Custom Model Development With GCP BQML and AutoML Tables - Comparing Custom Model Development on Python Jupyter notebook with Google Cloud Platform BigQuery Machine Learning and AutoML Tables (beta).

AI Platform Notebooks Big Data Data Science GPU Jan. 18, 2021

An Accelerated Big Data Workflow for the Data Analyst - Explore and analyze 1B loan records with RAPIDS & Nvidia A100 GPUs on Cloud AI Platform.

BigQuery Data Science Python Jan. 4, 2021

A gentle introduction to the 5 Google Cloud BigQuery APIs - An overview of BigQuery APIs / client libraries.

BigQuery Data Science Machine Learning Jan. 4, 2021

K-Means Clustering in Google BigQuery ML - A complete guide on the most popular and practical clustering technique natively in Google BigQuery (database+ML).

Cloud Dataproc Data Science Machine Learning Jan. 4, 2021

All you need to know about Google Cloud Dataproc? - Managed Hadoop & Spark #GCPSketchnote.

AI Platform Notebooks Beginner Data Science Jupyter Notebook Dec. 28, 2020

How To Use Google AI Platform Notebooks For Your Data Science Team - Getting started with Google AI Platform Notebooks.

Apache Beam BigQuery Cloud Dataflow Data Science Dataflow Jupyter Notebook Machine Learning Python Dec. 21, 2020

Getting started with Machine Learning on GCP — Part 2: Making data clean and usable - Creating Beam/Dataflow pipeline in Jupyter Notebook.

Data Science Machine Learning Python TensorFlow Dec. 21, 2020

A machine learning pipeline with TensorFlow Estimators and Google Cloud Platform - TensorFlow on GCP — a way to industrialise complex machine learning pipelines.

BigQuery Data Science Data Studio Dec. 21, 2020

Create a real time Dashboard on covid-19 in France with GCP - Using public API to create Covid dashboard in Data Studio.

BigQuery Data Science Dec. 14, 2020

Time series analytics with BigQuery part 2 - The second in a series of posts on implementing time series analytics in BigQuery, this time defining sliding windows and session windows.

BigQuery Data Science Dec. 14, 2020

5 Bigquery SQL performance tips for modern data scientists - SQL tuning tips and advice to help reduce BigQuery costs. Start 2021 off on the right foot!

AI Platform Data Science Machine Learning Nov. 30, 2020

Google Cloud AI Platform: Human Data labeling-as-a-Service Part 2 - Exploring Google’s (human) Data Labelling Service for Advanced Video Labelling.

AI Platform Cloud AutoML Data Science Kaggle Nov. 30, 2020

Kaggle: Man vs Machine - Using AI Platform to identify healthy plants in Kaggle competition.

Billing Data Science Nov. 30, 2020

Isolating trends in public cloud costs using time-series analysis - AWS or Google Cloud costs can be often somewhat confusing and it’s hard to “cut through the noise” to see what really matters.

AI AI Platform Data Science Nov. 22, 2020

Google Cloud AI Platform Unified - Launched on 16 Nov 2020, AI Platform Unified caught us by surprise. Learn exactly what’s been “unified”.

Data Science GCP Certification Nov. 22, 2020

How to pass Google Cloud Platform — Professional Data engineer exams - Preparing for Professional Data Engineer exam.

Cloud Healthcare API Data Science Machine Learning Nov. 22, 2020

Google Cloud Healthcare API - Learn how this can accelerate AI solutions to benefit modern medicine.

BigQuery Data Science Nov. 16, 2020

Time series analytics with BigQuery - Techniques for tumbles, fills, and interpolation.

App Engine Cloud Run Data Science Firebase Python Nov. 16, 2020

Deploying a Python Dash app on App Engine with a Flask/Cloud Run backend and Firebase auth - Learn how to deploy a beautiful dashboard using Python and Dash on GCP. Then add user authentication with Firebase.

Data Science Machine Learning TPU Nov. 16, 2020

Running BERT on Google Cloud Platform With TPU - Use Google Cloud and TPU to Build a Deep Learning Environment.

AI Data Science Machine Learning Nov. 9, 2020

Google Cloud AI Platform: Hyper-Accessible AI & Machine Learning - In this first article of the series, we present an over of Google AI Platform, exploring the services available to modern data science.

Data Science Security Nov. 2, 2020

Understanding Data Encryption in Google Cloud - GCP Comics #4: Encryption to secure your data in cloud.

Apache Beam Cloud Dataflow Data Science Oct. 26, 2020

Dataflow and Apache Beam, the Result of a Learning Process Since MapReduce - An overview of Apache Beam and Cloud Dataflow.

BigQuery Data Science Oct. 19, 2020

Explore Public Datasets with Google BigQuery and DataStudio - Exploring and Reporting Massive Datasets Right Inside Your Web-browser — With an example of COVID-19 Dataset.

Data Science GCP Certification Oct. 19, 2020

How To Pass Google Cloud Professional Data Engineer Exam without IT background. - Passing Data Engineer certification exam with non-IT background.

AI Platform Prediction Data Science Machine Learning TensorFlow Oct. 12, 2020

Lightweight yet scalable TensorFlow workflow on Google Cloud - My superpower toolkit: TFRecorder, TensorFlow Cloud, AI Platform Predictions and Weights & Biases.

Cloud Build Data Science Looker Machine Learning Oct. 5, 2020

Operationalizing BigQuery ML through Cloud Build and Looker - Implementing MLOps with BigQuery ML, Cloud Build and Looker.

AI Platform Cloud SQL Data Science Sept. 28, 2020

Accessing Cloud SQL Data from AI Platform using Python - This article talks about a workaround to access data in Cloud SQL DB from the AI Platform.

BigQuery Data Science GIS Sept. 21, 2020

A beginner’s Guide to Google’s BigQuery GIS - Get started free with Google Big Query GIS with this step by step tutorial.

Big Data BigQuery Data Science Aug. 31, 2020

Google Cloud for Genomics - Building a scalable, reproducible, and secure data processing pipeline on the cloud.

Data Science Aug. 31, 2020

How I passed the Google Professional Data Engineer Exam in 2020 - In 8 days. Quick learner’s guide for those who don’t have time to read the manuals. August 2020.

Data Science Machine Learning Aug. 31, 2020

Managing Your Machine Learning Experiments with MLflow - Deploying MLflow server on GCP.

Data Science Machine Learning Aug. 17, 2020

Scalable Machine Learning with Dask on Google Cloud - A great addition to your arsenal of data science tools, Dask provides you advanced parallelism for computation at scale.

BigQuery Data Science Aug. 10, 2020

Yet another way to generate fake datasets in BigQuery - Wrapping faker.js with a Javascript UDF.

BigQuery Data Science Public Datasets July 27, 2020

Data Science 101 for Startups- Aggregation in SQL — Part 2 - Using aggregation SQL functions on BigQuery public dataset.

Cloud Dataproc Data Science Jupyter Notebook Tutorial July 27, 2020

Getting Started with Jupyter + Spark on the Cloud in 2020 - Spinning up Spark clusters with Jupyter on Cloud Dataproc.

Data Science Machine Learning July 27, 2020

Building a Data Platform to Enable Analytics and AI-Driven Innovation - Build a Data Mesh & Set up MLOps.

BigQuery Data Science Python July 20, 2020

BigQuery + Python for Production Data Science - Accessing BigQuery using Pandas, PySpark, and OS/Python.

BigQuery Data Science Public Datasets July 20, 2020

Data Science 101 for Startups- Aggregation in SQL - Aggregations concepts on examples from BigQuery.

BigQuery Data Science Machine Learning Public Datasets July 13, 2020

Stack Overflow in 2023: Predicting with ARIMA and BigQuery - Predicting the top Stack Overflow tags with ARIMA model in BigQuery.

Data Science Machine Learning Tutorial July 13, 2020

Building Image Detection with Google Cloud AutoML - Building "snack classifier" with AutoML Vision.

BigQuery Data Science July 6, 2020

Get started with BigQuery and dbt, the easy way - Find here the quickest way to get started with dbt and BigQuery using only free offerings from Google Cloud.

AI Platform Data Science July 6, 2020

Using GCP’s AI Platform to Predict Customer Churn - Developing a classification model to address customer churn.

BigQuery Cloud Functions Data Science Python July 6, 2020

Part 2: Building a Simple ETL Pipeline with Python and Google Cloud Functions — MySQL to BigQuery - Extracting data from a MySQL database and loading into Google BigQuery using Google Cloud Functions.

BigQuery Data Science Machine Learning July 6, 2020

Visualizing Pitcher Clusters: A Next OnAir Digital Experience - Analyzing baseball pitchers.

BigQuery Data Science July 3, 2020

How to handle Google Analytics data in BigQuery - The ways & tricks to tackle Shaded Tables and ARRAYs in BigQuery tables.

Data Science Machine Learning Python TensorFlow July 3, 2020

Model with TensorFlow and Serve on Google Cloud Platform - Serving TensorFlow Models on a scalable cloud platform.

BigQuery Data Science June 29, 2020

BigQuery: Creating Nested Data with SQL - Working with SQL on nested data in BigQuery can be very performant. But what if your data comes in flat tables like CSV’s?

BigQuery Data Science June 29, 2020

Easy pivot() in BigQuery, finally - Using dynamic SQL and stored procedures to pivot in BigQuery.

BigQuery Data Science June 29, 2020

Custom cohort size using Range Bucket in SQL. - Using RANGE_BUCKET command in BigQuery.

BigQuery Data Science Public Datasets June 15, 2020

Intro to BigQuery and its Free Data Sets - A quick introduction on how to access and query Google’s BigQuery using their free public datasets.

BigQuery Data Science June 8, 2020

Zero to Differential Privacy in 5 minutes on Google BigQuery - Differential Privacy presents a framework for asking statistical questions about a dataset while provably maintaining the privacy of the entities within that dataset.

BigQuery Data Science June 1, 2020

The Best Way to Generate Indices in BigQuery - Using GENERATE_ARRAY for Histograms and More.

AI Platform Notebooks Big Data Data Science Machine Learning June 1, 2020

Hands-on Big Data Analysis on GCP Using AI Platform Notebooks - Example of working with AI Platform Notebooks.

Cloud Composer Compute Engine Data Science May 18, 2020

Airflow on GCP (May 2020) - This is a complete guide to install Apache Airflow on a Google Cloud Platform Virtual Machine from scratch.

Big Data Data Catalog Data Science May 18, 2020

Google Cloud Data Catalog — Integrate Your On-Prem RDBMS Metadata - Code samples with a practical approach on how to ingest metadata from on-premise Relational Databases into Google Cloud Data Catalog.

Data Science Machine Learning Serverless May 11, 2020

13 Most Common Google Cloud Reference Architectures - Summary of #13DaysOfGCP architecture Twitter series.

BigQuery Data Science April 27, 2020

How to UNPIVOT multiple columns into tidy pairs with SQL and BigQuery - This post is for anyone dealing with time series in CSVs with one new column for each day.

BigQuery Data Science Data Studio Visualization April 27, 2020

Empowering Apple Mobility Trends Reports with BigQuery and Data Studio - Analyzing Apple's mobility data using BigQuery and Data Studio.

BigQuery Cloud Dataproc Data Science Jupyter Notebook March 16, 2020

Apache Spark and Jupyter Notebooks made easy with Dataproc component gateway - Make use of the new Dataproc optional components and component gateway features to easily use Jupyter Notebooks.

BigQuery Data Science Public Datasets March 16, 2020

Data analysis with SQL and BigQuery on New york city bikes data. - Starting with New York biking open data analysis.

Data Science Jupyter Notebook Machine Learning March 16, 2020

Setting Up Jupyter on Google Cloud - A scriptable list of command lines to deploy Jupyter in Google Cloud, securely and cost-effectively, with added exercises.

Beginner Cloud Composer Cloud Dataproc Data Science March 9, 2020

A gentle introduction to Data Workflows with Apache Airflow and Apache Spark - A tutorial on using Cloud Composer (Airflow) to launch Spark jobs on Cloud Dataproc.

AI Platform AI Platform Notebooks Data Science March 9, 2020

Reducing Startup Time For Notebooks With Custom Containers - Have you ever tried to use Cloud AI Platform Notebooks with huge containers?

BigQuery Data Science March 2, 2020

What do party schools and energy efficiency have in common? - Using BigQuery to analyze public data on building energy use.

AI Platform Data Science Docker Machine Learning Python March 2, 2020

Serverless machine learning using Docker - Running containers in Google AI Platform.

Data Science Serverless March 2, 2020

Introducing Serverless Orchestration with Houston - Serverless workflow control on Google Cloud Platform.

BigQuery Data Science Data Studio Feb. 24, 2020

Reddit AmItheAsshole is nicer to women than to men — a SQL proof? - Analyzing Reddit posts with BigQuery and visualizing in Data Studio.

Compute Engine Data Science Feb. 24, 2020

Jupyter Notebook on Google Compute Engine with HTTPS - Setting up Jupyter to run on Google Compute Engine and be accessed via HTTPS.

Data Science Machine Learning Feb. 17, 2020

All things GCP: Machine Learning Decision pyramid - Understand which Google Cloud tools matches best for you.

Apache Beam BigQuery Data Science Jan. 27, 2020

Fastai batch prediction on a BigQuery table - From this article, you will get to know how to perform a batch prediction on a BigQuery table using a fastai model.

BigQuery Data Science Data Studio Jan. 27, 2020

Interactive: The top 2019 Wikipedia pages - Going deeper into Wikipedia most popular pages for 2019 with BigQuery and Data Studio.

BigQuery Data Science Jan. 27, 2020

Inequality: How to draw a Lorenz curve with SQL, BigQuery, and Data Studio - Analyzing the popularity of Wikipedia pages based on public data set.

AI Platform Data Science Machine Learning Python Jan. 20, 2020

Using Scikit-learn on Google Cloud Platform - Training Scikit-learn models on GCP’s AI Platform.

Data Science Dec. 16, 2019

This is how you put the data in Data Science! - Google's search engine for Datasets.

BigQuery Data Science Dec. 9, 2019

Advent of code: SQL + BigQuery - Solving the Advent of Code challenges with SQL and BigQuery.

Data Science Machine Learning Dec. 2, 2019

Get started or improve your Machine Learning of structured data using AutoML Tables! (Part 1) - Challenges we are trying to solve and part 2 will go…

AI Platform Data Science Machine Learning Python Nov. 25, 2019

Predicting Taxi fares in NYC using Google Cloud AI Platform (Billion + rows) Part 3 - The objective of this series of articles is to create a Machine Learning model that is able to estimate taxi fares in NYC before the ride commences.

Big Data BigQuery Data Science GCP Experience Nov. 18, 2019

Batch Processing Pipelines for Better Data Analysis - An overview of how Gojek is using batch processing to generate useful insights from our data warehouse.

Big Data BigQuery Data Science Nov. 18, 2019

BigQuery workflow from the Jupyter notebook - In this article, you will get to know how to create and schedule the BigQuery workflow using the Jupyter Lab and the Cloud Composer.

App Engine BigQuery Data Science Python Nov. 18, 2019

Python / Pandas & BigQuery in 7 minutes - Using BigQuery in Django app.

Data Catalog Data Science Nov. 18, 2019

Boosting the Data Governance journey with Google Cloud Data Catalog - Thoughts on data discovery and metadata management in Google Cloud.

Data Science Kubernetes Machine Learning Nov. 18, 2019

MiniKF is now available on the GCP Marketplace - MiniKF is the fastest and easiest way to get started with Kubeflow. With just a few clicks, you are ready for experimentation, and for running complete Kubeflow Pipelines.

BigQuery Data Science Nov. 11, 2019

Anomaly Detection With SQL - Demonstrating SQL anomaly detection on a public dataset in BigQuery.

BigQuery Data Science Machine Learning Nov. 11, 2019

ML Design Pattern #5: Repeatable sampling - Use a well-distributed column to split your data into a train/valid/test.

BigQuery Data Science Data Studio Nov. 4, 2019

Analyzing the crisis with reddit and BigQuery: 2019 Chilean protests - Analyzing and visualizing data from Reddit with BigQuery and Data Studio.

Big Data BigQuery Data Science Nov. 4, 2019

Let the kids into the library - An opinionated attempt at building a data driven company in the cloud.

Beginner Data Science Machine Learning Nov. 4, 2019

Using a cluster in the cloud for Data Science projects in 4 simple steps - Tutorial on how to set up Jupyter notebook on GCP.

Big Data BigQuery Data Science Python Oct. 28, 2019

How to get into BigQuery analysis on Kaggle with Python? - Exploring ways to use BigQuery in Kaggle.

Big Data Data Science Oct. 28, 2019

A gentle introduction to Apache Druid in Google Cloud Platform - The article describes how to set up and use Apache Druid on GCP.

Data Science Machine Learning TensorFlow Oct. 28, 2019

Predicting Taxi fares in NYC using Google Cloud AI Platform (Billion + rows) Part 2 - Using data from BigQuery to create a Tensorflow model of predicting taxi fares in NYC.

BigQuery Data Science Sept. 30, 2019

10 top tips: Unleash your BigQuery superpowers - If BigQuery was superhero, what kind of superpowers would it have?

BigQuery Data Science Sept. 16, 2019

Loading MySQL backup files into BigQuery — straight from Cloud SQL - Loading Cloud SQL MySQL backup data into BigQuery.

BigQuery Data Science Python Sept. 2, 2019

Slow BigQuery results no more - How the use of BigQuery Storage API improves the speed of results retrieving from BigQuery.

BigQuery Data Science Data Studio Aug. 19, 2019

Don’t Double Park in Brooklyn - Analyzing New York's open data about state vehicle registration using BigQuery and Data Studio.

AI Data Science Machine Learning Aug. 19, 2019

How to Upgrade Colab with More Compute - Learn how to use Google Cloud Platform’s Deep Learning VMs to power up your Colab environment, on this episode of AI Adventures

Data Science Aug. 12, 2019

4 Data Studio tricks - UX and UI tips for Data Studio.

BigQuery Data Science July 29, 2019

BigQuery: SQL on Nested Data - Examples of working with nested data in BigQuery.

BigQuery Data Science Machine Learning July 29, 2019

Clustering 4,000 Stack Overflow tags with BigQuery k-means - Using BigQuery ML to cluster tags from StackOverflow.

Big Data BigQuery Data Science Java July 15, 2019

Beast: Moving Data from Kafka to BigQuery - GOJEK’s open source solution for moving data from Kafka to Google BigQuery.

Data Science DevOps Kubernetes Machine Learning July 15, 2019

Automated Model Retraining with Kubeflow Pipelines - How to implement a reproducible ML workflow that adapts to new data

BigQuery Data Science July 8, 2019

New in BigQuery: Persistent UDFs - Using new functionality of saving User Defined Functions in BigQuery.

BigQuery Data Science Python July 8, 2019

BigQuery and Public Datasets. Overview for Data Analysts - In this article we’ll briefly explore what is BigQuery and how a data analyst can access and use it through various interfaces with…

BigQuery Data Science July 8, 2019

An open source Python package for moving HelpScout data into Google BigQuery - This article is written for business analysts, data scientists and engineers that need to integrate Help Scout data into their Google BigQuery pipeline, and have hands-on experience dealing with Python, APIs and SQL databases.

Big Data Data Analytics Data Catalog Data Science July 8, 2019

Google Cloud Data Catalog hands-on guide: templates & tags with Python - This quickstart guide brings a practitioner approach to Data Catalog, covering Templates & Tags management using the Python client library.

BigQuery Data Science Data Studio June 24, 2019

From College to the Pros with Google Cloud Platform (Part 1) - Getting together and analyzing NBA players stats.

Data Science GCP Certification June 24, 2019

10 Days to Become a Google Cloud Certified Professional Data Engineer - Overview of resources used for Data Engineer exam preparation.

Data Science R June 24, 2019

From College to the Pros with Google Cloud Platform (Part 2) - The second part of NBA players analysis.

Cloud Dataproc Data Science June 17, 2019

Scale out RAPIDS on Google Cloud Dataproc - Scaling GPU data jobs on Cloud Dataproc.

BigQuery Data Science GCP Experience June 17, 2019

Analytics at lightspeed with Google BigQuery - The article describes how Aditya Birla Group created a digital platform on GCP to manage the travel of their employees.

Data Science June 17, 2019

Setup Julia with Jupyter notebook on Google Cloud Platform - Tutorial on how to set up and use Julia on Jupyter notebooks hosted on GCP.

Data Science Security June 10, 2019

How to use cloud storage to securely load data into Neo4j - Methods for loading data into a remote Neo4j Instance — Part 2

Apache Beam Cloud Dataflow Data Science Python May 13, 2019

Let’s Build a Streaming Data Pipeline - Creating Apache Beam / DataFlow pipeline to parse web server logs.

Data Science GCP Certification May 13, 2019

Passing the (new) Google Professional Data Engineer exam within 7 weeks - Experience of preparing and taking Data Engineer certification.

Data Science April 15, 2019

How to get started with Google Colab and Kaggle - Example of using Colab for Kaggle competitions.

AI Data Science Machine Learning April 8, 2019

GCP Notebook Executor v0.1.2 - Executing long running Jupyter Notebook jobs on GCP.

BigQuery Cloud Dataflow Cloud Dataprep Data Science Machine Learning TensorFlow April 8, 2019

End-to-end churn prediction on Google Cloud Platform - Overview of GCP architecture to build customer churn prediction compromising of data acquisition, data wrangling, modeling, model deployment, and a business use case.

 

Latest Issues




Contact

Zdenko Hrček
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]