Welcome to issue #181 March 16th, 2020

With coronavirus touching everybody's lives, it's encouraging to see how people and companies are supporting each other and how technology and cloud have a big impact in these times.

On GCP side note, we have several CI/CD pipeline examples as well as new AI Platform pipelines.

 

News

AI Machine Learning

Introducing Cloud AI Platform Pipelines - AI Platform Pipelines provides enterprise-ready infrastructure for deploying and running structured ML workflows, and pipeline tools for building, debugging, and sharing pipelines and components.

Compute Engine Official Blog Storage

Local SSDs + VMs = love at first (tera)byte - In Google Cloud Storage you can now attach 6TB and 9TB local SSDs to virtual machines (VMs) for higher throughput and IOPS per VM.

AWS BigQuery Data Analytics Official Blog

Modern analytics made easy with new Redshift, S3 migration tools - Data warehouse migrations to Google Cloud’s BigQuery are now easier with new tools to move your Redshift and S3 workloads.

Anthos Cloud Marketplace Data Analytics Official Blog

Get the Flink Operator for Kubernetes in Anthos on Marketplace - The open source Apache Flink for Kubernetes operator is now available in the Google Cloud Marketplace to deploy in your own cluster.

Compute Engine Machine Learning Official Blog

Compute Engine instance creation made easy with machine images - Compute Engine machine images capture everything you need to create a new instance.

Articles, Tutorials

Infrastructure, Networking, Security, Kubernetes

Official Blog SRE

Finding a problem at the bottom of the Google stack - See a real-world example of how Google’s SRE practices can identify and help fix issues, even at the bottom of the hardware stack.

CI Cloud Build Kubernetes Python

Parameterised Kubernetes deployments without Helm via GCP Cloud Build - Setting Cloud Build for Google Kubernetes Engine CI/CD.

CI Cloud Build Kubernetes

CI/CD at Ai Incube - The explanation of the CI/CD system on GCP.

Billing

Managing Billing Permissions in Google Cloud - The article goes through basic steps to set up a billing account, permissions, etc.

Cloud Identity Aware Proxy Security

Identity-Aware Proxy for On-Prem applications - Using Identity Aware Proxy to secure internal systems at home.

IAM Security Tutorial

Improving Security with Impersonation - The article describes the impersonation of service accounts and how to set it up.

App Development, Serverless, Databases, DevOps

Cloud Pub/Sub Cloud Storage Data Loss Prevention API Security

Automating Cloud Storage Data Classification: Setup Cloud Storage and Pub/Sub - Automation of data classification in Cloud Storage for security and organizational purposes using Data Loss Prevention API.

Cloud Firestore Cloud Functions Cloud Vision API Firebase Javascript

How to Create an Image Translation Web App in 25 Lines of Code - Simple OCR web application using Firebase and other GCP products.

Cloud Functions Cloud SQL Terraform

How to use Terraform to schedule backups for your Google Cloud SQL database - Using Terraform to schedule daily backups of Cloud SQL database to Cloud Storage bucket.

App Engine Azure DevOps

Deploy GCP App Engine from Azure Release Pipelines - Interesting combination, deploying App Engine app from Azure DevOps pipelines.

Dialogflow Serverless

A Healthy Dialogflow Part I: A Case for Google Cloud Platform’s NLP Service - Proof of concept for chatbot related to health implemented with Dialogflow.

Knative Serverless

Knative Eventing Delivery Methods - The article explains delivery methods in Knative.

Big Data, Analytics, ML&AI

CI Cloud Data Fusion Data Analytics DevOps

CI/CD and Change Management for Pipelines — Part 1 - Process and steps to take in order to implement a CI/CD pipeline for CDAP (Cloud Data Fusion).

Monitoring Official Blog SRE

Use SRE principles to monitor pipelines with Cloud Monitoring dashboards - Try SRE principles and the four golden signals as the metrics to build a monitoring dashboard for your data pipelines.

BigQuery Data Analytics

How to query and calculate GA App + Web event data in BigQuery - In-depth of new Firebase Google Analytics way of measurement and how to use raw data in BigQuery.

BigQuery Cloud Dataproc Data Science Jupyter Notebook

Apache Spark and Jupyter Notebooks made easy with Dataproc component gateway - Make use of the new Dataproc optional components and component gateway features to easily use Jupyter Notebooks.

Data Science Jupyter Notebook Machine Learning

Setting Up Jupyter on Google Cloud - A scriptable list of command lines to deploy Jupyter in Google Cloud, securely and cost-effectively, with added exercises.

Big Data BigQuery Public Datasets

Processing 10TB of Wikipedia Page Views - Part 1 - Processing and uploading Wikipedia page views into BigQuery.

BigQuery Cloud Functions Stackdriver

Sending data from BigQuery to Intercom using Google Cloud Functions - Using Stackdriver triggering to send notifications when BigQuery table is updated.

BigQuery GIS

Yet another GeoJson to Ndjson converter - You might have already seen many ways to convert GeoJson files to something BigQuery can understand. Let’s invent one more wheel!

BigQuery Data Science Public Datasets

Data analysis with SQL and BigQuery on New york city bikes data. - Starting with New York biking open data analysis.

BigQuery Machine Learning

Building an end to end Machine Learning Pipeline in Bigquery - Building a simple end to end BigQuery Machine Learning pipeline using open-source framework Dataform.

Machine Learning TPU Tutorial

Get started with PyTorch, Cloud TPUs, and Colab - Running Machine Learning With PyTorch on TPUs in Colab.

Apache Beam Cloud Dataflow TensorFlow

TensorFlow Extended (TFX): Using Apache Beam for large scale data processing - Using Apache Beam (Cloud Dataflow) for TensorFlow Extended pipelines.

Various

GCP Certification

How to pass GCP Professional Data Engineer exam in 2 months - Sharing experience of preparing and passing GCP Professional Data Engineer certification.

GCP Certification

My journey of GCP Professional Data Engineer (2020)! - A journey to GCP Professional Data Engineer Exam.

Slides, Videos, Audio

GCP Podcast - #211 Digital Services with xMatters.

Kubernetes Podcast - #94 gRPC, with Richard Belleville.

 

Releases

BigQuery Transfer - BigQuery Data Transfer Service now supports the Finland region. BigQuery Data Transfer Service now supports the Zürich region.

Cloud Build - The Create trigger page on the Cloud Console has been updated.

Cloud Composer - You can now control access to the Airflow web server, either allowing access from any IP address (default), or specifying which IP ranges have access.

Config Connector - ComputeHealthCheck's location field now supports supplying a region. Fixed an issue with deleting StorageBucketAccessControl when the ServiceAccount did not exist: https://github.com/GoogleCloudPlatform/k8s-config-connector/issues/39. With the exception of role-bindings, moved all system components for namespaced mode into the cnrm-system, note: you must completely uninstall and reinstall to upgrade namespaced mode completely for this release. Added a version annotation to the Config Connector manifests.

Data Catalog - Support for custom entries is now in beta.

Dataproc - Added the following flags to gcloud dataproc clusters create and gcloud dataproc workflow-templates set-managed-cluster commands: --num-secondary-workers --num-secondary-worker-local-ssds --secondary-worker-boot-disk-size --secondary-worker-boot-disk-type --secondary-worker-accelerator. The following flags to gcloud dataproc clusters create and gcloud dataproc workflow-templates set-managed-cluster commands have been deprecated: --num-preemptible-workers --num-preemptible-worker-local-ssds --preemptible-worker-boot-disk-size --preemptible-worker-boot-disk-type --preemptible-worker-accelerator See the related change, above, for the new flags to use in place of these deprecated flags.

Datastore - Support for us-west3 (Salt Lake City) and asia-northeast3 (Seoul).

Dialogflow - On March 16, 2020, the Inline Editor will use Cloud Functions instead of Cloud Functions for Firebase. Event names are now limited to 50 characters.

Cloud Firestore - Support for us-west3 (Salt Lake City) and asia-northeast3 (Seoul).

Stackdriver Logging - Cloud Logging Agent for Windows version 1-11 is now available. Logs Viewer (Preview) now contains a histogram panel.

Managed Microsoft AD - VPC Service Controls integration is now in beta.

Stackdriver - Cloud Logging Agent for Windows version 1-11 is now available. Logs Viewer (Preview) now contains a histogram panel.

Cloud Vision API - OCR model upgrades The text_detection and document_text_detection models have been upgraded to newer versions.

VPC Service Controls - Beta stage support for: Managed Service for Microsoft Active Directory.

AI Platform - Runtime version 2.1 for AI Platform Prediction is now available. If you deploy a model version for online prediction that uses runtime version 2.1 with a GPU, AI Platform Prediction uses TensorFlow 2.0.0 (instead of TensorFlow 2.1.0) to serve predictions. Runtime version 2.1 for AI Platform Training is now available. Runtime version 2.1 includes scikit-learn 0.22 rather than 0.22.1. When you train with runtime version 2.1 or later, AI Platform Training uses the chief task name to represent the master VM in the TF_CONFIG environment variable.

Dialogflow Enterprise - On March 16, 2020, the Inline Editor will use Cloud Functions instead of Cloud Functions for Firebase. Event names are now limited to 50 characters.

Secret Manager - Secret Manager is generally available.

 

Latest Issues




Contact

Zdenko Hrček
Třebanická 183
Prague, Czech Republic
Phone: +420 777 283 075
Email: [email protected]