Get hands-on experience designing and building data processing systems on Google Cloud. This course uses lectures, demos, and hands-on labs to show you how to design data processing systems, build end-to-end data pipelines, and analyze data. This course covers structured, unstructured, and streaming data.

Products:

BigQuery
Bigtable
Cloud Storage
Cloud SQL
Spanner
Dataproc
Dataflow
Cloud Data Fusion
Cloud Composer
Pub/Sub

We can organize this training at your preferred date and location. Contact Us!

Prerequisites

Participants should have:

Prior Google Cloud experience using Cloud Shell and accessing products from the Google Cloud console.
Basic proficiency with a common query language such as SQL.
Experience with data modeling and ETL (extract, transform, load) activities.
Experience developing applications using a common programming language such as Python

Target audience

This course is designed for:

Data engineers
Database administrators
System administrators

What You Will Learn

By the end of this course, learners will be able to:

Design and build data processing systems on Google Cloud.
Process batch and streaming data by implementing autoscaling data pipelines on Dataflow.
Derive business insights from extremely large datasets using BigQuery.
Leverage unstructured data using Spark and ML APIs on Dataproc.
Enable instant insights from streaming data.

Training Outline

Module 01: Data engineering tasks and components

The role of a data engineer
Data sources versus data syncs
Data formats
Storage solution options on Google Cloud
Metadata management options on Google Cloud
Share datasets using Analytics Hub

Module 02: Data replication and migration

Replication and migration architecture
The gcloud command line tool
Moving datasets
Datastream

Module 03: The extract and load data pipeline pattern

Extract and load architecture
The bq command line tool
BigQuery Data Transfer Service
BigLake

Module 04: The extract, load, and transform data pipeline pattern

Extract, load, and transform (ELT) architecture
SQL scripting and scheduling with BigQuery
Dataform

Module 05: The extract, transform, and load data pipeline pattern

Extract, transform, and load (ETL) architecture
Google Cloud GUI tools for ETL data pipelines
Batch data processing using Dataproc
Streaming data processing options
Bigtable and data pipelines

Module 06: Automation techniques

Automation patterns and options for pipelines
Cloud Scheduler and Workflows
Cloud Composer
Cloud Run functions
Eventarc

Module 07: Introduction to data engineering

Data engineer’s role
Data engineering challenges
Introduction to BigQuery
Data lakes and data warehouses
Transactional databases versus data warehouses
Effective partnership with other data teams
Management of data access and governance
Building of production-ready pipelines
Google Cloud customer case study

Module 08: Build a Data Lake

Introduction to data lakes
Data storage and ETL options on Google Cloud
Building of a data lake using Cloud Storage
Secure Cloud Storage
Store all sorts of data types
Cloud SQL as your OLTP system

Module 09: Build a data warehouse

The modern data warehouse
Introduction to BigQuery
Get started with BigQuery
Loading of data into BigQuery
Exploration of schemas
Schema design
Nested and repeated fields
Optimization with partitioning and clustering

Module 10: Introduction to building batch data pipelines

EL, ELT, ETL
Quality considerations
Ways of executing operations in BigQuery
Shortcomings
ETL to solve data quality issues

Module 11: Execute Spark on Dataproc

The Hadoop ecosystem
Run Hadoop on Dataproc
Cloud Storage instead of HDFS
Optimize Dataproc

Module 12: Serverless data processing with Dataflow

Introduction to Dataflow
Reasons why customers value Dataflow
Dataflow pipelines
Aggregating with GroupByKey and Combine
Side inputs and windows
Dataflow templates

Module 13: Manage data pipelines with Cloud Data Fusion and Cloud Composer

Build batch data pipelines visually with Cloud Data Fusion
Components
Overview
Building a pipeline
Exploring data using Wrangler
Orchestrate work between Google Cloud services with Cloud Composer
Apache Airflow environment
DAGs and operators
Workflow scheduling
Monitoring and logging

Module 14: Serverless messaging with Pub/Sub

Introduction to Pub/Sub
Pub/Sub push versus pull
Publishing with Pub/Sub code

Module 16: Dataflow streaming features

Streaming data challenges
Dataflow windowing

Module 17: High-throughput BigQuery and Bigtable streaming features

Streaming into BigQuery and visualizing results
High-throughput streaming with Bigtable
Optimizing Bigtable performance

Module 18: Advanced BigQuery functionality and performance

Analytic window functions
GIS functions
Performance considerations

Exams and assessments

There is no specific certification related to this course.

Hands-on learning

There are practical labs in this course.

Why Choose Us

Experience Data Engineering on Google Cloud through Bilginç IT Academy's live and interactive virtual classroom environment, accessible from your home, office, or any location. Connect with expert trainers in real time and bring the energy of classroom learning into the digital experience.

Live Instructor-Led Sessions: Join scheduled training sessions with your instructor and fellow delegates in real time.
Interactive Learning Experience: Take part in discussions, practical exercises, group activities, and Q&A sessions throughout the course.
Expert Trainer Network: Learn from experienced trainers with strong industry backgrounds and practical field expertise.
Over 30 Years of Training Expertise: Benefit from Bilginç IT Academy's long-standing experience in delivering professional training since 1995.
Flexible and Scalable Delivery: Access live virtual classrooms worldwide with flexible planning options for individual and corporate training needs.

Experience Data Engineering on Google Cloud in a focused classroom environment designed for high engagement and effective learning. Bilginç IT Academy's carefully selected training venues provide a professional setting where delegates can interact directly with expert trainers and peers.

Experienced Trainers: Learn from specialists with extensive field experience and real-world knowledge.
Professional Training Venues: Attend courses in comfortable, well-equipped classrooms designed to support effective learning.
Focused Classroom Experience: Benefit from limited class sizes that encourage discussion, interaction, and personalized support.
Quality-Driven Learning: Develop practical skills through structured, up-to-date, and professionally designed training content.

Meet your team's training needs with Bilginç IT Academy's onsite Data Engineering on Google Cloud solution, delivered at your office or preferred location. Align your team's development with your business goals through a training experience tailored to your organization.

Tailored Course Content: Adapt the training program to your organization's projects, team structure, and specific business requirements.
Time and Cost Efficiency: Reduce travel, accommodation, and operational costs while maximizing the value of your training investment.
Team-Focused Learning: Help your employees develop around the same knowledge base and strengthen collaboration across your organization.
Simplified Planning and Tracking: Manage the training process, participant development, and organizational requirements with greater control.

Why have you chosen us?

I have attended a training from Bilginc IT Academy before and I was satisfied.

I have attended a training from a different provider and it was not helpful.

Other

How many employees do you have in your IT department?

0 – 50

50 – 250

250 – 1000

1000+