Building Batch Data Pipelines on Google Cloud Training in Germany

  • Learn via: Classroom
  • Duration: 1 Day
  • Level: Intermediate
  • Price: From €1,365+VAT
We can host this training at your preferred location. Contact us!

Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud for data transformation including BigQuery, executing Spark on Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Dataflow. Learners will get hands-on experience building data pipeline components on Google Cloud using Qwiklabs.

Introduction to Building Batch Data Pipelines

This module reviews different methods of data loading: EL, ELT and ETL and when to use what

  • Module introduction
  • EL, ELT, ETL
  • Quality considerations
  • How to carry out operations in BigQuery
  • Shortcomings
  • ETL to solve data quality issues
  • QUIZ
  • Introduction to Building Batch Data Pipelines

Executing Spark on Dataproc

This module shows how to run Hadoop on Dataproc, how to leverage Cloud Storage, and how to optimize your Dataproc jobs.

  • Module introduction
  • The Hadoop ecosystem
  • Running Hadoop on Dataproc
  • Cloud Storage instead of HDFS
  • Optimizing Dataproc
  • Optimizing Dataproc storage
  • Optimizing Dataproc templates and autoscaling
  • Optimizing Dataproc monitoring
  • Lab Intro: Running Apache Spark jobs on Dataproc
  • LAB: Running Apache Spark jobs on Cloud Dataproc: This lab focuses on running Apache Spark jobs on Cloud Dataproc.
  • Summary
  • QUIZ

Serverless Data Processing with Dataflow

This module covers using Dataflow to build your data processing pipelines

  • Module introduction
  • Introduction to Dataflow
  • Why customers value Dataflow
  • Building Dataflow pipelines in code
  • Key considerations with designing pipelines
  • Transforming data with PTransforms
  • Lab Intro: Building a Simple Dataflow Pipeline
  • LAB: A Simple Dataflow Pipeline (Python) 2.5: In this lab, you learn how to write a simple Dataflow pipeline and run it both locally and on the cloud.
  • LAB: Serverless Data Analysis with Dataflow: A Simple Dataflow Pipeline (Java): In this lab you will open a Dataflow project, use pipeline filtering, and execute the pipeline locally and on the cloud using Java.
  • Aggregate with GroupByKey and Combine
  • Lab Intro: MapReduce in Beam
  • LAB: MapReduce in Beam (Python) 2.5: In this lab, you learn how to use pipeline options and carry out Map and Reduce operations in Dataflow.
  • LAB: Serverless Data Analysis with Beam: MapReduce in Beam (Java): In this lab you will identify Map and Reduce operations, execute the pipeline, use command line parameters.
  • Side inputs and windows of data
  • Lab Intro: Practicing Pipeline Side Inputs
  • LAB: Serverless Data Analysis with Dataflow: Side Inputs (Python): In this lab you will try out a BigQuery query, explore the pipeline code, and execute the pipeline using Python.
  • LAB: Serverless Data Analysis with Dataflow: Side Inputs (Java): In this lab you will try out a BigQuery query, explore the pipeline code, and execute the pipeline using Java.
  • Creating and re-using pipeline templates
  • Summary
  • QUIZ

Manage Data Pipelines with Cloud Data Fusion and Cloud Composer

This module shows how to manage data pipelines with Cloud Data Fusion and Cloud Composer.

  • Module introduction
  • Introduction to Cloud Data Fusion
  • Components of Cloud Data Fusion
  • Cloud Data Fusion UI
  • Build a pipeline
  • Explore data using wrangler
  • Lab Intro: Building and executing a pipeline graph in Cloud Data Fusion
  • LAB: Building and Executing a Pipeline Graph with Data Fusion 2.5: This tutorial shows you how to use the Wrangler and Data Pipeline features in Cloud Data Fusion to clean, transform, and process taxi trip data for further analysis.
  • Orchestrate work between Google Cloud services with Cloud Composer
  • Apache Airflow environment
  • DAGs and Operators
  • Workflow scheduling
  • Monitoring and Logging
  • Lab Intro: An Introduction to Cloud Composer
  • LAB: An Introduction to Cloud Composer 2.5: In this lab, you create a Cloud Composer environment using the GCP Console. You then use the Airflow web interface to run a workflow that verifies a data file, creates and runs an Apache Hadoop wordcount job on a Dataproc cluster, and deletes the cluster.
  • QUIZ



Contact us for more detail about our trainings and for all other enquiries!

Upcoming Trainings

Join our public courses in our Germany facilities. Private class trainings will be organized at the location of your preference, according to your schedule.

Classroom / Virtual Classroom
23 November 2024
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
23 November 2024
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
24 Januar 2025
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
27 Januar 2025
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
03 Februar 2025
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
12 Februar 2025
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
24 Januar 2025
Berlin, Hamburg, Münih
1 Day
Classroom / Virtual Classroom
27 Januar 2025
Berlin, Hamburg, Münih
1 Day
Building Batch Data Pipelines on Google Cloud Training Course in Germany

The Federal Republic of Germany is the second most populous country in Europe and is located in Central Europe. The official language of the country is German. Germany is one of the richest countries in the world. The main exports of the country include motor vehicles and iron and steel products.

Here are some fun facts about Germany:
The fairy tale writer, the Brothers Grimm, came from Germany and wrote many famous stories such as Cinderella, Snow White, and Sleeping Beauty.
Germany is home to the largest theme park in Europe, the Europa-Park.
The famous composer Ludwig van Beethoven was born in Germany.
The Autobahn, the German highway system, is known for having no general speed limit.


Berlin was divided by the Berlin Wall from 1961 to 1989. Known for its street art, Berlin has many colorful murals and graffiti throughout the city. Also, Berlin is home to many famous museums, such as the Pergamon Museum and the Museum Island. Many clubs and bars stay open until the early hours of the morning in this big city.

Another popular city is Munich, which is famous for its Oktoberfest beer festival that attracts millions of visitors every year. Munich is also home to many historic buildings, including Nymphenburg Palace and the Marienplatz town square.

The country's capital and largest city is Berlin, however Frankfurt is considered to be the business and financial center of Germany. It is home to the Frankfurt Stock Exchange, the European Central Bank, and many other financial institutions. Because of its central location within Europe and its status as a major financial hub, Frankfurt is often referred to as the "Mainhattan," a play on the city's name and its association with the Manhattan financial district in New York City.

Frankfurt is also a major transportation hub, with the largest airport in Germany and one of the largest in Europe, Frankfurt Airport. Additionally, it is a popular destination for tourists, with its historic city center, beautiful parks, and vibrant cultural scene.

Some of the top German technology companies like Siemens AG, Bosch, SAP SE, Deutsche Telekom, Daimler AG and Volkswagen has business centers in Frankfurt. The country has a strong tradition of engineering and innovation, and is home to many other world-class technology companies and research institutions.

Tailored to meet the specific needs of Germany, Bilginç IT Academy combines cutting-edge training methodologies with our comprehensive range of Certification Exam preparation courses and accredited corporate training programs. Experience a transformative approach to IT training that will redefine your expectations.
By using this website you agree to let us use cookies. For further information about our use of cookies, check out our Cookie Policy.