This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data.
This class is intended for experienced developers who are responsible for managing big data transformations including: Extracting, Loading, Transforming, cleaning, and validating data Designing pipelines and architectures for data processing Creating and maintaining machine learning and statistical models Querying datasets, visualizing query results and creating reports To get the most of out of this course, participants should have: Completed Google Cloud Fundamentals: Big Data & Machine Learning course OR have equivalent experience Basic proficiency with common query language such as SQL Experience with data modeling, extract, transform, load activities Developing applications using a common programming language such as Python Familiarity with Machine Learning and/or statistics
To get the most of out of this course, participants should have: Completed: Google Cloud Fundamentals: Core Infrastructure (GCPFCI) course OR have equivalent experience. Basic proficiency with common query language such as SQL Experience with data modeling, extract, transform, load activities Developing applications using a common programming language such as Python Familiarity with basic statistics
This course teaches participants the following skills: Design and build data processing systems on Google Cloud Platform Leverage unstructured data using Spark and ML APIs on Cloud Dataproc Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow Derive business insights from extremely large datasets using Google BigQuery Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML Enable instant insights from streaming data
Module 1: Introduction to Data Engineering
Module 2: Building a Data Lake
Module 3: Building a Data Warehouse
Module 4: Introduction to Building Batch Data Pipelines
Module 5: Executing Spark on Cloud Dataproc
Module 6: Serverless Data Processing with Cloud Dataflow
Module 7: Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
Module 8: Introduction to Processing Streaming Data
Module 9: Serverless Messaging with Cloud Pub/Sub
Module 10: Cloud Dataflow Streaming Features
Module 11: High-Throughput BigQuery and Bigtable Streaming Features
Module 12: Advanced BigQuery Functionality and Performance
Module 13: Introduction to Analytics and AI
Module 14: Prebuilt ML model APIs for Unstructured Data
Module 15: Big Data Analytics with Cloud AI Platform Notebooks
Module 16: Production ML Pipelines with Kubeflow
Module 17: Custom Model building with SQL in BigQuery ML
Module 18: Custom Model building with Cloud AutoMLW
Join our public courses in our United States of America facilities. Private class trainings will be organized at the location of your preference, according to your schedule.