Introduction
Introduction to Apache Hadoop and the Hadoop Ecosystem
- Apache Hadoop Overview
 - Data Storage and Ingest
 - Data Processing
 - Data Analysis and Exploration
 - Other Ecosystem Tools
 - Introduction to the Hands-On Exercises
 - Apache Hadoop File Storage
 - Problems with Traditional
 
Large-Scale Systems
- HDFS Architecture
 - Using HDFS
 - Apache Hadoop File Formats
 
Data Processing on an Apache Hadoop Cluster
- YARN Architecture
 - Working With YARN
 
Importing Relational Data with Apache Sqoop
- Apache Sqoop Overview
 - Importing Data
 - Importing File Options
 - Exporting Data
 
Apache Spark Basics
- What is Apache Spark?
 - Using the Spark Shell
 - RDDs (Resilient Distributed Datasets)
 - Functional Programming in Spark
 
Working with RDDs
- Creating RDDs
 - Other General RDD Operations
 
Aggregating Data with Pair RDDs
- Key-Value Pair RDDs
 - Map-Reduce
 - Other Pair RDD Operations
 
Writing and Running Apache Spark Applications
- Spark Applications vs. Spark Shell
 - Creating the SparkContext
 - Building a Spark Application
 
(Scala and Java)
- Running a Spark Application
 - The Spark Application Web UI
 
Configuring Apache Spark Applications
- Configuring Spark Properties
 - Logging
 
Parallel Processing in Apache Spark
- Review: Apache Spark on a Cluster
 - RDD Partitions
 - Partitioning of File-Based RDDs
 - HDFS and Data Locality
 - Executing Parallel Operations
 - Stages and Tasks
 
RDD Persistence
- RDD Lineage
 - RDD Persistence Overview
 - Distributed Persistence
 
Common Patterns in Apache Spark
Data Processing
- Common Apache Spark Use Cases
 - Iterative Algorithms in Apache Spark
 - Machine Learning
 - Example: k-means
 
DataFrames and Spark SQL
- Apache Spark SQL and the SQL Context
 - Creating DataFrames
 - Transforming and Querying DataFrames
 - Saving DataFrames
 - DataFrames and RDDs
 - Comparing Apache Spark SQL, Impala, and Hive-on-Spark
 - Apache Spark SQL in Spark 2.x
 
Message Processing with Apache Kafka
- What is Apache Kafka?
 - Apache Kafka Overview
 - Scaling Apache Kafka
 - Apache Kafka Cluster Architecture
 - Apache Kafka Command Line Tools
 
Capturing Data with Apache Flume
- What is Apache Flume?
 - Basic Flume Architecture
 - Flume Sources
 - Flume Sinks
 - Flume Channels
 - Flume Configuration
 
Integrating Apache Flume and Apache Kafka
- Overview
 - Use Cases
 - Configuration
 
Apache Spark Streaming:
Introduction to DStreams
- Apache Spark Streaming Overview
 - Example: Streaming Request Count
 - DStreams
 - Developing Streaming Applications
 
Apache Spark Streaming:
Processing Multiple Batches
- Multi-Batch Operations
 - Time Slicing
 - State Operations
 - Sliding Window Operations
 
Apache Spark Streaming: Data Sources
- Streaming Data Source Overview
 - Apache Flume and Apache Kafka
 
Data Sources
- Example: Using a Kafka Direct Data Source
 
Conclusion
                             
                            
MINISTRY OF INTERNAL AFFAIRS
Rated the training 5 stars.
Turkcell
Rated the training 5 stars.
Turkcell
Rated the training 5 stars.
Turkcell
Rated the training 5 stars.
Turkcell
Rated the training 5 stars.
Amadeus
Rated the training 5 stars.
Amadeus
Rated the training 5 stars.
Amadeus
Rated the training 5 stars.
Amadeus
Rated the training 5 stars.
Amadeus
Rated the training 5 stars.