This course is intended for systems administrators who will be responsible for the design, installation, configuration, and management of the Hortonworks Data Platform (HDP). The course provides in-depth knowledge and experience in using Apache Ambari as the operational management platform for HDP. This course presumes no prior knowledge or experience with Hadoop. Download the data sheet above for the full course details and objectives.
This course is the updated version of the previous HDP Operations: Hadoop Administration I class.
Read more +
Prerequisites
Attendess, must have experience working in a Linux environment with standard Linux system commands. Attendees, should be able to read and execute basic Linux shell scripts. In addition, it is recommended for students to have some operational experience in data center practices.
Read more +
Who Should Attend
The target audience for this course includes Linux administrators and system operators responsible for installing, configuring and managing an HDP cluster.
Read more +
Outline
Introduction to Big Data, Hadoop and the Hortonworks Data Platform
- Describe Apache Hadoop
- List Hadoop Cluster Management Choices
- Identify Hadoop Cluster Deployment Options
- Perform an Interactive HDP Installation using Apache Ambari
- Manage Users, Groups and Permissions
- Summarize Operations of the Web UI Tool
- Perform HDFS Shell Operations
LABS
- Setting Up the Environment
- Installing HDP
- Managing Ambari Users and Groups
- Managing Hadoop Services
- Using HDFS Storage
- Using WebHDFS
- Using HDFS Access Control Lists
Managing HDFS Storage, Rack Awareness, HDFS Snapshots and HDFS Centralized Cache
- Describe HDFS Architecture and Operation
- Manage HDFS using Ambari Web, NameNode and DataNode UIs
- Manage HDFS using Command-line Tools
- Summarize the Purpose and Benefits of Rack Awareness
- Summarize Hadoop Backup Considerations
- Summarize the Purpose and Operation of HDFS Centralized Caching
- Identify HDFS NFS Gateway Use Cases
- Install and Configure an HDFS NFS Gateway
LABS
- Managing HDFS Storage
- Managing HDFS Quotas
- Configuring Rack Awareness
- Managing HDFS Snapshots
- Using DistCP
- Configuring HDFS Storage Policies
- Configuring HDFS Centralized Cache
- Configuring an NFS Gateway
An Introduction to YARN
- Describe YARN Resource Management
- Summarize YARN Architecture and Operation
- Identify and Use YARN Management Options
- Understand the Basics of Running Simple YARN Applications
- Configure and Manage YARN Queues
- Summarize the Purpose and Operation of YARN Node Labels
- Run Test Jobs to Confirm Node Label Behavior
LABS
- Managing YARN Using Ambari
- Managing YARN Using CLI
- Running Sample YARN Applications
- Setting Up for Capacity Scheduler
- Managing YARN Containers and Queues
- Managing YARN ACLs and User Limits
- Working with YARN Node Labels
High Availability with HDP, Deploying HDP with Blueprints, and the HDP Upgrade Process
- Summarize the Purpose of NameNode HA
- Configure NameNode HA Using Ambari
- Describe the Features and Benefits of the Apache Ambari Dashboard
- Recall the Types of Methods and Upgrades Available in HDP
- Describe the Upgrade Process, Restrictions and Pre-upgrade Checklist
- Perform an Upgrade Using the Apache Ambari Web UI
LABS
- Configuring NameNode HA
- Configuring Resource Manager HA
- Adding, Decommissioning and Re-commissioning a Worker Node
- Configuring Ambari Alerts
- Deploying an HDP Cluster Using Ambari Blueprints
- Performing an HDP Upgrade – Express
Introduction to Big Data, Hadoop and the Hortonworks Data Platform
- Describe Apache Hadoop
- List Hadoop Cluster Management Choices
- Identify Hadoop Cluster Deployment Options
- Perform an Interactive HDP Installation using Apache Ambari
- Manage Users, Groups and Permissions
- Summarize Operations of the Web UI Tool
- Perform HDFS Shell Operations
LABS
- Setting Up the Environment
- Installing HDP
- Managing Ambari Users and Groups
- Managing Hadoop Services
- Using HDFS Storage
- Using WebHDFS
- Using HDFS Access Control Lists
Managing HDFS Storage, Rack Awareness, HDFS Snapshots and HDFS Centralized Cache
- Describe HDFS Architecture and Operation
- Manage HDFS using Ambari Web, NameNode and DataNode UIs
- Manage HDFS using Command-line Tools
- Summarize the Purpose and Benefits of Rack Awareness
- Summarize Hadoop Backup Considerations
- Summarize the Purpose and Operation of HDFS Centralized Caching
- Identify HDFS NFS Gateway Use Cases
- Install and Configure an HDFS NFS Gateway
LABS
- Managing HDFS Storage
- Managing HDFS Quotas
- Configuring Rack Awareness
- Managing HDFS Snapshots
- Using DistCP
- Configuring HDFS Storage Policies
- Configuring HDFS Centralized Cache
- Configuring an NFS Gateway
An Introduction to YARN
- Describe YARN Resource Management
- Summarize YARN Architecture and Operation
- Identify and Use YARN Management Options
- Understand the Basics of Running Simple YARN Applications
- Configure and Manage YARN Queues
- Summarize the Purpose and Operation of YARN Node Labels
- Run Test Jobs to Confirm Node Label Behavior
LABS
- Managing YARN Using Ambari
- Managing YARN Using CLI
- Running Sample YARN Applications
- Setting Up for Capacity Scheduler
- Managing YARN Containers and Queues
- Managing YARN ACLs and User Limits
- Working with YARN Node Labels
High Availability with HDP, Deploying HDP with Blueprints, and the HDP Upgrade Process
- Summarize the Purpose of NameNode HA
- Configure NameNode HA Using Ambari
- Describe the Features and Benefits of the Apache Ambari Dashboard
- Recall the Types of Methods and Upgrades Available in HDP
- Describe the Upgrade Process, Restrictions and Pre-upgrade Checklist
- Perform an Upgrade Using the Apache Ambari Web UI
LABS
- Configuring NameNode HA
- Configuring Resource Manager HA
- Adding, Decommissioning and Re-commissioning a Worker Node
- Configuring Ambari Alerts
- Deploying an HDP Cluster Using Ambari Blueprints
- Performing an HDP Upgrade – Express
Read more +