Module 1: Cluster Components
• Describe HPCM
• Describe clusters
• Describe clusters (flat) and distributed (hierarchical) clusters
• Define management, data and rack networks
• List cluster components
– Admin node
– Compute nodes
– Rack leader nodes
– ICE compute nodes
– Chassis management controllers (CMCs)
– Management interfaces (iLO, LO100i, BMC)
– Management of Ethernet switches
– Fabric switch switches and blades
– Smart PDUs
– Storage
Module 2: HPCM GUI
• Locate system groups
• Locate network groups
• Review image groups
• Use custom groups
• Manage nodes
• SSH to nodes
• Shutdown node
• Power off node
• Reboot node
• Illuminate the locate beacon to identify the chassis
• Run tasks on multiple systems
• Monitor the cluster
• Run commands to apply load on nodes
• View left chart to display CPU load
• View right chart to display memory usage
• Focus on single node
• Focus on group of nodes
• Switch to bar graph or table view
• Turn metrics off and on
• Use Ganglia to monitor cluster
• Use Nagios to monitor cluster
• Manage cluster power
• Monitor node power consumption
• Monitor job power consumption
• Monitor system power consumption
Module 3: Operating System Boot Modes and Root Filesystem Modes
• Describe disk mode
• Describe PXE mode
• Describe disk, nfs and tmpfs root file system mode
Module 4: Running Commands to Gather Information on the Cluster
• Describe pdsh
• Use pdsh commands to interrogate the cluster
• Describe output, interleaved stdout, stderr
• Review predefined pdsh node groups
• Add a pdsh node group
• Run a command with the new pdsh node group
• Run a command using the exclude node group format
• Format output from pdsh commands with dshbak –c
• Copy a file to all compute nodes, to all leader nodes, to all ice-compute nodes,
• Retrieve a file from all compute nodes, from all leader nodes, from all ice-compute nodes
• Work with HPCM connector to job scheduler
• Search central log repository