Building an HP Insight CMU Linux Cluster
Transcription
Building an HP Insight CMU Linux Cluster
Building an HP Insight CMU Linux Cluster WW Hyperscale R&D Division, ISS June 2013 © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Agenda • Insight CMU Brief Overview • Installing Insight CMU • Adding cluster nodes to Insight CMU • Installing a Linux OS • • • Autoinstalling a node Backing up and cloning an image Creating and deploying a diskless image • Configure Insight CMU Monitoring • Monitoring options: collectl & iLO4 If you are clustering Linux servers, then you are building a supercomputer. Make sure it acts like one. HP Insight Cluster Management Utility (CMU) Hyperscale cluster lifecycle management software Provision Monitor Control • Simplified discovery, firmware audits • Industry-standard baremetal installation • Fast and scalable cloning • Diskless support • ‘At a glance’ view of entire • GUI and CLI options system; zoom to component • Easy, friction-less control • Customizable of remote servers • Lightweight and efficient • Scalable auditing and management • Instant view (2D) • Time view (3D) • 10 years+ in deployment, included Top500 sites with1000s of nodes • Built for and designed around Linux -scales to 4,000 nodes per cluster • HP supported, available as factory-integrated cluster option Typical Cluster Topology CMU Site Network Console Network Mgmt Network Head node Moonshot Network Topology CMU Management Network Moonshot Chassis CMU Head Node Moonshot Chassis Moonshot Chassis Site Network Moonshot Chassis iLO CM Network 2nd Network Installing Insight CMU Cluster head node • Install standard Linux distribution • Copy the Linux Distribution ISO to the head node as a local repository Cluster head node • Configure static site network, cluster management network, and iLO network • For this cluster, iLO network (172.22.x.x) and management network (172.20.x.x) are the same Site network Management/iLO network Install Oracle java on the cluster head node • For CMU v7.1, this must be java version 1.6 update 26 or newer Install Insight CMU Copy the Insight CMU RPM and Insight CMU license to the cluster head node Use ‘yum’ to install Insight CMU to resolve all dependencies • • • • Manually install ‘/usr/lib64/libXdmcp.so.6’ first to resolve missing dependency in Insight CMU v7.1 RPM Install Insight CMU license file in /opt/cmu/etc/cmu.lic Configure Insight CMU • Run ‘/opt/cmu/bin/cmu_mgt_config –c’ to configure the cluster head node for use by Insight CMU • This tools performs various checks and configures the cluster head node as a PXE-boot server on the management network. • PXE-boot support is required to remotely install a Linux OS onto a set of compute nodes. • If there are any errors, please correct them and rerun ‘/opt/cmu/bin/cmu_mgt_config –c’. • If you don’t plan on using CMU to install an OS, you can skip this step. • Make sure that password-less ssh for root works between all nodes! Configure Insight CMU Configure Insight CMU Configure Insight CMU Start Insight CMU • Before starting Insight CMU, ‘unset_audit’ to de-activate ‘audit’ mode • ‘audit’ mode is used when Insight CMU is installed in a High-Availability configuration Launch the Insight CMU webpage and GUI • Make sure your desktop/laptop has a recent version of Oracle java installed • Enter the head node IP address in your browser Click to launch the Insight CMU GUI Click ‘yes’ to accept installing the OpenGL library support needed by Insight CMU Timeview Insight CMU GUI Cluster resources organized in groups Main Monitoring Display Cluster Administrators need an X-Server • Anyone can launch the Insight CMU GUI for monitoring. • Administrative tasks are disabled by default. • For Administrators, the Insight CMU v7.1 GUI will execute administrative tasks within an xterm. This xterm is launched from the cluster head node and displays on your desktop/laptop. • Administrators need a X-server running on their desktop/laptop that can accept and display these Insight CMU xterm windows. • Desktops/laptops running a Linux or Apple OS come with X-server software. • Need to configure the X-server to accept X-display requests from the cluster head node. • Desktops/laptops running Windows need to install an X-Server. • • Insight CMU recommends Xming – free, lightweight X-server available on the web. Cygwin, ReflectionX, other X-server software works fine too. Cluster Administrators need an X-Server Snapshot of Xming Xlaunch screen #3 Need to disable Access Control to allow cluster head node to display xterms on the local desktop/laptop when requested by the GUI Logging into the Insight CMU GUI Administrators log in here with cluster head node ‘root’ user credentials Add cluster nodes to Insight CMU Cluster Node Preparation Each compute node should be pre-configured with: • 1. Consecutive static iLO IP addresses • • 1. 2. Common iLO username and password Boot order: network PXE-boot before hard disk boot • 1. iLO NIC is set to DHCP by default. Insight CMU requires iLO static IP addressing to avoid DHCP and to assign node order (see ‘scanning nodes’). Required for remotely installing an OS on the nodes. Virtual Serial Port attached to COM1 • • • Embedded Serial Port is assigned to COM1 by default (for connecting display to server). Insight CMU recommends assigning Virtual Serial Port to COM1 so that the Linux kernel console activity can be viewed remotely via the iLO. If Virtual Serial Port is assigned to COM2, then add ‘console=ttyS1’ to Linux kernel arguments to view console activity remotely via iLO (‘console=ttyS0’ is the default Linux kernel argument setting). Cluster Node Preparation • Other recommended BIOS settings (not required by Insight CMU): – Drive Write Cache: Enabled (SATA disks only; this setting is ‘disabled’ by default) – HP Power Savings Mode: “High Performance” for HPC workloads, or ‘Dynamic’ for power savings – Intel Hyperthreading: ‘enabled’ or ‘disabled’ depending on your workload – Turbo Boost: ‘enabled’ or ‘disabled’ depending on your workload Adding cluster nodes to Insight CMU • Node information needed by Insight CMU: • • • Hostname, IP Address, Netmask, MAC Address, Management Card Type, Management Card IP, and Architecture (and cartridge ID and node ID for each node in a Moonshot Chassis). Hostname, IP Address, Netmask, and MAC Address for the Management Network (information for additional networks for each node configured elsewhere). Management Card Type can be ‘none’ (only ‘ssh’-based power control available; MAC addresses cannot be scanned; no serial console access). • MAC address needed for installing an OS • • • Insight CMU provides a ‘Scan Nodes’ tool for querying the MAC Address from the iLO Node information can also be entered manually If Insight CMU is not used to install an OS, enter fake MAC Address (ex. ‘AA-BB-CC-DD-EE-FF’) Managing Nodes via the Insight CMU GUI Launch the Node Management Window Add node manually Scan nodes via iLO Scanning nodes with the Insight CMU GUI Options are ‘ILO’, ‘lo100i’, and ‘ILOCM’ (for Moonshot Chassis). IPMI also available. Hostname syntax. ‘%i’ provides node numbering. Other format keys available for Moonshot servers. Scanning starts with first iLO IP address. The first node hostname and node IP address are assigned to the server with that iLO IP address. Then both IP addresses and the node hostname are incremented and the process continues until all nodes are scanned. Scanning nodes with the Insight CMU GUI Enter common iLO username and password. Insight CMU will cache these values. Managing iLO access with Insight CMU • Insight CMU stores the iLO access credentials • If you need to update or change these values, use the Insight CMU CLI Scanning nodes with the Insight CMU GUI Check to ensure scanned MAC addresses are valid. If there are errors, check access to iLO. If the node information looks good, then add these nodes to the Insight CMU database. This option replaces all of the existing nodes with the scanned nodes. Adding Nodes with Insight CMU commands • Configure iLO credentials with ‘/opt/cmu/cmucli’ (see previous slide). • Add a node with ‘/opt/cmu/bin/cmu_add_node’ • Run ‘/opt/cmu/bin/cmu_add_node –h’ for details • Scan nodes with ‘/opt/cmu/bin/cmu_scan_macs’ • Run ‘/opt/cmu/bin/cmu_scan_macs –h’ for details • Add or scan nodes with the Insight CMU CLI • • • • Run ‘/opt/cmu/cmucli’ to enter the Insight CMU CLI Type ‘help’ at the ‘cmu>’ prompt to see the list of commands Type ‘help <command>’ to view the command syntax (double quotes are important!) ‘add_node’ to add a node; ‘scan_macs’ to scan nodes Configure network topology in Insight CMU • Nodes in a Network Entity share a common “leaf-level” network switch Launch the Network Entity Management Window Create a Network Entity (ex. ‘rack1’) Select the Network Entity Select nodes from the left-hand side and add them to the Network Entity (right-side) Installing a Linux OS Installing a Linux OS with Insight CMU • Insight CMU provides ‘autoinstall’ support • • Remotely installs a Linux OS on a “bare-metal” server (for initial OS installation) Remotely installs a new Linux distribution on a server (for upgrading to a newer Linux distribution) • Once one server is installed with Linux, Insight CMU provides 2 methods of distributing that image to the other cluster nodes 1. Insight CMU ‘backup and clone’ archives the “golden node” image on the cluster head node (backup), and then copies or clones it to the other cluster nodes 2. Insight CMU diskless support copies the “golden node” image to one or more NFS servers and boots the other cluster nodes with support for mounting that “golden node” image via NFS Insight CMU AutoInstall Support • Enable autoinstall support in the Insight CMU GUI (do this once and restart GUI) • Export the repository via NFS (do this for each Linux distribution) A sample AutoInstall Script (kickstart 1 of 3) [root@cmu1 ~]# cat centos63.cfg # # General config options # install nfs --server=CMU_MGT_IP --dir=CMU_REPOSITORY_PATH lang en_US.UTF-8 keyboard us skipx # # Network setup # network --onboot=yes --bootproto=static --ip=CMU_CN_IP --netmask=CMU_CN_NETMASK -hostname=CMU_CN_HOSTNAME # # Security and time # rootpw changeME A sample AutoInstall Script (kickstart 2 of 3) firewall --disabled authconfig --enableshadow --enablemd5 selinux --disabled timezone --utc America/New_York # # Disk partition information # bootloader --location=mbr zerombr clearpart --all --initlabel part /boot --fstype ext2 --size=1024 --ondisk=sda --asprimary part swap --size=1024 --ondisk=sda --asprimary part / --fstype ext4 --size=1 --ondisk=sda --asprimary --grow # # Reboot after all packages have been installed # reboot A sample AutoInstall Script (kickstart 3 of 3) # # Packages # %packages @system-admin-tools @core @base @network-server @development expect tcl openssl nfs-utils emacs openssl-devel %end [root@cmu1 ~]# Creating an Insight CMU AutoInstall Group • Note that the partitions defined in the sample AutoInstall script are formatted with a native filesystem format • • CMU cannot backup or clone a Logical Volume (LVM) or software RAID filesystem Kickstart by default uses LVM, so make sure you specify a native format like ext4 • Launch the “New AutoInstall Logical Group” Window Creating an Insight CMU AutoInstall Group Give the new group a name Configure the location of the image repository Configure the path to the autoinstall script Creating an Insight CMU AutoInstall Group Add Nodes to an Insight CMU Logical Group Launch ‘Logical Group Management’. Select the Logical Group by name. Select nodes from left-hand side and use arrow to move them to the righthand side. Nodes are now members of the group but they are not active in the group until Insight CMU successfully installs the group image on them. AutoInstalling a node with Insight CMU Select a node and right-click to launch the remote management menu. Select ‘AutoInstall’ AutoInstalling a node with Insight CMU Select the AutoInstall Group and click ‘OK’. Click ‘OK’ to begin. AutoInstalling a node with Insight CMU Monitor progress and watch for errors. You can also select ‘Virtual Serial Port Connection’ from the remote management menu and watch the autoinstall process via the serial console. The node ‘ping’ status will eventually turn green when it is powered up and connected to the network. Successful autoinstall of a node via Insight CMU When the autoInstall process has completed, the node will reboot into the newly installed image. Building a Golden Node with Insight CMU • At this point the autoinstalled node is ready for a software stack. User can log into the node and install their software, setup user accounts, create mountpoints, etc. • • Install the Insight CMU Monitoring Agent To setup password-less ssh keys for the root account (required for monitoring), run the following command from the cluster head node: • /opt/cmu/tools/copy_ssh_keys.exp <nodename> [root password] • If you don’t provide the root password, you will be prompted for it Install the CMU Monitoring Agent from the Insight CMU GUI • • Select the node, right-click, and select ‘Update->Install CMU Monitoring Client’ [optional] Add local users to the Cluster Head Node Add user account Confirm local home directory is created Export /home to the rest of the cluster to create shared /home Note userID and groupID [optional] Add local users to Golden Node Log into Golden Node Mount /home from cluster head node Create same user with existing /home Confirm user has same userID and groupID (or fix) [optional] Install Collectl • Collectl is an open-source tool for gathering performance data from a server [optional] Integrate Collectl with Insight CMU • Configure collectl on the Golden Node to provide metrics to Insight CMU Configure the appropriate subsystems to monitor Export the metrics as key/value pairs Run collectl in “server” mode Configure a 5-second metric polling interval [optional] Configure yum repository [optional] Install dependencies for HP conrep • Configure Insight CMU to use the correct conrep binary • Configure the CMU_BIOS_SETTINGS_TOOL setting in /opt/cmu/etc/cmuserver.conf • Insight CMU v7.1 contains the latest conrep binary in /opt/cmu/ntbt/rp/opt/hp/hp-scripting-tools/bin/conrep [optional] Install dependencies for HP conrep Select the Golden Node, right-click, and choose ‘AMS->Show BIOS Settings’ This will likely fail the first time on newer OS versions. This is because the HP conrep tool for extracting BIOS settings is a 32-bit binary, and the newer OS versions no longer install 32-bit support by default. [optional] Install dependencies for HP conrep To fix the conrep dependencies, log into the golden node and run ‘conrep –h’ to identify each dependency and then install it Insight CMU installs conrep on each node in /opt/cmu/tmp/conrep [optional] Installing SLURM • SLURM is an open-source workload scheduler • Build and Install SLURM on the cluster head node with /home/slurm/slurm.conf as the shared central config file (everything else in /usr/local/) • Install SLURM on the Golden Node • Configure keys and /etc/init.d/slurm • Configure slurm.conf with nodes, logdirs, scheduler, etc. • Details TBD [optional] Installing Hadoop • Hadoop is a framework for processing unstructured data • Download and install hadoop on the cluster head node • Configure the files in the conf/ directory (all data goes on other disks – NOT on the local OS disk) • Do NOT configure HDFS yet! • Configure HDFS after cloning • Install hadoop on the Golden Node • Copy config files in conf/ from cluster head node to Golden Node • Details TBD Insight CMU Backup Support • Create a new Logical Group for the image to be backed up Creating a disk-based image group Give the Logical Group a name Configure the device from which the image is to be taken (‘sda’ is most common; check by running ‘mount’ or ‘df’ on the golden node). Optional: add nodes to this group from an existing group. These nodes will be “not active” in this group until CMU successfully installs this image on them. Backing up a Golden Node with Insight CMU • Make sure nodes are added to the Logical Group before Backing Up or Cloning • See ‘Add Nodes to an Insight CMU Logical Group’ for details • Backup the Golden Node Select the Golden Node; right-click; select ‘Backup’ Backing Up a Golden Node with Insight CMU Select the Logical Group. Select the root directory partition. Run ‘df’ or ‘mount’ on the golden node to determine this (‘sda3’ is the root partition from the sample AutoInstall script: /boot is ‘sda1’ and swap is ‘sda2’). Make sure that this information is correct before proceeding. Backing Up a Golden Node with Insight CMU • Monitor the backup progress • • CMU reboots the golden node into a CMU diskless environment and archives the partitions on the given disk Watch for and address any errors that may come up Backing up a Golden Node with insight CMU • When the backup is finished, the golden node is rebooted back into its OS Contents of an Insight CMU Archived Image • The archived image is in ‘/opt/cmu/image/<logical_group>/’ /boot (sda1) / (root: sda3) reconf.sh is the post-cloning script for this image reconf.sh: the Insight CMU post-cloning script • This script is where all per-node post-cloning activities are crafted • • • Insight CMU provides pre-set per-node environment variables for use by this script (see script for details) Configuring additional networks is the most common per-node post-cloning task The following example configures IB-over-TCP by appending the last two octets of the management network IP address for each node to the IB subnet (192.168.x.x): ### Insert your custom reconfiguration scripts here ## reconfiguring ib0 IBCONF=${CMU_RCFG_PATH}/etc/sysconfig/network-scripts/ifcfg-ib0 IBSUBNET=192.168 TMPFILE=/tmp/cmu-tmp-$$ grep -v IPADDR ${IBCONF} > ${TMPFILE} IPSUFFIX=`echo $CMU_RCFG_IP | awk -F. '{print $3 "." $4}'` echo IPADDR=${IBSUBNET}.${IPSUFFIX} >> ${TMPFILE} mv ${TMPFILE} ${IBCONF} Modifying an Insight CMU Archived Image • An archived image can be unpacked so that changes can be applied • /opt/cmu/bin/cmu_image_open –i <logical group> • • Useful for updating the archived image without having to perform a complete backup. Run ‘chroot image_mountpoint’ to “enter” the image and make changes as if the image were the local filesystem (type ‘exit‘ when done). • If the unpacked image is corrupted, simply delete the ‘image_mountpoint’ directory and unpack another copy. • When modifications are completed, repack the archived image • /opt/cmu/bin/cmu_image_commit –I <logical group> • • The original archive is preserved. The modified archive is ready for cloning. Modifying an Insight CMU Archived Image Cloning an Insight CMU Archived Image Select the set of nodes to clone Select cloning Cloning an Insight CMU Archived Image Select the image to clone Confirm this information before proceeding Cloning an Insight CMU Archived Image The first node in each Network Entity will boot up, partition and format the local disk, and then receive the archive. Then it will wait for the rest of of the nodes. The rest of the nodes will boot up, partition and format their disks, and then receive the archive from the first node. Cloning an Insight CMU Archived Image When everyone has the archived image, they all unpack it, run the post-configuration script, and then reboot. The first node waits for the other nodes to finish cloning and reboot before it reboots. Cloning an Insight CMU Archived Image [optional] Configure HDFS for Hadoop • When the cloned nodes come back up, then you can format and mount the other disks, configure the HDFS, and start hadoop • Use Insight CMU management tools such as Multi-Window Broadcast and/or Pdsh with CMU_Diff to format the other disks in parallel. • Details TBD Diskless Topology CMU Diskless Compute Nodes CMU Head Node And NFS Server NFS Servers Insight CMU Diskless Support • Enable Insight CMU diskless support • • Enable diskless support in /opt/cmu/etc/cmuserver.conf and restart the Insight CMU GUI Install the system-config-netboot-cmd-cmu RPM from the Insight CMU ISO on the cluster head node Insight CMU Diskless Support • Enable tftp diskless support • Add the /tftpboot directory to the tftp configuration for system-confignetboot diskless support Insight CMU Diskless Support • Make sure that the Golden Node image is ready to become a diskless image • • Add the required packages for Insight CMU diskless support Ensure all software is installed and configured on the Golden Node Workaround: Skipping the Diskless OS Check • Insight CMU v7.1 diskless support has been qualified for use with CentOS 6.3 via the exception process, but the Insight CMU v7.1 code still guards against using unqualified Linux OS distributions • The workaround is to replace the code that performs the OS check with a dummy script Creating a Diskless Image from a Golden Node • Create a new Diskless Logical Group Creating a Diskless Image from a Golden Node Give the new diskless group a name Click the ‘diskless’ check box Provide the Golden Node Click on ‘Get Kernel List’ to get a list of the available kernels from the Golden Node Select the kernel to be used as the diskless kernel Don’t add clients from another group to this group. Click ‘OK’ when finished. Creating a Diskless Image from a Golden Node Preparing an Insight CMU Diskless Image • The Diskless Image is stored in /opt/cmu/image/<logical group> Script run after image is created Script run after each node is added to the group Directory containing the readonly ‘root’ filesystem image Directory containing the pernode read-write filesystems Preparing an Insight CMU Diskless Image • Make any cluster-wide changes to the read-only root filesystem • Example: add NFS mountpoints to etc/fstab (this file is modified for diskless support) • Script these changes in reconf-diskless-image.sh so that they can be made automatically if the image is recreated • The diskless logical group can be deleted and re-created to reload the image and re-run the local reconf-diskless-image.sh Adding Nodes to an Insight CMU Diskless Image • Identify and configure per-node read-write files and directories in snapshot/files.custom • • Ex. …/ifcfg-ethX or …/ifcfg-ib0 files for additional networks (full filesystem pathname required) The snapshot/files file is provided by Insight CMU and should remain unedited • Configure any per-node changes in reconf-diskless-snapshot.sh • • Ex. changing IPADDR in additional network files Insight CMU per-node environment variables are available, similar to reconf.sh file for postcloning • When ready, add the nodes to the diskless logical group Adding Nodes to an Insight CMU Diskless Image Select nodes from the left-hand side and add them to the group Snapshot directory for each node is created and configured Nodes are not active in this group until they are booted into this image Adding Nodes to an Insight CMU Diskless Image • The per-node snapshot directories are where the per-node read-write files and directories are stored and edited • • The files and directories listed in the snapshot/files and snapshot/files.custom files are mounted from the per-node snapshot directory over the appropriate locations in the read-only root filesystem image when each node boots up Ex. when a node writes to /var/log/messages, that file exists on the NFS server in /opt/cmu/image/<diskless_group>/snapshot/<nodename>/var/log/messages Booting Diskless Nodes with Insight CMU Select the nodes to boot diskless; right-click and select ‘Boot’ Select ‘network’ and then select the logical group And click ‘OK’ Booting Diskless Nodes with Insight CMU The nodes will get configured in the DHCP server and then get rebooted Insight CMU Diskless Filesystem Configure Insight CMU Monitoring Insight CMU Monitoring Insight CMU Default Metrics and Alerts Insight CMU ActionAndAlertsFile.txt • /opt/cmu/etc/ActionAndAlertsFile.txt • • Single file for configuring all metrics, alerts, and reactions Edit file on the cluster head node • • • • One line per metric / alert / reaction Action Format: <name> <description> <interval> numerical Instantaneous|MeanOverTime <max> <unit> <action command> <name> <description> <interval> string Instantaneous|MeanOverTime <unit> <action command> Alert Format: <name> <message> <level> <interval> <threshold> <comparison_operator> <unit> <action command> Reaction Format: <alert_name> <message> ReactOnRaise|ReactAlways <command(s) to run> Insight CMU ActionAndAlertsFile.txt • Metric example: This is how Insight CMU obtains CPU load from each server Description of ‘cpuload’ metric Metric name: ‘cpuload’ Max expected value: 100% Metric unit: ‘%’ Interval: gather ‘cpuload’ every 5 seconds (set to ‘2’ for every 10 seconds; set to 12 For every 60 seconds) Command to run to obtain the ‘cpuload’ metric Metric value is a number Sum this value with the previous value and divide by elapsed seconds (based on interval) Restart Insight CMU Monitoring • Restart Insight CMU monitoring after making changes to the ActionAndAlertsFile.txt file • Stop and then start the Insight CMU Monitoring Engine [optional] Configure Insight CMU to gather Collectl data • The Insight CMU ActionAndAlertsFile.txt file supports gathering metrics from collectl • • In the ‘Action Command’ field, add the keyword COLLECTL followed by the collectl metric names (simple arithmetic is supported) Example: this is one way to gather CPUload using collectl: cpuload "% cpu load (normalized)" 1 numerical Instantaneous 100 % COLLECTL 100 - (cputotals.idle) • The Insight CMU ActionAndAlertsFile.txt file includes pre-configured collectl-based metrics • Comment out the Insight CMU native metrics and uncomment the collectl-based metrics to enable gathering metrics from collectl [optional] Configure Insight CMU to gather Collectl data • Run collectl to get a list of the collectl metric names for use in the ActionAndAlertsFile.txt Don’t forget to include the subsystems of Interest! [optional] Configure Insight CMU with iLO4 Metrics • iLO4 in the HP Proliant Gen8 servers can be configured to provide server metric data via a public SNMP port • Insight CMU can gather this server data directly from iLO4 for monitoring and analysis • Hardware metric data is obtained “out-of-band” to avoid disruptions in the OS layer • This process employs Insight CMU “EXTENDED” Monitoring • • • “Extended” monitoring provides support for processes outside of Insight CMU to gather metrics and submit them to Insight CMU for display and analysis /opt/cmu/bin/cmu_submit_extended_metrics is the Insight CMU command to submit metrics to Insight CMU Submitted metrics must be pre-configured in the Insight CMU ActionAndAlertsFile.txt file with the EXTENDED keyword [optional] Configure Insight CMU with iLO4 Metrics • The Insight CMU iLO4 support requires a file containing a list of the servers with iLO4 • If all servers in the Insight CMU cluster contain iLO4 then use /opt/cmu/bin/cmu_show_nodes to create a nodefile. • Run /opt/cmu/bin/cmu_config_ams –f <nodefile> to configure iLO4 with a public SNMP port and to enable Agentless Monitoring • Run /opt/cmu/bin/cmu_config_ams –c to add Agentless Monitoring support to the Insight CMU GUI via the Insight CMU Custom Menu Support • Insight CMU Custom Menu Support is configured in /opt/cmu/etc/cmu_custom_menu [optional] Configure Insight CMU with iLO4 Metrics [optional] Gather and Review iLO4 Metrics • Now that the iLO4 SNMP port is enabled, you can get a dump of all of the available information from iLO4 and view it using the new “AMS” custom menu commands in the Insight CMU GUI • Restart the Insight CMU GUI for the custom menu changes to take effect [optional] Gather and Review iLO4 Metrics Select the nodes, rightclick, and select AMS-> Get/Refresh SNMP Data to get a data dump When the data download finishes, this pop-up window will appear [optional] Gather and Review iLO4 Metrics • The iLO4 Metric data is stored in /opt/cmu/tmp/snmp_node_data/<nodename>.raw • CMU reads the MIBs stored in /opt/cmu/snmp_mibs/ and adds humanreadable text to each SNMP OID (and value, where applicable), and stores these results in /opt/cmu/tmp/snmp_node_data/<nodename>.txt [optional] Gather and Review iLO4 Metrics Data can be viewed in the GUI by selecting the node(s), rightclicking, and selecting AMS->View/Compare SNMP Data The data is filtered through CMU_Diff, in case multiple nodes were selected, to highlight any differences [optional] Configure Insight CMU with iLO4 Metrics • /opt/cmu/bin/cmu_get_ams_metrics is the program to gather a preconfigured list of SNMP metrics from the iLO4 management cards of a given list of nodes and submit the data to Insight CMU • The default pre-configured list of SNMP metrics is in the /opt/cmu/etc/cmu_ams_metrics file [optional] Integrate Insight CMU with iLO4 Metrics [optional] Integrate Insight CMU with iLO4 Metrics • The /opt/cmu/bin/cmu_get_ams_metrics program requires the snmpget command [optional] Integrate Insight CMU with iLO4 Metrics • Test the /opt/cmu/bin/cmu_get_ams_metrics command with the configured SNMP metrics from /opt/cmu/etc/cmu_ams_metrics using a subset of nodes • The ‘-d’ option displays the results to the screen instead of submitting them to Insight CMU [optional] Integrate Insight CMU with iLO4 Metrics • Configure the EXTENDED iLO4 metrics in the ActionAndAertsFile.txt file • Commands configured after the EXTENDED keyword will be executed by Insight CMU at the configured interval Interval set to ‘4’ to poll iLO4 every 20 seconds Configure only the first EXTENDED metric with the program that will gather the other EXTENDED metrics [optional] Integrate Insight CMU with iLO4 Metrics • Restart Insight CMU monitoring and the GUI to view the new metrics [optional] Check BIOS Versions with Insight CMU Select the nodes, right-click, and choose ‘AMS->Show BIOS Version’ The BIOS Vendor, Version, and Release Date for each node is obtained, organized, and displayed in a concise format [optional] Check BIOS settings with Insight CMU Select the nodes, right-click, and choose ‘AMS->Show BIOS Settings’ The output of conrep is saved to a file in /opt/cmu/tmp/conrep on each node, and then the file is displayed via CMU_Diff to highlight any differences. More information available at: http://www.hp.com/go/cmu Thank you! © Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Similar documents
HP CLUSTER MANAGEMENT UTILITY
– HP CMU provides a default set of sensors such as CPU load, memory usage, I/O performance, and network performance. – You can customize this list or create your own sensors. You can display sensor...
More information