vRealize Operations Management Pack for Storage Devices 6.0.1
Transcription
vRealize Operations Management Pack for Storage Devices 6.0.1
vRealize Operations Management Pack for Storage Devices 6.0.1 Guide This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs. EN-001750-00 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide You can find the most up-to-date technical documentation on the VMware Web site at: http://www.vmware.com/support/ The VMware Web site also provides the latest product updates. If you have comments about this documentation, submit your feedback to: [email protected] Copyright © 2015 VMware, Inc. All rights reserved. Copyright and trademark information. VMware, Inc. 3401 Hillview Ave. Palo Alto, CA 94304 www.vmware.com 2 VMware, Inc. Contents About This Book 5 1 Introduction to the vRealize Operations Management Pack for Storage Devices 7 2 Installing and Configuring the Management Pack 9 Install the Management Pack 9 Add Credentials and Fabric Servers for the Management Pack for Storage Devices 10 Verify That Your Management Pack for Storage Devices Is Connected and Collecting Data 11 3 Management Pack for Storage Devices Dashboards 15 4 Management Pack for Storage Devices Alert Definitions 17 5 Management Pack for Storage Devices Metrics 21 Statistical Metrics of the Management Pack for Storage Devices Object Properties 28 21 6 User Scenario: An Object in Your Storage Network Has a Performance Problem 33 User Scenario: Use the Troubleshooting Dashboard to Locate the Real Problem Source 33 User Scenario: Examine Metrics to Find the Root Cause 34 Index VMware, Inc. 35 3 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 4 VMware, Inc. About This Book The vRealize Operations Management Pack for Storage Devices Guide provides you with a complete view of your entire storage topology from your host, through your storage network, and out to the storage array. With this solution, you can use vRealize Operations Manager to monitor and troubleshoot capacity and performance problems on different components of your storage area network. Intended Audience The information in this guide is intended for storage administrators, data center architects, and IT operators. VMware Technical Publications Glossary VMware Technical Publications provides a glossary of terms that might be unfamiliar to you. For definitions of terms as they are used in VMware technical documentation, go to http://www.vmware.com/support/pubs. VMware, Inc. 5 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 6 VMware, Inc. Introduction to the vRealize Operations Management Pack for Storage Devices 1 The Management Pack for Storage Devices provides you with a complete view of your entire storage topology from your host, through your storage network, and out to the storage array. With this solution, you can use vRealize Operations Manager to monitor and troubleshoot capacity and performance problems on different components of your storage area network. The Management Pack for Storage Devices has the following features: Provides an end-to-end view of topology, statistics, and events at every affected level of the storage area network. n n Enables a vSphere administrator working with a virtual environment to isolate problems caused by elements in the physical storage stack, such as the Host Bus Adapter (HBA), storage switches, and array. The vSphere administrator uses this information to hand off the problem to the domain administrator for further analysis. n Captures and analyzes information for throughput and latency on the HBA and mount objects. Captures throughput on the switch ports. n Captures IOPS and queue depth at the HBA and switch ports for read and write components. n Discovers the storage switches and applies the credentials required to collect data from the objects in the storage network. It uses the Common Information Model (CIM) to exchange information with objects managed by the following management systems: Cisco Data Center Network Manager (DCNM) n n VMware, Inc. Brocade Network Advisor (BNA) 7 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 8 VMware, Inc. Installing and Configuring the Management Pack 2 When you install a management pack, you install the PAK file that you downloaded as part of the solution, and you configure the management pack instance. This chapter includes the following topics: n “Install the Management Pack,” on page 9 n “Add Credentials and Fabric Servers for the Management Pack for Storage Devices,” on page 10 n “Verify That Your Management Pack for Storage Devices Is Connected and Collecting Data,” on page 11 Install the Management Pack If your environment includes multiple vCenter Server systems on the same storage network, you must add fabric switch CIM servers for each adapter instance. The adapter instance provides you with a holistic view of the storage network connected to the vCenter Server system. The adapter discovers the storage devices only for the ESXi 5.1 update 2 and later. NOTE Any adapter credentials you add are shared with other adapter administrators and vRealize Operations Manager collector hosts. Other administrators might use these credentials to configure a new adapter instance or to move an adapter instance to a new host. Prerequisites Verify that the following conditions are met: You can provide credentials with sufficient privileges to connect and collect data from the n vCenter Server host. If the user account has limited access to objects in the vCenter, you can only collect data from objects for which you have permission. Credentials must include at least read permissions. n The CIM agent is enabled on all storage network switches from which you are collecting data, and that you can provide credentials with sufficient privileges to connect and collect data. If the user account has limited access to objects in the storage environment, you can only collect data from objects for which you have permission. Credentials must include at least read permissions. n All storage network switches with the CIM agent enabled are added to DCNM or BNA solutions. n The fabric CIM service supports switch profile and fabric profile. The supported versions for the Fabric Profile are 1.1 to 1.6. VMware, Inc. 9 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide n A vSphere API for Storage Awareness (VASA) provider is registered as a storage provider with the vCenter Server host for your adapter instance. VASA providers gather information from arrays that support the environment and provide the data to vCenter Server system. If the VASA provider is not registered with vCenter Server host, the Management Pack for Storage Devices cannot discover objects in the array. The minimum support version is VASA 1.0. See information regarding storage providers in the ESXi and vCenter Server 5.5 Documentation. n Download and install the Management Pack for Storage Devices. Procedure 1 Log in to the vRealize Operations Manager user interface with administrator privileges. 2 In the left pane of vRealize Operations Manager, click the Administration icon and click Solutions. 3 On the Solutions tab, click the plus sign. 4 On the Solutions tab, select Management Pack for Storage Devices and click Configure. 5 Browse to locate the temporary folder and select the PAK file. For example, managementpack_name-buildnumber.pak. 6 Click Upload. The upload might take several minutes. 7 Read and accept the EULA, and click Next. Installation details appear in the window during the process. 8 Ensure that you install the PAK file on the master node. 9 When the installation is complete, click Finish. What to do next Add credentials and add the fabric servers Add Credentials and Fabric Servers for the Management Pack for Storage Devices When configuring an adapter instance for the Management Pack for Storage Devices, you add credentials for a vCenter Server. Procedure 1 In the vCenter Server text box, enter the FQDN or IP address of the vCenter Server instance to which you are connecting. The vCenter Server FQDN or IP address must be reachable from all nodes in the vRealize Operations Manager cluster. 2 To add credentials on the Manage Solution page, click the plus sign. a In the Credential name text box, enter the name by which you are identifying the configured credentials. b Type the User name and Password for the vCenter Server instance. c Click OK. You configured credentials to connect to a vCenter Server instance. 10 3 Click Test Connection to validate the connection with your vCenter Server instance. 4 Accept the vCenter Server security certificate. VMware, Inc. Chapter 2 Installing and Configuring the Management Pack 5 Enter the SNMP Community Strings for your SNMP v2 enabled IP Network Switches. 6 To add fabric servers, click Add Fabric Server. a Click the plus sign. b Enter a name and description for the fabric server. c Enter the fabric switch CIM server IP or URL of the host on which the CIM server is installed. The Management Pack for Storage Devices supports CIM servers that manage Fibre Channel switches only. It does not support CIM servers that manage arrays. Option Action Brocade fabric Enter the IP address Cisco fabric Enter the service URL in the following format: https://<IP/Hostname>:<Port>/<Implementation Namespace>, for example, https://1.2.3.4:5989/cimv2. d To add credentials, click the plus sign. e In the Credential name text box, enter the name by which you are identifying the configured credentials. f Enter the user name and password for the Brocade or Cisco Fabric Switch CIM server. For the Cisco Fabric Switch CIM server, the user name and password are for the Cisco Data Center Network Manager (DCNM) and not the Cisco switch. 7 g Accept all the security certificates from the Fabric Switch CIM Server. h Click OK. Click Save Settings. The adapter is added to the Adapter Instance list and is active. What to do next To verify that the adapter is configured and collecting data from the objects in the storage topology, wait a few collection cycles, then view application-related data. n Environment Overview Inventory. Verify that all the objects related to the Storage Devices Instance are listed. Objects should be in the collecting state and receiving data. n Dashboards. Verify that the Storage Components Heatmap, the Storage Components Usage, and the Storage Troubleshooting are added to the default dashboards. Verify That Your Management Pack for Storage Devices Is Connected and Collecting Data You configured an adapter instance of the Management Pack for Storage Devices with credentials for a vCenter Server. Now you want to verify that your adapter instance can retrieve information from objects in the host, switch, and array domains of your environment. To view the object types, from the left pane of the vRealize Operations Manager user interface, click Administration > Environment Overview > Adapter Instances > Storage device Instances > <User_Created_Instance>. VMware, Inc. 11 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 2‑1. Object Types that the Management Pack for Storage Devices Discovers Object Type Description Active Zone Set A collection of all active zones in the storage network. ESX pNic The physical Network Interface Cards (NICs) on the ESXi host. Fabric In the Brocade storage network, fabric is a collection of interconnected switches. There is typically a one-to-one relationship between the fabric and the SAN. In the Cisco storage network, fabric is a virtual SAN which is collection of one or more logical switches. There is typically a one-to-many relationship between the SAN and the fabric. Fabric Switch CIM Server CIM server that manages the switches. It provides the username, password and URL information that the Management Pack for Storage Devices uses to discover storage network objects such as fabric, switches, and switch ports. Host Adapter The Fibre Channel (FC) or Fibre channel over Ethernet (FcoE) host adapters in the host system. IP Switch The Simple Network Management Protocol (SNMP) v2 enabled (IP) network switches IP Switchport The switch ports on your SNMP v2 enabled IP network switches Logical Switch A logical switch abstraction for the Cisco storage network. It is a collection of one or more switch ports that are accessible to one other. Magnetic Disks Magnetic disks on the ESXi hosts. Mount A storage LUN abstraction that is available on the host system. NFS Client Network File System (NFS) clients on the ESXi hosts. NFS Servers NFS servers registered with vCenter via VASA. No Existing Object Container A container that includes all the objects that are no longer discovered in the storage devices topology. SAN (Storage Area Network) In the Brocade storage network, the SAN is a collection of fabrics. There is typically a one-toone relationship between the SAN and the fabrics. In the Cisco storage network, the SAN is a collection of one or more Virtual Storage Area Networks (VSAN). Solid State Device Solid State Devices (SSDs) on the ESXi hosts. Storage Array The storage array registered to the VASA provider in the vCenter for the adapter instance. Storage Devices Instance The Management Pack for Storage Devices instance. Storage LUN The storage LUN of a storage array. Storage Processor The storage controller on the storage array. Switch The switch on the storage network that connects the host to the storage array. Switch Port The port on a switch. Target Port The target port of a storage array. VirtualSAN Cluster VirtualSAN clusters in your data center. VirtualSAN Datastore VirtualSAN data stores in your data center. Zone A collection of target ports or host adapters in the storage network. Procedure 12 1 In the left pane of vRealize Operations Manager, click the Administration tab and click Environment Overview. 2 In the list of tags, expand Adapter Instances and expand Storage Devices Instance. 3 Select the adapter instance name to display the list of objects discovered by your adapter instance. VMware, Inc. Chapter 2 Installing and Configuring the Management Pack 4 5 Slide the display bar to the right to view the object status. Object Status Description Collection State If green, the object is connected. Collection Status If green, the adapter is retrieving data from the object. Deselect the adapter instance name and expand the Object Types tag. Each Object Type name appears with the number of objects of that type in your environment. 6 Verify that you have at least one object in each of the network areas that the Management Pack for Storage Devices discovers. What to do next If objects are missing or not transmitting data, you might need to investigate further. n If an object is not connected or not transmitting data, search for the object and check for related alerts. n If the adapter does not discover at least one object on the host system, in the storage network, and in the storage array, verify that your environment is configured properly. Common configuration problems such as invalid credentials, Fabric Switch CIM servers not reachable, or VASA providers not registered in the vCenter Server instance are raised as alerts for the Management Pack for Storage Devices instance. VMware, Inc. 13 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 14 VMware, Inc. Management Pack for Storage Devices Dashboards 3 When the Management Pack for Storage Devices is installed, it adds preconfigured dashboards and alert definitions that you use to monitor and troubleshoot the components in your storage area network. Table 3‑1. Management Pack for Storage Devices Dashboards Dashboard for protocols such as FC-FCoE, iSCSI, and NFS Purpose Troubleshooting View end-to-end storage topology, health, and alerts on storage components. Component Usage Provides an overview of the most frequently used objects on the storage network and tracks the error count on these objects. Heatmap Provides an overview of object performance against its predefined limits. VMware, Inc. 15 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 16 VMware, Inc. Management Pack for Storage Devices Alert Definitions 4 When the Management Pack for Storage Devices is installed, it adds preconfigured dashboards and alert definitions that you use to monitor and troubleshoot the components in your storage area network. vRealize Operations Manager generates an alert If a problem occurs with the components in the storage area network that the Management Pack for Storage Devices is monitoring. Table 4‑1. Management Pack for Storage Devices Alert Definitions Alert Level Affected Object Type Adapter Configuration failed (AdapterNotConfigured) Critical Storage Devices Adapter Instance Triggered when the Management Pack for Storage Devices instance is incorrectly configured because of invalid credentials or inaccessible Fabric CIM or vCenter Server instances. Clears when the adapter is reconfigured and can successfully communicate with data sources, such as Fabric CIM or vCenter Server. All Paths Down on some Mounts (APD) Critical VMWARE: Host System Triggered by an All Paths Down (APD) situation on one or more mounts on the host system. This generic alert applies when the root cause for APD cannot be attributed to a zone configuration change, host adapter link failure, or switch port failure. Clears when the APD situation is resolved on the host system. All Paths Down on some Mounts because HBA is Link down (APDHostAdapterLinkDown) Critical VMWARE: Host System Triggered by an APD situation on one or more mounts on the host system because of a problem with the host adapter. Clears when the APD situation is resolved on the host system. All Paths Down On Some Mounts because the physical NIC status is down (APDPnicDown) Critical VMWARE: Host System Triggered by an APD situation on one or more mounts on the host system, because of the problem with the physical Network Interface Card (NIC) connection to the host adapter. Clears when the APD situation is resolved on the host system. All Paths Down On Some Mounts Because Switch Ports connected to the HBAs is Down (APDInitiatorSwitchPortDown) Critical VMWARE: Host System Triggered by an APD situation on one or more mounts on the host system, because of a problem with the switch port connection to the host adapter. Clears when the APD situation is resolved on the host system. Alert (ID name) VMware, Inc. Description 17 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 4‑1. Management Pack for Storage Devices Alert Definitions (Continued) Alert Level Affected Object Type All Paths Down On Some Mounts Because Switch Port connected to the Target is Down (APDTargetSwitchPortDown) Critical VMWARE: Host System Triggered by an APD situation on one or more mounts on the host system, because of a problem with the switch port connection to the target port on the array side. Clears when the APD situation is resolved on the host system. All Paths Down on some mounts because HBAs are zoned out (APDZoneConfiguration) Critical VMWARE: Host System Triggered by an APD situation on one or more mounts on the host system. This alert indicates a loss of connectivity between the host adapters on the host side and target ports on the array side because of zone configuration changes. Clears when the APD situation is resolved on the host system. Congestion due to high network bandwidth usage (NetworkBandwidthCongestion) Critical EsxPnic Triggered when the physical NIC bandwidth is over-utilized. Another symptom when the physical NIC capacity is less than 10 GB is also checked, if applicable. Clears when the physical NIC bandwidth usage comes back to normal. Datacenter has some HBAs that are not operating at maximum supported speed (HostAdapterSpeedNotOptimal) Info VMWARE: Datacenter Triggered if a connected HBA is not operating at its maximum supported speed. Clears when the HBA returns to its maximum operating speed. Fabric CIM server certificate is not trusted (FabricCIMServerUntrusted) Critical Fabric CIM Server Triggered if the Management Pack for Storage Devices cannot authenticate the fabric CIM server certificate. Clears when the adapter can authenticate the fabric CIM server certificate. Fabric CIM Server Configuration failed (FabricCredentialsWrong) Critical Fabric Switch CIM Server Triggered if the Management Pack for Storage Devices cannot communicate with the fabric switch CIM server using the login credentials or URL information provided. Clears when the adapter can communicate with the fabric switch CIM server. HBA Link Failure (HostAdapterLinkDown) Critical Host Adapter Triggered when the HBA status changes from Online to LINK DOWN. Clears when the HBA status returns to Online. HBA Speed dropped (HostAdapterSpeedDropped) Critical Host Adapter Triggered when the operating speed of the host adapter drops compared to a previous reading in the data collection cycle. Not automatically cleared. A user must manually clear the alert. High Virtual LUN Resets Or Aborts (HighLunAbortsOrLunResets) Critical Mount Triggered if 50 or more virtual LUN resets or aborts occur per minute on the mount object during two consecutive collection cycles. Clears when the LUN resets or aborts drop to fewer than 50 per minute for two consecutive collection cycles. Alert (ID name) 18 Description VMware, Inc. Chapter 4 Management Pack for Storage Devices Alert Definitions Table 4‑1. Management Pack for Storage Devices Alert Definitions (Continued) Alert Level Affected Object Type IO Latency is above normal (MountIoLatencyHigh) Critical Mount Triggered if the HostSpecificMetrics|IOLatency metric exceeds the dynamic threshold. vRealize Operations Manager sets the dynamic threshold based on the historic data values of this metric. Clears when the metric values fall below the dynamic threshold. Multiple ports on the Switch are down (SwitchPortsDown) Critical Switch Triggered when 50 percent or more switch ports become disabled or display the Enabled but offline status. Clears when the percentage of unavailable switch ports falls below 50 percent. No VASA providers registered to vCenter. Cannot discover array objects (NoVasaProvidersRegistered) Warnin g Storage Devices Adapter Instance Triggered when no VASA providers are registered to the vCenter Server. Clears when at least one VASA provider is registered to the vCenter Server. Number of active paths is low (PathHealthRatioLow) Automa tic Mount Triggered when the HostSpecificMetrics| PathHealthRatio metric for the mount object falls below thresholds. vRealize Operations Manager sets the alert level based on the value of pathHealthRatio: n 75<=pathHealthRatio<100 indicates Immediate n 50<=pathHealthRatio<75 indicates Warning n pathHeathRatio< 50 indicates Critical Clears when pathHealthRatio value becomes 100 for all the host system instanced metric groups of the mount object. Permanent Device Loss on a Mount/Storage LUN (PDL) Critical VMWARE: Host System Triggered when one or more mounts on the host system is unavailable. This situation occurs if the underlying storage LUN is unmasked or removed from the host system. n If the storage LUN was unmasked without first detaching it from the host system, the alert clears when the mount object becomes available. n If the storage LUN was permanently removed, the alert does not automatically clear. A user must manually clear the alert. Some NFS volumes are not reachable from the host because the physical NIC may be down (APDOnNFSClientBecauseOfPnic) Critical VMWARE: Host System Triggered by an APD situation on one or more Network File System (NFS) volumes on the host system, because the connected physical NIC status is Down. Clears when the APD situation is resolved on the host system. Some NFS volumes are not reachable from the host because the switch port connecting to the physical NIC on the host may be down (APDOnNFSClientBecauseOfSwitch Port) Critical VMWARE: Host System Triggered by an APD situation on one or more NFS volumes on the host system because of the switch port connected to the physical NIC is Disabled. Clears when the APD situation is resolved on the host system. Some NFS volumes are unreachable from the host (APDOnNFSClientGeneric) Critical VMWARE: Host System Triggered by a APD situation on one or more NFS volumes on the host system. Clears when the APD situation is resolved on the host system. Alert (ID name) VMware, Inc. Description 19 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 4‑1. Management Pack for Storage Devices Alert Definitions (Continued) Alert Level Affected Object Type Storage LUN Thin Provisioned Capacity At Critical Levels (ThinProvisionedThresholdExceede d) Critical Storage LUN Triggered when the thin provisioning capacity on the storage LUN reaches a critical level. Thresholds for thin provisioning are set in the vCenter Server. Clears when the vCenter Server sends the clear event to the Management Pack for Storage Devices. Storage LUN Thin Provisioned Capacity Threshold At Warning Level (ThinProvisionedThresholdInitialize d) Warnin g Storage LUN Triggered when the thin provisioning capacity on the storage LUN reaches a warning level. The thresholds for thin provisioning are set in the vCenter. Clears when the vCenter sends the clear event to the Management Pack for Storage Devices. Switch Port Down Failure (SwitchPortLinkDown) Critical Switch Port Triggered when the status of switch port changes to Disabled or Enabled but Offline. Clears when the switch port status returns to Enabled. Alert (ID name) 20 Description VMware, Inc. Management Pack for Storage Devices Metrics 5 The Management Pack for Storage Devices collects metrics for objects. It also displays object properties in the vRealize Operations Manager user interface. This chapter includes the following topics: n “Statistical Metrics of the Management Pack for Storage Devices,” on page 21 n “Object Properties,” on page 28 Statistical Metrics of the Management Pack for Storage Devices The Management Pack for Storage Devices collects statistical metrics of objects. You can view these metrics from the vRealize Operations Manager user interface. From the vRealize Operations Manager user interface, click Environment > Storage Devices > Troubleshooting > All Metrics tab . Disk Group Metrics The Management Pack for Storage Devices collects metrics about disk groups. Table 5‑1. Disk Group Metrics Metric Name User Interface Name DerivedStatistics.DiskGroupCapacity Disk Group Capacity DerivedStatistics.DiskGroupDemand Disk Group Demand DerivedStatistics.DiskGroupUsage Disk Group Usage VsanStatistics.Disk Capacity (GB) Disk Capacity (GB) VsanStatistics.Disk Capacity Used (%) Disk Capacity Used (%) VsanStatistics.Capacity Used (GB) Capacity Used (GB) ESXi Physical NIC Metrics The Management Pack for Storage Devices collects metrics about the physical Network Interface Cards (NICs) on the ESXi host. Table 5‑2. ESXi Physical NIC Metrics Metric Name User Interface Name esx_pnic_statistics.PacketsReceived Packets Received esx_pnic_statistics.PacketsTransmitted Packets Transmitted VMware, Inc. 21 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 5‑2. ESXi Physical NIC Metrics (Continued) Metric Name User Interface Name esx_pnic_statistics.Received Received esx_pnic_statistics.Transmitted Transmitted esx_pnic_statistics.DroppedReceived Dropped Received esx_pnic_statistics.DroppedTransmitted Dropped Transmitted esx_pnic_statistics.BytesReceived Bytes Received esx_pnic_statistics.BytesTransmitted Bytes Transmitted esx_pnic_statistics.BroadcastReceived Broadcast Received esx_pnic_statistics.BroadcastTransmitted Broadcast Transmitted DerivedStatistics.Capacity Capacity (MBps) DerivedStatistics.Demand Demand (MBps) DerivedStatistics.Usage Usage (MBps) DerivedStatistics.UsageCapacityPercent Used Capacity Percent Host Adapter Metrics The Management Pack for Storage Devices collects metrics about the host adapters in the host system. Table 5‑3. Host Adapter Metrics 22 Metric Name User Interface Name AdapterErrorStatistics.FailedCommands Failed Commands AdapterErrorStatistics.FailedBlocksRead Failed Blocks Read Per Second AdapterErrorStatistics.FailedBlocksWritten Failed Blocks Written Per Second AdapterErrorStatistics.FailedReadOperations Failed Read Operations Per Second AdapterErrorStatistics.FailedWriteOperations Failed Write Operations Per Second AdapterErrorStatistics.LipCount Lip Count AdapterErrorStatistics.ErrorFrames Error Frames Per Second AdapterErrorCountPerMin.LinkFailureCount Link Failure Count AdapterErrorCountPerMin.LossOfSignalCount Loss Of Signal Count AdapterErrorCountPerMin.InvalidTxWordCount Invalid Tx Word Count AdapterErrorCountPerMin.InvalidCRCCount Invalid CRC Count ReadStatistics.BlocksRead Blocks Read Per Second ReadStatistics.ReadOperations Read Operations Per Second WriteStatistics.BlocksWritten Blocks Written Per Second WriteStatistics.WriteOperations Write Operations Per Second TaskManagementStatistics.Aborts Aborts Per Minute TaskManagementStatistics.LunResets Lun Resets Per Minute FCStatistics.TxFrames Tx Frames Per Second FCStatistics.RxFrames Rx Frames Per Second DerivedStatistics.CongestionRatio Congestion Ratio DerivedStatistics.HostAdapterCapacity Host Adapter Capacity (MBps) VMware, Inc. Chapter 5 Management Pack for Storage Devices Metrics Table 5‑3. Host Adapter Metrics (Continued) Metric Name User Interface Name DerivedStatistics.HostAdapterDemand Host Adapter Demand (MBps) DerivedStatistics.HostAdapterUsage Host Adapter Usage (MBps) DerivedStatistics.HostAdapterInferredLatency Input/Output latency observed on the host adapter. DerivedStatistics.IOPs Aggregation of the read and write operations per second. AdapterErrorStatistics and AdapterErrorCountPerMin metrics are displayed only in Fibre Channel (FC) and Fibre Channel over Ethernet (FCoE) environments. Magnetic Disk, Solid State Device, and Disk Group Metrics The Management Pack for Storage Devices collects metrics about the magnetic disks, solid state devices, and disk groups on the ESXi hosts. Table 5‑4. Magnetic Disk, Solid State Device, and Disk Group Metrics Metric Name User Interface Name busResets Bus Resets capacity.usage Capacity Usage commandsAborted Commands Aborted deviceLatency Device Latency deviceReadLatency Device Read Latency deviceWriteLatency Device Write Latency maxQueueDepth Max Queue Depth numberRead Number Read numberReadAveraged Number Read Averaged numberWrite Number Write numberWriteAveraged Number Write Averaged read Read scsiReservationConflicts SCSI Reservation Conflicts totalLatency Total Latency totalReadLatency Total Read Latency totalWriteLatency Total Write Latency usage Usage write Write HDDErrors HDD Errors IOPS IOPS VMware, Inc. 23 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Magnetic Disk and Solid State Device Smart Metrics The Management Pack for Storage Devices collects SMART metrics about magnetic disks and solid state devices on the ESXi hosts. Table 5‑5. Magnetic Disk and Solid State Device SMART Metrics Metric Name User Interface Name scsi_smart_statistics.HealthStatus.Value Health Status Value scsi_smart_statistics.MediaWearoutIndicator.Value Media Wearout Indicator Value scsi_smart_statistics.MediaWearoutIndicator.Threshold Media Wearout Indicator Threshold csi_smart_statistics.WriteErrorCount.Value Write Error Count Value scsi_smart_statistics.WriteErrorCount.Threshold Write Error Count Threshold scsi_smart_statistics.ReadErrorCount.Value Read Error Count Value scsi_smart_statistics.ReadErrorCount.Threshold Read Error Count Threshold scsi_smart_statistics.PoweronHours.Value Power on Hours value scsi_smart_statistics.ReallocatedSectors.Value Re-allocated sectors value scsi_smart_statistics.ReallocatedSectors.Threshold Re-allocated sectors threshold scsi_smart_statistics.RawReadErrorRate.Value Raw Read Error Rate Value scsi_smart_statistics.RawReadErrorRate.Threshold Raw Read Error Rate Threshold scsi_smart_statistics.DriveTemperature.Value Drive Temperature Value scsi_smart_statistics.DriveTemperature.Worst Drive Temperature Worst scsi_smart_statistics.DriveTemperature.Threshold Drive Temperature Threshold scsi_smart_statistics.DriveRatedMaxTemperature.Value Drive Rated Max Temperature Value scsi_smart_statistics.DriveRatedMaxTemperature.Threshold Drive Rated Max Temperature Threshold scsi_smart_statistics.WriteSectorsTOTCount.Value Write Sectors TOT Count Value csi_smart_statistics.WriteSectorsTOTCount.Threshold Write Sectors TOT Count Threshold scsi_smart_statistics.ReadSectorsTOTCount.Value Read Sectors TOT Count Value scsi_smart_statistics.ReadSectorsTOTCount.Threshold Read Sectors TOT Count Threshold scsi_smart_statistics.InitialBadBlockCount.Value Initial Bad Block Count Value scsi_smart_statistics.InitialBadBlockCount.Threshold Initial Bad Block Count Threshold Mount Metrics The Management Pack for Storage Devices collects metrics about a storage LUN abstraction that is available on the host system. Table 5‑6. Mount Metrics 24 Metric Name User Interface Name DeviceErrorStatistics.FailedCommands Failed Commands DeviceErrorStatistics.FailedReadOperations Failed Read Operations DeviceErrorStatustics.FailedWriteOperations Failed Write Operations DeviceErrorStatustics.FailedBlocksWritten Failed Blocks Written ReadStatistics.BlocksRead Blocks Read VMware, Inc. Chapter 5 Management Pack for Storage Devices Metrics Table 5‑6. Mount Metrics (Continued) Metric Name User Interface Name ReadStatistics.ReadOperations Read Operations WriteStatistics.BlocksWritten Blocks Written WriteStatistics.WriteOperations Write Operations TaskManagementStatistics.Aborts Task Management Statistics TaskManagementStatistics.LunResets LUN Resets DerivedStatistics.CongestionRatio Ratio of the active commands to the queued commands on the mount. HostSpecificMetrics.PathHealthRatio Path Health Ratio DerivedStatistics.MountCapacity Mount Throughput Capacity (MBps) DerivedStatistics.MountDemand Mount Throughput Demand (MBps) DerivedStatistics.MountUsage Mount Throughput Usage (MBps) DerivedStatistics.MountInferredLatency Mount Inferred Latency DerivedStatistics.MountAggregatedErrors Mount errors DerivedStatistics.IOPS IOPS NFS Volume Metrics The Management Pack for Storage Devices collects metrics about Network File System (NFS) volumes. Table 5‑7. NFS Volume Metrics Metric Name User Interface Name DerivedStatistics.ThroughputCapacity Throughput (MBps) DerivedStatistics.ThroughputUsage Current Throughput (MBps) DerivedStatistics.InferredLatency Inferred Latency NfsVolumePerformanceMetrics.datastoreIops Aggregate IOPS NfsVolumePerformanceMetrics.datastoreWriteIops Writes per second NfsVolumePerformanceMetrics.read Read Rate NfsVolumePerformanceMetrics.write Write Rate NfsVolumePerformanceMetrics.totalReadLatency Total Read Latency NfsVolumePerformanceMetrics.datastoreReadBytes Bytes Read NfsVolumePerformanceMetrics.datastoreWriteBytes Bytes Written NfsVolumePerformanceMetrics.datastoreReadIops Reads per second NfsVolumePerformanceMetrics.totalWriteLatency Total Write Latency Storage LUN Metrics The Management Pack for Storage Devices collects metrics about the storage LUN of a storage array. Table 5‑8. Storage LUN Metrics Metric Name User Interface Name lun.PercentUsed Percent Used lun.LunCapacity LUN Capacity VMware, Inc. 25 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Switch Port Metrics The Management Pack for Storage Devices collects metrics about the port on a switch. Table 5‑9. Switch Port Metrics Metric Name User Interface Name switch_port_statistics.AddressErrors Address Errors switch_port_statistics.BytesReceived Bytes Received switch_port_statistics.BytesTransmitted Bytes Transmitted switch_port_statistics.CRCErrors CRC Errors switch_port_statistics.DelimiterErrors Delimiter Errors switch_port_rate_statistics.RxFrameRate Rx Frame Rate switch_port_rate_statistics.RxRate Rx Rate switch_port_rate_statistics.TxFrameRate Tx Frame Rate switch_port_rate_statistics.TxRate Tx Rate DerivedStatistics.SwitchPortCapacity Switch Port Capacity (MBps) DerivedStatistics.SwitchPortUsage Switch Port Capacity DerivedStatistics.AggregatedErrors NA Virtual SAN Data Store (Performance Metrics) The Management Pack for Storage Devices collects performance metrics about the Virtual SAN data stores in your datacenter. Table 5‑10. Virtual SAN Data Store (Performance Metrics) 26 Metric Name User Interface Name PerformanceStatistics.opIngressCongestionCount opIngressCongestionCount PerformanceStatistics.readCachedBytes Total Read Cached Bytes PerformanceStatistics.readCachedCount Read Cached Count PerformanceStatistics.writeCongestionSum Write Congestion Sum PerformanceStatistics.recoveryWriteCount Total Recovery Write Count PerformanceStatistics.readCount Total Read Count PerformanceStatistics.recoveryWriteBytes Recovery Write Bytes PerformanceStatistics.writeBytes Total Write Bytes PerformanceStatistics.recoveryWriteCongestionSum Recovery Write Congestion Sum PerformanceStatistics.prepareLimitRetryCount Prepare Limit Retry Count PerformanceStatistics.readCachedLatencyMaxUs Maximum Read Cached Latency (ms) PerformanceStatistics.readCongestionSum Total Read Congestion PerformanceStatistics.ioCount Total Input Output Count PerformanceStatistics.objIngressCongestionCount Object Ingress Congestion Count PerformanceStatistics.recoveryWriteLatencyMaxUs Maximum Recovery Write Latency (ms) PerformanceStatistics.writeLatencyMaxUs Maximum Write Latency (ms) PerformanceStatistics.writeCount Total Write Count VMware, Inc. Chapter 5 Management Pack for Storage Devices Metrics Table 5‑10. Virtual SAN Data Store (Performance Metrics) (Continued) Metric Name User Interface Name PerformanceStatistics.readLatencyMaxUs Maximum Read Latency (ms) PerformanceStatistics.readCachedCongestionSum Total Read Cached Congestion Virtual SAN Data Store Metrics The Management Pack for Storage Devices collects metrics about the Virtual SAN data stores in your datacenter. Table 5‑11. Virtual SAN Data Store Metrics Metric Name User Interface Name DerivedStatistics.totalCapacity Total Capacity (MB) DerivedStatistics.provisionedCapacity Demand (MB) DerivedStatistics.usedCapacity Usage (MB) DerivedStatistics.latency Inferred Latency DerivedStatistics.throughput Throughput (MBps) DerivedStatistics.maxThroughput Maximum Throughput (MBps) DerivedStatistics.avgTotalCapacityPerDiskGroup Average Capacity Per Disk Group (GB) DerivedStatistics.avgUsedCapacityPerDiskGroup Average Used Capacity Per Disk Group (GB) DerivedStatistics.avgThroughputPerDiskGroup AverageThroughput Per Disk Group (MBps) DerivedStatistics.avgMaxThroughputPerDiskGroup Average Max Throughput Per Disk Group (MBps) HostSpecificMetrics.readCachedLatency Read Cached Latency (ms) HostSpecificMetrics.readLatency Read Latency (ms) HostSpecificMetrics.recoveryWriteLatency Recovery Write Latency (ms) HostSpecificMetrics.writeLatency Write Latency (ms) HostSpecificMetrics.PerHostThroughput Throughtput (KBps) HostSpecificMetrics.MaxPerHostThroughput Maximum Throughtput (KBps) Virtual SAN Host Metrics The Management Pack for Storage Devices collects metrics about the Virtual SAN host in your datacenter. Table 5‑12. Virtual SAN Host Metrics Metric Name User Interface Name VsanStatistics.ComponentUsage Component usage (%) VsanStatistics.capacity Capacity (GB) VsanStatistics.totalUsedCapacity Capacity Used (GB) DerivedStatistics.ThroughputCapacity Throughput (MBps) DerivedStatistics.ThroughputUsage Current Throughput (MBps) VsanStatistics.failedCommands Failed Commands VsanStatistics.latency.totalTime Latency total time VsanStatistics.capacityUsed Disk Capacity Used (%) VMware, Inc. 27 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 5‑12. Virtual SAN Host Metrics (Continued) Metric Name User Interface Name VsanStatistics.capacity Disk Capacity (GB) VsanStatistics.Read Cache Hit Rate (%) Read Cache Hit Rate (%) Object Properties The Management Pack for Storage Devices displays object properties. You can view the object properties from the vRealize Operations Manager user interface. From the vRealize Operations Manager user interface, click Environment > Storage Devices > Analysis tab. Disk Group Properties The Management Pack for Storage Devices displays the disk group object properties. Table 5‑13. Disk Group Properties Name Type Display Name Description HostUuid String Host UUID The UUID of the host. ESXi Physical Network Interface Card Properties The Management Pack for Storage Devices displays the properties of the ESXi Physical Network Interface Card (NIC) object. Table 5‑14. ESXi Physical NIC Properties Name Type Display Name Description Link Speed long Link Speed The speed of the link. Link State string Link State Indicates the state of the link. PCI Device string PCI Device Indicates the type of PCI used. Driver Name string Driver Name Indicates the name of the driver. Host Adapter Properties The Management Pack for Storage Devices displays the properties of the host adapter. Table 5‑15. Host Adapter Properties 28 Name Type Display Name FirmwareVersion string Firmware Version The host bus adapter (HBA) firmware version. DriverName string Driver Name The HBA driver name in ESXi. DriverVersion string Driver Version The HBA drvier version in ESXi. PortSpeed string Port Speed The configured port speed. SupportedPortSpeeds string Supported Port Speeds The list of supported port speeds. Vendor string Vendor The vendor's name for the HBA. Model string Model The model name of the HBA. VMware, Inc. Chapter 5 Management Pack for Storage Devices Metrics Table 5‑15. Host Adapter Properties (Continued) Name Type Display Name HostPortName string Host Port Name The World Wide Port Name (WWPN) of the HBA. HostNodeName string Host Node Name The World Wide node name of the HBA. HardwareVersion string Hardware Version The hardware version of the HBA. OptionROMVersion string Option ROM Version The read-only memory of the HBA. MaxFrameSize string Max Frame Size The maximum size of a frame. PortType string Port Type The port type such as N port and F port. FabricName string Fabric Name The WWPN of the fabric the HBA connects. Name string Name The HBA name in ESXi. IP Switch Properties The Management Pack for Storage Devices displays the properties of the IP switch objects. Table 5‑16. IP Switch Properties Name Type Display Name Description SNMP Enabled string SNMP Enabled Indicates if the IP switch is SNMP enabled. IP Switch Port Properties The Management Pack for Storage Devices displays the properties of the IP switch port object. Table 5‑17. IP Switch Port Properties Name Type Display Name Description Port State string Port State Indicates the state of the port. Storage Array Properties The Management Pack for Storage Devices displays the properties of the storage array object. Table 5‑18. Storage Array Properties Name Type Display Name Description Fimware string Storage Array Firmware The firmware version of the array. ModelId string Storage Array Model Id The model number. VendorId string Vendor ID The vendor identifier. SupportedBlockInterfaces string Supported Block Interfaces The list of supported block interfaces like Fibre Channel (FC) and iSCSI. SupportedFileSystemInterf aces string Supported File System Interfaces The list of supported file system interfaces like Network File System (NFS) and Common Internet File System (CIFS). Name string Array Name The array name. VMware, Inc. 29 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide Table 5‑18. Storage Array Properties (Continued) Name Type Display Name Description UUID string SMS Array Uuid The UUID of the array. AlternateNames string Alternate Names - SupportedProfile string Supported Profile The list of supported VASA profiles. Magnetic Disk Properties The Management Pack for Storage Devices displays the properties of the magnetic disk object. Table 5‑19. Magnetic Disk Properties Name Type Display Name Description Name string Name The name of the magnetic disk. Size string Size The size of the magnetic disk. Vendor string Vendor The name of the manufacturer of the magnetic disk. is Virtual SAN Entity string is Virtual SAN Entity - QueueDepth string Queue Depth The number of pending input/output requests. Mount Properties TheManagement Pack for Storage Devices displays the properties of the mount object. Table 5‑20. Mount Properties Name Type Display Name Description Vendor string Vendor The vendor name of the mount device. Model string Model The number of the mount device. Size long Mount Size in KB The capacity of the mount device. Name string Name The device identifier of the mount device. Solid State Device Properties The Management Pack for Storage Devices displays the properties of the solid state device object. Table 5‑21. Solid State Device Properties 30 Name Type Display Name Description Name string Name The name of the solid state device. Size string Size The size of the solid state device. Vendor string Vendor The name of the manufacturer of the solid state device. is Virtual SAN Entity string is Virtual SAN Entity - QueueDepth string QueueDepth The number of pending input/output requests. VMware, Inc. Chapter 5 Management Pack for Storage Devices Metrics Switch Properties The Management Pack for Storage Devices displays the properties of the switch object. Table 5‑22. Switch Properties Name Type Display Name Description Manufacturer string Manufacturer The name of the manufacturer of the switch. Model string Model The model of the switch. PartNumber string Part Number The part number of the switch. SerialNumber string Serial Number The serial number of the switch. ManufacturerSerialNumber string Manufacturer Serial Number The serial number of the manufacturer of the switch. VersionString string Version String The version number. ClassificationDescriptions string Classification Descriptions The version class (operating system and firmware). Switch Port Properties The Management Pack for Storage Devices displays the properties of the switch port object. Table 5‑23. Switch Port Properties Name Type Display Name Description Speed long Speed The configured speed. MaxSpeed long Max Speed The maximum speed. PortType string Port Type The type of port. PortNumber string Port Number The port number. PermanentAddress string Permanent Address The WWPN of the switch port. EnabledState string Enabled State Indicates if the port is enabled. VMware, Inc. 31 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 32 VMware, Inc. User Scenario: An Object in Your Storage Network Has a Performance Problem 6 A user reports that a virtual machine is running slow. She is not sure how long the problem has existed, but needs to process data overnight and wants to know if she can start the task before she leaves for the day. As the system administrator, you realize that the problem might not be with the virtual machine itself. Rather than investigate the state of only the virtual machine, you begin by reviewing the dashboard that shows the state of all storage network objects associated with the virtual machine. Procedure 1 See the topic called User Scenario: Use the Troubleshooting Dashboard to Locate the Real Problem Source As the system administrator for the data center, you must locate the user's virtual machine in the network storage topology so that you can determine if the problem is with the host system or somewhere else in the storage network. 2 See the topic called User Scenario: Examine Metrics to Find the Root Cause To determine if the virtual machine performance problem is temporary or indicates a more serious problem in the storage network, you create metric charts to identify the root cause of the problem. This chapter includes the following topics: n “User Scenario: Use the Troubleshooting Dashboard to Locate the Real Problem Source,” on page 33 n “User Scenario: Examine Metrics to Find the Root Cause,” on page 34 User Scenario: Use the Troubleshooting Dashboard to Locate the Real Problem Source As the system administrator for the data center, you must locate the user's virtual machine in the network storage topology so that you can determine if the problem is with the host system or somewhere else in the storage network. The fastest way to inspect all of the components connected to the virtual machine is to look at the Troubleshooting dashboard. Start by locating the problematic object. Procedure 1 Click the Troubleshooting dashboard. The dashboard displays badges for all the objects that the Management Pack for Storage Devices discovered in the storage area network. 2 VMware, Inc. In the group of virtual machines displayed, point to each badge icon to see the object name. 33 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 3 Identify the virtual machine that the user reported as slow and select the badge. This action highlights all the objects in the storage area network path from the host system to the storage mount that are associated with the virtual machine. 4 Review highlighted objects that appear with a red, orange, or yellow condition. 5 Select an object to inspect problems or alerts reported in the widgets on the right. Problems such as low throughput do not generate alerts but can affect performance. You identified storage components associated with the virtual machine that have problems. You can now select one of the components to examine faults, anomalies, and metrics on that object and determine if its problems are the root cause of problems seen on the virtual machine. User Scenario: Examine Metrics to Find the Root Cause To determine if the virtual machine performance problem is temporary or indicates a more serious problem in the storage network, you create metric charts to identify the root cause of the problem. You used the Troubleshooting dashboard to find an object networked to the problematic virtual machine that is not operating at an optimum. Now you want to investigate the performance of that object over a period of time. Procedure 1 If you are on the Troubleshooting dashboard with the problematic virtual machine selected, select a highlighted object that appears with a red, orange, or yellow condition. 2 Double-click the object to access collected metrics. The left pane displays the object name and the related objects. The main pane displays the Summary tab. 3 Click the Troubleshooting tab, and click the All Metrics tab. Throughput, latency, and error metrics are the most useful troubleshooting metrics. a Select the error statistics and double-click an entry to determine if errors occurred over time. Errors can indicate when the component had trouble accessing the network. b Select the derived statistics and double-click the use to see a historical graph of the component use over time. Slow throughput can occur during times of high workload. c Select the derived statistics and double-click the latency to see a historical graph of the time required to process IO requests. High latency can indicate that the component had trouble receiving data from the network. What to do next If the error statistics show no problems and derived statistics show a cyclical use pattern on the component, you can call the user and explain that the problem might be temporary and when IO traffic throughout the network drops, the virtual machine will become more responsive. If the error statistics show no problems and derived statistics show consistently low use of the component but the latency is suddenly high, another component on the network might be causing connectivity to time out. Check for a faulty switch or other faulty component on the network. 34 VMware, Inc. Index A S about 5 add credentials for a storage device 10 add fabric servers for the management pack for storage devices 10 alert definitions, Storage Devices 17 solid state device properties 30 Solid State Device SMARTmetrics 24 statistical metrics 21 storage devices verify connection 11 verify data collection 11 storage array properties 29 storage Devices, alert definitions 17 storage LUN metrics 25 storage network component investigate details 34 troubleshoot 33 troubleshooting dashboard 33 switch port metrics 26 switch port properties 31 switch properties 31 D dashboards 15 disk group properties 28 disk group metrics 21, 23 E ESXi physical NIC 21 ESXi Physical NIC properties 28 H host adapter metrics 22 host adapter properties 28 I install 9 installing and configuring 9 intended audience 5 introduction 7 investigate details, storage network component 34 IP switch 29 IP switch port 29 T troubleshoot, storage network component 33 troubleshooting dashboard, storage network component 33 V verify connection, storage devices 11 verify data collection, storage devices 11 virtual SAN data store metrics 27 virtual SAN data store (performance metrics) 26 virtual SAN host metrics 27 M magnetic disk properties 30 magnetic disk SMART metrics 24 magnetic disks metrics, magnetic disk metrics 23 management pack 9 metrics, solid state device metrics 23 mount metrics 24 mount properties 30 N NFS volume metrics 25 O object properties 28 VMware, Inc. 35 vRealize Operations Management Pack for Storage Devices 6.0.1 Guide 36 VMware, Inc.