Summary View Metrics
Metric | Description |
---|---|
Unhealthy Services | Aggregate number of all CDH services in a Bad or Concerning state (Unknown and Disabled are not included) |
Applications Running | Number of YARN applications currently executing |
Cluster Memory Allocated | Percent of available memory allocated across all NodeManager instances |
HDFS Max Node I/O | Highest I/O level in bytes on any node in the HDFS system |
HDFS Disk Usage | Percentage of space being used |
YARN Metrics
Metric | Description |
---|---|
Applications Failed | Number of YARN applications that failed to execute successfully |
Applications Completed | Number of YARN applications that executed successfully |
Applications Running | Number of YARN applications currently executing |
Cluster Memory Allocated | Percent of available memory allocated across all NodeManager instances |
Cluster Memory Reserved | Percent of available memory reserved across all NodeManager instances |
Cluster Memory Skew | Comparison of the largest NodeManager memory allocated to the average memory allocated |
Containers Allocated | Number of YARN containers currently allocated across the cluster |
Containers Pending | Number of YARN containers currently pending across the cluster |
Containers Reserved | Number of YARN containers currently reserved across the cluster |
NodeManagers | Number of nodemanagers in a bad (critical), concerning (degraded), and good state Unknown and disabled states display when there are one or more in those states |
ResourceManager Up Since | Timestamp when the ResourceManager service started |
ResourceManager Heap | Percentage of heap space used in the ResourceManager JVM |
ResourceManager UI | Open web interface for the service in a new window |
HDFS Metrics
Metric | Description |
---|---|
Capacity Usage | Percentage of used space to overall storage capacity |
Datanodes | Number of datanodes in a bad (critical), concerning (degraded), and good state Unknown and disabled states display when there are one or more in those states |
Files + Directories Total | Total number of files and directories in HDFS |
Namenode Up since | Timestamp when the namenode service started |
Namenode Heap | Percentage of heap space used in the namenode JVM |
Namenode UI | Open web interface for the service in new window |
HBase Metrics
Metric | Description |
---|---|
Load Average | Average region load per region server |
Region Servers | Number of region servers in a bad (critical), concerning (degraded), and good state Unknown and disabled states display when there are one or more in those states |
Master Server Up since | Timestamp when the master server started |
Master Server Heap | Percentage of heap space used in the master server JVM |
Master Server UI | Open web interface for the service in new window |
Other Services Metrics
Metric | Description |
---|---|
Bad | Services in a critical state |
Concerning | Services in a degraded state |
Disabled | Services in a disabled state when there are one or more in this state |
Good | Services in a good state |
Unknown | Services in a unknown state when there are one or more in this state |