Host Metrics Sumo Logic App

The Host Metrics app allows you to monitor the performance and resource utilization of hosts and processes that your mission critical applications are dependent upon. Preconfigured dashboards provide insight into CPU, memory, network, file descriptors, page faults, and TCP connectors. This app uses the Sumo Logic installed collector for the collection of host metrics data.
Collecting Metrics for the Host Metrics App​
This procedure explains how to collect metrics from a host machine and ingest them into Sumo Logic for metrics visualization.
Configure a Collector​
Configure an Installed Collector. Collectors can be installed on Linux, Windows, or Mac OS hosts.
Configure a Source​
- Configure a Host Metrics Source. Choose Add Source and select Host Metrics as the source type.
- Configure the Source Fields as follows:
- Name. Required. Description is optional. The source name is stored in a searchable field called
_sourceName
. - Source Host. Enter the host name of the machine from which the metrics will be collected.
- Source Category. Required. The Source Category metadata field is a fundamental building block to organize and label Sources. For details see Best Practices.
- Scan Interval. Select the frequency for the Source to scan for hostmetrics data. Selecting a short interval will increase the message volume and could cause your deployment to incur additional charges. The default is 1 minute.
- Metrics. Select check boxes for the metrics to collect. By default, all CPU and memory metrics are collected. Select the top level check box to select all metrics in that category. A blue checkmark icon indicates that the category is selected. To select individual metrics, click the right-facing arrow to expand the category and select the individual metrics. The icon changes to
, as shown below.
- Name. Required. Description is optional. The source name is stored in a searchable field called
- Click Save.
Metric Types​
Available metrics include:
- CPU
- Memory
- TCP
- Network
- Disk
The metrics that are collected are described in Host Metrics for Installed Collectors.
Host metrics are gathered by the open-source SIGAR library.
The following tables list the available host metrics.
CPU Metrics​
Metric | Units | Description |
CPU_User | % | Total system cpu user time |
CPU_Sys | % | Total system cpu kernel time |
CPU_Nice | % | Total system cpu nice time |
CPU_Idle | % | Total system cpu idle time |
CPU_IOWait | % | Total system cpu IO wait time |
CPU_Irq | % | Total system cpu time servicing interrupts |
CPU_SoftIrq | % | Total system cpu time servicing softirqs |
CPU_Stolen | % | Total system cpu involuntary wait time |
CPU_LoadAvg_1min* | Average | System load average for past 1 minute |
CPU_LoadAvg_5min* | Average | System load average for past 5 minutes |
CPU_LoadAvg_15min* | Average | System load average for past 15 minutes |
CPU_Total | % | Total system CPU usage time |
Load averages are not available on Windows platform.
Memory Metrics​
Metric | Units | Description |
Mem_Total | Bytes | Total amount of physical RAM |
Mem_Free | Bytes | The amount of physical RAM left unused by the system |
Mem_Used | Bytes | Total used system memory, calculated as
This metric includes the space allocated in buffers and in the Page Cache, which can make it appear that a larger portion of physical RAM is being consumed than is actually in use. See |
Mem_ActualFree | Bytes | Actual total free system memory calculated as:
Where
|
Mem_ActualUsed | Bytes | Actual total used system memory calculated as: Mem_Total - Mem_Actual_Free This metric better represents the amount of physical RAM in use than Mem_Used . |
Mem_UsedPercent | % | Percent total used system memory calculated as: (Mem_Total - Mem_Actual_Free) / Mem_total |
Mem_FreePercent | % | Percent total free system memory |
Mem_PhysicalRam | Bytes | System random access memory |
TCP Metrics​
Metric | Units | Description |
TCP_InboundTotal | Count | TCP inbound connection count |
TCP_OutboundTotal | Count | TCP outbound connection count |
TCP_Established | Count | TCP established connection count |
TCP_Listen | Count | TCP listen connection count |
TCP_Idle | Count | TCP idle connection count |
TCP_Closing | Count | TCP closing connection count |
TCP_CloseWait | Count | TCP close_wait connection count |
TCP_Close | Count | TCP close connection count |
TCP_TimeWait | Count | TCP time_wait connection count |
Networking Metrics​
These have two additional dimensions:
- Interface: Name of the network interface (example:
eth0
) - Description: Description of the network interface (example:
Dual Band Wireless-AC 8265
)
Networking metrics are cumulative, so you can use the rate operator to display these metrics as a rate per second. For example: metric=Net_InBytes Interface=eth0 | rate
.
Metric | Units | Description |
Net_InPackets | Packets | Number of received packets |
Net_OutPackets | Packets | Number of sent packets |
Net_InBytes | Bytes | Number of received bytes |
Net_OutBytes | Bytes | Number of sent bytes |
Disk Metrics​
Disk metrics have two additional dimensions:
- DevName: Device name, such as the mount name (example:
udev
) - DirName: Directory name, such as the mount directory (example:
/dev
)
Disk_Reads
, Disk_Writes
, Disk_ReadBytes
, and Disk_WriteBytes
are cumulative, so you can use the rate operator to display these metrics as a rate per second. For example: metric=Disk_WriteBytes | rate
.
Metric | Units | Description |
Disk_Reads | Operations | Number of physical disk reads |
Disk_ReadBytes | Bytes | Number of physical disk bytes read |
Disk_Writes | Operations | Number of physical disk writes |
Disk_WriteBytes | Bytes | Number of physical disk bytes written |
Disk_Queue | Operations | Number of disk queue operations |
Disk_InodesAvailable* | Nodes | Number of free file nodes |
Disk_Used | Bytes | Total used bytes on filesystem |
Disk_UsedPercent | % | Percentage of filesystem space used |
Disk_Available | Bytes | Total available bytes on filesystem |
Disk_InodesAvailable
is not available on Windows platform.
Time Intervals​
The time interval determines how frequently the Source is scanned for metrics data. Sumo Logic supports pre-specified time intervals (10 seconds, 15 seconds, 30 seconds, 1 minute, and 5 minutes).
You can also specify a time interval in JSON by using the interval parameter, as follows:
"interval" : 60000
The JSON parameter is in milliseconds. We recommend 60 seconds (60000 ms) or longer granularity. Specifying a shorter interval will increase the message volume and could cause your deployment to incur additional charges.
AWS Metadata​
Collectors running on AWS EC2 instances can optionally collect AWS Metadata such as EC2 tags to make it easier to search for Host Metrics. For more information, see AWS Metadata Source for Metrics.
Only one AWS Metadata Source for Metrics is required to collect EC2 tags from multiple hosts.
Installing the Host Metrics App​
Now that you have configured Host Metrics, install the Sumo Logic App for Host Metrics to take advantage of the preconfigured searches and dashboards to analyze your Host Metrics data.
To install the app:
- From the Sumo Logic navigation, select App Catalog.
- In the Search Apps field, search for and then select your app.
- Optionally, you can scroll down to preview the dashboards included with the app. Then, click Install App (sometimes this button says Add Integration).note
If your app has multiple versions, you'll need to select the version of the service you're using before installation.
- On the next configuration page, under Select Data Source for your App, complete the following fields:
- Data Source. Select one of the following options:
- Choose Source Category and select a source category from the list; or
- Choose Enter a Custom Data Filter, and enter a custom source category beginning with an underscore. For example,
_sourceCategory=MyCategory
.
- Folder Name. You can retain the existing name or enter a custom name of your choice for the app.
- All Folders (optional). Default location is the Personal folder in your Library. If desired, you can choose a different location and/or click New Folder to add it to a new folder.
- Data Source. Select one of the following options:
- Click Next.
- Look for the dialog confirming that your app was installed successfully.
Once an app is installed, it will appear in your Personal folder or the folder that you specified. From here, you can share it with other users in your organization. Dashboard panels will automatically start to fill with data matching the time range query received since you created the panel. Results won't be available immediately, but within about 20 minutes, you'll see completed graphs and maps.
Viewing Host Metrics Dashboards​
Overview​

Overall Average CPU Idle. Displays the CPU idle time averaged across all hosts in a line chart on a timeline for the last hour. You can modify the list of hosts using the provided filters.
Overall Average CPU Load (1m, 5m, 15m). Shows the CPU load time for one, five, and 15 minutes averaged across all hosts in a line chart on a timeline for the last hour.
Total Free System Memory per Host. Provides information on the total free system memory per host in a line chart on a timeline for the last hour.
Total Used, Less Buffers and Cached Memory per Host. Displays the total memory used less buffers and cached memory per host in a line chart on a timeline for the last hour.
Disk Used Bytes per Host. Shows the disk used bytes per host in a line chart on a timeline for the last hour.
Disk Available Bytes per Host. Provides the disk available bytes per host in a line chart on a timeline for the last hour.
Network InBytes Rate per Host. Displays the rate of network InBytes per host in a line chart on a timeline for the last hour.
Network OutBytes Rate per Host. Shows the rate of network OutBytes per host in a line chart on a timeline for the last hour.
CPU​

CPU User Time per Host. Displays the CPU user time per host in a line chart on a timeline for the last hour.
Overall Average CPU User Time. Shows the CPU user time averaged across all hosts in a line chart on a timeline for the last hour.
CPU System Time per Host. Provides details on CPU system time per host in a line chart on a timeline for the last hour.
Overall Average CPU System Time. Displays the CPU system time averaged across all hosts in a line chart on a timeline for the last hour.
CPU 1 min Average Load per Host. Shows the CPU 1 minute average load per host in a line chart on a timeline for the last hour.
Overall Average CPU Load (1m, 5m, 15m). Provides the CPU load time for one, five, and 15 minutes averaged across all hosts in a line chart on a timeline for the last hour.
CPU Idle Time per Host. Displays the CPU idle time per host in a line chart on a timeline for the last hour.
Overall Average CPU Idle Time. Shows the CPU idle time averaged across all hosts in a line chart on a timeline for the last hour.
CPU IO Wait Time per Host. Displays the CPU IO wait time per host on a line chart on a timeline for the last hour
Disk​

Disk Used Bytes per Host. Displays disk used bytes per host in a line chart on a timeline for the last hour.
Disk Available Bytes per Host. Shows disk available bytes per host in a line chart on a timeline for the last hour.
Disk Read Rate per Host. Provides details on disk read rate per host in a line chart on a timeline for the last hour.
Disk Read Byte Rate per Host. Displays disk read byte rate per host in a line chart on a timeline for the last hour.
Disk Write Rate per Host. Shows disk write rate per host in a line chart on a timeline for the last hour.
Disk Write Byte Rate per Host. Provides details on disk write byte rate per host in a line chart on a timeline for the last hour.
Memory​

Total Memory per Host. Displays total memory per host in a line chart on a timeline for the last hour.
Percent Memory Used per Host. Shows percent memory used per host in a line chart on a timeline for the last hour.
Total Free, Buffers, and Cached Memory per Host. Provides details on the total free, buffers, and cached memory per host (from a metric called ActualFree) in a line chart on a timeline for the last hour.
Total Used, Less Buffers, and Cached Memory per Host. Displays the total used, buffers, and cached memory (from a metric called ActualUsed) in a line chart on a timeline for the last hour.
Total Free Memory per Host. Shows the amount of total free memory per host available in a line chart on a timeline for the last hour.
Total Used System Memory per Host. Provides details on the total system memory per host used in a line chart on a timeline for the last hour.
Network​

Network InPacket Rate per Host. Displays network InPacket rate per host in a line chart on a timeline for the last hour.
Network OutPacket Rate per Host. Shows network OutPacket rate per host in a line chart on a timeline for the last hour.
Network InByte Rate per Host. Provides details on network InByte rate per host in a line chart on a timeline for the last hour.
Network OutByte Rate per Host. Displays network OutByte rate per host in a line chart on a timeline for the last hour.
TCP​

Inbound Connections per Host. Displays inbound connections per host in a line chart on a timeline for the last hour.
Outbound Connections per Host. Shows outbound connections per host in a line chart on a timeline for the last hour.
Listen Connections per Host. Provides details on listen connections per host in a line chart on a timeline for the last hour.
Established Connections per Host. Displays established connections per host in a line chart on a timeline for the last hour.
CloseWait Connections per Host. Shows CloseWait connections per host in a line chart on a timeline for the last hour.
TimeWait Connections per Host. Provides details on TimeWait connections per host in a line chart on a timeline for the last hour.
Filters​
The supported filters are:
_sourceCategory
_sourceHost
_source
_collector