Skip to main content

Host Metrics Sumo Logic App

Thumbnail icon

The Host Metrics app allows you to monitor the performance and resource utilization of hosts and processes that your mission critical applications are dependent upon. Preconfigured dashboards provide insight into CPU, memory, network, file descriptors, page faults, and TCP connectors. This app uses the Sumo Logic installed collector for the collection of host metrics data.

Collecting Metrics for the Host Metrics App

This procedure explains how to collect metrics from a host machine and ingest them into Sumo Logic for metrics visualization.

Configure a Collector

Configure an Installed Collector. Collectors can be installed on Linux, Windows, or Mac OS hosts.

Configure a Source

  1. Configure a Host Metrics Source. Choose Add Source and select Host Metrics as the source type.
  2. Configure the Source Fields as follows:
    1. Name. Required. Description is optional. The source name is stored in a searchable field called _sourceName.
    2. Source Host. Enter the host name of the machine from which the metrics will be collected.
    3. Source Category. Required. The Source Category metadata field is a fundamental building block to organize and label Sources. For details see Best Practices.
    4. Scan Interval. Select the frequency for the Source to scan for hostmetrics data. Selecting a short interval will increase the message volume and could cause your deployment to incur additional charges. The default is 1 minute.
    5. Metrics. Select check boxes for the metrics to collect. By default, all CPU and memory metrics are collected. Select the top level check box to select all metrics in that category. A blue checkmark icon indicates that the category is selected. To select individual metrics, click the right-facing arrow to expand the category and select the individual metrics. The icon changes to icon_blue_minus, as shown below.
      host_metrics_config_window
  3. Click Save.

Metric Types

Available metrics include:

  • CPU
  • Memory
  • TCP
  • Network
  • Disk

The metrics that are collected are described in Host Metrics for Installed Collectors.

Host metrics are gathered by the open-source SIGAR library.

The following tables list the available host metrics.

CPU Metrics

MetricUnitsDescription
CPU_User%Total system cpu user time
CPU_Sys%Total system cpu kernel time
CPU_Nice%Total system cpu nice time
CPU_Idle%Total system cpu idle time
CPU_IOWait%Total system cpu IO wait time
CPU_Irq%Total system cpu time servicing interrupts
CPU_SoftIrq%Total system cpu time servicing softirqs
CPU_Stolen%Total system cpu involuntary wait time
CPU_LoadAvg_1min*AverageSystem load average for past 1 minute
CPU_LoadAvg_5min*AverageSystem load average for past 5 minutes
CPU_LoadAvg_15min*AverageSystem load average for past 15 minutes
CPU_Total%Total system CPU usage time

Load averages are not available on Windows platform.

Memory Metrics

MetricUnitsDescription
Mem_TotalBytesTotal amount of physical RAM
Mem_FreeBytesThe amount of physical RAM left unused by the system
Mem_UsedBytesTotal used system memory, calculated as

MemTotal - MemFree

This metric includes the space allocated in buffers and in the Page Cache, which can make it appear that a larger portion of physical RAM is being consumed than is actually in use. See Mem_ActualUsed below.

Mem_ActualFreeBytesActual total free system memory calculated as:

Mem_Free + Buffers + Cached

Where

Buffers = The amount of physical RAM used for file buffers

Cached = The amount of physical RAM used as cache memory

Mem_ActualUsedBytesActual total used system memory calculated as: Mem_Total - Mem_Actual_Free
This metric better represents the amount of physical RAM in use than Mem_Used.
Mem_UsedPercent%Percent total used system memory calculated as: (Mem_Total - Mem_Actual_Free) / Mem_total
Mem_FreePercent%Percent total free system memory
Mem_PhysicalRamBytesSystem random access memory

TCP Metrics

MetricUnitsDescription
TCP_InboundTotalCountTCP inbound connection count
TCP_OutboundTotalCountTCP outbound connection count
TCP_EstablishedCountTCP established connection count
TCP_ListenCountTCP listen connection count
TCP_IdleCountTCP idle connection count
TCP_ClosingCountTCP closing connection count
TCP_CloseWaitCountTCP close_wait connection count
TCP_CloseCountTCP close connection count
TCP_TimeWaitCountTCP time_wait connection count

Networking Metrics

These have two additional dimensions:

  • Interface: Name of the network interface (example: eth0)
  • Description: Description of the network interface (example: Dual Band Wireless-AC 8265)

Networking metrics are cumulative, so you can use the rate operator to display these metrics as a rate per second. For example: metric=Net_InBytes Interface=eth0 | rate.

MetricUnitsDescription
Net_InPacketsPacketsNumber of received packets
Net_OutPacketsPacketsNumber of sent packets
Net_InBytesBytesNumber of received bytes
Net_OutBytesBytesNumber of sent bytes

Disk Metrics

Disk metrics have two additional dimensions:

  • DevName: Device name, such as the mount name (example: udev)
  • DirName: Directory name, such as the mount directory (example: /dev)

Disk_Reads, Disk_Writes, Disk_ReadBytes, and Disk_WriteBytes are cumulative, so you can use the rate operator to display these metrics as a rate per second. For example: metric=Disk_WriteBytes | rate.

MetricUnitsDescription
Disk_ReadsOperationsNumber of physical disk reads
Disk_ReadBytesBytesNumber of physical disk bytes read
Disk_WritesOperationsNumber of physical disk writes
Disk_WriteBytesBytesNumber of physical disk bytes written
Disk_QueueOperationsNumber of disk queue operations
Disk_InodesAvailable*NodesNumber of free file nodes
Disk_UsedBytesTotal used bytes on filesystem
Disk_UsedPercent%Percentage of filesystem space used
Disk_AvailableBytesTotal available bytes on filesystem

Disk_InodesAvailable is not available on Windows platform.

Time Intervals

The time interval determines how frequently the Source is scanned for metrics data. Sumo Logic supports pre-specified time intervals (10 seconds, 15 seconds, 30 seconds, 1 minute, and 5 minutes).

You can also specify a time interval in JSON by using the interval parameter, as follows:

"interval" : 60000

The JSON parameter is in milliseconds. We recommend 60 seconds (60000 ms) or longer granularity. Specifying a shorter interval will increase the message volume and could cause your deployment to incur additional charges.

AWS Metadata

Collectors running on AWS EC2 instances can optionally collect AWS Metadata such as EC2 tags to make it easier to search for Host Metrics. For more information, see AWS Metadata Source for Metrics.

Only one AWS Metadata Source for Metrics is required to collect EC2 tags from multiple hosts.

Installing the Host Metrics App

Now that you have configured Host Metrics, install the Sumo Logic App for Host Metrics to take advantage of the preconfigured searches and dashboards to analyze your Host Metrics data.

To install the app:

  1. From the Sumo Logic navigation, select App Catalog.
  2. In the Search Apps field, search for and then select your app.
    App_Catalog.png
  3. Optionally, you can scroll down to preview the dashboards included with the app. Then, click Install App (sometimes this button says Add Integration).
    note

    If your app has multiple versions, you'll need to select the version of the service you're using before installation.

  4. On the next configuration page, under Select Data Source for your App, complete the following fields:
    • Data Source. Select one of the following options:
      • Choose Source Category and select a source category from the list; or
      • Choose Enter a Custom Data Filter, and enter a custom source category beginning with an underscore. For example, _sourceCategory=MyCategory.
    • Folder Name. You can retain the existing name or enter a custom name of your choice for the app.
    • All Folders (optional). Default location is the Personal folder in your Library. If desired, you can choose a different location and/or click New Folder to add it to a new folder.
  5. Click Next.
  6. Look for the dialog confirming that your app was installed successfully.
    app-success.png

Once an app is installed, it will appear in your Personal folder or the folder that you specified. From here, you can share it with other users in your organization. Dashboard panels will automatically start to fill with data matching the time range query received since you created the panel. Results won't be available immediately, but within about 20 minutes, you'll see completed graphs and maps.

Viewing Host Metrics Dashboards

Overview

Host Metrics dashboards

Overall Average CPU Idle. Displays the CPU idle time averaged across all hosts in a line chart on a timeline for the last hour. You can modify the list of hosts using the provided filters.

Overall Average CPU Load (1m, 5m, 15m). Shows the CPU load time for one, five, and 15 minutes averaged across all hosts in a line chart on a timeline for the last hour.

Total Free System Memory per Host. Provides information on the total free system memory per host in a line chart on a timeline for the last hour.

Total Used, Less Buffers and Cached Memory per Host. Displays the total memory used less buffers and cached memory per host in a line chart on a timeline for the last hour.

Disk Used Bytes per Host. Shows the disk used bytes per host in a line chart on a timeline for the last hour.

Disk Available Bytes per Host. Provides the disk available bytes per host in a line chart on a timeline for the last hour.

Network InBytes Rate per Host. Displays the rate of network InBytes per host in a line chart on a timeline for the last hour.

Network OutBytes Rate per Host. Shows the rate of network OutBytes per host in a line chart on a timeline for the last hour.

CPU

Host Metrics dashboards

CPU User Time per Host. Displays the CPU user time per host in a line chart on a timeline for the last hour.

Overall Average CPU User Time. Shows the CPU user time averaged across all hosts in a line chart on a timeline for the last hour.

CPU System Time per Host. Provides details on CPU system time per host in a line chart on a timeline for the last hour.

Overall Average CPU System Time. Displays the CPU system time averaged across all hosts in a line chart on a timeline for the last hour.

CPU 1 min Average Load per Host. Shows the CPU 1 minute average load per host in a line chart on a timeline for the last hour.

Overall Average CPU Load (1m, 5m, 15m). Provides the CPU load time for one, five, and 15 minutes averaged across all hosts in a line chart on a timeline for the last hour.

CPU Idle Time per Host. Displays the CPU idle time per host in a line chart on a timeline for the last hour.

Overall Average CPU Idle Time. Shows the CPU idle time averaged across all hosts in a line chart on a timeline for the last hour.

CPU IO Wait Time per Host. Displays the CPU IO wait time per host on a line chart on a timeline for the last hour

Disk

Host Metrics dashboards

Disk Used Bytes per Host. Displays disk used bytes per host in a line chart on a timeline for the last hour.

Disk Available Bytes per Host. Shows disk available bytes per host in a line chart on a timeline for the last hour.

Disk Read Rate per Host. Provides details on disk read rate per host in a line chart on a timeline for the last hour.

Disk Read Byte Rate per Host. Displays disk read byte rate per host in a line chart on a timeline for the last hour.

Disk Write Rate per Host. Shows disk write rate per host in a line chart on a timeline for the last hour.

Disk Write Byte Rate per Host. Provides details on disk write byte rate per host in a line chart on a timeline for the last hour.

Memory

Host Metrics dashboards

Total Memory per Host. Displays total memory per host in a line chart on a timeline for the last hour.

Percent Memory Used per Host. Shows percent memory used per host in a line chart on a timeline for the last hour.

Total Free, Buffers, and Cached Memory per Host. Provides details on the total free, buffers, and cached memory per host (from a metric called ActualFree) in a line chart on a timeline for the last hour.

Total Used, Less Buffers, and Cached Memory per Host. Displays the total used, buffers, and cached memory (from a metric called ActualUsed) in a line chart on a timeline for the last hour.

Total Free Memory per Host. Shows the amount of total free memory per host available in a line chart on a timeline for the last hour.

Total Used System Memory per Host. Provides details on the total system memory per host used in a line chart on a timeline for the last hour.

Network

Host Metrics dashboards

Network InPacket Rate per Host. Displays network InPacket rate per host in a line chart on a timeline for the last hour.

Network OutPacket Rate per Host. Shows network OutPacket rate per host in a line chart on a timeline for the last hour.

Network InByte Rate per Host. Provides details on network InByte rate per host in a line chart on a timeline for the last hour.

Network OutByte Rate per Host. Displays network OutByte rate per host in a line chart on a timeline for the last hour.

TCP

Host Metrics dashboards

Inbound Connections per Host. Displays inbound connections per host in a line chart on a timeline for the last hour.

Outbound Connections per Host. Shows outbound connections per host in a line chart on a timeline for the last hour.

Listen Connections per Host. Provides details on listen connections per host in a line chart on a timeline for the last hour.

Established Connections per Host. Displays established connections per host in a line chart on a timeline for the last hour.

CloseWait Connections per Host. Shows CloseWait connections per host in a line chart on a timeline for the last hour.

TimeWait Connections per Host. Provides details on TimeWait connections per host in a line chart on a timeline for the last hour.

Filters

The supported filters are:

  • _sourceCategory
  • _sourceHost
  • _source
  • _collector
Legal
Privacy Statement
Terms of Use

Copyright © 2023 by Sumo Logic, Inc.