Skip to main content
Sumo Logic

Install the Host and Process Metrics app, Alerts, and view the Dashboards

This page provides instructions for installing the Sumo App and Alerts for hosts and processes, as well as the descriptions of each of the app dashboards. These instructions assume you have already set up a collection as described in the Collect Metrics from Host and Processes App page.

Pre-Packaged Alerts

Sumo Logic has provided out of the box alerts available through Sumo Logic monitors to help you monitor your hosts and processes. These alerts are built based on metrics and logs datasets and include preset thresholds based on industry best practices and recommendations.

For details on the individual alerts,  please see this page.

Installing Alerts

  • To install these alerts, you need to have the Manage Monitors role capability.
  • Alerts can be installed by either importing them a JSON or a Terraform script.

Method 1: Install the alerts by importing a JSON file

  1. Download the JSON file describing all the monitors. 

    1. The JSON contains the alerts that are based on Sumo Logic searches that do not have any scope filters and therefore will be applicable to all hosts, the data for which has been collected via the instructions in the previous sections. However, if you would like to restrict these alerts to specific hosts or environments, update the JSON file by replacing the text $$hostmetrics_data_source with <your sourceCategory>.

      SourceCategory examples: 

      1. For alerts applicable only to a specific cluster of hosts, your custom filter could be:  ‘_sourceCategory=yourclustername/metrics’.

      2. For alerts applicable to all hosts that start with ec2hosts-prod, your custom filter could be: ‘_sourceCategory=ec2hosts-prod*/metrics’.

      3. For alerts applicable to a specific cluster within a production environment, your custom filter could be: ‘_sourceCategory=prod/yourclustername/metrics’

  2. Go to Manage Data > Alerts > Monitors.

  3. Click Add:
    Add monitors page.png

  4. Click Import to import monitors from the JSON above.

Method 2: Install the alerts using a Terraform script

Generate a Sumo Logic access key and ID

Generate an access key and access ID for a user that has the Manage Monitors role capability in Sumo Logic using these instructions. Please identify which deployment your Sumo Logic account is in, using this link.

Download the Sumo Logic Terraform package for Host and Process alerts

The alerts package is available in the Sumo Logic GitHub repository. You can either download it through the “git clone” command or as a zip file. 

Alert Configuration 

After the package has been extracted, navigate to the package directory terraform-sumologic-sumo-logic-monitor/monitor_packages/host_process_metrics/

Edit the host_and_processes.auto.tfvars file and add the Sumo Logic Access Key, Access Id, and Deployment from Generate a Sumo Logic access key and ID.

access_id   = "<SUMOLOGIC ACCESS ID>"

access_key  = "<SUMOLOGIC ACCESS KEY>"

environment = "<SUMOLOGIC DEPLOYMENT>"

Update the variable ’host_and_processes_data_source’ with your source category: 

  1. SourceCategory examples: 

    1. For alerts applicable only to a specific cluster of hosts, your custom filter could be: ‘_sourceCategory=yourclustername/metrics’.

    2. For alerts applicable to all hosts that start with ec2hosts-prod, your custom filter could be:‘_sourceCategory=ec2hosts-prod*/metrics’.

    3. For alerts applicable to a specific cluster within a production environment, your custom filter could be: ‘_sourceCategory=prod/yourclustername/metrics’

All monitors are disabled by default on installation. If you would like to enable all the monitors, set the parameter monitors_disabled to false in this file.

By default, the monitors are configured in a monitor folder called “Host and “Process Metrics”, if you would like to change the name of the folder, update the monitor folder name in this file.

If you would like the alerts to send email or connection notifications, configure these in the file host_process_metrics_notifications.auto.tfvars. For configuration examples, refer to the next section.

Email and Connection Notification Configuration Examples

To configure notifications, modify the file host_process_metrics_notifications.auto.tfvars file and fill in the connection_notifications and email_notifications sections. See the examples for PagerDuty and email notifications below. See this document for creating payloads with other connection types.

Pagerduty Connection Example:
connection_notifications = [
    {
      connection_type       = "PagerDuty",
      connection_id         = "<CONNECTION_ID>",
      payload_override      = "{\"service_key\": \"your_pagerduty_api_integration_key\",\"event_type\": \"trigger\",\"description\": \"Alert: Triggered {{TriggerType}} for Monitor {{Name}}\",\"client\": \"Sumo Logic\",\"client_url\": \"{{QueryUrl}}\"}",
      run_for_trigger_types = ["Critical", "ResolvedCritical"]
    },
    {
      connection_type       = "Webhook",
      connection_id         = "<CONNECTION_ID>",
      payload_override      = "",
      run_for_trigger_types = ["Critical", "ResolvedCritical"]
    }
  ]

Replace <CONNECTION_ID> with the connection id of the webhook connection. The webhook connection id can be retrieved by calling the Monitors API.

Email Notifications Example:
email_notifications = [
    {
      connection_type       = "Email",
      recipients            = ["abc@example.com"],
      subject               = "Monitor Alert: {{TriggerType}} on {{Name}}",
      time_zone             = "PST",
      message_body          = "Triggered {{TriggerType}} Alert on {{Name}}: {{QueryURL}}",
      run_for_trigger_types = ["Critical", "ResolvedCritical"]
    }
  ]
Install the Alerts
  1. Navigate to the package directory terraform-sumologic-sumo-logic-monitor/monitor_packages/host_process_metrics/ and run terraform init. This will initialize Terraform and will download the required components.
  2. Run terraform plan to view the monitors which will be created/modified by Terraform.
  3. Run terraform apply.
Post Installation

If you haven’t enabled alerts or configured notifications through the Terraform procedure outlined above, we highly recommend enabling alerts of interest and configuring each enabled alert to send notifications to other people or services. This is detailed in Step 4 of this document.

Install the app

This section demonstrates how to install the Host and Process Metrics App.

Now that you have set up a log and metric collection for the Host and Process Metrics App, you can install the Sumo Logic App for Host and Processes to use the pre-configured searches and Dashboards.

To install the app, do the following:

Locate and install the app you need from the App Catalog. If you want to see a preview of the dashboards included with the app before installing, click Preview Dashboards.

  1. From the App Catalog, search for and select the app. 

  2. Select the version of the service you're using and click Add to Library.

  3. To install the app, complete the following fields.

    1. App Name. You can retain the existing name, or enter a name of your choice for the app.


    2. Data Source. Select either of these options for the data source.


  • Choose Source Category, and select a source category from the list.


    • Choose Enter a Custom Data Filter, and enter a custom source category beginning with an underscore. Example: (_sourceCategory=MyCategory).


  1. Advanced. Select the Location in Library (the default is the Personal folder in the library), or click New Folder to add a new folder.

  2. Click Add to Library.

Once an app is installed, it will appear in your Personal folder, or another folder that you specified. From here, you can share it with your organization. 

Panels will start to fill automatically. It's important to note that each panel slowly fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but with a bit of time, you'll see full graphs and maps.

Filters with template variables   

Template variables provide dynamic dashboards that can rescope data on the fly. As you apply variables to troubleshoot through your dashboard, you view dynamic changes to the data for a quicker resolution to the root cause. For more information, see the Filter with template variables help page.

Dashboards

Host Metrics - Overview Dashboard

The Host Metrics - Overview dashboard gives you an at-a-glance view of the key metrics like CPU, memory, disk, network, and TCP connections of all your hosts. You can drill down from this dashboard to the Host Metrics - CPU/Disk/Memory/Network/TCP dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high CPU, disk, memory utilization, and identify anomalies over time.

Host Metrics - CPU

The Host Metrics - CPU dashboard provides a detailed analysis based on CPU metrics. You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts and processes with high CPU utilization.
  • Examine CPU usage by type and identify anomalies over time.

Host Metrics - Disk

The Host Metrics - Disk dashboard provides detailed information about on disk utilization and disk IO operations.You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high disk utilization and disk IO operations.
  • Monitor abnormal spikes in read/write rates.
  • Compare disk throughput across storage devices of a host.

Host Metrics - Memory

The Host Metrics - Memory dashboard provides detailed information on host memory usage, memory distribution, and swap space utilization. You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high memory utilization.
  • Examine memory distribution (free, buffered-cache, used, total) for a given host. 
  • Monitor abnormal spikes in memory and swap utilization.

Host Metrics - Network

The Host Metrics - Network dashboard provides detailed information on host network errors, throughput, and packets sent and received.

Use this dashboard to:

  • Determine top hosts with network errors and dropped packets. 
  • Monitor abnormal spikes in incoming/outgoing packets and bytes sent and received.
  • Use dashboard filters to compare throughput across the interface of a host.

Host Metrics - TCP

The Host Metrics - TCP dashboard provides detailed information around inbound, outbound, open, and established TCP connections.

Use this dashboard to:

  • Identify abnormal spikes in inbound, outbound, open, or established connections.

Process Metrics - Overview Dashboard

The Process Metrics - Overview dashboard gives you an at-a-glance view of all the processes by open file descriptors,  CPU usage, memory usage, disk read/write operations and thread count.You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify top processes by CPU, memory usage, and open file descriptors.
  • Determine the longest running processes and users that have spawned the most number of processes.

Process Metrics - Details Dashboard

The Process Metrics - Details dashboard gives you a detailed view of key process related metrics such as CPU and memory utilization, disk read/write throughput, and major/minor page faults.

Use this dashboard to:

  • Determine the number of open file descriptors in all hosts. If the number of open file descriptors reaches the maximum file descriptor limits,, it can cause IOException errors.
  • Identify anomalies in CPU usage, memory usage,  major/minor page faults and reads/writes over time.
  • Troubleshoot memory leaks using the resident set memory trend chart.

Process Metrics - Trends Dashboard

The Process Metrics - Trend dashboard gives you insight into the state of your processes over time.

Use this dashboard to:

  • Analyze the current state of all the processes (sleeping, dead, idle, stopped, total, paging) 
  • Identify anomalies over time in the number of threads, zombie processes, and total processes