Skip to main content

Windows - OpenTelemetry Collector

thumbnail icon Thumbnail icon

The Sumo Logic App for Windows allows you to monitor the performance and resource utilization of hosts and processes that your mission-critical applications are dependent upon. In addition to that, our Windows App provides insight into your Windows system's operation and events so that you can better manage and maintain your environment.

The Windows App, which is based on the Windows event log format, consists of predefined searches and dashboards that provide visibility into your environment for real-time analysis of overall usage of Security Status, System Activity, Updates, User Activity, and Applications. Our dashboards provide insight into CPU, memory, network, file descriptors, page faults, and TCP connectors.

Schematics

Fields Created in Sumo Logic for Windows

Following are the fields which will be created as part of Windows App install if not already present. 

  • sumo.datasource. Has a fixed value of windows.

Log Types

The Windows App assumes events are coming from Windows Event Log receiver in JSON format. It does not work with third party logs.

Standard Windows event channels include:

  • Security
  • System
  • Application

Collection configuration and app installation

As part of data collection setup and app installation, you can select the App from App Catalog and click on Install App. Follow the steps below.

Step 1: Set up Collector

note

If you want to use an existing OpenTelemetry Collector, you can skip this step by selecting the Use an existing Collector option.

To create a new Collector:

  1. Select the Add a new Collector option.
  2. Select the platform where you want to install the Sumo Logic OpenTelemetry Collector.

This will generate a command that you can execute in the machine environment you need to monitor. Once executed, it will install the Sumo Logic OpenTelemetry Collector.

Collector

Step 2: Configure integration

In this step, you will configure the yaml file required for Windows event logs and metrics Collection.

Any custom fields can be tagged along with the data in this step.

Enable process metric collection (Optional)

By default, the collector will not send process metrics to Sumo Logic. This is because the number of processes running on a host can be very large, which would result in a significant increase in Data Points per Minute (DPM).

Click the Enable process metric collection checkbox to collect (process level metric)[https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/hostmetricsreceiver/internal/scraper/processscraper/documentation.md].

  • Name of process. Add the list of process names.
  • Include/Exclude the above pattern. Signifies if you want to exclude or include the metrics for the processes listed previously.
  • Match type for process name. Select if the process name given should be considered for a strict match with the host machine processes or if it should be considered as regex when matching.
process-metric-collection
note

If the process list needs to be edited in the future, you can edit it manually in the OTEL config yaml by adding/removing in the names list under process scrapper.

process:
  include:
    names: [ <process name1>, <process name2> ... ]
    match_type: <strict|regexp>

Click on the Download YAML File button to get the yaml file.
Windows-YAML

Step 3: Send logs to Sumo

Once you have downloaded the yaml file as described in the previous step, follow the below steps based on your platform.

  1. Copy the yaml file to C:\ProgramData\Sumo Logic\OpenTelemetry Collector\config\conf.d folder in the machine which needs to be monitored.
  2. Restart the collector using:
    Restart-Service -Name OtelcolSumo

After successfully executing the above command, Sumo Logic will start receiving data from your host machine.

Click Next. This will install the app (dashboards and monitors) to your Sumo Logic Org.

Dashboard panels will start to fill automatically. It's important to note that each panel fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but within 20 minutes, you'll see full graphs and maps.

Sample Metrics Message

{
"queryId":"A",
"_source":"windows-otel-metric",
"_metricId":"tYzy7VHWrdxuGHOkPRT5pA",
"_sourceName":"Http Input",
"os.type":"windows",
"sumo.datasource":"windows",
"direction":"transmit",
"_sourceCategory":"Labs/windows-otel",
"_contentType":"Carbon2",
"host.name":"EC2AMAZ-T30T53R.ec2.internal",
"metric":"system.network.io",
"_collectorId":"000000000CEC8ECC",
"_sourceId":"0000000044DB46EF",
"unit":"By",
"_collector":"Labs - windows-otel",
"device":"Loopback_Pseudo-Interface_1",
"max":289495780,
"min":0,
"avg":229918329.73,
"sum":3448774946,
"latest":289485558,
"count":15
}

Sample Queries

This sample metrics query is from the Host Metric - CPU dashboard > CPU User Time panel.

Metrics Query String
sumo.datasource=windows host.name={{host.name}} cpu=cpu0  metric=system.cpu.utilization state=user | avg by host.name

This sample log query is from the Windows - Overview dashboard > System Restarts panel.

Log Query String
%"sumo.datasource"=windows  "\"channel\":\"Security\""
| json "event_id", "computer", "message", "channel" as event_id_obj, host.name, msg_summary, channel nodrop 
| json field=event_id_obj "id" as event_id
| parse regex field=msg_summary "(?<msg_summary>.*\.*)" nodrop
| where event_id = "4608" and channel = "Security" and host.name matches "{{host.name}}"
| count as Restarts

Sample Logs

{
"record_id":"6316",
"channel":"Application",
"event_data":"",
"task":"0",
"provider":"{\"name\":\"Microsoft-Windows-Security-SPP\",\"guid\":\"{E23B33B0-C8C9-472C-A5F9-F2BDFEA0F156}\",\"event_source\":\"Software Protection Platform Service\"}",
"system_time":"2023-01-20T15:22:02+0000816Z",
"computer":"EC2AMAZ-T30T53R",
"opcode":"0",
"keywords":"Classic",
"message":"Offline downlevel migration succeeded.",
"event_id":"{\"id\":\"16394\",\"qualifiers\":\"49152\"}",
"level":"Information"
}

Viewing Windows Event Log-Based Dashboards

Windows - Overview

The Windows - Overview dashboard provides insights into fatal or warning messages, policy changes, system restarts, and changes to administrative groups.

Use this dashboard to:

  • Monitor systems experiencing fatal errors, warnings, and system restarts.
  • View system login attempts. 
  • Monitor policy changes performed on the system.
  • Monitor services installed on the systems.
  • Monitor the number of changes performed on the Administrative groups.
Windows - Overview

Windows - Default

The Windows - Default dashboard provides information about the start and stop operations for Windows services, Windows events, operations events, and Errors and Warnings.

Use this dashboard to:

  • Monitor services being stopped, started on the system.
  • Monitor event types (channels) collected from the system.
  • Monitor log level (error, warning) trend on the systems.
  • Monitor operations performed on the system like restarts, user creation, group creation, and firewall changes.
Windows - Default

Windows - Login Status

The Windows - Login Status dashboard provides information about successful and failed logins, successful Remote Desktop Protocol (RDP) reconnects, and failed login outliers.

Use this dashboard to:

  • Monitor successful and failed logins by the user and track their locations with successful and failed login attempts.
  • Monitor RDP reconnect events.
  • Track failed login outliers to identify mischievous login activities.
Windows - Login Status

Windows - Event Errors

The Windows - Event Errors dashboards provide insights into error keyword trends and outliers.

Use this dashboard to:

  • Monitor various errors in the systems.
  • Monitor error trends and outliers to ensure they are within acceptable limits to decide the next step.
Windows - Event Errors

Windows - Application

The Windows - Application dashboard provides detailed information about install, uninstall, and event trends.

Use this dashboard to:

  • Monitor Install and uninstall of applications performed on the system.
  • Monitor log levels (error, warning, information) through trends and quick snapshots.
  • Monitor various application-specific events happening through recent messages.
Windows - Application

Windows - Host Metric Based Dashboards 

Host Metrics - Overview

The Host Metrics - Overview dashboard gives you an at-a-glance view of the key metrics like CPU, memory, disk, network, and TCP connections of all your hosts. You can drill down from this dashboard to the Host Metrics - CPU/Disk/Memory/Network/TCP dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high CPU, disk, memory utilization, and identify anomalies over time.
Host Metrics - Overview

Host Metrics - CPU

The Host Metrics - CPU dashboard provides a detailed analysis based on CPU metrics. You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts and processes with high CPU utilization.
  • Examine CPU usage by type and identify anomalies over time.
Host Metrics - CPU

Host Metrics - Disk

The Host Metrics - Disk dashboard provides detailed information about disk utilization and disk IO operations.You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high disk utilization and disk IO operations.
  • Monitor abnormal spikes in read/write rates.
  • Compare disk throughput across storage devices of a host.
Host Metrics - Disk

Host Metrics - Memory

The Host Metrics - Memory dashboard provides detailed information on host memory usage, memory distribution, and swap space utilization. You can drill down from this dashboard to the Process Metrics - Details dashboard by using the honeycombs or line charts in all the panels.

Use this dashboard to:

  • Identify hosts with high memory utilization.
  • Examine memory distribution (free, buffered-cache, used, total) for a given host.
  • Monitor abnormal spikes in memory and swap utilization.
Host Metrics - Memory

Host Metrics - Network

The Host Metrics - Network dashboard provides detailed information on host network errors, throughput, and packets sent and received.

Use this dashboard to:

  • Determine top hosts with network errors and dropped packets.
  • Monitor abnormal spikes in incoming/outgoing packets and bytes sent and received.
  • Use dashboard filters to compare throughput across the interface of a host.
Host Metrics - Network

Host Metrics - TCP

The Host Metrics - TCP dashboard provides detailed information around inbound, outbound, open, and established TCP connections.

Use this dashboard to:

  • Identify abnormal spikes in inbound, outbound, open, or established connections.
Host Metrics - TCP

The Process Metrics - Overview

The Process Metrics - Overview dashboard gives you an at-a-glance view of all the processes by open file descriptors, CPU usage, memory usage, disk read/write operations, and thread count.

User this dashboard to :

  • Process wise distribution of CPU and memory usage
  • Process wise read/write operations
Process Metrics - Overview

Process Metrics - Details

The Process Metrics - Details dashboard gives you a detailed view of key process related metrics such as CPU and memory utilization, disk read/write throughput, and major/minor page faults.

Use this dashboard to:

  • Determine the number of open file descriptors in all hosts. If the number of open file descriptors reaches the maximum file descriptor limits, it can cause IOException errors.
  • Identify anomalies in CPU usage, memory usage, major/minor page faults and reads/writes over time.
  • Troubleshoot memory leaks using the resident set memory trend chart.
Process Metrics - Details
Legal
Privacy Statement
Terms of Use

Copyright © 2023 by Sumo Logic, Inc.