Skip to main content
Sumo Logic

Collect logs and metrics for VMware ULM

This page shows you how to configure a system to collect data, install a Collector, and collect logs and metrics for the VMware ULM.

The Sumo Logic App for VMware ULM collects logs and metrics from your VMware cloud computing virtualization platform, then displays the data in predefined dashboards. The app enables you to monitor vCenter, ESXi hosts and VM metrics and events.

This page provides instructions for collecting logs and metrics for VMware. Click a link to jump to a topic:

Step 1: Set up a server, host, or VM to collect data

You can use either one of the following two methods for setting up a server to collect data for the VMware ULM App. This section provides instructions for both: 

  • Use a Sumo Logic VM template to create a pre-configured VM with the Sumo Logic scripts pre-installed and pre-configured. The Sumo Logic VM for logs and events collection is an appliance (Ubuntu virtual machine) that includes vSphere SDK for Python (pyvmomi) and Sumo Logic scripts.
  • Install the Sumo Logic scripts for events and metrics on a vCenter server, or another host with access to vCenter API’s.

Setting up a Sumo Logic VM to collect data

This section walks you through the process of setting up a Sumo Logic VM that has the necessary scripts to collect data for VMware pre-installed and pre-configured.

To set up a Sumo Logic VM with pre-installed scripts, do the following:
  1. Download the VM template (.zip file) from this link.
  2. Choose File > Deploy Template.
  3. Login using the credentials sumoadmin/sumoadmin.
  4. Continue with step 6 to configure the scripts PATH.

Installing Sumo Logic scripts on a vCenter server, another host, or VM

This section walks you through the process of installing Sumo Logic scripts for events and metrics on a vCenter server, or another host with access to vCenter API’s. Lastly, it provides instructions for configuring the path to run the scripts, whether on a vCenter server, host, or VM.

To install and configure the Sumo Logic scripts, do the following:
  1. On the server, host, or VM create a directory in which to put the Sumo Logic scripts from Sumo Logic Scripts for VMware. We recommend that you name the directory /var/log/vmware, or something similar.

  2. Download the Sumo Logic VMware scripts from sumo-vsphere-ulm.zip, into the directory you just created.

  3. Install python version 3.6, or later.

  4. Install pyvmomi 6.7.1: pip install pyvmomi==6.7.1

  5. Verify that the user account that will be running the Sumo Logic VMware scripts has full read/write/execute permissions for the directories from which the sumo-vsphere-ulm.zip files will be extracted.
     

  6. Edit the cron_vcenter_events.sh script, changing the SCRIPT_PATH variable to reflect the absolute path where the script resides.
     

Step 2: Download and install the Collector

An Installed Collector is a Java agent that receives logs and metrics from its Sources and then encrypts, compresses, and sends the data to the Sumo service. The Collector runs as a service and starts automatically after installing or rebooting.

To install a Collector to collect logs and metrics: refer to this link for installation instructions.

Step 3: Collect logs and metrics for the VMware ULM App

This section explains how to set up a vCenter server, host, or VM to collect logs and metric for the Sumo Logic App for VMware ULM. Click a link to jump to a topic.

A. Collecting event messages

An event is an action that triggers an event message on a vCenter Server. Event messages are not logged, but are instead stored in the vCenter Server database. The Sumo Logic Collector for VMware retrieves these messages using the vSphere python SDK.

This procedure includes the following tasks:

To configure a syslog source for the Collector, do the following:
  1. Go to Manage Data > Collection > Collection, and click Add Source.
  2. Select Syslog for the Source type.

    VMwareULM_SyslogSource.png
  3. Enter a Name to display for this Source. Source name metadata is stored in a searchable field called _sourceName.
  4. For Protocol choose TCP.
  5. Enter the correct Port number (for your Collector) for the Source to listen to, such as 1514.
  6. For Source Category, we recommend using vcenter_events.
  7. Under Advanced, set the following options:
    —Select Extract timestamp information from log file entries.
    —Select Ignore time zone from log file and instead use and then choose UTC from the menu (as shown below).

    VMwareULM_SyslogSource-Advanced.png

  8. Click Save.
To configure logs to be collected, do the following:
  1. Go to the directory for the Sumo Logic scripts, and run the events.py script—which queries the vCenter Server for events—from that location (this script queries the vCenter Server for events) with the following command:
python events.py -s [vcenterserver] -u [username] -p [password] -f output.txt
  1. Create a cron job to periodically run the cron_vcenter_events.sh script at the desired time interval.

B. Collecting performance metrics

Collecting performance metrics involves using scripts to call the vCenter performance API’s to extract performance statistics. 

Performance data collection for ESXi servers associated with a vCenter server works by getting data from each ESXi server in parallel, using multiple threads. The number of threads depends on the amount of data you are collecting and the frequency of the collection.

The number of threads can be controlled using a property THREADSIZE_POOL in the sumo.json config file. You can also control the number of objects processed by a single thread using the property BATCH_MORLIST_SIZE. The following is a description of all the configuration properties.

BATCH_MORLIST_SIZE: Default 50, Simultaneous objects processed by a single thread for retrieving the performance data.
THREADSIZE_POOL: Default 5, Number of threads
SSL_VERIFY: Default False, if using SSL, set as True
SSL_CAPATH: Certificate absolute path if SSL_VERIFY is True
To collect performance metrics, do the following:
  1. Follow the instructions to configure a Streaming Metrics Source.

    VMwareULM_CollectPerformanceMetrics_dialog.png
  2. Edit the properties in the bundled sumo.json properties file, as necessary.
  3. Go to the directory for the Sumo Logic scripts, and run the esx_perf_metrics_6_5.py script—which queries the vCenter Server for events—from that location (this script queries the vCenter Server for events) with the following command:
python esx_perf_metrics_6_5.py -u [username] -p [password] -s [vcenter server] -t [target server] -to [target port] -cf [config filename]
  1. In Sumo Logic, verify that metrics are being captured.
  2. When you are satisfied with the batch and thread configurations, create a cron job to periodically run the cron_vcenter_metrics.sh script at the desired time interval.

C. Collecting historical events

By default, the first time events.py is called, events from the past 24 hours are collected. Each time the script is called, it writes the timestamp of the last read event in a file named .timelog_events for the next call to pick up.

To collect events older than the past 24 hours, before setting up the CRON job for cron_vcenter_events.sh, run the script as following:

python events.py --server <vcenter server> --target <syslog host> --targetPort <syslog host port> --bT <time in UTC>

Step 4: Encrypt passwords

The scripts support symmetric authenticated cryptography—also known as secret key authentication—using the python Fernet implementation.

To utilize encryption, generate a key from the python command line:

>>> from cryptography.fernet import Fernet
>>> print(Fernet.generate_key())
b'xgb8NJ3ZYPJbzX6vWHySZbLd73bKWPsGMKoSnry7hL4='

Encrypt the password, from the python command line:

>>> from cryptography.fernet import Fernet
>>> key = b'xgb8NJ3ZYPJbzX6vWHySZbLd73bKWPsGMKoSnry7hL4='
>>> s = Fernet(key)
>>> text = s.encrypt(b"secretpassword")
>>> print(text)
b'gAAAAABb6asvlRfxEj_ZQTKOyrqnGNMbfo_kpxrqv4DCO6TorS4FmKFzrepe0_xtiMT67ZT6OOf5bfrVZXNnUDFNlwPWrpFSfg=='
Modify the scripts to include the encrypted password and the key 

Example for Metrics:

python esx_perf_metrics_6_5.py -u [username] -pK 'xgb8NJ3ZYPJbzX6vWHySZbLd73bKWPsGMKoSnry7hL4=' -p 'gAAAAABb6asvlRfxEj_ZQTKOyrqnGNMbfo_kpxrqv4DCO6TorS4FmKFzrepe0_xtiMT67ZT6OOf5bfrVZXNnUDFNlwPWrpFSfg==' -s 192.168.20.121 -t 127.0.0.1 -to 2003 -cf sumo.json -pE True

Example for Events:

python events.py -s 192.168.20.121 -u [username] -f outfile -pK 'xgb8NJ3ZYPJbzX6vWHySZbLd73bKWPsGMKoSnry7hL4=' -p 'gAAAAABb6asvlRfxEj_ZQTKOyrqnGNMbfo_kpxrqv4DCO6TorS4FmKFzrepe0_xtiMT67ZT6OOf5bfrVZXNnUDFNlwPWrpFSfg==' -pE True

Sample log message

2018-11-15 17:39:09.569 +0530 ,,, message=Error detected for sumo-win2k8-a-4 on xx1.sumolabs.com 
in Production1-West: Agent can't send heartbeats.msg size: 612, sendto() returned: Operation not 
permitted.,,,eventType=<class 'pyVmomi.VmomiSupport.vim.event.GeneralVmErrorEvent'>,,,
vm=ubuntu16.04-b-4,,,host=8df.sumolabs.com,,,datacenter=Production3-East,,,
computeResource=esx1.sumolabscluster.com,,,key=3553,,,chainId=3269

Sample query

The following query is from the vSphere Errors Trend panel of the vCenter Errors - Analysis Dashboard.

_sourceCategory = Labs/VMWare6.5 and ("error" or "fail" or "critical")
| parse "message=*,,," as err_msg
| parse "host=*,,," as esx_host
| parse "eventType=*,,," as event_type
| parse "vm=*,,," as vm nodrop
| parse "computeResource=*,,," as cluster
| where esx_host matches {{esx_host}} and cluster matches {{cluster}} and event_type matches {{event_type}}
| timeslice 1h
| count(err_msg) as err_count by _timeslice
| compare with timeshift 1d 7

Troubleshooting

  • The scripts need read and write access to the directory to generate logs and maintain timestamps.

  • Python must be installed, as the scripts use python.

  • Scripts generate logs which can be reviewed if problems arise.

  • The logs are generated for each run under the configured working directory.

  • If the collector is not running but the script is, the metrics and events will be lost. In such a case, once the collector is running again, update the timestamp in the files .timelog_events and .timelog_metrics to the required start time. This will allow you to retrieve the old data. After the script retrieves the old data, it continues with normal processing.