Skip to main content
Sumo Logic

Collect Kafka Logs and Metrics for Non-Kubernetes environments

This page provides instructions for configuring log and metric collection for the Sumo Logic App for Kakfa in non-Kubernetes environment.

We use the Telegraf Operator for Kafka metric collection and the Sumo Logic Installed Collector for collecting Kafka logs. The diagram below illustrates the components of the Kafka collection in a non-Kubernetes environment. Telegraf runs on the same system as Kafka, and uses the Kafka Jolokia input plugin to obtain Kafka metrics, and the Sumo Logic output plugin to send the metrics to Sumo Logic. Kafka Logs are sent to Sumo Logic Local File Source on Installed Collector.

This section provides instructions for configuring metrics collection for the Sumo Logic App for Kafka. Follow the instructions documented below to set up metrics collection for a given Broker in your Kafka Cluster :

  1. Configure Metrics Collection

    1. Configure a Hosted Collector

    2. Configure an HTTP Logs and Metrics Source

    3. Install Telegraf 

    4. Download and setup Jolokia

    5. Configure the Jolokia Input Plugin

    6. Restart Telegraf

  2. Configure Logs Collection

    1. Configure logging in Kafka

    2. Configure Sumo Logic Installed Collector

Step 1: Configure Collection of Kafka Metrics

  1. Configure a Hosted Collector

To create a new Sumo Logic hosted collector, perform the steps in the Configure a Hosted Collector section of the Sumo Logic documentation.

  1. Configure an HTTP Logs and Metrics Source

Create a new HTTP Logs and Metrics Source in the hosted collector created above by following these instructions. Make a note of the HTTP Source URL.

  1. Install Telegraf

Follow the steps in this document  to install Telegraf on each Kafka Broker node

  1. Download and setup Jolokia on each Kafka Broker node

As part of collecting metrics data from Telegraf, we will use the Jolokia input plugin to get data from Telegraf and the Sumo Logic output plugin to send data to Sumo Logic. 

  • Download the latest version of the Jolokia JVM-Agent from Jolokia.
  • Rename downloaded Jar file to jolokia-agent.jar
  • Save the file jolokia-agent.jar on your kafka server in /opt/kafka/libs
  • Configure Kafka to use Jolokia: 
  1. Add following to 

export JMX_PORT=9999
export KAFKA_JMX_OPTS="-javaagent:/opt/kafka/libs/jolokia.jar=port=8778,host=$RMI_HOSTNAME -Djava.rmi.server.hostname=$RMI_HOSTNAME$JMX_PORT"
  1. Restart Kafka Service

  2. Verify that you can access jolokia on port 8778 using following command:

curl http://KAFKA_SERVER_IP_ADDRESS:8778/jolokia/
  1. Configure the Jolokia Input Plugin

Create or modify the telegraf.conf file in /etc/telegraf/telegraf.d and copy and paste the text from this file.  

Please enter values for the following parameters (marked with CHANGE_ME) in the downloaded file:

  • In the input plugins section which is [[inputs.jolokia2_agent]]:

    • urls - In the [[inputs.jolokia2_agent]] section. The URL to the Kafka server. This can be a comma-separated list to connect to multiple Kafka servers. Please see this doc for more information on additional parameters for configuring the Jolokia input plugin for Telegraf.

    • In the tags sections (total 3) which is section[inputs.jolokia2_agent.tags], and [inputs.disk.tags]

      • environment - This is the deployment environment where the Kafka cluster identified by the value of urls parameter resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. 

      • messaging_cluster - Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards. 

  • In the output plugins section

    • url - This is the HTTP source URL created in step 3. Please see this doc for more information on additional parameters for configuring the Sumo Logic Telegraf output plugin.

Here’s an explanation for additional values set by this Telegraf configuration that we request you please do not modify these values as they will cause the Sumo Logic apps to not function correctly.

  • data_format - “prometheus” In the output plugins section. In other words, this indicates that metrics should be sent in the Prometheus format to Sumo Logic.
  • Component: “messaging” - In the input plugins section.In other words, this value is used by Sumo Logic apps to identify application components.
  • messaging_system: “kafka” - In the input plugins sections.In other words, this value identifies the messaging system.
  • component: “messaging” - In the input plugins sections. In other words, this value identifies application components.

Here is an example telegraf.conf file. 

For all other parameters please see this doc for more properties that can be configured in the Telegraf agent globally.

  1. Restart Telegraf

Once you have finalized your telegraf.conf file, you can start or reload the telegraf service using instructions from their doc.

At this point, Kafka metrics should start flowing into Sumo Logic.

Step 2: Configure Collection of Kafka Logs on each Kafka Broker node

This section provides instructions for configuring log collection for Kafka running on a non-Kubernetes environment for the Sumo Logic App for Kafka. 

By default, Kafka logs are stored in a log file. Follow the instructions below to set up log collection:

  1. Configure logging on each Kafka Broker Node
  2. Configure an Installed Collector
  3. Configure a Source

Perform the steps outlined below for each Kafka Broker node

  • By default Kafka logs (server.log and controller.log) are stored in the directory: /opt/Kafka/kafka_<VERSION>/logs

To add an Installed collector, perform the steps as defined on the page Configure an Installed Collector.

To add a Local File Source source for Kafka do the following

  1. Add a Local File Source in the installed collector configured in the previous step.

  2. Configure the Local File Source fields as follows:

  • Name. (Required)
  • Description. (Optional)
  • File Path (Required). Enter the path to your server.log and controller.log. The files are typically located in /opt/Kafka/kafka_<VERSION>/logs/*.log. 
  • Source Host. Sumo Logic uses the hostname assigned by the OS unless you enter a different host name
  • Source Category. Enter any string to tag the output collected from this Source, such as Kafka/Logs. (The Source Category metadata field is a fundamental building block to organize and label Sources. For details see Best Practices.)
  • Fields. Set the following fields. For more information on fields please see this document:
    • component = messaging
    • messaging_system = kafka
    • messaging_cluster = <Your_KAFKA_Cluster_Name>
    • environment = <Environment_Name>, such as Dev, QA or Prod.

  1. Configure the Advanced section:

  • Enable Timestamp Parsing. Select Extract timestamp information from log file entries.
  • Time Zone. Choose the option, Ignore time zone from log file and instead use, and then select your Kafka Server’s time zone.
  • Timestamp Format. The timestamp format is automatically detected.
  • Encoding. Select UTF-8 (Default).
  • Enable Multiline Processing. Detect messages spanning multiple lines
    • Select Infer Boundaries - Detect message boundaries automatically
  1. Click Save.

At this point, Kafka logs should start flowing into Sumo Logic.

Sample Log Messages

[2021-03-10 20:12:28,742] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)

Query Sample

This sample Query is from the Logs panel of the Kafka - Logs dashboard.

Query String

messaging_cluster= messaging_system="kafka" | json auto maxdepth 1 nodrop | if (isEmpty(log), _raw, log) as kafka_log_message | parse field=kafka_log_message "[*] * *" as date_time,severity,msg | where severity in ("ERROR", "FATAL") | count by date_time, severity, msg | sort by date_time | limit 10