Skip to main content
Sumo Logic

Collect Logs and Metrics for Kubernetes environments

This page provides instructions for configuring log and metric collection for the Sumo Logic App for Kakfa in Kubernetes environment.

In a Kubernetes environment, we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more about it here.The diagram below illustrates how data is collected from Kafka in Kubernetes environments. In the architecture shown below, there are four services that make up the metric collection pipeline: Telegraf, Prometheus, Fluentd and FluentBit.

The first service in the pipeline is Telegraf. Telegraf collects metrics from Kafka. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment. In other words, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Jolokia input plugin to obtain metrics, (For simplicity, the diagram doesn’t show the input plugins.) The injection of the Telegraf sidecar container is done by the Telegraf Operator. We also have Fluentbit that collects logs written to standard out and forwards them to FluentD, which in turn sends all the logs and metrics data to a Sumo Logic HTTP Source. 

Follow the instructions below to set up the metric collection:

  1. Configure Metrics Collection

    1. Setup Kubernetes Collection with the Telegraf operator.

    2. Add annotations on your Kafka pods.

    3. Configure your Kafka Pod to use the Jolokia Telegraf Input Plugin

  2. Configure Logs Collection

    1. Configure logging in Kafka.

    2. Add labels on your Kafka pods to capture logs from standard output.

    3. Collecting Kafka Logs from a Log file

Step 1 Configure Metrics Collection

Follow the steps below to collect metrics from a Kubernetes environment:

  1. Setup Kubernetes Collection with the Telegraf operator.
    Please ensure that you are monitoring your Kubernetes clusters with the Telegraf operator enabled -  If you are not, then please follow these instructions to do so. 

  2. Add annotations on your Kafka pods
    On your Kafka Pods, add the following annotations mentioned in this file.

Please enter in values for the following parameters (marked with CHANGE_ME) in the downloaded file:

  • telegraf.influxdata.com/inputs - As telegraf will be run as a sidecar the urls should always be localhost. 

    • In the input plugins section :

      • urls - The URL to the Kafka server. As telegraf will be run as a sidecar the urls should always be localhost. This can be a comma-separated list to connect to multiple Kafka servers. 

    • In the tags sections (total 3) Which are, , [inputs.jolokia2_agent.tags], and [inputs.disk.tags]:

      • environment - This is the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it. 

      • messaging_cluster - Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards. 

Here’s an explanation for additional values set by this configuration that we request you please do not modify these values as they will cause the Sumo Logic apps to not function correctly.

  • telegraf.influxdata.com/class: sumologic-prometheus - This instructs the Telegraf operator what output to use. This should not be changed.

  • prometheus.io/scrape: "true" - This ensures our Prometheus plugin will scrape the metrics.

  • prometheus.io/port: "9273" - This tells Prometheus what ports to scrape metrics from. This should not be changed.

  • telegraf.influxdata.com/inputs

    • In the tags sections [inputs.jolokia2_agent/diskio/disk]

      • component: “messaging” - This value is used by Sumo Logic apps to identify application components. 

      • messaging_system: “kafka” - This value identifies the database system.

For more information on all other parameters please see this doc for more properties that can be configured in the Telegraf agent globally.

For more information on configuring the Joloka input plugin for Telegraf please see this doc. 

  1. Configure your Kafka Pod to use the Jolokia Telegraf Input Plugin
    Jolokia agent needs to be available to the Kafka Pods. Starting Kubernetes 1.10.0, you can store a binary file in a configMap. This makes it very easy to load the Jolokia jar file, and make it available to your pods. 

  2. Download the latest version of the Jolokia JVM-Agent from Jolokia.

  3. Rename the file to jolokia.jar

  4. Create a configMap jolokia from the binary file

kubectl create configmap jolokia --from-file=jolokia.jar
  1. Modify your Kafka Pod definition to include volume (type ConfigMap)  and volumeMounts.Finally, update the env (environment variable) to start Jolokia, and apply the updated Kafka pod definition.


spec:
  volumes:
    - name: jolokia
      configMap:
        name: jolokia
  containers:
    - name: XYZ
      image: XYZ
      env:
      - name: KAFKA_OPTS
        value: "-javaagent:/opt/jolokia/jolokia.jar=port=8778,host=0.0.0.0"
      volumeMounts:
        - mountPath: "/opt/jolokia"
          name: jolokia

Verification Step: You can ssh to Kafka pod and run following commands to make sure Telegraf (and Jolokia) is scraping metrics from your Kafka Pod: 

-javaagent:/opt/jolokia/jolokia.jar=port=8778,host=0.0.0.0
  • Make sure jolokia.jar exists at /opt/jolokia/ directory of kafka pod.

This is an example  of a Pod definition file looks like. 

Once this has been done, the Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step. Verify metrics are flowing into Sumo Logic by running the following metrics query:

component="messaging" and messaging_system="kafka"

Step 2 Configure Logs Collection

This section explains the steps to collect Kafka logs from a Kubernetes environment.

If your Kafka helm chart/pod is writing the logs to standard output then follow the steps listed below to collect the logs: 

  1. Apply the following labels to your Kafka pods:
     labels:
        environment: "prod-CHANGE_ME"
        component: "messaging"
        messaging_system: "kafka"
        messaging_cluster: "kafka_prod_cluster01-CHANGE_ME

Please enter in values for the following parameters (marked in bold and CHANGE_ME above):

  • environment - This is the deployment environment where the Kafka cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.

  • messaging_cluster - Enter a name to identify this Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards.

Here’s an explanation for additional values set by this configuration that we request you please do not modify as they will cause the Sumo Logic apps to not function correctly.

  • component: “messaging” - This value is used by Sumo Logic apps to identify application components. 

  • messaging_system: “kafka” - This value identifies the messaging system.

For all other parameters please see this doc for more properties that can be configured in the Telegraf agent globally.

The Sumologic-Kubernetes-Collection will automatically capture the logs from stdout and will send the logs to Sumologic. For more information on deploying Sumologic-Kubernetes-Collection, please see this page.

If your Kafka helm chart/pod is writing its logs to log files, you can use a sidecar to send log files to standard out. To do this:

  1. Determine the location of the Kafka log file on Kubernetes. This can be determined from helm chart configurations. 

  2. Install the Sumo Logic tailing sidecar operator.

  3. Add the following annotation in addition to the existing annotations.

annotations:
  tailing-sidecar: sidecarconfig;<mount>:<path_of_kafka_log_file>/<kafka_log_file_name>

Example:
annotations:
  tailing-sidecar: sidecarconfig;data:/opt/Kafka/kafka_<VERSION>/logs/server.log

  1. Make sure that the Kafka pods are running and annotations are applied by using the command: kubectl describe pod <Kafka_pod_name>
  2. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above. 

Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for Messaging Application Components. To do so:

  1. Go to Manage Data > Logs > Field Extraction Rules.

  2. Click the + Add button on the top right of the table.

  3. The following form appears:


  1. Enter the following options:

  • Rule Name. Enter the name as App Component Observability - Messaging.
  • Applied At. Choose Ingest Time
  • Scope. Select Specific Data
    • Scope: Enter the following keyword search expression: 
pod_labels_environment=* pod_labels_component=messaging
pod_labels_messaging_system=kafka pod_labels_messaging_cluster=*
  • Parse Expression.Enter the following parse expression:
if (!isEmpty(pod_labels_environment), pod_labels_environment, "") as environment
| pod_labels_component as component
| pod_labels_messaging_system as messaging_system
| pod_labels_messaging_cluster as messaging_cluster
  1. Click Save to create the rule.

  2. Verify logs are flowing into Sumo Logic by running the following logs query:

component="messaging" and messaging_system="kafka"

Sample Log Messages

{"timestamp":1617392000686,"log":"[2021-04-02 19:33:20,598] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)","stream":"stdout","time":"2021-04-02T19:33:20.599066311Z"}

Query Sample

This sample Query is from the Logs panel of the Kafka - Logs dashboard.

Query String

messaging_cluster=* messaging_system="kafka" | json auto maxdepth 1 nodrop | if (isEmpty(log), _raw, log) as kafka_log_message | parse field=kafka_log_message "[*] * *" as date_time,severity,msg | where severity in ("ERROR", "FATAL") | count by date_time, severity, msg | sort by date_time | limit 10