Skip to main content

Cassandra - Classic Collector

Thumbnail icon

The Cassandra app is a unified logs and metrics app that helps you monitor the availability, performance, health, and resource utilization of your Cassandra clusters. Preconfigured dashboards provide insight into cluster health, resource utilization, cache/Gossip/Memtable statistics, compaction, garbage collection, thread pools, and write paths.

Log types and Metrics

The app supports Logs and Metrics from the open-source version of Cassandra. The app is tested on the 3.11.10 version of Cassandra.

Cassandra has three main logs, system.log, debug.log, and gc.log which hold general logging messages, debugging logging messages, and java garbage collection logs respectively.

These logs by default live in ${CASSANDRA_HOME}/logs, but most Linux distributions relocate logs to /var/log/cassandra. Operators can tune this location as well as what levels are logged using the provided logback.xml file. For more details on Cassandra logs, see this link.

The Sumo Logic app for Cassandra supports metrics generated by the Jolokia2 plugin for Telegraf. The app assumes prometheus format Metrics.

Collecting logs and metrics for Cassandra

This section provides instructions for configuring log and metric collection for the Sumo Logic app for Cassandra.

Configure Collection for Cassandra

In Kubernetes environments, we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more about it here.

The diagram below illustrates how data is collected from Cassandra in a Kubernetes environment. In the architecture shown below, make up the metric collection pipeline: Telegraf, Telegraf Operator, Prometheus, and Sumo Logic Distribution for OpenTelemetry Collector.


cassandra

The first service in the metrics pipeline is Telegraf. Telegraf collects metrics from Cassandra. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment for example, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Jolokia2 input plugin to obtain metrics. For simplicity, the diagram doesn’t show the input plugins. The injection of the Telegraf sidecar container is done by the Telegraf Operator. Prometheus pulls metrics from Telegraf and sends them to Sumo Logic Distribution for OpenTelemetry Collector which enriches metadata and sends metrics to Sumo Logic.

In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic.

Prerequisites

It’s assumed that you're using the latest helm chart version. If not, upgrade using the instructions here.

Configure Metrics Collection

Follow the steps listed below to collect Cassandra metrics from a Kubernetes environment.

  1. Set up your Kubernetes Collection with the Telegraf Operator.
  2. On your Cassandra Pods, add the following annotations:
annotations:
telegraf.influxdata.com/class: sumologic-prometheus
prometheus.io/scrape: "true"
prometheus.io/port: "9273"
telegraf.influxdata.com/inputs: |+
[[inputs.jolokia2_agent]]
urls = ["http://localhost:8778/jolokia"]
name_prefix = "cassandra_java_"
[inputs.jolokia2_agent.tags]
environment = "prod"
component = "database"
db_system = "cassandra"
db_cluster = "cassandra_on_premise"
dc = "IDC1"
[[inputs.jolokia2_agent.metric]]
name = "Memory"
mbean = "java.lang:type=Memory"
[[inputs.jolokia2_agent.metric]]
name = "GarbageCollector"
mbean = "java.lang:name=*,type=GarbageCollector"
tag_keys = ["name"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name=”OperatingSystem”
mbean=”java.lang:type=OperatingSystem”
paths = [“FreePhysicalMemorySize", "AvailableProcessors", "SystemCpuLoad", "TotalPhysicalMemorySize", "TotalSwapSpaceSize", "SystemLoadAverage"]
[[inputs.jolokia2_agent]]
urls = ["http://localhost:8778/jolokia"]
name_prefix = "cassandra_"
[inputs.jolokia2_agent.tags]
environment="ENV_TO_BE_CHANGED"
component="database"
db_system="cassandra"
db_cluster="cassandra_on_premise"
db_cluster_address = “ENV_TO_BE_CHANGED”
db_cluster_port = “ENV_TO_BE_CHANGED”
dc = "IDC1"
[[inputs.jolokia2_agent.metric]]
name = "TableMetrics"
mbean = "org.apache.cassandra.metrics:name=*,scope=*,keyspace=*,type=Table"
tag_keys = ["name", "scope","keyspace"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name = "DroppedMessageMetrics"
mbean = "org.apache.cassandra.metrics:name=*,scope=*,type=DroppedMessage"
tag_keys = ["name", "scope"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name = "ClientMetrics"
mbean = "org.apache.cassandra.metrics:type=Client,name=*"
tag_keys = ["name"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name = "ThreadPoolMetrics"
mbean = "org.apache.cassandra.metrics:type=ThreadPools,path=*,scope=*,name=*"
tag_keys = ["name", "scope", "path"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name = "CacheMetrics"
mbean = "org.apache.cassandra.metrics:type=Cache,scope=*,name=*"
tag_keys = ["name", "scope"]
field_prefix = "$1_"
[[inputs.jolokia2_agent.metric]]
name = "CommitLogMetrics"
mbean = "org.apache.cassandra.metrics:type=CommitLog,name=*"
tag_keys = ["name"] field_prefix = "$1_"

Enter in values for the following parameters (marked ENV_TO_BE_CHANGED above):

  • telegraf.influxdata.com/inputs. This contains the required configuration for the Telegraf Cassandra Input plugin. Please refer to this doc for more information on configuring the Cassandra input plugin for Telegraf. As Telegraf will be run as a sidecar, the host should always be localhost.
    • In the input plugins section ([[inputs.jolokia2_agent]]):
      • urls - The URL to the Cassandra server. This can be a comma-separated list to connect to multiple Cassandra servers. Please see this doc for more information on additional parameters for configuring the Cassandra input plugin for Telegraf.
    • In the tags section ([[inputs.jolokia2_agent]]):
      • environment. This is the deployment environment where the Cassandra cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
      • db_cluster. Enter a name to identify this Cassandra cluster. This cluster name will be shown in the Sumo Logic dashboards.
      • db_cluster_address. Enter the cluster hostname or ip address that is used by the application to connect to the database. It could also be the load balancer or proxy endpoint.
      • db_cluster_port. Enter the database port. If not provided, a default port will be used

Do not modify the following values set by this configuration as it will cause the Sumo Logic app to not function correctly.

  • telegraf.influxdata.com/class: sumologic-prometheus. This instructs the Telegraf operator what output to use. This should not be changed.
  • prometheus.io/scrap: "true". This ensures our Prometheus will scrape the metrics.
  • prometheus.io/port: "9273". This tells prometheus what ports to scrape on. This should not be changed.
  • telegraf.influxdata.com/inputs
    • In the tags section ([inputs.jolokia2_agent.tags]):
      • component: “database” - This value is used by Sumo Logic apps to identify application components.
      • db_system: “cassandra” - This value identifies the database system.
note

db_cluster_address and db_cluster_port should reflect exact configuration of DB client configuration in your application, especially if you instrument it with OT tracing. The values of these fields should match exactly the connection string used by the database client (reported as values for net.peer.name and net.peer.port metadata fields).

For example, if your application uses “cassandra-prod.sumologic.com:3306” as the connection string, the field values should be set as follows: db_cluster_address=cassandra-prod.sumologic.com db_cluster_port=3306.

If your application connects directly to a given Cassandra node, rather than the whole cluster, use the application connection string to override the value of the “host” field in the Telegraf configuration: host=cassandra-prod.sumologic.com.

Pivoting to Tracing data from Entity Inspector is possible only for “Cassandra address” Entities.

See this doc for more parameters that can be configured in the Telegraf agent globally.

  1. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step.
  2. Verify metrics in Sumo Logic.

Configure Logs Collection

This section explains the steps to collect Cassandra logs from a Kubernetes environment.

  1. Add labels on your Cassandra pods to capture logs from standard output on Kubernetes.

    1. Apply following labels to the Cassandra pods:
    environment: "<Ex prod, stag>"
    component: "database"
    db_system: "cassandra"
    db_cluster: "<Your_Cassandra_Cluster_Name>"--Enter Default if you do not have one.
    db_cluster_address: <your cluster’s hostname or ip address or service endpoint>
    db_cluster_port: <database port>

    Please enter values for the following parameters:

    • environment. This is the deployment environment where the Cassandra cluster identified by the value of servers resides. For example: dev, prod or qa. While this value is optional we highly recommend setting it.
    • db_cluster- Enter a name to identify the Cassandra cluster. The cluster name will be shown in the Sumo Logic dashboards.

    Do not modify the following values as it will cause the Sumo Logic apps to not function correctly.

    • component: “database”. This value is used by Sumo Logic apps to identify application components.
    • db_system: “Cassandra”. This value identifies the database system.
    • db_cluster_address. Enter the cluster hostname or ip address that is used by the application to connect to the database. It could also be the load balancer or proxy endpoint.
    • db_cluster_port. Enter the database port. If not provided, a default port will be used
note

db_cluster_address and db_cluster_port should reflect exact configuration of DB client configuration in your application, especially if you instrument it with OT tracing. The values of these fields should match exactly the connection string used by the database client (reported as values for net.peer.name and net.peer.port metadata fields).

For example, if your application uses “cassandra-prod.sumologic.com:3306” as the connection string, the field values should be set as follows: db_cluster_address=cassandra-prod.sumologic.com db_cluster_port=3306

If your application connects directly to a given Cassandra node, rather than the whole cluster, use the application connection string to override the value of the “host” field in the Telegraf configuration: host=cassandra-prod.sumologic.com

Pivoting to Tracing data from Entity Inspector is possible only for “Cassandra address” Entities.

For all other parameters, see this doc for more parameters that can be configured in the Telegraf agent globally.

  1. (Optional) Collecting Cassandra Logs from a Log File on Kubernetes.
    1. Determine the location of the Cassandra log file on Kubernetes. This can be determined from the Cassandra logback.xml for your Cassandra cluster along with the mounts on the Cassandra pods.
    2. Install the Sumo Logic tailing sidecar operator.
    3. Add the following annotation in addition to the existing annotations.
    annotations:
    tailing-sidecar: sidecarconfig;<mount>:<path_of_Cassandra_log_file>/ <Cassandra_log_file_name>

Example:

annotations:
tailing-sidecar: sidecarconfig;data:/opt/bitnami/cassandra/logs/cassandra.log
  1. Make sure that the Cassandra pods are running and annotations are applied by using the command:
kubectl describe pod <Cassandra_pod_name>
  1. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above.
  2. Verify logs in Sumo Logic.


FER to normalize the fields in Kubernetes environments. Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, a Field Extraction Rule named AppObservabilityCassandraDatabaseFER is automatically created for Database Application Components.


Installing the Cassandra app

note

This step is not needed if you are using the Application Components Solution Terraform script.

To install the app, do the following:

note

Next-Gen App: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app.

  1. Select App Catalog.
  2. In the 🔎 Search Apps field, run a search for your desired app, then select it.
  3. Click Install App.
    note

    Sometimes this button says Add Integration.

  4. Click Next in the Setup Data section.
  5. In the Configure section of your respective app, complete the following fields.
    1. Is K8S deployment involved. Specify if resources being monitored are partially or fully deployed on Kubernetes (K8s)
  6. Click Next. You will be redirected to the Preview & Done section.

Post-installation

Once your app is installed, it will appear in your Installed Apps folder, and dashboard panels will start to fill automatically.

Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time.

As part of the app installation process, the following fields will be created by default:

  • component
  • environment
  • db_system
  • db_cluster
  • pod
  • db_cluster_address
  • db_cluster_port

Additionally, if you're using Cassandra in the Kubernetes environment, the following additional fields will be created by default during the app installation process:

  • pod_labels_component
  • pod_labels_environment
  • pod_labels_db_system
  • pod_labels_db_cluster
  • pod_labels_db_cluster_address
  • pod_labels_db_cluster_port

For information on setting up fields, see Fields.

Viewing Cassandra dashboards

All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level.

  • You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. Learn more.
  • You can use template variables to drill down and examine the data on a granular level. For more information, see Filtering Dashboards with Template Variables.
  • Most Next-Gen apps allow you to provide the scope at the installation time and are comprised of a key (_sourceCategory by default) and a default value for this key. Based on your input, the app dashboards will be parameterized with a dashboard variable, allowing you to change the dataset queried by all panels. This eliminates the need to create multiple copies of the same dashboard with different queries.

Overview

The Cassandra (Classic) - Overview dashboard provides an at-a-glance view of Cassandra backend and frontend HTTP error codes percentage, visitor location, URLs, and clients causing errors.

Use this dashboard to:

  • Identify Frontend and Backend Sessions percentage usage to understand active sessions. This can help you increase the session limit.
  • Gain insights into originated traffic location by region. This can help you allocate computer resources to different regions according to their needs.
  • Gain insights into the client, server responses on the server. This helps you identify errors in the server.
  • Gain insights into network traffic for the frontend and backend systems of your server.
Cassandra dashboards

Cache Stats

The Cassandra (Classic) - Cache Stats dashboard provides insight into the database cache status, schedule, and items.

Use this dashboard to:

  • Monitor Cache performance.
  • Identify Cache usage statistics.
Cassandra dashboards

Errors and Warnings

The Cassandra (Classic) - Errors and Warnings dashboard provides details of the database errors and warnings.

Use this dashboard to:

  • Review errors and warnings generated by the server.
  • Review the Threads errors and warning events.
Cassandra dashboards

Gossip

The Cassandra (Classic) - Gossip dashboard provides details about communication between various cassandra nodes.

Use this dashboard to:

  • Determine nodes with errors resulting in failures.
  • Review the node activity and pending tasks.
Cassandra dashboards

Memtable

The Cassandra (Classic) - Memtable dashboard provides insights into memtable statistics.

Use this dashboard to:

  • Review flush activity and memtable status.
Cassandra dashboards

Resource Usage

The Cassandra (Classic) - Resource Usage dashboard provides details of resource utilization across Cassandra clusters.

Use this dashboard to:

  • Identify resource utilization. This can help you to determine whether are resources over- or under-allocated.
Cassandra dashboards

Compactions

The Cassandra (Classic) - Compactions dashboard provides details of compactions.

Use this dashboard to:

  • Review pending/completed compactions and flushes.
Cassandra dashboards

Garbage Collection

The Cassandra (Classic) - Garbage Collection dashboard shows key Garbage Collector statistics like the duration of the last GC run, objects collected, threads used, and memory cleared in the last GC run.

Use this dashboard to:

  • Understand the garbage collection time. If the time keeps on increasing, you may have more CPU usage.
  • Understand the amount of memory cleared by garbage collectors across memory pools and its impact on the Heap memory.
Cassandra dashboards

Read Path

The Cassandra (Classic) - Read Path dashboard shows read operation statistics.

Use this dashboard to:

  • Gather insights into read operations, cache statistics, Tombstone, and SSTTables summary.
  • Review thread pool and memtable usage for read operations.
Cassandra dashboards

Resource Usage

The Cassandra (Classic) - Resource Usage dashboard provides details of resource utilization across Cassandra clusters.

Use this dashboard to:

  • Identify resource utilization. This can help you to determine resources over or under allocation.
Cassandra dashboards

Thread Pool

The Cassandra (Classic) - Thread Pool dashboard shows thread pool statistics.

Use this dashboard to:

  • Review thread pool usage and statistics for different kinds of operations.
Cassandra dashboards

Write Path

The Cassandra (Classic) - Write Path dashboard shows write operation statistics.

Use this dashboard to:

  • Gather insights into write operations, cache statistics, Tombstone, and SSTTables summary.
  • Review thread pool and memtable usage for write operations.
Cassandra dashboards

Create monitors for Cassandra app

From your App Catalog:

  1. From the Sumo Logic navigation, select App Catalog.
  2. In the Search Apps field, search for and then select your app.
  3. Make sure the app is installed.
  4. Navigate to What's Included tab and scroll down to the Monitors section.
  5. Click Create next to the pre-configured monitors. In the create monitors window, adjust the trigger conditions and notifications settings based on your requirements.
  6. Scroll down to Monitor Details.
  7. Under Location click on New Folder.
    note

    By default, monitor will be saved in the root folder. So to make the maintenance easier, create a new folder in the location of your choice.

  8. Enter Folder Name. Folder Description is optional.
    tip

    Using app version in the folder name will be helpful to determine the versioning for future updates.

  9. Click Create. Once the folder is created, click on Save.

Cassandra Alerts

Alert NameAlert DescriptionAlert ConditionRecover Condition
Cassandra (Classic) - Increase in Authentication FailuresThis alert fires when there is an increase of Cassandra authentication failures.>5<= 5
Cassandra (Classic) - Cache Hit Rate below 85 PercentThis alert fires when the cache key hit rate is below 85%.<85>= 85
Cassandra (Classic) - High Commitlog Pending TasksThis alert fires when there are more than 15 Commitlog tasks that are pending.>15<= 15
Cassandra (Classic) - High Number of Compaction Executor Blocked TasksThis alert fires when there are more than 15 compaction executor tasks blocked for more than 5 minutes.>15<= 15
Cassandra (Classic) - Compaction Task PendingThis alert fires when there are many Cassandra compaction tasks that are pending. You might need to increase I/O capacity by adding nodes to the cluster.>100<= 100
Cassandra (Classic) - High Number of Flush Writer Blocked TasksThis alert fires when there is a high number of flush writer tasks which are blocked.>15<= 15
Cassandra (Classic) - Many Compaction Tasks Are PendingMany Cassandra compaction tasks are pending>100<= 100
Cassandra (Classic) - Node DownThis alert fires when one or more Cassandra nodes are down>0<= 0
Cassandra (Classic) - Blocked Repair TasksThis alert fires when the repair tasks are blocked>2<= 2
Cassandra (Classic) - Repair Tasks PendingThis alert fires when repair tasks are pending.>2<= 2
Cassandra (Classic) - High Tombstone ScanningThis alert fires when tombstone scanning is very high (>1000 99th Percentile) in queries.>1000<= 1000
Status
Legal
Privacy Statement
Terms of Use

Copyright © 2025 by Sumo Logic, Inc.