Skip to main content

Best Practices and Advanced Configuration

Overriding chart resource names with fullnameOverride​

You can use the fullnameOverride properties of this chart and of its subcharts to override the created resource names.

See the chart's README for all the available fullnameOverride properties.

Here's an example of using the fullnameOverride properties with the components that are enabled by default:

fullnameOverride: sl

kube-prometheus-stack:
fullnameOverride: kps

kube-state-metrics:
fullnameOverride: ksm

prometheus-node-exporter:
fullnameOverride: pne

After installing the chart, the resources in the cluster will have names similar to the following:

$ kubectl -n <namespace> get pods
NAME READY STATUS RESTARTS AGE
kps-operator-6fdb4b955-zqsdh 1/1 Running 0 3m4s
ksm-779757976f-4d262 1/1 Running 0 3m4s
pne-hfk88 1/1 Running 0 3m4s
prometheus-kps-prometheus-0 2/2 Running 0 3m4s
sl-otelcol-events-0 1/1 Running 0 3m4s
sl-otelcol-instrumentation-0 1/1 Running 0 3m4s
sl-otelcol-instrumentation-1 1/1 Running 0 3m4s
sl-otelcol-instrumentation-2 1/1 Running 0 3m4s
sl-otelcol-logs-0 1/1 Running 0 3m4s
sl-otelcol-logs-1 1/1 Running 0 3m4s
sl-otelcol-logs-2 1/1 Running 0 3m4s
sl-otelcol-logs-collector-24z56 1/1 Running 0 3m4s
sl-otelcol-metrics-0 1/1 Running 0 3m4s
sl-otelcol-metrics-1 1/1 Running 0 3m4s
sl-otelcol-metrics-2 1/1 Running 0 3m4s
sl-remote-write-proxy-7f79766ff7-pvchj 1/1 Running 0 3m4s
sl-remote-write-proxy-7f79766ff7-qkk8n 1/1 Running 0 3m4s
sl-remote-write-proxy-7f79766ff7-vsjbj 1/1 Running 0 3m4s
sl-traces-gateway-cdc7d9bbc-wq9sl 1/1 Running 0 3m4s
sl-traces-sampler-94656c48d-cslsb 1/1 Running 0 3m4s
note

When changing the fullnameOverride property for an already installed chart with the helm upgrade command, you need to restart the Prometheus pods for the changed names of the Otelcol pods to be picked up:

helm -n <namespace> upgrade <release_name> sumologic/sumologic --values changed-fullnameoverride.yaml
kubectl -n <namespace> rollout restart statefulset <prometheus_statefulset_name>

OpenTelemetry Collector Autoscaling​

Autoscaling for logs metadata, metrics metadata, metrics collector, otelcol instrumentation, and traces gateway is enabled by default.

Disabling autoscaling for all components can be done by overriding the sumologic.autoscaling.enabled parameter.

sumologic:
autoscaling:
enabled: false

Furthermore, you can disable autoscaling for individual components. Keep in mind that the autoscaling.enabled parameter always takes precedence over the global sumologic.autoscaling.enabled parameter. Use the following configuration to accomplish this:

  • Disable autoscaling for log metadata enrichment
    metadata:
    logs:
    autoscaling:
    enabled: false
  • Disable autoscaling for metrics metadata enrichment
    metadata:
    metrics:
    autoscaling:
    enabled: false
  • Disable autoscaling for metrics collector
    sumologic:
    metrics:
    collector:
    otelcol:
    autoscaling:
    enabled: false
  • Disable autoscaling for otelcol instrumentation
    otelcolInstrumentation:
    autoscaling:
    enabled: false
  • Disable autoscaling for traces gateway
    tracesGateway:
    autoscaling:
    enabled: false

It's also possible to adjust other autoscaling configuration options, like the maximum number of replicas or the average CPU utilization. Refer to the chart readme and default values.yaml for details.

If metrics-server is not already installed in your cluster, it is required to enable metrics-server.

## Configure metrics-server
## ref: https://github.com/bitnami/charts/tree/master/bitnami/metrics-server/values.yaml
metrics-server:
enabled: true

Cleaning unused PVCs​

When autoscaling is enabled, new Persistent Volume Claims (PVCs) will be created for new pods. However, Kubernetes doesn't support removing unused PVCs after downscaling a StatefulSet yet, which means that they will be reused when the StatefulSet gets upscaled again. This creates problems in EKS when new instance of the pod is created in a different availability zone than the initial one, as PVCs cannot be mounted across different AZs.

To solve this problem, pvcCleaner can be used to remove unused PVCs:

  • For log metadata enrichment
    pvcCleaner:
    logs:
    enabled: true
  • For metrics metadata enrichment
    pvcCleaner:
    metrics:
    enabled: true
    This will create cronJobs that will run the pvcCleaner script regularly. The schedule is by default set to 15 minutes, but it can be overridden:
    pvcCleaner:
    job:
    schedule: "*/10 * * * *"
    The schedule is written in Cron format.
note

It is recommended to enable pvcCleaner only for the types of telemetry which are autoscaled to avoid creating unnecessary cronJobs.

OpenTelemetry Collector Persistent Buffer​

The logs and metrics metadata enrichment pods store data in a persistent file-based buffer to prevent data loss during restarts. This feature is enabled by default and can be disabled using the metadata.persistence.enabled: false property.

The persistence of the event collection pods is controlled by another property: sumologic.events.persistence.enabled and is also enabled by default.

The persistent storage is implemented from the Kubernetes perspective by adding a volumeClaimTemplate to the statefulset, which in effect creates a PersistentVolumeClaim resource for each pod of the statefulset.

For logs and metrics, use the following properties to configure the PVC templates:

  • metadata.persistence.accessMode: ReadWriteOnce defines the access mode used in the PersistentVolumeClaim resources.
  • metadata.persistence.pvcLabels: {} allows to set additional labels on the PersistentVolumeClaim resources
  • metadata.persistence.size: 10Gi defines the requested volume size
  • metadata.persistence.storageClass: null defines the storage class name to use in the volume claim template. This property is not set by default, which means that the storageClassName will not be set in the PersistentVolumeClaim resources and the default storage class will be used (assuming the DefaultStorageClass admission controller is enabled on the Kubernetes API server).

For events, use the following properties:

  • sumologic.events.persistence.persistentVolume.accessMode: ReadWriteOnce
  • sumologic.events.persistence.persistentVolume.pvcLabels: {}
  • sumologic.events.persistence.persistentVolume.storageClass: null
  • sumologic.events.persistence.size: 10Gi

Collect logs from additional files on the Node​

To collect logs from additional files on Node, it is necessary to:

  • mount log file to make it accessible by log collector pod
  • create an additional pipeline in both the log collector and metadata enrichment service to transfer your data
  • open a new port for this pipeline to enable sending data from the log collector to the metadata enrichment service

The following configuration can be used for these purposes:

otellogs:
config:
merge:
receivers:
filelog/extrafiles:
include: [/var/log/extrafiles/extrafile.log]
exporters:
otlphttp/extrafiles:
endpoint: http://${LOGS_METADATA_SVC}.${NAMESPACE}.svc.cluster.local.:4319
service:
pipelines:
logs/extrafiles:
receivers: [filelog/extrafiles]
exporters: [otlphttp/extrafiles]
daemonset:
extraVolumes:
- name: extrafiles-mount
hostPath:
path: PATH_TO_YOUR_LOG_FILE
type: File
extraVolumeMounts:
- name: extrafiles-mount
mountPath: /var/log/extrafiles/extrafile.log
readOnly: true

metadata:
logs:
config:
merge:
receivers:
otlp/extrafiles:
protocols:
http:
endpoint: 0.0.0.0:4319
service:
pipelines:
logs/extrafiles:
receivers: [otlp/extrafiles]
processors:
- memory_limiter
- batch
exporters: [sumologic]
statefulset:
extraPorts:
- name: otlphttp2
containerPort: 4319
protocol: TCP

In the example above, two internally defined processors were used in metadata pipeline: batch and memory limiter. If you need to change the parameters of these processors in any way, you can define your own and use them in this pipeline.

Removing attributes from systemd logs​

If you want to remove some attributes from Systemd logs, like for example PRIORITY and SYSLOG_FACILITY, you can do it the following way:

sumologic:
logs:
systemd:
otelcol:
extraProcessors:
- transform/cleanup_systemd:
log_statements:
- context: log
statements:
- delete_key(body, "PRIORITY")
- delete_key(body, "SYSLOG_FACILITY")

Modify the Log Level for Falco​

To modify the default log level for Falco, edit the following section in the user-values.yaml file. Available log levels can be found in Falco's documentation here: https://falco.org/docs/configuration/.

falco:
## Set the enabled flag to false to disable falco.
enabled: true
falco:
json_output: true
log_level: debug

Overriding metadata using annotations​

You can use Kubernetes annotations to override some metadata and settings per pod or per namespace.

Pod annotations take precedence over namespace annotations.

  • sumologic.com/sourceCategory overrides the value of the sumologic.logs.container.sourceCategory property
  • sumologic.com/sourceCategoryPrefix overrides the value of the sumologic.logs.container.sourceCategoryPrefix property
  • sumologic.com/sourceCategoryReplaceDash overrides the value of the sumologic.logs.container.sourceCategoryReplaceDash property
  • sumologic.com/sourceName overrides the value of the sumologic.logs.container.sourceName property

The following example uses pod annotations:

apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: mywebsite
template:
metadata:
name: nginx
labels:
app: mywebsite
annotations:
sumologic.com/format: "text"
sumologic.com/sourceCategory: "mywebsite/nginx"
sumologic.com/sourceName: "mywebsite_nginx"
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80

The following example uses namespace annotations:

apiVersion: v1
kind: Namespace
metadata:
annotations:
sumologic.com/sourceCategory: "namespaceSourceCategory"
sumologic.com/sourceCategoryPrefix: "namespace-Source-Category-Prefix"
sumologic.com/sourceCategoryReplaceDash: "#"
sumologic.com/sourceHost: "namespaceSourceHost"
sumologic.com/sourceName: "namespaceSourceName"
name: my-namespace

Overriding source category with pod annotations​

The following example shows how to customize the source category for data from a specific deployment. The resulting value of _sourceCategory field will be my-component:

apiVersion: v1
kind: Deployment
metadata:
name: my-component-deployment
spec:
replicas: 1
selector:
app: my-component
template:
metadata:
annotations:
sumologic.com/sourceCategory: "my-component"
sumologic.com/sourceCategoryPrefix: ""
sumologic.com/sourceCategoryReplaceDash: "-"
labels:
app: my-component
name: my-component
spec:
containers:
- name: my-component
image: my-image

The sumologic.com/sourceCategory annotation defines the source category for the data.

The empty sumologic.com/sourceCategoryPrefix annotation removes the default prefix added to the source category.

The sumologic.com/sourceCategoryReplaceDash annotation with value - prevents the dash in the source category from being replaced with another character.

Excluding data using annotations​

You can use the sumologic.com/exclude annotation to exclude data from Sumo. This data is sent to the metadata enrichment service, but not to Sumo. You can exclude data per pod using pod annotations or per namespace using namespace annotations. However, pod annotations take precedence over namespace annotations.

The following example uses pod annotations:

apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: mywebsite
template:
metadata:
name: nginx
labels:
app: mywebsite
annotations:
sumologic.com/exclude: "true"
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80

The following example uses namespace annotations:

apiVersion: v1
kind: Namespace
metadata:
annotations:
sumologic.com/exclude: "true"
name: my-namespace

Including subsets of excluded data​

If you excluded a whole namespace, but still need one or few pods to be still included for shipping to Sumo, you can use the sumologic.com/include annotation to include it. It takes precedence over the exclusion described above.

apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 1
selector:
app: mywebsite
template:
metadata:
name: nginx
labels:
app: mywebsite
annotations:
sumologic.com/format: "text"
sumologic.com/sourceCategory: "mywebsite/nginx"
sumologic.com/sourceName: "mywebsite_nginx"
sumologic.com/include: "true"
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80

Templating Kubernetes metadata​

The following Kubernetes metadata is available for string templating:

String templateDescription
%{namespace}Namespace name
%{pod}Full pod name (e.g., travel-products-4136654265-zpovl)
%{pod_name}Friendly pod name (e.g., travel-products)
%{pod_id}The pod's uid (a UUID)
%{container}Container name
%{source_host}Host
%{label:foo}The value of label foo

Missing labels​

Unlike the other templates, labels are not guaranteed to exist, so missing labels interpolate as "undefined".

For example, if you have only the label app: travel but you define SOURCE_NAME="%{label:app}@%{label:version}", the source name will appear as travel@undefined.

Disable logs, metrics, or falco​

If you want to disable the collection of logs, metrics, or falco, make the below changes respectively in the user-values.yaml file and run the helm upgrade command.

parametervaluefunction
sumologic.logs.enabledfalsedisable logs collection
sumologic.metrics.enabledfalsedisable metrics collection
falco.enabledfalsedisable falco

Changing scrape interval for OpenTelemetry metrics collection​

You can adjust the default metric scrapeInterval (30s) if you are using OpenTelemetry for collection (default). This is the recommended value which ensures that all of Sumo Logic dashboards are filled up with proper data.

To change it, you can use following configuration:

sumologic:
metrics:
collector:
otelcol:
scrapeInterval: 30s

Changing scrape interval for Prometheus metrics collection​

You can adjust the default metric scrapeInterval (30s) if you are using Prometheus for collection instead of OpenTelemetry (default). This is the recommended value which ensures that all of Sumo Logic dashboards are filled up with proper data.

To change it, you can use following configuration:

kube-prometheus-stack: # For user-values.yaml
prometheus:
prometheusSpec:
scrapeInterval: "1m"

Get logs not available on stdout​

When logs from a pod are not available on stdout, Tailing Sidecar Operator can help with collecting them using standard logging pipeline. To tail logs using Tailing Sidecar Operator the file with those logs needs to be accessible through a volume mounted to sidecar container.

Providing that the file with logs is accessible through volume, to enable tailing of logs using Tailing Sidecar Operator:

  • Enable Tailing Sidecar Operator by modifying user-values.yaml:

    tailing-sidecar-operator:
    enabled: true
  • Add annotation to pod from which you want to tail logs in the following format:

    metadata:
    annotations:
    tailing-sidecar: <sidecar-name-0>:<volume-name-0>:<path-to-tail-0>;<sidecar-name-1>:<volume-name-1>:<path-to-tail-1>

Example of using Tailing Sidecar Operator is described in the blog post.

Using custom Kubernetes API server address​

In order to change API server address, the following configurations can be used.

metadata:
logs:
statefulset:
extraEnvVars:
- name: KUBERNETES_SERVICE_HOST
value: my-custom-k8s.api
- name: KUBERNETES_SERVICE_PORT
value: "12345"
metrics:
statefulset:
extraEnvVars:
- name: KUBERNETES_SERVICE_HOST
value: my-custom-k8s.api
- name: KUBERNETES_SERVICE_PORT
value: "12345"

OpenTelemetry Collector queueing and batching​

OpenTelemetry comes with several parameters related to queue management.

For batch processor:

  • send_batch_size defines the number of items (logs, metrics, traces) in one batch before it's sent further down the pipeline.
  • timeout defines time after which the batch is sent regardless of the size (can be lower than send_batch_size).
  • send_batch_max_size is an upper limit of the batch size.

We could say that send_batch_size is a soft limit and send_batch_max_size is a hard limit of the batch size.

In Helm Chart's default configuration, send_batch_max_size is set to 2 * send_batch_size. It is necessary to consider changing the value of send_batch_max_size whenever send_batch_size is changed. More information can be found in sumologic-otel-collector's documentation.

For sumologic exporter:

  • max_request_body_size defines maximum size of requests to sumologic before compression.

  • timeout defines connection timeout. It is recommended to adjust this value in relation to max_request_body_size.

  • sending_queue.num_consumers is the number of consumers that dequeue batches. It translates to maximum number of parallel connections to the sumologic backend.

  • sending_queue.queue_size is capacity of the queue in terms of batches (batches can vary between 1 and send_batch_max_size)

note

As the effective value of sending_queue.queue_size depends on current traffic, there is no way to figure out optimal PVC size in relation to sending_queue.queue_size. Due to this, we recommend setting sending_queue.queue_size to high value in order to use maximum resources of PVC.

The above, in connection with PVC monitoring, can lead to constant alerts (e.g., KubePersistentVolumeFillingUp), because once filled in PVC never reduces its fill.

Compaction​

The OpenTelemetry Collector doesn't have a compaction mechanism. Local storage can only grow - it can reuse disk space that has already been allocated, but not free it. This leads to a situation where the database file can grow a lot (due to a spike in data traffic) but after some time only small piece of the file will be used for data storage (until next spike).

Examples​

Here are some useful examples and calculations for queue and batch parameters.

For the calculations below we made an assumption that a single metric data point is around 1 kilobyte in size, including metadata. This assumption is based on the average data we ingest. Persistent storage doesn't compress data so we assume that single metric data point takes 1 kilobyte on disk as well.

number_of_instances represents number of sumologic-otelcol-metrics instances.

Outage with huge metrics spike​

Let's consider a huge metrics spike in your network while connection to the Sumologic is down. Huge load means that batch processor is going to push batches due to send_batch_max_size instead of timeout. The reliability of the system can be calculated using the following formulas:

  • If limited by queue_size: number_of_instances*send_batch_max_size*sending_queue.queue_size/load_in_DPM minutes.
  • If limited by PVC size: number_of_instances*PVC_size/(1KB*load_in_DPM) minutes.

Outage with low DPM load​

Let's consider a low but constant load in your network while connection to the Sumologic is down. Low load means that batch processor is going to push batches due to timeout instead of send_batch_max_size. The reliability of the system can be calculated using the following formulas:

  • If limited by queue_size: number_of_instances*timeout[min]*sending_queue.queue_size/load_in_DPM minutes.
  • If limited by PVC size: number_of_instances*PVC_size/(1KB*load_in_DPM) minutes.

Example configuration​

Here is an example configuration to change send_batch_size, send_batch_max_size, and timeout for logs metadata otelcol, metrics metadata otelcol and logs collector otelcol.

metadata:
logs:
config:
merge:
processors:
batch:
timeout: 10s
send_batch_size: 1_024
send_batch_max_size: 2_048
metrics:
config:
merge:
processors:
batch:
timeout: 10s
send_batch_size: 1_024
send_batch_max_size: 2_048

otellogs:
config:
merge:
processors:
batch:
timeout: 10s
send_batch_size: 1_024
send_batch_max_size: 2_048

Below there is example configuration to change sending_queue for metrics metadata otelcol, logs metadata otelcol and logs collector otelcol

metadata:
logs:
config:
merge:
exporters:
sumologic/containers:
sending_queue:
queue_size: 25000
sumologic/systemd:
sending_queue:
queue_size: 25000

metrics:
config:
merge:
exporters:
sumologic/apiserver:
sending_queue:
queue_size: 25000
sumologic/control_plane:
sending_queue:
queue_size: 25000
sumologic/controller:
sending_queue:
queue_size: 25000
sumologic/default:
sending_queue:
queue_size: 25000
sumologic/kubelet:
sending_queue:
queue_size: 25000
sumologic/node:
sending_queue:
queue_size: 25000
sumologic/scheduler:
sending_queue:
queue_size: 25000
sumologic/state:
sending_queue:
queue_size: 25000

otellogs:
config:
merge:
exporters:
otlphttp:
sending_queue:
queue_size: 20

Assigning Pod to particular Node​

Using NodeSelectors​

Kubernetes offers a feature of assigning specific pod to node. Such kind of control is sometimes useful, whenever you want to ensure that pod will end up on specific node according your requirements like operating system or connected devices.

Binding pods to linux nodes​

Using this feature we can bind them to linux nodes. In order to do that nodeSelector has to be used. Node selector can be changed via additional parameter in user-values.yaml, see an example for PVC Cleaner below:

pvcCleaner:
job:
nodeSelector:
kubernetes.io/os: linux

You can also specify a global nodeSelector option that will be used for all components except for the subcharts. The option for a component has a higher priority than the global one.

To specify the global option, use key sumologic.nodeSelector:

sumologic:
nodeSelector:
kubernetes.io/os: linux

metadata:
metrics:
statefulset:
nodeSelector:
kubernetes.io/os: metricOS

The above config will set:

  • nodeSelector.kubernetes.io/os to metricOS for metadata metrics statefulset
  • nothing for subcharts
  • nodeSelector.kubernetes.io/os to linux for other components

See the section about global customizations for information about setting nodeSelectors for subchart resources.

Setting different resources on different nodes for logs collector​

All of the log collector Pods have the same resource settings, being members of the same DaemonSet. However, sometimes it's necessary to vary this based on the Node - for example when some large Nodes can contain applications generating a lot of log data.

It's possible to set different resource quotas for different Nodes based on labels, resulting in a different log collector DaemonSet for these Nodes.

Let's consider the following example.

You have node group with common label workingGroup: IntenseLogGeneration and for that specific group of nodes, you would like to set different resources than for rest of the cluster.

We assume that you only want to set requests.cpu to 2 and limits.cpu to 10 and have the same memory values like main DaemonSet.

otellogs:
additionalDaemonSets:
## intense will be suffix for daemonset for easier recognition
intense:
nodeSelector:
## we are using nodeSelector to select only nodes with `workingGroup` label set to `IntenseLogGeneration`
workingGroup: IntenseLogGeneration
resources:
requests:
cpu: 1
limits:
cpu: 10
daemonset:
# For main daemonset, we need to set nodeAffinity to not schedule on nodes with `workingGroup` label set to `IntenseLogGeneration`
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: workingGroup
operator: NotIn
values:
- IntenseLogGeneration

Keeping Source Category for metrics​

In order to keep Source Category for metrics, the following configuration can be applied:

metadata:
metrics:
config:
merge:
processors:
resource/delete_source_metadata:
attributes:
## Do not remove _sourceCategory
# - key: _sourceCategory
# action: delete
- key: _sourceHost
action: delete
- key: _sourceName
action: delete

Using newer Kube Prometheus Stack​

Due to breaking changes, we do not support the latest Kube Prometheus Stack. We are aware that it can be a major issue, so this section describes how to install newer Kube Prometheus Stack to work with our collection.

  1. Prepare values.yaml for Kube Prometheus Stack.
    • copy content of kube-prometheus-stack from values.yaml, e.g., for Sumologic Kubernetes Collection v3.9 you need to copy these lines
    • remove configuration for images for Kube Prometheus Stack to use newer versions, e.g., for Sumologic Kubernetes Collection v3.9 you need to remove tag for kube-state-metrics
    • add your custom configuration
  2. Upgrade sumologic chart without Kube Prometheus Stack by adding the following configuration to your values.yaml:
    kube-prometheus-stack:
    enabled: false
  3. Verify changes in newer Kube Prometheus Stack which you want to install, you can find information about changes here.
  4. Install CRD for Kube Prometheus Stack to do this please see upgrade instruction for Kube Prometheus Stack, for example to upgrade from 40.x to 41.x:
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagerconfigs.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_probes.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
    kubectl apply --server-side -f
    https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.60.1/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml
  5. Install Kube Prometheus Stack in the same namespace in which collection has been installed:
    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
    helm repo update

    helm install kube-prometheus prometheus-community/kube-prometheus-stack \
    --namespace <NAMESPACE> \
    --version <VERSION> \
    -f values-prometheus.yaml

Lowering default ingest​

By default, the Helm Chart collects some data necessary for Sumo Logic dashboards to work. If these dashboards are not used, we suggest to disable collection of this data in order to lower the ingest.

In order to learn more about filtering out data please see the doc about filtering data.

Global customizations​

Some common Kubernetes resource attributes can be set globally for all resources managed by the Helm Chart. Due to Helm's limitations, they need to be separately set for subchart resources. This section lists the relevant attributes and configuration values.

note

Each subchart can be independently configured. To make sure you do not miss anything you want to change, make sure to check their documentation. Links to documentations of subcharts can be found here.

Node selectors​

These options allow you to set node selectors for whole subcharts. The values under these keys must be a map of node selectors. More information about node selectors can be found here.

subchartkey
sumologicsumologic.nodeSelector
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.nodeSelector
kube-state-metricskube-prometheus-stack.kube-state-metrics.nodeSelector
prometheuskube-prometheus-stack.prometheus.prometheusSpec.nodeSelector
opentelemetry-operatoropentelemetry-operator.nodeSelector
falcofalco.nodeSelector
telegraf-operatortelegraf-operator.nodeSelector

In the main sumologic chart, value specified in sumologic.nodeSelector can be overridden for the following pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.nodeSelector
otelcolInstrumentationotelcolInstrumentation.statefulset.nodeSelector
tracesGatewaytracesGateway.deployment.nodeSelector
otellogsotellogs.daemonset.nodeSelector
setupJobsumologic.setup.job.nodeSelector
tracesSamplertracesSampler.deployment.nodeSelector
metadatametadata.metrics.statefulset.nodeSelector
metadatametadata.logs.statefulset.nodeSelector
pvcCleanerpvcCleaner.job.nodeSelector
metrics otelcolsumologic.metrics.collector.otelcol.nodeSelector

Tolerations​

These options allow you to set tolerations for whole subcharts. The values under these keys must be a list of tolerations. More information about node selectors can be found here.

subchartkey
sumologicsumologic.tolerations
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.tolerations
kube-state-metricskube-prometheus-stack.kube-state-metrics.tolerations
prometheuskube-prometheus-stack.prometheus.prometheusSpec.tolerations
opentelemetry-operatoropentelemetry-operator.tolerations
falcofalco.tolerations
telegraf-operatortelegraf-operator.tolerations

In the main sumologic chart, value specified in sumologic.tolerations can be overridden for the following pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.tolerations
otelcolInstrumentationotelcolInstrumentation.statefulset.tolerations
tracesGatewaytracesGateway.deployment.tolerations
otellogsotellogs.daemonset.tolerations
setupJobsumologic.setup.job.tolerations
tracesSamplertracesSampler.deployment.tolerations
metadatametadata.metrics.statefulset.tolerations
metadatametadata.logs.statefulset.tolerations
pvcCleanerpvcCleaner.job.tolerations
metrics otelcolsumologic.metrics.collector.otelcol.tolerations

Affinity​

These options allow you to set affinity for whole subcharts. The values under these keys must be a map of affinities and anti-affinities. More information about affinity and anti-affinity can be found here.

subchartkey
sumologicsumologic.affinity
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.affinity
kube-state-metricskube-prometheus-stack.kube-state-metrics.affinity
prometheuskube-prometheus-stack.prometheus.prometheusSpec.affinity
opentelemetry-operatoropentelemetry-operator.affinity
falcofalco.affinity
telegraf-operatortelegraf-operator.affinity

In the main sumologic chart, value specified in sumologic.affinity can be overridden for the following pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.affinity
otelcolInstrumentationotelcolInstrumentation.statefulset.affinity
tracesGatewaytracesGateway.deployment.affinity
otellogsotellogs.daemonset.affinity
setupJobsumologic.setup.job.affinity
tracesSamplertracesSampler.deployment.affinity
metadatametadata.metrics.statefulset.affinity
metadatametadata.logs.statefulset.affinity
pvcCleanerpvcCleaner.job.affinity
metrics otelcolsumologic.metrics.collector.otelcol.affinity

Pod labels​

These options allow you to set pod labels for whole subcharts. The values under these keys must be a map of pod labels. More information about labels can be found here.

subchartkey
sumologicsumologic.podLabels
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.podLabels
prometheuskube-prometheus-stack.prometheus.prometheusSpec.podMetadata.labels
opentelemetry-operatoropentelemetry-operator.manager.podLabels
falcofalco.podLabels
metrics-servermetrics-server.podLabels

In the main sumologic chart, you can specify additional pod labels for these pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.podLabels
otelcolInstrumentationotelcolInstrumentation.statefulset.podLabels
tracesGatewaytracesGateway.deployment.podLabels
otellogsotellogs.daemonset.podLabels
setupJobsumologic.setup.job.podLabels
tracesSamplertracesSampler.deployment.podLabels
metadata (all)metadata.podLabels
metadata (metrics)metadata.metrics.statefulset.podLabels
metadata (logs)metadata.logs.statefulset.podLabels
pvcCleanerpvcCleaner.job.podLabels
metrics otelcolsumologic.metrics.collector.otelcol.podLabels

Pod annotations​

These options allow you to set pod annotations for whole subcharts. The values under these keys must be a map of pod annotations. More information about annotations can be found here.

subchartkey
sumologicsumologic.podAnnotations
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.podAnnotations
kube-state-metricskube-prometheus-stack.kube-state-metrics.podAnnotations
prometheuskube-prometheus-stack.prometheus.prometheusSpec.podMetadata.annotations
opentelemetry-operatoropentelemetry-operator.manager.podAnnotations
falcofalco.podAnnotations
metrics-servermetrics-server.podAnnotations

In the main sumologic chart, you can specify additional pod annotations for these pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.podAnnotations
otelcolInstrumentationotelcolInstrumentation.statefulset.podAnnotations
tracesGatewaytracesGateway.deployment.podAnnotations
otellogsotellogs.daemonset.podAnnotations
setupJobsumologic.setup.job.podAnnotations
tracesSamplertracesSampler.deployment.podAnnotations
metadata (all)metadata.podAnnotations
metadata (metrics)metadata.metrics.statefulset.podAnnotations
metadata (logs)metadata.logs.statefulset.podAnnotations
pvcCleanerpvcCleaner.job.podAnnotations
metrics otelcolsumologic.metrics.collector.otelcol.podAnnotations

Images​

Every image configuration is a map that consists of two elements: repository and tag. Images defined in this helm chart also have pullPolicy key. For subcharts, use the following keys to override main images:

subchartkey
kube-prometheus-stackkube-prometheus-stack.prometheus-node-exporter.image
kube-state-metricskube-prometheus-stack.kube-state-metrics.image
prometheuskube-prometheus-stack.prometheus.prometheusSpec.image
opentelemetry-operatoropentelemetry-operator.manager.image
falcofalco.image
metrics-servermetrics-server.image
telegraf-operatortelegraf-operator.image

In the main sumologic chart, you can specify additional pod annotations for these pods:

componentkey
remoteWriteProxysumologic.metrics.remoteWriteProxy.image
otelcolInstrumentationotelcolInstrumentation.statefulset.image
tracesGatewaytracesGateway.deployment.image
otellogsotellogs.daemonset.image
setupJobsumologic.setup.job.image
tracesSamplertracesSampler.deployment.image
metadata (all)metadata.image
pvcCleanerpvcCleaner.job.image
metrics otelcolsumologic.metrics.collector.otelcol.image
oteleventsotelevents.image

You can also set a default image for every OpenTelemetry collector instance. To do so, use key sumologic.otelcolImage:

sumologic:
otelcolImage:
repository: "public.ecr.aws/sumologic/sumologic-otel-collector"
tag: "0.90.1-sumo-0"

## Add a -fips suffix to all image tags. With default tags, this results in FIPS-compliant otel images.
## See https://github.com/SumoLogic/sumologic-otel-collector/blob/main/docs/fips.md for more information.
addFipsSuffix: false

Use containers only from Docker Hub registry​

If DockerHub is the only available Docker registry option, we recommend the following configuration:

sumologic:
setup:
job:
image:
repository: docker.io/sumologic/kubernetes-setup
initContainerImage:
repository: docker.io/sumologic/busybox
otelcolImage:
repository: docker.io/sumologic/sumologic-otel-collector
metrics:
remoteWriteProxy:
image:
repository: docker.io/sumologic/nginx-unprivileged
falco:
image:
registry: docker.io
extra:
initContainers:
## Add initContainer to wait until kernel-devel is installed on host
- name: init-falco
image: docker.io/sumologic:1.36.0
command:
- "sh"
- "-c"
- |
while [ -f /host/etc/redhat-release ] && [ -z "$(ls /host/usr/src/kernels)" ] ; do
echo "waiting for kernel headers to be installed"
sleep 3
done
volumeMounts:
- mountPath: /host/usr
name: usr-fs
readOnly: true
- mountPath: /host/etc
name: etc-fs
readOnly: true
driver:
loader:
initContainer:
image:
registry: docker.io
otellogs:
daemonset:
initContainers:
changeowner:
image:
repository: docker.io/sumologic/busybox
opentelemetry-operator:
instrumentation:
dotnet:
repository: docker.io/sumologic/autoinstrumentation-dotnet
java:
repository: docker.io/sumologic/autoinstrumentation-java
python:
repository: docker.io/sumologic/autoinstrumentation-python
nodejs:
repository: docker.io/sumologic/autoinstrumentation-nodejs
instrumentationJobImage:
image:
repository: docker.io/sumologic/kubernetes-tools-kubectl
manager:
collectorImage:
repository: docker.io/sumologic/sumologic-otel-collector
image:
repository: docker.io/sumologic/opentelemetry-operator
kubeRBACProxy:
image:
repository: docker.io/sumologic/kube-rbac-proxy
testFramework:
image:
repository: docker.io/sumologic/busybox
tailing-sidecar-operator:
kubeRbacProxy:
image:
repository: docker.io/sumologic/kube-rbac-proxy
operator:
image:
repository: docker.io/sumologic/tailing-sidecar-operator
sidecar:
image:
repository: docker.io/sumologic/tailing-sidecar
telegraf-operator:
image:
# Ensure the image tag is the same like in Sumo Logic Kubernetes Collection Chart
sidecarImage: docker.io/sumologic/telegraf:1.21.2
repository: docker.io/sumologic/telegraf-operator
kube-prometheus-stack:
prometheus-node-exporter:
image:
repository: docker.io/sumologic/node-exporter
kube-state-metrics:
image:
repository: docker.io/sumologic/kube-state-metrics
prometheusOperator:
image:
repository: docker.io/sumologic/prometheus-operator
prometheus:
prometheusSpec:
image:
repository: docker.io/sumologic/prometheus
pvcCleaner:
job:
image:
repository: docker.io/sumologic/kubernetes-tools-kubectl
debug:
sumologicMock:
image:
repository: docker.io/sumologic/sumologic-mock
Status
Legal
Privacy Statement
Terms of Use

Copyright © 2025 by Sumo Logic, Inc.