In some cases, Sumo Logic disables a metrics source to limit the number of ingested time series. This is referred to as blacklisting. Sumo blacklists a metric source that has received too many unique time series.
For a Cloudwatch Metrics source or any source sending Prometheus metrics, the limit is 1M time series. For all other metric sources and formats, the limit is 500K. In any case it is a limit over the last 7 days, imposed based on the total count of unique time series received in the last 7 days.
This page has information about the blacklisting process and how to resolve the problem.
Why Sumo blacklists a metrics source
Too many unique time series can occur with metrics whose names contain dynamically-generated strings, for example a timestamp. This is sometimes the case with Graphite metrics. Similarly, Dropwizard metric names sometimes contain names of threads. With dynamically-generated metric names, a new time series is created with each new name. When this happens, the total number of time series can be subjected to unbounded growth over time. And the time series are typically of little use, given their ephemeral nature.
In the case of EC2 CloudWatch metrics, Amazon’s metric naming convention causes a new time series to be created for each EC2 instance.
When you run a metrics query that matches one or more blacklisted metrics sources, the following message is presented at the top metric query tab:
Some of your metrics Sources are sending too many unique time series and have been temporarily disabled. The data sent while Sources are disabled is not recoverable. To re-enable the Sources, click Re-enable the sources.
A dialog box prompts you to re-enable the Source. Selecting Re-enable Sources cancels the blacklist for the moment, but blacklisting will soon be reinstated if the underlying issue is not addressed.
Audit logging and notifications for blacklisted metric sources
Sumo writes the following message to the audit index when it blacklists a metrics source:
User 0000000000000475 This metrics source has sent too many unique time series and has been temporarily disabled. The data sent while this source is disabled cannot be recovered. Details: type=Source SourceId=0000000006158DAE, SourceName=carbon2udp, CollectorId=00000000060BF6D2, CollectorName=stag-metricsstore-2, SourceHostName=stag-metricsstore-2, SourceCategory=metrics, createdBy=Collector Registration, modifiedBy=Collector Registration
When Sumo blacklists a source, it sends an email notification to the creator of the source.
Solve the underlying blacklisting issue
The sections below describe strategies for solving the underlying blacklisting issue.
Modify the source configuration
If the metric source is not a CloudWatch EC2 Source, modify the time series naming convention in the Source to exclude any dynamically inserted elements, such as date, timestamp, or thread names.
Replace EC2 CloudWatch metrics source with host metrics source
The recommendation described above (changing your time series naming convention) does not address the issue of too many unique time series from Amazon EC2 metrics. If you encounter this problem, we recommend that, instead of using Sumo's CloudWatch source for metrics, you install a Sumo Logic Collector in your EC2 instance, and configure a Sumo host metrics source on that collector. The host metrics source send a suite of standard system metrics, such as CPU, disk, and network metrics. The information is comparable to that provided by CloudWatch. See Host Metrics Source for Installed Collectors for more information. This is a much more cost-effective way to collect host metrics.