Skip to main content
Sumo Logic

AWS Kinesis Firehose for Logs Source

An AWS Kinesis Firehose for Logs Source allows you to ingest CloudWatch logs or any other logs streamed and delivered via AWS Kinesis Data Firehose.

Amazon Kinesis Data Firehose is an AWS service that can reliably load streaming data into any analytics platform, such as Sumo Logic. It is a fully managed service that automatically scales to match the throughput of data and requires no ongoing administration. With Kinesis Data Firehose, you don't need to write applications or manage resources. You configure your AWS service logs like VPC flow logs to send logs to AWS CloudWatch that can then stream logs to Kinesis Data Firehose, latter automatically delivers the logs to your Sumo Logic account. This eliminates the need for creating separate log processors or forwarders such as AWS Lambda functions, that are limited by time out, concurrency, and memory limits.

The following diagram shows the flow of data with an AWS Kinesis Firehose for Logs Source:

Screen Shot 2021-04-29 at 9.17.26 AM.png

Create an AWS Kinesis Firehose for Logs Source

When you create an AWS Kinesis Firehose for Logs Source, you add it to a Hosted Collector. Before creating the Source, identify the Hosted Collector you want to use or create a new Hosted Collector. For instructions, see Configure a Hosted Collector.

To create an AWS Kinesis Firehose for Logs Source:

  1. In the Sumo Logic web app, select Manage Data > Collection > Collection
  1. On the Collectors page, click Add Source next to a Hosted Collector.
  1. Select AWS Kinesis Firehose for Logs Source.
    AWS Kinesis Firehost for Logs Icon.png
  1. Enter a Name for the Source. A description is optional.
    AWS Kinesis Firehose for Logs Source UI.png
  1. (Optional) For Source Host and Source Category, enter any string to tag the output collected from the Source. (Category metadata is stored in a searchable field called _sourceCategory.)
  2. SIEM Processing. Check the checkbox to forward your data to Cloud SIEM Enterprise.
  3. Fields. Click the +Add link to add custom log metadata Fields.
    • Define the fields you want to associate, each field needs a name (key) and value. 
      • green check circle.png A green circle with a check mark is shown when the field exists and is enabled in the Fields table schema.
      • orange exclamation point.png An orange triangle with an exclamation point is shown when the field doesn't exist, or is disabled, in the Fields table schema. In this case, an option to automatically add or enable the nonexistent fields to the Fields table schema is provided. If a field is sent to Sumo that does not exist in the Fields schema or is disabled it is ignored, known as dropped.
  4. Set any of the following options under Advanced. Advanced options do not apply to uploaded metrics.

Enable Timestamp Parsing. This option is selected by default. If it's deselected, no timestamp information is parsed at all.

  • Time Zone. There are two options for Time Zone. You can use the time zone present in your log files, and then choose an option in case time zone information is missing from a log message. Or, you can have Sumo completely disregard any time zone information present in logs by forcing a time zone. Whichever option you choose, it's important to set the proper time zone. If the time zone of logs can't be determined, Sumo assigns logs UTC; if the rest of your logs are from another time zone your search results will be affected.
  • Timestamp Format. By default, Sumo will automatically detect the timestamp format of your logs. However, you can manually specify a timestamp format for a source. See Timestamps, Time Zones, Time Ranges, and Date Formats for more information.

Enable Multiline Processing. See Collecting Multiline Logs for details on multiline processing and its options. Use this option if you're working with multiline messages (for example, log4J messages or exception stack traces). Deselect this option if you want to avoid unnecessary processing when collecting single-message-per-line files (for example, Linux system.log).

  • Infer Boundaries. Enable when you want Sumo to automatically attempt to determine which lines belong to the same message.
    If you deselect the Infer Boundaries option, enter a regular expression in the Boundary Regex field to use for detecting the entire first line of multi-line messages.
  • Boundary Regex. You can specify the boundary between messages using a regular expression. Enter a regular expression for the full first line of every multiline message in your log files.
  • Enable One Message Per Request. Select this option if you'll be sending a single message with each HTTP request. For more information, see Multiline options in HTTP sources
  1. Processing Rules for Logs. Configure desired filters—such as include, exclude, hash, or mask—as described in Create a Processing Rule. Processing rules are applied to log data, but not to metric data. Note that while the Sumo service will receive your data, data ingestion will be performed in accordance with the regular expressions you specify in processing rules.
  2. When you are finished configuring the Source click Save.
  3. Copy the provided URL for the Source. You'll provide this to AWS in the next section.

Set up CloudWatch to stream logs to Kinesis Data Firehose

You can use the AWS console or our CloudFormation Template.

AWS console
  1. Follow AWS's steps to Publish flow logs to CloudWatch Logs.

  2. Follow AWS's steps to set up a CloudWatch Logs subscription to send any incoming log events that match defined filters to your Amazon Kinesis Data Firehose delivery stream.

  3. Use the Create Delivery Stream wizard to configure Firehose to deliver logs to Sumo Logic. You will provide the wizard the Source URL you copied after creating the AWS Kinesis Firehose for Logs Source.

Download our CloudFormation template and upload it when creating a stack on the AWS CloudFormation console.

When you Specify a stack name and parameters on the AWS CloudFormation console you'll provide the following:

  • Stack name
  • Sumo Logic Source endpoint URL provided when you created the AWS Kinesis Firehose for Logs Source.
  • S3 path prefix (this is the prefix under which all failed logs will go in the S3 bucket).