Skip to main content
Sumo Logic

Collect Logs for Google Cloud VPC

Instructions for collection logs from Google Cloud VPC.

This page describes the Sumo pipeline for ingesting logs from Google Cloud Platform (GCP) services, and provides instructions for collecting logs from  Google Cloud VPC.

Collection process for GCP services

The key components in the collection process for GCP services are:  Google Logs Export, Google Cloud Pub/Sub and Sumo’s Google Cloud Platform (GCP) source running on a hosted collector. 

The integration works like this: The GCP service generates logs which are exported and published to a Google Pub/Sub topic via Stackdriver. You will then setup a Sumo Logic Google Cloud Platform source that subscribes to this topic and receives the exported log data.


Configuring collection for GCP uses the following process: 

  1. Configure a GCP source on a hosted collector. You'll obtain the HTTP URL for the source, and then use Google Cloud Console to register the URL as a validated domain.  
  2. Create a topic in Google Pub/Sub and subscribe the GCP source URL to that topic.
  3. Create an export of GCP logs from Google Stackdriver Logging. Exporting involves writing a filter that selects the log entries you want to export, and choosing a Pub/Sub as the destination. The filter and destination are held in an object called a sink. 

See the following sections for configuration instructions.

Configure a Google Cloud Platform Source

The Google Cloud Platform (GCP) Source receives log data from Google Pub/Sub.

This Source will be a Google Pub/Sub-only Source, which means that it will only be usable for log data formatted as data coming from Google Pub/Sub.

  1. In Sumo Logic select Manage Data > Collection > Collection
  2. Select an existing Hosted Collector upon which to add the Source. If you don't already have a Collector you'd like to use, create one, using the instructions on Configure a Hosted Collector.
  3. Click Add Source next to the Hosted Collector and click Google Cloud Platform.
  4. Enter a Name to display for the Source. Description is optional.
  5. (Optional) For Source Host and Source Category, enter any string to tag the output collected from the source. Category metadata is stored in a searchable field called _sourceCategory, for example, "gcp".
  6. Fields. Click the +Add Field link to add custom log metadata Fields, then define the fields you want to associate. Each field needs a name (key) and value. Look for one of the following icons and act accordingly:
    • orange exclamation point.png If an orange triangle with an exclamation point is shown, use the option to automatically add or enable the nonexistent fields before proceeding to the next step. The orange icon indicates that the field doesn't exist, or is disabled, in the Fields table schema. If a field is sent to Sumo that does not exist in the Fields schema or is disabled it is ignored, known as dropped.
    • green check circle.png If a green circle with a checkmark is shown, the field exists and is already enabled in the Fields table schema. Proceed to the next step.
  7. Advanced Options for Logs
    • Enable Timestamp Parsing. This option is selected by default. If it's deselected, no timestamp information is parsed at all.
    • Time Zone. There are two options for Time Zone. You can use the time zone present in your log files, and then choose an option in case time zone information is missing from a log message. Or, you can have Sumo Logic completely disregard any time zone information present in logs by forcing a time zone. It's very important to have the proper time zone set, no matter which option you choose. If the time zone of logs can't be determined, Sumo Logic assigns logs UTC; if the rest of your logs are from another time zone your search results will be affected.
    • Timestamp Format. By default, Sumo Logic will automatically detect the timestamp format of your logs. However, you can manually specify a timestamp format for a Source. See Timestamps, Time Zones, Time Ranges, and Date Formats for more information.
  8. Processing Rules for Logs. Configure desired filters—such as include, exclude, hash, or mask—as described in Create a Processing Rule. Processing rules are applied to log data, but not to metric data. Note that while the Sumo service will receive your data, data ingestion will be performed in accordance with the regular expressions you specify in processing rules.
  9. When you are finished configuring the Source click Save.

Create export of Google Cloud VPC logs from Google Logging

In this step, you export logs to the Pub/Sub topic you created in the previous step.

  1. Go to Logging and click Logs Router.
  2. Click Create Sink.
  3. Click the arrow to Filter by label or text and select Convert to advanced filter.

  4. For resource_type, replace "<resource_variable>" with "gce_subnetwork".

Create a sink for each GCP service whose logs you want to send to Sumo. We recommend you create sinks for the following services:  GCE Subnetwork. To configure a sink, do the following:

  1. Select the service in the middle pane (GCE Subnetwork).

  2. In the Edit Export window on the right:

    1. Set the Sink Name. For example, "gcp-subnetwork."
    2. Set Sink Service to “Cloud Pub/Sub”
    3. Set Sink Destination to the newly created Pub/Sub topic. For example, pub-sub-logs.
    4. Click Create Sink.
  3. By default, GCP logs are stored within Stackdriver, but you can configure Stackdriver to exclude them as detailed here without affecting the export to Sumo Logic as outlined above. To understand how to exclude Stackdriver logs, please follow the instructions in this GCP document.

Sample log message 

  "message": {
    "data": {
      "insertId": "h7cue3dc1fr",
      "jsonPayload": {
        "bytes_sent": "1836",
        "connection": {
          "dest_ip": "",
          "dest_port": 443,
          "protocol": 6,
          "src_ip": "",
          "src_port": 56552
        "dest_location": {
          "city": "Ashburn",
          "continent": "America",
          "country": "usa",
          "region": "Virginia"
        "end_time": "2018-01-26T12:35:10.115UTC",
        "packets_sent": "20",
        "reporter": "SRC",
        "rtt_msec": "49",
        "src_instance": {
          "project_id": "bmlabs-loggen",
          "region": "us-central1",
          "vm_name": "vm-selectstar-collector-again",
          "zone": "us-central1-c"
        "src_vpc": {
          "project_id": "bmlabs-loggen",
          "subnetwork_name": "default",
          "vpc_name": "default"
        "start_time": "2018-01-26T12:35:10.115UTC"
      "logName": "projects/bmlabs-loggen/logs/",
      "receiveTimestamp": "2018-01-26T12:35:10.115UTC",
      "resource": {
        "labels": {
          "location": "us-central1-c",
          "project_id": "bmlabs-loggen",
          "subnetwork_id": "3656133720937113003",
          "subnetwork_name": "default"
        "type": "gce_subnetwork"
      "timestamp": "2018-01-26T12:35:10.115UTC"
    "attributes": {
      "": "2018-01-26T12:35:10.115UTC"
    "message_id": "172581793992900",
    "messageId": "172581793992900",
    "publish_time": "2018-01-26T12:35:10.115UTC",
    "publishTime": "2018-01-26T12:35:10.115UTC"
  "subscription": "projects/bmlabs-loggen/subscriptions/push-to-sumo"

Example query

Average latency (ms) by subnet ID

_collector="HTTP Source for GCP Pub/Sub" logName resource timestamp
| json "" as type 
| parse regex "\"logName\":\"(?<log_name>[^\"]+)\"" 
| where type = "gce_subnetwork" | where log_name matches "projects/*/logs/"
| json "" as resource | json field=resource "labels.location","labels.project_id","labels.subnetwork_id","labels.subnetwork_name" as zone,project,subnetwork_id,subnetwork_name nodrop
| json "", "" as labels, payload
| json field=payload "src_instance","dest_instance" as src_instance,dest_instance nodrop 
| json field=payload "src_vpc.vpc_name","dest_vpc.vpc_name" as src_vpc,dest_vpc nodrop
| json field=payload "connection.src_ip","connection.dest_ip","connection.dest_port","connection.src_port" as src_ip,dest_ip,dest_port,src_port 
| json field=src_instance "project_id", "zone", "region", "vm_name" as src_project, src_zone, src_region, src_vm nodrop 
| json field=dest_instance "project_id", "zone", "region", "vm_name" as dest_project, dest_zone, dest_region, dest_vm nodrop
| json field=payload "bytes_sent","rtt_msec","packets_sent"  as bytes, rtt,packets  
| timeslice 1m
| avg(rtt) as latency by _timeslice, subnetwork_id, subnetwork_name 
| transpose row _timeslice column subnetwork_id,subnetwork_name