Skip to main content
Sumo Logic

Blob Storage FAQs

This page provides answers for frequently asked integration questions about Azure Blob Storage.

This page provides answers for frequently asked integration questions about Azure Blob Storage.

What is FileOffsetMap?

FileOffsetMap is a table created in Azure Table Storage that is used for internal bookkeeping. The events generated from Storage Account only contain the blob size, so the Azure functions receive  event messages containing sizes such as 30 bytes, 40 bytes, and 70 bytes in random order along with blob path. Sumo Logic stores a mapping for each file and the current offset to determine the next range.

How does the collection mechanism work?

For a summary of how various components are stitched together in the pipeline, see the Monitoring data flow section of the Azure Blog Storage page.

How do I scale the function?

From the Application settings page, you can do any of the following to scale the function:

  • Increasing the maxBatchSize in the BlobTaskProducer host.json from function app settings. This fetches more events and creates larger blocks for reading.
  • Increasing maxConcurrentCalls calls setting in the BlobTaskConsumer host.json. It is recommended that you increase it in smaller increments so as to not hit the throttling limit.

Azure_New_Host_Key.png 

  • Increasing the prefetchcount to 2*maxBatchSize.

Azure-FAQ_Increase_BatchSize.png

How do I ingest logs from Azure Blob Storage into multiple sourceCategories?

The following is a Field Extraction Rule (FER) solution. 

You extract the container name in FERs  and override the _sourceCategory  with _sourceCategory/<containername> so that when a user searches the new sourcecategory is used. For example: 

_sourceCategory = azure_logs | json auto | parse field=resource_id  "/NETWORKSECURITYGROUPS/*" 
as nsg_name | concat('azure_logs/", nsg_name) as _sourceCategory

Another approach is to modify the function to send source category in headers. For more information, see How do I route logs to different source categories based on log content?”

How do I filter events by container name?

To filter events by container name, do the following:

  1. Go to Event subscription > Filters tab.
  2. Enter the following in the Subject Begins With field, replacing <container_name> with the name of the container from where you want to export logs. 
/blobServices/default/containers/<container_name>/

Azure-FAQ_Event-Subscription.png

How do I troubleshoot Blob Storage integration?

  • Verify Block Blob Create Events are getting published - If events are not getting created, then either no new blobs are getting created or the event grid subscription subscriber settings is not configured right. For example, the regex for container does not match or the event grid service could be down.
  • Verify Event Hub is receiving log messages - If events are not getting into the Event Hub, then the event grid subscription publisher settings are not configured properly.
  • Verify Service Bus Queue is receiving tasks - If service bus is not receiving data. there might be something wrong with SUMOBRTaskProducer function. Check tje function's invocation logs. For example, the event payload format may have been changed by Microsoft, it's not able to write to service bus, or the service bus may be down.
  • Verify with live tail - If you are getting logs into sumo and everything else checked out, then there might be an issue in SUMOBRTaskConsumer function. Check the function's invocation logs. For example, it may not be able to read the from Storage Account, the blob may have been deleted before it was read, or the log format may not be supported.

Blob Reader error messages

  • Error: The request is being throttled. at client.pipeline.error (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\azure-arm-storage\lib\operations\storageAccounts.js:1444:19) at retryCallback (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\ms-rest\lib\filters\systemErrorRetryPolicyFilter.js:89:9) at retryCallback (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\ms-rest\lib\filters\exponentialRetryPolicyFilter.js:140:9) at D:\home\site\wwwroot\BlobTaskConsumer\node_module...FunctionName: BlobTaskConsumer

    Solution: Increase the maxBatchSize in BlobTaskProducer's host.json This will fetch more events and will create larger blocks for reading. Then, decrease maxConcurrentCalls calls setting in BlobTaskConsumer's host.json. This will limit the number of concurrent invocations, reducing the number of read requests.

  • Error: HTDECK-JOBCOSTING-API__BE93-2019-05-08-14-e5260b.log"": [48255]} Exception while executing function: Functions.BlobTaskProducer Microsoft.Azure.WebJobs.Host. FunctionInvocationException : Exception while executing function: Functions.BlobTaskProducer ---> System.Exception : StorageError: The table specified does not exist. RequestId:3914a31a-e002-000e-1dad-05a995000000 Time:2019-05-08T14:48:29.9940095Z at async Microsoft.Azure.WebJobs.Script.Description.NodeFunctionInvoker.InvokeCore(Object[] parameters,FunctionInvocationContext context) at C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Description\Node\NodeFunctionInvoker.cs : 196

    Solution: This error comes when FileOffsetMap does not exists. Check and confirm whether you have created the following table in Step 3: Configure Azure resources using ARM template, substep 11.

  • Error: Error in the following example occurs when roleAssignment is not unique but we are already using resourcegroup.id in a name that is unique.

    Azure-FAQ_FileOffsetMap_table.png

    For more information, see the following articles:

    https://social.msdn.microsoft.com/Forums/en-US/5267ce3b-8e48-4b1b-8e40-276006ad23e4/create-roleassignment-fails-with-error-quottenant-id-application-id-principal-id-and-scope-are?forum=WindowsAzureAD

    http://answers.flyppdevportal.com/MVC/Post/Thread/afc10f35-fa20-467e-b927-aeefdbf35eaf?category=azurescripting

    Solution: Create a new resource group for the Sumo Logic collection resources. If that doesn't fix the problem, then change the variables in the ARM template from this:

     "consumer_roleGuid": "[guid(parameters('sites_blobreaderconsumer_name'), uniqueString(deployment().name, resourceGroup().id))]",

            "dlq_roleGuid": "[guid(parameters('sites_DLQProcessor_name'), uniqueString(deployment().name, resourceGroup().id))]",

    To this: 

     "consumer_roleGuid": "[guid(parameters('sites_blobreaderconsumer_name'), uniqueString(‘<random unique word>’, resourceGroup().id))]",

            "dlq_roleGuid": "[guid(parameters('sites_DLQProcessor_name'), uniqueString(‘<random unique word>’, resourceGroup().id))]"

  • Error:  Azure fails to install dependencies on a node. System.AggregateException : One or more errors occurred. ---> Error: Cannot find module 'azure-storage'

    Solution:  Run npm install from the console.

    Azure-FAQ_BlobTaskConsumer.png

  • Error: Subscription for Microsoft.EventGrid is not registered.

    Solution: To register the provider do the following:

    1. Go To subscriptions.

    2. Select the subscription name where ARM template is deployed.

    3. Select the Resource providers under settings on the left.

    4. Search for Microsoft.EventGrid and register it.

    Azure-FAQ_Subscriptions.png

    Azure-FAQ_Register-Subscription1.png