This page provides answers for frequently asked integration questions about Azure Blob Storage.
What is FileOffsetMap?
FileOffsetMap is a table created in Azure Table Storage that is used for internal bookkeeping. The events generated from Storage Account only contain the blob size, so the Azure functions receive event messages containing sizes such as 30 bytes, 40 bytes, and 70 bytes in random order along with blob path. Sumo Logic stores a mapping for each file and the current offset to determine the next range.
How does the collection mechanism work?
For a summary of how various components are stitched together in the pipeline, see the Monitoring data flow section of the Azure Blog Storage page.
How do I scale the function?
From the Application settings page, you can do any of the following to scale the function:
- Increasing the maxBatchSize in the BlobTaskProducer host.json from function app settings. This fetches more events and creates larger blocks for reading.
- Increasing maxConcurrentCalls calls setting in the BlobTaskConsumer host.json. It is recommended that you increase it in smaller increments so as to not hit the throttling limit.
- Increasing the prefetchcount to 2*maxBatchSize.
How do I ingest logs from Azure Blob Storage into multiple sourceCategories?
The following is a Field Extraction Rule (FER) solution.
You extract the container name in FERs and override the _sourceCategory with _sourceCategory/<containername> so that when a user searches the new sourcecategory is used. For example:
_sourceCategory = azure_logs | json auto | parse field=resource_id "/NETWORKSECURITYGROUPS/*" as nsg_name | concat('azure_logs/", nsg_name) as _sourceCategory
Another approach is to modify the function to send source category in headers. For more information, see How do I route logs to different source categories based on log content?”
How do I filter events by container name?
To filter events by container name, do the following:
- Go to Event subscription > Filters tab.
- Enter the following in the Subject Begins With field, replacing <container_name> with the name of the container from where you want to export logs.
How do I troubleshoot Blob Storage integration?
- Verify Block Blob Create Events are getting published - If events are not getting created, then either no new blobs are getting created or the event grid subscription subscriber settings is not configured right. For example, the regex for container does not match or the event grid service could be down.
- Verify Event Hub is receiving log messages - If events are not getting into the Event Hub, then the event grid subscription publisher settings are not configured properly.
- Verify Service Bus Queue is receiving tasks - If service bus is not receiving data. there might be something wrong with SUMOBRTaskProducer function. Check the function's invocation logs. For example, the event payload format may have been changed by Microsoft, it's not able to write to service bus, or the service bus may be down.
- Verify with live tail - If you are getting logs into sumo and everything else checked out, then there might be an issue in SUMOBRTaskConsumer function. Check the function's invocation logs. For example, it may not be able to read the from Storage Account, the blob may have been deleted before it was read, or the log format may not be supported.
Blob Reader error messages
Error: The request is being throttled. at client.pipeline.error (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\azure-arm-storage\lib\operations\storageAccounts.js:1444:19) at retryCallback (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\ms-rest\lib\filters\systemErrorRetryPolicyFilter.js:89:9) at retryCallback (D:\home\site\wwwroot\BlobTaskConsumer\node_modules\ms-rest\lib\filters\exponentialRetryPolicyFilter.js:140:9) at D:\home\site\wwwroot\BlobTaskConsumer\node_module...FunctionName: BlobTaskConsumer
Solution: Increase the maxBatchSize in BlobTaskProducer's host.json This will fetch more events and will create larger blocks for reading. Then, decrease maxConcurrentCalls calls setting in BlobTaskConsumer's host.json. This will limit the number of concurrent invocations, reducing the number of read requests.
Error: HTDECK-JOBCOSTING-API__BE93-2019-05-08-14-e5260b.log"": } Exception while executing function: Functions.BlobTaskProducer Microsoft.Azure.WebJobs.Host. FunctionInvocationException : Exception while executing function: Functions.BlobTaskProducer ---> System.Exception : StorageError: The table specified does not exist. RequestId:3914a31a-e002-000e-1dad-05a995000000 Time:2019-05-08T14:48:29.9940095Z at async Microsoft.Azure.WebJobs.Script.Description.NodeFunctionInvoker.InvokeCore(Object parameters,FunctionInvocationContext context) at C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Description\Node\NodeFunctionInvoker.cs : 196
Solution: This error comes when FileOffsetMap does not exists. Check and confirm whether you have created the following table in Step 3: Configure Azure resources using ARM template, substep 11.
Error: You'll see a Deployment Failed error when roleAssignment is not unique but we are already using resourcegroup.id in a name that is unique. The error details are:
Tenant ID, application ID, principal ID, and scope are not allowed to be updated. (Code: RoleAssignmentUpdateNotPermitted)
For more information, see the following articles:
Solution: Create a new resource group for the Sumo Logic collection resources. If that doesn't fix the problem, then change the variables in the ARM template from this:
"consumer_roleGuid": "[guid(parameters('sites_blobreaderconsumer_name'), uniqueString(deployment().name, resourceGroup().id))]",
"dlq_roleGuid": "[guid(parameters('sites_DLQProcessor_name'), uniqueString(deployment().name, resourceGroup().id))]",
"consumer_roleGuid": "[guid(parameters('sites_blobreaderconsumer_name'), uniqueString(‘<random unique word>’, resourceGroup().id))]",
"dlq_roleGuid": "[guid(parameters('sites_DLQProcessor_name'), uniqueString(‘<random unique word>’, resourceGroup().id))]"
Error: Azure fails to install dependencies on a node. System.AggregateException : One or more errors occurred. ---> Error: Cannot find module 'azure-storage'
npm installfrom the console.
- Error: Subscription for Microsoft.EventGrid is not registered.
Solution: To register the provider do the following:
1. Go To subscriptions.
2. Select the subscription name where ARM template is deployed.
3. Select the Resource providers under settings on the left.
4. Search for Microsoft.EventGrid and register it.