Azure Event Grid

Azure Event Grid is a highly scalable, fully managed Pub Sub message distribution service that offers flexible message consumption patterns using the MQTT and HTTP protocols. This integration helps in monitoring data plane requests, delivery failures and publish failures of Event Grid resources - custom topics, system topics and domains.
Log and metric types
For Azure Event Grid, you can collect the following logs and metrics:
- Resource Logs. Publish and delivery failure logs by Event Grid resources. It also includes audit traces for data plane operations including public and private access operations. The schema for resource logs is described here.
- Metrics. Metrics for Azure Event Grid are in the namespaces below:
For more information on supported dimensions, refer to Azure documentation.
Setup
Azure service sends monitoring data to Azure Monitor, which can then stream data to Eventhub. Sumo Logic supports:
- Logs collection from Azure Monitor using our Azure Event Hubs source.
- Metrics collection using our Azure Metrics Source.
You must explicitly enable diagnostic settings for each domain, namespace, custom topic, and system topic you want to monitor. You can forward logs to the same event hub provided they satisfy the limitations and permissions as described here.
When you configure the event hubs source or HTTP source, plan your source category to ease the querying process. A hierarchical approach allows you to make use of wildcards. For example: Azure/EventGrid/Logs
, Azure/EventGrid/Metrics
.
Configure metrics collection
To set up the Azure Metrics source in Sumo Logic, refer to Azure Metrics Source.
Sumo Logic Metrics source is currently in Beta, to participate, contact your Sumo Logic account executive.
- To set up the Azure Metrics source in Sumo Logic, refer to the shared beta documentation.
- In the Sumo Logic Azure Metrics source configuration, configure namespaces as
Microsoft.EventGrid/domains
,Microsoft.EventGrid/systemTopics
andMicrosoft.EventGrid/topics
.
Configure logs collection
In this section, you will configure a pipeline for shipping diagnostic logs from Azure Monitor to an Event Hub.
Diagnostic logs
- To set up the Azure Event Hubs source in Sumo Logic, refer to Azure Event Hubs Source for Logs.
- To create the Diagnostic settings in the Azure portal, refer to the Azure documentation. Perform the steps below for each Azure Event Grid namespace that you want to monitor.
- Choose
Stream to an event hub
as the destination. - Select
allLogs
. - Use the Event Hub namespace and Event Hub name configured in the previous step in the destination details section. You can use the default policy
RootManageSharedAccessKey
as the policy name. - Use the Event Hub namespace and Event Hub name configured in the previous step in the destination details section. You can use the default policy
RootManageSharedAccessKey
as the policy name.
- Choose
- Tag the location field in the source with the right location value.
Activity Logs
To collect activity logs, follow the instructions here. Do not perform this step in case you are already collecting activity logs for a subscription.
Since this source contains logs from multiple regions, make sure that you do not tag this source with the location tag.
Installing the Azure Event Grid app
Now that you have set up data collection, install the Azure Event Grid Sumo Logic app to use the pre-configured dashboards that provide visibility into your environment for real-time analysis of Azure resources.
To install the app, do the following:
Next-Gen App: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app.
- Select App Catalog.
- In the 🔎 Search Apps field, run a search for your desired app, then select it.
- Click Install App.
note
Sometimes this button says Add Integration.
- Click Next in the Setup Data section.
- In the Configure App section of your respective app, complete the following field.
- Index. Specify value for _index if the collection is configured with custom partition. Learn more. Default value is set to
sumologic_default
(default partition)
- Index. Specify value for _index if the collection is configured with custom partition. Learn more. Default value is set to
- Click Next. You will be redirected to the Preview & Done section.
Post-installation
Once your app is installed, it will appear in your Installed Apps folder, and dashboard panels will start to fill automatically.
Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time.
As part of the app installation process, the following fields will be created by default:
tenant_name
. This field is tagged at the collector level. You can get the tenant name using the instructions here.location
. The region the resource name belongs to.subscription_id
. ID associated with a subscription where the resource is present.resource_group
. The resource group name where the Azure resource is present.provider_name
. Azure resource provider name (for example, Microsoft.Network).resource_type
. Azure resource type (for example, storage accounts).resource_name
. The name of the resource (for example, storage account name).service_type
. Type of the service that can be accessed with an Azure resource.service_name
. Services that can be accessed with an Azure resource (for example, in Azure Container Instances the service is Subscriptions).
Viewing the Azure Event Grid dashboards
All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level.
- You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. Learn more.
- You can use template variables to drill down and examine the data on a granular level. For more information, see Filtering Dashboards with Template Variables.
- Many of the Next-Gen apps allow you to provide the Index at the installation time and a default value for this key (sumologic_default). Based on your input, the app dashboards will be parameterized with a dashboard variable, allowing you to change the data partition queried by all panels. This restricts the query scope of all the dashboard queries to a specific data partition.
Overview
The Azure Event Grid - Overview dashboard provides details on the performance, authentication, and delivery status of your Azure Event Grid service.
Use this dashboard to:
- Monitor request patterns by location and network access type to optimize resource allocation and improve latency.
- Analyze authentication types to ensure proper security measures are in place and identify potential unauthorized access attempts.
- Track delivery and publish failures by topic to quickly identify and resolve issues affecting event distribution.
- Correlate publish success latency with destination processing duration to optimize the end-to-end event handling performance.

Administrative Operations
The Azure Event Grid - Administrative Operations dashboard provides details on the operational activities and status of your Azure Event Grid resources.
Use this dashboard to:
- Monitor the distribution of operation types and their success rates to ensure the proper functioning of your Event Grid system.
- Identify potential issues by analyzing the top operations causing errors and correlating them with specific users or applications.
- Track recent write and delete operations to maintain an audit trail of changes made to your Event Grid configuration.

Topic Errors
The Azure Event Grid - Topic Errors dashboard provides details on various error types, failed deliveries, and dropped events in your Azure Event Grid service for system topics and topics.
Use this dashboard to:
- Identify the most common error types affecting event publishing and delivery, such as "NotFound" and "Cancelled" errors.
- Analyze trends in delivery attempt failures and correlate them with specific resource groups or topics to pinpoint problematic areas.
- Monitor dropped event counts over time and investigate the reasons behind event drops to improve system reliability.
- Track the top failed topics and delivery destinations to prioritize troubleshooting efforts and optimize event routing.

Domain Errors
The Azure Event Grid - Domain Errors dashboard provides details on various error types, failed deliveries, and dropped events in your Azure Event Grid service for domains. You can filter the dashboard results for particular topic using the topic filter
Use this dashboard to:
- Identify the most common error types affecting event publishing and delivery, such as "NotFound" and "Cancelled" errors.
- Analyze trends in delivery attempt failures and correlate them with specific resource groups or topics to pinpoint problematic areas.
- Monitor dropped event counts over time and investigate the reasons behind event drops to improve system reliability.
- Track the top failed topics and delivery destinations to prioritize troubleshooting efforts and optimize event routing.

Topic Operations
The Azure Event Grid - Topic Operations dashboard provides details on the performance and reliability of your Event Grid service, including processing times, success rates, and failure percentages for system topics and topics.
Use this dashboard to:
- Monitor the average destination processing duration to identify potential bottlenecks or performance issues in event delivery.
- Track delivery failure percentages over time to quickly spot and address any spikes in unsuccessful event transmissions.
- Analyze the correlation between unmatched event percentages and advanced filter evaluation counts to optimize event routing and filtering.
- Identify trends in publish success latency and failure rates to ensure efficient event publishing and processing.

Domain Operations
The Azure Event Grid - Domain Operations dashboard provides details on the performance and reliability of your Event Grid service, including processing times, success rates, and failure percentages for domains. You can filter the dashboard results for particular topic using the topic filter.
Use this dashboard to:
- Monitor the average destination processing duration to identify potential bottlenecks or performance issues in event delivery.
- Track delivery failure percentages over time to quickly spot and address any spikes in unsuccessful event transmissions.
- Analyze the correlation between unmatched event percentages and advanced filter evaluation counts to optimize event routing and filtering.
- Identify trends in publish success latency and failure rates to ensure efficient event publishing and processing.

Policy and Recommendations
The Azure Event Grid - Policy and Recommendations dashboard provides details on policy events and recommendations for your Azure Event Grid resources.
Use this dashboard to:
- Monitor the success and failure rates of policy events to ensure proper configuration and compliance.
- Track and analyze recent recommendations to improve the performance and security of your Event Grid setup.
- Identify trends in policy events and recommendations over time to proactively address potential issues.

Troubleshooting
HTTP Logs and Metrics Source used by Azure Functions
To troubleshoot metrics collection, follow the instructions in Collect Metrics from Azure Monitor > Troubleshooting metrics collection.