Tensor Processing Units (TPUs) are Google's custom developed Application Specific Integrated Circuits (ASICs) used to accelerate machine learning workloads. For more details, refer to the GCP documentation.
Log and metric types
You can collect the logs and metrics for Sumo Logic's Google Cloud TPU integration by following the below steps.
Configure logs collection
- Collect Audit Logs using the Google Cloud Platform source. These Audit Logs can be accessed based on the permissions and roles. To enable logging for Google TPU, refer to Google documentation. For more detail on TPU operations being audited, refer to audited operations. While creating the sync in GCP, as part of the Choose logs to include in sink section, you can use the following query:
- Collect Platform Logs using the Google Cloud Platform source. Cloud TPU Worker logs contain information about a specific Cloud TPU worker in a specific zone, for example the amount of memory available on the Cloud TPU worker (system_available_memory_GiB). While creating the sync in GCP, as part of the Choose logs to include in sink section, you can use the following query: