Storage options for TPU data
This document describes data storage options you can use when training models on TPU VMs.
Workloads running on TPU VMs require data storage for the following tasks:
- Dataset downloading and preprocessing
- Host input pipeline processing
- Model training input
- Model training output
- Checkpoints
- Model weights
- Key-value cache offloading
You can use the following storage options with TPU VMs:
- Durable block storage, including the boot disk and attached storage disks
- Cloud Storage buckets, including the Cloud Storage Rapid product family (Rapid Bucket and regional buckets with Rapid Cache)
- Google Cloud Managed Lustre
Durable block storage
Durable block storage, also known as disks or volumes, is for data that you want to preserve after you stop, suspend, or delete your TPU VM. Durable block storage is still available even if the TPU VM crashes or fails. You can use the TPU VM boot disk or attach additional block storage to your TPU.
You might want to attach an additional disk in the following scenarios:
- The size of your training dataset exceeds the size of the TPU boot disk.
- You have read-only data and want faster read access using a Hyperdisk ML volume.
TPU generation and supported disk types
The following table shows the types of disks supported by each TPU generation:
| TPU generation | Supported disk types |
|---|---|
| TPU7x | Hyperdisk Balanced, Hyperdisk ML |
| v6e | Hyperdisk Balanced, Hyperdisk ML |
| v5p | Balanced Persistent Disk, Hyperdisk ML |
| v5e | Balanced Persistent Disk, Hyperdisk ML |
| v4 | Balanced Persistent Disk |
| v3 | Balanced Persistent Disk |
| v2 | Balanced Persistent Disk |
TPU VM boot disk
By default, each TPU VM has a single 10 GB boot disk. When you create your VMs, you can configure a larger boot disk. For more information, see Create a customized boot disk. The boot disk contains the operating system, TPU drivers, and libraries. The boot disk can also store downloaded datasets temporarily for preprocessing and model input and output data, as long as the total size of the data doesn't exceed the available space on the boot disk.
If your application requires additional storage space beyond the boot disk default, you can add durable disks to your TPU VM instance or you can resize the boot disk. For more information, see the following resources:
- Create a new Persistent Disk volume
- Create a new Hyperdisk volume
- Modify the settings for a Google Cloud Hyperdisk volume
- Change the size of a Persistent Disk
For information about best practices when customizing the TPU VM boot disk, see Customize the TPU VM boot disk.
Attached storage
Both Google Cloud Hyperdisk and Persistent Disk are durable network storage devices that your VM instances can access like physical disks in a desktop or a server. You create both types of disks independently from your VM instances, so you can keep your data even after you delete your VM.
Advantages of using Hyperdisk over Persistent Disk include customizable performance, higher IOPS and throughput limits. For more information about Hyperdisk and Persistent Disk, see Choose a disk type.
When you attach a disk to a MIG with a multi-host TPU VM slice, the system attaches the disk to each VM in that TPU slice. To prevent two or more TPU VMs from writing to a disk at the same time, you must configure all disks that you attach to a multi-host TPU slice as read-only. Read-only disks are useful for storing a dataset for processing on a TPU slice. Because Hyperdisk Balanced doesn't support read-only mode, you can only attach a Hyperdisk Balanced volume to a single TPU VM instance.
For more information about using durable block storage, see Add a persistent disk to your VM and Add a Hyperdisk.
Disk backups
If the TPU VM is in an unknown state or you accidentally delete data, recovering that data from the boot disk is difficult. Back up your data using another storage option, such as Cloud Storage buckets.
If you store data on an attached disk, you can use disk snapshots, which incrementally back up data on a disk. The TPU VM boot disk does not support disk snapshots. For more information, see About disk snapshots.
Cloud Storage
Cloud Storage is an object storage system designed for storing and processing massive datasets (such as images, text, and video files) that require petabyte scale and massive bandwidth. It is highly cost-efficient, scalable, and resilient. When you use Cloud Storage, you store data as objects in containers called buckets. The performance of Cloud Storage buckets depends on the storage class that you select and the location of the bucket relative to your TPU VM instance.
When using Cloud Storage, you select a bucket type based on your throughput, scale, cost, and data availability requirements. The Cloud Storage Rapid product family, which includes Rapid Bucket and regional buckets with Rapid Cache, provides high-performance options for AI/ML workloads:
- Rapid Bucket: A Cloud Storage capability that lets you store objects in the Rapid storage class by creating a zonal bucket. Zonal buckets support high throughput and sub-millisecond latency for open files. They are ideal for asynchronous checkpoints, training data, and model weights.
- Regional buckets: A Cloud Storage bucket located in one or more regions. Regional buckets offer standard object storage performance. To meet throughput and latency requirements of AI/ML workloads, use regional buckets with Rapid Cache. Rapid Cache is an SSD-backed zonal read cache for regional buckets that boosts throughput and reduces latency for read-intensive and cacheable workloads, such as training data loading or model weight loading for inference.
All Cloud Storage buckets have built-in redundancy to protect your data against equipment failure and to ensure data availability through data center maintenance events. Cloud Storage calculates checksums for all operations to help ensure data integrity.
Unlike durable block storage, Cloud Storage buckets don't restrict you to the zone where your instance is located. Additionally, you can read and write data to a bucket from multiple instances simultaneously. For example, you can configure instances in multiple zones to read and write data in the same bucket rather than replicate the data to durable block storage in multiple zones.
For more information, see Connect TPU VMs to Cloud Storage buckets.
Cloud Storage FUSE
Cloud Storage FUSE is a FUSE adapter that lets you mount buckets as a local file system. When using Google Kubernetes Engine, we recommend using the Cloud Storage FUSE CSI driver and the Cloud Storage FUSE profiles.
For more information about Cloud Storage FUSE, see the Cloud Storage FUSE documentation.
Managed Lustre
Google Cloud Managed Lustre is a fully managed parallel file system with full POSIX compliance. It is designed for the low-latency, high-concurrency metadata performance profiles that are required for model training or inference workloads. It is optimized for ultra-low latency (less than 1 ms) workloads such as home directories, synchronous checkpoints, reinforcement learning (high-speed weight propagation storage), key-value (KV) caching offload storage, or training datasets with many small files (less than 1 MB). Managed Lustre offers a cost-effective Dynamic tier that automatically optimizes performance based on data access patterns.
For more information, see the Managed Lustre documentation.
What's next
- Create a new Persistent Disk volume
- Create a new Hyperdisk volume
- Connect TPU VMs to Cloud Storage buckets
- Storage best practices for AI/ML on TPU VMs
- Cloud Storage FUSE performance tuning best practices