A community for creating and using Virtual Zarr stores — cloud-optimized, chunk-based access to your existing archival data, without copying or converting it.
Lots of valuable scientific data is stored in pre-cloud archival file formats like netCDF, HDF5, GRIB, or TIFF. A Virtual Zarr store lets you read that data through the Zarr API by generating lightweight references that point to the byte ranges of chunks inside the original files. You get fast, cloud-friendly, xarray-compatible access while the source data stays exactly where it is. No duplication, no rewriting.
- VirtualiZarr — Create virtual Zarr stores using familiar xarray syntax. Open archival files as virtual datasets, combine them into a single coherent datacube, and serialize the chunk references to disk.
- Icechunk — A cloud-native, transactional storage engine for Zarr. One of the formats VirtualiZarr writes to, it lets virtual chunks (pointing at archival files) and native Zarr chunks be treated interchangeably, with versioning and transactional guarantees.
A typical workflow: use VirtualiZarr to virtualize and combine your files, commit the references to Icechunk, then read it all back with xarray.open_zarr as if it were one cloud-optimized store.
Have a question, an idea, or a workflow to share? Head to our community Discussions — it's the place to ask how to create or use Virtual Zarr stores, share projects, and help others.
- 💬 Community Discussions
- 📦 VirtualiZarr · docs
- 🧊 Icechunk · docs
- ⚡ Zarr