Skip to content

SLAF Blog

Welcome to the SLAF blog, where we share insights, technical deep dives, and updates about the Sparse Lazy Array Format for single-cell genomics.

Latest Posts

Blazing Fast Dataloaders #2: Ignatius takes a trip to the Library of Congress

Last updated: September 8, 2025

Remember Ignatius J Reilly from A Confederacy of Dunces? Voracious, impatient, impressionable, perambulatorily challenged? That's modern neural network pretraining on GPUs. In this post, we explore how SLAF's mixture of scanners approach achieves near-perfect randomization (88-90% of theoretical maximum) while maintaining 97% of sequential throughput performance. We dive deep into the "Library of Congress" metaphor to explain how our contraption delivers randomized books at high throughput without reorganizing the library.

6.4x Faster DataLoaders: Deconstructing PyTorch for Single-Cell Genomics

Last updated: August 22, 2025

Single-cell transcriptomics datasets have reached escape velocity, with modern experiments yielding counts for upwards of 5M cells × 20k genes. This technical deep dive explores how we achieved 6.4x performance improvement over standard PyTorch DataLoaders, reaching 28,207 cells/second through five key innovations: contiguous reads, single-threaded prefetching, vectorized window functions, block shuffling, and vectorized tokenization.

Introducing SLAF: The Single-Cell Data Format for the Virtual Cell Era

Last updated: August 14, 2025

Single-cell datasets have grown from 50k to 100M cells in less than a decade, creating a fundamental mismatch between our tools and our needs. This introduction to SLAF (Sparse Lazy Array Format) explores how we're solving 2025 problems with modern technology, combining the best ideas from Zarr, Dask, Lance, and Polars into a cloud-native, SQL-powered format designed for the modern single-cell era.


About SLAF

SLAF (Sparse Lazy Array Format) is a cloud-native single-cell storage format for the virtual cell era:

  • Zarr-inspired: Zero-copy, query-in-place access to cloud storage
  • Dask-inspired: Lazy computation graphs that optimize before execution
  • Lance + Polars-inspired: OLAP-powered SQL with pushdown optimization
  • Scanpy-compatible: Drop-in replacement for existing workflows

Get Started

Ready to try SLAF? Check out our quickstart guide or explore the API documentation. Find it on Github. Deep dive into benchmarks.


Have questions or want to contribute? We'd love to hear from you!