Skip to content

SLAF vs h5ad Performance Benchmarks

This document provides a comprehensive performance comparison between SLAF and the traditional h5ad (AnnData) format across bioinformatics and machine learning workflows.

Overview

SLAF provides dramatic performance improvements over h5ad across all benchmark categories, demonstrating the advantages of modern columnar storage and optimized data access patterns.

Key Performance Summary

Category SLAF vs h5ad Speedup Memory Efficiency Dataset
Cell Filtering 92.3x faster 115.7x less memory synthetic_50k_processed
Gene Filtering 17.3x faster 2.2x less memory synthetic_50k_processed
Expression Queries 9.5x faster 154.6x less memory synthetic_50k_processed
ML Data Loading 55x faster 2.3x less memory Tahoe-100M

Performance Leadership

SLAF consistently outperforms h5ad by 9.5x-92.3x across all operation types while using 2.2x-154.6x less memory.

Bioinformatics Benchmarks

Input Dataset: synthetic_50k_processed (49,955 cells × 25,000 genes, 722MB h5ad file)

Cell Filtering Performance

Cell filtering operations are fundamental to single-cell analysis workflows, used for quality control, cell type selection, and data subsetting.

Scenario h5ad Total (ms) SLAF Total (ms) Speedup Description
S1 530.0 2.9 183.6x Cells with >=500 genes
S2 169.7 2.0 83.3x High UMI count (total_counts > 2000)
S3 170.7 1.9 92.2x Mitochondrial fraction < 0.1
S4 177.1 2.0 86.7x Complex multi-condition filter
S5 186.6 2.8 67.2x Cell type annotation filter
S6 171.4 2.0 86.3x Cells from batch_1
S7 207.0 2.3 89.4x Cells in clusters 0,1 from batch_1
S8 170.9 2.1 79.6x High-quality cells (>=1000 genes, <=10% mt)
S9 172.1 2.5 70.2x Cells with 800-2000 total counts
S10 173.5 2.1 84.6x Cells with 200-1500 genes

Average Performance:

  • SLAF vs h5ad: 92.3x faster
  • Memory Usage: SLAF uses 115.7x less memory than h5ad

Gene Filtering Performance

Gene filtering operations are essential for feature selection, quality control, and dimensionality reduction.

Scenario h5ad Total (ms) SLAF Total (ms) Speedup Description
S1 43.4 3.0 14.6x Genes expressed in >=10 cells
S2 32.3 1.7 19.4x Genes with >=100 total counts
S3 32.1 1.8 17.4x Genes with mean expression >=0.1
S4 31.1 1.6 19.9x Exclude mitochondrial genes
S5 32.7 1.7 19.7x Highly variable genes
S6 31.7 2.1 15.4x Non-highly variable genes
S7 31.5 2.0 15.8x Genes in >=50 cells with >=500 total counts
S8 31.7 1.9 17.0x Genes with 100-10000 total counts
S9 33.2 2.0 16.4x Genes in 5-1000 cells

Average Performance:

  • SLAF vs h5ad: 17.3x faster
  • Memory Usage: SLAF uses 2.2x less memory than h5ad

Expression Queries Performance

Expression queries retrieve specific expression data for cells or genes, supporting targeted analysis workflows.

Scenario h5ad Total (ms) SLAF Total (ms) Speedup Description
S1 484.5 16.1 30.1x Single cell expression
S2 251.3 13.9 18.1x Another single cell
S3 328.3 14.2 23.1x Two cells
S4 233.2 15.5 15.1x Three cells
S5 232.7 523.7 0.4x Single gene across all cells
S6 203.4 442.6 0.5x Another single gene
S7 256.1 303.0 0.8x Two genes
S8 212.0 655.9 0.3x Three genes
S9 221.4 22.5 9.9x 100x50 submatrix
S10 168.3 61.9 2.7x 500x100 submatrix
S11 212.2 63.2 3.4x 500x500 submatrix

Average Performance:

  • SLAF vs h5ad: 9.5x faster
  • Memory Usage: SLAF uses 154.6x less memory than h5ad

Machine Learning Benchmarks

Input Dataset: Tahoe-100M (5,481,420 cells × 62,710 genes, ~8B non-zero values)

Raw Data Loading Performance

Raw data loading measures the base throughput for machine learning workflows without tokenization overhead.

System Throughput (cells/sec) Memory Usage (GB) Notes
SLAF 24,587 2.1 Optimized streaming
h5ad (AnnDataLoader) 422 4.8 Traditional approach
h5ad (AnnLoader) 239 5.2 Experimental loader

Performance Comparison:

  • SLAF vs AnnDataLoader: 58.3x faster
  • SLAF vs AnnLoader: 102.9x faster
  • Memory Efficiency: SLAF uses 2.3x less memory

GPU-Ready Output Performance

SLAF provides pre-tokenized sequences ready for GPU training, while h5ad-based loaders only provide raw data.

System Throughput (cells/sec) Throughput (tokens/sec) Output Type
SLAF 7,487 15,332,896 Pre-tokenized sequences
h5ad loaders N/A N/A Raw data only

GPU Training Advantage

SLAF is the only system providing GPU-ready tokenized output, enabling efficient training of foundation models like Geneformer and scGPT.

Technical Implementation Comparison

Aspect SLAF h5ad
Storage Arrow-based columnar storage with Lance backend HDF5-based hierarchical storage with h5py backend
Metadata Polars DataFrames for efficient filtering operations Pandas DataFrames with traditional filtering
Expression Optimized sparse COO matrices with zero-copy access Sparse matrices with h5py backend
Memory Minimal intermediate allocations, efficient memory management Full data loading with pandas overhead
Access Asynchronous prefetching with background processing Synchronous loading with no streaming optimization

Use Case Recommendations

Choose SLAF for:

  • High-throughput bioinformatics workflows requiring fast filtering and querying
  • Machine learning training on large single-cell datasets
  • Cloud-based analysis requiring scalable, multi-user access
  • Foundation model training requiring GPU-ready tokenized sequences
  • Memory-constrained environments where efficiency is critical

Consider h5ad for:

  • Legacy workflows that cannot be easily migrated
  • Small-scale analysis where performance differences are negligible
  • Educational purposes where traditional formats are more familiar
  • Tool compatibility with systems that only support h5ad

Conclusion

The benchmarks demonstrate that SLAF's modern architecture, optimized data access patterns, and streaming capabilities provide massive advantages for both bioinformatics and machine learning workflows. For users looking to improve performance and scalability, migrating from h5ad to SLAF offers dramatic benefits with minimal workflow changes.


For detailed migration guidance, see Migrating to SLAF. For comprehensive benchmark results, see Bioinformatics Benchmarks and ML Benchmarks.