Migrating to SLAF¶
Quick Start¶
Convert your single-cell data to SLAF format with just one command:
# Convert any supported format (auto-detection)
slaf convert data.h5ad output.slaf
slaf convert filtered_feature_bc_matrix/ output.slaf
slaf convert data.h5 output.slaf
That's it! SLAF automatically detects your file format and converts it with optimized settings.
Supported Formats¶
SLAF supports conversion from these common single-cell formats:
- AnnData (.h5ad files) - The standard single-cell format
- 10x MTX (filtered_feature_bc_matrix directories) - Cell Ranger output
- 10x H5 (.h5 files) - Cell Ranger H5 output format
Python API¶
from slaf.data import SLAFConverter
# Basic conversion (auto-detection)
converter = SLAFConverter()
converter.convert("data.h5ad", "output.slaf")
converter.convert("filtered_feature_bc_matrix/", "output.slaf")
converter.convert("data.h5", "output.slaf")
# Convert existing AnnData object
import scanpy as sc
adata = sc.read_h5ad("data.h5ad")
converter.convert_anndata(adata, "output.slaf")
Large Datasets¶
For large datasets (>100k cells), you can optimize performance:
# Use larger chunks for speed (if you have enough RAM)
slaf convert large_data.h5ad output.slaf --chunk-size 100000
# Create indices for faster queries
slaf convert large_data.h5ad output.slaf --create-indices
# Python API for large datasets
converter = SLAFConverter(chunk_size=100000, create_indices=True)
converter.convert("large_data.h5ad", "output.slaf")
Advanced Options¶
Most users won't need these, but they're available if needed:
CLI Options¶
# Specify format explicitly (if auto-detection fails)
slaf convert data.h5 output.slaf --format 10x_h5
# Use non-chunked processing (not recommended for large datasets)
slaf convert small_data.h5ad output.slaf --no-chunked
# Disable storage optimization (larger files but includes string IDs)
slaf convert data.h5ad output.slaf --no-optimize-storage
# Verbose output
slaf convert data.h5ad output.slaf --verbose
Python API Options¶
# Custom settings
converter = SLAFConverter(
chunk_size=50000, # Cells per chunk
create_indices=True, # Faster queries
optimize_storage=True, # Smaller files (default)
use_optimized_dtypes=True, # Better compression (default)
)
converter.convert("data.h5ad", "output.slaf")
What SLAF Does¶
SLAF converts your data to an optimized format that:
- Enables fast SQL queries on your data
- Works with any size dataset (memory-efficient processing)
- Preserves all metadata (cell types, gene info, etc.)
Next Steps¶
After converting your data:
- Explore:
slaf info output.slaf
- Query:
slaf query output.slaf "SELECT * FROM expression LIMIT 10"
- Use in Python:
import slaf; data = slaf.SLAFArray("output.slaf")
See the Getting Started guide for more examples.