SLAF Benchmark System¶
This document describes the SLAF benchmark suite for performance testing and documentation generation. The benchmark system has been refactored to separate bioinformatics and ML benchmarks.
🚀 Quick Start (Recommended)¶
Bioinformatics Benchmarks (CLI Integration)¶
Use the unified CLI interface for bioinformatics benchmark operations:
# Run bioinformatics benchmarks
slaf benchmark run --datasets pbmc3k_processed --types cell_filtering,expression_queries --verbose
# Generate summary from results
slaf benchmark summary --results comprehensive_benchmark_results.json
# Update documentation
slaf benchmark docs --summary benchmark_summary.json
# Run complete workflow
slaf benchmark all --datasets pbmc3k_processed --auto-convert
ML Benchmarks (Standalone Scripts)¶
ML benchmarks are run as standalone scripts:
# External dataloader comparisons
python benchmarks/benchmark_dataloaders_external.py
# Internal tokenization strategies
python benchmarks/benchmark_dataloaders_internal.py
# Prefetcher performance analysis
python benchmarks/benchmark_prefetcher.py
📁 File Structure¶
Core Files¶
benchmarks/benchmark.py
- Main bioinformatics benchmark runner with CLI integrationbenchmarks/benchmark_utils.py
- Shared utilities for bioinformatics benchmarks
Bioinformatics Benchmark Modules (CLI Integrated)¶
benchmarks/benchmark_cell_filtering.py
- Cell filtering performance testsbenchmarks/benchmark_gene_filtering.py
- Gene filtering performance testsbenchmarks/benchmark_expression_queries.py
- Expression query performance testsbenchmarks/benchmark_anndata_ops.py
- AnnData operation performance testsbenchmarks/benchmark_scanpy_preprocessing.py
- Scanpy preprocessing performance tests
ML Benchmark Modules (Standalone)¶
benchmarks/benchmark_dataloaders_external.py
- External dataloader comparisons (SLAF vs scDataset, BioNeMo, etc.)benchmarks/benchmark_dataloaders_internal.py
- Internal tokenization strategy comparisons (scGPT, Geneformer, etc.)benchmarks/benchmark_prefetcher.py
- Prefetcher pipeline performance analysis
Output Files¶
benchmarks/comprehensive_benchmark_results.json
- Complete bioinformatics benchmark resultsbenchmarks/benchmark_summary.json
- Documentation-ready summarybenchmarks/benchmark_output.txt
- Detailed benchmark outputbenchmarks/benchmark_results.json
- Legacy results file
🔧 CLI Commands (Bioinformatics Only)¶
Run Benchmarks¶
# Run all bioinformatics benchmark types
slaf benchmark run --datasets pbmc3k_processed --auto-convert
# Run specific benchmark types
slaf benchmark run --datasets pbmc3k_processed --types cell_filtering,expression_queries
# Run with verbose output
slaf benchmark run --datasets pbmc3k_processed --verbose --auto-convert
# Run on multiple datasets
slaf benchmark run --datasets pbmc3k_processed pbmc_68k --auto-convert
Generate Summary¶
# Generate summary from existing results
slaf benchmark summary --results comprehensive_benchmark_results.json
# Generate summary with custom output
slaf benchmark summary --results comprehensive_benchmark_results.json --output custom_summary.json
Update Documentation¶
# Update bioinformatics_benchmarks.md with summary data
slaf benchmark docs --summary benchmark_summary.json
# Update with custom summary file
slaf benchmark docs --summary custom_summary.json
Complete Workflow¶
# Run benchmarks, generate summary, and update docs
slaf benchmark all --datasets pbmc3k_processed --auto-convert --verbose
📊 Available Benchmark Types¶
Bioinformatics Benchmarks (CLI Integrated)¶
- cell_filtering - Metadata-based cell filtering performance
- gene_filtering - Metadata-based gene filtering performance
- expression_queries - Expression matrix slicing performance
- anndata_ops - AnnData operation performance
- scanpy_preprocessing - Scanpy preprocessing pipeline performance
ML Benchmarks (Standalone Scripts)¶
- External Dataloader Comparisons - SLAF vs scDataset, BioNeMo SCDL, AnnDataLoader
- Internal Tokenization Strategies - scGPT, Geneformer, raw data loading
- Prefetcher Performance - Pipeline timing analysis across configurations
🎯 Usage Examples¶
Bioinformatics Development Workflow¶
# Quick test of cell filtering
slaf benchmark run --datasets pbmc3k_processed --types cell_filtering --verbose
# Comprehensive testing
slaf benchmark all --datasets pbmc3k_processed --auto-convert --verbose
ML Development Workflow¶
# Compare against external dataloaders
python benchmarks/benchmark_dataloaders_external.py
# Test different tokenization strategies
python benchmarks/benchmark_dataloaders_internal.py
# Analyze prefetcher performance
python benchmarks/benchmark_prefetcher.py
Performance Analysis¶
# Generate bioinformatics performance summary
slaf benchmark summary --results comprehensive_benchmark_results.json
# Update bioinformatics documentation with latest results
slaf benchmark docs --summary benchmark_summary.json
Multi-Dataset Testing¶
# Test bioinformatics benchmarks on multiple datasets
slaf benchmark run --datasets pbmc3k_processed pbmc_68k --types cell_filtering,expression_queries --auto-convert
📈 Output Files¶
Bioinformatics Results Files¶
comprehensive_benchmark_results.json
- Complete benchmark results with detailed timing and memory databenchmark_summary.json
- Condensed summary for documentation updatesbenchmark_output.txt
- Human-readable benchmark output with tables and analysis
ML Results Files¶
- ML benchmarks output results directly to console with rich formatting
- Results are not automatically saved to files (manual documentation updates required)
Documentation Integration¶
The bioinformatics benchmark system automatically updates docs/benchmarks/bioinformatics_benchmarks.md
with the latest performance data, ensuring documentation stays current with benchmark results. ML benchmarks are documented separately in docs/benchmarks/ml_benchmarks.md
and require manual updates.
🔍 Troubleshooting¶
Common Issues¶
- Dataset not found: Ensure datasets are in the correct directory and use
--auto-convert
to convert h5ad files - Benchmark failures: Check that SLAF files exist and are properly formatted
- Memory issues: Some benchmarks require significant memory for large datasets
- ML benchmark dependencies: Ensure all ML dependencies are installed for standalone ML benchmarks
Debug Mode¶
# Run bioinformatics benchmarks with verbose output for debugging
slaf benchmark run --datasets pbmc3k_processed --types cell_filtering --verbose
# Run ML benchmarks with debug output
python benchmarks/benchmark_dataloaders_external.py --debug
📝 Contributing¶
Adding Bioinformatics Benchmarks¶
When adding new bioinformatics benchmarks:
- Create a new benchmark module following the existing pattern
- Add the benchmark type to the CLI in
slaf/cli.py
- Update this documentation with the new benchmark type
- Test with
slaf benchmark run --types your_new_benchmark
Adding ML Benchmarks¶
When adding new ML benchmarks:
- Create a new standalone benchmark script following the existing pattern
- Add appropriate documentation in
docs/benchmarks/ml_benchmarks.md
- Test the standalone script directly
- Consider integration with CLI system in the future
🏗️ Architecture¶
The benchmark system uses a modular design with two distinct approaches:
Bioinformatics Benchmarks (CLI Integrated)¶
- CLI Interface: Unified command-line interface in
slaf/cli.py
- Benchmark Runner: Main orchestration in
benchmarks/benchmark.py
- Individual Modules: Specialized benchmark tests in separate files
- Utilities: Shared functions in
benchmarks/benchmark_utils.py
- Documentation: Automatic updates to
docs/benchmarks/bioinformatics_benchmarks.md
ML Benchmarks (Standalone)¶
- Standalone Scripts: Independent benchmark scripts with rich console output
- External Comparisons:
benchmark_dataloaders_external.py
for competitor analysis - Internal Analysis:
benchmark_dataloaders_internal.py
for tokenization strategies - Pipeline Analysis:
benchmark_prefetcher.py
for prefetcher performance - Documentation: Manual updates to
docs/benchmarks/ml_benchmarks.md
🔄 Future Integration¶
The ML benchmarks are currently standalone but may be integrated with the CLI system in the future to provide:
- Unified benchmark execution
- Automatic result aggregation
- Integrated documentation updates
- Consistent output formatting
For now, ML benchmarks provide immediate value as standalone tools for development and performance analysis.