Core API¶
Public API from lance_array.core, in reading order: LanceArray (view type), TileCodec, open_array, and normalize_chunk_slices.
lance_array.core.LanceArray
¶
2D view over a Lance dataset with one encoded tile per row.
Rows are indexed by logical tile grid (tile_i, tile_j) mapped to a Lance
positional row index for take_blobs. Each stored payload is decoded
using the TileCodec chosen at write time; see decode_tile.
Indexing: NumPy-like view[row, col] — int, slice (including
step ≠ 1), ..., view[row] as view[row, :], and advanced indices
(integer or boolean ndarray, list) with the same broadcasting rules
as NumPy for 2D arrays. Overlapping tiles are read via batched take_blobs
and stitched (including partial edge tiles).
Assignment: only for views opened with mode="r+". Supported keys match
basic NumPy indexing with slice step 1 on both axes; fancy and boolean
assignment is not implemented.
Full raster: to_numpy materializes the entire grid in one batched read
path.
Create on disk: LanceArray.to_lance writes a new dataset from a NumPy
image and returns a read-only view; open with mode="r+" to mutate.
Attributes:
| Name | Type | Description |
|---|---|---|
shape |
tuple[int, int]
|
Raster shape |
chunks |
tuple[int, int]
|
Tile shape |
dtype |
dtype
|
Pixel dtype of the logical raster. |
Attributes¶
ndim: int
property
¶
Number of dimensions.
Returns:
| Type | Description |
|---|---|
int
|
Always |
coord_to_row: dict[tuple[int, int], int]
property
¶
Map each tile grid index to its Lance positional row index.
Returns:
| Type | Description |
|---|---|
dict[tuple[int, int], int]
|
Keys |
blob_column: str
property
¶
Lance column name holding encoded tile payloads.
Returns:
| Type | Description |
|---|---|
str
|
Blob v2 column name (default |
payload_layout: Literal['blob', 'bytes']
property
¶
Physical payload layout for encoded tiles.
dataset: lance.LanceDataset
property
¶
Underlying Lance dataset handle.
Returns:
| Type | Description |
|---|---|
LanceDataset
|
Use e.g. |
n_tile_rows: int
property
¶
Number of tile rows along axis 0.
Returns:
| Type | Description |
|---|---|
int
|
|
n_tile_cols: int
property
¶
Number of tile columns along axis 1.
Returns:
| Type | Description |
|---|---|
int
|
|
Functions¶
__init__(dataset: lance.LanceDataset, chunk_shape: tuple[int, int], image_shape: tuple[int, int], coord_to_row: dict[tuple[int, int], int], decode_tile: Callable[[bytes], np.ndarray], *, blob_column: str = 'blob', payload_layout: Literal['blob', 'bytes'] = 'blob', tile_row_col: str = _TILE_ROW_COL, tile_col_col: str = _TILE_COL_COL, dtype: np.dtype | None = None, encode_tile: Callable[[np.ndarray], bytes] | None = None) -> None
¶
Build a view from an open Lance dataset (prefer LanceArray.open or LanceArray.to_lance).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset
|
LanceDataset
|
Open |
required |
chunk_shape
|
tuple[int, int]
|
|
required |
image_shape
|
tuple[int, int]
|
Full raster shape |
required |
coord_to_row
|
dict[tuple[int, int], int]
|
Map |
required |
decode_tile
|
Callable[[bytes], ndarray]
|
Decode one blob payload to a |
required |
blob_column
|
str
|
Lance blob column name for tile payloads. |
'blob'
|
dtype
|
dtype | None
|
Raster dtype; defaults to |
None
|
encode_tile
|
Callable[[ndarray], bytes] | None
|
Encode a tile array to bytes; must be set for |
None
|
open(path: str | Path, *, mode: str = 'r') -> LanceArray
classmethod
¶
Open a dataset written with LanceArray.to_lance using a sidecar manifest.
The dataset root must expose lance_array.json (written by LanceArray.to_lance).
For local paths this is a file under the dataset directory; for URIs (e.g.
s3://...) the manifest is read via optional smart-open.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Lance dataset directory or URI (e.g. |
required |
mode
|
str
|
|
'r'
|
Returns:
| Type | Description |
|---|---|
LanceArray
|
View over the on-disk dataset; use |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
FileNotFoundError
|
If |
ImportError
|
If a remote URI is used without |
to_lance(path: str | Path, image: np.ndarray, chunk_shape: tuple[int, int], *, codec: TileCodec | str = TileCodec.RAW, blosc_typesize: int | None = None, blosc_clevel: int = 5, blosc_cname: str = 'zstd', blob_column: str = 'payload', data_storage_version: Literal['stable', '2.0', '2.1', '2.2', '2.3', 'next', 'legacy', '0.1'] = '2.2', tile_order: Literal['row_major', 'morton', 'hilbert'] = 'morton', payload_layout: Literal['blob', 'bytes'] = 'bytes') -> LanceArray
classmethod
¶
Write a 2D image as one encoded tile per row and return a LanceArray.
The on-disk table stores tile coordinates (default columns
tile_row, tile_col), morton_code, and an encoded payload
column (blob-v2 or large-binary bytes depending on payload_layout).
A sidecar lance_array.json stores shape, chunk grid, dtype, and codec
parameters so LanceArray.open works.
Pass codec= as TileCodec or a string alias ("raw",
"blosc_numcodecs", "blosc2"). Blosc presets use blosc_typesize
(defaults to dtype itemsize), blosc_clevel, and blosc_cname where
applicable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Output dataset directory. |
required |
image
|
ndarray
|
Full raster |
required |
chunk_shape
|
tuple[int, int]
|
|
required |
codec
|
TileCodec | str
|
Built-in tile codec preset. |
RAW
|
blosc_typesize
|
int | None
|
Blosc |
None
|
blosc_clevel
|
int
|
Blosc compression level. |
5
|
blosc_cname
|
str
|
Blosc compressor name (e.g. |
'zstd'
|
blob_column
|
str
|
Name of the blob column in the Lance schema. |
'payload'
|
data_storage_version
|
Literal['stable', '2.0', '2.1', '2.2', '2.3', 'next', 'legacy', '0.1']
|
Passed to |
'2.2'
|
tile_order
|
Literal['row_major', 'morton', 'hilbert']
|
Physical insertion order of tiles in the Lance table. |
'morton'
|
Returns:
| Type | Description |
|---|---|
LanceArray
|
Read-only view over the written dataset (use |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
decode_tile(data: bytes) -> np.ndarray
¶
Decode one blob from storage into a single tile array.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
bytes
|
Raw bytes from the blob column for one row. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Shape |
to_numpy() -> np.ndarray
¶
Decode all tiles and return the full raster (single batched read path).
Returns:
| Type | Description |
|---|---|
ndarray
|
Shape |
lance_array.core.TileCodec
¶
Bases: Enum
How each tile is encoded in the Lance blob column.
Pass a member (or string alias such as "raw", "blosc_numcodecs",
"blosc2") to LanceArray.to_lance. BLOSC2 requires the blosc2
package (e.g. lance-array[zarr]). See enum members below for each preset.
Attributes¶
RAW = 'raw'
class-attribute
instance-attribute
¶
Uncompressed contiguous bytes (dtype itemsize × cells per tile).
BLOSC_NUMCODECS = 'blosc_numcodecs'
class-attribute
instance-attribute
¶
Blosc1 via numcodecs.Blosc (typical Zarr Blosc parity).
BLOSC2 = 'blosc2'
class-attribute
instance-attribute
¶
Blosc2 via blosc2.compress / decompress (install blosc2).
lance_array.core.open_array(store: str | Path, *, mode: str = 'r') -> LanceArray
¶
Open a Lance tile dataset (Zarr-style entry point).
Like zarr.open_array, but for a dataset written by LanceArray.to_lance
(includes lance_array.json).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
store
|
str | Path
|
Dataset directory or URI passed to |
required |
mode
|
str
|
|
'r'
|
Returns:
| Type | Description |
|---|---|
LanceArray
|
Same as |
Raises:
| Type | Description |
|---|---|
ValueError
|
Invalid |
FileNotFoundError
|
Missing |
ImportError
|
Remote URI without |
Notes
For s3://, gs://, or https://, install smart-open (extra
lance-array[cloud]). The manifest is read with smart_open before
opening Lance.
lance_array.core.normalize_chunk_slices(s: slice, dim: int) -> tuple[int, int]
¶
Normalize a slice to (start, stop) with step 1 and positive span.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
s
|
slice
|
Slice along an axis of logical length |
required |
dim
|
int
|
Size of that axis. |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
Half-open interval |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |