numcodecs_bitmap_index

numcodecs_bitmap_index

BitmapIndexCodec for the numcodecs buffer compression API.

Modules:

  • typing

    Commonly used type variables.

Classes:

  • BitmapIndexCodec

    Codec that uses bitmaps to encode the most frequent bitpatterns in the data

BitmapIndexCodec

BitmapIndexCodec(
    *,
    max_bitmaps: None | int = None,
    cost_factor: float = 1,
)

Bases: Codec

Codec that uses bitmaps to encode the most frequent bitpatterns in the data and encodes any remaining values as-is.

A simple heuristic is used to only encode bitpatterns with bitmaps where the direct savings outweigh the costs. The codec can be configured to bound the number of bitmaps or scale the cost.

Encoding produces a 1D array of unsigned integers with the same itemsize as the original data.

Parameters:
  • max_bitmaps (None | int, default: None ) –

    Maximum number of bitmaps to use.

  • cost_factor (float, default: 1 ) –

    Factor for the cost of a bitmap. Use >1 if bitmaps compress worse than the original data, or <1 if they compress better.

Methods:

  • encode

    Encode the data in buf by replacing the most frequent bitpatterns

  • decode

    Decode the data in buf.

  • get_config

    Returns the configuration of the codec.

codec_id class-attribute instance-attribute

codec_id: str = 'bitmap-index'

encode

encode(
    buf: ndarray[S, dtype[T]],
) -> ndarray[tuple[int], dtype[U]]

Encode the data in buf by replacing the most frequent bitpatterns with bitmaps.

Parameters:
Returns:

decode

decode(
    buf: ndarray[tuple[int], dtype[U]],
    out: None | ndarray[S, dtype[T]] = None,
) -> ndarray[S, dtype[T]]

Decode the data in buf.

Parameters:
  • buf (ndarray[tuple[int], dtype[U]]) –

    Encoded 1D array with an unsigned integer dtype.

  • out (None | ndarray[S, dtype[T]], default: None ) –

    Writeable array to store the decoded data.

Returns:

get_config

get_config() -> dict

Returns the configuration of the codec.

numcodecs.registry.get_codec(config) can be used to reconstruct this codec from the returned config.

Returns:
  • config( dict ) –

    Configuration of the codec.