Source: kerchunk
Section: python
Maintainer: Debian GIS Project <pkg-grass-devel@lists.alioth.debian.org>
Uploaders: Antonio Valentino <antonio.valentino@tiscali.it>
Build-Depends: debhelper-compat (= 13),
               dh-sequence-python3,
               dh-sequence-sphinxdoc <!nodoc>,
               pybuild-plugin-pyproject,
               python3-aiohttp <!nocheck>,
               python3-all,
               python3-astropy <!nodoc>,
               python3-dask <!nocheck>,
               python3-cfgrib,
               python3-cftime,
               python3-eccodes <!nodoc>,
               python3-fsspec,
               python3-h5netcdf <!nocheck>,
               python3-h5py,
	       python3-netcdf4 <!nocheck>,
               python3-numcodecs,
               python3-numpy,
               python3-numpydoc <!nodoc>,
               python3-pytest <!nocheck>,
               python3-scipy,
               python3-setuptools,
               python3-setuptools-scm,
               python3-sphinx <!nodoc>,
               python3-sphinx-rtd-theme <!nodoc>,
               python3-tifffile <!nodoc>,
               python3-ujson,
               python3-xarray,
               python3-zarr
Standards-Version: 4.7.3
Testsuite: autopkgtest-pkg-pybuild
Homepage: https://github.com/fsspec/kerchunk
Vcs-Browser: https://salsa.debian.org/debian-gis-team/kerchunk
Vcs-Git: https://salsa.debian.org/debian-gis-team/kerchunk.git
Description: Cloud-friendly access to archival data
 Kerchunk is a library that provides a unified way to represent a
 variety of chunked, compressed data formats (e.g. NetCDF, HDF5, GRIB),
 allowing efficient access to the data from traditional file systems or
 cloud object storage.  It also provides a flexible way to create
 virtual datasets from multiple files.  It does this by extracting the
 byte ranges, compression information and other information about the
 data and storing this metadata in a new, separate object.
 This means that you can create a virtual aggregate dataset over
 potentially many source files, for efficient, parallel and
 cloud-friendly *in-situ* access without having to copy or translate
 the originals. It is a gateway to in-the-cloud massive data processing
 while the data providers still insist on using legacy formats for
 archival storage.
 .
 Features:
 .
  * completely serverless architecture
  * metadata consolidation, so you can understand a many-file dataset
    (metadata plus physical storage) in a single read
  * read from all of the storage backends supported by fsspec,
    including object storage (s3, gcs, abfs, alibaba), http, cloud user
    storage (dropbox, gdrive) and network protocols (ftp, ssh, hdfs,
    smb...)
  * loading of various file types (currently netcdf4/HDF, grib2, tiff,
    fits, zarr), potentially heterogeneous within a single dataset,
    without a need to go via the specific driver (e.g., no need for
    h5py)
  * asynchronous concurrent fetch of many data chunks in one go,
    amortizing the cost of latency
  * parallel access with a library like zarr without any locks
  * logical datasets viewing many (>~millions) data files, and direct
    access/subselection to them via coordinate indexing across an
    arbitrary number of dimensions

Package: python3-kerchunk
Architecture: all
Depends: ${python3:Depends},
         ${misc:Depends}
Recommends: python3-cfgrib,
            python3-cftime,
            python3-h5py,
            python3-scipy,
            python3-xarray
Suggests: python3-aiohttp,
          python3-dask,
          python3-netcdf4
Description: ${source:Synopsis}
 ${source:Extended-Description}

Package: python-kerchunk-doc
Section: doc
Architecture: all
Multi-Arch: foreign
Depends: ${sphinxdoc:Depends},
         ${misc:Depends}
Suggests: www-browser
Description: ${source:Synopsis} (documentation)
 ${source:Extended-Description}
 .
 This package provides the HTML documentation for kerchunk.
