Backend storage#

Data used for backend computations and their results are stored on a MinIO server running on PAVICS. MinIO is a storage server that provides a scalable, high-performance object storage system. It is API-compatible with Amazon S3 cloud storage service.

MinIO organizes data in buckets. Each bucket is a top-level namespace for objects. The project’s bucket is called portail-ing. Each bucket may have its own access rules and policies (access rights, maximum size, versioning, etc.)

Linux client#

The easiest way to read and write on a MinIO instance is through the mc command-line client. To configure the client, run the following command:

mc alias set pavics https://minio.ouranos.ca

providing your access key and secret key when prompted. Ask your admin for these keys. Note that read-only access is possible, just press Enter Enter.

Then you can list the buckets with mc ls:

mc ls pavics/portail-ing

To copy files to the bucket using, assuming you have write access, use mc cp:

mc cp file.txt pavics/portail-ing/file_copy.txt

Add -r flag for recursive copy within a directory.

Programmatic access for developers#

The MinIO server can be accessed programmatically in Python using s3fs. Here is an example of how to write and read from a bucket.


def write_to_minio(local_path="data.nc", root="test/data.zarr"):
    """Open netCDF dataset from local path and write to MinIO as a Zarr object."""
    import s3fs
    import xarray as xr

    # Open connection to MinIO server
    ACCESS_KEY = "<YOUR_API_KEY"
    SECRET_KEY = "<YOUR_SECRET_KEY>"

    s3 = s3fs.S3FileSystem(anon=False, key=ACCESS_KEY, secret=SECRET_KEY, use_ssl=False,
                           client_kwargs={"endpoint_url": "http://minio.ouranos.ca"})

    # Create store from bucket name / object name
    store = s3fs.S3Map(root=root, s3=s3, check=False)

    # Open local dataset
    ds = xr.open_dataset(local_path).chunk()

    # Write to MinIO
    ds.to_zarr(store=store, mode="w", consolidated=True)

def read_from_minio(root="test/data.zarr"):
    """Read Zarr object from MinIO and return as xarray dataset."""
    import s3fs
    import xarray as xr

    s3r = s3fs.S3FileSystem(anon=True, use_ssl=False, client_kwargs={"endpoint_url": "http://minio.ouranos.ca"})
    store = s3fs.S3Map(root=root, s3=s3r, check=False)
    return xr.open_zarr(store=store)