Skip to content

pydap/pydap

Repository files navigation

pydap

Python3 PyPI conda forge Build Status black license pre-commit Join the chat at https://gitter.im/pydap/pydap

pydap is an implementation of the Opendap/DODS protocol, written from scratch in pure python. You can use pydap to access scientific data on the internet without having to download it; instead, you work with special array and iterable objects that download data on-the-fly as necessary, saving bandwidth and time. The module also comes with a robust-but-lightweight Opendap server, implemented as a WSGI application.

Quickstart

pydap is a lighweight python package that you can use in either of the two modalities: a client and as a server. You can install the latest version using pip. After installing pip you can install pydap with this command:

    $ pip install pydap

This will install pydap together with all the required dependencies. pydap is also available through Anaconda. Below we install pydap and its required dependencies, along with common additional packages in a fresh conda environment named pydap:

$ mamba create -n pydap -c conda-forge python=3.10 pydap numpy jupyterlab ipython netCDF4 scipy matplotlib

Now you simply activate the pydap environment:

mamba activate pydap

You can now use pydap as a client and open any remotely served dataset, and pydap will download the accessed data on-the-fly as needed:

    >>> from pydap.client import open_url
    >>> dataset = open_url('http://test.opendap.org/dap/data/nc/coads_climatology.nc')
    >>> var = dataset['SST']
    >>> var.shape
    (12, 90, 180)
    >>> var.dtype
    dtype('>f4')
    >>> data = var[0,10:14,10:14]  # this will download data from the server
    >>> data
    <GridType with array 'SST' and maps 'TIME', 'COADSY', 'COADSX'>
    >>> print(data.data)
    [array([[[ -1.26285708e+00,  -9.99999979e+33,  -9.99999979e+33,
              -9.99999979e+33],
            [ -7.69166648e-01,  -7.79999971e-01,  -6.75454497e-01,
              -5.95714271e-01],
            [  1.28333330e-01,  -5.00000156e-02,  -6.36363626e-02,
              -1.41666666e-01],
            [  6.38000011e-01,   8.95384610e-01,   7.21666634e-01,
               8.10000002e-01]]], dtype=float32), array([ 366.]), array([-69., -67., -65., -63.]), array([ 41.,  43.,  45.,  47.])]

For more information, please check the documentation on using pydap as a client.

pydap also comes with a simple server, implemented as a WSGI application. To use it, you first need to install the server and optionally a data handler:

    $ pip install "pydap[server,handlers.netcdf]"

This will install support for netCDF files; more handlers for different formats are available, if necessary. Now create a directory for your server data.

To run the server just issue the command:

    $ pydap --data ./myserver/data/ --port 8001 --workers 4 --threads 4

This will start a standalone server running on the default http://localhost:8001/, serving netCDF files from ./myserver/data/, similar to the test server at http://test.pydap.org/. Since the server uses the WSGI standard, pydap uses by default 1 worker and 1 thread, but these can be defined by the user like in the case above (4 workers and 4 threads). Pydap can also easily be run behind Apache. The server documentation has more information on how to better deploy pydap.

Documentation

For more information, see the pydap documentation.

Help

If you need any help with pydap, please feel free to send an email to the mailing list