HDA PySTAC-Client Introduction - DEDL Notebook Gallery

🚀 Launch in JupyterHub

This notebook shows the basic use of DestinE Data Lake Harmonised Data Access using pystac-client. It will include iterating through Collections and Items, and perform simple spatio-temporal searches.

Obtain DEDL Access Token to use the HDA service¶

pip install --quiet --upgrade destinelab

Note: you may need to restart the kernel to use updated packages.

import requests
import json
import os
from getpass import getpass
import destinelab as deauth

DESP_USERNAME = input("Please input your DESP username or email: ")
DESP_PASSWORD = getpass("Please input your DESP password: ")

auth = deauth.AuthHandler(DESP_USERNAME, DESP_PASSWORD)
access_token = auth.get_token()
if access_token is not None:
    print("DEDL/DESP Access Token Obtained Successfully")
else:
    print("Failed to Obtain DEDL/DESP Access Token")

auth_headers = {"Authorization": f"Bearer {access_token}"}

Please input your DESP username or email:  eum-dedl-user
Please input your DESP password:  ········

Response code: 200
DEDL/DESP Access Token Obtained Successfully

Set username and password as environment variables to be used for DEDL data access¶

import os

os.environ["EODAG__DEDL__AUTH__CREDENTIALS__USERNAME"] = DESP_USERNAME
os.environ["EODAG__DEDL__AUTH__CREDENTIALS__PASSWORD"] = DESP_PASSWORD

Create pystac client object for HDA STAC API¶

We first connect to an API by retrieving the root catalog, or landing page, of the API with the Client.open function.

from pystac_client import Client

HDA_API_URL = "https://hda.data.destination-earth.eu/stac"
cat = Client.open(HDA_API_URL, headers=auth_headers)

Query all available collections¶

As with a static catalog the get_collections function will iterate through the Collections in the Catalog. Notice that because this is an API it can get all the Collections through a single call, rather than having to fetch each one individually.

from rich.console import Console
import rich.table

console = Console()

hda_collections = cat.get_collections()

table = rich.table.Table(title="HDA collections", expand=True)
table.add_column("ID", style="cyan", justify="right",no_wrap=True)
table.add_column("Title", style="violet", no_wrap=True)
for collection in hda_collections:
    table.add_row(collection.id, collection.title)
console.print(table)

Obtain provider information for each individual collection¶

table = rich.table.Table(title="HDA collections | Providers", expand=True)
table.add_column("Title", style="cyan", justify="right", no_wrap=True)
table.add_column("Provider", style="violet", no_wrap=True)

hda_collections = cat.get_collections()

for collection in hda_collections:
    collection_details = cat.get_collection(collection.id)
    provider = ','.join(str(x.name) for x in collection_details.providers)
    table.add_row(collection_details.title, provider)
console.print(table)

Inspect Items of a Collection¶

The main functions for getting items return iterators, where pystac-client will handle retrieval of additional pages when needed. Note that one request is made for the first ten items, then a second request for the next ten.

coll_name = 'EO.ESA.DAT.SENTINEL-1.L1_GRD'
search = cat.search(
    max_items=10,
    collections=[coll_name],
    bbox=[-72.5,40.5,-72,41],
    datetime="2023-09-09T00:00:00Z/2023-09-20T23:59:59Z"
)

coll_items = search.item_collection()
console.print(f"For collection {coll_name} we found {len(coll_items)} items")

import geopandas

df = geopandas.GeoDataFrame.from_features(coll_items.to_dict(), crs="epsg:4326")
df.head()

Inspect STAC assets of an item¶

import rich.table

selected_item = coll_items[3]

table = rich.table.Table(title="Assets in STAC Item")
table.add_column("Asset Key", style="cyan", no_wrap=True)
table.add_column("Description")
for asset_key, asset in selected_item.assets.items():
    table.add_row(asset_key, asset.title)

console.print(table)

from IPython.display import Image

Image(url=selected_item.assets["thumbnail"].href, width=500)

down_uri = selected_item.assets["downloadLink"].href
console.print(f"Download link of asset is {down_uri}")

Download asset to JupyterLab¶

selected_item.id

'S1A_IW_GRDH_1SDV_20230914T225135_20230914T225200_050330_060F40_9F54'

selected_item.assets["downloadLink"]

# Make http request for remote file data
data = requests.get(selected_item.assets["downloadLink"].href,
                   headers=auth_headers)
mtype = selected_item.assets["downloadLink"].media_type.split("/")[1]
# Save file data to local copy
with open(f"{selected_item.id}.{mtype}", 'wb')as file:
    file.write(data.content)