Skip to article frontmatterSkip to article content

HDA Tutorial

This notebook demonstrates the first steps using the Harmonised Data access API.

🚀 Launch in JupyterHub

First steps using Harmonised Data access API

  • Discover data of DestinE Data Portfolio

  • Search data of DestinE Data Portfolio and visualize the results

  • Access Data of DestinE Data Portfolio and visualize the thumbnails

  • This notebook demonstrates how to use the HDA (Harmonized Data Access) API by sending a few HTTP requests to the API, using Python code.

    Throughout this quickstart notebook, you will learn:

    1. Discover: How to discover DEDL services and data collections through HDA.

    2. Authenticate: How to authenticate to search and access DEDL collections.

    3. Search data: How to search DEDL data through HDA.

    4. Visualize search results: How to see the results.

    5. Download data: How to download DEDL data through HDA.

    The detailed API and definition of each endpoint and parameters is available in the HDA Swagger UI at:

    https://hda.data.destination-earth.eu/docs/

    Prerequisites:
  • For Data discovery: none

  • For Data access : DestinE user account

  • Discover

    Settings

    Import the relevant modules

    We start off by importing the relevant modules for HTTP requests and json handling.

    from typing import Union
    import requests
    import json
    import urllib.parse
    from IPython.display import JSON
    from IPython.display import Image
    
    import geopandas
    import folium
    import folium.plugins
    from branca.element import Figure
    import shapely.geometry

    Define some constants for the API URLs

    In this section, we define the relevant constants, holding the URL strings for the different endpoints.

    # IDS
    SERVICE_ID = "dedl-hook"
    COLLECTION_ID = "EO.EUM.DAT.SENTINEL-3.SL_1_RBT___"
    ITEM_ID = "S3B_SL_1_RBT____20240918T102643_20240918T102943_20240919T103839_0179_097_336_2160_PS2_O_NT_004"
    
    # Core API
    HDA_API_URL = "https://hda.data.destination-earth.eu"
    SERVICES_URL = f"{HDA_API_URL}/services"
    SERVICE_BY_ID_URL = f"{SERVICES_URL}/{SERVICE_ID}"
    
    # STAC API
    ## Core
    STAC_API_URL = f"{HDA_API_URL}/stac/v2"
    CONFORMANCE_URL = f"{STAC_API_URL}/conformance"
    
    ## Item Search
    SEARCH_URL = f"{STAC_API_URL}/search"
    DOWNLOAD_URL = f"{STAC_API_URL}/download"
    
    ## Collections
    COLLECTIONS_URL = f"{STAC_API_URL}/collections"
    COLLECTION_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}"
    
    ## Items
    COLLECTION_ITEMS_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items"
    COLLECTION_ITEM_BY_ID_URL = f"{COLLECTIONS_URL}/{COLLECTION_ID}/items/{ITEM_ID}"
    
    ## HTTP Success
    HTTP_SUCCESS_CODE = 200

    Core API

    We can start off by requesting the HDA landing page, which provides links to the API definition, the available services (links services and service-doc) as well as the STAC API index.

    response=requests.get(HDA_API_URL)
    JSON(response.json())

    STAC API

    The HDA is plugged to a STAC API. The STAC API entry point is set to the /stac endpoint and provides the search capabilities provided by the DEDL STAC interface.

    print(STAC_API_URL)
    JSON(requests.get(STAC_API_URL).json())

    Discover DEDL Services

    The /services endpoint will return the list of the DEDL services available for users of the platform.

    print(SERVICES_URL)
    JSON(requests.get(SERVICES_URL).json())

    Through the /services endpoint is also possible discover services related to a certain topic:

    JSON(requests.get(SERVICES_URL,params = {"q": "dask"}).json())

    The API can also describe a specific service, identified by its serviceID (e.g. dedl-hook).

    The links describes and described by contains the reference documentation.

    print(SERVICE_BY_ID_URL)
    JSON(requests.get(SERVICE_BY_ID_URL).json())

    Discover DEDL data collections

    It is also possible discover data collections related to a certain topic and provided by a certain provider in a specic time interval. We specify an open time interval in order to have collections with data starting from a certain datetime.

    response = requests.get(COLLECTIONS_URL,params = {"q": "ozone,methane,fire","provider":"eumetsat","datetime":'2024-01-01T00:00:00Z/..'})
    
    JSON(response.json(), expanded=False)

    Authenticate

    Obtain Authentication Token

    import json
    import os
    from getpass import getpass
    import destinelab as deauth
    
    DESP_USERNAME = input("Please input your DESP username or email: ")
    DESP_PASSWORD = getpass("Please input your DESP password: ")
    
    auth = deauth.AuthHandler(DESP_USERNAME, DESP_PASSWORD)
    access_token = auth.get_token()
    if access_token is not None:
        print("DEDL/DESP Access Token Obtained Successfully")
    else:
        print("Failed to Obtain DEDL/DESP Access Token")
    
    auth_headers = {"Authorization": f"Bearer {access_token}"}

    List Available Collections

    The /stac/collections endpoint returns a FeatureCollection object, listing all STAC collections available to the user.

    print(COLLECTIONS_URL)
    JSON(requests.get(COLLECTIONS_URL).json())

    By providing a specific collectionID (e.g. EO.EUM.DAT.SENTINEL-3.SL_1_RBT___), the user can get the metadata for a specific Collection. The collection used for this tutorial is SLSTR Level 1B Radiances and Brightness Temperatures - Sentinel-3

    print(COLLECTION_BY_ID_URL)
    JSON(requests.get(COLLECTION_BY_ID_URL).json())

    Search for Items in a specific collection

    It is also possible to get the list of items available in a given Collection using a simple search and sorting the results.

    FILTER = "?datetime=2024-09-18T00:00:00Z/2024-09-20T23:59:59Z&bbox=-10,34,-5,42.5&sortby=datetime&limit=5"
    
    print(COLLECTION_ITEMS_URL+FILTER)
    response=requests.get(COLLECTION_ITEMS_URL+FILTER, headers=auth_headers)  
    
    JSON(response.json())            

    The search endpoint

    The STAC API also provides an item endpoint (/stac/search). This endpoint allows users to efficiently search for items that match the specified input filters.

    By default, the /stac/search endpoint will return the first 20 items found in all the collections available at the /stac/collections endpoint. Filters can be added either via query parameters in a GET request or added to the JSON body of a POST request.

    The full detail for each available filter is available in the API documentation.

    The query parameters are added at the end of the URL as a query string: ?param1=val1&param2=val2&param3=val3

    FILTER = "&datetime=2024-09-18T00:00:00Z/2024-09-20T23:59:59Z&bbox=-10,34,-5,42.5&sortby=datetime&limit=10"
    SEARCH_QUERY_STRING = "?collections="+COLLECTION_ID+FILTER
    response=requests.get(SEARCH_URL + SEARCH_QUERY_STRING, headers=auth_headers)
    
    JSON(response.json())    

    The same filters can be added as the JSON body of a POST request.

    BODY = {
        "collections": [
            COLLECTION_ID,
        ],
        "datetime" : "2024-09-18T00:00:00Z/2024-09-20T23:59:59Z",
        "bbox": [-10,34,
                  -5,42.5 ],
        "sortby": [{"field": "datetime","direction": "desc"}
                  ],
        "limit": 10,
    }
    
    response=requests.post(SEARCH_URL, json=BODY, headers=auth_headers)
    
    JSON(response.json())    

    Visualize

    Visualize search results in a table

    Search results can be visualized on a map.

    df = geopandas.GeoDataFrame.from_features(response.json()['features'], crs="epsg:4326")
    df.head()

    Visualize search results in a map

    #map1 = folium.Map([38, 0],
    #                  zoom_start=4, tiles='Esri Ocean Basemap', attr='Tiles © Esri — Source: Esri, DeLorme, NAVTEQ')
    
    #map1 = folium.Map([38, 0],zoom_start=4)
    
    map1 = folium.Map([38, 0],zoom_start=4, tiles=None)
    
    nasa_wms = folium.WmsTileLayer(
        url='https://gibs.earthdata.nasa.gov/wms/epsg4326/best/wms.cgi',
        name='NASA Blue Marble',
        layers='BlueMarble_ShadedRelief',
        format='image/png',
        transparent=True,
        attr='NASA'
    )
    nasa_wms.add_to(map1)
    
    results=folium.GeoJson( response.json(),name='Search results',style_function=lambda feature: {
            "fillColor": "#005577",
            "color": "black",
            "weight": 1
        })
    
    results.add_to(map1)
    
    
    bbox=[-10,34,-5,42.5]
    bb=folium.GeoJson(
        shapely.geometry.box(*bbox),name='Search bounding box',style_function=lambda feature: {
            "fillColor": "#ff0000",
            "color": "black",
            "weight": 2,
            "dashArray": "5, 5",
        }
    )
    bb.add_to(map1)
    
    # Add layer control to toggle visibility
    folium.LayerControl().add_to(map1)
    
    
    #display(fig)
    map1
    

    Download

    The items belonging to a specific collection can be downloaded entirely, or it is possible to download a single asset of a chosen item.

    Download a specific item

    To get the metadata specific to a given item (identified by its itemID in a collection, the user can request the /stac/collections/{collectionID}/items/{itemID}endpoint.

    print(COLLECTION_ITEM_BY_ID_URL)
    response=requests.get(COLLECTION_ITEM_BY_ID_URL, headers=auth_headers) 
    JSON(response.json())             

    The metadata of a given item contains also the download link that the user can use to download a specific item.

    result = json.loads(response.text)
    downloadUrl = result['assets']['downloadLink']['href']
    print(downloadUrl)
    
    resp_dl = requests.get(downloadUrl,stream=True,headers=auth_headers)
    
    # If the request was successful, download the file
    if (resp_dl.status_code == HTTP_SUCCESS_CODE):
            print("Downloading "+ ITEM_ID + "...")
            filename = ITEM_ID + ".zip"
            with open(filename, 'wb') as f:
                for chunk in resp_dl.iter_content(chunk_size=1024): 
                    if chunk:
                        f.write(chunk)
                        f.flush()
            print("The dataset has been downloaded to: {}".format(filename))
    else: print("Request Unsuccessful! Error-Code: {}".format(response.status_code))

    Download a specific asset of an item

    The metadata of a given item contains also the single assets download link, that the user can use to download a specific asset of the chosen item. In the example below we download the asset: “xfdumanifest.xml”

    downloadUrl = result['assets']['xfdumanifest.xml']['href']
    print(downloadUrl)
    
    resp_dl = requests.get(downloadUrl,stream=True,headers=auth_headers)
    
    # If the request was successful, download the file
    if (resp_dl.status_code == HTTP_SUCCESS_CODE):
            print("Downloading "+ result['assets']['xfdumanifest.xml']['title'] + "...")
            filename = result['assets']['xfdumanifest.xml']['title']
            with open(filename, 'wb') as f:
                for chunk in resp_dl.iter_content(chunk_size=1024): 
                    if chunk:
                        f.write(chunk)
                        f.flush()
            print("The dataset has been downloaded to: {}".format(filename))
    else: print("Request Unsuccessful! Error-Code: {}".format(response.status_code))

    Visualize the quicklook asset

    url =result['assets']["quicklook.jpg"]["href"]
    headers = {
        "Authorization": "Bearer " + access_token
    }
    
    response = requests.get(url, headers=headers)
    response.raise_for_status()
    
    Image(data=response.content,width=500)