Using DEDL API Keys for Automated Workflows

Using DEDL API Keys for Automated Workflows¶

-Contents: Notebook summary

-Prerequisites: Prerequisites to run this notebook

Contents¶

This notebook demonstrates how to authenticate to the HDA API using an API Key (powered by OAuth 2.0 client credentials) and generated via My DataLake Services. This method is intended for machine-to-machine access, automation, and operational workflows, without interactive user login.

Workflows often need to be:

Fully automated (CI/CD, pipelines, scheduled jobs)
Non-interactive (no browser login)
Scalable (distributed computing, Dask, batch jobs)

In these scenarios, the authentication with API keys is the recommended mechanism instead of personal user tokens.

In this notebook we demonstrate a simple programmatic data-access workflow using an API key

To obtain a Data Lake API key, please refer to the user documentation at How to manage API keys

Objective: Show how to retrieve an authentication token using the DestineLab library with API key credentials, and then use that token to access DestinE data through the HDA API.
Data Sources: DestinE-Data-Portfolio
Methods: API Key credentials are used to obtain an access token via the destinelab library. The credentials are stored as environment variables. Using the retrievied access token we wuery HDA to list the last week data for the collection Global Ocean Colour (Copernicus-GlobColour), Bio-Geo-Chemical, L3 (daily) from Satellite Observations (Near Real Time) downloading only the most recent plankton data at 300mt.
Prerequisites: To run this notebook the user needs to have a service account configured in DestinE Data Lake. Please refers to the Prerequisites section below.
Expected Output: Access to one data in the DestinE-Data-Portfolio

Prerequisites¶

A DestinE user account is needed
API keys credential created in My DataLake Services

Imports¶

import os
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
from tqdm import tqdm
import json
from getpass import getpass
from datetime import datetime, timedelta, timezone

from destinelab import DEDLServiceAccountAuth

Authentication using the API key¶

We assume:

A Service Account has been created in My DataLake Services
An API key has been generated and securely stored

Best practice:

Never hard-code API keys in notebooks
Use environment variables or secret managers

The following code requests your API key credentials the first itme it runs, then they are stored in environment variables.

# API key provisioned for a Service Account
HDA_CLIENT_ID = os.environ.get("HDA_CLIENT_ID")
HDA_CLIENT_SECRET = os.environ.get("HDA_CLIENT_SECRET")

if (HDA_CLIENT_ID is None or HDA_CLIENT_SECRET is None):
        HDA_CLIENT_ID = input("Please input your client ID: ")
        HDA_CLIENT_SECRET = getpass("Please input your API key: ")
        os.environ["HDA_CLIENT_ID"] = HDA_CLIENT_ID
        os.environ["HDA_CLIENT_SECRET"] = HDA_CLIENT_SECRET

def requestANewToken():
    try:
        access_token = DEDLServiceAccountAuth(client_id=HDA_CLIENT_ID,client_secret=HDA_CLIENT_SECRET).get_token()
    
        if access_token is not None:
            print("DEDL/DESP Access Token Obtained Successfully")
    except Exception as e:
        print("Failed to Obtain DEDL/DESP Access Token", e)

    return {"Authorization": f"Bearer {access_token}"}

auth_headers = requestANewToken()

Please input your client ID:  9e68b4cd-6041-44d2-83a9-68404879b7d2
Please input your API key:  ········

DEDL/DESP Access Token Obtained Successfully

Programmatic HDA data access¶

This workflow mimics a batch job that:

Queries HDA to list the data of the Global Ocean Colour (Copernicus-GlobColour), Bio-Geo-Chemical, L3 (daily) from Satellite Observations (Near Real Time) available for the last week (HDA collection ID EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101) .
Download only the most recent plankton data at 300mt

1 - Queries HDA to list the last week available data¶

COLLECTION_ID = "EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101"

# Current time in UTC
now_utc = datetime.now(timezone.utc)
# Yesterday's date (UTC)
one_week_ago = (now_utc - timedelta(days=7)).date()

# Construct full UTC day interval
start = datetime.combine(one_week_ago, datetime.min.time(), tzinfo=timezone.utc)
end = datetime.combine(now_utc, datetime.max.time(), tzinfo=timezone.utc)

datetime_range = f"{start.isoformat().replace('+00:00', 'Z')}/{end.isoformat().replace('+00:00', 'Z')}"

request_body = {
    "collections": [
        COLLECTION_ID,
    ],
    "datetime": datetime_range
}
request_body

{'collections': ['EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101'],
 'datetime': '2026-07-13T00:00:00Z/2026-07-20T23:59:59.999999Z'}

BASE_URL = "https://hda.data.destination-earth.eu/stac/v2" 

response=requests.post(BASE_URL+'/search', json=request_body, headers=auth_headers)

response.raise_for_status()

datasets=[] 
for i in response.json().get("features"):
    datasets.append({"id":i.get("id"),"downloadLink": i.get("assets").get("downloadLink").get("href"),\
                    # "alternate": i.get("assets").get("downloadLink").get("alternate").get("origin").get("href")
                    "datetime": i.get("properties").get("datetime")
                    })
from pprint import pprint
pprint(datasets)

[{'datetime': '2026-07-13T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260713_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260713_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-14T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260714_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260714_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-15T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260715_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260715_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-16T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260716_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260716_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-17T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260717_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260717_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-18T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260718_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260718_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-07-13T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260713_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260713_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-07-14T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260714_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260714_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-07-15T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260715_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260715_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-07-16T00:00:00.000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260716_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260716_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'}]

2 - Select the most recent plankton data at resolution of 300mt¶

#Sometimes requests to polytope get timeouts, it is then convenient define a retry strategy
retry_strategy = Retry(
    total=5,  # Total number of retries
    status_forcelist=[500, 502, 503, 504],  # List of 5xx status codes to retry on
    allowed_methods=["GET",'POST'],  # Methods to retry
    backoff_factor=1  # Wait time between retries (exponential backoff)
)

# Create an adapter with the retry strategy
adapter = HTTPAdapter(max_retries=retry_strategy)

# Create a session and mount the adapter
session = requests.Session()
session.mount("https://", adapter)

def download(href,auth_headers=auth_headers):

    response = session.get(href, stream=True, headers=auth_headers)
    
    if response.status_code == 401:
        auth_headers = requestANewToken()
        response = session.get(href, stream=True, headers=auth_headers)

    response.raise_for_status()
    
    content_disposition = response.headers.get('Content-Disposition')
    total_size = int(response.headers.get("content-length", 0))
    if content_disposition:
        filename = content_disposition.split('filename=')[1].split('"')[1]
    else:
        filename = os.path.basename(url)
        
    # Open a local file in binary write mode and write the content
    print(f"downloading {filename}")
    
    with tqdm(total=total_size, unit="B", unit_scale=True) as progress_bar:
        with open(filename, 'wb') as f:
            for data in response.iter_content(1024):
                progress_bar.update(len(data))
                f.write(data)
    
    result=filename+' NOT'
    if (os.path.exists(filename)):
        result=filename
    return result

#select only plankton data at 300mt
target = "plankton_nrt_l3-olci-300m_P1D"

filtered = [
    d for d in datasets
    if target in d.get("id", "") and "datetime" in d
]

if not filtered:
    raise RuntimeError("No datasets found matching the requested product")

#select only the most recent data
most_recent = max(
    filtered,
    key=lambda d: datetime.fromisoformat(d["datetime"].replace("Z", "+00:00"))
)

download(most_recent["downloadLink"])

downloading 20260718_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D.nc

451MB [00:10, 42.2MB/s]

'20260718_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D.nc'

Summary¶

In this notebook, we demonstrated how to authenticate to the DestinE Data Lake HDA API using API key credentials and the OAuth 2.0 client credentials flow, enabling secure, non-interactive access suitable for automated and scalable workflows.

Using the DestineLab library, we retrieved an access token from credentials stored as environment variables and then used that token to query the HDA API, discover recent data from the DestinE Data Portfolio, and download the latest plankton product from the Global Ocean Colour collection.

The main takeaway is that API key authentication provides a robust approach for machine-to-machine data access, allowing operational applications, pipelines, and distributed processing workflows to securely access DestinE data resources without requiring interactive user login.

Using DEDL API Keys for Automated Workflows