Skip to article frontmatterSkip to article content

Using DEDL API Keys for Automated Workflows

Demonstrating DestinE Data Lake API Keys usage for accessing data via HDA

🚀 Launch in JupyterHub

Using DEDL API Keys for Automated Workflows

-Contents: Notebook summary

-Prerequisites: Prerequisites to run this notebook

Contents

This notebook demonstrates how to authenticate to the HDA API using an API Key (powered by OAuth 2.0 client credentials) and generated via My DataLake Services. This method is intended for machine-to-machine access, automation, and operational workflows, without interactive user login.

Workflows often need to be:

  • Fully automated (CI/CD, pipelines, scheduled jobs)

  • Non-interactive (no browser login)

  • Scalable (distributed computing, Dask, batch jobs)

In these scenarios, the authentication with API keys is the recommended mechanism instead of personal user tokens.

In this notebook we demonstrate a simple programmatic data-access workflow using an API key

To obtain a Data Lake API key, please refer to the user documentation at How to manage API keys

  • Objective: Show how to retrieve an authentication token using the DestineLab library with API key credentials, and then use that token to access DestinE data through the HDA API.

  • Data Sources: DestinE-Data-Portfolio

  • Methods: API Key credentials are used to obtain an access token via the destinelab library. The credentials are stored as environment variables. Using the retrievied access token we wuery HDA to list the last week data for the collection Global Ocean Colour (Copernicus-GlobColour), Bio-Geo-Chemical, L3 (daily) from Satellite Observations (Near Real Time) downloading only the most recent plankton data at 300mt.

  • Prerequisites: To run this notebook the user needs to have a service account configured in DestinE Data Lake. Please refers to the Prerequisites section below.

  • Expected Output: Access to one data in the DestinE-Data-Portfolio

Prerequisites

Imports

import os
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
from tqdm import tqdm
import json
from getpass import getpass
from datetime import datetime, timedelta, timezone

from destinelab import DEDLServiceAccountAuth

Authentication using the API key

We assume:

  • A Service Account has been created in My DataLake Services

  • An API key has been generated and securely stored

Best practice:

  • Never hard-code API keys in notebooks

  • Use environment variables or secret managers

The following code requests your API key credentials the first itme it runs, then they are stored in environment variables.

# API key provisioned for a Service Account
HDA_CLIENT_ID = os.environ.get("HDA_CLIENT_ID")
HDA_CLIENT_SECRET = os.environ.get("HDA_CLIENT_SECRET")

if (HDA_CLIENT_ID is None or HDA_CLIENT_SECRET is None):
        HDA_CLIENT_ID = input("Please input your client ID: ")
        HDA_CLIENT_SECRET = getpass("Please input your API key: ")
        os.environ["HDA_CLIENT_ID"] = HDA_CLIENT_ID
        os.environ["HDA_CLIENT_SECRET"] = HDA_CLIENT_SECRET

def requestANewToken():
    try:
        access_token = DEDLServiceAccountAuth(client_id=HDA_CLIENT_ID,client_secret=HDA_CLIENT_SECRET).get_token()
    
        if access_token is not None:
            print("DEDL/DESP Access Token Obtained Successfully")
    except Exception as e:
        print("Failed to Obtain DEDL/DESP Access Token", e)

    return {"Authorization": f"Bearer {access_token}"}

auth_headers = requestANewToken()
Please input your client ID:  9e68b4cd-6041-44d2-83a9-68404879b7d2
Please input your API key:  ········
DEDL/DESP Access Token Obtained Successfully

Programmatic HDA data access

This workflow mimics a batch job that:

  1. Queries HDA to list the data of the Global Ocean Colour (Copernicus-GlobColour), Bio-Geo-Chemical, L3 (daily) from Satellite Observations (Near Real Time) available for the last week (HDA collection ID EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101) .

  2. Download only the most recent plankton data at 300mt

1 - Queries HDA to list the last week available data

COLLECTION_ID = "EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101"
# Current time in UTC
now_utc = datetime.now(timezone.utc)
# Yesterday's date (UTC)
one_week_ago = (now_utc - timedelta(days=7)).date()

# Construct full UTC day interval
start = datetime.combine(one_week_ago, datetime.min.time(), tzinfo=timezone.utc)
end = datetime.combine(now_utc, datetime.max.time(), tzinfo=timezone.utc)

datetime_range = f"{start.isoformat().replace('+00:00', 'Z')}/{end.isoformat().replace('+00:00', 'Z')}"

request_body = {
    "collections": [
        COLLECTION_ID,
    ],
    "datetime": datetime_range
}
request_body
{'collections': ['EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101'], 'datetime': '2026-04-20T00:00:00Z/2026-04-27T23:59:59.999999Z'}
BASE_URL = "https://hda.data.destination-earth.eu/stac/v2" 

response=requests.post(BASE_URL+'/search', json=request_body, headers=auth_headers)

response.raise_for_status()

datasets=[] 
for i in response.json().get("features"):
    datasets.append({"id":i.get("id"),"downloadLink": i.get("assets").get("downloadLink").get("href"),\
                    # "alternate": i.get("assets").get("downloadLink").get("alternate").get("origin").get("href")
                    "datetime": i.get("properties").get("datetime")
                    })
from pprint import pprint
pprint(datasets)
[{'datetime': '2026-04-20T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260420_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260420_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-21T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260421_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260421_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-22T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260422_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260422_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-23T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260423_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260423_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-24T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260424_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260424_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-25T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260425_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D/downloadLink',
  'id': '20260425_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D'},
 {'datetime': '2026-04-20T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260420_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260420_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-04-21T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260421_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260421_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-04-22T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260422_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260422_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'},
 {'datetime': '2026-04-23T00:00:00.000000Z',
  'downloadLink': 'https://hda-download.lumi.data.destination-earth.eu/data/cop_marine/EO.MO.DAT.OCEANCOLOUR_GLO_BGC_L3_NRT_009_101/20260423_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D/downloadLink',
  'id': '20260423_cmems_obs-oc_glo_bgc-reflectance_nrt_l3-olci-4km_P1D'}]

2 - Select the most recent plankton data at resolution of 300mt

#Sometimes requests to polytope get timeouts, it is then convenient define a retry strategy
retry_strategy = Retry(
    total=5,  # Total number of retries
    status_forcelist=[500, 502, 503, 504],  # List of 5xx status codes to retry on
    allowed_methods=["GET",'POST'],  # Methods to retry
    backoff_factor=1  # Wait time between retries (exponential backoff)
)

# Create an adapter with the retry strategy
adapter = HTTPAdapter(max_retries=retry_strategy)

# Create a session and mount the adapter
session = requests.Session()
session.mount("https://", adapter)

def download(href,auth_headers=auth_headers):

    response = session.get(href, stream=True, headers=auth_headers)
    
    if response.status_code == 401:
        auth_headers = requestANewToken()
        response = session.get(href, stream=True, headers=auth_headers)

    response.raise_for_status()
    
    content_disposition = response.headers.get('Content-Disposition')
    total_size = int(response.headers.get("content-length", 0))
    if content_disposition:
        filename = content_disposition.split('filename=')[1].split('"')[1]
    else:
        filename = os.path.basename(url)
        
    # Open a local file in binary write mode and write the content
    print(f"downloading {filename}")
    
    with tqdm(total=total_size, unit="B", unit_scale=True) as progress_bar:
        with open(filename, 'wb') as f:
            for data in response.iter_content(1024):
                progress_bar.update(len(data))
                f.write(data)
    
    result=filename+' NOT'
    if (os.path.exists(filename)):
        result=filename
    return result

#select only plankton data at 300mt
target = "plankton_nrt_l3-olci-300m_P1D"

filtered = [
    d for d in datasets
    if target in d.get("id", "") and "datetime" in d
]

if not filtered:
    raise RuntimeError("No datasets found matching the requested product")

#select only the most recent data
most_recent = max(
    filtered,
    key=lambda d: datetime.fromisoformat(d["datetime"].replace("Z", "+00:00"))
)

download(most_recent["downloadLink"])
downloading 20260425_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D.nc
476MB [00:14, 33.1MB/s] 
'20260425_cmems_obs-oc_glo_bgc-plankton_nrt_l3-olci-300m_P1D.nc'

Summary

Conclude with a brief single paragraph summarizing at a high level the key pieces that were learned and how they tied to your objectives. Look to reiterate what the most important takeaways were.

Resources and references