Reading CMEMS Copernicus data from GFTS s3 bucket

Reading CMEMS Copernicus data from GFTS s3 bucket#

import intake
import hvplot.xarray  # noqa: F401
%%time
cat = intake.open_catalog("s3://gfts-reference-data/gfts_cmems_catalog.yaml")
CPU times: user 362 ms, sys: 60.4 ms, total: 423 ms
Wall time: 498 ms
list(cat)
['cmems_nws_2d', 'cmems_nws_3d', 'cmems_ibi_2d', 'cmems_ibi_3d']
cat["cmems_nws_2d"]
cmems_nws_2d:
  args:
    consolidated: false
    storage_options:
      fo: s3://gfts-reference-data/CMEMS_v6r1_NWS_PHY_NRT_NL_01hav_AN_2D_combined.parq
      remote_options:
        anon: false
      remote_protocol: s3
      target_options:
        anon: false
    urlpath: reference://
  description: Copernicus CMEMS_v6r1_NWS_PHY_NRT_NL_01hav_AN_2D data for GFTS
  driver: intake_xarray.xzarr.ZarrSource
  metadata:
    catalog_dir: s3://gfts-reference-data
%%time
ds = cat["cmems_nws_2d"].to_dask()
CPU times: user 2.13 s, sys: 229 ms, total: 2.36 s
Wall time: 2.7 s
/srv/conda/envs/notebook/lib/python3.11/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),
ds
<xarray.Dataset> Size: 521GB
Dimensions:    (latitude: 551, longitude: 936, time: 18048)
Coordinates:
  * latitude   (latitude) float32 2kB 46.0 46.03 46.06 ... 61.23 61.25 61.28
  * longitude  (longitude) float32 4kB -16.0 -15.97 -15.94 ... 9.921 9.949 9.977
  * time       (time) datetime64[ns] 144kB 2022-04-02T00:30:00 ... 2024-04-22...
Data variables:
    mlotst     (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    thetao     (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    ubar       (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    uo         (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    vbar       (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    vo         (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
    zos        (time, latitude, longitude) float64 74GB dask.array<chunksize=(1, 551, 936), meta=np.ndarray>
Attributes: (12/13)
    Conventions:     CF-1.8
    comment:         
    contact:         https://marine.copernicus.eu/contact
    domain_name:     NWS36
    field_date:      20220402
    field_type:      mean
    ...              ...
    forecast_type:   analysis
    institution:     Nologin (Spain)
    licence:         https://marine.copernicus.eu/user-corner/service-commitm...
    references:      http://marine.copernicus.eu/
    source:          NEMO3.6
    title:           Ocean surface hourly mean fields for the North West Shel...
da = ds["thetao"].sel(latitude=48.23, longitude=-7.154, method="nearest").load()
da
<xarray.DataArray 'thetao' (time: 18048)> Size: 144kB
array([11.50500007, 11.51700007, 11.53400007, ..., 12.0570001 ,
       12.0500001 , 12.0440001 ])
Coordinates:
    latitude   float32 4B 48.23
    longitude  float32 4B -7.162
  * time       (time) datetime64[ns] 144kB 2022-04-02T00:30:00 ... 2024-04-22...
Attributes: (12/14)
    easting:        longitude
    latitude_max:   61.2819f
    latitude_min:   46.0036f
    long_name:      Temperature
    longitude_max:  9.977f
    longitude_min:  -15.996f
    ...             ...
    unit_long:      degrees_C
    units:          degrees_C
    valid_max:      22000
    valid_min:      -12000
    z_max:          0.494025f
    z_min:          0.494025f
da.hvplot()