Access Sentinel2 data from CDSE
Access Sentinel-2 data from the CDSE STAC catalog¶
A DeepESDL example notebook¶
This notebook demonstrates how to access Sentinel-2 data from CDSE via xcube-stac store. The data is fetched via the CDSE STAC API.
The data can be accessed via S3, where key and secret can be obtained following the CDSE access documentation to EO data via S3. The store object will receive the key and secret upon initialization, as demonstrated below.
Please, also refer to the DeepESDL documentation and visit the platform's website for further information!
Brockmann Consult, 2025
This notebook runs with the python environment users-deepesdl-xcube-1.11.0
, please checkout the documentation for help on changing the environment.
import itertools
import matplotlib.pyplot as plt
from xcube.core.store import new_data_store, get_data_store_params_schema
from xcube_stac.utils import reproject_bbox
Store the credentials in a dictionary. If you don't have credentials yet, please folow the CDSE access documentation to EO data via S3 to obtain credentials.
credentials = {
"key": "xxx",
"secret": "xxx",
}
First, we get the store parameters needed to initialize a STAC data store. Note that key and secret of the S3 access are required.
%%time
store_params = get_data_store_params_schema("stac-cdse")
store_params
CPU times: user 42.3 ms, sys: 42.5 ms, total: 84.9 ms Wall time: 206 ms
<xcube.util.jsonschema.JsonObjectSchema at 0x7f91d80cd490>
Note that the user does not need to provide the URL for the CDSE STAC API. Only the key and secret for S3 access are required when initializing a stac-cdse
data store. First, we will initialize a store supporting the stacking mode. Then, for completeness, we will initialize a store in single-tile mode.
%%time
store = new_data_store("stac-cdse", stack_mode=True, **credentials)
CPU times: user 11 ms, sys: 972 μs, total: 12 ms Wall time: 171 ms
The data IDs point to a STAC collections. So far only 'sentinel-2-l2a'
is supported.
%%time
#data_ids = store.list_data_ids()
#data_ids
CPU times: user 0 ns, sys: 3 μs, total: 3 μs Wall time: 4.77 μs
Below, the parameters for the open_data
method can be viewed.
%%time
open_params = store.get_open_data_params_schema()
open_params
CPU times: user 31 μs, sys: 10 μs, total: 41 μs Wall time: 42.4 μs
<xcube.util.jsonschema.JsonObjectSchema at 0x7f916f06b770>
So far, only data from the collection sentinel-2-l2a
can be accessed. We therefore assign data_id
to "sentinel-2-l2a"
. We set the bounding box to cover the greater Hamburg area and the time range to second half of July 2020.
%%time
ds = store.open_data(
data_id="sentinel-2-l2a",
bbox=[9.1, 53.1, 10.7, 54],
time_range=["2020-07-15", "2020-08-01"],
spatial_res=10 / 111320, # meter in degree
crs="EPSG:4326",
asset_names=["B02", "B03", "B04", "SCL"],
apply_scaling=True,
angles_sentinel2=True,
)
ds
CPU times: user 40.2 s, sys: 448 ms, total: 40.7 s Wall time: 1min 24s
<xarray.Dataset> Size: 27GB Dimensions: (time: 11, lon: 17813, lat: 10020, angle_lon: 37, angle_lat: 22, angle: 2, band: 3) Coordinates: * time (time) datetime64[ns] 88B 2020-07-15T10:15:59.024000 ... 2... spatial_ref int64 8B 0 * lon (lon) float64 143kB 9.1 9.1 9.1 9.1 ... 10.7 10.7 10.7 10.7 * lat (lat) float64 80kB 54.0 54.0 54.0 54.0 ... 53.1 53.1 53.1 * angle_lon (angle_lon) float64 296B 9.1 9.146 9.191 ... 10.65 10.7 10.74 * angle_lat (angle_lat) float64 176B 54.04 54.0 53.95 ... 53.14 53.1 * angle (angle) object 16B 'zenith' 'azimuth' * band (band) <U3 36B 'B02' 'B03' 'B04' Data variables: B02 (time, lat, lon) float32 8GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> B03 (time, lat, lon) float32 8GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> B04 (time, lat, lon) float32 8GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> SCL (time, lat, lon) uint16 4GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> solar_angle (angle, time, angle_lat, angle_lon) float32 72kB dask.array<chunksize=(2, 1, 22, 37), meta=np.ndarray> viewing_angle (angle, band, time, angle_lat, angle_lon) float32 215kB dask.array<chunksize=(2, 3, 1, 22, 37), meta=np.ndarray> Attributes: stac_item_ids: {'2020-07-15T10:15:59.024000': ['S2B_MSIL2A_20200715T1... stac_catalog_url: https://stac.dataspace.copernicus.eu/v1
We can plot the B04 (red) band for a given timestamp as an example. Hereby a mosaicking of multiple tiles have been applied. Additionally, we plot the solar and viewing angle.
%%time
fig, ax = plt.subplots(1, 3, figsize=(20, 6))
ds.B04.isel(time=-1)[::10, ::10].plot(ax=ax[0], vmin=0, vmax=0.2)
ds.solar_angle.isel(angle=0, time=-1).plot(ax=ax[1])
ds.viewing_angle.isel(band=2, angle=0, time=-1).plot(ax=ax[2])
CPU times: user 1min 27s, sys: 1.97 s, total: 1min 29s Wall time: 1min 9s
<matplotlib.collections.QuadMesh at 0x7f916665edb0>
The data access can be speed up when requesting the data in the UTM CRS which is the native UTM of the Sentinel-2 products.
%%time
bbox = [9.1, 53.1, 10.7, 54]
crs_target = "EPSG:32632"
bbox_utm = reproject_bbox(bbox, "EPSG:4326", crs_target)
CPU times: user 1.28 ms, sys: 0 ns, total: 1.28 ms Wall time: 731 μs
%%time
ds = store.open_data(
data_id="sentinel-2-l2a",
bbox=bbox_utm,
time_range=["2020-07-15", "2020-08-01"],
spatial_res=10,
crs=crs_target,
asset_names=["B02", "B03", "B04", "SCL"],
apply_scaling=True,
angles_sentinel2=True,
)
ds
CPU times: user 7.77 s, sys: 263 ms, total: 8.03 s Wall time: 51.4 s
<xarray.Dataset> Size: 17GB Dimensions: (time: 11, y: 10147, x: 10727, angle_x: 23, angle_y: 22, angle: 2, band: 3) Coordinates: * time (time) datetime64[ns] 88B 2020-07-15T10:15:59.024000 ... 2... spatial_ref int64 8B 0 * x (x) float64 86kB 5.066e+05 5.066e+05 ... 6.138e+05 6.138e+05 * y (y) float64 81kB 5.985e+06 5.985e+06 ... 5.883e+06 5.883e+06 * angle_x (angle_x) float64 184B 5.066e+05 5.116e+05 ... 6.166e+05 * angle_y (angle_y) float64 176B 5.988e+06 5.983e+06 ... 5.883e+06 * angle (angle) object 16B 'zenith' 'azimuth' * band (band) <U3 36B 'B02' 'B03' 'B04' Data variables: B02 (time, y, x) float32 5GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> B03 (time, y, x) float32 5GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> B04 (time, y, x) float32 5GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> SCL (time, y, x) uint16 2GB dask.array<chunksize=(1, 2048, 2048), meta=np.ndarray> solar_angle (angle, time, angle_y, angle_x) float32 45kB dask.array<chunksize=(2, 1, 22, 23), meta=np.ndarray> viewing_angle (angle, band, time, angle_y, angle_x) float32 134kB dask.array<chunksize=(2, 3, 1, 22, 23), meta=np.ndarray> Attributes: stac_item_ids: {'2020-07-15T10:15:59.024000': ['S2B_MSIL2A_20200715T1... stac_catalog_url: https://stac.dataspace.copernicus.eu/v1
Note that the search function in the CDSE STAC API is very slow. Further investigation and comparison with other STAC APIs is needed.
We can plot the B04 (red) band for a given timestamp as an example. Hereby a mosaicking of multiple tiles have been applied. Additionally, we plot the solar and viewing angle.
%%time
fig, ax = plt.subplots(1, 3, figsize=(20, 6))
ds.B04.isel(time=-1)[::10, ::10].plot(ax=ax[0], vmin=0, vmax=0.2)
ds.solar_angle.isel(angle=0, time=-1).plot(ax=ax[1])
ds.viewing_angle.isel(band=2, angle=0, time=-1).plot(ax=ax[2])
CPU times: user 58.1 s, sys: 795 ms, total: 58.8 s Wall time: 37.7 s
/home/conda/users/19e33cc7-1749710901-48-deepesdl-xcube-1.11.0/lib/python3.12/site-packages/dask/_task_spec.py:764: RuntimeWarning: invalid value encountered in divide return self.func(*new_argspec)
<matplotlib.collections.QuadMesh at 0x7f914e1e6a20>
Data store in the single-tile mode¶
For completeness, we initiate the data store in the single-tile mode and open data of one tile.
%%time
store = new_data_store("stac-cdse", stack_mode=False, **credentials)
CPU times: user 9.98 ms, sys: 0 ns, total: 9.98 ms Wall time: 88.8 ms
The data IDs point to a STAC item's JSON and are specified by the segment of the URL that follows the catalog's URL. The data IDs can be streamed using the following code where we show the first 10 data IDs as an example.
⚠️ Warning: If you use store.list_data_ids()
it will try to collect all Sentinel-2 tiles in the archive, before printing the result. This can take a while, and is not recommended.
%%time
data_ids = store.get_data_ids()
list(itertools.islice(data_ids, 10))
CPU times: user 11.4 ms, sys: 0 ns, total: 11.4 ms Wall time: 667 ms
['collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDV_20250612T082159_20250612T082242_002747_005A85_CD30_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDV_20250612T082059_20250612T082159_002747_005A85_C739_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDV_20250612T081959_20250612T082059_002747_005A85_9273_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDV_20250612T081855_20250612T081959_002747_005A85_F4FA_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T081357_20250612T081421_002747_005A83_6503_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T081257_20250612T081357_002747_005A83_8A20_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T081157_20250612T081257_002747_005A83_94F3_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T081057_20250612T081157_002747_005A83_EE29_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T080957_20250612T081057_002747_005A83_44A9_COG', 'collections/sentinel-1-grd/items/S1C_EW_GRDM_1SDH_20250612T080857_20250612T080957_002747_005A83_3E51_COG']
In the next step, we can search for items using search parameters. The following code shows which search parameters are available.
%%time
search_params = store.get_search_params_schema()
search_params
CPU times: user 32 μs, sys: 0 ns, total: 32 μs Wall time: 34.8 μs
<xcube.util.jsonschema.JsonObjectSchema at 0x7f914d2dc560>
Next, we will search for tiles of Sentinel-2 data.
%%time
descriptors = list(
store.search_data(
collections=["sentinel-2-l2a"],
bbox=[9, 47, 10, 48],
time_range=["2020-07-01", "2020-07-05"],
)
)
[d.to_dict() for d in descriptors]
CPU times: user 208 ms, sys: 4.01 ms, total: 212 ms Wall time: 2.05 s
[{'data_id': 'collections/sentinel-2-l2a/items/S2B_MSIL2A_20200705T101559_N0500_R065_T32UNU_20230530T175912', 'data_type': 'dataset', 'bbox': [8.999728, 47.755819, 10.493269, 48.753013], 'time_range': ('2020-07-05T10:15:59.024Z', '2020-07-05T10:15:59.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2B_MSIL2A_20200705T101559_N0500_R065_T32UMU_20230530T175912', 'data_type': 'dataset', 'bbox': [8.087776, 47.759622, 9.132783, 48.752937], 'time_range': ('2020-07-05T10:15:59.024Z', '2020-07-05T10:15:59.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2B_MSIL2A_20200705T101559_N0500_R065_T32TNT_20230530T175912', 'data_type': 'dataset', 'bbox': [8.999733, 46.85664, 10.467277, 47.853702], 'time_range': ('2020-07-05T10:15:59.024Z', '2020-07-05T10:15:59.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2B_MSIL2A_20200705T101559_N0500_R065_T32TMT_20230530T175912', 'data_type': 'dataset', 'bbox': [7.760786, 46.858555, 9.13047, 47.853628], 'time_range': ('2020-07-05T10:15:59.024Z', '2020-07-05T10:15:59.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2A_MSIL2A_20200703T103031_N0500_R108_T32UNU_20230613T212700', 'data_type': 'dataset', 'bbox': [8.999728, 47.761055, 10.094773, 48.753013], 'time_range': ('2020-07-03T10:30:31.024Z', '2020-07-03T10:30:31.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2A_MSIL2A_20200703T103031_N0500_R108_T32UMU_20230613T212700', 'data_type': 'dataset', 'bbox': [7.639177, 47.757404, 9.132783, 48.752937], 'time_range': ('2020-07-03T10:30:31.024Z', '2020-07-03T10:30:31.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2A_MSIL2A_20200703T103031_N0500_R108_T32TNT_20230613T212700', 'data_type': 'dataset', 'bbox': [8.999733, 46.864171, 9.68382, 47.853702], 'time_range': ('2020-07-03T10:30:31.024Z', '2020-07-03T10:30:31.024Z')}, {'data_id': 'collections/sentinel-2-l2a/items/S2A_MSIL2A_20200703T103031_N0500_R108_T32TMT_20230613T212700', 'data_type': 'dataset', 'bbox': [7.662865, 46.858176, 9.13047, 47.853628], 'time_range': ('2020-07-03T10:30:31.024Z', '2020-07-03T10:30:31.024Z')}]
In the next step, we can open the data for each data ID. The following code shows which parameters are available for opening the data.
%%time
open_params = store.get_open_data_params_schema()
open_params
CPU times: user 36 μs, sys: 0 ns, total: 36 μs Wall time: 39.1 μs
<xcube.util.jsonschema.JsonObjectSchema at 0x7f914ff5ac30>
We select the band B04 (red), B03 (green), B02 (blue), and the science classification layer (SLC), and lazily load the corresponding data.
%%time
ds = store.open_data(
"collections/sentinel-2-l2a/items/S2B_MSIL2A_20200705T101559_N0500_R065_T32TMT_20230530T175912",
asset_names=["B04", "B03", "B02", "SCL"],
apply_scaling=True,
angles_sentinel2=True,
)
ds
CPU times: user 301 ms, sys: 12.4 ms, total: 313 ms Wall time: 1.79 s
<xarray.Dataset> Size: 2GB Dimensions: (y: 10980, x: 10980, angle_x: 23, angle_y: 23, angle: 2, band: 3) Coordinates: spatial_ref int64 8B 0 * x (x) float64 88kB 4e+05 4e+05 4e+05 ... 5.097e+05 5.098e+05 * y (y) float64 88kB 5.3e+06 5.3e+06 ... 5.19e+06 5.19e+06 * angle_x (angle_x) float64 184B 4e+05 4.05e+05 ... 5.05e+05 5.1e+05 * angle_y (angle_y) float64 184B 5.3e+06 5.295e+06 ... 5.19e+06 * angle (angle) object 16B 'zenith' 'azimuth' * band (band) <U3 36B 'B02' 'B03' 'B04' Data variables: B04 (y, x) float32 482MB dask.array<chunksize=(10980, 10980), meta=np.ndarray> B03 (y, x) float32 482MB dask.array<chunksize=(10980, 10980), meta=np.ndarray> B02 (y, x) float32 482MB dask.array<chunksize=(10980, 10980), meta=np.ndarray> SCL (y, x) uint8 121MB dask.array<chunksize=(10980, 10980), meta=np.ndarray> solar_angle (angle, angle_y, angle_x) float32 4kB dask.array<chunksize=(2, 23, 23), meta=np.ndarray> viewing_angle (angle, band, angle_y, angle_x) float32 13kB dask.array<chunksize=(2, 3, 23, 23), meta=np.ndarray> Attributes: stac_catalog_url: https://stac.dataspace.copernicus.eu/v1 stac_item_id: S2B_MSIL2A_20200705T101559_N0500_R065_T32TMT_20230530T...
We plot the loaded data as an example below.
%%time
ds.B04[::10, ::10].plot(vmin=0.0, vmax=0.2)
CPU times: user 16 s, sys: 354 ms, total: 16.4 s Wall time: 10.5 s
<matplotlib.collections.QuadMesh at 0x7f916e877020>
We can also open a .jp2
as a xcube's multi-resolution dataset, where we can select the level of resolution, shown below.
%%time
mlds = store.open_data(
descriptors[3].data_id,
data_type="mldataset",
asset_names=["B04", "B03", "B02"],
apply_scaling=True,
angles_sentinel2=True,
)
mlds.num_levels
CPU times: user 18.5 ms, sys: 3.89 ms, total: 22.3 ms Wall time: 738 ms
5
%%time
ds = mlds.get_dataset(2)
ds
CPU times: user 201 ms, sys: 74 μs, total: 202 ms Wall time: 708 ms
<xarray.Dataset> Size: 90MB Dimensions: (x: 2745, y: 2745, angle_x: 23, angle_y: 23, angle: 2, band: 3) Coordinates: * x (x) float64 22kB 4e+05 4e+05 ... 5.097e+05 5.097e+05 * y (y) float64 22kB 5.3e+06 5.3e+06 ... 5.19e+06 5.19e+06 spatial_ref int64 8B 0 * angle_x (angle_x) float64 184B 4e+05 4.05e+05 ... 5.05e+05 5.1e+05 * angle_y (angle_y) float64 184B 5.3e+06 5.295e+06 ... 5.19e+06 * angle (angle) object 16B 'zenith' 'azimuth' * band (band) <U3 36B 'B02' 'B03' 'B04' Data variables: B04 (y, x) float32 30MB dask.array<chunksize=(256, 256), meta=np.ndarray> B03 (y, x) float32 30MB dask.array<chunksize=(256, 256), meta=np.ndarray> B02 (y, x) float32 30MB dask.array<chunksize=(256, 256), meta=np.ndarray> solar_angle (angle, angle_y, angle_x) float32 4kB dask.array<chunksize=(2, 23, 23), meta=np.ndarray> viewing_angle (angle, band, angle_y, angle_x) float32 13kB dask.array<chunksize=(2, 3, 23, 23), meta=np.ndarray> Attributes: stac_catalog_url: https://stac.dataspace.copernicus.eu/v1 stac_item_id: S2B_MSIL2A_20200705T101559_N0500_R065_T32TMT_20230530T...
%%time
ds.B04[::10, ::10].plot(vmin=0.0, vmax=0.2)
CPU times: user 1.6 s, sys: 146 ms, total: 1.75 s Wall time: 8.8 s
<matplotlib.collections.QuadMesh at 0x7f914d4bc0b0>