Scientific Background

The atmosphere, terrestrial biosphere, hydrosphere, pedosphere and oceans are characterized by a series of complex phenomena whose interactions control the dynamics of the Earth system as a whole. Many of the involved phenomena are increasingly monitored with high quality satellite and in-situ data. This is why today we are confronted with a multitude of long-term and spatially connected data streams, which help us to quantify the conditions of e.g. terrestrial vegetation, its seasonal dynamics, trends and extreme anomalies.

However, despite of the unprecedented progresses, it remains very difficult to jointly analyse multiple data together to actually understand the interactions between the Earth’s subsystems. This has to do with various obstacles, ranging from data discoverability, formatting inconsistencies, incompatible spatiotemporal resolutions to access restrictions. Therefore, before we can tap into the full potential of the multitude of available data streams, an Earth system scientist must overcome all these barriers.

The Earth System Data Lab (ESDL) tries to overcome these barriers in a generic way:

Data cubes

Firstly, the ESDL provides access to a series of highly curated "data cubes"'. These analysis ready data were collected after previous user consultations and pre-processed to common spatio-temporal resolutions ready for computations in a cloud (or cloud-based computations). A data cube essentially consists of screened data with the dimensions "latitude", "longitude", "time", "variable". Further dimensions can be added as a result of an analysis.

Framework

Secondly, the ESDL offers a framework to effectively map user-defined functions (UDFs) to these data cubes. The user should be able to write her own functions, which are then mapped to the dimension of the cube. We try to support users in multiple languages so that they can speak their native language. Currently we support Python and Julia, we are working on an interface in R and invite contributions in other languages.

Together, the Earths system data cubes and a framework for mapping user defined functions form a virtual laboratory (accessed through jupyter labs) to be able to fully concentrate on the exploration of high-dimensional data across domains.

At this early implementation state, we invite users who help us to build this system, share their experience, and improve the system. The ESDL is committed to open source computations, and open data usage. We highly appreciate the contributions from multiple data providers who made this possible (for details on which data are included see Data Sets) and invite further suggestions.