The EMSO Data Management Platform: from prototype to full production

The ocean plays an integral role in regulating Earth’s climate and weather patterns, including the heat, freshwater and carbon cycles but despite the ocean’s recognised impact, it is still poorly understood.
The European Multidisciplinary Seafloor and water column Observatory (EMSO) aims to explore the oceans and to explain their role in the broader Earth systems, focusing on climate change, risks for biodiversity and natural hazards. EMSO’s observatories are platforms equipped with multiple sensors to measure chemical and physical parameters, for example ocean temperature or dissolved oxygen concentration.
EMSO aims to offer distributed data and services to its community, and the EGI Foundation assisted them in this challenge during the EMSODEV project (EU Grant No: 676555).
EMSODEV was set up to develop and deploy the EMSO Generic Instrument Module (EGIM) as a fully operational distributed Research Infrastructure. A key component of providing accurate measurements of ocean parameters, is the Data Management Platform (DMP). The prototype DMP ingests, consolidates, processes and archives data from EGIM, integrates the data management architectures of the regionally distributed EMSO nodes and makes data available to the community.
In addition to a data portal, which provides early access to quality controlled EMSO data, the DMP provides scientific user with the following a set of tools:
- EMSODEV API allows scientific users and other European initiatives in the ocean sciences to interface with DMP data.
- MOODA (Module for Ocean Observatory Data Analysis) is a python framework for direct data access, with data analysis methods.
During the EMSODEV project, EGI assisted EMSO in implementing a DMP prototype using a subset of the EMSO data. To support this implementation, the EGI Foundation brokered access to cloud computing resources made available by RECAS-BARI, NCG-INGRID-PT, INFN-PADOVA-STACK and CESGA. A total of 340 vCPU cores and 9TB storage were made available.
EMSO is now planning to transition the prototype to production by the end of 2019. The operational system needs to be running on a robust infrastructure, with well curated data and no data loss, providing consistent data delivery to the user community. This will require pledged allocation, where resources are reserved and the job will be executed right after submission.
Together with EMSO, the EGI Foundation is scoping the requirements needed for EMSO to run its DMP in full production mode. The fully operational system will provide accurate, long-term measurements of ocean parameters. This, in turn, will lead to increased interoperability of EMSO nodes and the consistent collection of ocean essential variables.