EGI runs a ‘DataHub service’ based on the OneData technology from CYFRONET. DataHub is a high-performance data management solution that offers unified data access across globally distributed environments and multiple types of underlying storage. It allows researchers to share, collaborate and perform computations on the stored data easily.
Users can bring data close to their community or to the compute facilities they use, in order to exploit it efficiently. This is as simple as selecting which (subset of the) data should be available at which supporting provider.
This tutorial will show to users and scientific communities how to publish, share, discover and reuse data with the EGI DataHub service.
The main features of DataHub are:
- Discovery of data spaces via a central portal.
- Policy based data access.
- Replication of data across providers for resiliency and availability purposes.
- Integration with EGI Check-in allows access using community credentials, including from other EGI services and components.
- File catalog to track replication of data and manage logical and physical files.
With the EGI DataHub communities can implement various access policies for the data they share:
- Unauthenticated, open access
- Access after user registration or
- Access restricted to members of a scientific community