ERIC/open

The Eawag Research Data Institutional Repository

This is the place where Eawag scientists publish their research data. Research data is organized in Packages which contain one or more Resources. Resources are usually files containing research data proper or ancillary information such as a README-file. A URL pointing to external information might also constitute a Resource.

Data packages can be assigned to Projects to give users an overview of data generated or related to a specific research project.

Each Package is assigned to an Organization. Organizations represent individual research groups or labs at Eawag. Since Eawag departments are also represented as Organizations in ERIC/open, users can easily access all open research data published by a specific Eawag department.

The data packages are not (yet) complete, since we are still in the process to discover and move home legacy datasets that were previously published in Zenodo, Dryad, Pangaea and other places.

Data in ERIC/open is Findable.

Each Dataset has a globally unique and persistent identifier in the form of a Digital Object Identifier (DOI) which is registered with DataCite.

The meta-data records we submit for DOI creation conform to the latest DataCite Metadata Schema and contain quality-checked information for almost all possible fields, where applicable. For example, we are among the few repositories who attach spatial information to their meta-data. We continuously improve the extent and the quality our meta-data, also retroactively.

Data in ERIC/open is found through search services that index DOI meta-data, such s DataCite Search, Google Dataset Search and BASE.

ERIC/open can of course also be searched directly, be it through the search box, which also supports an advanced query syntax or through this service's API, which provides a powerful SOLR search syntax.

Data in ERIC/open is Accessible.

All datasets can be downloaded directly over HTTPS, no strings attached. There is no need to create an account or disclose an an email-address. In fact, the data available through ERIC/open is better than "FAIR", it is truly Open Data and conforms to the highest standards regarding openness. ERIC/open's powerful RPC-style API exposes all of CKAN's core features and lends itself to efficient scripted data retrieval without user-interaction.

Data in ERIC/open is Interoperable.

ERIC/open's meta-data are formally specified, and contain qualified references to related resources such as other datasets, publications, other versions of the dataset, software repositories, ORCID identifiers, etc. in a standardized way.

ERIC/open allows to directly read, search and filter data without the need to download the entire file first through its DataStore API. This requires of course a suitable format of tabular data, which cannot be enforced. The degree of interoperability that can be realized with respect to the research data proper varies and is determined by conventions of the respective field, availability of standards and the trade-off between data quality and cost that scientists have to make.

Data in ERIC/open is Reusable.

When submitting data to ERIC (the Eawag-internal companion service to ERIC/open), scientists are guided through an ingest process that is designed to help them tag their dataset with a large variety of significant and precise attributes. Furthermore, Eawag RDM Services provide support and guidance for the preparation of a README file, which contains the necessary information to re-use the data in a scientific context. The guide also contains practical state-of-the art advice on the organization and preparation of re-usable research data packages. ERIC supports initiating and recording the (internal) review of datasets before making them public.

All meta-data provided by ERIC/open is in the public domain. Research data proper is also in the public domain. Packages might contain copyrighted material such as text, images, software, graphs. Greatest care is taken to make sure that licensing information for all content in ERIC/open is clearly stated. Please have a look at the more extensive explanation.

CKAN

ERIC/open is build on the CKAN, the world’s leading Open Source data portal platform. CKAN is run by hundreds of state institutions, governments, companies and NGOs around the world who invest in its continued development. A large pool of developers, a highly competent core technical team, a solid yet easily extensible architecture, high code quality, excellent documentation and stewardship through Link Digital, Datopian and the Open Knowledge Foundation make CKAN the optimal choice for ERIC/open.

Other components

CKAN utilizes a battle proven technology-stack, including Python, Flask, redis, Solr and PostgreSQL.

Custom code

A large number of modifications have been implemented to customize CKAN for Eawag's requirements. This is mostly visible on the "ingest"-side in ERIC/open's internal companion-service "ERIC". The code written by the Eawag Research Data Management Services is available on GitHub.

ERIC/open

This is a FAIR Open Research Data repository.