Managing Research Data: a pilot study in Health and Life Sciences

Metadata for UWE data repository

Posted by Liz Holliday | 0 Comments
14Aug2012

Implementation of the UWE data repository is progressing. The first stage of metadata identification and EPrints customisation is complete and load testing has begun. The next stage will be user driven; testing process workflows, metadata acceptance and usability with our seven pilot researchers.

We identified metadata requirements for the data repository using information from a range of sources. The DataCite workshop at the British Library (http://datacite.org/node/67) outlined a useful and practical approach to defining mandatory and optional metadata. Blogs on selecting metadata by data.bris (minimal set of mandatory metadata ), Research Data @Essex (Adapting EPrints for research data: metadata and design) and Iridium (CERIF in Practice Workshop) were particularly helpful in formulating our approach.

We will use the DCMI metadata terms and have selected mandatory and optional fields based on the DataCite Metadata schema v2.1. DCMI has been selected for the same reasons discussed by David Boyd in his data.bris post (25/06/2012) and for two additional reasons. Firstly, Dublin Core is standard within our EPrints data repository and, secondly, as a non-research intensive university UWE is unlikely to commit to the British Library DataCite service at this pilot stage in RDM development.

Two levels of metadata are planned; the first is a basic level collected on project record entry and data deposit. An optional detailed level will conform to disciplinary and subject metadata standards. These will be supplied by researchers as additional files deposited with the research data set. Additional information to allow for some basic analytics by research administration will be requested.

At present UWE has a limited number of IT systems supporting research projects and no interoperability requirements for the data repository with those that currently exist. Therefore we expect most of our metadata to be entered manually by researchers on deposition of research data. Whilst this may allow a richer set of metadata than possible with harvested metadata it places the burden for metadata acquisition on the researcher. If the burden was considered too great the metadata capture would fail. Therefore, we have limited mandatory fields to five, those required for citation and location. An additional fourteen are optional. Of these nineteen, six are standard or entered automatically by EPrints. A key element of our researcher testing will be to assess the acceptance and usability of the metadata fields by researchers. If metadata entry is found to inflict a burden we will review these requirements.

UWE Metadata Summary (initial draft version)

Mandatory fields

Author/creator, title of project, publication date, publisher, location (identifier)

Optional fields

Resource type, contributor(s), summary, subject, rights, spatial coverage, temporal coverage, derived publications, related datasets, language, methodology, data description, file format, file size