Data management for analysis and synthesis
Synthesis and analysis of scientific data has proven an invaluable tool for the emergence of new theory, paradigms and knowledge. Much has to be done to make data fully available and usable, to document how the data were used (the recipe), and to allow for its re-use. A significant part of this challenge relates to the documentation and management of the data, and a Data Management Plan (DMP) is a really good basis for any project. Indeed in many parts of the world these are a mandatory part of gaining research funding.
Questions commonly asked in a DMP
Helping researchers develop Data Management Plans (DMP) is one of the key activities of a synthesis centre. The DMP enables the documentation of how and what data are managed during and after a project. This is intended to improve and streamline the decision process. This is part of an exercise to promote the culture of developing and using a Data Management Plan in research projects. See CESAB Workshop “Managing biodiversity data for scientific synthesis” held in 2014.
Considerations for data management include the following:
1. Acquiring Data
What data are to be used? Are they:
- in the possession of participant (and are there any ownership considerations)?
- data that need to be acquired?
- vital data that are not digital?
2. Managing Data
- Which format are the data in (csv, database, GIS)?
- Where will the data be stored and in what format (ACCESS, excel workbook)?
- Does the data have appropriate metadata attached?
3. Analysing Data
- Do you have to combine different data sets?
- Do you need assistance to combine your data sets?
- How will the data be processed?
4. Data to be produced
- What data will be generated in the research?
- What type of data will be created?
- Does the data have appropriate metadata associated with it?
The Knowledge Network for Biocomplexity is a useful site for links to informatics innovations. This is the home of Morpho (data management for ecologists), Metcat (a flexible XML database) and EML (Ecological Markup Language) among other things.
Making a data management plan has been made easy and tailored for different fields and specifications (e.g. for specific funding sources) by DataONE in association with the Californian Digital Library: DataONE DMPTool