Every research endeavor creates data.
North Dakota State University has a responsibility to steward the data you create during the course of your research. But, as the researcher, you need to make sure your data is in a state that will be useable for all the future researchers who will build on your work.
It is up to you to ensure your data remains useful, and you only need to take a few simple steps to make that happen.
Start by asking yourself these questions:
Document your answers to these questions so that you can plan how to manage your data in the future. This information will also be useful to use in creating your data management plan.
Why this is important:
Creating a plan now for what will happen with your data when your research is finished will not only help you to conceptualize how to organize the data as you generate it, but also make it easier to find it when you need to access it.
Why this is important:
Research is typically derived from one or more datasets. Maintaining an unmodified copy of this is necessary for reproducibility. Version control can be very useful, especially if a misstep is made in the research process - it is a complete snapshot of your work at a point in time. Using version control will allow you to revert to an earlier point.
Maintain backups in one or more separate physical locations. If you are storing your primary data on a NDSU server, a good backup location may be on an external drive or a cloud-based platform. Some services, including those at NDSU, may be backed up, but the backup scheme may not be sufficient for your research. This document explains storage options available to all NDSU researchers. A consultation with Kim Owen, Program Manager, Research & Education Network Resources, may help determine the best options for your research - 701.231.9522, kim.owen@ndsu.edu.
Why this is important:
If something happens to the computer used to store your original data (e.g. it gets damaged, corrupted, stolen, etc.), you will not lose all of your work.
Identify and annotate all data field headings. Be as thorough as possible in describing from where the data is sourced. This is important if your data is to be reused or shared with others, especially if your field naming includes abbreviations or placeholders. Include unaccounted for variables that may affect the outcome or skew the data
Why this is important:
Documentation is important for the following reasons:
Follow standardized methods for citing data sources.
Why this is important:
Data is increasingly being recognized as a publication type similar to journal articles and books
Follow NDSU, funder, and your field’s best practices on where and how you store your data. For more information see: http://libguides.lib.msu.edu/citedata
Follow NDSU, funder, and your field’s best practices on where and how you share your data. Include both short and long-term planning.
For long-term storage, consider a service that will provide you with a DOI for your data. For more information see: https://library.uic.edu/help/article/1966/what-is-a-doi-and-how-do-i-use-them-in-citations/
If your research funder requires that your data be shared and it includes personal and/or proprietary information that falls under FERPA, HIPAA, or other privacy or legal protection rules, create anonymized sharing versions of the final datasets.
Why this is important:
See the plan you created in the first section
Why this is important:
It will help your research future by allowing your work to be duplicated and allow the data to be reused by others.
A data management plan (DMP) is a document that outlines how you will manage data related to your research/project. This may include plans for collecting, organizing, documenting, analyzing, preserving, and sharing the data. Creating and following a plan for managing your data throughout the life-cycle of your research can save time, increase research impact, and ensure long-term ability to preserve and access data. In addition, many funding agencies require a DMP as part of the grant application.
Below are tips for writing your DMP and options for using the NDSU Repository for sharing your data (including boilerplate text you can use in your DMP). You can also consult with your NDSU subject librarian throughout the process if questions arise.
Check out the DMP Tool for examples and templates for DMPs by funding agency. With a DMP Tool account you can create and write your DMP with guidance and tips for each section, based on funding agency requirements. When working with a template within the DMP, you can:
The Write Plan tab breaks down the DMP into sections required by the funding agency. Within each section there is a box for you to type your content with some formatting options. Not sure exactly what to include in each section? Check out the Guidance tabs.
x
While working through each section, consider the following questions for each category described in the DMPTool guidance. It is in your best interest that your Data Management Plan address as many of these questions as possible.
Adapted with permission from University of Minnesota's DMP Template from the Data Repository for University of Minnesota (DRUM).The NDSU Repository is the university’s open access repository for scholarly output and data sharing that enables long-term access and preservation.
NDSU researchers may submit data to the NDSU Repository subject to the following submission criteria:
If appropriate for your data, use this boilerplate language in your DMP to demonstrate your institutionally supported strategy for data sharing and preservation:
A long-term data sharing and preservation plan will be used to store and make publicly accessible the data beyond the life of the project. The data will be deposited in the NDSU Repository (library.ndsu.edu/ir). This repository, hosted by NDSU Libraries (library.ndsu.edu), is an open access platform allowing for the dissemination and archiving of university scholarly output data. Curators review all incoming submissions and work with data authors to comply with data sharing requirements in ways that make data FAIR (Findable, Accessible, Interoperable, Reusable). The NDSU Repository provides long-term preservation of digital objects using services such as migration (limited format types), off-site backup, bit-level checksums, and assigns a Uniform Resource Identifier (URI) for archival citations (a Handle.net identifier and/or DOI). The data will be accompanied by the appropriate documentation, metadata, and code to facilitate reuse and provide the potential for interoperability with similar data sets.
Annotation: a note of explanation or comment added to a data field heading
Backup: a copy of one or more files created as an alternate in case the original data is lost or becomes unusable
CCAST: Center for Computationally Assisted Science and Technology - an NDSU research unit that supports NDSU research, and provides onsite hardware, software, and filesystems for researchers and their private and public sector partners.
Cloud-based platform: a computing platform that delivers its services via the internet; resources are available for computing, storage, and networking: https://www.sdxcentral.com/cloud/definitions/what-is-cloud/
Controlled vocabulary: standardized and organized arrangements of words and phrases that provide a consistent way to describe data. Users of controlled vocabulary lists select terms that offer preferred or authorized terms and spellings which improves information retrieval by reducing the quantity and ambiguity of terms, and ensuring consistency in application: https://en.wikipedia.org/wiki/Controlled_vocabulary
Data citation: the practice of providing a reference to data in the same way as researchers routinely provide citations for things like journal articles, reports, and conference reports. Data citation is key to recognizing data as a primary research output: https://www.ands.org.au/working-with-data/citation-and-identifiers/data-citation
Data dictionary: a list of key terms and metrics with definitions, a data glossary, which can help to define workflows: https://medium.com/@leapingllamas/data-dictionary-a-how-to-and-best-practices-a09a685dcd61
Data field heading: supplemental data placed at the beginning of a block of data being stored or transmitted - should be descriptive
Dataset: a collection of related sets of information that is composed of separate elements but can be manipulated by a computer
DMP: data management plan
DMP Tool: a tool used for creating data management plans. Access the tool at: https://dmptool.org/
DOI: a persistent identifier or handle used to identify objects uniquely, standardized by the International Organization for Standardization (ISO). Assigned DOIs resolve to the digital object to which they are assigned. Resolve a DOI at: https://www.doi.org/
Duplicability: the idea that research can be duplicated or replicated; as when an independent group of researchers copies a process and arrives at the same results as the original study; a method for establishing validity of results. Also referred to as replicability.
External drive: a portable storage device that can be attached to a computer through a wired connection or wirelessly
FERPA: Family Educational Rights and Privacy Act - affords students certain rights with respect to their educational records. See NDSU's Student Privacy Policy (FERPA).
Grant compliance: to ensure compliance with the terms of your grant, check in with the funding agency, NDSU, and the federal government's Grants 101 page at Grants.gov.
HIPAA: Health Insurance Portability and Accountability Act - specifically requires controls to protect covered data. Further information at Health Information Privacy, U.S. Department of Health and Human Services.
IR: institutional repository - a resource for providing storage and access to research generated at an institution. The NDSU Repository collects, preserves, and distributes digital content relating to North Dakota State University's mission, research, and scholarly activities.
IRB: Institutional Review Board- responsible for reviewing or certifying all research that includes human subjects prior to the start of the research project to ensure protection of participants' rights and welfare. NDSU's Institutional Review Board is part of the Office of Research and Creative Activity.
Markup language: a human-readable computer language that uses tags like HTML or XHTML
Metadata: a set of data that describes and provides information about other data; functions like a markup language
Metadata schema: a standardized structure for metadata. Commonly includes metadata components for information like dates, names, places, titles etc. Usually XML-based like Dublin Core, EAD, MODS, or disciplines-specific mark-up, etc. https://www.sciencedirect.com/topics/computer-science/metadata-schema
Network drive: a storage device on a local access network (LAN) - at NDSU S:, U:. and X:drives are common identifiers for network drives provided by campus IT. Check with ITS or your department to see if other network storage is available to you.
Raw data: data collected by research from first-hand sources; may be collected or generated by means of experimentation, surveys, or interviews and is collected specifically for a research project; may also be referred to as original, primary, or source data.
RCA: Office of Research and Creative Activity - centralized support, resources, and tools for all NDSU researchers.
Security: any data that contains information about human subjects needs to ensure privacy and confidentiality of that information: https://research-compliance.umich.edu/data-security-guidelines
URI: uniform resource identifier; a string of characters that unambiguously identifies a particular resource. NDSU uses the Handle.Net Registry.
Version control: a system that records changes to a file or set of files over time so that you can recall specific versions later: https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control