Manage and Store Data Effectively
October 19, 2016
Data is everywhere. It can be on your computer, a USB, a notebook, even a box in your basement. But, what are the best ways to store your data? Why should you make your data accessible to your peers? What’s the difference between data and metadata? These questions were addressed by Richard Inouye, Liz Woolcott and Andrea Payant, at October’s GrTS.
When you are collecting data, three important questions to consider are:
- What equipment will be used?
- Is there required processing?
- Are there quality assurance and control guidelines?
Whether or not you’re using online databases to collect data, these three questions will aid you in the gathering process.
These are important questions to ask as you begin to process and store your data. Large files may require you to purchase data storage software. Some file formats are not easily accessible to the public. You may want to consider what you’ll be using your data for, and what benefit it will be to the public, before you begin to format it.
These are important questions to consider when titling your data. A few tips to make it easily recognizable and searchable are to include the full date of the study in a year, month, day format. For example: 20161020.
You also want to provide as much context to the data as possible. Instead of simply listing the data as, “Rivers,” consider titling the data something like, “Greater Yellowstone Rivers: 1:1 26,700 U.S. Forest Service Visitor Maps (1961-1983).”
The second example shows what was studied, where it was studied, when it was studied, and what scale was used in the study. Consider using these elements whenever you title data.
These tools are useful, but generally you should store your data in at least two distinct places. USU librarians use the LOCKSS system, Lots of Copies Keeps Stuff Safe.
When storing data on multiple devices, make sure each device has a separate location. This will help prevent damage from flooding, fire, or other natural hazards.
Also, consider how your data storage will impact its use by others. Are you storing your data in a manner where it will still be relevant in 15-20 years, or will the technology be obsolete by then? Are you storing your data in an easily accessible manner? These are questions to consider with data storage.
Data and Metadata
Colleagues can find your data. Researchers can use it. University personnel can publish the data sets and studies, and the library staff allow students and community members to access the information recorded.
It is important for you to collect metadata for all of these groups.
- Why were the data created?
- What processes were used to create the data?
- When were the data last updated?
- Who created the data?
- What fields are present and what do the values of those fields mean?
- Who do I contact about getting more information about the data?
- How do I obtain a hard copy of the data?
- Are there any limitations to the data?
- Constantly review your records for accuracy and completeness.
- Have a colleague review your records.
- Don’t use jargon or acronyms within your data, these can easily be misinterpreted, misunderstood or misapplied.
- Avoid special characters.
Check out discipline-specific metadata standards.
Data Access and Sharing
This means all digital data within the scientific realm needs to be recorded, preserved and shared with the general public.
All agencies, including university’s, which receive more than $100 million in Research and Development expenditures are required to make their data publicly accessible. Utah State falls under this category.
Open access allows for additional analysis to be made. Research builds off each other. It is important to know the methods and results of peers when conducting a new study.
Open access increases the impact of your data and results.
Open access to data also creates a better informed public, and community members can use your data and research to decide on public policies.
Sharing your data is of personal benefit because you are given recognition for your study, and you can become a field matter expert.
You also should consider how people will access your data, and if any special software will be required to access it.
When allowing for reuse of your data, consider whether or not you will require permission to be granted. Who will want to use the data? What is the intended future use of the data?
Also consider where you will store your data longterm. Will you store it in a repository? Will you deposit all of the data in the chosen repository? What metadata documents will you include?
These are key questions to ask when considering data access.