Sparking Joy with Research: Organizing Your DataAs scientists, we are always collecting data. At some point, our collection of data files expands during a research project and then we struggle to find files. Learning how to efficiently organize data is an essential skill that will help you in the long run. It also helps future researchers find your data when you are not around.
So how do you organize it all? We encourage you to find your inner “Marie Kondo” and use our tips to organize your data, which includes naming formats for your files and folders. For starters, add dates in your file names and organize by project folders.
- Start with having different folders for different projects such as “Greenhouse Phytoremediation Study” and “Arid Land Soil Microbiome Project.”
- Divide these major project folders into specific folders for different data types, for example, “Soil Chemical Analyses” and “Microbial Molecular Data.”
- When naming your folders and data files, use names that are human-readable, descriptive and include details about the data set, even if it’s in abbreviated manner. Titles should not be too long and should be easily identified and imported into data analytics tools. Also, ensure that there are no blanks in the file names. Instead, use underscores to separate elements in the title. For example, “Year1_Site1_PlantData_121118”, this file contains plant metrics data for Site 1 from Year 1 and the survey was conducted on December 11, 2018.
- Include a .txt file in each folder describing the naming scheme for that folder and try to adhere to the scheme for all the files in the folder.
- One of the easiest ways of keeping your field and lab data is in spreadsheets. Ensure that the first row is never empty and reserve it for descriptive headers. Each column should contain data in the same format (either numerical or text). Each row should be a record of a sample type or an individual response.
- Spreadsheets should be saved in nonproprietary software formats such as csv or txt. Keeping data in proprietary software formats can result in incompatible files when the software is updated. As a result, data can be lost due to inaccessibility.
Contributor: Priyanka Kushwaha is a postdoctoral researcher at the University of Arizona. Her current research is focused on establishing the links between phylogenetic and genetic diversity of microbial communities in nutrient-limited arid soils. She is also working on a greenhouse study to develop a mechanistic understanding of plant-microbial interactions in metal-contaminated soils by using plant metatranscriptomics.