Based on your analytical plan, identify the criteria your data need to meet so that you can answer your questions.
For example, if you plan to use a one-sample t test for independent means, your data have to meet the following assumptions:
interval or ratio scale of measurement
random sampling from a defined population
samples or data sets are linked in the population through repeated measures, natural association, or matching
scores are normally distributed in the population; difference scores are normally distributed
Take snapshots of your data at these key points, at least. A snapshot is just a locked copy of your data files, saved to a backup storage location. These files should not be changed; they are for reference only.
raw data (before it is cleaned and processed)
your processed data (before it is analyzed)
the data used for your analysis, any analytical scripts or procedures used, and detailed notes about why data were selected for analysis or excluded
Before you can begin to analyze your data, the data have to be cleaned, processed, screened, and sometimes split into separate datasets. This process can be full of confusion and uncertainty, but you can reduce it through planning, good note-taking, and taking snapshots (sometimes called data locks) of your data at key points in the data collection, processing, and analyzing process.
The Software Tools database is the product of two NSF-funded Informatics Education Planning Workshops hosted by DataONE. The database provides a brief description of a wide range of tools that are recommended for use by scientists and students, as well as additional information and links to further resources.
The DiRT Directory is a registry of digital research tools for scholarly use. DiRT makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software.