Skip to Main Content
Skip to main content

Research Data Services

This guide describes the services offered by the Ruth Lilly Medical Library for the management, sharing, and preservation of research data at the IU School of Medicine.

Data Collection Best Practices

Designing a research proposal involves making decisions about how to represent information from the real world as data for analysis.  Areas to consider include:

  1. Are you ethically upholding institutional and community data collection standards?
  2. What framework will you be using to collect your data, and why?
  3. How will quality control be integrated into data collection protocols?
  4.  If you are collecting human subjects data, how can you build privacy into your study design?
  5. How can data collection processes anticipate the rest of the data life cycle?


Selected data collection resources:

File Naming Conventions

Establishing a file naming convention before data collection starts allows you to scale data findability as you generate data throughout the project period.  A file naming convention protects you from losing time searching for files, and enhances machine-readability.  This in turn streamlines data analysis and validation processes.

Here is an example of a file naming convention:

File Formats

When you have the option, choosing an open source file format during data collection means that your data will be more accessible and interoperable than if you used a proprietary format.  Open source formats lower the barrier for other researchers to reuse your data, increasing the potential for future impact in the data you create.

Example Open Source File Formats
Text .txt, .html, .pdf (some types are proprietary to Adobe)
Images .jpeg, .png, .tiff (specifications are open; .tiff is trademarked by Adobe)
Tabular .csv
Audio .flac (lossless codec), .mp3 (lossy codec)
Compression .7z, .gz, .tar



REDCap is a secure, web-based platform that supports data collection and data management for research, operations support, and quality improvement projects. This tool is free and access is provided by the Indiana Clinical and Translational Science Institute (CTSI), UITS Advanced Biomedical IT Core, and IUSM Department of Biostatistics.

Common Data Elements

Common data elements (CDEs) are standard ways of setting variables for data collection that increase the interoperability of your scientific data, and open up opportunities for extending your research impact.  A CDE allows specific responses to defined questions.  Datasets created with the same CDEs can be harmonized without losing information.

The NIH CDE Repository is a central location to find existing CDEs.