Skip to Main Content
Skip to main content

Research Data Services

This guide describes the services offered by the Ruth Lilly Medical Library for the management, sharing, and preservation of research data at the IU School of Medicine.

README

A README file (often written in all capital letters to draw attention) makes data more understandable and reusable, whether by your future self or others.  In data sharing, a README is typically a plain text or markdown file that briefly describes the organization of everything else included, like the table of contents in a book.  A good README's strength lies in describing relationships among files, and can be the best place to include certain details about reproducing scientific analysis.

Here is an example of a README from a Figshare submission:

Setting up a README early in a project can help capture key information about datasets as they are being created, especially if many collaborators are involved in the collection and analysis processes.

Data Dictionary

A data dictionary, also called a codebook, is often distinguished from a README file by the way it is oriented toward defining variables within tabular data.  These variable definitions may include information about the data type that is relevant to a programming language used in data processing, where a "data dictionary" can have a more sharply defined technical meaning.  Depending on the scope and complexity of the associated scientific data, a data dictionary may overlap with a README or replace it.  The underlying purpose is to facilitate reproducibility in research, saving your future self and others from unnecessary deciphering effort.

Sample data dictionary
Variable name Data type Data format Description
Name text Last Name, First Initial. Middle Initial. Survey responder's name
DOB date/time YYYY/MM/DD Survey responder's date of birth
SSN integer XXX-XX-XXXX Survey responder's social security number