Chapter 2 Data and Code Books

“Numbers have life. They are not just symbols on paper.”     — Shakuntala

2.1 Overview

You have learned a bit about the structure and goals of the course and have heard from students who have engaged in their own research. Now you will begin to learn about the necessary steps in statistical inquiry, types of variables, and how to use code books to develop your own research question. Statistics is all about converting data into useful information.

2.2 Lesson

Learn about the role of populations and samples in statistical inquiry. Understand the importance of choosing a representative sample from a population of interest. Consider the structure of a data set and the meaning and function of information provided within its rows and columns. Learn about how code books document the ways that the data are arranged within data sets, what the various numbers and letters mean, and any special instructions on how to use the data. Develop skills in reading code books and developing a research question. Click HERE to watch the video lesson.

2.3 Assignment

Now it is time for you to formulate your own research question. You should spend ample time on this task as it will guide your work for the rest of the course.


One of the simplest research questions that can be asked is whether two topics or constructs are associated.

Is medical treatment seeking associated with income? Is water fluoridation associated with number of cavities during dentist visits? Is humidity associated with caterpillar reproduction?


First, carefully read through the available code book(s). If more than one data set has been made available to you, select the one that you would like to work with.

Next, identify a specific topic of interest and begin to prepare a code book of your own. Print individual pages from the larger code book (i.e. items or questions that measure your selected topic).

During an additional review of the code book, identify a second topic that you would like to explore in terms of its association with your original topic. Add the items or questions documenting this second topic to your personal codebook.

For example, after looking through the code book for the NESARC study, I have decided that I am particularly interested in nicotine dependence. I am not sure which items I will use regarding nicotine dependence (e.g. symptoms or a diagnosis) so for now I will include all of the relevant items in my personal codebook.

While nicotine dependence is a good starting point, I need to determine what it is about nicotine that I am interested in. It strikes me that friends and acquaintances that I have known through the years that became hooked on cigarettes did so across very different periods of time. Some seemed to be dependent soon after their first few experiences with smoking and others after many years of generally irregular smoking behavior.

I decide that I am most interested in exploring the association between level of smoking and nicotine dependence.

I add to my codebook items reflecting smoking (e.g. smoking quantity and frequency). Bring a printed copy of this personal code book to each class session.