Week 5

One

This and several following problems concern the NHANES data.

Judging from the graph, what’s the most likely height among the NHANES people? (FYI: The heights are in meters.)

Two

What’s the most likely height for women? For men?

Three

Here’s the density for bmi and age simultaneously. The blue lines play the role of contours on a contour map of geography.

What’s the most likely BMI for a 40-year old? For a 70-year old?

Four

You’re doing a study of whether grandparents go where the grandkids are. As a first attempt, you look at 20 ZIP codes. You find the relationship between the fraction of people in a ZIP code under 5 years old and the fraction 65 and older, plotted below.

  • Do the data indicate that ZIP codes with high elderly populations tend to have high child populations?

  • Looking at the confidence bands, is your data possibly consistent with there being no relationship (that is, a level line) between elderly population and child population?

You decide to get more data: study 80 ZIP codes. Here’s the result.

  • Is a flat line consistent with the data?

  • Compare the height of the confidence band with 20 ZIP codes to the height of the band with 80 ZIP codes: 4 times as much data in the larger sample than the smaller. Roughly, what’s the ratio of confidence band heights?

  • Statistical theory indicates that the width of a confidence band based on \(n\) points goes as \(1/\sqrt{n}\). Does this seem about right?

Five

Studying a sample of 100 ZIP codes, the following graphic divides ZIP codes into 5 income groups, looking at the fraction of the population that is 65 and older.

What relationship do you see between income and the fraction of the population greater than 65? (The notch shows the confidence interval on the median population.)

The same analysis, but for a sample 16 times as big as before.

Do the notch sizes (confidence intervals) follow the \(1/\sqrt{n}\) rule?


Please use the comment system to make suggestions, point out errors, or to discuss the topic.

comments powered by Disqus

Written by Daniel Kaplan for the Data & Computing Fundamentals Course. Development was supported by grants from the National Science Foundation for Project Mosaic (NSF DUE-0920350) and from the Howard Hughes Medical Institute.