More on Voting

In the ranked-choice ballot system, once a voter has picked a first choice candidate, there is no advantage in listing that same candidate as second or third choice.

In answering each of the following, you should include three things:

  1. A statement in English (using data verbs!) of your strategy for carrying out the calculation.
  2. The implementation of the calculation in R.
  3. The data table that is the result of the calculation. This will be printed automatically from (2). You do not have to format the result beautifully. It’s sufficient to show the first few lines in the data table appear. (Remember head()).

Here are the questions.

Medicare Reimbursement

One way to measure the “spread” of values is with the “standard deviation.” The higher the standard deviation, the more spread out the values are.

In the MedicareCharges data table:

library( mosaic )
DRGtoBodySystem <-
  fetchGoogle( "https://docs.google.com/spreadsheet/pub?key=0Am13enSalO74dFFFaGJWdk1IbUo4bVFXLXYzbE1KLXc&single=true&gid=0&output=csv" )

Make a bar chart showing how much is paid by Medicare for each categories. Your report should show both the bar chart and the commands that generate it.

Zip Codes

Join the zip code geography and demography:

Pick out the 10000 zip codes with the highest population. Make a scatter plot of the latitude versus longitude. (Hint: arrange() and head().) Use color to represent the fraction of the population that is over 65.

How many zip codes have a WaterArea that is more than 50% of the LandArea? Make a scatter plot showing the geographical location of these, with color indicating the population.


Please use the comment system to make suggestions, point out errors, or to discuss the topic.

comments powered by Disqus

Written by Daniel Kaplan for the Data & Computing Fundamentals Course. Development was supported by grants from the National Science Foundation for Project Mosaic (NSF DUE-0920350) and from the Howard Hughes Medical Institute.