Visualizing Movie Ratings

Data Computing

A set of 100,000 ratings of movies by individuals was collected in the late 1990s by the grouplens research team at the University of Minnesota. The grouplens team provides the data directly at http://grouplens.org/datasets/movielens/100k/. These data were reformatted by for the Data Computing book. Downloaded them to your own computer with this statement:

download.file("http://tiny.cc/dcf/MovieLens.rda", 
              destfile = "MovieLens.rda")

You only need to download the data once. But each time you start a new R session1 Every time you knit a document, you are starting a new session just for the purpose of compiling that document. you will need to load() the data to your R session.

MovieLens.rda contains three data tables:

Your task: Construct each of these graphics.

Show the appeal of different genres to the different sexes

Who are the reviewers?

Users %>%
  ggplot(aes(x = age)) + 
  geom_density(aes(fill = occupation), 
               color = NA, alpha = .7, position = "fill") + 
  facet_wrap( ~ sex)

Users %>%
  ggplot(aes(x = age)) + 
  geom_density(aes(fill = sex), 
               color = NA, alpha = .4, position = "fill")

Users %>%
  group_by(occupation) %>%
  tally() %>%
  arrange(desc(n))
## # A tibble: 21 × 2
##       occupation     n
##            <chr> <int>
## 1        student   196
## 2          other   105
## 3       educator    95
## 4  administrator    79
## 5       engineer    67
## 6     programmer    66
## 7      librarian    51
## 8         writer    45
## 9      executive    32
## 10     scientist    31
## # ... with 11 more rows

Ratings as people age

All %>%
  filter( genre != "unknown") %>%
  ggplot(aes(x = age, color = sex, y = rating)) + 
  geom_smooth() + 
  facet_wrap( ~ genre, scales = "free")
## `geom_smooth()` using method = 'gam'

All %>% 
  ggplot(aes(x = age, color = sex, y = rating)) +
  geom_smooth()
## `geom_smooth()` using method = 'gam'