CVC 2015

Computation and Visualization Consortium

Resources and Working Schedule

Resources

  • CRAN (Comprehensive R Archive Network) has lots of R-related stuff
  • The mosaic package (version 0.10 or greater) and its vignettes are available via CRAN and also via github
  • The Calvin College RStudio server has provided accounts for all workshop participants. Teaching Statistics Using R.
  • Slides and files for sessions will be posted in the schedule as we go along (see below).
  • The CVC 2015 R Pubs site can be used to post documents for others to see.

Provide Feedback

We’ll use this Google Doc to record your questions and comments and to provide responses. Feel free to edit it at anytime before, during, or (shortly) after the workshop.

Working Schedule

This scheule will be updated throughout the workshop. All items should be considered tentative until they happen since we will adjust things to meet the needs and wishes of the participants.

Sunday

Arrival day for many coming from out of town.

Monday

  1. Welcome
    1. Welcome to CVC [Ben]
    2. Goals and outline for the week [Randy]
  2. Introductions [Randy]
    1. Activity: Split up into groups of three or four and introduce yourselves.
    2. Google Doc for second part.
  3. A First Case Study using Medicare Data [Danny]

  4. Introduction to R and R Studio [Randy]

  5. RMarkdown [Nick]

    1. Activity: Write and publish a description of some straightforward, data-oriented topic you’d like to present in class.
      • Include at least 1 plot (even if unrelated) to make sure you know how to do that.
      • Use the mosaic plain html template as a starting point
      • Share this on RPubs here via “publish” button in RStudio. (See board for login and password.)
  6. How to organize your data [Danny]

    1. Introduction slides, but the basics are
    • Rectangular format: cases (rows) and variables (columns)
    • Separate analysis from data storage.
    • use a codebook to describe your cases and variables in detail
    1. Activity: The spreadsheet here contains data on the Minneapolis 2013 election by ward and precinct. Identify the elements of this spreadsheet that are not in standard rectangular data format.

    2. Activity: Enter data in this spreadsheet indicating types of data you are interested in.

Tuesday

  1. Project Reports [Ben]
    1. Activity: come prepared to give a (very brief) report on the status of your project idea(s).
  2. The graphics-data interface [Danny]
    1. graphing concepts: slides [Randy/Danny]
    2. Activity: data <-> graphs [Danny] PDF with the graphs & the questions
  3. Introduction to ggplot2 – turning the concepts into pictures with the grammar of graphics

    1. slides
    2. Dope Sheets
  4. Working with data – 5 verbs of dplyr [Danny]
    Getting your data into the right shape for graphing or analysis
    1. Slides
    2. A case study with workshop participant Mary Harrington
    3. Activity: Baby names [Nick]
  5. Getting your own data into R [Randy]

    1. Notes
    2. Anscombe.csv
    3. Gender.xls

    Post workshop: a DataCamp tutorial

Wednesday

  1. More data wrangling with dplyr and tidyr
    1. merging data from multiple sources [Danny]
    2. gather() [Randy]
    3. Building precursors paper
  2. Quick Shiny intro Example

  3. Ingesting your own data and starting on your project

Thursday

  1. More ggplot (see notes above) [Randy]

  2. Writing Functions [Randy]

    1. Notes
    2. Rds file
  3. Shiny tutorial. [Danny & Randy] The group designed an app to be embedded in an Rmd file.
    1. Here’s the Rmd file.
    2. We deployed the app here. Try it out!

Friday

  1. Please fill out the Project Report Form

  2. Live Project Reports
    1. Sanjive Qazi – Interactive Clusters in Shiny
    2. Dominique – Shiny & Generating Data on the fly
    3. Simon – Student Lab in Rmd with a Dygraph
  3. Finding R packages and documentation [Nick]

Additional Topics TBD

  1. Getting data from a database [Ben]
  2. Github [Randy]
  3. Shiny [Danny]
  4. Maps [Danny]
  5. Machine learning [Danny]
  6. Text mining [Nick]
  7. Modeling in R [Nick]
  8. Writing Functions [Randy]
  9. More ggplot2 [Randy]
  10. Basic Statistical Methods
  11. Data Scraping from the Web
    1. Finding packages that provide interfaces to APIs
  12. Image Analysis