CVC 2016 Working Schedule

Computation and Visualization Consortium

This is cvc.mosaic-web.org/Summer2016/schedule2016.html

Resources

Provide Feedback

We’ll use this Google Doc to record your questions and comments and to provide responses. Feel free to edit it at anytime before, during, or (shortly) after the workshop.

Working Schedule

This schedule will be updated throughout the workshop. All items should be considered tentative until they happen since we will adjust things to meet the needs and wishes of the participants.

Sunday

Arrival day for many coming from out of town.

Opening reception, 5 pm - 7 pm at Jo Hardin’s house

  • Check your email for details.

Monday

  1. Welcome (9am)
    1. Welcome to CVC/Pomona [Jo]
    2. Goals and outline for the week [Randy]
    3. Thinking with data
  2. Introductions [Randy]
    1. Activity: Split up into groups of three or four and introduce yourselves.
    2. Google Doc for second part.
  3. A First Case Study using Medicare Data [Danny]

  4. Introduction to R and R Studio [Randy]

  5. Project Example – Dynamic Data [Jo]

    1. slides for presentation; markdown file also available
    2. manuscript on arXiv
    3. complete example markdown files also available
  6. RMarkdown [Nick]

    • Creating PDF/HTML/Word documents in RStudio
    • TISE paper by Baumer et al
    • Example: R Markdown example; R Markdown formatted (Student evaluations example from Open Intro)

    • Activity: Write and publish a description of some straightforward, data-oriented topic you’d like to present in class.
      • Include at least 1 plot (even if unrelated) to make sure you know how to do that.
      • Use the mosaic plain html template as a starting point
  7. How to organize your data [Danny]

    1. Introduction slides, but the basics are
    • Rectangular format: cases (rows) and variables (columns)
    • Separate your analysis from data storage.
    • Use a codebook to describe your cases and variables in detail.
    1. Activity: The spreadsheet here contains data on the Minneapolis 2013 election by ward and precinct. Identify the elements of this spreadsheet that are not in standard rectangular data format.

Tuesday

  1. Project check-in: sharing ideas and one-on-ones

  2. The graphics-data interface
    1. graphing concepts: slides [Danny]
    2. Activity: data <-> graphs [Danny] PDF with the graphs
  3. Introduction to ggplot2 – turning the concepts into pictures with the grammar of graphics [Randy]

    1. slides
    2. Dope Sheets
  4. R parts of speech [Danny]. slides

  5. Working with data – 5 verbs of dplyr [Jo]
    Getting your data into the right shape for graphing or analysis

    1. slides for presentation;

    2. Worksheet for presentation; markdown file also available (and on the Calvin server)

    3. Sample solution Rmd and Sample solution html

Wednesday

  1. More data wrangling with dplyr and tidyr

  1. Quick Shiny interlude Example 1; Shiny resources [Nick]

  2. Ingesting your own data and next steps for your project

  3. Quick demo of github.

  4. Optional Tutorials

    1. Shiny [Danny]
    2. More ggplot2 [Randy]
  5. 4 pm: Seminar with Hilary Parker (from Stitch Fix and Not So Standard Deviations )

Thursday

  1. Opening (quick) interludes

    1. Brief status reports
    2. Using dplyr with an SQL data base
    1. Making data available to others (esp. students) via the web. Notes
    2. Writing functions in R
  2. Working on projects

  3. Optional Tutorials

    1. Github [Randy] @ 11 am
    1. Maps with Leaflet [Danny] @ 1pm Examples of widgets and details on leaflet
    2. Linear Modeling [Nick] @ 2:30 pm
  4. Dinner at Buca di Beppo at 6:00pm

Friday

  1. Please fill out the Project Report Form

  2. Live Project Reports (in parallel)

  3. Closing thoughts and goodbyes (finish at 3:00pm)

Things we did this week

Additional Topics TBD

In the second half of the workshop, we will offer optional tutorials on topics of interest to several workshop participants. Here are some potential topics, but don’t be afraid to suggest others. If there is a name listed by a topic, talk to that person to find out more.

  1. Github [Randy]
  2. Shiny [Danny, Nick, or Jo] — Notes
  3. Leaflets, etc. [Danny]
  4. Maps [Danny]
  5. Machine learning [Danny]
  6. Text mining [Nick]
  7. Modeling in R [Nick]
  8. Writing Functions [Randy]
  9. More ggplot2 [Randy]
  10. ggvis [Nick or Randy]
  11. Data Scraping from the Web
  12. Getting data from a database
  13. Image Analysis
  14. Resampling [Nick]
  15. Basic Statistical Methods
  16. Linear models with “Less Volume, More Creativity” [Nick or Jo]
  17. Other topics as the arise

Thanks

  • Kudos to Kathy Sheldon for her assistance with logistics
  • HHMI support