CVC 2017 Working Schedule

Computation and Visualization Consortium

This is cvc.mosaic-web.org/Summer2017/schedule2017.html

Resources

Provide Feedback

We’ll use this Google Doc to record your questions and comments and to provide responses. Feel free to edit it at anytime before, during, or (shortly) after the workshop.

Project ideas and updates

We’ll use this Google Doc to share information about project ideas and updates.

Working Schedule

This schedule will be updated throughout the workshop. All items should be considered tentative until they happen since we will adjust things to meet the needs and wishes of the participants.

Sunday

Arrival day for many coming from out of town.

  • Details TBA

Monday

  1. Welcome (9am)
    1. Welcome to CVC/Macalester [Danny]
    2. Goals and outline for the week [Randy]
  2. Introductions [Randy]
    1. Activity: Split up into groups of three or four and introduce yourselves.
    2. Google Doc for second part.
  3. A First Case Study using Medicare Data

    1. Thinking with data [Nick]

    2. Curriculum Guidelines for Data Science [Nick]

    3. Slides [Danny]

  4. Introduction to R and R Studio [Randy]

  5. RMarkdown [Nick]

    1. Creating PDF/HTML/Word documents in RStudio
    2. TISE paper by Baumer et al
    3. RMarkdown cheatsheet
    4. Example: R Markdown example; R Markdown formatted (Student evaluations example from Open Intro)

    5. Activity: Write and publish a description of some straightforward, data-oriented topic you’d like to present in class.
      • Include at least 1 plot (even if unrelated) to make sure you know how to do that.
      • Use the mosaic fancy html template as a starting point
  6. How to organize your data [Danny]

    1. Introduction slides, but the basics are
    • Rectangular format: cases (rows) and variables (columns)
    • Separate your analysis from data storage.
    • Use a codebook to describe your cases and variables in detail.
    1. Activity: The spreadsheet here contains data on the Minneapolis 2013 election by ward and precinct. Identify the elements of this spreadsheet that are not in standard rectangular data format.
  1. Homework: Think about your project (and add into the following Google Doc) [Nick]

Tuesday

  1. Project check-in
    1. sharing ideas and one-on-ones [Nick]
    2. update the Google Doc
  2. More graphics in R with ggformula [Randy and Danny]

    1. The graphics-data interface
    2. Expanding the ggformula toolkit [Randy]

    3. Brief Intro to ggplot2 [Randy]

  3. R parts of speech and “Chaining Syntax” [Danny]
    1. slides
    2. exercise
  4. Working with data – 5 verbs of dplyr [Nick]
    1. Goal: Get your data into the right shape for graphing or analysis

    2. slides for presentation;

    3. tutorials
    4. Building precursors to data wrangling: https://arxiv.org/abs/1401.3269

    5. Useful online resource: downloadable chapter on data wrangling from http://mdsr-book.github.io/

Wednesday

  1. More data wrangling with dplyr and tidyr [Randy]

    1. spread(), gather(), joins, etc.

    2. A data tidying case study: tidydata-TB-tutorial.Rmd

  1. Ingesting your own data and next steps for your project

  2. Teasers

    1. Shiny Example 1; Shiny resources [Nick]

    2. GitHub [Randy]

    3. learnr (with hints and checked answers and bells and whistles, see https://rstudio.github.io/learnr and https://rstudio.github.io/learnr/exercises.html#hints-and-solutions) [Danny]

    4. Leaflets (leaflet.Rmd, see also https://rstudio.github.io/leaflet) [Nick]

  3. Welcome to 2nd-half participants

  4. Optional Tutorials

Thursday

  1. Opening (quick) interludes

    1. Brief status reports
    2. Using dplyr with an SQL data base
    1. Simple text mining for fun and profit
    1. Writing functions in R
    1. Googlesheets [Randy or Danny]
  2. Working on projects

  3. Optional Tutorials, TBD

  1. Using Github and Making Packages

Friday

  1. Please fill out the Project Report Form

  2. How to make a website (GitHub edition)
  1. Break out sessions
    1. Shiny
    1. Web scraping
    2. Machine Learning
    1. Modeling in R
  2. Project Reports

  3. Closing thoughts and good-byes

Things we did this week

Here is a sampler of things people worked on during their “project time” this week.

  • USC Dornsife/LA Times Daybreak Election Poll
    • data available online
    • time series of overall trends Clinton vs Trump
    • to do: add in covariates from the data
  • Tutorials for lower division Science courses to show how stats can be used there
    • Chem, Bio, Env Science
  • Wrangle data for a longitudinal study

  • Animation of seismic waves passing through the US (Earth scope data)
    • ggmap for mapping
    • ffmpeg to stich stills into movie
  • Earthquake epicenter mapper with additional data overlays

  • Tutorial for intro stats class
    • learned to think about goals and flow of tutorial
    • first week of class tutorial – “run and adjust”
  • Migrating from SAS to R
    • love graphics in R
    • hoping to move more and more analyses to R
  • Bike share data in DC & Twin Cities
    • visualzation of usage over time (and season)
  • HMP Explorer
    • Human Microbiome data
  • Two shiny apps
    • Coin Flipping simulations
    • Goodness of Fit Test
  • Maps using R

  • Learn to make tutorials with
    • mathematical formulas
    • visualization of t-test
    • solution code
    • sliders to adjust plots and other output
    • exercises/quizzes
  • Visualiztion of Interlibrary Load data

  • Make better connections between Inro Biostats and Bio Course
    • data wrangling for data that will be used
    • shiny app with repository of example analyses
  • Working with data from CDC survey on behavior risk

Additional Topics TBD

In the second half of the workshop, we will offer optional tutorials on topics of interest to several workshop participants. Here are some potential topics, but don’t be afraid to suggest others. If there is a name listed by a topic, talk to that person to find out more. (Note that Nick’s last day is Thursday!)

  1. Github [Randy]
  2. learnr [Danny]
  3. Shiny [Miles] — Notes
  4. Leaflets, Maps [Leaflets]
  5. Machine learning [Danny]
  6. Text mining [Nick]
  7. Modeling in R [Nick]
  8. Writing Functions [Randy]
  9. More ggplot2 [Randy]
  10. ggvis [Nick]
  11. ggplotly [Randy or Danny]
  12. Animated GIFs [Miles]
  13. Data Scraping from the Web [Miles or Danny]
  14. Getting data from a database [Nick]
  15. Resampling [Nick]
  16. Workflow for R Markdown in class and with students [Nick]
  17. Linear models with “Less Volume, More Creativity” [Nick]
  18. Creating R packages [Randy]
  19. R with LaTeX (similar to RMarkdown)
  20. Other topics as the arise

Thanks

  • HHMI support