A very large part of the use and presentation of data draws on a small set of concepts and techniques. These are not difficult individually and can be taught individually as simple manoeuvres. In this way, they are simple, like Legos©. The complexity of data use and presentation comes from combining these concepts and techniques in various ways to achieve our specific purposes just as an elaborate model can be built out of simple blocks.
The individual lego bricks are simple.1 | A city made by arranging lego bricks 2 |
---|---|
We’re going to start today with the infrastructure for these techniques:
In coming weeks, we will study
brosenberg
). The initial password is the last 4 digits of your student ID number. This can be changed. Instructions herePackages: Run the script in install_packages.R
source("http://dtkaplan.github.io/DCF-Course-2014/Notes/Week-1/install_packages.R")
devtools::install_github("dtkaplan/DCFdevel")
devtools::install_github("dtkaplan/DCFinteractive")
DCF Homepage is on Moodle.
An FAQ for the course: http://dtkaplan.github.io/DCF-Course-2014/Notes/FAQ.html. For questions of a general methodological nature, e.g. “How do I rename a variable?” please post the question on the Disqus discussion site.
Data sets we will access regularly for examples.
BabyNames Names of children as recorded by the US Social Security Administration. CountryCentroids Geographic location of countries CountryData Many variables on countries from the 2014 CIA factbook. CountryGroups Membership in Country Groups DirectRecoveryGroups MedicareCharges MedicareProviders MigrationFlows Human Migration between Countries Minneapolis2013 Ballots in the 2013 Mayoral election in Minneapolis NCI60 Gene expression in cancer. NCI60cells Cell Line descriptions in the NCI-60 dataset WorldCities Cities and their populations ZipDemography Demographic information for most US ZIP Codes (Postal Codes) ZipGeography Geographic information by US Zip Codes (Postal Codes) registeredVoters A sample of the voter registration list for Wake County, North Carolina in Fall 2010.
Using the DCF template file for Rmd.
Create an Rmd file named Class-1.Rmd
. Eventually, you will upload your HTML file to Moodle, under In-class, Week 1
Markdown for …
TASK: Create an narrative description of your classes this term. Include links to the Moodle site, links to a relevant Wikipedia (or other) article, and an embedded figure (perhaps from Wikipedia).
Basics: Cases, Variables, rows, columns, quantitative, categorical
Divide into groups and put your answer to the following in these spreadsheets:Group-1, Group-2, Group-3, Group-4, Group-5, Group-6
Make a separate tab for each table. Hand in your work here on Moodle. Work on the spreadsheets together with your group, but each student should individually create and hand in an Rmd->html document containing links to those spreadsheets (or the spreadsheets themselves.)
In the 1880s, Francis Galton started to make a mathematical theory of evolution.
Here’s part of a page from his lab notebook.
Here’s the original, untidy spreadsheet.
When done, upload it to Moodle, under In-class activity: RMD -> HTML.
Start on Assignment 1.
AssignmentOne-XXX.Rmd
(where XXX
is your initials).We’ll be done when you’ve uploaded at least the first part of your assignment to Moodle. You can refine it later.
Please use the comment system to make suggestions, point out errors, or to discuss the topic.
Source : “Lego Color Bricks” by Alan Chia - Lego Color Bricks. Licensed under CC BY-SA 2.0 via Wikimedia Commons↩
Source: Trafalgar Legoland 2003 by Kaihsu Tai - Kaihsu Tai. Licensed under CC BY-SA 3.0 via Wikimedia Commons↩