UATX Computing Tutorials: Data and Modeling

Author

Daniel Kaplan

Published

December 10, 2024

WebR Status

🟡 Loading...

Preface

Loading webR...

Many college-level students are expected to study a semester of statistics. The conventional approach—a course about correlations, distributions, and null-hypothesis testing—long ago became obsolete. That approach was designed for benchtop experimenters measuring a small handful of variables on a small number of specimens.

Today, large numbers of workers in many sectors of the economy have to work with and make sense of large-scale, potentially complicated data often collected outside of an experimental setting. The tools needed for this work are hardly touched on in the conventional course. Perhaps this is due to the inherent conservatism of academia and the lack of incentives for meaningful innovation. This book is one attempt to improve the situation and being statistics education in line with contemporary motivations and needs for working with data.

The University of Austin (UATX) was founded in 2024 out of a concern that many aspects of the university system have become dysfunctional, the shortcomings the content of the statistics (and mathematics) curriculum being just one part of the problem. The UATX founders, a combination of entrepreneurs and professors with extensive experience with the university system, identified another structure that tends to weaken what a student can get out of university study. This is the cafeteria approach to how a student’s early college experience is put together. Naturally, students like to have control of their choice of courses. Similarly, faculty like to be free to do what they want and to avoid anything that impinges on their autonomy. But there are great advantages to faculty working together and being accountable for the way courses are designed and run. Similarly, there are benefits to students following a carefully planned curriculum and having learning experiences in common. And for instructors of non-introductory courses, it helps greatly to know what students have previously studied and to build on and reinforce that early material.

This short book was written as part of the UATX core curriculum. All students are required to take two “quantitative reasoning” core courses, the second of which (“QR2”) is about working with and interpreting data. Since the course is required for all students, with no pre-requisite, it needs to be accessible and motivating for all. It’s hard to do this with a conventional textbook. Instead, many of the readings for the course are based on contemporary practitioners writing for a general audience.

Data and statistics are intimately tied to computing. So part of QR2 involves genuine computing with data. Consequently, all students are introduced to a powerful, professional, but compact and accessible language and system for data computing. This set of tutorials provides that introduction. Exercises help to establish and reinforce what was introduced in the tutorials.

The QR2 curriculum is divided into six topics, each focussing on a particular set of data computing tasks. For each topic, there is a corresponding tutorial.

To the student

In technical subjects, doing is at the heart of learning. The tutorials are interactive. They require only a standard web browser; the computing is built in with immediate access to professional levels of computing organized around a compact and consistent style that is straightforward to learn from examples. Every tutorial, and the exercises that follow, provide a way for students easily to collect their work and send it to an instructor. This way, students can demonstrate what they have done (and be accountable for doing it!) and get feedback. Each tutorial comes with a set of exercises to practice reinforce the skills covered..

Computing in the tutorials is done with the widely used R package. In the tutorials, you will find many “Interactive R Chunks” which have been pre-populated with working R commands. These commands are based on a carefully designed dialect of R developed specifically to smooth the path for the introductory student. Here’s a simple example:

Active R chunk 1

You will need to run these chunks in order to see the result. But you should also play around with the commands. For instance, in Tutorial 1 you can try out different variables than those initially used in the R chunk, different model annotations, different summary statistics, and so on. This “play” will be self-directed. Try out your ideas, look for anomalies, … explore!

To encourage this sort of play, there is an incentive. Each of the times a chunk is run, a record is kept. This record will be graded. The grade is not about getting things right, it’s about rewarding you for putting time and attention into the tutorial.

It is up to you to submit your play/work with the tutorials. This is straightforward, but submission involves your participation.

At the end of each tutorial, there is a button that you press to initiate the submission of your work. Like this:

No answers yet collected

Pressing the button will collect all of the R commands given. In the exercises, the collection will involve your multiple-choice and essay answers as well as R commands you executed in Interactive R Chunks. The button will copy all this into your computer’s clipboard. Once copied, you will login to a course web page and paste your clipboard contents.

When you close a tutorial or exercise document, your work will be erased. (This shortcoming will be addressed in future editions of the tutorials.) So it’s very important to submit your work before closing the document. If you come back to the document later on, you can submit your work in that session to augment your previous submission(s).

Submit collected answers here

webR Code Links

Preface

To the student

R History Command Contents