Stats for Data Science
An MAA mini-course at JMM 2020, Daniel Kaplan

Stats for, not as, data science

Data science is not merely a rebranding of statistics.

The scenario where statisticians would have come to lead the development of data science is plausible, but historically computer scientists and people from fields such as genetics, marketing, public health, medicine, remote sensing, etc. have played crucial roles.

Whether or not data science ought to be considered part of the mathematical sciences, any genuine approach should be fundamentally based in realistic applications and the actual kinds of problems–especially decision making–that data science is used to address.

For concise introductions to wrangling and visualization, see Statistical Inference via Data Science by Chester Ismay and Albert Y. Kim or Data Computing by Daniel Kaplan and Matthew Beckman.

I don’t know of a concise introduction to decision making from the statistics, mathematics, or computer science perspectives. (Please tell me if you do know of one!) But if you are willing to wade into the business literature, you would do well with How to Measure Anything by Douglas Hubbard. It even has a workbook.


MAA mini-course evaluation