Glyphs, Frames, and Scales

Data Computing

June 13, 2017

Glyphs and Data

In its original sense, in archeology, a glyph is a carved symbol.

Heiroglyph Mayan glyph
Heiroglyph Mayan glyph

Data Glyph

A data glyph is also a mark, e.g.

The features of a data glyph encodes the value of variables.

See: http://docs.ggplot2.org/current/

Data Glyph Properties: Aesthetics

Aesthetics are visual properties of a glyph.

## Warning: Using size for a discrete variable is not advised.

Why “Aesthetic”?

Some Graphics Components

glyph
The basic graphical unit that represents one case. Other terms used include mark and symbol.
aesthetic
a visual property of a glyph such as position, size, shape, color, etc.
scale
A mapping that translates data values into aesthetics.
frame
The position scale describing how data are mapped to x and y
guide
An indication for the human viewer of the scale. This allows the viewer to translate aesthetics back into data values.

Scales

The relationship between the variable value and the value of the aesthetic the variable is mapped to.

The conversion from SBP to position is a scale.

The conversion from Smoker to color is a scale.

Guides

Guide: an indication to a human viewer of what the scale is.

—————————-|————————— |

Facets – using x and y twice

Designing Graphics

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Design choices

More details, …, e.g. setting of aesthetics to constants

Good and Bad Graphics

Remember …

Graphics are designed by the human expert (you!) in order to reveal information that’s latent in the data.

Your choices depend on what information you want to reveal and convey.

Learn by reading graphics and determining which ways of arranging thing are better or worse.

A basic principle is that a graphic is about comparison. Good graphics make it easy for people to perceive things that are similar and things that are different. Good graphics put the things to be compared “side-by-side”, that is, in perceptual proximity to one another.

Perception and Comparison

In roughly descending order of human ability to compare nearby objects:

  1. Position
  2. Length
  3. Area
  4. Angle
  5. Shape (but only a very few different shapes)
  6. Color

Color is the most difficult, because it is a 3-dimensional quantity.
- color gradients — we’re good at - discrete colors — must be carefully selected.

Count the ways this graphic is bad

## Warning: Using size for a discrete variable is not advised.

Glyph-Ready Data

Glyph-ready data has this form:

Glyph-ready data

##   sbp dbp    sex  smoker
## 1 129  75   male   never
## 2 105  62 female   never
## 3 122  72   male   never
## 4 128  83 female  former
## 5 123  90   male  former
## 6 122  77   male current

Mapping of data to aesthetics

   sbp -> x      
   dbp -> y     
smoker -> color
   sex -> shape

Scales determine details of
data -> aesthetic translation

Layers – building up complex plots

Each layer may have its own data, glyphs, aesthetic mapping, etc.

Stats: Data Transformations

##   sbp dbp    sex smoker
## 1 129  75   male  never
## 2 105  62 female  never
## 3 122  72   male  never
## 4 128  83 female former

What’s Next

  1. Eye-training

    • recognize and describe glyphs, aesthetics, scales, etc.
    • identify data required for a plot
  2. Data wrangling

    • get data into glyph-ready format (dplyr, tidyr)
  3. Graphics construction

    • start with: map variables to aesthetics interactively with scatterGraphHelper(), barGraphHelper(), densityGraphHelper()
    • move on to: describe data, glyphs, aesthetics, etc. to R using ggplot2