Appendix: R Programming Style Guide

This style guide serves to reconcile and extend two popular sources:

18.16 Document Organization

It’s a good habit to establish a template for consistent document organization. For example, the following elements are convenient to group as “Front Matter” at the beginning of the document.

  1. clean-up R environment: rm(list = ls())
  2. packages, etc: library() and source() statements
  3. user-defined functions: function() { ... }
  4. source data intake: read_csv(), etc

18.17 Whitespace and New Lines

When to begin a new line. Long lines of code can be more challenging to read efficiently.

  • after ~80 characters Configure the RStudio IDE to display a reference line at 80 characters: RStudio >> Tools >> Global options >> Code >> Display >> “Show margin”
  • after chain operators (e.g. %>%, +)

Blank lines are useful to separate command sequences just as paragraphs are used to separate sentence sequences.

  • before and after a sequence of chained commands
  • before and after a group commands unified toward some goal

Comments are useful to include informative remarks about the code in plain language.

  • Entire commented lines should begin with # and one space before remarks

  • additional spacing can be used to align at the # in case of several end of line comments in close succession End of line comments appended to line of code: precede with two spaces, #, and then one space before remarks.

In-line spacing.

  • Spaces around all infix operators including =, +, -, <-, etc., with exception of :, ::
  • Always put a space after a comma, but never before
  • Place a space before left parentheses, except in a function call
  • Do not place spaces around code in parentheses or square brackets (unless there’s a comma, in which case see above).

Curly braces: { and }

  • { an opening curly brace should never go on its own line and should always be followed by a new line
  • } a closing curly brace should always go on its own line, unless it’s followed by else
  • Indent the code inside curly braces Use progressive indenting for nested curly braces.

18.18 Object naming

Naming can be suprisingly challenging. Prioritize names that represent intuitive meaning of the object, and then make the name short if possible.

  • Use a lowercase letter to begin variable, vector, and scalar names; Use a CAPITAL letter to begin data frame, list, array, and matrix names
  • Use CamelCase or Snake_case
  • It is also wise to avoid object names that coincide with popular functions (e.g. if you calculate a mean and store it as an object, name the object avg, NOT mean) Identical object names should never refer to fundamentally different objects at different places in the same code script
  • Always use <- assignment operator (not = for object assignment)

18.19 Other general suggestions.

Minimize nested syntax where possible:

  • Bad: Result <- select(filter(Data, condition), var1, var2)
  • Better: Result <- Data %>% filter(condition) %>% select(var1, var2)
  • Best:
    Result <- 
      Data %>%
      filter(condition) %>% 
      select(var1, var2)

Compound mutate and summarise statements should begin a new line after comma. For example:

Result <- 
  Data %>%
  group_by(category) %>%
  summarise(N = n(), 
            Average = mean(quantity), 
            StDev = sd(quantity))

Always preserve access to the uncorrected/raw source data. One should never modify, change, or overwrite the only accessible copy of the source data. Doing so inadvertantly is irresponsible and doing so intentionally is unethical.

18.20 Resources & additional code examples (including good and bad):