This style guide serves to reconcile and extend two popular sources:
It’s a good habit to establish a template for consistent document organization. For example, the following elements are convenient to group as “Front Matter” at the beginning of the document.
rm(list = ls())
library()
and source()
statementsfunction() { ... }
read_csv()
, etcWhen to begin a new line. Long lines of code can be more challenging to read efficiently.
%>%
, +
)Blank lines are useful to separate command sequences just as paragraphs are used to separate sentence sequences.
Comments are useful to include informative remarks about the code in plain language.
Entire commented lines should begin with #
and one space before remarks
additional spacing can be used to align at the #
in case of several end of line comments in close succession End of line comments appended to line of code: precede with two spaces, #
, and then one space before remarks.
In-line spacing.
=
, +
, -
, <-
, etc., with exception of :
, ::
Curly braces: {
and }
{
an opening curly brace should never go on its own line and should always be followed by a new line}
a closing curly brace should always go on its own line, unless it’s followed by else
Naming can be suprisingly challenging. Prioritize names that represent intuitive meaning of the object, and then make the name short if possible.
CamelCase
or Snake_case
avg
, NOT mean
) Identical object names should never refer to fundamentally different objects at different places in the same code script<-
assignment operator (not =
for object assignment)Minimize nested syntax where possible:
Result <- select(filter(Data, condition), var1, var2)
Result <- Data %>% filter(condition) %>% select(var1, var2)
Result <-
Data %>%
filter(condition) %>%
select(var1, var2)
Compound mutate
and summarise
statements should begin a new line after comma. For example:
Result <-
Data %>%
group_by(category) %>%
summarise(N = n(),
Average = mean(quantity),
StDev = sd(quantity))
Always preserve access to the uncorrected/raw source data. One should never modify, change, or overwrite the only accessible copy of the source data. Doing so inadvertantly is irresponsible and doing so intentionally is unethical.