Helpers for specifying nodes in simulations

Mix two variables together. The output will have the specified R-squared with var1 and variance one.

Evaluate an expression separately for each case

## Usage

```
categorical(n = 5, ..., exact = TRUE)
cat2value(variable, ...)
bernoulli(n = 0, logodds = NULL, prob = 0.5, labels = NULL)
mix_with(signal, noise = NULL, R2 = 0.5, var = 1, exact = FALSE)
each(ex)
block_by(block_var, levels = c("treatment", "control"), show_block = FALSE)
random_levels(n, k = NULL, replace = FALSE)
```

## Arguments

- n
The symbol standing for the number of rows in the data frame to be generated by

`datasim_run()`

. Just use`n`

as a symbol; don't assign it a value. (That will be done by`datasim_run()`

.)- exact
if

`TRUE`

, make R-squared or the target variance exactly as specified.- variable
a categorical variable

- logodds
Numerical vector used to generate bernouilli trials. Can be any real number.

- prob
An alternative to

`logodds`

. Values must be in`[0,1]`

.- labels
Character vector: names for categorical levels, also used to replace 0 and 1 in bernouilli()

- signal
The part of the mixture that will be correlated with the output.

- noise
The rest of the mixture. This will be

**uncorrelated**with the output only if you specify it as pure noise.- R2
The target R-squared.

- var
The target variance.

- ex
an expression potentially involving other variables.

- block_var
Which variable to use for blocking

- levels
Character vector giving names to the blocking levels

- show_block
Logical. If

`TRUE`

, put the block number in the output.- k
Number of distinct levels

- replace
if

`TRUE`

, use resampling on the set of k levels- ...
assignments of values to the names in

`variable`

## Value

A numerical or categorical vector which will be assembled into
a data frame by `datasim_run()`

## Details

`datasim_make()`

constructs a simulation
which can then be run with `datasim_run()`

. Each argument to
`datasim_make()`

specifies one node of the simulation using an
assignment-like syntax such as `y <- 3*x + 2 + rnorm(n)`

. The datasim
helpers documented here are for use on the right-hand side of the specification
of a node. They simplify potentially complex operations such as blocking, creation
of random categorical methods, translation from categorical to numerical values, etc.

The target R-squared and variance will be achieved only
if `exact=TRUE`

or the sample size goes to infinity.

## Examples

```
Demo <- datasim_make(
g <- categorical(n, a=2, b=1, c=0.5),
x <- cat2value(g, a=-1.7, b=0.1, c=1.2),
y <- bernoulli(logodds = x, labels=c("no", "yes")),
z <- random_levels(n, k=4),
w <- mix_with(x, noise=rnorm(n), R2=0.75, var=1),
treatment <- block_by(w),
dice <- each(rnorm(1, sd = abs(w)))
)
```