Skip to contents

A set of methods to generate random samples from data frames and data simulations. For data frames, individual rows are sampled. For vectors, elements are sampled.

Usage

sample(x, n, replace = FALSE, ...)

# S3 method for default
sample(
  x,
  n = length(x),
  replace = FALSE,
  prob = NULL,
  .by = NULL,
  groups = .by,
  orig.ids = FALSE,
  ...
)

resample(..., replace = TRUE)

Arguments

x

The object from which to sample

n

Size of the sample.

replace

Logical flag: whether to sample with replacement. (default: FALSE)

prob

Probabilities to use for sampling, one for each element of x

.by

Variables to use to define groups for sampling, as in {dplyr}. The sample size applies to each group.

groups

Variable indicating blocks to sample within

orig.ids

Logical. If TRUE, append a column named "orig.ids" with the row from the original x that the same came from.

...

Arguments to pass along to specific sample methods.

Value

A vector or a data frame depending on the nature of the x argument.

Details

These are based in spirit on the sample functions in the {mosaic} package, but are redefined here to 1) avoid a dependency on {mosaic} and 2) bring the arguments in line with the .by = features of {dplyr}.

Examples

sample(sim_03, n=5) # run a simulation
#> # A tibble: 5 × 3
#>        g       x       y
#>    <dbl>   <dbl>   <dbl>
#> 1  0.552  1.17   -0.242 
#> 2 -0.675 -0.788   0.753 
#> 3  0.214  1.13   -1.25  
#> 4  0.311  0.0875  0.0741
#> 5  1.17   1.70    0.981 
sample(Clock_auction, n = 3) # from a data frame
#> # A tibble: 3 × 3
#>     age bidders price
#>   <dbl>   <dbl> <dbl>
#> 1   194       5  1356
#> 2   182      11  1979
#> 3   175       8  1545
sample(1:6, n = 6) # sample from a vector
#> [1] 2 5 6 4 1 3