These functions provide an unbiased alternative to the corresponding base functions.

sample(x, size, replace = FALSE, prob = NULL)

sample.int(n, size = n, replace = FALSE, prob = NULL)

Arguments

x

either a vector of one or more elements from which to choose, or a positive integer.

size

a non-negative integer giving the number of items to choose.

replace

should sampling be with replacement?

prob

a vector of probability weights for obtaining the elements of the vector being sampled.

n

a positive number, the number of items to choose from.

Details

Currently there is no support for weighted sampling and for long vectors. If such situations are encountered, the functions fall back to the equivalent functions in base.

Note

The used algorithm needs a random 32bit unsigned integer as input. R does not provide an interface for such a random number. Instead unif_rand() returns a random double in \((0, 1)\). Internally, the result of unif_rand() is multiplied with \(2^{32}\) to produce a 32bit unsigned integer. This works correctly for the default generator Mersenne-Twister, since that produces a 32bit unsigned integer which is then devided by \(2^{32}\). However, other generators in R do not follow this pattern so that this procedure might introduce a new bias.

References

Daniel Lemire (2018), Fast Random Integer Generation in an Interval, https://arxiv.org/abs/1805.10941.

See also

Examples

# base::sample produces very different amount of odd an even numbers m <- 2/5 * 2^32 x <- sample(m, 1000000, replace = TRUE) table(x %% 2)
#> #> 0 1 #> 500434 499566