Learn R Programming

grr (version 0.9.5)

sample2: A wrapper for sample.int and extract that makes it easy to quickly sample rows from any object, including Matrix and sparse matrix objects.

Description

Row names are not preserved.

Usage

sample2(x, size, replace = FALSE, prob = NULL)

Arguments

x
object from which to extract elements
size
a positive number, the number of items to choose.
replace
Should sampling be with replacement?
prob
A vector of probability weights for obtaining the elements of the vector being sampled.

Examples

Run this code

#Sampling from a list
l1<-as.list(1:1e6)
b<-sample2(l1,1e5)

#Sampling from a data frame
orders<-data.frame(orderNum=sample(1e5, 1e6, TRUE),
   sku=sample(1e3, 1e6, TRUE),
   customer=sample(1e4,1e6,TRUE),stringsAsFactors=FALSE)
   
a<-sample2(orders,250000) 

#With oversampling sample2 can be much faster than the alternatives,
#with the caveat that it does not preserve row names.
system.time(a<-sample2(orders,2000000,TRUE))
system.time(b<-orders[sample.int(nrow(orders),2000000,TRUE),])
## Not run: 
# 
# system.time(c<-dplyr::sample_n(orders,2000000,replace=TRUE))
# 
# #Can quickly sample for sparse matrices while preserving sparsity
# sm<-rsparsematrix(20000000,10000,density=.0001)
# sm2<-sample2(sm,1000000)
# ## End(Not run)

Run the code above in your browser using DataLab