powered by
Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleBy(x, col, fractions, seed)# S4 method for SparkDataFrame,character,list,numeric sampleBy(x, col, fractions, seed)
# S4 method for SparkDataFrame,character,list,numeric sampleBy(x, col, fractions, seed)
A SparkDataFrame
column that defines strata
A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero.
random seed
A new SparkDataFrame that represents the stratified sample
Other stat functions: approxQuantile(), corr(), cov(), crosstab(), freqItems()
approxQuantile()
corr()
cov()
crosstab()
freqItems()
# NOT RUN { df <- read.json("/path/to/file.json") sample <- sampleBy(df, "key", fractions, 36) # }
Run the code above in your browser using DataLab