sdf_rhyper

Generator method for creating a single-column Spark dataframes comprised of
i.i.d. samples from a hypergeometric distribution.

R interface to Apache Spark, a fast and general
engine for big data processing, see <https://spark.apache.org/>. This
package supports connecting to local and remote Apache Spark clusters,
provides a 'dplyr' compatible back-end, and provides an interface to
Spark's built-in machine learning algorithms.

Edgar Ruiz

sparklyr

R Interface to Apache Spark

Javier Luraschi

Kevin Kuo

Kevin Ushey

JJ Allaire

Samuel Macedo

Hossein Falaki

Lu Wang

Andy Zhang

Yitao Li

Jozef Hajnala

Maciej Szymkiewicz

Wil Davis

 RStudio

 The Apache Software Foundation

sdf_rhyper function

<dl><dt>sc</dt>
<dd>A Spark connection.</dd>
<dt>nn</dt>
<dd>Sample Size.</dd>
<dt>m</dt>
<dd>The number of successes among the population.</dd>
<dt>n</dt>
<dd>The number of failures among the population.</dd>
<dt>k</dt>
<dd>The number of draws.</dd>
<dt>num_partitions</dt>
<dd>Number of partitions in the resulting Spark dataframe
(default: default parallelism of the Spark cluster).</dd>
<dt>seed</dt>
<dd>Random seed (default: a random long integer).</dd>
<dt>output_col</dt>
<dd>Name of the output column containing sample values (default: "x").</dd></dl>

Arguments

Generate random samples from a hypergeometric distribution — sdf_rhyper

<dl>

<dt>sc</dt>
<dd>A Spark connection.</dd>


<dt>nn</dt>
<dd>Sample Size.</dd>


<dt>m</dt>
<dd>The number of successes among the population.</dd>


<dt>n</dt>
<dd>The number of failures among the population.</dd>


<dt>k</dt>
<dd>The number of draws.</dd>


<dt>num_partitions</dt>
<dd>Number of partitions in the resulting Spark dataframe
(default: default parallelism of the Spark cluster).</dd>


<dt>seed</dt>
<dd>Random seed (default: a random long integer).</dd>


<dt>output_col</dt>
<dd>Name of the output column containing sample values (default: "x").</dd>

</dl>

Generate random samples from a hypergeometric distribution

sdf_rhyper: Generate random samples from a hypergeometric distribution

Description

Usage

Arguments

See Also