Learn R Programming

parameters (version 0.14.0)

data_partition: Partition data into a test and a training set

Description

Creates a training and a test set based on a dataframe. Can also be stratified (i.e., evenly spread a given factor) using the group argument.

Usage

data_partition(x, training_proportion = 0.7, group = NULL, seed = NULL)

Arguments

x

A data frame, or an object that can be coerced to a data frame.

training_proportion

The proportion (between 0 and 1) of the training set. The remaining part will be used for the test set.

group

A character vector indicating the name(s) of the column(s) used for stratified partitioning.

seed

A random number generator seed. Enter an integer (e.g., 123) so that the random sampling will be the same each time you run the function.

Value

A list of two data frames, named test and training.

Examples

Run this code
# NOT RUN {
df <- iris
df$Smell <- rep(c("Strong", "Light"), 75)

head(data_partition(df))
head(data_partition(df, group = "Species"))
head(data_partition(df, group = c("Species", "Smell")))
# }

Run the code above in your browser using DataLab