get_split_indexes_from_stratum
returns a list with indexes for
splitting its stratum
argument in two parts. The splits differ at most
by one in size. With default arguments, a random split-half is returned,
which samples elements for each part from stratum
without replacement.
Via additional arguments to get_split_indexes_from_stratum
a range of
other splitting methods can be applied.
get_split_indexes_from_stratum(
stratum,
method = c("random", "odd_even", "first_second"),
replace = FALSE,
split_p = 0.5,
subsample_p = 1,
careful = TRUE
)
(data frame, tibble, list, or vector) Object to split; dataframes and tibbles are counted and split by row. All other data types are counted and split by element
(character) Splitting method. Note that first_second
and
odd_even
splitting method will only deliver a valid split with
default settings for other arguments (subsample_p = 1, split_p = 1,
replace = TRUE
)
(logical) If FALSE, splits are constructed by sampling from stratum without replacement. If TRUE, stratum is sampled with replacement.
(numeric) Desired joint size of both parts, expressed as a
proportion of the size of the subsampled stratum
. If split_p
is larger than 1, and careful
is FALSE, then parts are automatically
sampled with replacement
(numeric) Subsample a proportion of stratum
to be
used in the split.
(boolean) If TRUE, stop with an error when called with arguments that may yield unexpected splits
(list) List with two elements that contain indexes that can be used to split the stratum in two parts two splits of stratum.
The following rounding rules apply to subsample size and split size:
If the size of the subsample, calculated as
subsample_p
times size of stratum
, is a fraction, then
subsample size is rounded up.
If the joint size of the two parts,
calculated as 2 * split_p
times size of the subsampled stratum
,
is a fraction, the part size is rounded up.
If the joint size of the
two parts is odd and replace
is FALSE, then one of the parts randomly
gets one more element than the other part.
If the joint size of the two
parts is odd and replace
is TRUE, part size is rounded up to the next
whole number, so each of the splits has the same size.
Other splitting functions:
apply_split_indexes_to_strata()
,
apply_split_indexes_to_stratum()
,
check_strata()
,
get_split_indexes_from_strata()
,
split_df()
,
split_strata()
,
split_stratum()
,
stratify()
# NOT RUN {
# Split-half. One of the splits gets 4 elements and the other 5
stratum = letters[1:9]
indexes = get_split_indexes_from_stratum(stratum)
apply_split_indexes_to_stratum(stratum, indexes[[1]], indexes[[2]])
# }
Run the code above in your browser using DataLab