colMeans_if
calculates the mean of every column in a numeric or
logical matrix conditional on the frequency of observed data. If the
frequency of observed values in that column is less than (or equal to) that
specified by ov.min
, then NA is returned for that row.
colMeans_if(x, ov.min = 1, prop = TRUE, inclusive = TRUE)
numeric vector of length = ncol(x)
with names =
colnames(x)
providing the mean of each column or NA depending on the
frequency of observed values.
numeric or logical matrix. If not a matrix, it will be coerced to one.
minimum frequency of observed values required per column. If
prop
= TRUE, then this is a decimal between 0 and 1. If prop
= FALSE, then this is a integer between 0 and nrow(x)
.
logical vector of length 1 specifying whether ov.min
should refer to the proportion of observed values (TRUE) or the count of
observed values (FALSE).
logical vector of length 1 specifying whether the mean
should be calculated if the frequency of observed values in a column is
exactly equal to ov.min
.
Conceptually this function does: apply(X = x, MARGIN = 2, FUN =
mean_if, ov.min = ov.min, prop = prop, inclusive = inclusive)
. But for
computational efficiency purposes it does not because then the missing values
conditioning would not be vectorized. Instead, it uses colMeans
and
then inserts NAs for columns that have too few observed values.
colSums_if
rowMeans_if
rowSums_if
colMeans
colMeans_if(airquality)
colMeans_if(x = airquality, ov.min = 150, prop = FALSE)
Run the code above in your browser using DataLab