Learn R Programming

expss (version 0.5.5)

fre: Simple frequencies and crosstabs with support of labels, weights and multiple response variables.

Description

  • fre returns data.frame with six columns: labels or values, counts, valid percent (excluding NA), percent (with NA), percent of responses(for single-column x it equals to valid percent) and cumulative percent of responses.
  • cro returns data.frame with counts (possibly weighted) with column and row totals.
  • cro_pct, cro_cpct, cro_rpct return data.frame with table/column/row percent with column and row totals. There are always weighted counts instead of margin with 100%. Empty labels/factor levels are removed from results of these functions. Base for multiple response (x is data.frame) percent is number of valid cases (not sum of responses) so sum of percent may be greater than 100. Case is considered as valid if it has at least one non-NA value.
  • cro_mean, cro_sum, cro_median return data.frame with mean/sum/median. Empty labels/factor levels are removed from results of these functions. NA's are always omitted.
  • cro_fun, cro_fun_df return data.frame with custom summary statistics defined by 'fun' argument. Empty labels/factor levels in predictor are removed from results of these functions. NA's treatment depends on your 'fun' behavior. To use weight you should have 'weight' argument in 'fun' and some logic for its proccessing inside.cro_fun applies 'fun' on each column in 'x' separately, cro_fun_df gives to 'fun' x as a whole data.frame. So cro_fun(iris[, -5], iris$Species, fun = mean) gives the same result as cro_fun_df(iris[, -5], iris$Species, fun = colMeans). For cro_fun_df names of 'x' will converted to labels if they are available before 'fun' is applied. You should take care to return from 'fun' rectangular object with appropriate row/column names - they will be used in final result as labels.

Usage

fre(x, weight = NULL)
cro(x, predictor, weight = NULL)
cro_cpct(x, predictor, weight = NULL)
cro_rpct(x, predictor, weight = NULL)
cro_tpct(x, predictor, weight = NULL)
cro_mean(x, predictor, weight = NULL)
cro_sum(x, predictor, weight = NULL)
cro_median(x, predictor)
cro_fun(x, predictor, fun, ..., weight = NULL)
cro_fun_df(x, predictor, fun, ..., weight = NULL)

Arguments

x
vector/data.frame. data.frames are considered as multiple response variables.
weight
numeric vector. Optional case weights. NA's and negative weights treated as zero weights.
predictor
vector. By now multiple-response predictor is not supported.
fun
custom summary function. It should always return scalar/vector/matrix of the same size.
...
further arguments for fun

Value

object of class 'simple_table'/'summary_table'. Basically it's a data.frame but class is needed for custom print method.

Examples

Run this code
data(mtcars)
mtcars = modify(mtcars,{
    var_lab(vs) = "Engine"
    val_lab(vs) = c("V-engine" = 0, 
                    "Straight engine" = 1) 
    var_lab(am) = "Transmission"
    val_lab(am) = c(automatic = 0, 
                    manual=1)
})

fre(mtcars$vs)
with(mtcars, cro(am, vs))
with(mtcars, cro_cpct(am, vs))

# multiple-choise variable
# brands - multiple response question
# Which brands do you use during last three months? 
set.seed(123)
brands = data.frame(t(replicate(20,sample(c(1:5,NA),4,replace = FALSE))))
# score - evaluation of tested product
score = sample(-1:1,20,replace = TRUE)
var_lab(brands) = "Used brands"
val_lab(brands) = make_labels("
                              1 Brand A
                              2 Brand B
                              3 Brand C
                              4 Brand D
                              5 Brand E
                              ")

var_lab(score) = "Evaluation of tested brand"
val_lab(score) = make_labels("
                             -1 Dislike it
                             0 So-so
                             1 Like it    
                             ")

fre(brands)
cro(brands, score)
cro_cpct(brands, score)

# 'cro_mean'

data(iris)
cro_mean(iris[, -5], iris$Species)

# 'cro_fun'

data(mtcars)
mtcars = modify(mtcars,{
    var_lab(vs) = "Engine"
    val_lab(vs) = c("V-engine" = 0, 
                    "Straight engine" = 1) 
    var_lab(hp) = "Gross horsepower"
    var_lab(mpg) = "Miles/(US) gallon"
})

# Label for 'disp' forgotten intentionally
with(mtcars, cro_fun(data.frame(hp, mpg, disp), vs, summary))

# or, the same with transposed summary
with(mtcars, cro_fun(data.frame(hp, mpg, disp), vs, function(x) t(summary(x))))

# very artificial example
a = c(1,1,1, 1, 1)
b = c(0, 1, 2, 2, NA)
weight = c(0, 0, 1, 1, 1)
cro_fun(b, a, weight = weight, 
     fun = function(x, weight, na.rm){
                 weighted.mean(x, w = weight, na.rm = na.rm)
             }, 
     na.rm = TRUE)


# comparison 'cro_fun' and 'cro_fun_df'

data(iris)
cro_fun(iris[, -5], iris$Species, fun = mean)
# same result
cro_fun_df(iris[, -5], iris$Species, fun = colMeans)  

# usage for 'cro_fun_df' which is not possible for 'cro_fun'
# calculate correlations of variables with Sepal.Length inside each group
cro_fun_df(iris[,-5], iris$Species, fun = function(x) cor(x)[,1])

# or, pairwise correlations inside groups
cro_fun_df(iris[,-5], iris$Species, fun = cor)

Run the code above in your browser using DataLab