spread_coef: Spread model coefficients of list-variables into columns

Description

This function extracts coefficients (and standard error and p-values) of fitted model objects from (nested) data frames, which are saved in a list-variable, and spreads the coefficients into new colummns.

Usage

spread_coef(data, model.column, model.term, se, p.val, append = TRUE)

Value

A data frame with columns for each coefficient of the models that are stored in the list-variable of data; or, if

model.term is given, a data frame with the term's estimate. If se = TRUE or p.val = TRUE, the returned data frame also contains columns for the coefficients' standard error and p-value. If append = TRUE, the columns are appended to data, i.e. data is also returned.

Arguments

data: A (nested) data frame with a list-variable that contains fitted model objects (see 'Details').
model.column: Name or index of the list-variable that contains the fitted model objects.
model.term: Optional, name of a model term. If specified, only this model term (including p-value) will be extracted from each model and added as new column.
se: Logical, if TRUE, standard errors for estimates will also be extracted.
p.val: Logical, if TRUE, p-values for estimates will also be extracted.
append: Logical, if TRUE (default), this function returns data with new columns for the model coefficients; else, a new data frame with model coefficients only are returned.

Details

This function requires a (nested) data frame (e.g. created by the nest-function of the tidyr-package), where several fitted models are saved in a list-variable (see 'Examples'). Since nested data frames with fitted models stored as list-variable are typically fit with an identical formula, all models have the same dependent and independent variables and only differ in their subsets of data. The function then extracts all coefficients from each model and saves each estimate in a new column. The result is a data frame, where each row is a model with each model's coefficients in an own column.

Examples

Run this code

if (require("dplyr") && require("tidyr") && require("purrr")) {
  data(efc)

  # create nested data frame, grouped by dependency (e42dep)
  # and fit linear model for each group. These models are
  # stored in the list variable "models".
  model.data <- efc %>%
    filter(!is.na(e42dep)) %>%
    group_by(e42dep) %>%
    nest() %>%
    mutate(
      models = map(data, ~lm(neg_c_7 ~ c12hour + c172code, data = .x))
    )

  # spread coefficients, so we can easily access and compare the
  # coefficients over all models. arguments `se` and `p.val` default
  # to `FALSE`, when `model.term` is not specified
  spread_coef(model.data, models)
  spread_coef(model.data, models, se = TRUE)

  # select only specific model term. `se` and `p.val` default to `TRUE`
  spread_coef(model.data, models, c12hour)

  # spread_coef can be used directly within a pipe-chain
  efc %>%
    filter(!is.na(e42dep)) %>%
    group_by(e42dep) %>%
    nest() %>%
    mutate(
      models = map(data, ~lm(neg_c_7 ~ c12hour + c172code, data = .x))
    ) %>%
    spread_coef(models)
}

Run the code above in your browser using DataLab