lmfreq: lmfreq

Description

To fit linear models with data grouped in frequency tables.

Usage

lmfreq(formula, data, freq = NULL)
.lmfreq(formula, tfq)
# S3 method for lmfreq
logLik(object, ...)
# S3 method for lmfreq
extractAIC(fit, scale = 0, k = 2, ...)
# S3 method for lmfreq
AIC(object, ..., k = 2)
# S3 method for lmfreq
nobs(object, ...)
# S3 method for lmfreq
summary(object, ...)
# S3 method for lmfreq
print(x, ...)
# S3 method for summary.lmfreq
print(x, digits = getOption("digits") - 3, ...)
# S3 method for lmfreq
predict(object, ...)

Arguments

formula

an object of class formula

data

a data frame that must contain all variables present in formula and freq

freq

a character string specifying the variable of frequency weights

tfq

a tablefreq object

object

a lmfreq object

...

See Details

fit

a lmfreq object

scale

not used

penalty parameter

a lmfreq object

digits

Value

It returns an object of class lmfreq, very similar to lm

Details

It computes the linear model of a frequency table. See lm for further details.

Any variables in the formula are removed from the data set.

The dot function are for programming purpose. It does not check the data.

Examples

Run this code


## Benchmark
if(require(hflights)){
  formula <-  ArrDelay ~ DepDelay   
  print(system.time(a <- lm(formula, data=hflights)))  ## ~0.4 seconds 
  print(system.time(b <- lmfreq(formula, data=hflights))) ## ~0.12 seconds. 4x faster
}

l0 <- lm(Sepal.Length ~ Sepal.Width,iris)
summary(l0)

tfq <- tablefreq(iris[,1:2])
lf <- lmfreq(Sepal.Length ~ Sepal.Width,tfq, freq="freq")
summary(lf)

all.equal(coef(lf),coef(l0))
all.equal(AIC(lf),AIC(l0))

newdata <- data.frame(Sepal.Width=c(1,NA,7))
predict(lf, newdata)

if(require(MASS)){
   stepAIC(lf)
}

system.time(lmfreq(Sepal.Length ~ Sepal.Width,tfq, freq="freq"))
system.time(.lmfreq(Sepal.Length ~ Sepal.Width,tfq)) # Fast

library(dplyr)
igrouped <- iris %>% group_by(Species)
models <- igrouped %>% do(model=lmfreq(Sepal.Length ~ Sepal.Width, .))
coefs <- models %>%
  do(cbind(as.data.frame(rbind(coef(.$model))),
           Species=.$Species))
coefs

## Not run: ------------------------------------
# ## If data is too granular, benchmark is worst
# n <- 10^6
# data <- data.frame(y=rnorm(n),x=rnorm(n))
# system.time(lm(y~x,data)) ## ~5 seconds
# system.time(lmfreq(y~x,data)) ## ~ 15 seconds
# system.time(tfq <- tablefreq(data)) ## ~ 5 seconds
# nrow(tfq) # same number of rows than original data
# system.time(.lmfreq(y~x,tfq)) ## ~ 10 seconds
## ---------------------------------------------

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

Details

See Also

Examples