Learn R Programming

zoib (version 1.6)

AlcoholUse: California County-level Teenager Monthly Alcohol Use data

Description

AlcoholUse contains the county-level monthly alcohol use data from students in California in years 2008 to 2010. The data can be downloaded at http://www.kidsdata.org. The data has information on the percentage of public school students in grades 7, 9, and 11 in five buckets of days (0, 1-2, 3-9, 10-19, 20-30) in which they drank alcohol in the past 30 days. Since the percentages across the five days categories add up to 1 given a combination of County, Grade and Gender. Data points associated with 0 are completely redundant and are not in AlcoholUse. zoib is applied to examining whether the proportions of alcohocol use in the past month are different across gender, grade, days of drinking.

Usage

data(AlcoholUse)

Arguments

Format

A data frame with 1340 observations on the following 6 variables.

County

a factor with 56 levels/clusters/counties.

Grade

a numeric/factor vector: 7, 9, 11 grades.

Days

a factor with levels 0 [1,2] [3,9] [10,19] [20,30].

MedDays

a numeric vector: the med point of each of the 4 day buckets.

Gender

a factor with levels Female Male.

Percentage

a numeric vector; Proportions of public school students in 4 buckets of days in which they drank alcohol in the past 30 days per Gender, Grader, and County

.

Examples

Run this code
  if (FALSE) {
  ##### eg3: modelling with clustered beta variables with inflation at 0
    library(zoib)
    data("AlcoholUse", package = "zoib")
    eg3 <- zoib(Percentage ~ Grade*Gender+MedDays|1|Grade*Gender+MedDays|1,
                data = AlcoholUse, random = 1, EUID= AlcoholUse$County,
                zero.inflation = TRUE,  one.inflation = FALSE, joint = FALSE, 
                n.iter=5000, n.thin=20, n.burn=1000)  
    sample1 <- eg3$coeff
    summary(sample1)
    
    # check convergence on the regression coefficients
    traceplot(sample1); 
    autocorr.plot(sample1);
    check.psrf(sample1)
    
    # plot posterior mean of y vs. observed y to check on goodness of fit.
    ypred = rbind(eg3$ypred[[1]],eg3$ypred[[2]])
    post.mean= apply(ypred,2,mean); 
    par(mfrow=c(1,1),mar=c(4,4,0.5,0.5))
    plot(AlcoholUse$Percentage, post.mean, xlim=c(0,0.4),ylim=c(0,0.4), 
         col='blue', xlab='Observed y', ylab='Predicted y', main="")
    abline(0,1,col='red')
  }

Run the code above in your browser using DataLab