api: Student performance in California schools

Description

The Academic Performance Index is computed for all California schools based on standardised testing of students. The data sets contain information for all schools with at least 100 students and for various probability samples of the data.

Usage

data(api)

Arguments

Format

The full population data in apipop are a data frame with 6194 observations on the following 37 variables.

cds: Unique identifier
stype: Elementary/Middle/High School
name: School name (15 characters)
sname: School name (40 characters)
snum: School number
dname: District name
dnum: District number
cname: County name
cnum: County number
flag: reason for missing data
pcttest: percentage of students tested
api00: API in 2000
api99: API in 1999
target: target for change in API
growth: Change in API
sch.wide: Met school-wide growth target?
comp.imp: Met Comparable Improvement target
both: Met both targets
awards: Eligible for awards program
meals: Percentage of students eligible for subsidized meals
ell: `English Language Learners' (percent)
yr.rnd: Year-round school
mobility: percentage of students for whom this is the first year at the school
acs.k3: average class size years K-3
acs.46: average class size years 4-6
acs.core: Number of core academic courses
pct.resp: percent where parental education level is known
not.hsg: percent parents not high-school graduates
hsg: percent parents who are high-school graduates
some.col: percent parents with some college
col.grad: percent parents with college degree
grad.sch: percent parents with postgraduate education
avg.ed: average parental education level
full: percent fully qualified teachers
emer: percent teachers with emergency qualifications
enroll: number of students enrolled
api.stu: number of students tested.

The other data sets contain additional variables pw for sampling weights and fpc to compute finite population corrections to variance.

Details

apipop is the entire population, apisrs is a simple random sample, apiclus1 is a cluster sample of school districts, apistrat is a sample stratified by stype, and apiclus2 is a two-stage cluster sample of schools within districts. The sampling weights in apiclus1 are incorrect (the weight should be 757/15) but are as obtained from UCLA.

References

The API program has been discontinued at the end of 2018, and the archive page at the California Department of Education is now gone. The Wikipedia article has links to past material at the Internet Archive. https://en.wikipedia.org/wiki/Academic_Performance_Index_(California_public_schools)

Examples

Run this code

library(survey)
data(api)
mean(apipop$api00)
sum(apipop$enroll, na.rm=TRUE)

#stratified sample
dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
summary(dstrat)
svymean(~api00, dstrat)
svytotal(~enroll, dstrat, na.rm=TRUE)

# one-stage cluster sample
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
summary(dclus1)
svymean(~api00, dclus1)
svytotal(~enroll, dclus1, na.rm=TRUE)

# two-stage cluster sample
dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)
summary(dclus2)
svymean(~api00, dclus2)
svytotal(~enroll, dclus2, na.rm=TRUE)

# two-stage `with replacement'
dclus2wr<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)
summary(dclus2wr)
svymean(~api00, dclus2wr)
svytotal(~enroll, dclus2wr, na.rm=TRUE)


# convert to replicate weights
rclus1<-as.svrepdesign(dclus1)
summary(rclus1)
svymean(~api00, rclus1)
svytotal(~enroll, rclus1, na.rm=TRUE)

# post-stratify on school type
pop.types<-xtabs(~stype, data=apipop)

rclus1p<-postStratify(rclus1, ~stype, pop.types)
dclus1p<-postStratify(dclus1, ~stype, pop.types)
summary(dclus1p)
summary(rclus1p)

svymean(~api00, dclus1p)
svytotal(~enroll, dclus1p, na.rm=TRUE)

svymean(~api00, rclus1p)
svytotal(~enroll, rclus1p, na.rm=TRUE)

Run the code above in your browser using DataLab