gencalib: g-weights of the generalized calibration estimator

Description

Computes the g-weights of the generalized calibration estimator. The g-weights should lie in the specified bounds for the truncated and logit methods.

Usage

gencalib(Xs,Zs,d,total,q=rep(1,length(d)),method=c("linear","raking","truncated","logit"),
bounds=c(low=0,upp=10),description=FALSE,max_iter=500,C=1)

Arguments

matrix of calibration variables.

matrix of instrumental variables with same dimension as Xs.

vector of initial weights.

total

vector of population totals.

vector of positive values accounting for heteroscedasticity; the variation of the g-weights is reduced for small values of q.

method

calibration method (linear, raking, logit, truncated).

bounds

vector of bounds for the g-weights used in the truncated and logit methods; 'low' is the smallest value and 'upp' is the largest value.

description

if description=TRUE, summary of initial and final weights are printed, and their boxplots and histograms are drawn; by default, its value is FALSE.

max_iter

maximum number of iterations in the Newton's method.

value of the centering constant, by default equals 1.

Value

The function returns the vector of g-weights.

Details

The generalized calibration or the instrument vector method computes the g-weights \(g_k=F(\lambda'z_k),\) where \(z_k\) is a vector with values defined for \(k\in s\) (or \(k\in r\) where \(r\) is the set of respondents) and sharing the dimension of the specified auxiliary vector \(x_k\). The vectors \(z_k\) and \(x_k\) have to be stronlgy correlated. The vector \(\lambda\) is determined from the calibration equation \(\sum_{k\in s} d_kg_k x_k=\sum_{k\in U} x_k\) or \(\sum_{k\in r} d_kg_k x_k=\sum_{k\in U} x_k\). The function \(F\) plays the same role as in the calibration method (see calib). If Xs=Zs the calibration method is obtain. If the method is "logit" the g-weights will be centered around the constant C, with low<C<upp. In the calibration method C=1 (see calib).

References

Deville, J.-C. (1998). La correction de la nonr<e9>ponse par calage ou par <e9>chantillonnage <e9>quilibr<e9>. Paper presented at the Congr<e8>s de l'ACFAS, Sherbrooke, Qu<e9>bec. Deville, J.-C. (2000). Generalized calibration and application for weighting for non-response, COMPSTAT 2000: proceedings in computational statistics, p. 65--76. Estevao, V.M., and S<e4>rndal, C.E. (2000). A functional form approach to calibration. Journal of Official Statistics, 16, 379--399. Kott, P.S. (2006). Using calibration weighting to adjust for nonresponse and coverage errors. Survey Methodology, 32, 133--142.

Examples

Run this code

# NOT RUN {
############
## Example 1
############
# matrix of sample calibration variables 
Xs=cbind(
c(1,1,1,1,1,0,0,0,0,0),
c(0,0,0,0,0,1,1,1,1,1),
c(1,2,3,4,5,6,7,8,9,10))
# inclusion probabilities
piks=rep(0.2,times=10)
# vector of population totals
total=c(24,26,290)
# matrix of instrumental variables
Zs=Xs+matrix(runif(nrow(Xs)*ncol(Xs)),nrow(Xs),ncol(Xs))
# the g-weights using the truncated method
g=gencalib(Xs,Zs,d=1/piks,total,method="truncated",bounds=c(0.5,1.5))
# the calibration estimator of X is equal to the 'total' vector
t(g/piks)%*%Xs
# the g-weights are between lower and upper bounds
summary(g)
############
## Example 2
############
# Example of generalized g-weights (linear, raking, truncated, logit),
# with the data of Belgian municipalities as population.
# Firstly, a sample is selected by means of Poisson sampling.
# Secondly, the g-weights are calculated.
data(belgianmunicipalities)
attach(belgianmunicipalities)
# matrix of calibration variables for the population
X=cbind(Totaltaxation/mean(Totaltaxation),medianincome/mean(medianincome))
# selection of a sample with expected size equal to 200
# by means of Poisson sampling
# the inclusion probabilities are proportional to the average income 
pik=inclusionprobabilities(averageincome,200)
N=length(pik)               # population size
s=UPpoisson(pik)            # sample
Xs=X[s==1,]                 # sample calibration variable matrix 
piks=pik[s==1]              # sample inclusion probabilities
n=length(piks)              # sample size
# vector of population totals of the calibration variables
total=c(t(rep(1,times=N))%*%X)  
# the population total
total
Z=cbind(TaxableIncome/mean(TaxableIncome),averageincome/mean(averageincome))
# defines the instrumental variables
Zs=Z[s==1,]
# computation of the generalized g-weights
# by means of different generalized calibration methods
g1=gencalib(Xs,Zs,d=1/piks,total,method="linear")
g2=gencalib(Xs,Zs,d=1/piks,total,method="raking")
g3=gencalib(Xs,Zs,d=1/piks,total,method="truncated",bounds=c(0.5,8))
g4=gencalib(Xs,Zs,d=1/piks,total,method="logit",bounds=c(0.5,1.5))
# In some cases, the calibration does not exist
# particularly when bounds are used.
# if the calibration is possible, the calibration estimator of X total is printed
if(checkcalibration(Xs,d=1/piks,total,g1)$result) print(c((g1/piks)%*% Xs)) else print("error")
if(!is.null(g2))
if(checkcalibration(Xs,d=1/piks,total,g2)$result) print(c((g2/piks)%*% Xs)) else print("error")
if(!is.null(g3))
if(checkcalibration(Xs,d=1/piks,total,g3)$result) print(c((g3/piks)%*% Xs)) else print("error")
if(!is.null(g4))
if(checkcalibration(Xs,d=1/piks,total,g4)$result) print(c((g4/piks)%*% Xs)) else print("error")
############
## Example 3
############
# Generalized calibration and adjustment for unit nonresponse in the 'calibration' vignette
# vignette("calibration", package="sampling")
# }

Run the code above in your browser using DataLab