Learn R Programming

selection (version 1.0)

caseIII: Corrects correlations using Case III

Description

Using Thorndike's Case III correction, caseIII corrects the xy correlation for direct restriction on z (and by implication, indirect restriction on x)

Usage

caseIII(data = NULL, x = 2, y = 3, z = 1, rxy, rzy, rxz, uz)

Arguments

data
a dataset containing the two incidentally restricted variables (X and Y) and complete information on the selection variable (Z). Conversely, one could supply values for rxy, rzy, and rxz instead.
x
The column index (or name) of the X variable
y
The column index (or name) of the Y variable
z
The column index (or name) of the Z variable (the one used for selection)
rxy
the restricted correlation between x (the indirectly selected variable) and y (the outcome variable).
rzy
the restricted correlation between z (the selection variable) and y (the outcome variable).
rxz
the restricted correlation between x (the indirectly selected variable) and z (the selection variable).
uz
the ratio of restricted to unrestricted variance (i.e., sigmaz'/sigmaz).

Value

  • a scalar that is the unbiased estimate of the correlation between X and Y.

Details

The Case III correction is defined as follows insert later The result is an unbiased estimate of the unattenuated correlation between X and Y

References

Thorndike, R. L. (1949). Personnel selection: Test and measurement techniques. Oxford, England: Wiley.

Pearson, K. (1903). Mathematical contributions to the theory of evolution. XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 200, 1-66.

See Also

caseIV, caseIIIR, em, rel.correction

Examples

Run this code
# load example data
data(selection.example.data)
	# give me only those rows that have full information
new.dat = selection.example.data[!is.na(selection.example.data$Performance),]
cor.mat = cor(new.dat[,c("R", "Biodata", "Performance")])
	# correct assuming direct selection on R, indirect on biodata, and a dv of performance
corrected = caseIII(rxy=cor.mat[1,3], rzy=cor.mat[2,3], 
		rxz=cor.mat[1,2], uz = sd(new.dat$R)/sd(selection.example.data$R))	
corrected
## do a simulation to show it works
cor.mat = matrix(c(1, .3, .4, 
					.3, 1, .5,
					.4, .5, 1), nrow=3)
data = mvrnorm(100000, mu=c(0,0,0), Sigma = cor.mat)					
### restrict the data
data[data[,1]<.5, 2:3] = NA
caseIII(data=data, x=2, y=3, z=1)

Run the code above in your browser using DataLab