zlm: Zafar's Linear and Logistic Regressions

Description

Linear and logistic regression models enforcing fairness by bounding the covariance between sensitive attributes and predictors.

Usage

# a fair linear regression model.
zlm(response, predictors, sensitive, unfairness)
# a fair logistic regression model.
zlrm(response, predictors, sensitive, unfairness)

Arguments

response

a numeric vector, the response variable.

predictors

a numeric matrix or a data frame containing numeric and factor columns; the predictors.

sensitive

a numeric matrix or a data frame containing numeric and factor columns; the sensitive attributes.

unfairness

a positive number in [0, 1], how unfair is the model allowed to be. A value of 0 means the model is completely fair, while a value of 1 means the model is not constrained to be fair at all.

Value

zlm() returns an object of class c("zlm", "fair.model"). zlrm() returns an object of class c("zlrm", "fair.model").

Details

zlm() and zlrm() define fairness as statistical parity.

Estimation minimizes the log-likelihood of the regression models under the constraint that the correlation between each sensitive attribute and the fitted values (on the linear predictor scale, in the case of logistic regression) is smaller than unfairness in absolute value. Both models include predictors as explanatory variables; the variables sensitive only appear in the constraints.

References

Zafar BJ, Valera I, Gomez-Rodriguez M, Gummadi KP (2019). "Fairness Constraints: a Flexible Approach for Fair Classification". Journal of Machine Learning Research, 30:1--42. https://www.jmlr.org/papers/volume20/18-262/18-262.pdf