Function computing all the different bounds : DGM and/or Variance
regCombin(
Ldata,
Rdata,
out_var,
nc_var,
c_var = NULL,
constraint = NULL,
nc_sign = NULL,
c_sign = NULL,
weights_x = NULL,
weights_y = NULL,
nbCores = 1,
methods = c("DGM"),
grid = 10,
alpha = 0.05,
eps_default = 0.5,
R2bound = NULL,
projections = FALSE,
unchanged = FALSE,
ties = FALSE,
seed = 2131,
mult = NULL
)
Use summary_regCombin for a user-friendly print of the estimates. Returns a list containing, in order: - DGM_complete or Variance_complete : the complete outputs of the functions DGM_bounds or Variance_bounds.
and additional pre-treated outputs, replace below "method" by either "DGM" or "Variance":
- methodCI: the confidence region on the betanc without sign constraints
- methodpt: the bounds point estimates on the betanc without sign constraints
- methodCI_sign: the confidence region on the betanc with sign constraints
- methodpt_sign: the bounds point estimates on the betanc with sign constraints
- methodkp: the values of epsilon(q)
- methodbeta1: the confidence region on the betac corresponding to the common regressors Xc without sign constraints
- methodbeta1_pt: the bounds point estimates on the betac corresponding to the common regressors Xc without sign constraints
- methodbeta1_sign: the confidence region on the betac corresponding to the common regressors Xc with sign constraints
- methodbeta1_sign_pt: the bounds point estimates on the betac corresponding to the common regressors Xc with sign constraints
a dataset including Y and possibly X_c=(X_c1,...,X_cq). X_c must be finitely supported.
a dataset including X_nc and the same variables X_c as in Ldata.
the label of the outcome variable Y.
the labels of the regressors X_nc.
the labels of the regressors X_c (if any).
a vector of size q indicating the type of constraints (if any) on the function f(x_c1,...,x_cq) for k=1,...,q: "convex", "concave", "nondecreasing", "nonincreasing", "nondecreasing_convex", "nondecreasing_concave", "nonincreasing_convex", "nonincreasing_concave", or NA for no constraint. Default is NULL, namely no constraints at all.
a vector of size p indicating sign restrictions on each of the p coefficients of X_nc. For each component, -1 corresponds to a minus sign, 1 to a plus sign and 0 to no constraint. Default is NULL, namely no constraints at all.
same as nc_sign but for X_c (accordingly, it is a vector of size q).
the sampling weights for the dataset Rdata. Default is NULL.
the sampling weights for the dataset Ldata. Default is NULL.
number of cores for the parallel computation. Default is 1.
method used for the bounds: "DGM" (Default) and/or "Variance".
the number of points for the grid search on epsilon. If NULL, then grid search is not performed and epsilon is taken as eps_default. Default is 10.
one minus the nominal coverage of the confidence intervals. Default is 0.05.
a pre-specified value of epsilon used only if the grid search for selecting the value of epsilon is not performed, i.e, when grid is NULL. Default is 0.5.
the lower bound on the R2 of the long regression if any. Default is NULL.
a boolean indicating if the identified set and confidence intervals on beta_0k for k=1,...,p are computed (TRUE), rather than the identified set and confidence region of beta_0 (FALSE). Default is FALSE.
a boolean indicating if the categories based on X_c must be kept unchanged (TRUE). Otherwise (FALSE), a thresholding approach is taken imposing that each value appears more than 10 times in both datasets and represents more than 0.01 per cent of the pooled dataset (of size n_X+n_Y). Default is FALSE.
a boolean indicating if there are ties in the dataset. If not (FALSE), computation is faster. Default is FALSE.
to avoid fixinx the seed for the subsampling, set to NULL. Otherwise 2131.
a list of multipliers of our selected epsilon to look at the robustness of the point estimates with respect to it. Default is NULL
### Simulating according to this DGP
n=200
Xnc_x = rnorm(n,0,1.5)
Xnc_y = rnorm(n,0,1.5)
epsilon = rnorm(n,0,1)
## true value
beta0 =1
Y = Xnc_y*beta0 + epsilon
out_var = "Y"
nc_var = "Xnc"
# create the datasets
Ldata<- as.data.frame(Y)
colnames(Ldata) <- c(out_var)
Rdata <- as.data.frame(Xnc_x)
colnames(Rdata) <- c(nc_var)
############# Estimation #############
output <- regCombin(Ldata,Rdata,out_var,nc_var)
Run the code above in your browser using DataLab