Learn R Programming

RegCombin (version 0.4.1)

regCombin: Function computing all the different bounds : DGM and/or Variance

Description

Function computing all the different bounds : DGM and/or Variance

Usage

regCombin(
  Ldata,
  Rdata,
  out_var,
  nc_var,
  c_var = NULL,
  constraint = NULL,
  nc_sign = NULL,
  c_sign = NULL,
  weights_x = NULL,
  weights_y = NULL,
  nbCores = 1,
  methods = c("DGM"),
  grid = 10,
  alpha = 0.05,
  eps_default = 0.5,
  R2bound = NULL,
  projections = FALSE,
  unchanged = FALSE,
  ties = FALSE,
  seed = 2131,
  mult = NULL
)

Value

Use summary_regCombin for a user-friendly print of the estimates. Returns a list containing, in order: - DGM_complete or Variance_complete : the complete outputs of the functions DGM_bounds or Variance_bounds.

and additional pre-treated outputs, replace below "method" by either "DGM" or "Variance":

- methodCI: the confidence region on the betanc without sign constraints

- methodpt: the bounds point estimates on the betanc without sign constraints

- methodCI_sign: the confidence region on the betanc with sign constraints

- methodpt_sign: the bounds point estimates on the betanc with sign constraints

- methodkp: the values of epsilon(q)

- methodbeta1: the confidence region on the betac corresponding to the common regressors Xc without sign constraints

- methodbeta1_pt: the bounds point estimates on the betac corresponding to the common regressors Xc without sign constraints

- methodbeta1_sign: the confidence region on the betac corresponding to the common regressors Xc with sign constraints

- methodbeta1_sign_pt: the bounds point estimates on the betac corresponding to the common regressors Xc with sign constraints

Arguments

Ldata

a dataset including Y and possibly X_c=(X_c1,...,X_cq). X_c must be finitely supported.

Rdata

a dataset including X_nc and the same variables X_c as in Ldata.

out_var

the label of the outcome variable Y.

nc_var

the labels of the regressors X_nc.

c_var

the labels of the regressors X_c (if any).

constraint

a vector of size q indicating the type of constraints (if any) on the function f(x_c1,...,x_cq) for k=1,...,q: "convex", "concave", "nondecreasing", "nonincreasing", "nondecreasing_convex", "nondecreasing_concave", "nonincreasing_convex", "nonincreasing_concave", or NA for no constraint. Default is NULL, namely no constraints at all.

nc_sign

a vector of size p indicating sign restrictions on each of the p coefficients of X_nc. For each component, -1 corresponds to a minus sign, 1 to a plus sign and 0 to no constraint. Default is NULL, namely no constraints at all.

c_sign

same as nc_sign but for X_c (accordingly, it is a vector of size q).

weights_x

the sampling weights for the dataset Rdata. Default is NULL.

weights_y

the sampling weights for the dataset Ldata. Default is NULL.

nbCores

number of cores for the parallel computation. Default is 1.

methods

method used for the bounds: "DGM" (Default) and/or "Variance".

grid

the number of points for the grid search on epsilon. If NULL, then grid search is not performed and epsilon is taken as eps_default. Default is 10.

alpha

one minus the nominal coverage of the confidence intervals. Default is 0.05.

eps_default

a pre-specified value of epsilon used only if the grid search for selecting the value of epsilon is not performed, i.e, when grid is NULL. Default is 0.5.

R2bound

the lower bound on the R2 of the long regression if any. Default is NULL.

projections

a boolean indicating if the identified set and confidence intervals on beta_0k for k=1,...,p are computed (TRUE), rather than the identified set and confidence region of beta_0 (FALSE). Default is FALSE.

unchanged

a boolean indicating if the categories based on X_c must be kept unchanged (TRUE). Otherwise (FALSE), a thresholding approach is taken imposing that each value appears more than 10 times in both datasets and represents more than 0.01 per cent of the pooled dataset (of size n_X+n_Y). Default is FALSE.

ties

a boolean indicating if there are ties in the dataset. If not (FALSE), computation is faster. Default is FALSE.

seed

to avoid fixinx the seed for the subsampling, set to NULL. Otherwise 2131.

mult

a list of multipliers of our selected epsilon to look at the robustness of the point estimates with respect to it. Default is NULL

Examples

Run this code

### Simulating according to this DGP
n=200
Xnc_x = rnorm(n,0,1.5)
Xnc_y = rnorm(n,0,1.5)
epsilon = rnorm(n,0,1)

## true value
beta0 =1
Y = Xnc_y*beta0 + epsilon
out_var = "Y"
nc_var = "Xnc"

# create the datasets
Ldata<- as.data.frame(Y)
colnames(Ldata) <- c(out_var)
Rdata <- as.data.frame(Xnc_x)
colnames(Rdata) <- c(nc_var)


############# Estimation #############
output <- regCombin(Ldata,Rdata,out_var,nc_var)



Run the code above in your browser using DataLab