estimate: Create a multivariate estimate object.

Description

estimate creates an object of class estimate. The concept of an estimate is extended from the 1-dimensional (cf. estimate1d) to the multivariate case. This includes the description of correlations between the different variables. An estimate of an n-dimensional variable is at minimum defined by each component being a 1-dimensional estimate. This means, that for each component, at minimum, the type of its univariate parametric distribution, its 5% - and 95% quantiles must be provided. In probability theoretic terms, these are the marginal distributions of the components. Optionally, the individual median and the correlations between the components can be supplied.

as.estimate tries to coerce a set of objects and transform them to class estimate.

Usage

estimate(distribution, lower, upper, ..., correlation_matrix = NULL)
as.estimate(..., correlation_matrix = NULL)

Value

An object of class estimate which is a list with components $marginal and $correlation_matrix:

$marginal

is a data.frame with mandatory columns:

Mandatory column	R-type	Explanation
`distribution`	`character vector`	Distribution types
`lower`	`numeric vector`	5%-quantiles
`median`	`numeric vector`	50%-quantiles or `NA`
`upper`	`numeric vector`	95%-quantiles

The row.names are the names of the variables. Each row has the properties of an estimate1d.

Note that the median is a mandatory element of an estimate, although it is not necessary as input. If a component of median is numeric and not NA it holds that: lower <= median <= upper. In any case an estimate object has the property any(lower <= upper).

$correlation_matrix

is a symmetric matrix with row and column names being the subset of the variables supplied in $marginal which are correlated. Its elements are the corresponding correlations.

Arguments

distribution: character vector: defining the types of the univariate parametric distributions.
lower: numeric vector: lower bounds of the 90% confidence intervals, i.e the 5%-quantiles of this estimates components.
upper: numeric vector: upper bounds of the 90% confidence intervals, i.e the 95%-quantiles of this estimates components.
...: in estimate: optional arguments that can be coerced to a data frame comprising further columns of the estimate (for details cf. below).
in as.estimate: arguments that can be coerced to a data frame comprising the marginal distributions of the estimate components. Mandatory columns are distribution, lower and upper.
correlation_matrix: numeric matrix: containing the correlations of the variables (optional).

Details

The input arguments inform the estimate about its marginal distributions and joint distribution, i.e. the correlation matrix.

The structure of the estimates marginal input information

in estimate: The marginal distributions are defined by the arguments distribution, lower and upper and, optionally, by further columns supplied in ... that can be coerced to a data.frame with the same length as the mandatory arguments.
in as.estimate: The marginal distributions are completely defined in .... These arguments must be coercible to a data.frame, all having the same length. Mandatory columns are distribution, lower and upper.

Mandatory input columns

Column	R-type	Explanation
`distribution`	`character vector`	Marginal distribution types
`lower`	`numeric vector`	Marginal 5%-quantiles
`upper`	`numeric vector`	Marginal 95%-quantiles

It must hold that lower <= upper for every component of the estimate.

Optional input columns

The optional parameters in ... provide additional characteristics of the marginal distributions of the estimate. Frequent optional columns are:

Column	R-type	Explanation
`variable`	`character vector`	Variable names
`median`	cf. below	Marginal 50%-quantiles
`method`	`character vector`	Methods for calculation of marginal distribution parameters

The `median` column

If supplied as input, any component of median can be either NA, numeric (and not NA) or the character string "mean". If it equals "mean" it is set to rowMeans(cbind(lower, upper)) of this component; if it is numeric it must hold that lower <= median <= upper for this component. In case that no element median is provided, the default is median=rep(NA, length(distribution)).
The median is important for the different methods possible in generating the random numbers (cf. random.estimate).

The structure of the estimates correlation input information

The argument correlation_matrix is the sub matrix of the full correlation matrix of the estimate containing all correlated elements. Thus, its row and column names must be a subset of the variable names of the marginal distributions. This means, that the information which variables are uncorrelated does not need to be provided explicitly.

correlation_matrix must have all the properties of a correlation matrix, viz. symmetry, all diagonal elements equal 1 and all of diagonal elements are between -1 and 1.

Examples

Run this code

# Create a minimum estimate (only mandatory marginal information supplied):
estimateMin<-estimate(c("posnorm", "lnorm"),
                      c(        4,       4),
                      c(       50,      10))
print(estimateMin) 

# Create an estimate with optional columns (only marginal information supplied):
estimateMarg<-estimate(           c("posnorm", "lnorm"),
                                  c(        4,       4),
                                  c(       50,      10),
                         variable=c("revenue", "costs"),
                         median = c(   "mean",      NA),
                         method = c(    "fit",      ""))
print(estimateMarg)
print(corMat(estimateMarg))

# Create a minimum estimate from text (only mandatory marginal information supplied):
estimateTextMin<-"distribution, lower, upper
                  posnorm,      100,   1000
                  posnorm,      50,    2000
                  posnorm,      50,    2000
                  posnorm,      100,   1000"
estimateMin<-as.estimate(read.csv(header=TRUE, text=estimateTextMin, 
                          strip.white=TRUE, stringsAsFactors=FALSE))
print(estimateMin) 

# Create an estimate from text (only marginal information supplied):
estimateText<-"variable,  distribution, lower, upper, median, method
               revenue1,  posnorm,      100,   1000,  NA,        
               revenue2,  posnorm,      50,    2000,    ,     fit
               costs1,    posnorm,      50,    2000,  70,     calculate
               costs2,    posnorm,      100,   1000,  mean,             "
estimateMarg<-as.estimate(read.csv(header=TRUE, text=estimateText, 
                          strip.white=TRUE, stringsAsFactors=FALSE))
print(estimateMarg)
print(corMat(estimateMarg))

# Create an estimate from text (with correlated components): 
estimateTextMarg<-"variable,  distribution, lower, upper
                   revenue1,  posnorm,      100,   1000
                   revenue2,  posnorm,      50,    2000
                   costs1,    posnorm,      50,    2000
                   costs2,    posnorm,      100,   1000"
estimateTextCor<-",         revenue1, costs2
                  revenue1,        1,   -0.3
                  costs2,       -0.3,      1"
estimateCor<-as.estimate(read.csv(header=TRUE, text=estimateTextMarg, 
                          strip.white=TRUE, stringsAsFactors=FALSE),
                          correlation_matrix=data.matrix(read.csv(text=estimateTextCor, 
                                                                  row.names=1,
                                                                  strip.white=TRUE)))
print(estimateCor)
print(corMat(estimateCor))

Run the code above in your browser using DataLab