st.prelimFit: Compute preliminary estimates for a linear model with ST-distributed error term

Description

For a univariate or multivariate linear model where the error term is assumed to have skew-t (ST) distribution and the location parameter is a linear function of a set of explanatory values, the functions compute preliminary estimates to be used as initial values for a subsequent maximization of the likelihood function. These functions are mainly intended for internal package use.

Usage

st.prelimFit(x, y, w, quick = TRUE, verbose = 0, max.nu = 30, SN=FALSE)
mst.prelimFit(x, y, w, quick = TRUE, verbose = 0, max.nu = 30, SN=FALSE)

Value

A call to st.prelimFit returns a list with these components:

dp: a vector of estimates in the DP parameterization
residuals: a vector of residual values
logLik: the corresponding log-likelihood value

A call to mst.prelimFit returns a list with these components:

dp: a list with the estimates in the DP parameterization
shrink.steps: the number of shrinking steps applied to the original estimate of the scale matrix to obtain an admissible matrix
dp.matrix: a numeric matrix formed by the component-wise DP estimates
logLik: the corresponding log-likelihood value

Arguments

x: design matrix of numeric values. It may be missing; if present, the first column must contain all 1's.
y: vector of observations of length n, or a matrix with n rows.
w: a vector of non-negative integer weights of length n; if missing, a vector of all 1's is generated.
quick: logical value which regulates whether a very quick estimate is produced (default value TRUE); see ‘Details’ for additional information.
verbose: an integer value which regulates the amount of messages printed out; default value is 0.
max.nu: threshold for the estimated degrees of freedom
SN: logical value (default value: FALSE); if TRUE, a SN distribution is assumed.

Author

Adelchi Azzalini

Details

The underlying methodology is the one presented by Azzalini and Salehi (2020). In its essence, it is based on the selection of parameter values achieving the best matching between certain quantile-based summaries of the ST distribution and the corresponding empirical quantities for the sample or, in the presence of explanatory variables, the same quantities computed from the residuals after fitting a median regression.

Argument quick selects whether the above-described matching is performed in a quick or in an accurate way. Since the output values of this function are intended to be only initial values for subsequent likelihood maximization, this explains the default option quick=TRUE. Other possible values are FALSE and NULL; the latter simply sets alpha=0 and nu=10.

Since the methodology hinges on some selected sample quantiles, it can occasionally be spoiled by poor behaviour of these basic quantiles, especially for small or moderate sample sizes. The more visible effect of such situation is a very large value of the estimated degrees of freedom, which then hampers rather than help a subsequent likelihood maximization. It is therefore appropriate to set an upper limit max.nu to this component.

Argument x may be missing. In this case, a one-column matrix with all elements 1 is created.

References

Azzalini, A. and Salehi, M. (2020). Some computational aspects of maximum likelihood estimation of the skew-t distribution. In: Computational and Methodological Statistics and Biostatistics, edited by Andriëtte Bekker, Ding-Geng Chen and Johannes T. Ferreira. Springer. DOI: 10.1007/978-3-030-42196-0

Examples

Run this code

data(barolo)
attach(barolo)  
A75 <- (reseller=="A" & volume==75)  
log.price <- log(price[A75], 10)
prelimFit <- st.prelimFit(y=log.price)
detach(barolo)
# 
data(ais)
attach(ais)
prelim32 <- mst.prelimFit(y=cbind(BMI, LBM), x=cbind(1, Ht, Wt))
detach(ais)

Run the code above in your browser using DataLab