Given a hyper2
object and a point in probability space,
function gradient()
returns the gradient of the log-likelihood;
function hessian()
returns the bordered Hessian matrix. By
default, both functions are evaluated at the maximum likelihood estimate
for \(p\), as given by maxp()
.
gradient(H, probs=indep(maxp(H)))
hessian(H,probs=indep(maxp(H)),border=TRUE)
hessian_lowlevel(L, powers, probs, pnames,n)
is_ok_hessian(M, give=TRUE)
Function gradient()
returns a vector of length \(n-1\) with
entries being the gradient of the log-likelihood with respect to the
\(n-1\) independent components of
(p_1,...,p_n)(p_1,...,p_n), namely
(p_1,...,p_n-1)(p_1,...,p_n-1). The fillup
value p_n is calculated as
1-(p_1,...,p_n-1)1-(p_1,...,p_n-1).
If argument border
is TRUE
, function hessian()
returns an \(n\)-by-\(n\) matrix of second derivatives; the borders
are as returned by gradient()
. If border
is FALSE
,
ignore the fillup value and return an \(n-1\)-by-\(n-1\) matrix.
Calling hessian()
at the evaluate will not return exact zeros for
the constraint on the fillup value; gradient()
is used and this
does not return exactly zeros at the evaluate.
A hyper2
object
Components of a hyper2
object
A vector of probabilities
Character vector of names
Boolean, with default TRUE
meaning to return the
bordered Hessian and FALSE
meaning to return the Hessian
(warning: this option does not respect the unit sum constraint)
A bordered Hessian matrix, understood to have a single
constraint (the unit sum) at the last row and column; the output of
hessian(border=TRUE)
Boolean with default FALSE
meaning for function
is_ok_hessian()
to return whether or not M
corresponds to a negative-definite matrix, and TRUE
meaning
to return more details
Robin K. S. Hankin
Function gradient()
returns the gradient of the log-likelihood
function. If the hyper2
object is of size \(n\), then argument
probs
may be a vector of length \(n-1\) or \(n\); in the
former case it is interpreted as indep(p)
. In both cases, the
returned gradient is a vector of length \(n-1\).
The function returns the derivative of the loglikelihood with respect to
the \(n-1\) independent components of
(p_1,...,p_n)(p_1,...,p_n), namely
(p_1,...,p_n-1)(p_1,...,p_n-1). The fillup
value p_n is calculated as
1-(p_1+ + p_n-1)1-(p_1+...+p_n-1).
Function gradientn()
returns the gradient of the loglikelihood
function but ignores the unit sum constraint. If the hyper2
object is of size \(n\), then argument probs
must be a vector
of length \(n\), and the function returns a named vector of length
\(n\). The last element of the vector is not treated differently from
the others; all \(n\) elements are treated as independent. The sum
need not equal one.
Function hessian()
returns the bordered Hessian, a matrix
of size n+1 n+1(n+1)*(n+1), which is useful when using
Lagrange's method of undetermined multipliers. The first row and column
correspond to the unit sum constraint, p_1=1p_1+...+p_n=1.
Row and column names of the matrix are the pnames()
of the
hyper2
object, plus “usc
” for “Unit Sum
Constraint”.
The unit sum constraint borders could have been added with idiom
magic::adiag(0,pad=1,hess)
, which might be preferable.
Function is_ok_hessian()
returns the result of the second
derivative test for the maximum likelihood estimate being a local
maximum on the constraint hypersurface. This is a generalization of the
usual unconstrained problem, for which the test is the Hessian's being
negative-definite.
Function hessian_lowlevel()
is a low-level helper function that
calls the C++
routine.
Further examples and discussion is given in file
inst/gradient.Rmd
. See also the discussion at man/maxp.Rd
on the different optimization routines available.
data(chess)
p <- c(1/2,1/3)
delta <- rnorm(2)/1e5 # delta needs to be quite small
deltaL <- loglik(p+delta,chess) - loglik(p,chess)
deltaLn <- sum(delta*gradient(chess,p + delta/2)) # numeric
deltaL - deltaLn # should be small [zero to first order]
H <- hessian(icons)
is_ok_hessian(H)
Run the code above in your browser using DataLab