Solves the empirical Bayes normal means (EBNM) problem using the family of
nonnegative distributions consisting of mixtures where one component is a
point mass at zero and the other is a truncated normal distribution with
lower bound zero and nonzero mode. Typically, the mode is positive, with
the ratio of the mode to the standard deviation taken to be large, so that
posterior estimates are strongly shrunk towards one of two values (zero or
the mode of the normal component).
Identical to function ebnm
with argument
prior_family = "generalized_binary"
.
For details, see Liu et al. (2023), cited in References below.
ebnm_generalized_binary(
x,
s = 1,
mode = "estimate",
scale = 0.1,
g_init = NULL,
fix_g = FALSE,
output = ebnm_output_default(),
control = NULL,
...
)
An ebnm
object. Depending on the argument to output
, the
object is a list containing elements:
data
A data frame containing the observations x
and standard errors s
.
posterior
A data frame of summary results (posterior means, standard deviations, second moments, and local false sign rates).
fitted_g
The fitted prior \(\hat{g}\).
log_likelihood
The optimal log likelihood attained, \(L(\hat{g})\).
posterior_sampler
A function that can be used to
produce samples from the posterior. The sampler takes a single
parameter nsamp
, the number of posterior samples to return per
observation.
S3 methods coef
, confint
, fitted
, logLik
,
nobs
, plot
, predict
, print
, quantile
,
residuals
, simulate
, summary
, and vcov
have been implemented for ebnm
objects. For details, see the
respective help pages, linked below under See Also.
A vector of observations. Missing observations (NA
s) are
not allowed.
A vector of standard errors (or a scalar if all are equal). Standard errors may not be exactly zero, and missing standard errors are not allowed.
A scalar specifying the mode of the truncated normal component,
or "estimate"
if the mode is to be estimated from the data (the
location of the point mass is fixed at zero).
A scalar specifying the ratio of the (untruncated) standard
deviation of the normal component to its mode. This ratio must be
fixed in advance (i.e., it is not possible to set scale = "estimate"
when using generalized binary priors).
The prior distribution \(g\). Usually this is left
unspecified (NULL
) and estimated from the data. However, it can be
used in conjuction with fix_g = TRUE
to fix the prior (useful, for
example, to do computations with the "true" \(g\) in simulations). If
g_init
is specified but fix_g = FALSE
, g_init
specifies the initial value of \(g\) used during optimization. When
supplied, g_init
should be an object of class
tnormalmix
or an ebnm
object in which the fitted
prior is an object of class tnormalmix
.
If TRUE
, fix the prior \(g\) at g_init
instead
of estimating it.
A character vector indicating which values are to be returned.
Function ebnm_output_default()
provides the default return values, while
ebnm_output_all()
lists all possible return values. See Value
below.
A list of control parameters to be passed to function
optim
, where method
has been set to
"L-BFGS-B"
.
The following additional arguments act as control parameters for the outer EM loops in the fitting algorithm. Each loop iteratively updates parameters \(w\) (the mixture proportion corresponding to the truncated normal component) and \(\mu\) (the mode of the truncated normal component):
wlist
A vector defining intervals of \(w\) for which
optimal solutions will separately be found. For example, if
wlist = c(0, 0.5, 1)
, then two optimal priors will be found:
one such that \(w\) is constrained to be less than 0.5 and one
such that it is constrained to be greater than 0.5.
maxiter
A scalar specifying the maximum number of iterations to perform in each outer EM loop.
tol
A scalar specifying the convergence tolerance parameter for each outer EM loop.
mu_init
A scalar specifying the initial value of \(\mu\) to be used in each outer EM loop.
mu_range
A vector of length two specifying lower and upper bounds for possible values of \(\mu\).
Yusha Liu, Peter Carbonetto, Jason Willwerscheid, Scott A Oakes, Kay F Macleod, and Matthew Stephens (2023). Dissecting tumor transcriptional heterogeneity from single-cell RNA-seq data by generalized binary covariance decomposition. bioRxiv 2023.08.15.553436.
See ebnm
for examples of usage and model details.
Available S3 methods include coef.ebnm
,
confint.ebnm
,
fitted.ebnm
, logLik.ebnm
,
nobs.ebnm
, plot.ebnm
,
predict.ebnm
, print.ebnm
,
print.summary.ebnm
, quantile.ebnm
,
residuals.ebnm
, simulate.ebnm
,
summary.ebnm
, and vcov.ebnm
.