If a model is "singular", this means that some dimensions of the
variance-covariance matrix have been estimated as exactly zero. This
often occurs for mixed models with complex random effects structures.
“While singular models are statistically well defined (it is
theoretically sensible for the true maximum likelihood estimate to
correspond to a singular fit), there are real concerns that (1) singular
fits correspond to overfitted models that may have poor power; (2) chances
of numerical problems and mis-convergence are higher for singular models
(e.g. it may be computationally difficult to compute profile confidence
intervals for such models); (3) standard inferential procedures such as
Wald statistics and likelihood ratio tests may be inappropriate.”
(lme4 Reference Manual)
There is no gold-standard about how to deal with singularity and which
random-effects specification to choose. Beside using fully Bayesian methods
(with informative priors), proposals in a frequentist framework are:
avoid fitting overly complex models, such that the
variance-covariance matrices can be estimated precisely enough
(Matuschek et al. 2017)
use some form of model selection to choose a model that balances
predictive accuracy and overfitting/type I error (Bates et al. 2015,
Matuschek et al. 2017)
“keep it maximal”, i.e. fit the most complex model consistent
with the experimental design, removing only terms required to allow a
non-singular fit (Barr et al. 2013)
Note the different meaning between singularity and convergence: singularity
indicates an issue with the "true" best estimate, i.e. whether the maximum
likelihood estimation for the variance-covariance matrix of the random
effects is positive definite or only semi-definite. Convergence is a
question of whether we can assume that the numerical optimization has
worked correctly or not.