Learn R Programming

TraMineR (version 2.2-10)

seqstatd: Sequence of transversal state distributions and their entropies

Description

Returns the state relative frequencies, the number of valid states and the entropy of the state distribution at each position in the sequence.

Usage

seqstatd(seqdata, weighted=TRUE, with.missing=FALSE, norm=TRUE)

Value

A list with three elements: Frequencies (relative frequencies), ValidStates (number of valid states at each position), and Entropy (cross-sectional entropy at each position).

The returned list has attributes nbseq (number of sequences), cpal, xtlab, xtstep, tick.last, weighted, and norm.

Arguments

seqdata

a state sequence object as defined by the seqdef function.

weighted

if TRUE, distributions account for the weights assigned to the state sequence object (see seqdef). Set as FALSE if you want ignore the weights.

with.missing

If FALSE (default value), returned distributions ignore missing values.

norm

if TRUE (default value), entropy is normalized, ie divided by the entropy of the alphabet. Set as FALSE if you want the entropy without normalization.

Author

Alexis Gabadinho and Gilbert Ritschard

Details

In addition to the state distribution at each position in the sequence, the seqstatd function provides also for each time point the number of valid states and the Shannon entropy of the observed cross-sectional state distribution. Letting \(p_i\) denote the proportion of cases in state \(i\) at the considered position, the entropy is $$ h(p_1,\ldots,p_s) = -\sum_{i=1}^{s} p_i \log(p_i) $$ where \(s\) is the size of the alphabet. The log is here the natural (base e) logarithm. The entropy is 0 when all cases are in the same state and is maximal when the same proportion of cases are in each state. The entropy is a measure of the diversity of states observed at the considered position. First studies using sequence of cross-sectional entropies (but with aggregated transversal data) are Billari (2001) and Fussell (2005).

References

Ritschard, G. (2021), "Measuring the nature of individual sequences", Sociological Methods and Research, tools:::Rd_expr_doi("10.1177/00491241211036156").

Billari, F. C. (2001). The analysis of early life courses: complex descriptions of the transition to adulthood. Journal of Population Research 18 (2), 119-24.

Fussell, E. (2005). Measuring the early adult life course in Mexico: An application of the entropy index. In R. Macmillan (Ed.), The Structure of the Life Course: Standardized? Individualized? Differentiated?, Advances in Life Course Research, Vol. 9, pp. 91-122. Amsterdam: Elsevier.

See Also

plot.stslist.statd the plot method for objects of class stslist.statd,
seqdplot for higher level chronograms (state distribution plots)),
seqHtplot for transversal entropy line over sequence positions, and
seqdHplot for chronograms with overlayed entropy line.

Examples

Run this code
data(biofam)
biofam.seq <- seqdef(biofam,10:25)
sd <- seqstatd(biofam.seq)
## Plotting the state distribution
plot(sd, type="d")

## Line of cross-sectional entropies
plot(sd, type="Ht")

## ====================
## example with weights
## ====================
data(ex1)
ex1.seq <- seqdef(ex1, 1:13, weights=ex1$weights)

## Unweighted
seqstatd(ex1.seq, weighted=FALSE)

seqstatd(ex1.seq, weighted=TRUE)

Run the code above in your browser using DataLab