seqpolyads: Measuring the Degree of Within-Polyadic Similarities

Description

The function computes measures of the degree of similarities within polyadic member sequences compared to randomly assigned polyadic member sequences.

Usage

seqpolyads(seqlist, a=1, method="HAM", ...,
    w=rep(1,ncol(combn(1:length(seqlist),2))),
    s=36963, T=1000, core=1, replace=TRUE, weighted=TRUE,
    with.missing=FALSE, rand.weight.type=1, role.weights=NULL,
    show.time=FALSE)

Value

The function outputs a list of seven objects:

mean.dist: Vector of length 2 with the average observed and random within-polyadic distances.
U: Vector of N number of U statistics (see reference).
U.tp: Vector of N number of p-values for a two-tailed t-test of the U statistic.
V: Vector of N number of V statistics (see reference).
V.95: Vector of N number of 1s or 0s: 1 if a V value is at least 95 percent confident, 0 otherwise.
observed.dist: Vector of within-polyadic distances for the observed polyadic members.
random.dist: Vector of within-polyadic distances for the T number of randomly matched polyadic members.

Arguments

seqlist: A list of J>1 state sequence stslist objects. List of input sets (polyads) of polyadic sequences. The state sequence objects in the list must all have the same number N of sequences and the same alphabet. The state sequence objects should be created with seqdef and the list with list. E.g., list(gen1.seq,gen2.seq,gen3.seq).
a: Integer, 1 or 2. Random generation mechanism. If 1 (default), draws from the observed set of sequences, and if 2, in addition random draws of states from each randomly drawn sequence. See reference below for detail.
method: String. Method for computing sequence distances. See seqdist. Additional arguments may be required depending on the method chosen.
...: Additional arguments passed to seqdist
s: Integer. Default 36963. Using the same seed number on the same computer guarantees the same results each time. Set s=NULL if you don't want to set a seed. The random generator can be chosen with RNGkind.
w: Integer vector. Default 1. The weights assigned to between-polyadic member sets in the weight matrix. For example, for dyadic sequences, no weight is necessary and the distance computation takes on the default of 1. For triadic sequences, there are three weights between the first and the second members, the first and the third members, and the second and the third members, in a row-wise order. See reference below.
T: Integer. Default 1,000. The number of randomized computations.
core: Integer. Default 1. Number of cores for the computation. When greater than 1, the procedure utilizes parallel processing.
replace: Logical. When a=2, should state sampling in each sequence be done with replacement? Default is TRUE. Ignored when a=1.
weighted: Logical. Should we account for the weights when present in the sequence objects? See details. Default is TRUE.
with.missing: Logical. Should the missing state be considered as a regular state? Default is FALSE.
rand.weight.type: Integer, 1 or 2. Ignored when weighted=FALSE. If 1 (default), weight of each randomized polyad is the average of original weights of its members. If 2, member weights are adjusted by dividing them by the sum of weights of all drawn members of the same type.
role.weights: NULL or vector of non-negative weights of same length as the list seqlist. Ignored when weighted=FALSE. If non null, role weights for determining the weights of the randomized polyads.
show.time: Logical. Should elapsed time be displayed? Default is FALSE.

Author

Tim Liao and Gilbert Ritschard

Details

The function computes the polyadic distance of the observed polyads, i.e., the (weighted) mean of the pairwise distances between members of the polyad. In addition, the following statistics are computed:

The U statistic measures for each observed polyad by how much its polyadic distance differs from the mean polyadic distance of T randomized polyads. U.tp is the p-value for a two-tailed t-test of the U statistic.

The V statistic is, for each observed polyad, the proportion of T randomized polyads that have a greater polyadic distance. V.95 is an associated dummy that takes value 1 when the proportion V is greater than 95% and 0 otherwise.

When the sequence objects in seqlist have weights and weighted=TRUE, the randomized sequences are sampled using the weights of the first element in the list. Each member of an observed polyad is supposed to have the same weight. This does not hold for the randomized polyads that are obtained by sampling their members independently. The weights of each randomized sequence is set as the average of the weights of its members. When role weights are provided with role.weights, a weighted average of the member weights is used. When rand.weight.type=1, original member weights are used. When rand.weight.type=2, the weights of randomly selected members are adjusted by the sum of weights of all randomly drawn members of the same type.

When core > 1, the function uses the doParallel package for parallel computation.

References

Tim F. Liao (2021), "Using Sequence Analysis to Quantify How Strongly Life Courses Are Linked.” Sociological Science 8, 48-72, tools:::Rd_expr_doi("10.15195/v8.a3").

Examples

Run this code

data(polyads)
Gen <- polyads$Gen
seqGrandP <- seqdef(polyads[Gen=="1st Generation",2:11])
seqParent <- seqdef(polyads[Gen=="2nd Generation",2:11])
seqChild <- seqdef(polyads[Gen=="3rd Generation",2:11])
Seq <- rbind(seqGrandP,seqParent,seqChild)
slgth <- ncol(Seq)
colnames(Seq) <- 21:30
seqIplot(Seq,group=Gen,idxs=10:1,ylab="Triad",xlab="Age")
seqL <- list(seqGrandP,seqParent,seqChild)
core=1
seqG2.Tim <- seqpolyads(seqL[1:2],method="HAM",a=1,core=core,T=100)
seqG3.Tim <- seqpolyads(seqL,method="HAM",a=1,core=core,T=100)
seqG2.Dur <- seqpolyads(seqL[1:2],method="CHI2",step=slgth,core=core,T=100)
seqG3.Dur <- seqpolyads(seqL,method="CHI2",step=slgth,core=core,T=100)

Run the code above in your browser using DataLab