idn: Multidimensional evaluation on posets (Identification Function only)

Description

Given a partial order (arguments profiles and/or zeta) and a selected threshold, the function computes the identification function, as a S3 class object parsec. The identification function is computed by uniform sampling of the linear extensions of the input poset, through a C implementation of the Bubley - Dyer (1999) algorithm. idn is a simplified and faster version of evaluation, computing just the identification function.

Usage

idn(
    profiles = NULL,
    threshold,
    error = 10^(-3),
    zeta = getzeta(profiles),
    weights = {
        if (!is.null(profiles)) 
            profiles$freq
        else rep(1, nrow(zeta))
    },
    linext = lingen(zeta),
    nit = floor({
        n <- nrow(zeta)
        n^5 * log(n) + n^4 * log(error^(-1))
    }),
    maxint = 2^31 - 1
)

Value

profiles: an object of S3 class wprof reporting poset profiles and their associated frequencies (number of statistical units in each profile).
number_of_profiles: number of profiles.
number_of_variables: number of variables.
incidence: S3 class incidence, incidence matrix of the poset.
cover: S3 class cover, cover matrix of the poset.
threshold: boolean vector specifying whether a profile belongs to the threshold.
number_of_iterations: number of iterations performed by the Bubley Dyer algorithm.
rank_dist: matrix reporting by rows the relative frequency distribution of the poverty ranks of each profile, over the set of sampled linear extensions.
thr_dist: vector reporting the relative frequency a profile is used as threshold in the sampled linear extensions. This result is useful for a posteriori valuation of the poset threshold.
prof_w: vector of weights assigned to each profile.
edges_weights: matrix of distances between profiles, used to evaluate the measures of gap.
idn_f: vector reporting the identification function, computed as the fraction of sampled linear extensions where a profile is in the downset of the threshold.
svr_abs: NA use evaluation to obtain this result.
svr_rel: NA use evaluation to obtain this result.
wea_abs: NA use evaluation to obtain this result.
wea_rel: NA use evaluation to obtain this result.
poverty_gap: NA use evaluation to obtain this result.
wealth_gap: NA use evaluation to obtain this result.
inequality: NA use evaluation to obtain this result.

Arguments

profiles: an object of S3 class wprof.
threshold: a vector identifying the threshold. It can be a vector of indexes (numeric), a vector of poset element names (character) or a boolean vector of length equal to the number of elements.
error: the "distance" from uniformity in the sampling distribution of linear extensions.
zeta: the incidence matrix of the poset. An object of S3 class incidence. By default, extracted from profiles.
weights: weights assigned to profiles. If the argument profiles is not NULL, weights are by default set equal to profile frequencies, otherwise they are set equal to 1.
linext: the linear extension initializing the sampling algorithm. By default, it is generated by lingen(zeta). Alternatively, it can be provided by the user through a vector of elements positions.
nit: Number of iterations in the Bubley-Dyer algorithm, by default evaluated using a formula of Karzanov and Khachiyan based on the number of poset elements and the argument error (see Bubley and Dyer, 1999).
maxint: Maximum integer. By default the maximum integer obtainable in a 32bit system. This argument is used to group iterations and run the compiled C code more times, so as to avoid memory indexing problems. User can set a lower value to maxint in case of lower RAM availability.

References

Bubley R., Dyer M. (1999), Faster random generation of linear extensions, Discrete Math., 201, 81-88.

Fattore M., Arcagni A. (2013), Measuring multidimensional polarization with ordinal data, SIS 2013 Statistical Conference, BES-M3.1 - The BES and the challenges of constructing composite indicators dealing with equity and sustainability

Examples

Run this code

profiles <- var2prof(varlen = c(3, 2, 4))
threshold <- c("311", "112")

res <- idn(profiles, threshold, maxint = 10^5)

summary(res)
plot(res)

Run the code above in your browser using DataLab