Learn R Programming

collapse (version 1.2.0)

varying: Fast Check of Variation in Data

Description

varying is a generic function that (column-wise) checks for variation in the values of x, (optionally) within the groups g (i.e. a panel-identifier).

Usage

varying(x, ...)

# S3 method for default varying(x, g = NULL, any_group = TRUE, use.g.names = TRUE, ...)

# S3 method for matrix varying(x, g = NULL, any_group = TRUE, use.g.names = TRUE, drop = TRUE, ...)

# S3 method for data.frame varying(x, by = NULL, cols = NULL, any_group = TRUE, use.g.names = TRUE, drop = TRUE, ...)

# Methods for compatibility with plm:

# S3 method for pseries varying(x, effect = 1L, any_group = TRUE, use.g.names = TRUE, ...)

# S3 method for pdata.frame varying(x, effect = 1L, cols = NULL, any_group = TRUE, use.g.names = TRUE, drop = TRUE, ...)

# Methods for compatibility with dplyr:

# S3 method for grouped_df varying(x, any_group = TRUE, use.g.names = FALSE, drop = TRUE, keep.group_vars = TRUE, ...)

Arguments

x

a vector, matrix, data.frame or grouped tibble (dplyr::grouped_df).

g

a factor, GRP object, atomic vector (internally converted to factor) or a list of vectors / factors (internally converted to a GRP object) used to group x.

by

same as g, but also allows one- or two-sided formulas i.e. ~ group1 + group2 or var1 + var2 ~ group1 + group2. See Examples

any_group

logical. If !is.null(g), FALSE will check and report variation in all groups, whereas the default TRUE only checks if there is variation within any group. See Examples.

cols

select columns using column names, indices or a function (i.e. is.numeric). Two-sided formulas passed to by overwrite cols.

use.g.names

make group-names and add to the result as names (vector method) or row-names (matrix and data.frame method). No row-names are generated for data.tables and (default) grouped tibbles.

drop

matrix and data.frame methods: drop dimensions and return an atomic vector if the result is 1-dimensional.

effect

plm methods: Select which panel identifier should be used for between and within transformations of the data. 1L means first variable in the plm::index, 2L the second etc.. Index variables can also be called by name. More than one variable can be supplied.

keep.group_vars

grouped_df method: Logical. FALSE removes grouping variables after computation.

...

arguments to be passed to or from other methods.

Value

A logical vector or (if !is.null(g) and any_group = FALSE), a matrix or data.frame of logical vectors indicating whether the data vary (over the dimension supplied by g).

Details

Without groups passed to g, varying simply checks if there is any variation in the columns of x and returns TRUE for each column where this is the case and FALSE otherwise. A set of data points is defined as varying if it contains at least 2 distinct non-missing values (such that a non-0 standard deviation can be computed on numeric data). varying checks for variation in both numeric and non-numeric data.

If groups are supplied to g (or alternatively a grouped_df to x), varying can operate in one of 2 modes:

  • If any_group = TRUE (the default), varying checks each column for variation in any of the groups defined by g, and returns TRUE if such within-variation was detected and FALSE otherwise. Thus only one logical value is returned for each column and the computation on each column is terminated as soon as any variation within any group was found.

  • If any_group = FALSE, varying runs through the entire data checking each group for variation and returns, for each column in x, a logical vector reporting the variation check for all groups. If a group contains only missing values, a NA is returned for that group.

See Also

Data Transformations, Collapse Overview

Examples

Run this code
# NOT RUN {
## Checks overall variation in all columns
varying(wlddev)

## Checks whether data are time-variant i.e. vary within country
varying(wlddev, wlddev$country)

## Same as above but done for each country individually, countries wothout data are coded NA
varying(wlddev, wlddev$country, any_group = FALSE)
# }

Run the code above in your browser using DataLab