Learn R Programming

bnspatial (version 1.1.1)

dataDiscretize: Discretize data

Description

These functions discretize continuous input data into classes. Classes can be defined by the user or, if the user provides the number of expected classes, calculated from quantiles (default option) or by equal intervals. dataDiscretize processes a single variable at a time, provided as vector. bulkDiscretize discretizes multiple input rasters, optionally by using parallel processing.

Usage

dataDiscretize(
  data,
  classBoundaries = NULL,
  classStates = NULL,
  method = "quantile"
)

bulkDiscretize(formattedLst, xy, inparallel = FALSE)

Arguments

data

numeric vector. The continuous data to be discretized.

classBoundaries

numeric vector or single integer. Interval boundaries to be used for data discretization. Outer values (minimum and maximum) required. -Inf or Inf are allowed, in which case data minimum and maximum will be used to evaluate the mid values of outer classes. Alternatively, a single integer to indicate the number of classes, to split by quantiles (default) or equal intervals.

classStates

vector. The state labels to be assigned to the discretized data.

method

character. What splitting method should be used? This argument is ignored if a vector of values is passed to classBoundaries.

  • quantile splits data into quantiles (default).

  • equal splits data into equally sized intervals based on data minimum and maximum.

formattedLst

A formatted list as returned by linkNode and linkMultiple

xy

matrix. A matrix of spatial coordinates; first column is x (longitude), second column is y (latitude) of locations (in rows).

inparallel

logical or integer. Should the function use parallel processing facilities? Default is FALSE: a single process will be launched. If TRUE, all cores/processors but one will be used. Alternatively, an integer can be provided to dictate the number of cores/processors to be used.

Value

dataDiscretize returns a named list of 4 vectors:

  • $discreteDatathe discretized data, labels are applied accordingly if classStates argument is provided

  • $classBoundariesthe class boundaries, i.e. values splitting the classes

  • $midValuesthe mid point for each class (the mean of its lower and upper boundaries)

  • $classStatesthe labels assigne to each class

bulkDataDiscretize returns a matrix: in columns each node associated to input spatial data, in rows their discretized values at coordinates specified by argument xy.

Details

dataDiscretize

Examples

Run this code
# NOT RUN {
s <- runif(30)

# Split by user defined values. Values out of boundaries are set to NA:
dataDiscretize(s, classBoundaries = c(0.2, 0.5, 0.8)) 

# Split by quantiles (default):
dataDiscretize(s, classStates = c('a', 'b', 'c'))

# Split by equal intervals:
dataDiscretize(s, classStates = c('a', 'b', 'c'), method = "equal")

# When -Inf and Inf are provided as external boundaries, $midValues of outer classes
# are calculated on the minimum and maximum values:
dataDiscretize(s, classBoundaries=c(0, 0.5, 1), classStates=c("first", "second"))[c(2,3)]
dataDiscretize(s, classBoundaries=c(-Inf, 0.5, Inf), classStates=c("first", "second"))[c(2,3)]

## Discretize multiple spatial data by location
list2env(ConwyData, environment())

network <- LandUseChange
spatialData <- c(ConwyLU, ConwySlope, ConwyStatus)

# Link multiple spatial data to the network nodes and discretize
spDataLst <- linkMultiple(spatialData, network, LUclasses, verbose = FALSE)
coord <- aoi(ConwyLU, xy=TRUE)
head( bulkDiscretize(spDataLst, coord) )
# }

Run the code above in your browser using DataLab