Learn R Programming

Matrix.utils (version 0.9.8)

aggregate.Matrix: Compute summary statistics of a Matrix

Description

Similar to aggregate. Splits the matrix into groups as specified by groupings, which can be one or more variables. Aggregation function will be applied to all columns in data, or as specified in formula. Warning: groupings will be made dense if it is sparse, though data will not.

Usage

# S3 method for Matrix
aggregate(x, groupings = NULL, form = NULL, fun = "sum", ...)

Arguments

x

a Matrix or matrix-like object

groupings

an object coercible to a group of factors defining the groups

fun

character string specifying the name of aggregation function to be applied to all columns in data. Currently "sum", "count", and "mean" are supported.

...

arguments to be passed to or from methods. Currently ignored

Value

A sparse Matrix. The rownames correspond to the values of the groupings or the interactions of groupings joined by a _.

There is an attribute crosswalk that includes the groupings as a data frame. This is necessary because it is not possible to include character or data frame groupings in a sparse Matrix. If needed, one can cbind(attr(x,"crosswalk"),x) to combine the groupings and the aggregates.

Details

aggregate.Matrix uses its own implementations of functions and should be passed a string in the fun argument.

See Also

summarise

summarise

aggregate

Examples

Run this code
# NOT RUN {
skus<-Matrix(as.matrix(data.frame(
   orderNum=sample(1000,10000,TRUE),
   sku=sample(1000,10000,TRUE),
   amount=runif(10000))),sparse=TRUE)
#Calculate sums for each sku
a<-aggregate.Matrix(skus[,'amount'],skus[,'sku',drop=FALSE],fun='sum')
#Calculate counts for each sku
b<-aggregate.Matrix(skus[,'amount'],skus[,'sku',drop=FALSE],fun='count')
#Calculate mean for each sku
c<-aggregate.Matrix(skus[,'amount'],skus[,'sku',drop=FALSE],fun='mean')

m<-rsparsematrix(1000000,100,.001)
labels<-as.factor(sample(1e4,1e6,TRUE))
b<-aggregate.Matrix(m,labels)

# }
# NOT RUN {
orders<-data.frame(orderNum=as.factor(sample(1e6, 1e7, TRUE)),
   sku=as.factor(sample(1e3, 1e7, TRUE)),
   customer=as.factor(sample(1e4,1e7,TRUE)),
   state = sample(letters, 1e7, TRUE), amount=runif(1e7))
system.time(d<-aggregate.Matrix(orders[,'amount',drop=FALSE],orders$orderNum))
system.time(e<-aggregate.Matrix(orders[,'amount',drop=FALSE],orders[,c('customer','state')]))
# }

Run the code above in your browser using DataLab