Learn R Programming

dwtools (version 0.8.3.9)

eav: Entity-Attribute-Value data evaluate

Description

Evaluate expression on data stored in EAV model as you would using regular wide table.

Usage

eav(x, j, id.vars = key(x)[-length(key(x))], variable.name = key(x)[length(key(x))], measure.vars = names(x)[!(names(x) %in% key(x))], fun.aggregate = sum, shift.on = character(), wide = FALSE)

Arguments

x
data.table data in EAV model.
j
quoted expression to evaluate, the same way as using on wide format data.table.
id.vars
character vector of columns name which defines an Entity.
variable.name
character column name which defines Attribute.
measure.vars
character column name which defines Value.
fun.aggregate
function which will be applied on duplicates in Entity and Attribute and used when dcasting.
shift.on
character column name of field which should excluded from the grouping variables to use shift over that field. Should be only used when j expression is going to use shift. See examples.
wide
logical default FALSE will return EAV data after evaluation of j expression, when TRUE it will return wide format table.

Details

The easiest way to use is to setkey on your entity and attribute columns, then only x and j needs to be passed to function. See examples.

Examples

Run this code
suppressPackageStartupMessages(library(dwtools))

# basic product EAV
dt <- data.table(product=c(1,1,2,2,3,3),
                 attribute=rep(c('amount_in_pack','price'),3),
                 value=c(24,115,5,200,8,20.5),
                 key=c('product','attribute'))
dt
eav(dt, quote(price_of_pack := price * amount_in_pack))

# sources of income EAV, variable number of source
dt <- data.table(customer=c(1,2,2,2,3,3),
                 attribute=c('salary','salary','benefits','gambling','fraud','salary'),
                 value=c(560,490,120,85,200,380),
                 key=c('customer','attribute'))
dt
eav(dt, quote(total_income := rowSums(.SD,na.rm=TRUE)))

# sales of products over time
dt <- dw.populate(scenario='fact')[,.(prod_code,time_code,amount,value)][,melt(.SD, id=1:2,variable.name='measure',value.name='value')] # prepare EAV
setkey(dt,prod_code,time_code,measure)
dt
system.time(
  r <- eav(dt, quote(avg_price:=value/amount))
) # great timing even on big sets thanks to data.table!
r

# shift.on usage - calc price change over time
eav(r, quote(price_change := avg_price - shift(avg_price, 1L, NA, "lag")[[1L]]), shift.on='time_code')

# leave in wide format
eav(r, quote(price_change := avg_price - shift(avg_price, 1L, NA, "lag")[[1L]]), shift.on='time_code', wide=TRUE)

Run the code above in your browser using DataLab