Learn R Programming

gainML (version 0.1.0)

analyze.p1: Apply Period 1 Analysis

Description

Conducts period 1 analysis; selects the optimal set of variables that minimizes a k-fold CV error measure and establishes a machine learning model that predicts power output of REF and CTR-b turbines by using period 1 data.

Usage

analyze.p1(train, test, ratedPW)

Arguments

train

A list containing k datasets that will be used to train the machine learning model.

test

A list containing k datasets that will be used to test the machine learning model and calculate CV error measures.

ratedPW

A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b).

Value

The function returns a list containing period 1 analysis results as follows.

opt.cov

A character vector presenting the names of predictor variables chosen for the optimal set.

pred.REF

A list of \(k\) datasets each representing the \(k\)th fold's period 1 prediction for the REF turbine.

pred.CTR

A list of \(k\) datasets each representing the \(k\)th fold's period 1 prediction for the CTR-b turbine.

err.REF

A data frame containing \(k\)-fold CV based RMSE values and BIAS values for the REF turbine model (so \(k\) of them for both). The first column includes the RMSE values and the second column includes the BIAS values.

err.CTR

A data frame containing \(k\)-fold CV based RMSE values and BIAS values for the CTR-b turbine model. Similarly structured with err.REF.

biasCurve.REF

A \(k\) by \(m\) matrix describing the binned BIAS (technically speacking, `residuals' which are the negative BIAS) curve for the REF turbine model, where \(m\) is the number of power bins.

biasCurve.CTR

A \(k\) by \(m\) matrix describing the binned BIAS curve for the CTR-b turbine model.

References

H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.

Examples

Run this code
# NOT RUN {
df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
 power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
 power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3

data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
 p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
 k.fold = 2)

p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
p1.res$opt.cov #This provides the optimal set of variables.

# }

Run the code above in your browser using DataLab