The main function of Generalized Network-based Dimensionality Reduction and Regression (GNDR) for supervised learning.
ndrlm(Y,X,latents="in",dircon=FALSE,optimize=TRUE,
target="adj.r.square",rel_weight=FALSE,
cor_method=1,
cor_type=1,min_comm=2,Gamma=1,
null_model_type=4,mod_mode=1,use_rotation=FALSE,
rotation="oblimin",pareto=FALSE,fit_weights=NULL,
lower.bounds.x = c(rep(-100,ncol(X))),
upper.bounds.x = c(rep(100,ncol(X))),
lower.bounds.latentx = c(0,0,0,0),
upper.bounds.latentx = c(0.6,0.6,0.6,0.3),
lower.bounds.y = c(rep(-100,ncol(Y))),
upper.bounds.y = c(rep(100,ncol(Y))),
lower.bounds.latenty = c(0,0,0,0),
upper.bounds.latenty = c(0.6,0.6,0.6,0.3),
popsize = 20, generations = 30, cprob = 0.7, cdist = 5,
mprob = 0.2, mdist=10, seed=NULL)
Objective function for fitting
Target performance measures. The possible target measure are "adj.r.square" = adjusted R square (default), "r.sqauare" = R square, "MAE" = mean absolute error, "MAPE" = mean absolute percentage error, "MASE" = mean absolute scaled error ,"MSE"= mean square error,"RMSE" = root mean square error
optimized hyperparameters
in the case of multiple objectives TRUE provides pareto-optimal solution, while FALSE (default) provides weighted mean of objective functions (see out_weights)
A numeric data frame of output variables
A numeric data frame of input variables
Latent model: "in", "out", "both", "none"
GNDA object, which is the result of model reduction and features selection in the case of employing latent-independent variables
Weights of input variables (used in ndr
)
Optimized minimal eigenvector centrality value (used in ndr
)
Optimized minimal communality value of indicators (used in ndr
)
Optimized
minimal common communalities (used in ndr
)
Optimized
minimal square correlation between indicators (used in ndr
)
GNDA object, which is the result of model reduction and features selection in the case of employing latent-dependent variables
Weights of input variables (used in ndr
)
Optimized minimal eigenvector centrality value (used in ndr
)
Optimized minimal communality value of indicators (used in ndr
)
Optimized
minimal common communalities (used in ndr
)
Optimized
minimal square correlation between indicators (used in ndr
)
List of linear regrassion models
Wheter fittings are optimized or not
Outpot structure of NSGA-II optimization (list), if the optimization value is true (see in mco::nsga2
)
Logic variable. If direct connection (dircon=TRUE) is allowed not only the latent but the excluded input variables are analyized in the linear models as extra input variables.
Logic variable. If direct connection (dircon=TRUE) is allowed not only the latent but the excluded output variables are analyized in the linear models as extra input variables.
The list of input variables which are directly connected to output variables.
The list of output variables which are directly connected to output variables.
applied seed value (default=NULL, no seed)
Function (regression) name: NDRLM
Callback function
A numeric data frame of output variables
A numeric data frame of input variables
The employs of latent variables: "in" employs latent-independent variables (default); "out" employs latent-dependent variables; "both" employs both latent-dependent and latent independent variables; "none" do not employs latent variable (= multiple regression)
Wether enable or disable direct connection between input and output variables (default=FALSE)
Optimization of fittings (default=TRUE)
Target performance measures. The possible target measure are "adj.r.square" = adjusted R square (default), "r.sqauare" = R square, "MAE" = mean absolute error, "MAPE" = mean absolute percentage error, "MASE" = mean absolute scaled error ,"MSE"= mean square error,"RMSE" = root mean square error
Use relative weights. In this case, all weights should be non-negative. (default=FALSE)
Correlation method (optional). '1' Pearson's correlation (default), '2' Spearman's correlation, '3' Kendall's correlation, '4' Distance correlation
Correlation type (optional). '1' Bivariate correlation (default), '2' partial correlation, '3' semi-partial correlation
Minimal number of indicators per community (default: 2).
Gamma parameter in multiresolution null modell (default: 1).
'1' Differential Newmann-Grivan's null model, '2' The null model is the mean of square correlations between indicators, '3' The null model is the specified minimal square correlation, '4' Newmann-Grivan's modell (default)
Community-based modularity calculation mode: '1' Louvain modularity (default), '2' Fast-greedy modularity, '3' Leading Eigen modularity, '4' Infomap modularity, '5' Walktrap modularity, '6' Leiden modularity
FALSE no rotation (default), TRUE the rotation is used.
"none", "varimax", "quartimax", "promax", "oblimin", "simplimax", and "cluster" are possible rotations/transformations of the solution. "oblimin" is the default, if use_rotation is TRUE.
in the case of multiple objectives TRUE (default value) provides pareto-optimal solution, while FALSE provides weighted mean of objective functions (see out_weights)
weights of fitting the output variables (weights of means of objectives)
Lower bounds of weights of independent variables in GNDA
Upper bounds of weights of independent variables in GNDA
Lower bounds of hyper-parementers of GNDA for independent variables (values must be positive)
Upper bounds of hyper-parementers of GNDA for independent variables (value must be lower than one)
Lower bounds of weights of dependent variables in GNDA
Upper bounds of weights of dependent variables in GNDA
Lower bounds of hyper-parementers of GNDA for dependent variables (values must be positive)
Upper bounds of hyper-parementers of GNDA for dependent variables (value must be lower than one)
size of population of NSGA-II for fitting betas (default=20)
number of generations to breed of NSGA-II for fitting betas (default=30)
crossover probability of NSGA-II for fitting betas (default=0.7)
crossover distribution index of NSGA-II for fitting betas (default=5)
mutation probability of NSGA-II for fitting betas (default=0.2)
mutation distribution index of NSGA-II for fitting betas (default=10)
default seed value (default=NULL, no seed)
Zsolt T. Kosztyan*, Marcell T. Kurbucz, Attila I. Katona
e-mail*: kosztyan.zsolt@gtk.uni-pannon.hu
NDRLM is a variable fitting with feature selection based on the tunes of GNDA method with NSGA-II algorithm for parameter fittings.
Kosztyan, Z. T., Kurbucz, M. T., & Katona, A. I. (2022). Network-based dimensionality reduction of high-dimensional, low-sample-size datasets. Knowledge-Based Systems, 109180. doi:10.1016/j.knosys.2022.109180
# Using NDRLM without fitting optimization
X<-freeny.x
Y<-freeny.y
NDRLM<-ndrlm(Y,X,optimize=FALSE)
summary(NDRLM)
plot(NDRLM)
if (FALSE) {
# Using NDRLM with optimized fitting
NDRLM<-ndrlm(Y,X)
summary(NDRLM)
# Using Leiden's modularity for grouping variables
X<-freeny.x
Y<-freeny.y
NDRLM<-ndrlm(Y,X,mod_mode=6)
plot(NDRLM)
# Using relative weights
NDRLM<-ndrlm(Y,X,mod_mode=6,rel_weight=TRUE)
plot(NDRLM)
# Using Spearman's correlation
NDRLM<-ndrlm(Y,X,cor_method=2)
summary(NDRLM)
# Using greater population and generations
NDRLM<-ndrlm(Y,X,popsize=52,generations=40)
summary(NDRLM)
# No latent variables
NDRLM<-ndrlm(Y,X,latents="none")
plot(NDRLM)
# In-out model
library(lavaan)
df<-PoliticalDemocracy # Data of Political Democracy
dem<-PoliticalDemocracy[,c(1:8)]
ind60<-PoliticalDemocracy[,-c(1:8)]
NBSEM<-ndrlm(dem,ind60,latents = "both",seed = 2)
plot(NBSEM)
}
Run the code above in your browser using DataLab