Function for creating synthetic cases in order to balance the data for training with TEClassifierRegular or TEClassifierProtoNet]. This is an auxiliary function for use with get_synthetic_cases_from_matrix to allow parallel computations.
create_synthetic_units_from_matrix(
matrix_form,
target,
required_cases,
k,
method,
cat,
k_s,
max_k
)
Returns a list
which contains the text embeddings of the new synthetic cases as a named data.frame
and
their labels as a named factor
.
Named matrix
containing the text embeddings in matrix form. In most cases this object is taken
from EmbeddedText$embeddings.
Named factor
containing the labels/categories of the corresponding cases.
int
Number of cases necessary to fill the gab between the frequency of the class under
investigation and the major class.
int
The number of nearest neighbors during sampling process.
vector
containing strings of the requested methods for generating new cases. Currently
"smote","dbsmote", and "adas" from the package smotefamily are available.
string
The category for which new cases should be created.
int
Number of ks in the complete generation process.
int
The maximum number of nearest neighbors during sampling process.
Other data_management_utils:
get_n_chunks()
,
get_synthetic_cases_from_matrix()