The MRNET approach starts by selecting the variable \(X_i\)
having the highest mutual information with the target Y.
Then, it repeatedly enlarges the set of selected variables \(S\) by
taking the \(X_k\) that maximizes
$$I(X_k;Y) - mean(I(X_k;X_i))$$
for all \(X_i\) already in S.
The procedure stops when the score becomes negative.
By default, the function uses all the available cores. You can
set the actual number of threads used to N by exporting the
environment variable OMP_NUM_THREADS=N
.