get1stOfRepeatedByCol: Get first of repeated by column
Description
get1stOfRepeatedByCol sorts matrix 'mat' and extracts only 1st occurance of values in column 'sortBy'.
Returns then non-redundant matrix (ie for column 'sortBy', if 'markIfAmbig' specifies existing col, mark ambig there).
Note : problem when sortSupl or sortBy not present (or not intended for use)
depending on 'asList' either list with non-redundant ('unique') and removed lines ('repeats')
Arguments
mat
(matrix or data.frame) numeric vector to be tested
sortBy
column name for which elements should be made unique, numeric or character column; 'sortSupl' .. add'l colname to always select specific 1st)
sortSupl
default="ty"
asFirstLast
(character,length=2) to force specific strings from coluln 'sortSupl' as first and last when selecting 1st of repeated terms, default=c("full","inter")
markIfAmbig
(character,length=2) 1st will be set to 'TRUE' if ambiguous/repeated, 2nd will get (heading) prefix, default=c("ambig","seqNa")
asList
(logical) to return list with non-redundant ('unique') and removed lines ('repeats')
abmiPref
(character) prefix to note ambiguous entries/terms, default="_"
See Also
firstOfRepeated for (more basic) treatment of simple vector, nonAmbiguousNum for numeric use (much faster !!!)
aa <- cbind(no=as.character(1:20),seq=sample(LETTERS[1:15],20,repl=TRUE),
ty=sample(c("full","Nter","inter"),20,repl=TRUE),ambig=rep(NA,20),seqNa=1:20)
get1stOfRepeatedByCol(aa)