Learn R Programming

taxotools (version 0.0.139)

get_accepted_names: get_accepted_names

Description

Match namelist with master and fetch the accepted names using the linkages provided within the data

Usage

get_accepted_names(
  namelist,
  master,
  gen_syn = NA,
  namelookup = NA,
  mastersource = NA,
  match_higher = FALSE,
  fuzzymatch = TRUE,
  fuzzydist = 2,
  canonical = NA,
  genus = NA,
  species = NA,
  subspecies = NA,
  prefix = "",
  verbose = TRUE
)

Value

data frame containing all the original columns with following additional columns:

accepted_name -

Accepted name present in the master. NA is not resolved

method -

method used to resolve the name. See details for explanation of each method

Arguments

namelist

data frame of the list of names to be resolved. Must contain either column canonical containing binomial or trinomial name without spp. and var. etc. or may contain columns for genus, species and subspecies (any sub-specific unit) and the names of the columns are passed as subsequent parameters.

master

data frame with required columns id, canonical and accid. Other columns like order, family are optional. Column id is typically running ids for each record and accid will contain 0 if the name is currently accepted name and id number of accepted name in case the name is a synonym. Column canonical contains binomial or trinomial without spp. var. etc.

gen_syn

data frame with columns Original_Genus and Valid_Genus where Original_genus is synonym and valid_genus is one present in the master. Default: NA when gen_syn is not used.

namelookup

Lookup data frame for names where some names might need manual lookup. The columns required are binomial and validname where binomial is new name and validname is present in the master. Default: NA when namelookup is not used.

mastersource

vector of sources to be used for assignment with priority

match_higher

match genus and family names present in canonical field

fuzzymatch

attempt fuzzy matching or not. Default: TRUE

fuzzydist

fuzzy distance while matching. Default : 2

canonical

column containing names to be resolved to accepted names , Default: NA when columns for genus and species are specified.

genus

column containing genus names to be resolved to accepted names and typically accompanied by species and subspecies columns, Default: NA when canonical parameter is supplied.

species

column containing species names to be resolved to accepted names and is accompanied by genus, Default: NA

subspecies

column containing species names to be resolved to accepted names and is accompanied by genus and species, Default: NA

prefix

to be added to all the return fields

verbose

display process messages, Default: TRUE

Details

Name resolution methods:

direct -

was a direct match with name or a synonym

direct2 -

was a direct match with name or a synonym in non mastersource

fuzzy -

used fuzzy matching

gensyn -

genus substitution with known genus level synonyms

lookup -

Manual lookup in earlier processing

sppdrop -

subspecies was dropped

sub2sp -

subspecies elevated to species

genus -

genus was matched

family -

family was matched

NA -

could not be resolved

Note: Make sure all the data frames have same character encoding to prevent errors.

See Also

Other Name functions: build_gen_syn(), cast_canonical(), cast_scientificname(), check_scientific(), expand_name(), guess_taxo_rank(), list_higher_taxo(), melt_canonical(), melt_scientificname(), resolve_names(), taxo_fuzzy_match()

Examples

Run this code
# \donttest{
master <- data.frame("id" = c(1,2,3,4,5,6,7),
                    "canonical" = c("Hypochlorosis ancharia",
                                    "Hypochlorosis tenebrosa",
                                    "Pseudonotis humboldti",
                                    "Myrina ancharia",
                                    "Hypochlorosis ancharia tenebrosa",
                                    "Hypochlorosis ancharia obiana",
                                    "Hypochlorosis lorquinii"),
                     "family" = c("Lycaenidae", "Lycaenidae", "Lycaenidae",
                                  "Lycaenidae", "Lycaenidae", "Lycaenidae",
                                  "Lycaenidae"),
                    "accid" = c(0,1,1,1,0,0,0),
                    "source" = c("itis","itis","wiki","wiki","itis",
                                 "itis","itis"),
                    stringsAsFactors = FALSE)

mylist <- data.frame("id"= c(11,12,13,14,15,16,17,18,19),
                    "scname" = c("Hypochlorosis ancharia",
                                 "Hypochlorosis ancharii",
                                 "Hypochlorosis tenebrosa",
                                 "Pseudonotis humboldtii",
                                 "Abrothrix longipilis",
                                 "Myrinana anchariana",
                                 "Hypochlorosis ancharia ancharia",
                                 "Myrina lorquinii",
                                 "Sithon lorquinii"),
                    stringsAsFactors = FALSE)

res <- get_accepted_names(namelist = mylist,
                         master=master,
                         canonical = "scname")

gen_syn_list <- data.frame("Original_Genus"=c("Pseudonotis",
                                             "Myrina"),
                          "Valid_Genus"=c("Hypochlorosis",
                                          "Hypochlorosis"),
                          stringsAsFactors = FALSE)

res <- get_accepted_names(namelist = mylist,
                         master=master,
                         gen_syn = gen_syn_list,
                         canonical = "scname")

lookup_list <- data.frame("binomial"=c("Sithon lorquinii",
                                      "Hypochlorosis humboldti"),
                         "validname"=c("Hypochlorosis lorquinii",
                                       "Hypochlorosis lorquinii"),
                         stringsAsFactors = FALSE)

res <- get_accepted_names(namelist = mylist,
                         master=master,
                         gen_syn = gen_syn_list,
                         namelookup = lookup_list,
                         canonical = "scname")

mylist_s <- melt_canonical(mylist,canonical = "scname",
                          genus = "genus",
                          species = "species",
                          subspecies = "subspecies")

res <- get_accepted_names(namelist = mylist_s,
                         master=master,
                         gen_syn = gen_syn_list,
                         namelookup = lookup_list,
                         genus = "genus",
                         species = "species",
                         subspecies = "subspecies")

res <- get_accepted_names(namelist = mylist_s,
                         master=master,
                         gen_syn = gen_syn_list,
                         namelookup = lookup_list,
                         mastersource = c("itis"),
                         genus = "genus",
                         species = "species",
                         subspecies = "subspecies")

mylist <- data.frame("id"= c(11,12,13,14,15,16,17,18),
                    "scname" = c("Hypochlorosis ancharia",
                                 "Hypochlorosis ancharii",
                                 "Hypochlorosis",
                                 "Pseudonotis",
                                 "Lycaenidae",
                                 "Pseudonotis humboldtii",
                                 "Abrothrix longipilis",
                                 "Myrinana anchariana"),
                    stringsAsFactors = FALSE)

res <- get_accepted_names(namelist = mylist,
                         master=master,
                         match_higher = TRUE,
                         canonical = "scname")
# }

Run the code above in your browser using DataLab