occ_search: Search for GBIF occurrences

Description

Search for GBIF occurrences

Usage

occ_search(
  taxonKey = NULL,
  scientificName = NULL,
  country = NULL,
  publishingCountry = NULL,
  hasCoordinate = NULL,
  typeStatus = NULL,
  recordNumber = NULL,
  lastInterpreted = NULL,
  continent = NULL,
  geometry = NULL,
  geom_big = "asis",
  geom_size = 40,
  geom_n = 10,
  recordedBy = NULL,
  recordedByID = NULL,
  identifiedByID = NULL,
  basisOfRecord = NULL,
  datasetKey = NULL,
  eventDate = NULL,
  catalogNumber = NULL,
  year = NULL,
  month = NULL,
  decimalLatitude = NULL,
  decimalLongitude = NULL,
  elevation = NULL,
  depth = NULL,
  institutionCode = NULL,
  collectionCode = NULL,
  hasGeospatialIssue = NULL,
  issue = NULL,
  search = NULL,
  mediaType = NULL,
  subgenusKey = NULL,
  repatriated = NULL,
  phylumKey = NULL,
  kingdomKey = NULL,
  classKey = NULL,
  orderKey = NULL,
  familyKey = NULL,
  genusKey = NULL,
  speciesKey = NULL,
  establishmentMeans = NULL,
  degreeOfEstablishment = NULL,
  protocol = NULL,
  license = NULL,
  organismId = NULL,
  publishingOrg = NULL,
  stateProvince = NULL,
  waterBody = NULL,
  locality = NULL,
  occurrenceStatus = "PRESENT",
  gadmGid = NULL,
  coordinateUncertaintyInMeters = NULL,
  verbatimScientificName = NULL,
  eventId = NULL,
  identifiedBy = NULL,
  networkKey = NULL,
  verbatimTaxonId = NULL,
  occurrenceId = NULL,
  organismQuantity = NULL,
  organismQuantityType = NULL,
  relativeOrganismQuantity = NULL,
  iucnRedListCategory = NULL,
  lifeStage = NULL,
  isInCluster = NULL,
  distanceFromCentroidInMeters = NULL,
  geoDistance = NULL,
  sex = NULL,
  dwcaExtension = NULL,
  gbifId = NULL,
  gbifRegion = NULL,
  projectId = NULL,
  programme = NULL,
  preparations = NULL,
  datasetId = NULL,
  datasetName = NULL,
  publishedByGbifRegion = NULL,
  island = NULL,
  islandGroup = NULL,
  taxonId = NULL,
  taxonConceptId = NULL,
  taxonomicStatus = NULL,
  acceptedTaxonKey = NULL,
  collectionKey = NULL,
  institutionKey = NULL,
  otherCatalogNumbers = NULL,
  georeferencedBy = NULL,
  installationKey = NULL,
  hostingOrganizationKey = NULL,
  crawlId = NULL,
  modified = NULL,
  higherGeography = NULL,
  fieldNumber = NULL,
  parentEventId = NULL,
  samplingProtocol = NULL,
  sampleSizeUnit = NULL,
  pathway = NULL,
  gadmLevel0Gid = NULL,
  gadmLevel1Gid = NULL,
  gadmLevel2Gid = NULL,
  gadmLevel3Gid = NULL,
  earliestEonOrLowestEonothem = NULL,
  latestEonOrHighestEonothem = NULL,
  earliestEraOrLowestErathem = NULL,
  latestEraOrHighestErathem = NULL,
  earliestPeriodOrLowestSystem = NULL,
  latestPeriodOrHighestSystem = NULL,
  earliestEpochOrLowestSeries = NULL,
  latestEpochOrHighestSeries = NULL,
  earliestAgeOrLowestStage = NULL,
  latestAgeOrHighestStage = NULL,
  lowestBiostratigraphicZone = NULL,
  highestBiostratigraphicZone = NULL,
  group = NULL,
  formation = NULL,
  member = NULL,
  bed = NULL,
  associatedSequences = NULL,
  isSequenced = NULL,
  startDayOfYear = NULL,
  endDayOfYear = NULL,
  limit = 500,
  start = 0,
  fields = "all",
  return = NULL,
  facet = NULL,
  facetMincount = NULL,
  facetMultiselect = NULL,
  skip_validate = TRUE,
  curlopts = list(http_version = 2),
  ...
)

Value

An object of class gbif, which is a S3 class list, with slots for metadata (meta), the occurrence data itself (data), the taxonomic hierarchy data (hier), and media metadata (media). In addition, the object has attributes listing the user supplied arguments and whether it was a 'single' or 'many' search; that is, if you supply two values of the datasetKey parameter to searches are done, and it's a 'many'. meta is a list of length four with offset, limit, endOfRecords and count fields. data is a tibble (aka data.frame). hier

is a list of data.frames of the unique set of taxa found, where each data.frame is its taxonomic classification. media is a list of media objects, where each element holds a set of metadata about the media object.

Arguments

taxonKey

(numeric) A taxon key from the GBIF backbone. All included and synonym taxa are included in the search, so a search for aves with taxononKey=212 will match all birds, no matter which species. You can pass many keys to occ_search(taxonKey=c(1,212)).

scientificName

A scientific name from the GBIF backbone. All included and synonym taxa are included in the search.

country

(character) The 2-letter country code (ISO-3166-1) in which the occurrence was recorded. enumeration_country().

publishingCountry

The 2-letter country code (as per ISO-3166-1) of the country in which the occurrence was recorded. See enumeration_country().

hasCoordinate

(logical) Return only occurrence records with lat/long data (TRUE) or all records (FALSE, default).

typeStatus

Type status of the specimen. One of many options.

recordNumber

Number recorded by collector of the data, different from GBIF record number.

lastInterpreted

Date the record was last modified in GBIF, in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries, 'smaller,larger' (e.g., '1990,1991', whereas '1991,1990' wouldn't work).

continent

The source supplied continent.

"africa"
"antarctica"
"asia"
"europe"
"north_america"
"oceania"
"south_america"

Continent is not inferred but only populated if provided by the dataset publisher. Applying this filter may exclude many relevant records.

geometry

(character) Searches for occurrences inside a polygon in Well Known Text (WKT) format. A WKT shape written as either

"POINT"
"LINESTRING"
"LINEARRING"
"POLYGON"
"MULTIPOLYGON"

For Example, "POLYGON((37.08 46.86,38.06 46.86,38.06 47.28,37.08 47.28, 37.0 46.8))". See also the section WKT below.

geom_big

(character) One"bbox" or "asis" (default).

geom_size

(integer) An integer indicating size of the cell. Default: 40.

geom_n

(integer) An integer indicating number of cells in each dimension. Default: 10.

recordedBy

(character) The person who recorded the occurrence.

recordedByID

(character) Identifier (e.g. ORCID) for the person who recorded the occurrence

identifiedByID

(character) Identifier (e.g. ORCID) for the person who provided the taxonomic identification of the occurrence.

basisOfRecord

(character) The specific nature of the data record. See here.

"FOSSIL_SPECIMEN"
"HUMAN_OBSERVATION"
"MATERIAL_CITATION"
"MATERIAL_SAMPLE"
"LIVING_SPECIMEN"
"MACHINE_OBSERVATION"
"OBSERVATION"
"PRESERVED_SPECIMEN"
"OCCURRENCE"

datasetKey

(character) The occurrence dataset uuid key. That can be found in the dataset page url. For example, "7e380070-f762-11e1-a439-00145 eb45e9a" is the key for Natural History Museum (London) Collection Specimens.

eventDate

(character) Occurrence date in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries, 'smaller,larger' ('1990,1991', whereas '1991,1990' wouldn't work).

catalogNumber

(character) An identifier of any form assigned by the source within a physical collection or digital dataset for the record which may not unique, but should be fairly unique in combination with the institution and collection code.

year

The 4 digit year. A year of 98 will be interpreted as AD 98. Supports range queries, 'smaller,larger' (e.g., '1990,1991', whereas 1991, 1990' wouldn't work).

month

The month of the year, starting with 1 for January. Supports range queries, 'smaller,larger' (e.g., '1,2', whereas '2,1' wouldn't work).

decimalLatitude

Latitude in decimals between -90 and 90 based on WGS84. Supports range queries, 'smaller,larger' (e.g., '25,30', whereas '30,25' wouldn't work).

decimalLongitude

Longitude in decimals between -180 and 180 based on WGS84. Supports range queries (e.g., '-0.4,-0.2', whereas '-0.2,-0.4' wouldn't work).

elevation

Elevation in meters above sea level. Supports range queries, 'smaller,larger' (e.g., '5,30', whereas '30,5' wouldn't work).

depth

Depth in meters relative to elevation. For example 10 meters below a lake surface with given elevation. Supports range queries, 'smaller,larger' (e.g., '5,30', whereas '30,5' wouldn't work).

institutionCode

An identifier of any form assigned by the source to identify the institution the record belongs to.

collectionCode

(character) An identifier of any form assigned by the source to identify the physical collection or digital dataset uniquely within the text of an institution.

hasGeospatialIssue

(logical) Includes/excludes occurrence records which contain spatial issues (as determined in our record interpretation), i.e. hasGeospatialIssue=TRUE returns only those records with spatial issues while hasGeospatialIssue=FALSE includes only records without spatial issues. The absence of this parameter returns any record with or without spatial issues.

issue

(character) One or more of many possible issues with each occurrence record. Issues passed to this parameter filter results by the issue. One of many options. See here for definitions.

search

(character) Query terms. The value for this parameter can be a simple word or a phrase. For example, search="puma"

mediaType

(character) Media type of "MovingImage", "Sound", or "StillImage".

subgenusKey

(numeric) Subgenus classification key.

repatriated

(character) Searches for records whose publishing country is different to the country where the record was recorded in.

phylumKey

(numeric) Phylum classification key.

kingdomKey

(numeric) Kingdom classification key.

classKey

(numeric) Class classification key.

orderKey

(numeric) Order classification key.

familyKey

(numeric) Family classification key.

genusKey

(numeric) Genus classification key.

speciesKey

(numeric) Species classification key.

establishmentMeans

(character) provides information about whether an organism or organisms have been introduced to a given place and time through the direct or indirect activity of modern humans.

"Introduced"
"Native"
"NativeReintroduced"
"Vagrant"
"Uncertain"
"IntroducedAssistedColonisation"

degreeOfEstablishment

(character) Provides information about degree to which an Organism survives, reproduces, and expands its range at the given place and time. One of many options.

protocol

(character) Protocol or mechanism used to provide the occurrence record. One of many options.

license

(character) The type license applied to the dataset or record.

"CC0_1_0"
"CC_BY_4_0"
"CC_BY_NC_4_0"

organismId

(numeric) An identifier for the Organism instance (as opposed to a particular digital record of the Organism). May be a globally unique identifier or an identifier specific to the data set.

publishingOrg

(character) The publishing organization key (a UUID).

stateProvince

(character) The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs.

waterBody

(character) The name of the water body in which the locations occur

locality

(character) The specific description of the place.

occurrenceStatus

(character) Default is "PRESENT". Specify whether search should return "PRESENT" or "ABSENT" data.

gadmGid

(character) The gadm id of the area occurrences are desired from. https://gadm.org/.

coordinateUncertaintyInMeters

A number or range between 0-1,000,000 which specifies the desired coordinate uncertainty. A coordinateUncertainty InMeters=1000 will be interpreted all records with exactly 1000m. Supports range queries, 'smaller,larger' (e.g., '1000,10000', whereas '10000,1000' wouldn't work).

verbatimScientificName

(character) Scientific name as provided by the source.

eventId

(character) identifier(s) for a sampling event.

identifiedBy

(character) names of people, groups, or organizations.

networkKey

(character) The occurrence network key (a uuid) who assigned the Taxon to the subject.

verbatimTaxonId

(character) The taxon identifier provided to GBIF by the data publisher.

occurrenceId

(character) occurrence id from source.

organismQuantity

A number or range which specifies the desired organism quantity. An organismQuantity=5 will be interpreted all records with exactly 5. Supports range queries, smaller,larger (e.g., '5,20', whereas '20,5' wouldn't work).

organismQuantityType

(character) The type of quantification system used for the quantity of organisms. For example, "individuals" or "biomass".

relativeOrganismQuantity

(numeric) A relativeOrganismQuantity=0.1 will be interpreted all records with exactly 0.1 The relative measurement of the quantity of the organism (a number between 0-1). Supports range queries, "smaller,larger" (e.g., '0.1,0.5', whereas '0.5,0.1' wouldn't work).

iucnRedListCategory

(character) The IUCN threat status category.

"NE" (Not Evaluated)
"DD" (Data Deficient)
"LC" (Least Concern)
"NT" (Near Threatened)
"VU" (Vulnerable)
"EN" (Endangered)
"CR" (Critically Endangered)
"EX" (Extinct)
"EW" (Extinct in the Wild)

lifeStage

(character) the life stage of the occurrence. One of many options.

isInCluster

(logical) identify potentially related records on GBIF.

distanceFromCentroidInMeters

A number or range. A value of "2000,*" means at least 2km from known centroids. A value of "0" would mean occurrences exactly on known centroids. A value of "0,2000" would mean within 2km of centroids. Max value is 5000.

geoDistance

(character) Filters to match occurrence records with coordinate values within a specified distance of a coordinate. Distance may be specified in kilometres (km) or metres (m). Example : "90,100,5km"

sex

(character) The sex of the biological individual(s) represented in the occurrence.

dwcaExtension

(character) A known Darwin Core Archive extension RowType. Limits the search to occurrences which have this extension, although they will not necessarily have any useful data recorded using the extension.

gbifId

(numeric) The unique GBIF key for a single occurrence.

gbifRegion

(character) Gbif region based on country code.

projectId

(character) The identifier for a project, which is often assigned by a funded programme.

programme

(character) A group of activities, often associated with a specific funding stream, such as the GBIF BID programme.

preparations

(character) Preparation or preservation method for a specimen.

datasetId

(character) The ID of the dataset. Parameter may be repeated. Example : https://doi.org/10.1594/PANGAEA.315492

datasetName

(character) The exact name of the dataset. Not the same as dataset title.

publishedByGbifRegion

(character) GBIF region based on the owning organization's country.

island

(character) The name of the island on or near which the location occurs.

islandGroup

(character) The name of the island group in which the location occurs.

taxonId

(character) The taxon identifier provided to GBIF by the data publisher. Example : urn:lsid:dyntaxa.se:Taxon:103026

taxonConceptId

(character) An identifier for the taxonomic concept to which the record refers - not for the nomenclatural details of a taxon. Example : 8fa58e08-08de-4ac1-b69c-1235340b7001

taxonomicStatus

(character) A taxonomic status. Example : SYNONYM

acceptedTaxonKey

(numeric) A taxon key from the GBIF backbone. Only synonym taxa are included in the search, so a search for Aves with acceptedTaxonKey=212 will match occurrences identified as birds, but not any known family, genus or species of bird.

collectionKey

(character) A key (UUID) for a collection registered in the Global Registry of Scientific Collections. Example : dceb8d52-094c-4c2c-8960-75e0097c6861

institutionKey

(character) A key (UUID) for an institution registered in the Global Registry of Scientific Collections.

otherCatalogNumbers

(character) Previous or alternate fully qualified catalog numbers.

georeferencedBy

(character) Name of a person, group, or organization who determined the georeference (spatial representation) for the location. Example : Brad Millen

installationKey

(character) The occurrence installation key (a UUID). Example : 17a83780-3060-4851-9d6f-029d5fcb81c9

hostingOrganizationKey

(character) The key (UUID) of the publishing organization whose installation (server) hosts the original dataset. Example : fbca90e3-8aed-48b1-84e3-369afbd000ce

crawlId

(numeric) Crawl attempt that harvested this record.

modified

(character) The most recent date-time on which the occurrence was changed, according to the publisher. Can be a range. Example : 2023-02-20

higherGeography

(character) Geographic name less specific than the information captured in the locality term.

fieldNumber

(character) An identifier given to the event in the field. Often serves as a link between field notes and the event.

parentEventId

(character) An identifier for the information associated with a sampling event.

samplingProtocol

(character) The name of, reference to, or description of the method or protocol used during a sampling event. Example : malaise trap

sampleSizeUnit

(character) The unit of measurement of the size (time duration, length, area, or volume) of a sample in a sampling event. Example : hectares

pathway

(character) The process by which an organism came to be in a given place at a given time, as defined in the GBIF Pathway vocabulary. Example : Agriculture

gadmLevel0Gid

(character) A GADM geographic identifier at the zero level, for example AGO.

gadmLevel1Gid

(character) A GADM geographic identifier at the first level, for example AGO.1_1.

gadmLevel2Gid

(character) A GADM geographic identifier at the second level, for example AFG.1.1_1.

gadmLevel3Gid

(character) A GADM geographic identifier at the third level, for example AFG.1.1.1_1.

earliestEonOrLowestEonothem