This function perform a call to the epitweetr database which includes functionality for geolocating for languages activated and successfully processed on the shiny app.
The geolocation process tries to find the best match in GeoNames database https://www.geonames.org/ including all local aliases for words.
If no language is associated to the text, all tokens will be sent as a query to the indexed GeoNames database.
If a language code is associated to the text and this language is trained on epitweetr, entity recognition techniques will be used to identify the best candidate in text to contain a location
and only these tokens will be sent to the GeoNames query.
A custom scoring function is implemented to grant more weight to cities increasing with population to try to perform disambiguation.
Rules for forcing the geolocation choices of the algorithms and for tuning performance with manual annotations can be performed on the geotag tab of the Shiny app.
A prerequisite to this function is that the tasks download_dependencies
update_geonames
and update_languages
has been run successfully.
This function is called from the Shiny app on geolocation evaluation tab but can also be used for manually evaluating the epitweetr geolocation algorithm.