We are working on a way to programatically flag and/or remove these duplicate records. As you could imagine, this is rather difficult as data is often lost in translation, significant digits could change from provider to provider for the same data, etc.
Still, we think a single R interface to many occurrence record providers will provide a consistent way to work with occurrence data, making analyses and vizualizations more repeatable across providers.
We are working on a set of tools for cleaning data, as well as removing duplicates in the
spocc_clean
function - so keep an eye on that.
Do get in touch with us if you have concerns, have ideas for eliminating duplicates, etc, at