File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1038/s41597-023-02626-w
- Scopus: eid_2-s2.0-85175688705
- PMID: 37919303
- WOS: WOS:001098042400004
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: A globally synthesised and flagged bee occurrence dataset and cleaning workflow
Title | A globally synthesised and flagged bee occurrence dataset and cleaning workflow |
---|---|
Authors | Dorey, JBFischer, EEChesshire, PRNava-Bolaños, AO'Reilly, RLBossert, SCollins, SMLichtenberg, EMTucker, EMSmith-Pardo, AFalcon-Brindis, AGuevara, DARibeiro, Bde Pedro, DPickering, JHung, KLJParys, KAMcCabe, LMRogan, MSMinckley, RLVelazco, SJEGriswold, TZarrillo, TAJetz, WSica, YVOrr, MCGuzman, LMAscher, JSHughes, ACCobb, NS |
Issue Date | 2-Nov-2023 |
Publisher | Nature Research |
Citation | Scientific Data, 2023, v. 10, n. 1 How to Cite? |
Abstract | Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, "cleaned" and "flagged-but-uncleaned". The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation. |
Persistent Identifier | http://hdl.handle.net/10722/344874 |
ISSN | 2023 Impact Factor: 5.8 2023 SCImago Journal Rankings: 1.937 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Dorey, JB | - |
dc.contributor.author | Fischer, EE | - |
dc.contributor.author | Chesshire, PR | - |
dc.contributor.author | Nava-Bolaños, A | - |
dc.contributor.author | O'Reilly, RL | - |
dc.contributor.author | Bossert, S | - |
dc.contributor.author | Collins, SM | - |
dc.contributor.author | Lichtenberg, EM | - |
dc.contributor.author | Tucker, EM | - |
dc.contributor.author | Smith-Pardo, A | - |
dc.contributor.author | Falcon-Brindis, A | - |
dc.contributor.author | Guevara, DA | - |
dc.contributor.author | Ribeiro, B | - |
dc.contributor.author | de Pedro, D | - |
dc.contributor.author | Pickering, J | - |
dc.contributor.author | Hung, KLJ | - |
dc.contributor.author | Parys, KA | - |
dc.contributor.author | McCabe, LM | - |
dc.contributor.author | Rogan, MS | - |
dc.contributor.author | Minckley, RL | - |
dc.contributor.author | Velazco, SJE | - |
dc.contributor.author | Griswold, T | - |
dc.contributor.author | Zarrillo, TA | - |
dc.contributor.author | Jetz, W | - |
dc.contributor.author | Sica, YV | - |
dc.contributor.author | Orr, MC | - |
dc.contributor.author | Guzman, LM | - |
dc.contributor.author | Ascher, JS | - |
dc.contributor.author | Hughes, AC | - |
dc.contributor.author | Cobb, NS | - |
dc.date.accessioned | 2024-08-12T04:08:03Z | - |
dc.date.available | 2024-08-12T04:08:03Z | - |
dc.date.issued | 2023-11-02 | - |
dc.identifier.citation | Scientific Data, 2023, v. 10, n. 1 | - |
dc.identifier.issn | 2052-4463 | - |
dc.identifier.uri | http://hdl.handle.net/10722/344874 | - |
dc.description.abstract | Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, "cleaned" and "flagged-but-uncleaned". The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation. | - |
dc.language | eng | - |
dc.publisher | Nature Research | - |
dc.relation.ispartof | Scientific Data | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.title | A globally synthesised and flagged bee occurrence dataset and cleaning workflow | - |
dc.type | Article | - |
dc.identifier.doi | 10.1038/s41597-023-02626-w | - |
dc.identifier.pmid | 37919303 | - |
dc.identifier.scopus | eid_2-s2.0-85175688705 | - |
dc.identifier.volume | 10 | - |
dc.identifier.issue | 1 | - |
dc.identifier.eissn | 2052-4463 | - |
dc.identifier.isi | WOS:001098042400004 | - |
dc.publisher.place | BERLIN | - |
dc.identifier.issnl | 2052-4463 | - |