
Csv2xyd: A Python Software for Processing Large Biodiversity Datasets for Endemism Analysis
By: Jonathan Liria and Ana Soto-Vivas
References
- Nelson G, Ellis S. The history and impact of digitization and digital data mobilization on biodiversity research. Phil. Trans. R. Soc. 2018;B 374:20170391. DOI: 10.1098/rstb.2017.0391
- Scott B, Baker E, Woodburn M, Vincent S, Hardy H, Smith V. The Natural History Museum Data Portal. Database. 2019;1–14. DOI: 10.1093/database/baz038
- Escalante T, Morrone JJ, Rodríguez-Tapia G. Biogeographic regions of North American mammals based on endemism. Biological Journal of the Linnean Society. 2013;110:485–499. DOI: 10.1111/bij.12142
- Casagranda D, de Grosso DML. Areas of Endemism: Methodological and Applied Biogeographic Contributions from South America. InTech. 2013. DOI: 10.5772/55482
- Noguera-Urbano E. El endemismo: diferenciación del término, métodos y aplicaciones. Acta Zoológica Mexicana. 2017;33:89–107. DOI: 10.21829/azm.2017.3311016
- Szumik CA, Cuezzo F, Goloboff PA, Chalup AE. An optimality criterion to determine areas of endemism. Syst. Biol. 2002;51:806–816. DOI: 10.1080/10635150290102483
- Szumik CA, Goloboff PA. Areas of endemism: an improved optimally criterion. Syst. Biol. 2004;53:968–977. DOI: 10.1080/10635150490888859
- Szumik C, Casagranda MD, Roig-Juñent S.
Manual de NDM/VNDM: Programas para la identificación de áreas de endemismo . Instituto Argentino de Estudios Filogenéticos, Año V, Vol. 3. Argentina; 2006.https://www.lillo.org.ar/phylogeny/endemism/Manual_VNDM.pdf . - Liria J, Szumik CA, Goloboff PA. Analysis of endemism of world arthropod distribution data supports biogeographic regions and many established subdivisions. Cladistics. 2021;37:559–570. DOI: 10.1111/cla.12448
- Liria J. Áreas de endemismo de Ecuador: un análisis a partir de datos de distribución de especies de plantas, animales y hongos. Revista mexicana de biodiversidad. 2022;93:
e934031 . DOI: 10.22201/ib.20078706e.2022.93.4031 - Maldonado C, Molina CI, Zizka A, Persson C, Taylor CM, Albán J, Chilquillo E, Rønsted N, Antonelli A. Species diversity and distribution in the era of Big Data. Global Ecology and Biogeography. 2015;24:973–984. DOI: 10.1111/geb.12326
- Feng X, Enquist B, Park D, Boyle B, Breshears D, Gallagher R, Lien A, Newman E, Burger J, Maitner B, Merow C, Li Y, Huynh K, Ernst K, Baldwin E, Foden W, Hannah LB, Morueta-Holme N, Neves D, Núñez-Regueiro MM, Oliveira-Filho A, Peet R, Pillet M, Roehrdanz P, Sandel B, Serra-Diaz, Jímová I, Svenning J, Violle C, Weitemier T, Wiser S, Lopez-Hoffman L. A review of the heterogeneous landscape of biodiversity databases: Opportunities and challenges for a synthesized biodiversity knowledge base. Global Ecology and Biogeography. 2022;31(7):1242–1260. DOI: 10.1111/geb.13497
- Van Rossum G, Drake FL. Python 3 Reference Manual. Scotts Valley, CA: CreateSpace; 2009. p. 242.
- Lundh F. An introduction to tkinter. 1999.
www.pythonware.com/library/tkinter/introduction/index.htm . - McKinney W. Data Structures for Statistical Computing in Python. In: van der Walt S, Millman J, editors. Proceedings of the 9th Python in Science Conference. 2010;56–61. DOI: 10.25080/Majora-92bf1922-00a
- Rocklin M. Dask: Parallel Computation with Blocked algorithms and Task Scheduling. In: Proceedings of the 14th Python in Science Conference; 2015. pp. 126–132. DOI: 10.25080/Majora-7b98e3ed-013
- Cohen S. FuzzyWuzzy: Fuzzy string matching in Python. 2011.
https://github.com/seatgeek/fuzzywuzzy . - Folium developers.
Folium: Python Data . Leaflet.js Maps. 2017.https://python-visualization.github.io/folium/ . - Harris C, Millman K, Van der Walt S, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor L, Berg S, Smith S, Kern R, Picus M, Hoyer S, van Kerkwijk M, Brett M, Haldane A, Fernández del Río J, Wiebe M, Peterson P, Gérard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke G, Olipha T. Array programming with NumPy. Nature. 2020;585:357–362. DOI: 10.1038/s41586-020-2649-2
- Jordahl K, Van den Bossche J, Fleischm J, Wasserman M, McBride J, Gerard F. geopandas/geopandas: v0.8.1. 2020. DOI: 10.5281/zenodo.3946761
- Gillies S. Shapely: Geometric objects, predicates, and operations; 2007.
https://shapely.readthedocs.io . - Szumik CA, Goloboff PA. Higher taxa and the identification of areas of endemism. Cladistics. 2015;31:568–572. DOI: 10.1111/cla.12112
- Hill M. Diversity and evenness: a unifying notation and its consequences. Ecology. 1973;54:427–432. DOI: 10.2307/1934352
- Hijmans R. DIVA-GIS: A free computer program for mapping and geographic data analysis; 2012.
https://www.diva-gis.org/ . - Atlas of Living Australia. Occurrence download “Cnidaria”. 2024.
https://doi.ala.org.au/doi/ba60d4b0-c8cb-4542-8940-5c19144b24cd (Accessed 5 August 2024). - GBIF.org. GBIF Occurrence download “Chordata”. 2024a. DOI: 10.15468/dl.uwujsm (Accessed 15 July 2024).
- GBIF.org. GBIF Occurrence download “Aves”. 2024b. DOI: 10.15468/dl.rxhk3s (Accessed 17 July 2024).
DOI: https://doi.org/10.5334/jors.538 | Journal eISSN: 2049-9647
Language: English
Submitted on: Sep 27, 2024
Accepted on: Jul 14, 2025
Published on: Jul 22, 2025
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Keywords:
© 2025 Jonathan Liria, Ana Soto-Vivas, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.