Researched/assessed the current state of taxonomic key generation. Used my experience with Data Science to identify gaps and provide recommendations for the field. Supervised by Prof. John Caspersen.

Abstract

Taxonomic identification is immensely useful in addressing the well-documented biodiversity crisis as well as in areas such as weed/pest/disease monitoring, agriculture, medicine, and personal enjoyment. Taxonomic keys are traditional approaches which make use of humans and, despite their utility, have seen a decline in relevance and production due to the taxonomic crisis and the immense effort they require. In their place, automated identification approaches such as image recognition and DNA barcoding have been developed to bypass the need for taxonomists and produce a more efficient, uniform, and repeatable identification procedure. Such approaches, however, have several disadvantages that taxonomic keys do not suffer from, demonstrating their necessity alongside automated approaches. Unfortunately, taxonomic keys have received very little attention from those applying computers to taxonomic identification, thus remaining largely manual in their production and print-based in their use. There is a clear need for taxonomic keys to be modernized. Although some work towards this has been done, much of it has been relatively unsuccessful. Fortunately, there have been successful efforts to centralize and standardize taxonomic data, specifically involving classification and nomenclature, descriptions, and observations which are of use to modernizing taxonomic keys. Guidance is provided from the perspective of a data scientist towards the modernization of taxonomic keys, taxonomic identification, and taxonomy as a whole. This guidance focuses on the elimination of repeated work, an emphasis on probability and on the recording of character observations, the importance of citizen science, and a vision for a unified platform for taxonomic identification.

Thesis

Thesis_final_document.pdf