New noninvasive technique identifies animal species in historical parchments
Species identification using reflectance spectrophotometry and machine learning
Intelligent Computing
A groundbreaking study by researchers from the University of Namur, Belgium introduces a novel, contactless method for identifying animal species used in historical parchment manuscripts, an essential aspect of cultural heritage studies. Traditionally, species identification has been performed using slightly invasive methods, but this innovative classification model instead uses reflectance spectrophotometry, covering the ultraviolet, visible and near-infrared spectra, combined with machine learning for data analysis. This research was published Oct. 17 in Intelligent Computing, a Science Partner Journal, in an article titled “Animal Species Identification in Historical Parchments by Continuous Wavelet Transform–Convolutional Neural Network Classifier Applied to Ultraviolet–Visible–Near-Infrared Spectroscopic Data.”
The proposed classification model was evaluated for accuracy using both k-fold and historical validation techniques. In the k-fold validation, specifically, a 4-fold validation where 75% of the data was used for training and 25% was used for testing, the model achieved an average accuracy of 85% across the various training and validation sets. In the historic validation, the model was trained on a dataset containing the modern parchment spectra and tested on a dataset of 64 historical parchment spectra, achieving 79% accuracy despite the challenge of covariate shift between historical and modern parchments. The new method outperformed other classification methods such as k-nearest neighbors and support vector machine on both validation metrics, confirming its ability to effectively classify species based on ultraviolet–visible–near-infrared reflectance spectra.
The current limitation of the method is mainly the small size of the dataset used to train the model, according to the authors. In the future, data from parchments from more individual animals could be collected for inclusion in the dataset. Insights from the physicochemical properties of aging could also improve the training procedure, and therefore, the robustness and generalizability of the model.
In addition to achieving high classification accuracy, the study aimed to demystify the "black-box" nature of the classification process through explainable techniques. Shapley additive explanation values were used to reveal the classification process. Different relationships between the mean values and wavelength were obtained for the calf, sheep and goat parchment datasets. However, while averaged values revealed broad trends, individual samples showed unique variations, which is essential for understanding the complexity involved in accurately categorizing data.
From a physicochemical perspective, the researchers emphasized that species identification is not solely dependent on narrow-band molecular signatures. Instead, broader spectral features provide sufficient information for differentiation. “We hope this study will provide cultural heritage practitioners with valuable insights derived from high-level analysis,” the authors stated.
The study used a diverse dataset of ultraviolet–visible–near-infrared reflectance spectra from parchments made from calf, sheep, and goat skins. Samples varied in breed, individual animal, and preparation methods to enhance generalizability. To address the scarcity of labeled historical spectra, both modern and historical parchments were used in this study. Historical parchments were labeled by mass spectrometry peptide sequencing, while modern samples increased the number of samples by using multiple regions of the skin and considering both flesh and grain sides. Reflectance was measured using a double-beam spectrophotometer equipped with an integrating sphere, ensuring high-quality data collection across a wide wavelength range.
To prepare spectral data for machine learning, continuous wavelet transform was used as a preprocessing step, creating 2D scalograms that enhanced feature extraction. The authors said, "We harnessed the unsupervised feature extraction of autoencoders to assist a classification network in a common semisupervised training procedure. With this combination, we aimed to improve the generalization capability of the classification network.“
As new data-oriented approaches are empowered in the field of cultural heritage, the stories hidden in historical materials will be better explained, ensuring that the legacy of our ancestors is not only preserved but also accurately interpreted for future generations.
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.