AI Model Hetairos Accurately Predicts Central Nervous System Tumor Subtypes from Digital Histology Images
Researchers have developed an artificial intelligence model called Hetairos that can accurately predict central nervous system (CNS) tumor subtypes from digital histology images. The model, which was trained on over 11,000 slides from 9,606 patients across four continents, achieved an accuracy of 87% for its highest-rated predictions and outperformed five board-certified neuropathologists in a direct comparison.
Hetairos uses a combination of computer vision and machine learning algorithms to analyze digital histology images and predict the likelihood of different tumor subtypes. The model was built using data from the Department of Neuropathology at University Hospital Heidelberg, which included CNS tumors from all age groups and deliberately enriched rare tumor types. Twenty percent of this dataset was used for internal validation.
The Hetairos model was subsequently validated on ten external cohorts from four continents, comprising an additional 4,645 tumors and 5,498 slides. The results showed that the model's predictions agreed with methylation classification results in 75% of all internal validation tumors and remained robust in the presence of common histological artifacts.
Hetairos assigns a probability to each possible class, which usually centers around a single class or a few related classes. Of greatest interest is typically the class with the highest probability, known as the top-1 prediction, and the corresponding estimated probability, referred to as Hetairos's confidence. Top-1 predictions agreed with methylation classification results in 75% of all internal validation tumors.
The model's performance was evaluated across different confidence intervals, showing that accuracy increased as confidence rose. However, when combining the three most likely predictions for low-confidence cases, the accuracy remained at 71%, indicating that Hetairos can often meaningfully reduce the set of possible diagnoses from 102 subtypes to just 3.
Hetairos's ability to distinguish different tumor classes was reflected in its visible clustering of subtypes. The model learned internal representations that similarly cluster different groups of tumors, which is a key aspect of methylation analysis. However, Hetairos's predictions were not yet as distinct as those found by methylation analysis, but the emerging structures reflect histopathological similarity.
The researchers noted that while Hetairos performed well on common tumor subtypes, it struggled with rare ones. Data augmentation strategies, including oversampling and color space transformation, appeared insufficient to further improve performance on these underrepresented classes. The findings highlight the need for large datasets to confidently classify very rare tumor subtypes.
Hetairos's predictive performance was evaluated across centers, demographics, and processing protocols, demonstrating its ability to maintain accuracy in settings unobserved during training. This is a critical aspect of digital pathology models, which must be able to generalize beyond their initial dataset.