image: HKUMed research team develops world’s first AI model for thyroid cancer diagnosis, with over 90% accuracy and reduced consultation preparation time. The research was led by Professor Joseph Wu Tsz-kei (centre) and Dr Matrix Fung Man-him (third right).
Credit: The University of Hong Kong
An interdisciplinary research team from the LKS Faculty of Medicine of the University of Hong Kong (HKUMed), the InnoHK Laboratory of Data Discovery for Health (InnoHK D24H), and the London School of Hygiene & Tropical Medicine (LSHTM) has unveiled the world’s first artificial intelligence (AI) model designed to classify both the cancer stage and risk category of thyroid cancer, achieving impressive accuracy exceeding 90%. This innovative HKUMed AI model promises to significantly cut frontline clinicians’ pre-consultation preparation time by approximately 50%, aligning with the HKSAR Government’s initiative to harness AI technology in healthcare. The findings were published in the journal npj Digital Medicine [link to publication].
Background
Thyroid cancer is among the most prevalent cancers in Hong Kong and globally. Precision management of the disease often rely on two systems: (1) the 8th edition of the American Joint Committee on Cancer (AJCC) or Tumour-Node-Metastasis (TNM) cancer staging system to determine the cancer stage; and (2) the American Thyroid Association (ATA) risk classification system to categorise cancer risk. These systems are crucial for predicting patient survival and guiding treatment decisions. However, the manual integration of complex clinical information into these systems can be time-consuming and lack efficiency.
Research methods and findings
The research team developed an AI assistant that leverages large language models (LLMs), like ChatGPT and DeepSeek, which are designed to understand and process human language, to analyse clinical documents and enhance the accuracy and efficiency of thyroid cancer staging and risk classification.
The model leverages four offline open-source LLMs—Mistral (Mistral AI), Llama (Meta), Gemma (Google), and Qwen (Alibaba)—to analyse free-text clinical documents. The AI model was trained with a United States based open-access data with pathology reports of 50 thyroid cancer patients from The Cancer Genome Atlas Programme (TCGA), with subsequent validation against pathology reports from 289 TCGA patients and 35 pseudo cases created by endocrine surgeons.
By combining the output of all four LLMs, the team improved the overall performance of the AI model, achieving overall accuracy of 88.5% to 100% in ATA risk classification and 92.9% to 98.1% in AJCC cancer staging. Compared to traditional manual document reviews, this advancement is expected to halve the time clinicians spend on pre-consultation preparation.
Number of cases |
ATA risk category accuracy |
AJCC cancer staging accuracy |
50 TCGA patients (Training) |
100.0% |
94.1% |
289 TCGA patients (Validation) |
95.5% |
98.1% |
35 pseudo cases (Validation) |
88.5% |
92.9% |
Significance of the research
Professor Joseph T Wu, Sir Kotewall Professor in Public Health and Managing Director of InnoHK D24H at HKUMed, emphasised the model’s remarkable performance. ‘Our model achieves more than 90% accuracy in classifying AJCC cancer stages and ATA risk category’, he said. ‘A significant advantage of this model is its offline capability, which would allow local deployment without the need to share or upload sensitive patient information, thereby providing maximum patient privacy.’
‘In view of the recent debut of DeepSeek, we conducted further comparative tests with a “zero-shot approach” against the latest versions of DeepSeek—R1 and V3—as well as GPT-4o. We were pleased to find that our model performed on par with these powerful online LLMs,’ added Professor Wu.
Dr Matrix Fung Man-him, Clinical Assistant Professor and Chief of Endocrine Surgery, Department of Surgery, School of Clinical Medicine, HKUMed, stated, ‘In addition to providing high accuracy in extracting and analysing information from complex pathology reports, operation records and clinical notes, our AI model also dramatically reduces doctors’ preparation time by almost half compared to human interpretation. It could simultaneously provide cancer staging and clinical risk stratification based on two internationally recognised clinical systems.’
‘The AI model is versatile and could be readily integrated into various settings in the public and private sectors, and both local and international healthcare and research institutes,’ said Dr Fung. ‘We are optimistic that the real-world implementation of this AI model could enhance the efficiency of frontline clinicians and improve the quality of care. In addition, doctors will have more time to counsel with their patients.’
‘In line with government’s strong advocacy of AI adoption in healthcare, as exemplified by the recent launch of LLM-based medical report writing system in the Hospital Authority, our next step is to evaluate the performance of this AI assistant with a large amount of real-world patient data. Once validated, the AI model can be readily deployed in real clinical settings and hospitals to help clinicians improve operational and treatment efficiency,’ explained Dr Carlos Wong, Honorary Associate Professor in the Department of Family Medicine and Primary Care, School of Clinical Medicine, HKUMed.
About the research team
The study was led by Professor Joseph Wu Tsz-kei, Sir Robert Kotewall Professor in Public Health in the School of Public Health, and Managing Director & Lead Scientist of InnoHK D24H; Dr Matrix Fung Man-him, Clinical Assistant Professor and Chief of Endocrine Surgery in the Department of Surgery, School of Clinical Medicine; and Dr Carlos Wong King-ho, Honorary Associate Professor in the Department of Family Medicine and Primary Care, School of Clinical Medicine, and Senior Research Director in InnoHK D24H; all under HKUMed. The first authors were Dr Eric Tang Ho-man and Dr Tingting Wu from InnoHK D24H. Click the link to view the full research team.
Acknowledgements
The research was supported by the Hong Kong Jockey Club Global Health Institute (HKJCGHI), and the InnoHK initiative of the Innovation and Technology Commission of the Hong Kong Special Administrative Region Government.
About the InnoHK Laboratory of Data Discovery for Health
The InnoHK Laboratory of Data Discovery for Health (InnoHK D²4H) aims to gather and curate massive, unique data resources and develop deep frontier analytics to protect global public health while improving individual healthcare through precision medicine. The InnoHK D²4H brings together a multi-disciplinary team of some of the world's leading scientists to apply AI and big data in ways that will transform their approaches to understanding and treating diseases.
Spearheaded by the University of Hong Kong with support from other world-renowned national and international academic institutions, the InnoHK D²4H is keen to work with health authorities, such as the World Health Organization and the China Centre for Disease Control and Prevention. By arranging collaboration across multiple disciplines and sectors, the InnoHK D²4H aims to advance the frontiers of healthcare technology in Hong Kong, the Greater Bay Area and beyond to produce ‘moonshots’, which will have tremendous healthcare benefits for global health.
Media enquiries
Please contact LKS Faculty of Medicine of The University of Hong Kong by email (medmedia@hku.hk).
Journal
npj Digital Medicine
Method of Research
Experimental study
Subject of Research
Not applicable
Article Title
Developing a named entity framework for thyroid cancer staging and risk level classification using large language models
Article Publication Date
1-Mar-2025