Journal of Clinical Studies & Medical Case Reports Category: Medical Type: Research Article

Validation of an Artificial Intelligence Model for the Diagnosis of Thyroid Nodules: A Prospective Multicenter Study

Francesco Pignataro1*, Leo GUIDOBALDI2, Marco Gismant3, Manuela Nestola4 and Cristina Nisita5
1 Director of the Diagnostic Imaging Department of the Humanitas University Consortium, Italy
2 Cytopathologist, UOC Pathological Anatomy Sandro Pertini Hospital, Rome, Italy
3 Specialist in Internal Medicine Responsible for the Internal Ultrasound service in diagnostic centers in Rome, Italy
4 Specialist in Internal Medicine - Internal Multiparametric Ultrasound Service - Altamedica Research Center Rome, Italy
5 Specialist in Gastroenterology and Digestive Endoscopy, Senior Sonographer in diagnostic centers in Rome, Italy

*Corresponding Author(s):
Francesco Pignataro
Director Of The Diagnostic Imaging Department Of The Humanitas University Consortium, Italy
Email:pignatarofrancesco@yahoo.it

Received Date: May 29, 2024
Accepted Date: Jun 13, 2024
Published Date: Jun 20, 2024

Abstract

Objective: To validate the diagnostic accuracy of an artificial intelligence (AI) model in the diagnosis of thyroid nodules, comparing the results with the evaluation of expert sonographers and cytology. 

Methods: A prospective multicenter study that included 1500 thyroid nodules undergoing multiparametric ultrasound (two-dimensional ultrasound with 18 and 22 MHz probes, SMI power Doppler study, and strain ratio elastography) and fine-needle aspiration for cytological examination. The cases were numbered and double-blindly associated with cytological results and expert sonographers' evaluation. The images were analyzed by a machine learning algorithm and ChatBot Claude 3 Opus to create datasets. The predictive model was tested on another 1000 cases, defining the TIRADS level of suspicion. The results were compared with the evaluation of expert sonographers and cytology using statistical tests. 

Results: The AI model showed a sensitivity of 96% (95% CI: 94-98%), a specificity of 95% (95% CI: 93-97%), and a correspondence with the evaluation of expert sonographers of over 95%. 

The agreement with cytological analysis was 94%. The results were statistically significant (p < 0.001). 

Conclusion: The AI model validated in this study demonstrated high diagnostic accuracy in the evaluation of thyroid nodules, with a strong agreement with the evaluation of expert sonographers and cytology. The clinical implementation of this model could improve the management of patients with thyroid nodules, reducing inter-operator variability.

Introduction

Thyroid nodules are a common clinical condition, with an estimated prevalence of 20-70% in the general population [1]. Most thyroid nodules are benign, but 5-15% can be malignant [2]. Accurate diagnosis of thyroid nodules is crucial to identify patients who need further investigation or therapeutic interventions. Ultrasound is the first-line imaging method for the evaluation of thyroid nodules, but its interpretation may be subject to inter-operator variability [3]. 

Artificial intelligence (AI) has shown promising results in various fields of medicine, including diagnostic imaging [4]. The objective of this study is to validate the diagnostic accuracy of an AI model in the diagnosis of thyroid nodules, comparing the results with the evaluation of expert sonographers and cytology.

Materials And Methods

Study design 

A prospective multicenter study was conducted that included 1500 thyroid nodules in consecutive patients undergoing thyroid ultrasound at three referral centers for thyroid pathology. All images were managed anonymously with exclusive numerical concordance and all participants provided written informed consent. 

Acquisition of ultrasound images 

All thyroid nodules were evaluated by multiparametric ultrasound, including two-dimensional ultrasound with 18 and 22 MHz probes when possible, SMI power Doppler study, strain ratio elastography, and micropure analysis. The images were acquired by expert sonographers, following a standardized sequential protocol. 

Cytological analysis 

All nodules underwent fine-needle aspiration for cytological examination, interpreted by experienced cytopathologists according to the Bethesda system [5,6]. Cases with indeterminate or suspect cytology for malignancy underwent repeat fine-needle aspiration or surgical intervention, depending on the overall clinical picture. 

Development and validation of the AI model 

The ultrasound images were analyzed by a machine-learning algorithm developed using a convolutional neural network (CNN) trained on a large dataset of ultrasound images of thyroid nodules with cytological diagnosis and expert ultrasound evaluation. The CNN architecture included 5 convolutional layers, followed by pooling and fully connected layers. The algorithm was trained using the transfer learning technique, starting from a pre-trained network on a generic image dataset (ImageNet) and refining the connection weights on the specific dataset of thyroid nodules. Training was performed on an NVIDIA Tesla V100 GPU, using the Adam optimizer and the binary cross-entropy loss function. In addition, the images were analyzed by ChatBot Claude 3 Opus, a conversational AI system based on an auto-regressive language model (GPT-3), to create structured datasets containing the relevant ultrasound features of the nodules. 

The AI model was validated using a k-fold cross-validation technique (k=5), dividing the dataset into training, validation, and test subsets. Model performance was evaluated in terms of sensitivity, specificity, accuracy, and area under the ROC curve (AUC). Subsequently, the model was tested on an independent dataset of 1000 thyroid nodules, defining the TIRADS level of suspicion. The results were compared with the evaluation of expert sonographers and cytology using chi-square test and analysis of variance (ANOVA).

Results

Characteristics of thyroid nodules 

Of the 1500 thyroid nodules included in the study, 1200 (80%) were benign and 300 (20%) were malignant according to the evaluation of expert sonographers. The mean size of the nodules was 18±9 mm for benign nodules and 22±11 mm for malignant nodules (p < 0.001). Malignant nodules more frequently presented irregular margins (75% vs 20%, p<0.001), microcalcifications (60% vs 15%, p < 0.001), and intralesional vascularity (70% vs 30%, p < 0.001) compared to benign nodules. 

AI model performance 

The AI model showed a sensitivity of 96% (95% CI: 94-98%), a specificity of 95% (95% CI: 93-97%), and an accuracy of 95% (95% CI: 94-96%) in the diagnosis of thyroid nodules, with an AUC of 0.98 (95% CI: 0.97-0.99). The correspondence with the evaluation of expert sonographers was 95% (95% CI: 94-96%), while the agreement with cytological analysis was 94% (95% CI: 93-95%). The results were statistically significant (p < 0.001). 

In the validation dataset of 1000 nodules, the AI model showed a sensitivity of 95% (95% CI: 92-97%), a specificity of 94% (95% CI: 92-96%), and an accuracy of 94% (95% CI: 93-95%) in defining the TIRADS level of suspicion. The agreement with the evaluation of expert sonographers was 94% (95% CI: 92-95%) and with cytology was 93% (95% CI: 91-94%). The results were statistically significant (p < 0.001).

Discussion

This study validated an AI model for the diagnosis of thyroid nodules, demonstrating high diagnostic accuracy and strong agreement with the evaluation of expert sonographers and cytology. The results suggest that the clinical implementation of this model could improve the management of patients with thyroid nodules, reducing inter-operator variability and increasing the efficiency of the clinical workflow. 

The use of AI in the diagnosis of thyroid nodules has been the subject of growing interest in recent years [7,8]. However, most previous studies were based on retrospective datasets or compared the performance of AI only with cytology, without considering expert ultrasound evaluation. The present study overcomes these limitations, providing a prospective multicenter validation of the AI model and comparing the results with both the evaluation of expert sonographers and cytology. 

The strengths of this study include the prospective design, the sample size, and the use of a standardized protocol for the acquisition of ultrasound images. Furthermore, the use of a conversational AI system (ChatBot Claude 3 Opus) for the creation of structured datasets represents an innovative approach, which could facilitate the integration of AI into clinical practice. 

However, the study also has some limitations. First, despite multicenter validation, the results may not be generalizable to all populations of patients with thyroid nodules. Second, it was not possible to obtain histological confirmation for all nodules, limiting the evaluation of the diagnostic accuracy of the AI model. Finally, the clinical implementation of this model will require further evaluations of feasibility, acceptability, and impact on patient outcomes. 

In conclusion, this study validated an AI model for the diagnosis of thyroid nodules, demonstrating high diagnostic accuracy and strong agreement with the evaluation of expert sonographers and cytology. The clinical implementation of this model could improve the management of patients with thyroid nodules, reducing inter-operator variability and increasing the efficiency of the clinical workflow. Further studies are needed to confirm these results and evaluate the impact of AI on patient outcomes and clinical practice.

References

  1. Guth S, Theune U, Aberle J, Galach A, Bamberger CM (2009) Very high prevalence of thyroid nodules detected by high frequency (13 MHz) ultrasound examination. Eur J Clin Invest. 39: 699-706.
  2. Gharib H, Papini E, Garber JR, Duick DS, Harrell RM, et al. (2016) American Association of Clinical Endocrinologists, American College of Endocrinology, and Associazione Medici Endocrinologi Medical Guidelines for Clinical Practice for the Diagnosis and Management of Thyroid Nodules--2016 Update. Endocr Pract. 22: 622-639.
  3. Choi SH, Kim EK, Kwak JY, Kim MJ, Son EJ (2010) Interobserver and intraobserver variations in ultrasound assessment of thyroid nodules. Thyroid. 20: 167-172.
  4. Bi WL, Hosny A, Schabath MB, Giger ML, Birkbak NJ, et al. (2019) Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J Clin. 69: 127-157.
  5. Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, et al. (2017) ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol. 14: 587-595.
  6. Cibas ES, Ali SZ (2017) The 2017 Bethesda System for Reporting Thyroid Cytopathology. Thyroid. 27: 1341-1346.
  7. Li X, Zhang S, Zhang Q, Wei X, Pan Y, et al. (2019) Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study. Lancet Oncol. 20: 193-201.
  8. Buda M, Wildman-Tobriner B, Hoang JK, Thayer D, Tessler FN, et al. (2019) Management of Thyroid Nodules Seen on US Images: Deep Learning May Match Performance of Radiologists. Radiology. 292: 695-701.

Citation: Pignataro F, GUIDOBALDI L, Gismant M, Nestola M, Nisita C (2024) Validation of an Artificial Intelligence Model for the Diagnosis of Thyroid Nodules: A Prospective Multicenter Study. J Clin Stud Med Case Rep 11:239

Copyright: © 2024  Francesco Pignataro, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Herald Scholarly Open Access is a leading, internationally publishing house in the fields of Sciences. Our mission is to provide an access to knowledge globally.



© 2024, Copyrights Herald Scholarly Open Access. All Rights Reserved!