4 minute read

Ensembled deep learning model outperforms human experts in diagnosing biliary atresia from sonographic gallbladder images

Ensembled deep learning model outperforms human experts in diagnosing biliary atresia from sonographic gallbladder images

REVIEWED BY | Lisa McGuire ASA SIG: Emerging technologies

REFERENCE | Authors: Wenying Zhou, Yang Yang, Cheng Yu, et al. Journal: Nature Communications Open Access: Yes

WHY THE STUDY WAS PERFORMED

Biliary atresia (BA) is a congenital disorder affecting intrahepatic and extrahepatic bile ducts. Affected infants present with jaundice, pale stools and dark urine. The exact cause is unknown. Early diagnosis is critical as early surgical intervention is required to achieve longterm transplant-free survival. Ultrasound is recommended as the initial diagnostic imaging tool although diagnosis is challenging. The use of artificial intelligence has the potential to help diagnose BA based on sonographic gallbladder images. The purpose of this study was to develop an ensembled deep learning model (*EDLM) to accurately identify BA and test the effectiveness against human experts. This model will aim to provide a solution to help in the diagnosis of BA, particularly in rural and underdeveloped regions where there is limited imaging expertise.

*Ensembled deep learning model (EDLM) - The process of combining multiple learning algorithms to obtain their collective performance, this aims to improve performance.

HOW THE STUDY WAS PERFORMED

The study used infants < 5 months with hyperbilirubinemia and suspected BA. All diagnoseswere confirmed by laparoscopic cholangiography US guided cholecystocholangiography,liver biopsy or follow-up. After images were screened for suitability, 3705 gallbladder imageswere obtained from the principal hospital and 841 from the 6 collaborating external cohort.Retrospective and prospective sonographic images were used.

The EDLM was evaluated and compared to human experts using both internal and externalvalidation cohorts. Images collected from smartphones and ultrasound videos was tested also for robustness. Specificity, sensitivity and accuracy for BA diagnosis was tested andcompared.

The internal evaluation used a fivefold cross-validation manner on a training cohort that waspartitioned into five subsets of an equivalent number of patients. Four of the subsets wereused as a database to train the EDLM and the ensembled model then predicted the categoryof the remaining image. This process was repeated five times.

The external validation tested the effectiveness of EDLM with images from six other hospitals.This data was evaluated against the performance of three experts for accurate BA diagnosis.

Diagnosis evaluation was also tested using smartphones that could be reliably used ina remote setting. This was done using an image per patient from the external validationdatabase pictured by a smartphone in the region of the gallbladder and fed to the EDLM forintelligent diagnosis. Additionally, diagnosis using real-time ultrasound video was tested. Anauto segmentation model was trained with a collection of 34 sonographic videos from 34infants. The diagnostic performance of EDLM was compared to that of three human experts,each of whom independently made diagnoses by reviewing the videos and was blinded toother clinical information.

Ensembled deep learning model outperforms human experts in diagnosing biliary atresia from sonographic gallbladder images continued

“The ensembled deep learning model in this study [has the potential to] improve the diagnosis of biliary atresia in various clinical application scenarios, particularly in rural and underdeveloped regions with limited expertise.”

FLOW CHART OF STUDY

Fig 1. Flow chart of the study

WHAT THE STUDY FOUND

In the internal evaluation, at both the image level and the patient level, the EDLM outperformed the two experts in diagnosing BA. AI achieved an accuracy level of 89.4% as opposed to the highest performing expert accuracy level of 87.4%. The accuracy level for AI and expert in the external evaluation results was very similar, achieving 87.6% and 87.8% respectively. However, the AI model achieved a higher sensitivity of 93.3% as opposed to 90% for the highest expert. Diagnosis based on smartphone images yielded an accuracy of 86.9%. The video data also achieved promising results for the AI model with an accuracy of 94.1%. Additionally, the prediction of the EDLM combined with human experts improved identification of BA. This study was conclusive in finding overall EDLM outperforms human experts in the diagnosis of BA.

RELEVANCE TO CLINICAL PRACTICE

These findings indicate EDLM can be used to help diagnose BA in both remote and hospital settings, improving diagnostic accuracy. This model is potentially deployable in multiple application scenarios, such as remote diagnosis when based on a smartphone app. EDLM is also helpful for the inexperienced radiologist in a hospital setting. Diagnosis based on combined predictions of human and AI further improves sensitivity even for expert radiologists. This model is predicted to be of particular benefit to those patients in underdeveloped regions without sufficient healthcare support.

This article is from: