A Robust and Trustable Approach to Incorporate Medical Domain Knowledge in Machine Learning Models for Diabetic Retinopathy Screening Using Routine Lab Results
Published in Pre-print in SSRN, 2024
Despite the high promise of artificial intelligence (AI) in disease prediction and diagnosis, its clinical adoption remains limited by the challenge of explainability in machine learning predictions. Unfortunately, most AI models are complex and difficult to understand, because they do not consider human cognitive limitations. Physicians who will ultimately employ these models, are often occupied with their daily responsibilities, leaving them with limited bandwidth to actively engage in research efforts or provide feedback on the explainability of the AI models. This study aims to develop a robust and trustable disease prediction model that overcomes this hurdle. To counteract the potential bias introduced by a limited number of participating physicians, we design a new bibliometric statistic metric to extract medical domain knowledge through PubMed at National Library of Medicine and identify lab variables that are widely accepted as risk factors for the disease. Then a novel elaborative learning approach is proposed to account for human cognitive limitations in processing information to ensure the usability of the model by physicians. The disease of focus in this study is diabetic retinopathy, a potentially sight-threatening complication of diabetes. The study demonstrates a 96.65% accuracy and a 96.60% F1-Score in predicting diabetic retinopathy. This is accomplished using only eight features that align with the medical domain knowledge, indicating the model’s efficacy in identifying the disease. The model’s good performance underscores its potential as a screening tool for primary care physicians, specifically for conditions like diabetic retinopathy, where early detection is critical. The eight variables also comply with the human cognitive limit to digest information simultaneously. This research offers a promising direction for using AI in healthcare, particularly in disease detection. It presents a scalable approach for implementing robust and trustable AI for other diseases.