News
Article
Author(s):
A novel study assessed the potential of 2 AI platforms in providing accurate information to patients with HS.
A new study investigated the accuracy of AI chatbots in providing information on hidradenitis suppurativa (HS) to patients.1 Accurate initial insight from these technologies could possibly reduce diagnostic and treatment time gaps in patients with HS.
The study assessed the 3.5 version of Chat Generative Pre-Trained Transformer (Chat GPT) and Bard, the two most widely used AI systems currently on the market. Each platform was asked these 7 questions, which were developed by HS patients and clinicians:
Responses to each question were evaluated by 18 HS experts. The experts used the 5-point Likert scale to analyze the response from AI (1: strongly agree; 2: agree; 3: neutral; 4: disagree; 5: strongly disagree). Each response was classified as “appropriate” or “inappropriate” based on whether or not the experts agreed with the validity of the statement.
The results found a large disparity between the 2 systems as ChatGPT was significantly more accurate than Bard. A notable difference in mean scores was found by all participants (86% versus 14%, respectively).
6 of the 7 query answers were deemed appropriate with Q6 being the only one considered inappropriate. The response to question 6 was classified as “completely incorrect” as most experts rated it with the lowest possible score on the Likert scale. Conversely, only 1 of the 7 responses by Bard (Q5) was deemed appropriate.
ChatGPT was relatively effective and outperformed Bard, specifically with the precision and reliability of answers regarding symptoms and patient quality of life. However, both had limitations in providing adequate treatment advice. Additionally, ChatGPT has relatively shorter answers with a mean word count of 228±48 compared to Bard’s 254±77.
The investigators did note some limitations in this study. The small sample size of 18 HS experts along with the short list of 7 predefined questions may not be fully representative of the dermatology community.
A recent survey found that nearly 40% of patients use AI systems for health-related inquiries.2 Because of this, there is overall improvement needed in AI-driven medical advice, especially with treatment options. Although these platforms should never be a replacement for professional consultations, advancing their capabilities could increase reliability, according to the researchers.
AI could potentially be used in patient support, as earlier understanding of symptoms could prompt them to consult clinicians in a timely manner. This is especially important for HS patient outcomes, as early intervention is essential for the disease. Further research is needed to assess the performance of AI across other dermatological conditions.
“From a systemic perspective, if AI systems such as ChatGPT or Bard can reliably improve symptom understanding, they could potentially ease pressure on healthcare systems by reducing the need for unnecessary consultations or diagnostic procedures,” the authors wrote. “However, if AI systems frequently provide incorrect information, the resulting delays in diagnosis or treatment could add to the burden on healthcare services, as patients may present with more advanced or complicated cases. It is therefore essential to improve the accuracy of AI responses, so that AI becomes a valuable asset rather than a liability in healthcare.”
References
1. Ezanno AC, Fougerousse AC, Pruvost-Balland C, Maccari F, Fite C; ResoVerneuil. AI in Hidradenitis Suppurativa: Expert Evaluation of Patient-Facing Information. Clin Cosmet Investig Dermatol. 2024;17:2459-2464. Published 2024 Nov 2. doi:10.2147/CCID.S478309
2. Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare (Basel). 2023;11(6):887. Published 2023 Mar 19. doi:10.3390/healthcare11060887