A machine learning algorithm can identify clusters of patients with favourable glycaemic outcomes in a pooled European Gla-300 studies (REALI): Novel signposts for clinicians?

2018, 03 October

| 3 min read

Authors: M. Rollot, M. Bonnemaire, C. Brulle-Wohlhueter, L. Pedrazzini, E. Boëlle-Le Corfec, G. Bigot, M. Didac, R. Bonadonna, P. Gourdy, D. Müller-Wieland, O. Hacman, A. Chiorean, N. Freemantle

Date: 3 October 2018

N° 876, 2018-10, EASD 2018, Berlin, Germany


Background and aims

Detecting consistent patterns of interest can be performed using data-driven subgroup discovery algorithms. These may be instrumental in exploiting large healthcare databases and enabling a patient-level and data-driven analysis aiming at identifying patient clusters for clinicians. The REALI project, a pool of a large database collecting data from more than 10,000 people with type 2 and type 1 diabetes mellitus uncontrolled with antidiabetic therapy and initiated on Gla-300 from different European Gla-300 studies, is an appropriate source of data for such an analysis. The objective of the present work was to explore patient-related variables and to identify clusters of patients who: 1. achieved HbA1c drop ≥ 0.5% from baseline to end of study (EoS); 2. experienced hypoglycemia event during the study (HEOS), and 3. achieved the combined outcome of HbA1c < 7% at EoS without HEOS.

Materials and methods

Q-finder, a subgroup discovery algorithm, was first applied to a single study (Take Control) from REALI pooled database. Q-finder is a proprietary non-parametric supervised learning algorithm working with no particular assumption regarding distributions of the outcome or explanatory variables. This algorithm explores the space of explanatory variables to identify areas where the outcome specified for the exploration shows higher or lower concentration than average. The output is a set of patient clusters, defined as combinations of variables and thresholds which characterize subpopulations with significantly higher/lower probability of experiencing an outcome of interest. All results were tested for their significance in models adjusted for confounding factors and taking into account multiple testing (Bonferroni’s correction).


Take Control was a 24-week interventional study of Gla-300 efficacy and safety, including 631 patients. Thousands of queries were performed by the algorithm, and 32 were found statistically significant after Bonferroni’s adjustment in models including confounding factors. Among the clusters that were generated, patients with higher probability of having an HbA1c reduction ≥ 0.5% at EoS were those with no previous insulin exposure (234 patients [pts], RR = 1.6, p < 10 -6), and those with a baseline HbA1c value ≥ 7.9% (421 pts, RR = 1.5, p < 10 -10). Additionally, our findings suggest that patients who received a pre-hypoglycemia average insulin dose ≤ 33.3 U were at greater risk of HEOS (363 pts, RR = 1.7, p < 10 -5); moreover, patients with a duration of disease ≤ 12 years and a baseline alkaline phosphatase value ≤ 64 U/L were more likely to achieve the combined outcome of HbA1c < 7% at EoS without HEOS (127 pts, RR = 2.2, p < 10 -6).


Our analysis is a new promising and powerful tool, which provides simple and efficient criteria for the clinician to identify clusters of patients in whom the intervention of Gla-300 is most efficacious and safe. This approach will be applied on additional REALI datasets.