Identifying patients with Gaucher Disease type 3 (GD3) in the Optum’s de-identified Market Clarity database: a clustering analysis

2022, 07 February

| 2 min read

Authors: C. Rochmann, M. Génin, M. Blanchon, N. Lambert, L. Deplante, W. Heine, S.Kabadi

Date: 7–11 February 2022

N°262, 2022-02, Presented at WORLD Symposium 2022, San Diego, CA, USA




GD3 is a progressive, lysosomal storage disorder (LSD), distinguished from GD1 by early heterogenous neurological manifestations. Neurological presentation in GD1 patients is not well understood. Differentiation of GD3 from GD1 subtypes is challenging due to absence of specific ICD-10 codes; treatment plan specific for GD subtypes; and lower prevalence.


The objective was to develop a GD3-specific algorithm to distinguish GD subtypes. A retrospective agglomerative hierarchical unsupervised clustering was conducted using Optum’s de-identified Market Clarity Database (1-January-2007 to 31-March-2020) to identify GD3 patients using clinical characteristics from physicians’ notes, diagnoses, treatments, and provider specialty. Patients with ,!’.2 ICD-10 codes (E7522) for GD and/or ,!’.1 treatment for GD were included; those with other LSDs were excluded. Patients were characterized by clinical characteristics such as neurological signs, thrombocytopenia, splenomegaly, or bone density disorders.


Early occurrence of age-dependent neurologic signs (typical of GD3) and GD3-specific neurological symptoms were used to differentiate GD3 patients from GD1. Clusters were described using clustering features and additional features such as hospital visits or demographics to have broader description of each subpopulation. Ten clusters were identified, including one homogeneous cluster of suspected GD3 patients (n=33): they experienced severe disease with many GD-specific symptoms, accumulated severe, early, and atypical neurological signs including eye movement disorders, but few age-dependent symptoms. A supplemental analysis identified GD3 patients (n=13) outside of this specific cluster, expressing either ‘Type3’ in provider’s notes, GD3-specific neurological symptoms, or early occurrence of age-dependent neurological signs.


ln absence of GD-subtype specific ICD-10 codes, clustering analysis identified one cluster of GD3 patients based on clinical characteristics. The identified cluster included 7 of 8 patients with GD3 mentioned in their provider’s notes, thereby supporting correct identification of GD3 patients using clustering analysis. Overall, the proportion of suspected GD3 patients identified is congruence with reported GD3 prevalence (~5% of overall GD population).

Poster of Identifying patients with Gaucher Disease type 3 (GD3) at the World Symposium in 2022