AliBERT achieves it first successful application in oncology!

2023, 18 October

| 2 min read|
Grégoire Dugast
Thumbnail for the article entitled “NLP: Quinten achieves first successful application of AliBERT in oncology!”
A successful proof-of-concept for AliBERT, the first French-language model specialized in the biomedical field. In partnership with a major French Cancer Fighting Institute, Quinten’s datalab team has developed a first concrete use case : the extraction of concepts and structured information from medical reports in oncology. A Natural Language Processing [NLP] task which, until now, has been highly complex, given the technical nature, diversity and specialization of these medical reports.

In medical research, structured data is crucial for reconstructing a patient’s history. However, some of these data are poorly or incompletely captured in hospital information systems. AliBERT, a pre-trained language model, enables structured data to be generated directly from medical reports, speeding up the costly, error-prone and time-consuming task of building databases based on chart reviews.

Using nearly 700 cancer patient files annotated by experts in the field, Quinten’s teams have trained AliBERT to recognize around ten key concepts in the follow-up of breast and lung cancer patients. In particular, it is now possible to efficiently detect – with performances (Accuracy) ranging from 80 to 95% – these concepts in several types of oncology reports (e.g., consultation, anatomopathology reports, Réunion de Concertation Pluridisciplinaire (RCP)).

Until now, traditional methods such as regular expression search or non-specialized neural networks were unable to process and exploit these complex and technical documents. The AliBERT tool, specialized in biomedical language, now automatically extracts all this information from a large number of heterogeneous documents.

A scientific publication is currently being written, reporting state of the art and progress achieved through this project. In the future, AliBERT will make it possible to reconstruct a patient’s care pathway, based on medical report databases set up in health data warehouses. An instrumental solution to accelerate research in the fight against cancer.


latest articles