The Open-Source Revolution of AliBERT in French Biomedical AI

Developed by Quinten in 2022, AliBERT is a specialized model focusing on the French biomedical language. One version has been released as an open-source tool on the Huggingface platform, marking a significant contribution to the field of natural language processing (NLP) in healthcare.

AliBERT achieves it first successful application in oncology!

A successful proof-of-concept for AliBERT, the first French-language model specialized in the biomedical field. In partnership with a major French Cancer Fighting Institute, Quinten’s datalab team has developed a first concrete use case : the extraction of concepts and structured information from medical reports in oncology. A Natural Language Processing [NLP] task which, until now, has been highly complex, given the technical nature, diversity and specialization of these medical reports.

AliBERT : the first pretrained language model for French biomedical text

The paper “AliBERT : A pretrained language model for French biomedical text” was written in collaboration with Aman Berhe, Guillaume Draznieks, Vincent Martenot, Valentin Masdeu, Lucas Davy and Jean-Daniel Zucker. BERT architecture, which allow for context learning on text documents, is mostly trained on common English text resources.Performances in other languages, especially in specific topics which requires deep knowledge and vocabulary, are […]