Machine Learning Enables More Precise Molecular Diagnostics of Acute Lymphoblastic Leukemia

Researchers in Kiel developed a new bioinformatics program ALLCatchR for more precise identification of subtypes of acute lymphoblastic leukemia

Acute lymphoblastic leukemia (ALL) is a life-threatening condition of the blood-forming system. It is the most common type of cancer in children, but can also affect adults. In ALL patients, there is an excessive proliferation of immature, non-functional white blood cells (lymphocytes) in the blood and bone marrow. Similar to other types of cancer, ALL arises from genetic changes in the cells. The type of genetic alteration often determines the severity and course of the disease. For the most common subtype of ALL, known as B-cell precursor acute lymphoblastic leukemia (BCP-ALL), a research team from the University Medical Center Schleswig-Holstein (UKSH), led by Prof. Claudia Baldus and Dr. Lorenz Bastian, has developed a new bioinformatics tool that enables precise diagnostics of molecular subtypes.

Currently, up to 26 molecular subtypes of BCP-ALL are differentiated based on genomic alterations and the associated gene expression profiles. To accurately diagnose these molecular subtypes, RNA sequencing is gaining importance and establishing itself as the new diagnostic standard. While identifying genetic changes (like fusion genes) through RNA sequencing is well-established, systematic approaches for gene expression analysis are less advanced. Therefore, the research team developed a specific classifier named ALLCatchR, based on machine learning.

Dr. Lorenz Bastian & Dr. Thomas Beder, authors of the study

ALLCatchR utilizes RNA sequencing data to assign BCP-ALL samples to 21 predefined molecular subtypes. “The classifier was trained using 1,869 gene expression profiles derived from four different patient cohorts with established definitions for BCP-ALL subtypes, making it the world’s largest dataset currently available. In three independent test cohorts with 1,018 samples, the classifier achieved an accuracy of 95.7% in subtype assignment. In 84.6% of cases, even ‘high-confidence predictions’ were achieved with an accuracy of 99.7%,” summarizes first author Dr. Thomas Beder the key results of the study. Only 1.2% of samples were not classified. ALLCatchR significantly outperformed existing programs and identified subtype candidates even for samples that previously lacked a clear assignment.

To highlight the differences and similarities of BCP-ALL compared to normal B-cell development, the research group also created an RNA sequencing reference for human B-lymphopoiesis. This allowed researchers to demonstrate for the first time that diverse BCP-ALL subtypes resemble specific B-cell developmental stages. These patterns are also reflected in the activation of shared signaling pathways and could serve as the basis for targeted therapies. “ALLCatchR enables routine application of RNA sequencing in BCP-ALL diagnostics with systematic gene expression analysis for precise subtype assignment. Furthermore, it yields new insights into the underlying developmental biology of the disease,” summarizes Lorenz Bastian.

The results of this study were published in the journal HemaSphere. The ALLCatchR program is freely available as an R package.


This study is another result of the Clinical Research Group “CATCH ALL – Towards a Cure for Adults and Children with Acute Lymphoblastic Leukemia (ALL)”. Since January 2022, the research group has been funded with around five million euros by the German Research Foundation. In close collaboration between research and clinical teams, efforts are being made to translate promising therapy approaches into clinical practice, aiming to improve healing chances across all age groups for all patients. Professor Claudia Baldus, spokesperson for CATCH ALL, emphasizes: “The new findings can contribute to further improving the oncological precision diagnostics and therapy of acute lymphoblastic leukemia.” Further information can be found on the Clinical Research Group’s website:

A german version of the press release can be found on the CAU Kiel news page here.

Original publication

Beder, T., Hansen, B.-T., Hartmann, A. M., Zimmermann, J., Amelunxen, E., Wolgast, N., Walter, W., Zaliova, M., Antić, Ž., Chouvarine, P., Bartsch, L., Barz, M. J., Bultmann, M., Horns, J., Bendig, S., Kässens, J., Kaleta, C., Cario, G., Schrappe, M., Neumann, M., Gökbuget, N., Bergmann, A. K., Trka, J., Haferlach, C., Brüggemann, M., Baldus, C. D., Bastian, L. (2023). The Gene Expression Classifier ALLCatchR Identifies B-cell Precursor ALL Subtypes and Underlying Developmental Trajectories Across Age. HemaSphere, 7(9), e939. DOI: 10.1097/HS9.0000000000000939.

Text: Dr. Claudia Taubenheim

Additional Details


Participating Institutes