Figure 1 |. Overview of immuneML.
The main immuneML application areas are sequence- and repertoire-based prediction of AIRR with application to (a) immunodiagnostics and therapeutics research, as well as to (b) develop AIRR-based methods. We show three use cases belonging to these application areas. Use case 1: reproduction of the study by Emerson et al.6 on repertoire classification, use case 2: extending the platform with a novel convolutional neural network (CNN) classifier for prediction of TCR-pMHC binding that allows paired-chain input, use case 3: benchmarking ML methods with respect to their ability to recover a sequence-implanted signal corresponding to the simulated immune event. The immuneML core is composed of three pillars, which are (c) AIRR-seq data input and filtering, (d) ML, and (e) Interpretability analysis. Each of these pillars has different modules that may be interconnected to build an immuneML workflow. (f) immuneML uses a specification file (YAML), which is customizable and allows full reproducibility and shareability with collaborators or the broader research community. An overview of how immuneML analyses can be specified is given in Supplementary Figure 1. (g) immuneML may be operated via the Galaxy web interface or the command line. (h) All immuneML modules are extendable. Documentation for developers is available online. (i) immuneML is available as a Python package, a Docker image, and may be deployed to cloud frameworks (e.g., AWS, Google Cloud). Abbreviations: CMV (cytomegalovirus).