Background
Accurate neoantigen identification is crucial for assessing neoantigen burden and optimizing cancer vaccine formulation. The concurrent activation of CD4+ and CD8+ T cells is essential for inducing a robust immune response. However, current analytic tools predominantly focus on the MHC-I pathway, typically relying on gene expression level and MHC-binding affinity of mutated peptides, while neglecting the role of MHC-II pathway. To address this gap, we present a comprehensive pipeline that encompasses both MHC-I and MHC-II pathways, integrating three critical dimensions of the immune response: neoantigen abundance, MHC presentation, and T cell receptor (TCR) recognition, to select neoantigens and enhance the prediction of immunotherapy responses.
Methods
We first identified significant metrics for each dimension and established an optimal method for aggregating metrics across individual MHC alleles. To model TCR recognition, we developed a cross-reactivity (CR) distance model inspired by the work of Luksza et al. 1 We trained the model separately for MHC-I and MHC-II, leveraging TCR-peptide binding data. We then employed a gradient boosting model to integrate metrics from the three dimensions to predict the CD4+ and CD8+ immune responses. Finally, we combined MHC-I and MHC-II predictions for each neoantigen and aggregated neoantigen burden based on tumor subclonal structure to stratify the treatment responses of immunotherapy cohorts.
Results
Our pipeline demonstrated that the binding-masked maximum method outperformed the commonly used best-binding method for aggregating recognition metrics across MHC alleles. The CR distance model showed a significant association with T cell activity (P-value=0.002 for MHC-I and 0.01 for MHC-II) and exhibited optimal performance in rank improvement for MHC-I compared to other predictors. Additionally, our machine learning model effectively integrated metrics across dimensions, achieving an AUROC of 0.76, surpassing the AUROC of 0.72 for models using only presentation metrics in the isolated testing set for MHC-I. Finally, integrating MHC-I and MHC-II predictions provided a more accurate assessment of neoantigen burden, outperforming models that considered only one pathway in most immunotherapy cohorts (6 out of 7).
Conclusions
We developed a pipeline to identify immunogenic neoantigens and predict immunotherapy response by considering both MHC-I and MHC-II pathways. Our CR distance model improved TCR recognition predictions, and the binding-masked maximum method proved superior for aggregating metrics across MHC alleles. By effectively integrating multiple dimensions and both MHC pathways, our approach offers a comprehensive assessment of neoantigen burden, underscoring the importance of considering both pathways in neoantigen prediction for cancer immunotherapy.
Reference
Łuksza M, Sethna ZM, Rojas LA, Lihm J, Bravi B, Elhanati Y, et al. Neoantigen quality predicts immunoediting in survivors of pancreatic cancer. Nature 2022;606:389–95.