throbber
Cell Reports
`Distinct mutational processes shapeselection of
`MHCclass| and classIl mutations across primary
`and metastatic tumors
`
`Graphical abstract
`
`>30 cancer types
`
`Authors
`
`Michael B. Mumphrey, Noshad Hosseini,
`Abhijit Parolia, ..., Malini Raghavan,
`oo,
`own
`Arul Chinnaiyan, Marcin Cieslik
`
`Mutational
`processes
`APOBEC/AID
`MSI
`
`TMB.
`
`refractory
`
`ae
`_ Mutation calling
`personalize4
`“ey references
`
`camn
`
`Loss of High dN/dS
`function
`ratio
`
`Highlights
`e Hapster detects MHCclass | and II mutations with high
`sensitivity and specificity
`
`e MHC genesare among the most recurrently mutated genes
`pan-cancer
`
`e Tumor mutation burden and mutational processes shape the
`spectrum of MHC mutations
`
`e MHC missense mutationsarelikely loss of function,
`disrupting B2M and antigen binding
`
`e Cell?ress
`
`Mumphrey et al., 2023, Cell Reports 42, 112965
`August29, 2023 © 2023 The Authors.
`https://doi.org/10.1016/j.celrep.2023.11 2965
`
`®o
`
`er
`
`1
`
`JHU 2034
`Merck Sharp v. Johns Hopkins
`IPR2024-00649
`
`variants
`
`Correspondence
`mcieslik@med.umich.edu
`In brief
`Using the personalized mutation caller
`Hapster, Mumphreyet al. report a pan-
`canceranalysis of positive selection for
`sovenenensncoroooe peseesoosees
`
`MHC-V/II 1,079 MHC4&=|Mutation MHCclass! and class II mutations across
`
`Compendium,
`990 MHC-II eo,
`\
`primary and metastatic cancers. Their
`mutations
`f
`S
`.
`.
`.
`analysis provides evidencefor the
`enrichmentof inactivating MHC
`\
`|
`MHC-I/Il amongthe most
`
`recurrent driver genes|Positional Enrichment||mutationsin select cancers,as well as the
`Statistical
`;
`recurrence
`in protein
`mutational processes responsible.
`analysis
`» nue
`functional
`:
`domains
`
`\ | [
`d
`6
`
`;
`
` \
`
`1
`
`JHU 2034
`Merck Sharp v. Johns Hopkins
`IPR2024-00649
`
`

`

`Cell Reports
`
`© CelPress
`
`OPEN ACCESS
`
`Distinct mutational processes shape selection
`of MHCclass | and class Il mutations
`across primary and metastatic tumors
`
`Michael B. Mumphrey,? Noshad Hosseini,? Abhijit Parolia,’* Jie Geng,* Weiping Zou,**®-* Malini Raghavan,
`Arul Chinnaiyan,'-2:*-5-7.8.9 and Marcin Cieslik':2:34-9.10*
`Departmentof Pathology, University of Michigan, Ann Arbor, MI 48109, USA
`2Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
`$Michigan Center for Translational Pathology, University of Michigan, Ann Arbor, MI 48109, USA
`4Departmentof Microbiology & Immunology, University of Michigan, Ann Arbor, MI 48109, USA
`5Departmentof Urology, University of Michigan, Ann Arbor, MI 48109, USA
`®Centerof Excellence for Cancer Immunology and Immunotherapy, University of Michigan, Ann Arbor, MI 48109, USA
`7Howard Hughes MedicalInstitute, Ann Arbor, MI 48109, USA
`8University of Michigan Rogel Cancer Center, Ann Arbor, MI 48109, USA
`8Senior author
`10lead contact
`*Correspondence: mcieslik@med.umich.edu
`https://doi.org/10.1016/.celrep.2023.112965
`
`SUMMARY
`
`Disruption of antigen presentation via loss of major histocompatibility complex (MHC) expressionis a strat-
`egy wherebycancercells escape immunesurveillance and develop resistance to immunotherapy. Here, we
`develop the personalized genomics algorithm Hapster and accurately call somatic mutations within the MHC
`genes of 10,001 primary and 2,199 metastatic tumors, creating acatalog of 1,663 non-synonymous mutations
`that provide key insights into MHC mutagenesis. We find that MHC class | genes are among the most
`frequently mutated genes in both primary and metastatic tumors, while MHC class II mutations are more
`restricted. Recurrent deleterious mutations are found within haplotype- and cancer-type-specific hotspots
`associated with distinct mutational processes. Functional classification of MHC residues reveals significant
`positive selection for mutations disruptive to the B2M, peptide, and T cell binding interfaces, as well as to
`MHC chaperones.
`
`INTRODUCTION
`
`The immune system is capable of identifying and eliminating
`cancercells via CD8* T cell-mediated cytotoxicity.' To avoid
`this destruction, successful cancers often evolve strategies to
`disruptT cell immunity, such as overexpression of the immuno-
`suppressive PD-L1? or reduced expression of key proteins such
`as the major histocompatibility complex (MHC) class | mole-
`cules.° Research into how tumors escape T cell surveillance
`has led to immunotherapies that restore cancer immunity in
`select patients.*° However, even the most promising immuno-
`therapies still only provide a clinical benefit
`in a minority
`of cases.'° Understanding the mechanisms that
`lead to
`primary and acquired resistance to T cell-based immunother-
`apies will be critical for the continued improvement of patient
`outcomes.
`
`T cells are able to identify malignant cells via the presence of
`mutant peptides known as neoantigens.' In orderto detect these
`neoantigens, T cells require the peptides to be presented at the
`cell surface by the MHC. The MHCclass| proteins present neo-
`antigens to cytotoxic CD8* T cells' and,as a result, are directly
`
`involved in the destruction of malignant cells. Consistent with
`this role as a tumor suppressor,correlations between decreased
`MHCclass | expression and poor outcomes have beenrepeat-
`edly observed across many cancers.''~'® The MHC class II pro-
`teins present neoantigens to CD4* T cells and play an important
`role in tumor suppression.'*'® Their expression can be induced
`in most cell types,”° including cancer cells,?'~7° and MHC class
`\l-restricted neoantigen vaccines have been showntotrigger an
`anti-tumor immune response,” suggesting that cancer-cell-
`specific MHCclassIl expression may also play a tumor-suppres-
`sor role.
`Identifying the mechanisms that lead to loss of the MHC will be
`key to understanding resistance to T cell-based immunother-
`apies. However, genetic studies of the MHC are hindered by
`the extreme polymorphism of the MHC genes.”° To address
`this issue, we created Hapster, a generalized algorithm that con-
`structs personalized reference sequences for alignment and mu-
`tation calling within polymorphic genes. The high sensitivity and
`specificity of the identified mutations enabled us to study the
`mutational and evolutionary processes driving somatic MHC
`loss in primary and metastatic tumors.
`
`2
`
`Cell R
`This is an open accessarticle under the C'
`
`1
`42, 112965, August 29, 2023 © 2023 The Authors.
`BYlicense (http://creativecommons.org/licenses/by/4.0/).
`
`2
`
`2
`
`

`

`ll
`
`OPEN ACCESS
`
`Resource
`
`Figure 1. Hapster algorithm and validation
`(A) Simplified overview of the Hapster algorithm. For each gene, Hapster infers optimal reference sequences from normal sequencing data, realigns to these
`personalized references, and calls mutations. For a complete description, see STAR Methods.
`(B) Germline variants identified from 69 WES samples from the 1000 Genomes Project relative to either the standard reference GRCh38 or to dynamically selected
`references using 8 different haplotypers. A perfect reference sequence should produce 0 apparent germline variants.
`(C) Fraction of simulated insertions or deletions (indels) and SNVs that were either called and passed all filters, were called and filtered by either Hapster or
`Mutect2, or were never called. Shown here are mutations simulated at a VAF of 0.45 and a coverage of 1003.
`(D) QQ plot for observed RNA seq read support for HLA variants, assuming read support is only due to sequencing error according to a beta binomial model.
`Variants were identified by Hapster alone (red) or by both Hapster and Polysolver (blue) from WES data. A comparison is shown to randomly generated alternate
`bases (gray), which are only supported by noisy reads and follow the null model (diagonal black line).
`(E) Boxplot showing the number of private germline variants observed per tumor in those cases with or without somatic MHC mutations. Wilcoxon rank sum test p
`values with BH correction.
`
`(legend continued on next page)
`
`2 Cell Reports 42, 112965, August 29, 2023
`
`3
`
`

`

`Resource
`
`RESULTS
`
`Hapster allows for more sensitive and specific somatic
`mutation calls in MHC genes
`Hapster is a complete mutation calling pipeline that (1) selects
`personalized reference haplotype sequences,
`(2) prunes
`contaminant and misaligned reads, and (3) detects26,27 and fil-
`ters variants (Figure 1A). For the first function, in principle, any
`existing human leukocyte antigen (HLA) haplotyper28–34 could
`be used to identify HLA haplotype sequences. However, in prac-
`tice, existing haplotypers often report HLA types for which only
`the sequences for exons 2 and 3 are known35 (Figure S1A).
`This is insufficient for somatic mutation calling where a full-length
`sequence is needed to call variants in all exons/introns. We
`therefore developed a generalized haplotyping algorithm to re-
`turn full-length sequences for MHC class I and class II genes.
`For the second and third functions, we developed MHC-specific
`strategies for pruning of contaminant and misaligned reads orig-
`inating from other MHC genes and pseudogenes, as well as for
`the identification of false positive somatic mutation calls. A
`detailed overview of the Hapster pipeline is included (STAR
`Methods).
`To benchmark the haplotype inference portion, we used a set
`of 69 whole-exome sequencing (WES) samples from the 1000
`Genomes Project with reported haplotype calls.33,36 To compare
`Hapster with other methods, we called germline variants in WES
`data relative to each haplotyper’s inferred haplotype for each in-
`dividual. Using sequencing data as ground truth, a perfectly
`identified reference would lead to no germline variants being
`identified. We observe that, relative to the standard GRCh38
`reference, there is a median of 17–38 germline variants observed
`per allele. All tested haplotypers improve upon this, with each
`having a median of 0 and a mean of <0.5 observed germline mu-
`tation per allele (Figure 1B), similar to the genome-wide average
`of 1 variant per kilobase.37 In a larger population of 10,001
`normal tissue samples from TCGA, Hapster identified 0–3 germ-
`line variants per allele (Table S1).
`To assess Hapster’s sensitivity, we simulated 200 synthetic
`MHC haplotypes with a random mutation, followed by simulated
`WES at depths ranging from 53 to 1003 and variant allele frac-
`tions (VAFs) of 0.025–0.45. Of the 200 simulated mutations, 94%
`(187/200) were successfully identified (Figure 1C) at 1003
`coverage and a VAF of 0.45. Following filtering, 18 of these calls
`were removed by either Hapster’s or Mutect2’s filters, giving a
`final sensitivity of 85% (169/200) for high-coverage clonal vari-
`ants. Inspection of the 13 variants that failed to be called showed
`that 12 were in regions of low coverage following probe capture
`(Table S2). As such, they are false negatives due to the capture
`kit design rather than to Hapster’s algorithm. When looking at re-
`sults over the range of coverages and VAFs, we see that as
`coverage and VAF decrease, sensitivity decreases as expected
`
`ll
`
`OPEN ACCESS
`
`due to lower read support for variants (Figure S1D). A compari-
`son of simulated vs. observed VAFs for each mutation call shows
`that at most simulated VAFs, Hapster produces calls with slightly
`lower observed VAFs, likely due to a slight loss of reads following
`read filtering (Figure S1E).
`To assess specificity, we called somatic mutations in all 450
`samples from the TCGA head and neck squamous cell carci-
`noma (HNSC) cohort with tumor and normal
`labels swapped
`such that no somatic variants should be identified. In 9 cases,
`an apparent somatic variant was identified that passed all filters.
`Assuming all 9 calls are false positives gives a specificity of 98%
`(441/450).
`To assess Hapster’s accuracy using an orthogonal
`sequencing technology, we used Hapster to call somatic muta-
`tions in WES data and then determined if these same mutations
`were supported by paired RNA sequencing (RNA-seq) data. Es-
`tablished RNA-seq validation methods are not ideal, as they rely
`on alignment of reads to a reference, which would be inappro-
`priate in validating Hapster. We therefore developed an orthog-
`onal alignment-free kmer-based approach to determine if the
`reads support variants in the RNA based on a beta-binomial
`model of sequencer error (STAR Methods), avoiding reference
`selection or alignment biases. Of the 80 variants in the WES
`data, 72 had high enough coverage in the RNA-seq data to un-
`dergo validation. Of these, 63 variants (88%) had read support
`significantly exceeding the null model
`(p < 0.05 [Benjamini-
`Hochberg adjusted (BH-adj.)], Figure 1D), and 4 (5%) were trun-
`cating with evidence of nonsense-mediated decay (Figure S1B).
`This leaves only 5 variants (7%) without RNA evidence, poten-
`tially due to the limitations of our model, low tumor cellularity,
`loss of heterozygosity (LOH), or transcriptional silencing.38
`For a second orthogonal validation, we performed Sanger
`sequencing on tumors from MI-ONCOSEQ39 with sufficient
`DNA or tissue samples. All 14 candidate variants called by Hap-
`ster were detected in the Sanger chromatograms from tumor
`specimens while being absent in traces from patient-matched
`normal tissues (Figure S1C). Addressing the possibility of germ-
`line variants being miscalled as somatic due to poor reference
`selections, we found no evidence of enrichment of somatic mu-
`tations in cases with higher numbers of private germline variants
`(Figure 1E).
`Finally, we applied Hapster to a larger set of 7,746 samples
`from TCGA that have previously reported mutations called by
`both the Broad Genomic Data Analysis Center (GDAC) standard
`reference-based pipeline and the Polysolver personalized pipe-
`line.34,40 We found that when calling mutations in the MHC class I
`genes, Hapster detected over twice as many non-synonymous
`mutations as the GDAC pipeline and 36% more than Polysolver
`(Table S3; Figures 1F and S2A). Next we examined variant allele
`frequency (VAF) distributions. Hapster tended to report slightly
`higher VAFs (Figure S2B) due to preserving more variant reads
`
`(F) Comparison of non synonymous mutation calls for the MHC class I genes between the GDAC pipeline, Polysolver, and Hapster across various cancer types
`from TCGA. Lightly shaded bars represent possible false positives.
`(G) Comparison of mutational consequences for variants called by the standard GDAC pipeline, Polysolver, or Hapster in the MHC genes vs. oncogenes, tumor
`suppressors, and neutral gene mutations from TCGA. Oncogenes (OGs): KRAS, PIK3CA, IDH1, CTNNB1, FOXA1, BRAF, AKT1, EGFR. Tumor suppressors (TSs):
`TP53, RB1, PTEN, APC, BRCA2, VHL. Neutral genes: all others.
`See also Figures S1 and S2 and Tables S1, S2, and S3.
`
`Cell Reports 42, 112965, August 29, 2023 3
`
`4
`
`

`

`ll
`
`OPEN ACCESS
`
`low VAF mutations.
`(Figure S2C) rather than failing to call
`Conversely, variants exclusive to Hapster tended to be low
`VAF mutations, supporting its higher sensitivity (Figure S2D).
`We next performed an exhaustive search for potential false pos-
`itives resulting from misalignment of sequencing reads origi-
`nating from other homologous MHC genes or pseudogenes.
`We found that only 6% of Hapster’s non-synonymous
`calls matched known sequences in any other MHC gene, a
`rate significantly lower than that of Polysolver (15%, Fisher’s
`exact test p < 1e 10, BH-adj.) but similar to the GDAC pipeline
`(4%, Fisher’s exact test p = 1, BH-adj.) (Figures 1F and S2E;
`Table S3). An analysis of synonymous mutation calls shows
`the apparently recurrent synonymous variants p.T214T and
`p.A269A, which are identified as somatic mutations by Poly-
`solver (Figure S2F). These mutations are unlikely to be under
`extreme positive selection but have sequences exactly matching
`non-classical MHC class I genes, i.e., are likely due to alignment
`errors from HLA-E, HLA-F, or HLA pseudogenes. We next
`compared the distribution of functional consequences of HLA
`mutations called by each of the approaches. For both Hapster
`and the GDAC pipeline, synonymous mutation calls were under-
`represented when compared to neutral genes, consistent with
`what would be expected for a potential driver gene (Figure 1G).
`In contrast, we found that Polysolver had an over-representation
`of synonymous calls, many of which can likely be attributed to
`misaligned reads originating from non-classical MHC class I
`genes (Figure S2F; Table S3).
`
`Pan-cancer compendium of MHC class I and class II
`mutations
`To comprehensively characterize MHC class I and class II muta-
`tion rates in human cancer, we analyzed 10,001 tumors across
`35 cancer types from TCGA and 2,199 tumors across 24 cancer
`types from MI-ONCOSEQ39 (Table S4), for a total compendium
`of 2,069 MHC class I and class II mutations (Figure 2A;
`Table S5). Samples from TCGA are mainly primary tumors, with
`the exception of the melanoma cohort (skin cutaneous melanoma
`[SKCM]), which consists only of metastatic samples. Microsatellite
`unstable (MSI) tumors are immunologically distinct due to their
`significantly higher neoantigen burden,41 and we have therefore
`separated them from their microsatellite stable (MSS) counterparts
`within the colon (colon adenocarcinoma [COAD]), stomach (stom-
`ach adenocarcinoma [STAD]), and endometrial (uterine corpus
`endometrial carcinoma [UCEC]) TCGA cohorts. While some other
`cancers also have distinct subtypes, such as breast cancer
`(BRCA) estrogen receptor (ER)+/
`and HNSC human papilloma-
`virus (HPV)+/
`, no significant difference in MHC mutation rates
`was observed between subtypes (BH-adj. chi-squared p = 0.27–
`1). These cohorts were therefore not subdivided for further ana-
`lyses. Samples from MI-ONCOSEQ are metastatic/refractory
`and represent a significantly more advanced form of disease
`compared with the corresponding TCGA cohorts, with cases
`generally having received multiple forms of prior systemic therapy
`for their primary cancer, and one or more rounds of therapy for their
`metastatic cancer.
`Mutations were in general distributed uniformly across the gene
`body but occasionally concentrated within prominent hotspots
`(Figures 2A and S3A). We found that for MHC class I, HLA-A and
`
`4 Cell Reports 42, 112965, August 29, 2023
`
`Resource
`
`HLA-B contained significantly more mutations than HLA-C, and
`that for MHC class II, HLA-DRA contained significantly more mu-
`tations than all other MHC class II genes except for HLA-DQB1
`(Figure 2B). Within each HLA gene, no allele was found to bear
`an excess of mutations. In primary tumors, we noted substantial
`variation in both mutational frequency and functional conse-
`quences across tumor types and MHC gene classes (Figure 2C).
`We found non-synonymous MHC class I and class II mutations
`in 10.5% of primary tumors (ranging from 2.7% to 72.5% across
`cancer types) (Figure S3B), with 5.6% (range: 0.2%–62.3%) of pa-
`tients harboring an MHC class I and 5.7% (range: 1.1%–21.7%)
`harboring an MHC class II somatic variant. Consistent with previ-
`ous reports that MSI tumors are under strong pressure to lose
`MHC function,42,43 the COAD-MSI, STAD-MSI, and UCEC-MSI
`cohorts make up 3 of the top 4 cohorts for MHC class I mutations
`(Figures 2D and S3B), with the majority of variants being loss-of-
`function (LOF) frameshifts or stop gains. MHC class II mutations
`were also most prevalent in cancers with high mutation burden
`including MSI tumors and melanoma (Figure 2E). However, LOF
`mutations in the top-mutated cohorts were less frequent and the
`variation in mutation rates across cancer types was lower
`compared with MHC class I (Figures 2E and S3B). Interestingly,
`there was a slight pan-cancer association between both MHC
`class I and class II mutations and immune cell infiltration after
`adjusting for tumor purity (Figure S3C), but further cohort-level
`analysis showed that no individual cancer type had a significant
`association after multiple testing correction.
`
`Prevalence of MHC class I and class II mutations in
`primary vs. metastatic tumors
`The prevalence of MHC mutations in metastatic tumors is un-
`known, a critical gap in knowledge considering the predominant
`use of immunotherapies in this setting and the immunological
`differences between the primary and metastatic tumor microen-
`vironment (TME).39,44–47 Overall, we observed non-synonymous
`MHC class I and class II mutations in 7.6% (range: 3.3%–20%) of
`metastatic/refractory patients, with substantial variation in muta-
`tional frequency and functional consequences between cancer
`types (Figures 2F, 2G, and S3D). To directly compare mutation
`rates between primary and metastatic cancers, we created a
`set of pairings to match TCGA cohorts to MI-ONCOSEQ cohorts
`(Table S6). For 15/17 pairings (88%), there were no significant
`changes in primary vs. metastatic MHC class I or class II muta-
`tion rates. However,
`for prostate and breast cancers, we
`observed a significant increase in MHC class I mutations in met-
`astatic cancers compared with primary cancers (Figure S3E;
`prostate: F(1, 909) = 9.35, p = 0.03; breast: F(1, 1140) = 12.8,
`p = 0.01, BH-adj.). No significant differences were seen in
`MHC class II mutations.
`Overall, these data provide a comprehensive look at MHC
`class I and class II mutations pan-cancer, across both primary
`and metastatic tumors. We find that somatic mutations of
`HLA-A and HLA-B are most common, while HLA-C and MHC
`class II genes are less frequently mutated. While some significant
`differences in MHC mutation rate between primary and metasta-
`tic tumors are noted, the majority of MHC mutations in metasta-
`tic tumors are expected to be already present in the primary
`tumor.
`
`5
`
`

`

`Resource
`
`ll
`
`OPEN ACCESS
`
`Figure 2. Compendium of MHC class I and class II mutations in primary and metastatic tumors
`(A) Distribution of all observed mutations in both primary and metastatic cancers across the coding region of the MHC genes. Binding pocket secondary
`structures are noted above.
`(B) Significant differences in the prevalence of non synonymous mutations and indels of individual MHC class I and class II genes. *p < 0.05; **p < 0.01;
`***p < 0.001; ****p < 0.0001, BH adj. Fisher’s exact test.
`(C) Cohort specific mutation rates for MHC class I and class II genes across all primary and metastatic cancers. Values are scaled to the number of individuals
`within each cohort. Colors represent the fraction of cancers with non synonymous/indel mutations.
`(D G) Cohort summaries of coding region mutations in MHC class I (D and F) and class II (E and G) genes in primary (D and E) and metastatic (F and G) cancers.
`Values are scaled by the number of individuals within each cohort.
`See also Figure S3 and Tables S4 and S5.
`
`Positive selection of non-synonymous MHC somatic
`mutations
`To identify positive selection of functional mutations within the
`MHC genes, we applied CBaSE48 to each primary and metasta-
`tic cohort from TCGA and MI-ONCOSEQ. HLA genes and haplo-
`
`types are codominant, and each allele presents a largely unique
`set of neoantigens.49 Additionally, specific T cell responses are
`often immunodominant and mounted against only a few of
`the presented neoantigens such that the mutation of a single
`HLA allele may result in the complete inability to present an
`
`Cell Reports 42, 112965, August 29, 2023 5
`
`6
`
`

`

`ll
`
`OPEN ACCESS
`
`immunodominant neoantigen. Given this, we treat all MHC class
`I genes (and separately, all MHC class II genes) as one functional
`unit, analogous to multiple genes of a protein complex,50 taking
`into account the increased genomic length of this combined set
`of genes. In primary cancers, CBaSE identified 6 cohorts (COAD-
`MSI, STAD-MSI, diffuse large B cell lymphoma [DLBC], cervical
`squamous cell carcinoma [CESC], HNSC, LUSC) with statisti-
`cally significant evidence for positive selection of non-synony-
`mous variants in the MHC class I genes and 3 cohorts
`(cholangiocarcinoma [CHOL], kidney chromophobe [KICH],
`uveal melanoma [UVM]) for the MHC class II genes (Figure 3A).
`By this measure, the MHC class I genes are tied for 7th and
`the MHC class II genes are tied for the 17th most recurrent driver
`genes pan-cancer as determined by applying CBaSE to all pro-
`tein-coding genes across primary cancers. A similar trend was
`identified in metastatic and refractory cancers with the MHC
`class I genes being mutated in two lymphoma cohorts
`(M-DLBC, M-LYM), making them tied for 6th most recurrent
`pan-cancer driver gene by number of cohorts significantly
`mutated (Figure 3B). As an alternative measure of positive selec-
`tion, we used Fisher’s method to create a combined score (Fpos)
`for the strength of selection across all cohorts (Figures 3C and
`3D). We found that in both primary and metastatic cancers, the
`MHC class I genes scored in the top 0.1% of all protein-coding
`genes according to this metric of positive selection (Figures 3C
`and 3D), and in primary cancers, the MHC class II genes scored
`in the top 0.5% (Figure 3C). Due to the exclusion of MHC class II
`genes from the sequencing panel in a subset of MI-ONCOSEQ
`samples, we were not statistically powered to investigate selec-
`tion of MHC class II genes in metastatic cohorts.
`We next looked at the clonality of mutations within the 6 TCGA
`cohorts (COAD-MSI, STAD-MSI, DLBC, CESC, HNSC, LUSC)
`reported to be significantly mutated by CBaSE. We show that
`the majority of mutations in HLA-A (111/164, 68%) and HLA-B
`(132/179, 74%) within these cohorts have a cancer cell fraction
`(CCF) >0.7, consistent with the variants providing a survival
`advantage followed by a clonal sweep (Figure 3E). In tumors
`with multiple hits, we see that in each of these cohorts, the ma-
`jority (63%–100%) of cases have at least one clonal variant
`(Figures S4A and S4B). When there is at least one leading clonal
`mutation, we see that co-occurring MHC class I mutations are
`also primarily clonal in the DLBC (80%) and HNSC (88%) co-
`horts, while in the remaining cohorts, we observe a mix of clonal
`and subclonal mutations (Figures S4A and S4C). In cohorts
`showing no evidence of positive selection, the proportion of
`clonal mutations were significantly lower in both HLA-A and
`HLA-B (Figure 3E), consistent with these being mostly subclonal
`passenger mutations. In both groups, HLA-C is primarily subclo-
`nal, indicating that mutations in this gene may not provide as
`much survival benefit, consistent with our earlier finding that
`HLA-C is less frequently mutated than HLA-A and HLA-B
`(Figure 2B).
`
`Impact of tumor mutation burden on MHC class I and
`class II mutation frequency
`To investigate the association between tumor mutational burden
`(TMB) and MHC mutations, we compared the local TMB within
`the MHC genes to the genome-wide TMB for each cancer
`
`6 Cell Reports 42, 112965, August 29, 2023
`
`Resource
`
`cohort. As TMB increases, we expect the number of passenger
`mutations in a gene to increase stochastically. However, as TMB
`increases, neoantigen burden also increases, and we would
`expect increased selective pressures for LOF MHC mutations.
`We therefore expect all cancer types to show a positive associ-
`ation between TMB and MHC mutations, but in cohorts with sig-
`nificant evidence of positive selection, this increase should be
`elevated due to the added effect of both TMB- and neoanti-
`gen-induced selective pressures. We show this to be the case,
`with significantly mutated cohorts having a higher local TMB
`within the MHC genes than other cohorts of similar global TMB
`(Figure S4D).
`We originally hypothesized that somatic loss of MHC class II
`should mirror that of MHC class I given that both have been
`shown to promote anti-tumor immune responses. However,
`there was no association at the cohort level between MHC class
`I mutations and MHC class II mutations after controlling for TMB
`(Figures S4E–S4G). While MHC class I mutations appeared to be
`most prevalent in cancer types with high TMB, MHC class II mu-
`tations were frequently increased in low-TMB cancers with few
`MHC class I mutations.
`
`Functional consequences of MHC class I and class II
`mutations
`We next characterized the distributions of mutation functional
`consequences in cohorts with and without evidence of positive
`selection. We constructed an approximately neutral model by
`looking at the distribution of functional consequences across
`2.6 million mutations called from the entirety of the TCGA, the
`overwhelming majority of which are known to be passengers51
`(Figure 3F, ‘‘TCGA’’). MHC class I mutations within cohorts
`showing no evidence of positive selection showed a conse-
`quence distribution nearly identical to that of the neutral model
`(Figure 3F, ‘‘unselected’’), suggesting that these mutations are
`primarily passengers. However, in each of the 8 cancer types
`that did show positive selection, there was a significant difference
`in consequence distributions when compared to the TCGA-
`derived neutral model (chi-squared tests, p < 1e 3 to 1e 16,
`BH-adj.). Consistent with the MHC’s role as a tumor suppressor,
`this deviation was caused by an increase in truncating mutations,
`which accounted for more than 40% of mutations in most co-
`horts, as compared with the expected neutral rate of 12%.
`The DLBC, CESC, and HNSC cohorts all have a high proportion
`of stop gains (46%, 32%, and 28%, respectively) within the
`MHC class I genes, accounting for 56% (60/108) of all observed
`stop gains despite only comprising 8% (792/10,001) of TCGA pa-
`tients (Figure S4H). Notably, frameshift mutations in MHC class II
`were rare even in MSI tumors but were unexpectedly common in
`some MSS tumors including glioblastoma (GBM), ovarian cancer
`(OV), and liver hepatocellular carcinoma (LIHC). These cohorts
`were also depleted of synonymous mutations (Figure 3G).
`Similar to the TCGA DLBC cohort, the refractory M-DLBC
`cohort showed both a high mutation rate and a bias toward trun-
`cating mutations in the MHC class I genes (35%). Other non-
`DLBC refractory lymphomas (M-LYM) had a lower MHC class I
`mutation rate but still had a bias toward truncating mutations
`(65%) (Figures 3F and S4I). The lymphomas alone account for
`52% of stop gains observed across all MI-ONCOSEQ cohorts
`
`7
`
`

`

`Resource
`
`ll
`
`OPEN ACCESS
`
`Figure 3. Evidence for strong positive selection and deleteriousness of MHC somatic mutations
`(A and B) Top 30 genes showing evidence of positive selection in primary (A) or metastatic (B) cancers by CBaSE by number of cohorts with significant evidence.
`(C and D) Comparison of the number of cohorts significantly mutated vs. pan cancer metastatic Fpos for protein coding genes in primary (C) or metastatic
`(D) cancers as measured by CBaSE. Vertical dashed lines show the cutoff for the top 0.5% of genes by Fpos.
`(E) Cancer cell fraction (CCF) of MHC class I variants in TCGA cohorts showing significant evidence of positive selection compared with all other cohorts. Vertical
`line shows 70% CCF, above which mutations are considered clonal. ***p < 0.001, Wilcoxon rank sum test after BH correction.
`(F) Proportion of functional consequences observed in various groups: ‘‘TCGA,’’ 2,600,654 pan cancer mutations from TCGA an approx. neutral model; ‘‘un
`selected,’’ cohorts showing no evidence of positive selection; others, cohorts showing evidence of positive selection (n = 21 96 mutations within positively
`selected cohorts). ‘‘TCGA’’ and ‘‘unselected’’ are average frequencies across cohorts, with error bars showing SEM.
`(G) Functional consequences of MHC class II mutations in select primary cohorts. Mutational consequence distribution of known OGs (KRAS, PIK3CA, IDH1,
`CTNNB1, FOXA1, BRAF, AKT1, EGFR) and TSs (TP53, RB1, PTEN, APC, BRCA2, VHL) are shown.
`(H) Sample level co occurrence of mutations in either the MHC class I or APM genes within positively selected cohorts. Percentage values show percentage of
`mutated samples containing a hit in both the MHC class I and APM.
`
`Cell Reports 42, 112965, August 29, 2023 7
`
`8
`
`

`

`ll
`
`OPEN ACCESS
`
`(11/21) despite containing only 7% of patients. The HNSC,
`CESC, and LUSC cohorts in TCGA are all squamous cell
`carcinomas that correspond to a single cohort M-SQCC within
`MI-ONCOSEQ. Similar
`to what was observed across the
`primary squamous cancers, the pan-squamous M-SQCC (squa-
`mous cell carcinoma) cohort showed an overall elevated muta-
`tion rate and a high rate of LOF mutations when considering
`frameshifts, stop gains, and splice region variants (35%; Fig-
`ure S4J). Metastatic MSI
`tumors are underrepresented in
`MI-ONCOSEQ, preventing any comparison with primary MSI
`tumors. Altogether, these data reveal striking differences in
`mutation frequency and deleteriousness not only across cancer
`types but also between MHC class I and class II genes.
`
`Patterns of mutual exclusivity and independence of
`MHC mutations
`We next looked at the relationships between deleterious mutations
`in the MHC class I genes and the APM (antigen presentation ma-
`chinery)52 (Figure 3H). Other genes linked to MHC class I have
`been identified as cancer driver genes (e.g., beta-2 microglobulin
`[B2M]53), and it has been shown that driver gene

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket