Supplementary Data for "CHASMplus reveals the scope of somatic missense mutations driving human cancers"
datasetposted on 19.07.2019 by Collin Tokheim, Rachel Karchin
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Large-scale cancer sequencing studies of patient cohorts have statistically implicated many cancer driver genes, with a long-tail of infrequently mutated genes. Here we present CHASMplus, a computational method to predict driver missense mutations, which is uniquely powered to identify rare driver mutations within the long-tail. We show that it substantially outperforms comparable methods across a wide variety of benchmark sets. Applied to 8,657 samples across 32 cancer types, CHASMplus identifies over 4,000 unique driver mutations in 240 genes, further distinguished by their specific cancer types. Our results support a prominent emerging role for rare driver mutations, with substantial variability in the frequency spectrum of drivers across cancer types. The trajectory of driver discovery may already be effectively saturated for certain cancer types, a finding with policy implications for future sequencing. As a resource to handle newly observed rare driver mutations, we systematically score every possible missense mutation across the genome. With the ever-growing pace of DNA sequencing of human tumors, the total number of detected mutations in cancer continues to accelerate. However, only a few mutations in each tumor may actually “drive” the growth of cancer, some of which can have value for diagnostic, prognostic, or therapeutic purposes. Based on a new rigorous statistical analysis of The Cancer Genome Atlas (TCGA), we find a prominent emerging role for rare missense mutations predicted to be “drivers” of cancer, which may have potential implications for genome-driven precision oncology, since rare driver mutations that are putatively actionable could be newly observed in a patient, thus, requiring personalized modeling and assessment. To extend beyond the TCGA, we provide a systematic resource to assess such newly observed missense mutations as cancer drivers. Lastly, we assess the driver landscape of human cancers and find that discovery for some cancer types are already approaching saturation. Detailed results of the manuscript "CHASMplus reveals the scope of somatic missense mutations driving human cancers" are provided in the following Supplementary Tables. Supplementary Table 1. Features used by CHASMplus. Supplementary Table S2. Driver somatic missense mutation results from pan-cancer analysis. Supplementary Table 3. Cancer type specific driver somatic missense mutation results Supplementary Table 4. Subtype enrichment for driver missense mutations predicted by CHASMplus Supplementary Table 5. Comparison of CHASMplus to saturation mutagenesis experiments of PTEN Supplementary Table 6. CHASMplus results on 1,013 prostate adenocarcinoma samples (Armenia et al.) Supplementary Table 7. Cancer type-specific driver somatic missense mutation results for skin cutaneous melanoma