[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
A Fresh Look at Celery Collenchyma and Parenchyma Cell Walls Through a Combination of Biochemical, Histochemical, and Transcriptomic Analyses
Previous Article in Journal
Structure-Function Relationship of the β-Hairpin of Thermus thermophilus HB27 Laccase
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation of Exome-Wide Tumor Heterogeneity on Colorectal Tissue-Based Single Cells

by
Nikolett Szakállas
1,2,*,
Alexandra Kalmár
2,
Barbara Kinga Barták
2,
Zsófia Brigitta Nagy
2,
Gábor Valcz
3,
Tamás Richárd Linkner
2,
Kristóf Róbert Rada
2,
István Takács
2 and
Béla Molnár
2
1
Department of Biological Physics, Faculty of Science, Eötvös Loránd University, 1053 Budapest, Hungary
2
Department of Internal Medicine and Oncology, Faculty of Medicine, Semmelweis University, 1085 Budapest, Hungary
3
HUN-REN-SU Translational Extracellular Vesicle Research Group, 1117 Budapest, Hungary
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(2), 737; https://doi.org/10.3390/ijms26020737
Submission received: 25 November 2024 / Revised: 9 January 2025 / Accepted: 13 January 2025 / Published: 16 January 2025
(This article belongs to the Special Issue Molecular Findings in Colorectal Cancer)
Figure 1
<p>The distribution of the Tumor Mutational Burden (TMB) in the NAT and CRC samples is shown on a logarithmic scale. On average, the CRC group exhibits a higher TMB, indicating a greater number of somatic variations compared to the NAT group.</p> ">
Figure 2
<p>Summary plots of the (<b>a</b>) NEG, (<b>b</b>) NAT, and (<b>c</b>) CRC samples. Variant classification distribution: the <span class="html-italic">X</span>-axis represents the number of variants, and the <span class="html-italic">Y</span>-axis represents the variant type categories. Variant type plot: the <span class="html-italic">X</span>-axis represents the number of variants, and the <span class="html-italic">Y</span>-axis represents the variant type categories and SNV class plot. Variants per sample plot: the <span class="html-italic">X</span>-axis represents the ID of samples, and the <span class="html-italic">Y</span>-axis represents the number of variants. Variant classification summary: the <span class="html-italic">X</span>-axis represents the variant classifications, and the <span class="html-italic">Y</span>-axis represents the number of variants. Top 10 mutated genes: the <span class="html-italic">X</span>-axis represents the number of mutations, and the <span class="html-italic">Y</span>-axis lists the top 10 mutated genes.</p> ">
Figure 3
<p>Cross−cancer genome mutation patterns may serve as a proxy to identify positive (collaboration) or negative (synthetic lethal) epistatic relationships between recurrently mutated driver genes. The epistatic relationship between two driver genes may be inferred from cross-cancer mutation patterns, whereby co-occurrence may indicate a synergistic interaction in promoting tumorigenesis. By contrast, mutually exclusive driver genes may negatively impact tumorigenesis when mutated jointly. Here, mutually exclusive and co-occurring gene pairs are presented in a triangular matrix per tissue group—(<b>a</b>) NEG, (<b>b</b>) NAT, and (<b>c</b>) CRC. Bluish-green indicates a tendency toward co-occurrence, whereas brown indicates a tendency towards mutual exclusivity. The intensity of the greenish regions corresponds to the significance of the relationship between genes, and the star symbol denotes a higher (<span class="html-italic">p</span> &lt; 0.01) significance than the dot (<span class="html-italic">p</span> &lt; 0.05).</p> ">
Figure 4
<p>The tumor heterogeneity patterns of samples of the (<b>a</b>) NEG, (<b>b</b>) NAT, and (<b>c</b>) CRC groups. The <span class="html-italic">X</span>-axis represents the ID of samples, and the <span class="html-italic">Y</span>-axis represents the total number of mutations detected. The number of detected mutations per sample is marked above the corresponding columns, and the height of the columns is proportional to the number of detected variants. Different colors correspond to different genes. The gene-color coding is illustrated on the top-right corners of the figures.</p> ">
Figure 5
<p>Comparative plots regarding the sequencing input samples: (<b>a</b>) single-cell vs. bulk sequencing, (<b>b</b>) cfDNA vs. bulk sequencing, and (<b>c</b>) single-cell vs. cfDNA sequencing. The <span class="html-italic">X</span>-axis represents the genes, and the <span class="html-italic">Y</span>-axis represents the number of detected mutations on a logarithmic scale. The different sequencing methods are represented by different colors, where red is associated with bulk, blue is associated with single-cell, and green is associated with cfDNA sequencing.</p> ">
Figure 6
<p>The efficiency of the different sequencing methods regarding the detection of mutations. The <span class="html-italic">X</span>-axis represents the different sequencing methods: (P) cfDNA, (B) bulk, and (S-C) single-cell sequencing. The <span class="html-italic">Y</span>-axis represents the number of detected variants. The height of the columns is proportional to the amount of detected variants. Different colors correspond to different genes. The illustrated specific genes are characteristic of non-hypermutated colon tumors: <span class="html-italic">APC</span>, <span class="html-italic">TTN</span>, <span class="html-italic">TP53</span>, <span class="html-italic">KRAS</span>, <span class="html-italic">MUC16</span>, <span class="html-italic">MUC5B</span>, <span class="html-italic">PIK3CA</span>, <span class="html-italic">BRAF</span>, <span class="html-italic">SOX9</span>, <span class="html-italic">RYR1</span>, <span class="html-italic">RYR2</span>, <span class="html-italic">RYR3</span>, <span class="html-italic">FBXW7</span>, <span class="html-italic">ARID1A</span>, <span class="html-italic">COL5A1</span>, <span class="html-italic">COL6A3</span>, <span class="html-italic">KIAA019</span>, and <span class="html-italic">PCDH17</span>. The color-coding of genes is illustrated on the top-right corners of the figure.</p> ">
Figure A1
<p>The comparative oncomutational plot of the NAT and CRC groups. The X-axis represents the occurence rate of the mutations of the genes, and the Y-axis represents the examined genes. The left-side of the figure corresponds to the CRC group, and the right-side illustrates the findings regarding the NAT group.</p> ">
Versions Notes

Abstract

:
The progression of colorectal cancer is strongly influenced by environmental and genetic conditions. One of the key factors is tumor heterogeneity which is extensively studied by cfDNA and bulk sequencing methods; however, we lack knowledge regarding its effects at the single-cell level. Motivated by this, we aimed to employ an end-to-end single-cell sequencing workflow from tissue-derived sample isolation to exome sequencing. Our main goal was to investigate the heterogeneity patterns by laser microdissecting samples from different locations of a tissue slide. Moreover, by studying healthy colon control, tumor-associated normal, and colorectal cancer tissues, we explored tissue-specific heterogeneity motifs. For completeness, we also compared the performance of the whole-exome bulk, cfDNA, and single-cell sequencing methods based on variation at the level of a single nucleotide.

1. Introduction

Bulk DNA and RNA sequencing techniques are suitable for exploring the tumor microenvironment of cancer tissue. Several studies [1,2,3,4] indicate that the characteristics of both cell-extrinsic (such as the immunosuppressive tumor microenvironment, including suppressor cells and macrophages) and intracellular (epigenetic alterations, cancer cell metabolism, and oncogenic signaling) can be examined in detail. Through single-cell sequencing, we can find a systematic transcriptional atlas to delineate molecular and cellular heterogeneity as well as immune infiltration. In addition, it allows us to identify novel cell lineages and unique interactions between tumor cells and the surrounding microenvironment. Moreover, single-cell studies have revealed cancer initiative and progenitor mutations in driver genes [5]. The combination of these findings with the sampling of multiregional tissue samples can give an exact picture of the tumor’s microenvironment, particularly concerning intratumoral heterogeneity [6].
As general causes of tumor heterogeneity, we can emphasize genomic instability and enhanced clonal evolution under different environmental conditions. Intratumor heterogeneity refers to the diversity between tumor cells in a single patient [7]. This can manifest as the spatial heterogeneity that describes the distribution of genetically diverse tumor subpopulations in different tumor areas and as the temporal heterogeneity dealing with dynamic variations in the genetic diversity of an individual tumor over time [8]. Intertumor heterogeneity appears as multiple patients harbor tumors of the same histological type and are believed to result from patient-specific factors, including germline genetic variations, differences in the somatic mutation profile, and environmental conditions. A significant consequence is that, despite the dramatic initial responses, almost all cancer types resist targeted therapies due to intratumoral heterogeneity [9].
The objective of tissue isolation is to focus on specific, predetermined areas of tissues—such as transitional, differentiated, and invasive regions—that will later be subjected to various molecular analyses. Several techniques can be employed for the dissection of tissue samples, including bulk scraping, manual macrodissection, and laser capture microdissection. Choosing the most appropriate method is determined by the volume of interest (microdissection or macrodissection) and the features of the future analyses executed. Macrodissection can be performed without the use of a microscope or specialized equipment, typically targeting tumors that can be clearly defined without magnification. In contrast, during tissue microdissection, a microscope and/or other specialized equipment are also involved, such as a micromanipulator for movement, a laser source for isolation, or an adhesive tool for sample collection. Microdissectors typically operate on heterogeneous/mixed tissues, as well as on small well-defined tissue regions and functional units [10]. Regarding the origin of the target samples, formalin-fixed paraffin-embedded (FFPE) block tissues and fresh-frozen (FF) biopsies can be distinguished. Both types are suitable for macro- and microdissection procedures; however, fresh frozen tissues are used due to their potential superiority in preserving DNA and the absence of deparaffinization steps.
The process of drug resistance in various cancers, arising from tumor heterogeneity, can be observed and analyzed using single-cell RNA (scRNA) and single-cell DNA (scDNA) sequencing techniques [11,12]. Single-cell sequencing methods exploit the potential of micromanipulation and microdissection techniques. The method involves amplifying either the whole genome or specific regions of interest, constructing sequencing libraries, and employing next-generation sequencing technologies. The differences between cells are even greater for RNA, as it is more vulnerable to the influence of micro- and macroenvironmental stimuli [13]. scRNAseq provides an unprecedented opportunity and has demonstrated its utility in dissecting intratumor heterogeneity at a single-cell resolution.
Our current understanding of intratumoral heterogeneity in cancers is largely derived from the analysis of bulk tumor specimens; however, most bulk tumor specimens consist of a mixture of non-malignant cells and various subpopulations of cancer cells. Since a single tissue lesion cannot adequately represent the complex metastatic nature of many cancers, the ideal approach would involve collecting biopsy samples from multiple locations within the tumor. However, this is not a patient-friendly solution. Currently, cell-free DNA (cfDNA) profiling allows the detection of more heterogeneous driver alterations in a patient relative to single-tissue biopsies [14], but the complex picture of the mutational picture of each tumor cell with high resolution remains to be explored. Expanding on the methods using single-cell sequencing and multi-region morphology-based sampling can be an informative investigational strategy. By performing biopsy sampling of multiple regions within a single lesion, the ability to determine the extent of spatial heterogeneity within an individual tumor can be achieved more easily. The molecular makeup of cancer cells that predominate at different sites can be different due to the variable influences of microenvironment-related factors and site-specific stressors.
As part of Hungary’s Oncogenome Program in 2022, Kalmár et al. performed a large-volume study focusing on the whole-exome sequencing of colon tissues and matching cfDNA samples [15]. They found that the genes that were the most frequently mutated were APC, TP53, TTN, and KRAS in CRC tissue in the Hungarian cohort analyzed. In terms of the cfDNA WES results, tumor somatic variants were found in 6/33 CRC cases. Additionally, targeted panel sequencing was carried out on a subset of cfDNA samples. This revealed somatic variants in 8 of the 12 enrolled patients and identified 12 out of 20 tumor somatic variants within the targeted regions. In contrast, WES recovered only 20% of variants in the same targeted regions from the cfDNA of these patients. These contradictory results encouraged us to perform the whole-exome single-cell sequencing of samples from the same patients that had previously undergone WES plasma and bulk sequencing. Having data from the different sequencing methods facilitated the comparison of the results.
The presence of heterogeneity in malignant tumors can induce specific resistance to therapies. Although many research articles investigated this phenomenon, our methodological resources still lack tumor morphology-based heterogeneity-targeted single-cell techniques. The combination of laser microdissection and single-cell sequencing has been applied many times in studies in neuroscience [16,17,18]; breast, ovarian, and liver cancer [19,20,21]; and endometriosis [22]. These research areas mainly focused on single-cell RNA sequencing; however, whole-exome and whole-genome single-cell DNA sequencing on laser microdissected samples is rarely applied [18,22]. In addition, only a few examples corresponding to these types of investigations were presented in paraffin-embedded or fresh frozen tissues.
Encouraged by our findings, we conducted a study on colorectal cancer samples derived from tissue, which were collected using laser ablation and analyzed through single-cell whole-exome DNA sequencing. We aimed to highlight the benefits of single-cell sequencing and multi-region sampling to demonstrate the heterogeneity observed in our single-cell sequencing approach. We sequenced and examined samples from various locations within dedicated colon tissue sections to characterize tumor heterogeneity at the single-cell level. The selection of isolated samples was based on their morphological structures. Additionally, our goal was to identify characteristics related to variation and tumor heterogeneity across different sequencing methods. To achieve this, we performed both whole-exome bulk sequencing and circulating free DNA (cfDNA) sequencing of the solid and blood samples from the same patient, respectively.

2. Results

Based on the generated Dragen Enrichment reports for single-cell samples, we gathered general statistics regarding the number of bases, variants, depth of coverage, and the average efficiency of read and base reference mapping. The results showed mapping efficiencies of 29.12% and 27.59% for the NEG group, 58.06% and 57.62% for the NAT group, and 54.9% and 54.63% for the CRC group, respectively. The mean coverage depths, along with their corresponding standard deviations, for the different groups were as follows: NEG: 24.84 ± 37.94; NAT: 26.37 ± 39.94; and CRC: 31.53 ± 44.91. These coverage values are summarized in Table 1.
Figure 1 illustrates the distribution of tumor mutational burden (TMB) values. The boxplots regarding the TMBs of somatic variants per sample are presented on a logarithmic scale. CRC samples had a higher median value derived as 7.26 somatic alterations per megabase (Mb), while for the NAT samples, we detected 5.02 somatic variants per Mb.
The median number of mutations in the negative control samples was found to be 1103. After excluding the germline variants, the median number of mutations for the CRC and NAT groups decreased much more than for the NEG due to the applied somatic filter (Figure 2). The median number of mutations for NAT and CRC was 136 and 182, respectively, as presented in Figure 2. Detailed variant data per sample can be seen in Table A1.
To understand the relationship between genes present in the different tumor-related regions, we explored the mutually exclusive and co-occurring events by the maftools somaticInterations module. A pairwise Fisher’s exact test was performed to detect mutually exclusive and co-occurring events, and the result is presented in Figure 3. Our findings revealed that in the NEG-NAT-CRC regions, different genes play a role in the co-occurrence and exclusion, and the co-occurrence of certain genes can be observed with higher rates while their mutually exclusiveness slightly disappears in CRC. To illustrate the increase in the rate of co-occurrence, we highlight genes RYR2 and TNR as examples. They co-occur significantly more times in CRC (Figure 3c, p < 0.01) than NAT (Figure 3b, p < 0.05).

2.1. Investigation of Tumor Heterogeneity in Single-Cell Sequenced Samples

Investigations based on single-cell heterogeneity began after identifying variants corresponding to specific groups. For the negative control, only germline variants were analyzed, while both germline and somatic variants were assessed for colorectal cancer (CRC) and normal adjacent tissue (NAT) samples. A custom Python script was used to apply certain restrictions to the variant dataset. This method focused on collecting mutations in several key genes associated with non-hypermutated colon tumors, including APC, TTN, TP53, KRAS, MUC16, MUC5B, PIK3CA, BRAF, SOX9, RYR1, RYR2, RYR3, FBXW7, ARID1A, COL5A1, COL6A3, KIAA019, and PCDH17 for all locations cumulatively, and for the distinct locations one by one. The characteristics of this investigation for each sample can be seen in Figure 4. The different NEG, NAT, and CRC target regions provided similar mutation profiles but showed their uniqueness in certain aspects, for example, in the case of the mutated genes and the number of mutations, suggesting the presence of some sort of heterogeneity. Mutation profiles were derived from the different presence rates of the listed genes.
We also aimed to demonstrate the variational differences between the NAT and CRC groups from the same patient. Therefore, we created a smaller database. This database contained fewer genes and fewer samples. We introduced restrictions on the coverage values. In this examination, we only used NAT samples whose coverage was greater than 10X, and in the case of CRC, samples with coverage greater than 1X were analyzed. In the case of the CRC samples, we excluded five samples with coverage depths <1X. In the NAT group, we had to exclude six samples with coverage depths <10X. Therefore, seven CRC and six NAT samples were evaluated. The database is confined to the list of cancer-related genes that are consistently mutated in all samples as follows: TTN, APC, KRAS, TP53, PIK3CA, FBXW7, and SOX9. Table 2 shows that mutations on the genes APC, KRAS, TP53, PIK3CA, FBXW7, and SOX9 appeared with different rates in the individual samples of groups, by 6/6 (NAT) and 6/7 (CRC), 5/6 (NAT) and 5/7 (CRC), 5/6 (NAT) and 4/7 (CRC), 3/6 (NAT) and 5/7 (CRC), 5/6 (NAT) and 5/7 (CRC), 2/6 (NAT) and 4/7 (CRC) for APC, KRAS, TP53, PIK3CA, FBXW7, and SOX9, respectively. The TTN gene was an exception with its unique mutation rate of 100% (6/6 (NAT) and 7/7 (CRC)).

2.2. Comparison of Different Input Samples

We also summarized the general statistics of the short-read sequencing results of the bulk exome and the matched plasma samples. According to our results, we found many more variants in the exonic regions of 12 different single-cell locations compared to a whole bulk sample (Figure 5a). The number of mutations detected in the cfDNA results fell within the bulk data range (Figure 5b) and the single-cell sequencing (Figure 5c). Moreover, we determined the mutational profiles for the 12 individual single-cell locations, which are presented in Figure 4 for groups NEG, NAT, and CRC, respectively.
The outstanding performance of single-cell sequencing is presented in Figure 6, which shows that it detects many more distinct variants than the other two methods. In summarizing the plasma, bulk, and single-cell results, a ranking can be set based on the number of detected variants. Bulk sequencing provided the least detected mutations, and the cfDNA method earned second place behind single-cell sequencing. Therefore, the multi-region single-cell method can detect many more variants than whole-exome bulk and cfDNA sequencing from the same biological samples and thus provides a powerful method for investigating tumor heterogeneity. This finding is crucial for any future investigations of unique tumor-specific mutations.
To prove our findings with a clinically relevant example, based on scientific articles with a main focus on cases characterized by resistance to anti-EGFR, antiangiogenic, 5-fluorouracil (5-FU), chemo and cetuximab therapies [23,24,25,26,27], we constructed a list of genes that are mutated in patients with therapy-resistant colorectal cancer and provided the corresponding mutational attributes. These genes were the following: APC, KRAS, TP53, PIK3CA, FBXW7, SMAD4, NRAS, EGFR, BRAF, RNF43, and ARID1A. As we know, combined KRAS and TP53 mutations are the main factors for chemotherapy resistance [26]. Furthermore, since the G12 and G13 subtypes of KRAS mutations are mainly present among colorectal patients with poor survival [28], our objective was to examine their occurrence in our sample by whole-exome sequencing. We previously confirmed by ddPCR analysis. that the sample investigated in our study does not contain the KRAS G12/G13 mutations. By single-cell, bulk, and cfDNA sequencing, we were also able to confirm the absence of these KRAS G12/G13 mutations. Encouraged by this, we performed an additional mutational analysis on the therapy-resistant mutated genes listed above and were able to detect several likely pathogenic alterations in them. Interestingly, the different methods are distinguished in the non-benign non-analogous mutation lists, which are demonstrated in Table 3. To summarize our findings, single-cell sequencing identified the most alterations (250) compared to the whole-genome results (94), consistent with our previous results. In more detail, we found 219 and 63 alterations that were revealed by single-cell and bulk sequencing, respectively, and we listed 31 mutations that were found by all methods. Further analysis of the detected mutations listed in Table A2 without a ClinVar annotation is outside of the scope of this publication.

3. Discussion

The primary drawback of the current scDNA- and scRNA-seq methods is the use of liquid-based samples, which do not provide any morphological information due to their bloodstream origin. To enhance the precision and efficiency of single-cell sequencing, it is necessary to obtain exact morphological information, which is present on tissue slides, and thus, we can select the specific areas of interest. This can be achieved by punching tissue blocks (PTB) or laser capture microdissection (LCM) methods. Studies have shown that LCM is more favorable for analyzing and comparing morphologically distinct patterns, including single cells or clusters of fewer than five cancer cells ahead of the invasive front [22]. The combination of LCM and two-dimensional gel electrophoresis reveals the proteomic heterogeneity in cancer surgical specimens. By distinguishing tumor cells according to tissue localization, protein spots with different intensities were observed in tumor cell groups compared to normal epithelial cells [29].
Tissue heterogeneity complicates the identification of tumor markers, and the results of the proteomic analysis of the entire tissue may be considered controversial. LCM can overcome this problem by isolating individual tumor cells. However, the small number of cells obtained by LCM severely limits the required proteome coverage and biomarker discovery potential that can be achieved using conventional proteomics platforms [30]. This disadvantage can be eliminated by the capture of multiregion tissue from the same slide as we have demonstrated in our study.
The depth of the coverage of the sequencing is an important metric for measuring the quality of variant calls. In our study, the mean coverage depths and their standard deviations per group were derived as NEG: 24.84 ± 37.94; NAT: 26.37 ± 39.94; and CRC: 31.53 ± 44.91. Based on the detailed data of Table 1, it can also be considered that in addition to the deeper coverage values (≥30X), there were many inadequate values (<10X). We can consider excluding data with low coverage; however, as we dealt with the data of individual groups collectively in most of our investigations, we kept the information corresponding to them. In the cases illustrated in Figure 4, the results of individual rare-cell locations showed that samples with low coverage gave different mutational profiles than those with higher coverage. Encouraged by this, we performed a trial to exclude data with low coverage to prove that they do not significantly influence our collective results and that they only have a slight effect. The data of this examination are presented in Table 2. However, the consequence of exclusion was that the number of samples was different between the two groups; therefore, we performed simple normalization, that is, we normalized the data derived from seven samples to that of six samples. This is denoted by subscript 6 in the summation in the last row. Normalized values showed that a higher total number of mutations can be observed in the case of NAT samples compared to CRC, even in samples with a higher coverage by the collective results.
Variants with a higher occurrence rate were detected in the NEG group assigned to the reference genome. This can be explained by the fact that the germline variant calling for this group listed all deviations from the known reference, including every individual-specific benign mutation; theoretically, the presence of the well-known somatic variations is not allowed (and this does not precisely exclude their existence). As discussed in the study of H. Lee-Six et al. [31], colorectal neoplastic changes can occur in morphologically normal tissue, and their incidence tends to increase with age. Their higher occurrence involves an increased mutational burden in normal colorectal scripts with a range of 1500–15,000 alterations per Mb. Hence, the investigation of negative cancer-free tissue biopsies allows us to gain insight into the different aspects of the earliest stages of the clonal evolution of colorectal cancers, namely, the range of mutational processes, the frequency of driver mutations, and the clonal dynamics of colonic stem cells. Furthermore, another non-scientific but relatable explanation for the increased number of mutations in NEG compared to NAT and CRC can be the ‘doubled’ amount of the input NEG sample.
In Figure 5b,c, the heterogeneity patterns seem very similar for the two groups, but a deeper analysis reveals that the patterns became more diverse per sample in the case of NAT samples. In the image of the computational pattern (Figure A1, we can also observe several differences between the tissue groups. Oncogenic mutations in CRC are represented with higher rates than in NAT, especially including the genes BRAF, FBXW7, and PCDH17. Their mutation rates are zero in the NAT group, suggesting that they possibly have maintenance-like effects in tumor cells. Scientific resources have clarified that mutations in genes BRAF, FBXW7, and PCDH17 contribute to resistance to therapy against immunotherapy with EGFR inhibitor and 5-FU chemotherapy, respectively [32,33,34]. Interestingly, our results show that mutations in COL5A1 are present in NAT rather than in CRC samples, which can indicate some sort of preventive effect or may contrarily serve as a marker of tumor progression. Scientific literature data prove that the overexpression of COL5A1 promotes tumor progression and metastasis and correlates with poor patient survival [35]. To strengthen our results, we performed an additional KEGG pathway analysis using the ShinyGO platform [36] including resistance to therapy indicating genes (APC, KRAS, TP53, PIK3CA, FBXW7, SMAD4, NRAS, EGFR, BRAF, RNF43, and ARID1A), similarly to the study by He et al. [37]. We found that many corresponding alterations are present in cancer development-related signaling pathways, such as the PI3K-AKT, Ras, Wnt, TGF-beta, p53, ErbB, mTOR, and MAPK pathways. Based on this, mutations on oncogenes KRAS, BRAF, and NRAS, and tumor suppressor genes APC, SMAD4, and TP53 require more attention in the context of (single-cell) tumor heterogeneity and need to be investigated in more detail as we have in our study.
However, conventional anticancer therapies remove most cells from tumor mass. In the future, small surviving populations of therapy must also be considered as they evolve adaptive resistance strategies leading to treatment failure [38]. Several mutations in genes characterized by the resistance effect of therapy were identified by different sequencing systems, as presented in Table A2. We detected 38 and 11 mutations in APC, 7 and 0 in KRAS, 22 and 13 in TP53, 27 and 10 in PIK3CA, 14 and 3 in FBXW7, 13 and 0 in SMAD4, 3 and 1 in NRAS, 50 and 35 in EGFR, 67 and 14 in BRAF, 5 and 2 in RNF43, and 4 and 5 in ARID1A by single-cell and bulk sequencing, respectively. Additionally, we performed an analysis including specific alterations which were collected based on scientific resources [23,24,25,26,27], namely p.E542K, p.E545K, p.H1047R in PIK3CA, p.S192R, p.G465E, p.S492R, p.R451C, p.K467T in EGFR, p.V600E in BRAF, p.G659Vfs*41, p.G659Sfs*87, p.R117Pfs*42, p.R117S, p.R117C, p.R117Afs*41, p.R117Pfs*8, p.R117H, p.R117P, p.R117Pfs*41 in RNF43, and p.D1850Tfs*33, p.D1850Gfs*4, p.D1850G, p.Q309K, p.Q309*, p.Q309H, p.Q758Dfs*58, p.Q758*, p.Q758Rfs*75, and p.Q758Pfa*59 in ARID1A. From this detailed list, we detected p.R117H in RNF43 using both single-cell and bulk sequencing. This alteration associated with a BRAF p.V600E mutation promotes a better prognosis in CRC patients who receive PD-1/PD-L1 inhibitors and the combination of anti-EGFR/BRAF therapy [23]. However, in our case, in the absence of this specific BRAF mutation, this better prognosis cannot be declared. Together, our findings confirmed the efficiency of multi-region single-cell assay in the detection and identification of therapy-resistant variations. To achieve more precise results, our assay needs methodological improvements, especially in the case of the coverage values.
In summary, the detailed analysis of genes, variants, and heterogeneity in colorectal cancer (CRC) and the neighboring normal adjacent tissue (NAT) samples leads to several conclusions. Normal sites associated with tumor regions should not be mistakenly classified as normal controls or tumor-free areas. Instead, they represent a mixture of hereditary and somatic alterations. One possible explanation for this observation is that NAT may function as a transition site between tumors and cancer-negative regions, as previously proposed by Aran et al. [39]. Furthermore, our interpretation of NAT regions is supported by the research conducted by Kim et al. [40], who found that NAT exhibits a greater number of differentially expressed genes (DEGs) compared to tumors and demonstrated better prognostic abilities. To fully understand the mutational landscape, more detailed investigations are necessary. However, it is likely that NAT contains both progenitor and inhibitor mutations, suggesting that the complexity of these interactions should not be underestimated.
As demonstrated in the paragraph discussing the sequencing coverage of our samples, the main limitation of this study is the non-equal distribution of the coverage values. Insufficient coverage data can arise due to the nature of the applied sample source, as the diameter of the collected samples was in the range of micrometers, indicating proportionally less DNA and high intersample variations. Targeted sequencing applications could resolve coverage issues independently of the sample size and origin. Our main goal here was to explore the possibility of single-cell sequencing on tissue samples and to demonstrate investigations related to heterogeneity. Another limitation of this work is the lack of characteristics related to variant and tumor heterogeneity between different sequencing methods. As it falls outside the scope of the present study, we seek opportunities to fill this gap in a future publication.

4. Materials and Methods

4.1. Clinical Samples

All samples were obtained after written informed consent forms from untreated patients were signed. Colonic specimens were collected during surgery from tumors and histologically normal adjacent tissue (NAT) at the 1st Department of Surgery, Semmelweis University, Budapest, Hungary. The samples were then stored at −80 °C until use. In addition, tissue samples from the same locations were immediately fixed in buffered formalin and experienced pathologists established the histological diagnoses. The study was carried out according to the Declaration of Helsinki and was approved by the local ethics committee and government authorities (Regional and Institutional Committee on Science and Research Ethics (ETT TUKEB); No.: 14383-2/2017/EKU Semmelweis University, Budapest, Hungary).
Fresh frozen colorectal tissue samples were embedded in Tissue-Tek® O.C.T.TM Compound (Sakura Finetek, Torrance, CA, USA). Cryosections (15 μm) were prepared and mounted on MMI MembraneSlides (Molecular Machines & Industries GmbH, Eching, Germany). Cryo Slides were then stored at −80 °C for later use. Hematoxylin-eosin staining was performed to visualize the tissue morphology. For dehydration, slides were first dipped in 90% ethanol for 10 min, then in 100% ethanol for 10 min, and finally in xylene for 5 min.
Single-cell samples were cut from a membrane frame slide by laser microdissection (Laser Capture Microdissection, Molecular Machines & Industries GmbH). Target areas with a diameter of 30–40 μm and a height of approximately 20–30 μm were collected. The height was derived from the thickness of the membrane slide (4–6 μm) and the thickness of the tissue section (6–20 μm). The samples were selected based on their morphological structure located at random sites of the histological sections. Each group contained circles from 12 different regions of the corresponding slide. Sample collection was performed using 0.2 mL MMI Diffuser caps (Molecular Machines & Industries GmbH). Overall, we had 12-12 NAT and colorectal cancer (CRC), and 12(x2: the size of the pieces was doubled to two neighboring pieces to ensure a sufficient sample amount; however, in later steps, these pairs were treated as single samples) cancer-negative healthy samples (NEG). We attempted to perform laser microdissection on samples that met specific morphological criteria. For the NEG group, any areas with signs of inflammation were strictly excluded. In the case of the NAT group, we selected areas that were closest to tumorous sites but were confirmed by a pathologist to be non-cancerous. Since our examination focuses on colorectal cancer tissues, we assumed the absence of other cancers. Slides were observed using an inverted microscope (Fully Motorized and Automated Inverted Microscope IX83, Olympus Life Science, Waltham, MA, USA) before and after dissection.

4.2. Single-Cell DNA Extraction, Library Preparation, and Next-Generation Sequencing

DNA extraction and whole-genome amplification were performed by the REPLI-g Single Cell Kit (Qiagen GmbH, San Diego, CA, USA) according to the manufacturer’s instructions. Briefly, as a first step, cells collected by laser microdissection were lysed and DNA was denatured at 65 °C for 30 min. After adding a neutralization buffer that stops denaturation, whole-genome amplification was performed by REPLI-g sc-DNA Polymerase with incubation at 30 °C for 2 h and enzyme inactivation at 65 °C for 3 min. Following quantity and quality control by the Qubit HS dsDNA kit on the Qubit 1.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA) and high sensitivity DNA chip on Bioanalyzer 2100 microcapillary electrophoresis system (Agilent Technologies, Santa Clara, CA, USA), 200–1000 ng of amplified DNA were further processed. For the library preparation, the QIAseq FX Single Cell DNA Library kit (Qiagen GmbH) was applied. Enzymatic fragmentation was performed with FX Enhancer for 4 °C 1 min (pre-chilled thermocycler), 32 °C for 15 min, 65 °C for 30 min, and 4 °C hold. After the adapter ligation, clean-up was applied with 0.8× Agencourt AMPure XP (Beckman Coulter, Indianapolis, IN, USA) and then with 1× AMPure XP beads. PCR-based library amplification was used with QIASeq HiFi Mastermix (Qiagen GmbH) under the following thermocycling conditions: 98 °C for 2 min, 6 cycles of 98 °C for 20 s, 60 °C for 30 s, 72 °C for 30 s, followed by a final extension of 72 °C for 1 min. Post-amplification clean-up was performed with 1× AMPure XP Beads. Quality was verified with BioAnalyzer 2100 and quantity was assessed by QIAseqTM Library Quant Assay Kit (Qiagen GmbH) according to the manufacturer’s instructions.
Whole-exome capture was completed using the QIASeq Human Exome Kit (Qiagen GmbH) with 200 ng input DNA per sample. For hybridization capture, 6-6 indexed libraries were pooled and the reaction was carried out according to the manufacturer’s instructions. Following the binding of the hybridized targets to streptavidin beads, washing steps were completed and postcapture amplification was carried out with the Illumina Library Amplification Post Hybrid Capture PCR Mix and Primer Mix Illumina Library Amplification with the following thermocycling conditions: 98 °C for 2 min, 7 cycles of 98 °C for 20 s, 60 °C for 30 s, and 72 °C for 30 s, then 72 °C for 1 min. Clean-up was completed with 1.5× AMPure XP Beads. The quality of the library was again evaluated with BioAnalyzer 2100, and the exact amount of libraries was quantified with the QIAseqTM Library Quant Assay Kit (Qiagen GmbH) according to the manufacturer’s instructions. A 4-4 nm library pool was pooled together, and this sample was further prepared according to the Denature and Dilute Libraries Guide (Illumina Inc., San Diego, CA, USA). Finally, 12 samples per run were examined.
Paired-end next generation sequencing was performed using the NextSeq High Output Kit on a NextSeq 500/550 Instrument (Illumina Inc.) with 149 cycles for Read 1 and Read 2 and 10 cycles for Index 1 and 2 (corresponding to the QIAseq UDI Y-Adapters (Qiagen Gmbh) protocol).

4.3. Bioinformatic Analyses

The same short-read sequencing bioinformatic analysis was performed on single-cell, whole-exome bulk, and cfDNA sequencing data. The results of whole-exome bulk and plasma sequenced samples were available from the study of Kalmár et al. [15]. Briefly, we would like to summarize the mutual evaluation steps. The demultiplexing and FASTQ file generation was performed using the Illumina BaseSpace interface (Illumina Inc.). Next, we used the FastQC and MultiQC tools to assess the quality of sequencing reads. The raw sequence reads were aligned with the Human Reference Genome GRCh38. SNP and short indel germline and somatic variants were called and determined by the Dragen Germline Variant Caller v. 4.2.4 (Illumina Inc.) on the NEG, CRC, and NAT samples. Dragen Somatic Variant Caller v. 4.2.7 (Illumina Inc.) was run in tumor-normal mode for the CRC and NAT samples.
The variant call.vcf files were annotated using the SnpEff eff variant annotation tool on the Galaxy website [41], and the mutation annotation.maf files were generated by the package vcf2maf [42] and the Ensembl Variant Effect Predictor (VEP) release with version 102 [43]. The clinical impact of the variants was evaluated according to the ClinVar [44] database. Variant characteristics were summarized using the maftools [45] ‘plotmafsummary’ tool. The summary plots for the CRC and NAT groups include only the somatic mutations. The tumor mutation burden (TMB) values were calculated using the “tmb” function of the maftools program package as the number of non-silent mutations per mega base (Mb) for each data group. The target capture size was set to 37 Mb according to the exome sequencing kit used for our samples. An oncoplot and the landscape of somatic interactions were made by the oncoplot and somaticInteractions modules of maftools, respectively. To perform the KEGG pathway analysis of therapy-resistant genes, we used the ShinyGO platform [36].
After summarizing the results of the different sequencing methods, several analyses were performed in the Python and R program languages, and a cumulative data table per method was created to make data illustration easier. The heterogeneity-related bar plots were generated in Microsoft Excel based on the cumulative table, where the columns represented the different locations and the rows indicated for the gene mutations. The TMB boxplots were created using the matplotlib package [46] using the pyplot.boxplot tool.

5. Conclusions

In conclusion, we can declare that the different sequencing methods detect mutations on different scales. According to our findings regarding the investigated sequencing methods, the single-cell sequencing of tissue samples has a superior performance compared to the others in terms of heterogeneity- and resistance-based therapy investigations. With regard to the results demonstrated, we can conclude that sequencing different sample pieces from various locations rather than analyzing the whole bulk or cfDNA samples demonstrates better potential, especially when the goal is to present region-specific mutational patterns. Although we completed single-cell sequencing using a short-read next-generation sequencing instrument, the possibility of achieving much more compact results with a long-read third-generation device still holds and needs to be investigated.

Author Contributions

N.S., A.K., B.K.B. and B.M.: conceptualization and revision; N.S., A.K., B.K.B. and B.M.: methodology; N.S.: data analysis; N.S. and B.M.: literature research and drafting; B.K.B., B.M., I.T., K.R.R., T.R.L., G.V. and Z.B.N.: critical revision of the manuscript. All authors have contributed to the article and approved the submitted version. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare that financial support was received for the research, authorship, and/or publication of this article. This project was financed by the NRDI Fund (FK0201NEPE/TPKNKTA-47) and the Fund’s National Cardiovascular Laboratory (RRF-2.3.1-21-2022-00003). This project has been implemented with the support provided by the Ministry of Culture and Innovation of Hungary and the National Research, Development and Innovation Fund (KDP-2023/C2270480).

Informed Consent Statement

The study was conducted according to the Declaration of Helsinki and approved by the local ethics committee and government authorities (Regional and Institutional Committee of Science and Research Ethics (ETT TUKEB) No.: 14383-2/2017/EKU Semmelweis University, Budapest, Hungary).

Data Availability Statement

Due to privacy issues, we do not provide full data availability regarding the single-cell sequencing part of this publication. The data are used according with the consent provided by the participants without compromising their anonymity. Upon request, we can provide data for peer review purposes. As we used the results of the bulk and cfDNA sequencing of a previous publication, these are available at https://cbioportal.vo.elte.hu/cbioportal (accessed on 19 September 2022).

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Figure A1. The comparative oncomutational plot of the NAT and CRC groups. The X-axis represents the occurence rate of the mutations of the genes, and the Y-axis represents the examined genes. The left-side of the figure corresponds to the CRC group, and the right-side illustrates the findings regarding the NAT group.
Figure A1. The comparative oncomutational plot of the NAT and CRC groups. The X-axis represents the occurence rate of the mutations of the genes, and the Y-axis represents the examined genes. The left-side of the figure corresponds to the CRC group, and the right-side illustrates the findings regarding the NAT group.
Ijms 26 00737 g0a1
Table A1. Summary of coverage, tmb, fragment length, and the number of different mutation types of the NEG, NAT, and CRC data. The variant results of the NAT and CRC groups are expanded by the data of somatic alterations.
Table A1. Summary of coverage, tmb, fragment length, and the number of different mutation types of the NEG, NAT, and CRC data. The variant results of the NAT and CRC groups are expanded by the data of somatic alterations.
GermlineSomatic
IDMean Region Coverage DepthTMBMedian Fragment LengthSNPDeletionInsertionSNPDeletionInsertion
NEG18.222.119414,7756223
NEG225.195.3819848,08432503
NEG32.57.74170428016422
NEG44.26.04179526812354
NEG50.518.5618010206
NEG610.924.821767490188340
NEG776.639.819160,52415232158
NEG8126178.72191183,4651373068
NEG98.927.817814,8956227
NEG1019.266.2620237,99916529
NEG1114.38.0220628,89115692
NEG121.752.32284359062
NAT137.273.9426492,880101130555,00235015355
NAT211.48.5422714,0401214189066361169
NAT313.58.1621718,9704133811,3488691178
NAT4125.5226.52198261,09454110137,773728610,311
NAT585.3108.54164134,821208673,45037535428
NAT633.748.2822057,9928771435,85424703594
NAT70.81.7826818921287112594169
NAT80.70.56214151821997975126
NAT95.61.882053387101092196164266
NAT100.90.182249703257174762
NAT111.60.542399502246304780
NAT120.20.0825723005159822
T1141.2294.7241232,1331481516136,63213559003
T212.65.682555510161683743126494
T366118.04212103,0076668561,9506413810
T40.30.6426627037220442
T510.862556651235051553
T614.151.6625924,8024844415,0293491545
T762205.2253137,039169172482,05413636703
T80.81.16259128614667927111
T90.41.38244385023236441
T100.10.23051100568418
T1158.842606608141263854126503
T1274.8176.2623193,550120112157,14511615684
Table A2. Mutations on genes APC, ARID1A, BRAF, EGFR, FBXW7, KRAS, NRAS, PIK3CA, RNF43, SMAD4, and TP53 contributing to therapy resistance. The table is divided to 3 parts, where the detected mutations are listed by single-cell, bulk, and both sequencing methods. The common mutations are listed independently in the last part, and they are not included in the lists of individual methods.
Table A2. Mutations on genes APC, ARID1A, BRAF, EGFR, FBXW7, KRAS, NRAS, PIK3CA, RNF43, SMAD4, and TP53 contributing to therapy resistance. The table is divided to 3 parts, where the detected mutations are listed by single-cell, bulk, and both sequencing methods. The common mutations are listed independently in the last part, and they are not included in the lists of individual methods.
GeneSequence VariationdbSNP IDMutation Classification
Mutations Detected by Single-Cell Sequencing Only
APCn.112707592dupG
APCc.135+5252G>C
APCc.135+5253A>T
APCc.135+5254G>C
APCc.136-230C>Ars2464805benign
APCc.221-291C>A
APCc.4853dupT
APCc.6172G>T
APCc.*433T>A
APCc.*1958_*1959insTTAC
APCc.*1965T>C
APCc.*2220_*2221ins
APCc.934-8_934-7insrs1561535860likely benign
APCc.934-4dupA
APCc.5613T>C
APCc.*267_*273delCCATCCC
APCc.*281_*283delTTTrs42427benign
APCc.*285A>Grs866006benign
APCc.*1098T>C benign
APCc.*1556C>G
APCc.560-12T>C
APCc.1881-762G>Ars41116benign
APCc.*413_*414dupAArs2289484benign
APCc.559+37C>Ars1554084977uncertain significance
APCc.559+223C>Trs41115benign
APCc.627_628insAGAAGATGAArs1580673845benign
APCc.628_628+1insrs2229995benign
ARID1Ac.*421delA
ARID1Ac.*724C>A
ARID1Ac.2295-161delT
ARID1Ac.2496-130_2496-128delAAA
Mutations Detected by Single-Cell Sequencing Only
BRAFc.1763T>Crs1562954580uncertain significance
BRAFc.1695-940C>A
BRAFc.1695-1205G>A
BRAFc.1695-5750C>G
BRAFc.1694+8566G>A
BRAFc.1694+3374T>G
BRAFc.1694+3266A>G
BRAFc.1694+2940C>T
BRAFc.1140+3214G>A
BRAFc.1140+2430C>T
BRAFc.1140+2069A>G
BRAFc.1140+1915G>T
BRAFc.1140+1665dupG
BRAFc.981-2296A>G
BRAFc.980+2576T>A
BRAFc.980+1801_980+1802delAA
BRAFc.241-198G>C
BRAFc.981-356dupA
BRAFc.981-1080_981-1042del
BRAFc.589G>A
BRAFc.243C>A
BRAFc.984-1276C>T
BRAFc.1315-470C>T
BRAFc.1315-479A>C
BRAFc.1315-482A>T
BRAFc.451-200C>T
BRAFc.1251+48_1251+49delGA
BRAFc.1178-1544T>C
BRAFc.1178-1548A>G
BRAFc.1178-1551C>T
BRAFc.1178-1557C>T
BRAFc.981-2446G>C
BRAFc.160delC
BRAFc.1394T>C
BRAFc.328dupT
BRAFc.1804-85C>A
BRAFc.2138C>T
BRAFc.814-174G>A
BRAFc.814-77_814-76insAATA
BRAFc.1059+170C>A
BRAFc.1737A>G
BRAFc.1141-1044C>T
BRAFc.1141-1097A>G
BRAFc.1141-1111G>Ars373442098uncertain significance
BRAFc.1141-1637_1141-1636delTT
BRAFc.1140+633C>G
BRAFc.1140+610_1140+615delAGCTAT
BRAFc.861-75dupT
BRAFc.860+457delC
BRAFc.112-5717G>T
BRAFc.112-5732A>G
BRAFc.112-5778A>G
BRAFc.112-6770_112-6769insA
BRAFc.112-7499A>G
BRAFc.112-7501A>T
BRAFc.112-7503C>T
BRAFc.*274+536A>G
BRAFc.983+1398T>C
BRAFc.983+1236_983+1237insCAAGAGGT
BRAFc.983+1233_983+1234delTT
BRAFc.983+1232T>C
BRAFc.983+1186G>A
BRAFc.876+630A>G
EGFRc.88+48720_88+48721delTG
EGFRc.322A>T
EGFRc.2469+5015G>A
EGFRc.2625+13C>G
EGFRc.2947-203G>A
Mutations Detected by Single-Cell Sequencing Only
EGFRc.3272-1104_3272-1092del
EGFRc.3272-1071delG
EGFRc.3272-1068delA
EGFRc.3272-1064_3272-1063insAAAA
EGFRc.3272-438T>C
EGFRc.3272-408C>A
EGFRc.*932dupA
EGFRc.3271+191T>A
EGFRc.3271+191T>G
EGFRc.3271+1166C>A
EGFRc.3272-1115_3272-1099del
EGFRc.*781G>C
EGFRc.*1151dupC
EGFRc.*1382dupT
EGFRc.*1957G>A
EGFRc.1695-2105C>G
EGFRc.1741+165A>Grs10228436benign
EGFRc.1695-1134A>Grs2227984benign
EGFRc.2469+108delG
EGFRc.3271+188T>G
EGFRc.3271+282T>Ars2075110benign
EGFRc.1721A>G
EGFRc.1314+556_1314+557dupTTrs10241451benign
EGFRc.1314+394A>G
EGFRc.1314+256G>A
EGFRc.1178-443A>G
EGFRc.1178-648G>A
EGFRc.3271+809G>Ars2072454benign
EGFRc.3271+976G>C
EGFRc.3272-611_3272-608delTACA
EGFRn.140724136T>Crs2075109benign
EGFRn.140726106G>C
EGFRc.2128-5dupT
EGFRc.2128-27C>T benign
EGFRc.1993-90G>Trs2227983benign
EGFRc.1993-93A>Crs2227984benign
EGFRc.1742-352C>Grs2241055benign
EGFRc.1742-353C>G
EGFRc.1741+318A>G
EGFRc.1695-53G>A
EGFRc.1314+557dupT
FBXW7c.2001dupG
FBXW7c.*1125delT
FBXW7c.*1111A>C
FBXW7c.986-147A>C
FBXW7c.1728_1729insAAACAAC
FBXW7c.1727_1728ins
FBXW7c.1721_1722ins
FBXW7c.1716_1717ins
FBXW7c.933+18A>G
FBXW7c.934-191A>T
FBXW7c.934-15_934-14delTC
FBXW7c.934-12_934-11insAC
KRASc.876+408_876+409dupTT
KRASc.556-72_556-71delAA
KRASc.257C>T
KRASc.1308+459A>C
KRASc.1308+468C>T
KRASc.1308+472C>T
KRASc.1308+473A>G
NRASc.-2070dupT
NRASc.1251+52A>Trs61758221benign
PIK3CAc.*3631T>C
PIK3CAc.*1606delA
PIK3CAc.1251+54A>C
PIK3CAc.1251+57G>C
PIK3CAc.1251+58C>A
PIK3CAc.*355G>T
PIK3CAc.*480C>A
Mutations Detected by Single-Cell Sequencing Only
PIK3CAc.*488_*490delTCC
PIK3CAc.*494G>T
PIK3CAc.1736_1737insAAAACAAA
PIK3CAc.1735G>T
PIK3CAc.1733C>A
PIK3CAc.2496-124G>Trs7623154benign
PIK3CAc.2923A>Trs17550640benign
PIK3CAc.2936+21dupA
PIK3CAc.*654T>Grs3729676benign
PIK3CAc.645+173A>G
PIK3CAc.934-132delG
PIK3CAc.3381G>C
PIK3CAc.6200A>G
PIK3CAc.6633C>T
RNF43c.*6514G>A
RNF43c.451-5T>C
RNF43c.376-90A>G
RNF43c.2936+19A>G
SMAD4c.692dupG pathogenic
SMAD4c.905-1G>C
SMAD4c.1139+385A>T
SMAD4c.*5005dupT
SMAD4c.*5116C>G
SMAD4c.*5131A>G
SMAD4c.*5191C>G
SMAD4c.*5535_*5536delAC
SMAD4c.*5541_*5552del
SMAD4c.*5863_*5867delGAAAA benign
SMAD4c.*5994A>C benign
SMAD4c.*6433G>A
SMAD4c.1060-42G>T
TP53c.984-1412delT
TP53c.983+1201A>G
TP53c.556-71delA
TP53c.556-149T>A
TP53c.424C>T
TP53c.259-161_259-158delAAAA
TP53c.259-162_259-158delAAAAA
TP53c.258+123dupT
TP53c.99dupC
TP53c.-22+41_-22+48delACCTGGAG
TP53c.-145-190C>A
TP53c.-145-1184T>C
TP53c.966dupG
TP53c.582+132T>C
TP53c.1060-69G>T
TP53c.1308+477T>C
TP53c.1308+478G>C
TP53c.1308+501C>T
TP53c.1308+521_1308+574del
TP53c.*3333C>T
Mutations Detected by Bulk Sequencing Only
APCc.136-1428A>Crs2464807other
APCc.220+124C>Grs76552546likely benign
APCc.2413C>Trs587779783pathogenic
APCc.4666dupArs587783031pathogenic
ARID1Ac.1920+6177G>T
ARID1Ac.1921-1059A>T
ARID1Ac.2252-97A>Trs113319329benign
ARID1Ac.2733-400A>G
ARID1Ac.3199-95A>Grs76490152benign
BRAFn.140726457T>C
BRAFc.*1215A>T
BRAFc.2128-16C>Trs368721021benign/likely benign
BRAFc.1177+146G>Ars1267632benign
BRAFc.1140+3180G>T
Mutations Detected by Bulk Sequencing Only
BRAFc.505-6693T>C
BRAFc.505-9562G>A
BRAFc.504+3486T>C
BRAFc.504+142G>A benign
BRAFc.139-23483C>T
EGFRc.88+37643T>C
EGFRc.89-55393G>A
EGFRc.89-29869C>A
EGFRc.559+214G>Trs2270427benign
EGFRc.1498+22A>Trs1558544benign
EGFRc.1498+142C>Trs759162benign
EGFRc.1499-177A>Grs11536635benign
EGFRc.1880+733A>C
EGFRc.2361G>A benign
EGFRc.2469+4027T>C
EGFRc.2508C>T benign
EGFRc.2625+196A>Grs6970262benign
EGFRc.2709T>Crs1140475benign
EGFRc.2849-551T>G
EGFRc.3162+200_3162+201insAGrs34723095benign
EGFRc.3272-123G>Ars2692456benign
EGFRc.3333_3334insTTTTTTTTTTTTT
EGFRc.3337delC
EGFRc.3339_3350delGCCTCTGAACCC
EGFRc.3353C>G
EGFRc.3355C>G
EGFRc.3356C>A
EGFRc.3368C>Trs775317295uncertain significance
EGFRc.*9367A>G
FBXW7c.*3466C>G
FBXW7c.-69-40817T>C
PIK3CAc.1059+62C>Ars2699895benign
PIK3CAc.1060-17C>Ars2699896benign
PIK3CAc.1145+54A>Grs3729679benign
PIK3CAc.2016-27A>Trs6443625benign
PIK3CAc.*5631C>T
PIK3CAc.*10339G>T
RNF43c.2057C>Grs9652855benign
TP53c.*4020_*4049del
TP53c.*274+522T>G
TP53c.*274+31A>G
TP53c.877-1G>Ars587782272pathogenic
TP53c.555+62A>G benign
TP53c.259-91G>A benign
TP53c.259-160_259-158delAAA
TP53c.98C>G benign
TP53c.-22+41_-21-54del
TP53c.-44+38C>G benign
TP53c.-14962_-14959dupGTTT
Mutations detected by both methods
APCc.1458T>Crs2229992benign
APCc.4479G>Ars41115benign
APCc.5034G>Ars42427benign
APCc.5268T>Grs866006benign
APCc.5465T>Ars459552benign
APCc.5880G>Ars465899benign
APCc.7504G>Ars2229995benign
BRAFc.2128-54_2128-51dupCTTT
BRAFc.1992+16G>Crs3789806benign/likely benign
BRAFc.1992+14A>G
BRAFc.1929A>Grs9648696benign
EGFRc.474C>Trs2072454benign
EGFRc.560-84T>Crs2075109benign
EGFRc.628+104C>Trs2075110benign
EGFRc.629-62A>Grs11506105benign
EGFRc.1006+151T>Crs3735059benign
EGFRc.1562G>Ars2227983benign
Mutations detected by both methods
EGFRc.1881-600G>Ars10228436benign
EGFRc.1887T>Ars2227984benign
EGFRc.1920-215G>Crs2241055benign
EGFRc.2283+96A>Grs2017000benign
EGFRc.2284-60T>Crs10241451benign
FBXW7c.1746G>A
NRASc.-3343C>T
PIK3CAc.352+40A>Grs3729674benign
PIK3CAc.1173A>Grs2230461benign
PIK3CAc.2295-57C>Grs2699889benign
PIK3CAc.*10365T>C
RNF43c.350G>Ars2257205benign
TP53c.665+92T>Grs12951053benign
TP53c.665+72C>Trs12947788benign

References

  1. Galeano Nino, J.L.; Wu, H.; LaCourse, K.D.; Kempchinsky, A.G.; Baryiames, A.; Barber, B.; Futran, N.; Houlton, J.; Sather, C.; Sicinska, E.; et al. Effect of the intratumoral microbiota on spatial and cellular heterogeneity in cancer. Nature 2022, 611, 810–817. [Google Scholar] [CrossRef]
  2. Zhang, M.; Hu, S.; Min, M.; Ni, X.; Lu, Z.; Sun, X.; Wu, J.; Liu, B.; Ying, X.; Liu, Y. Dissecting transcriptional heterogeneity in primary gastric adenocarcinoma by single cell RNA sequencing. Gut 2021, 70, 464–475. [Google Scholar] [CrossRef] [PubMed]
  3. Wang, T.; Dang, N.; Tang, G.; Li, Z.; Li, X.; Shi, B.; Xu, Z.; Li, L.; Yang, X.; Xu, C.; et al. Integrating bulk and single-cell RNA sequencing reveals cellular heterogeneity and immune infiltration in hepatocellular carcinoma. Mol. Oncol. 2022, 16, 2195–2213. [Google Scholar] [CrossRef] [PubMed]
  4. Kumar, V.; Ramnarayanan, K.; Sundar, R.; Padmanabhan, N.; Srivastava, S.; Koiwa, M.; Yasuda, T.; Koh, V.; Huang, K.K.; Tay, S.T.; et al. Single-Cell Atlas of Lineage States, Tumor Microenvironment, and Subtype-Specific Expression Programs in Gastric Cancer. Cancer Discov. 2022, 12, 670–691. [Google Scholar] [CrossRef] [PubMed]
  5. Li, C.; Wu, S.; Yang, Z.; Zhang, X.; Zheng, Q.; Lin, L.; Niu, Z.; Li, R.; Cai, Z.; Li, L. Single-cell exome sequencing identifies mutations in KCP, LOC440040, and LOC440563 as drivers in renal cell carcinoma stem cells. Cell Res. 2017, 27, 590–593. [Google Scholar] [CrossRef] [PubMed]
  6. Gerlinger, M.; Rowan, A.J.; Horsewell, S.; Math, M.; Larkin, J.; Endesfelder, D.; Gronroos, E.; Martinez, P.; Matthews, N.; Stewart, A.; et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 2012, 367, 976. [Google Scholar] [CrossRef] [PubMed]
  7. Ramón Y Cajal, S.; Sesé, M.; Capdevila, C.; Aasen, T.; De Mattos-Arruda, L.; Diaz-Cano, S.J.; Hernández-Losa, J.; Castellví, J. Clinical implications of intratumor heterogeneity: Challenges and opportunities. Mol. Med. 2020, 98, 161–177. [Google Scholar] [CrossRef]
  8. Zhu, L.; Jiang, M.; Wang, H.; Sun, H.; Zhu, J.; Zhao, W.; Fang, Q.; Yu, J.; Chen, P.; Wu, S.; et al. A narrative review of tumor heterogeneity and challenges to tumor drug therapy. Ann. Transl. Med. 2021, 9, 1351. [Google Scholar] [CrossRef] [PubMed]
  9. Wu, H.; Guo, C.; Wang, C.; Xu, J.; Zheng, S.; Duan, J.; Li, Y.; Bai, H.; Xu, Q.; Ning, F.; et al. Single-cell RNA sequencing reveals tumor heterogeneity, microenvironment, and drug-resistance mechanisms of recurrent glioblastoma. Cancer Sci. 2023, 114, 2609–2621. [Google Scholar] [CrossRef]
  10. Walsh, E.M.; Halushka, M.K. A Comparison of Tissue Dissection Techniques for Diagnostic, Prognostic, and Theragnostic Analysis of Human Disease. Pathobiology 2023, 90, 199–208. [Google Scholar] [CrossRef]
  11. Peretz, C.A.C.; McGary, L.H.F.; Kumar, T.; Jackson, H.; Jacob, J.; Durruthy-Durruthy, R.; Levis, M.J.; Perl, A.; Huang, B.J.; Smith, C.C. Single-cell DNA sequencing reveals complex mechanisms of resistance to quizartinib. Blood Adv. 2021, 5, 1437–1441. [Google Scholar] [CrossRef] [PubMed]
  12. Lei, Y.; Tang, R.; Xu, J.; Wang, W.; Zhang, B.; Liu, J.; Yu, X.; Shi, S. Applications of single-cell sequencing in cancer research: Progress and perspectives. Hematol. Oncol. 2021, 14, 91. [Google Scholar] [CrossRef]
  13. Li, X.; Wang, C.Y. From bulk, single-cell to spatial RNA sequencing. Int. J. Oral. Sci. 2021, 13, 36. [Google Scholar] [CrossRef]
  14. Parikh, A.R.; Leshchiner, I.; Elagina, L.; Goyal, L.; Levovitz, C.; Siravegna, G.; Livitz, D.; Rhrissorrakrai, K.; Martin, E.E.; Van Seventer, E.E.; et al. Liquid versus tissue biopsy for detecting acquired resistance and tumor heterogeneity in gastrointestinal cancers. Nat. Med. 2019, 25, 1415–1421. [Google Scholar] [CrossRef] [PubMed]
  15. Kalmár, A.; Galamb, O.; Szabó, G.; Pipek, O.; Medgyes-Horváth, A.; Barták, B.K.; Nagy, Z.B.; Szigeti, K.A.; Zsigrai, S.; Csabai, I.; et al. Patterns of Somatic Variants in Colorectal Adenoma and Carcinoma Tissue and Matched Plasma Samples from the Hungarian Oncogenome Program. Cancers 2023, 15, 907. [Google Scholar] [CrossRef] [PubMed]
  16. Zhao, P.; Mondal, S.; Martin, C.; DuPlissis, A.; Chizari, S.; Ma, K.Y.; Maiya, R.; Messing, R.O.; Jiang, N.; Ben-Yakar, A. Femtosecond laser microdissection for isolation of regenerating C. elegans neurons for single-cell RNA sequencing. Nat. Methods 2023, 20, 590–599. [Google Scholar] [CrossRef] [PubMed]
  17. Smajic, S.; Prada-Medina, C.A.; Landoulsi, Z.; Ghelfi, J.; Delcambre, S.; Dietrich, C.; Jarazo, J.; Henck, J.; Balachandran, S.; Pachchek, S.; et al. Single-cell sequencing of human midbrain reveals glial activation and a Parkinson-specific neuronal state. Brain 2022, 145, 964–978. [Google Scholar] [CrossRef]
  18. Turan, Z.G.; Richter, V.; Bochmann, J.; Parvizi, P.; Yapar, E.; Işıldak, U.; Waterholter, S.K.; Leclere-Turbant, S.; Son, C.D.; Duyckaerts, C.; et al. Somatic copy number variant load in neurons of healthy controls and Alzheimer’s disease patients. Acta Neuropath. Commun. 2022, 10, 175. [Google Scholar] [CrossRef] [PubMed]
  19. Massalha, H.; Bahar Halpern, K.; Abu-Gazala, S.; Jana, T.; Massasa, E.E.; Moor, A.E.; Buchauer, L.; Rozenberg, M.; Pikarsky, E.; Amit, I.; et al. A single cell atlas of the human liver tumor microenvironment. Mol. Syst. Biol. 2020, 16, e9682. [Google Scholar] [CrossRef] [PubMed]
  20. Paul, E.D.; Huraiová, B.; Valková, N.; Birknerova, N.; Gábrišová, D.; Gubova, S.; Ignačáková, H.; Ondris, T.; Bendíková, S.; Bíla, J.; et al. Multiplexed RNA-FISH-guided Laser Capture Microdissection RNA Sequencing Improves Breast Cancer Molecular Subtyping, Prognostic Classification, and Predicts Response to Antibody Drug Conjugates. medRxiv 2023. [Google Scholar] [CrossRef]
  21. Ikeda, H.; Miyao, S.; Nagaoka, S.; Takashima, T.; Law, S.M.; Yamamoto, T.; Kurimoto, K. High-quality single-cell transcriptomics from ovarian histological sections during folliculogenesis. Life Sci. Alliance 2023, 6, e202301929. [Google Scholar] [CrossRef] [PubMed]
  22. Pavlič, A.; Urh, K.; Boštjančič, E.; Zidar, N. Analyzing the invasive front of colorectal cancer—By punching tissue block or laser capture microdissection? Pathol. Res. Pract. 2023, 248, 154727. [Google Scholar] [CrossRef] [PubMed]
  23. Tang, Y.L.; Li, D.D.; Duan, J.Y.; Sheng, L.M.; Wang, X. Resistance to targeted therapy in metastatic colorectal cancer: Current status and new developments. World J. Gastroenterol. 2023, 29, 926–948. [Google Scholar] [CrossRef]
  24. Wang, Q.; Shen, X.; Chen, G.; Du, J. Drug Resistance in Colorectal Cancer: From Mechanism to Clinic. Cancers 2022, 14, 2928. [Google Scholar] [CrossRef] [PubMed]
  25. Johnson, R.M.; Qu, X.; Li, C.F. ARID1A mutations confer intrinsic and acquired resistance to cetuximab treatment in colorectal cancer. Nat. Commun. 2022, 13, 5478. [Google Scholar] [CrossRef] [PubMed]
  26. Tang, Y.; Fan, Y. Combined KRAS and TP53 mutation in patients with colorectal cancer enhance chemoresistance to promote postoperative recurrence and metastasis. BMC Cancer 2024, 24, 1155. [Google Scholar] [CrossRef] [PubMed]
  27. Zhao, B.; Wang, L.; Qiu, H.; Zhang, M.; Sun, L.; Peng, P.; Yu, Q.; Yuan, X. Mechanisms of resistance to anti-EGFR therapy in colorectal cancer. Oncotarget 2017, 8, 3980–4000. [Google Scholar] [CrossRef] [PubMed]
  28. Zhou, S.L.; Xin, H.Y.; Sun, R.Q.; Zhou, Z.J.; Hu, Z.Q.; Luo, C.B.; Wang, P.C.; Li, J.; Fan, J.; Zhou, J. Association of KRAS Variant Subtypes with Survival and Recurrence in Patients with Surgically Treated Intrahepatic Cholangiocarcinoma. JAMA Surg. 2022, 157, 59–65. [Google Scholar] [CrossRef]
  29. Sugihara, Y.; Taniguchi, H.; Kushima, R.; Tsuda, H.; Kubota, D.; Ichikawa, H.; Fujita, S.; Kondo, T. Laser microdissection and two-dimensional difference gel electrophoresis reveal proteomic intra-tumor heterogeneity in colorectal cancer. J. Proteom. 2013, 78, 134–147. [Google Scholar] [CrossRef] [PubMed]
  30. Zhang, Y.; Ye, Y.; Shen, D.; Jiang, K.; Zhang, H.; Sun, W.; Zhang, J.; Xu, F.; Cui, Z.; Wang, S. Identification of transgelin-2 as a biomarker of colorectal cancer by laser capture microdissection and quantitative proteome analysis. Cancer Sci. 2010, 101, 523–529. [Google Scholar] [CrossRef]
  31. Lee-Six, H.; Olafsson, S.; Ellis, P.; Osborne, R.J.; Sanders, M.A.; Moore, L.; Georgakopoulos, N.; Torrente, F.; Noorani, A.; Goddard, M.; et al. The landscape of somatic mutation in normal colorectal epithelial cells. Nature 2019, 574, 532–537. [Google Scholar] [CrossRef] [PubMed]
  32. Clarke, C.N.; Kopetz, E.S. BRAF mutant colorectal cancer as a distinct subset of colorectal cancer: Clinical characteristics, clinical behavior, and response to targeted therapies. J. Gastrointest. Oncol. 2015, 6, 660–667. [Google Scholar] [PubMed]
  33. Shang, W.; Yan, C.; Liu, R.; Chen, L.; Cheng, D.; Hao, L.; Yuan, W.; Chen, J.; Yang, H. Clinical significance of FBXW7 tumor suppressor gene mutations and expression in human colorectal cancer: A systemic review and meta-analysis. BMC Cancer. 2021, 21, 770. [Google Scholar] [CrossRef] [PubMed]
  34. Liu, S.; Lin, H.; Wang, D.; Li, Q.; Luo, H.; Li, G.; Chen, X.; Li, Y.; Chen, P.; Zhai, B.; et al. PCDH17 increases the sensitivity of colorectal cancer to 5-fluorouracil treatment by inducing apoptosis and autophagic cell death. Signal Transduct. Target Ther. 2019, 4, 53. [Google Scholar] [CrossRef] [PubMed]
  35. Feng, G.; Ma, H.M.; Huang, H.B.; Li, Y.W.; Zhang, P.; Huang, J.J.; Cheng, L.; Li, G.R. Overexpression of COL5A1 promotes tumor progression and metastasis and correlates with poor survival of patients with clear cell renal cell carcinoma. Cancer Manag. Res. 2019, 11, 1263–1274. [Google Scholar] [CrossRef] [PubMed]
  36. Ge, S.X.; Jung, D.; Yao, R. ShinyGO: A graphical gene-set enrichment tool for animals and plants. Bioinformatics 2020, 36, 2628–2629. [Google Scholar] [CrossRef] [PubMed]
  37. He, J.; Han, J.; Liu, J.; Yang, R.; Wang, J.; Wang, X.; Chen, X. Genetic and Epigenetic Impact of Chronic Inflammation on Colon Mucosa Cells. Front. Genet. 2021, 12, 722835. [Google Scholar] [CrossRef]
  38. Valcz, G.; Buzás, E.; Gatenby, R.A.; Ujvari, B.; Molnár, B. Small extracellular vesicles from surviving cancer cells as multiparametric monitoring tools of measurable residual disease and therapeutic efficiency. Biochim. Biophys. Acta BBA Cancer 2024, 1879, 189088. [Google Scholar] [CrossRef] [PubMed]
  39. Dvir, A.; Camarda, R.; Odegaard, J.; Paik, H.; Oskotsky, B.; Krings, G.; Goga, A.; Sirota, M.; Butte, A. Comprehensive analysis of normal adjacent to tumor transcriptomes. Nat. Commun. 2017, 8, 1077. [Google Scholar]
  40. Kim, J.; Kim, H.; Lee, M.S.; Lee, H.; Kim, Y.J.; Lee, W.Y.; Yun, S.H.; Kim, H.C.; Hong, H.K.; Hannenhalli, S.; et al. Transcriptomes of the tumor-adjacent normal tissues are more informative than tumors in predicting recurrence in colorectal cancer patients. J. Transl. Med. 2023, 21, 304. [Google Scholar] [CrossRef] [PubMed]
  41. The Galaxy Community. The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update. Nucleic Acid Res. 2024, 52, W83–W94. [Google Scholar] [CrossRef] [PubMed]
  42. Kandoth, C.; Gao, J.; Mattioni, M.; Struck, A.; Boursin, Y.; Penson, A.; Chavan, S. mskcc/vcf2maf: vcf2maf v1.6.16. (v1.6.16), Zenodo: Genève, Switzerland, 2020. [Google Scholar] [CrossRef]
  43. McLaren, W.; Gil, L.; Hunt, S.E.; Riat, H.S.; Ritchie, G.R.; Thormann, A.; Flicek, P.; Cunningham, F. The Ensembl Variant Effect Predictor. Genome Biol. 2016, 17, 122. [Google Scholar] [CrossRef] [PubMed]
  44. Landrum, M.J.; Lee, M.J.; Riley, G.R.; Jang, W.; Rubinstein, W.S.; Church, D.M.; Maglott, D.R. ClinVar: Public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014, 42, D980-5. [Google Scholar] [CrossRef]
  45. Mayakonda, A.; Lin, D.C.; Assenov, Y.; Plass, C.; Koeffler, H.P. Maftools: Efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018, 28, 1747–1756. [Google Scholar] [CrossRef] [PubMed]
  46. Hunter, J.D. Matplotlib is a 2D graphics package used for Python for application development, interactive scripting, and publication-quality image generation across user interfaces and operating systems. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
Figure 1. The distribution of the Tumor Mutational Burden (TMB) in the NAT and CRC samples is shown on a logarithmic scale. On average, the CRC group exhibits a higher TMB, indicating a greater number of somatic variations compared to the NAT group.
Figure 1. The distribution of the Tumor Mutational Burden (TMB) in the NAT and CRC samples is shown on a logarithmic scale. On average, the CRC group exhibits a higher TMB, indicating a greater number of somatic variations compared to the NAT group.
Ijms 26 00737 g001
Figure 2. Summary plots of the (a) NEG, (b) NAT, and (c) CRC samples. Variant classification distribution: the X-axis represents the number of variants, and the Y-axis represents the variant type categories. Variant type plot: the X-axis represents the number of variants, and the Y-axis represents the variant type categories and SNV class plot. Variants per sample plot: the X-axis represents the ID of samples, and the Y-axis represents the number of variants. Variant classification summary: the X-axis represents the variant classifications, and the Y-axis represents the number of variants. Top 10 mutated genes: the X-axis represents the number of mutations, and the Y-axis lists the top 10 mutated genes.
Figure 2. Summary plots of the (a) NEG, (b) NAT, and (c) CRC samples. Variant classification distribution: the X-axis represents the number of variants, and the Y-axis represents the variant type categories. Variant type plot: the X-axis represents the number of variants, and the Y-axis represents the variant type categories and SNV class plot. Variants per sample plot: the X-axis represents the ID of samples, and the Y-axis represents the number of variants. Variant classification summary: the X-axis represents the variant classifications, and the Y-axis represents the number of variants. Top 10 mutated genes: the X-axis represents the number of mutations, and the Y-axis lists the top 10 mutated genes.
Ijms 26 00737 g002
Figure 3. Cross−cancer genome mutation patterns may serve as a proxy to identify positive (collaboration) or negative (synthetic lethal) epistatic relationships between recurrently mutated driver genes. The epistatic relationship between two driver genes may be inferred from cross-cancer mutation patterns, whereby co-occurrence may indicate a synergistic interaction in promoting tumorigenesis. By contrast, mutually exclusive driver genes may negatively impact tumorigenesis when mutated jointly. Here, mutually exclusive and co-occurring gene pairs are presented in a triangular matrix per tissue group—(a) NEG, (b) NAT, and (c) CRC. Bluish-green indicates a tendency toward co-occurrence, whereas brown indicates a tendency towards mutual exclusivity. The intensity of the greenish regions corresponds to the significance of the relationship between genes, and the star symbol denotes a higher (p < 0.01) significance than the dot (p < 0.05).
Figure 3. Cross−cancer genome mutation patterns may serve as a proxy to identify positive (collaboration) or negative (synthetic lethal) epistatic relationships between recurrently mutated driver genes. The epistatic relationship between two driver genes may be inferred from cross-cancer mutation patterns, whereby co-occurrence may indicate a synergistic interaction in promoting tumorigenesis. By contrast, mutually exclusive driver genes may negatively impact tumorigenesis when mutated jointly. Here, mutually exclusive and co-occurring gene pairs are presented in a triangular matrix per tissue group—(a) NEG, (b) NAT, and (c) CRC. Bluish-green indicates a tendency toward co-occurrence, whereas brown indicates a tendency towards mutual exclusivity. The intensity of the greenish regions corresponds to the significance of the relationship between genes, and the star symbol denotes a higher (p < 0.01) significance than the dot (p < 0.05).
Ijms 26 00737 g003
Figure 4. The tumor heterogeneity patterns of samples of the (a) NEG, (b) NAT, and (c) CRC groups. The X-axis represents the ID of samples, and the Y-axis represents the total number of mutations detected. The number of detected mutations per sample is marked above the corresponding columns, and the height of the columns is proportional to the number of detected variants. Different colors correspond to different genes. The gene-color coding is illustrated on the top-right corners of the figures.
Figure 4. The tumor heterogeneity patterns of samples of the (a) NEG, (b) NAT, and (c) CRC groups. The X-axis represents the ID of samples, and the Y-axis represents the total number of mutations detected. The number of detected mutations per sample is marked above the corresponding columns, and the height of the columns is proportional to the number of detected variants. Different colors correspond to different genes. The gene-color coding is illustrated on the top-right corners of the figures.
Ijms 26 00737 g004
Figure 5. Comparative plots regarding the sequencing input samples: (a) single-cell vs. bulk sequencing, (b) cfDNA vs. bulk sequencing, and (c) single-cell vs. cfDNA sequencing. The X-axis represents the genes, and the Y-axis represents the number of detected mutations on a logarithmic scale. The different sequencing methods are represented by different colors, where red is associated with bulk, blue is associated with single-cell, and green is associated with cfDNA sequencing.
Figure 5. Comparative plots regarding the sequencing input samples: (a) single-cell vs. bulk sequencing, (b) cfDNA vs. bulk sequencing, and (c) single-cell vs. cfDNA sequencing. The X-axis represents the genes, and the Y-axis represents the number of detected mutations on a logarithmic scale. The different sequencing methods are represented by different colors, where red is associated with bulk, blue is associated with single-cell, and green is associated with cfDNA sequencing.
Ijms 26 00737 g005
Figure 6. The efficiency of the different sequencing methods regarding the detection of mutations. The X-axis represents the different sequencing methods: (P) cfDNA, (B) bulk, and (S-C) single-cell sequencing. The Y-axis represents the number of detected variants. The height of the columns is proportional to the amount of detected variants. Different colors correspond to different genes. The illustrated specific genes are characteristic of non-hypermutated colon tumors: APC, TTN, TP53, KRAS, MUC16, MUC5B, PIK3CA, BRAF, SOX9, RYR1, RYR2, RYR3, FBXW7, ARID1A, COL5A1, COL6A3, KIAA019, and PCDH17. The color-coding of genes is illustrated on the top-right corners of the figure.
Figure 6. The efficiency of the different sequencing methods regarding the detection of mutations. The X-axis represents the different sequencing methods: (P) cfDNA, (B) bulk, and (S-C) single-cell sequencing. The Y-axis represents the number of detected variants. The height of the columns is proportional to the amount of detected variants. Different colors correspond to different genes. The illustrated specific genes are characteristic of non-hypermutated colon tumors: APC, TTN, TP53, KRAS, MUC16, MUC5B, PIK3CA, BRAF, SOX9, RYR1, RYR2, RYR3, FBXW7, ARID1A, COL5A1, COL6A3, KIAA019, and PCDH17. The color-coding of genes is illustrated on the top-right corners of the figure.
Ijms 26 00737 g006
Table 1. Mean coverage depth values for individual samples with columns representing sample IDs and rows indicating different tissue groups. Each row–column intersection reflects the unique coverage values.
Table 1. Mean coverage depth values for individual samples with columns representing sample IDs and rows indicating different tissue groups. Each row–column intersection reflects the unique coverage values.
123456789101112MeanSTD
NEG8.225.12.54.20.510.976.61268.919.214.31.724.837.9
NAT37.211.413.5125.585.333.70.80.75.60.91.60.226.439.9
CRC141.212.6660.3114.1620.80.40.1574.831.544.9
Table 2. Heterogeneity mutation data of single-cell samples over genes TTN, APC, KRAS, TP53, PIK3CA, FBXW7, and SOX9. Only samples with adequate coverage (for CRC >1X, for NAT >10X) were included. This database presents the occurrence and number of mutations on the specific genes per sample. The last row of the table shows the total number of mutations for all samples. In columns where the number 6 appears next to the summation sign, the total number of mutations has been normalized based on data from six samples. This normalization was necessary due to the unequal number of samples in the different groups.
Table 2. Heterogeneity mutation data of single-cell samples over genes TTN, APC, KRAS, TP53, PIK3CA, FBXW7, and SOX9. Only samples with adequate coverage (for CRC >1X, for NAT >10X) were included. This database presents the occurrence and number of mutations on the specific genes per sample. The last row of the table shows the total number of mutations for all samples. In columns where the number 6 appears next to the summation sign, the total number of mutations has been normalized based on data from six samples. This normalization was necessary due to the unequal number of samples in the different groups.
NATCRC
Gene Number of Mutations Samples Containing the Mutated Gene Occurence Rate of the Mutated Genes Gene Number of Mutations Samples Containing the Mutated Gene Occurence Rate of the Mutated Genes
TTN1746/6100%TTN1977/7100%
APC1376/6100%APC1975/771%
KRAS735/683%KRAS455/771%
TP53625/683%TP53664/757%
PIK3CA103/650%PIK3CA385/771%
FBXW7145/683%FBXW7255/771%
SOX972/633%SOX974/757%
∑ = 477 6 = 477∑ = 498 6 = 427
Table 3. The non-benign mutations identified by the single-cell and bulk sequencing assays. The variant results were annotated using the ClinVar database. This table lists only those mutations that have previously been reported with clinical significance.
Table 3. The non-benign mutations identified by the single-cell and bulk sequencing assays. The variant results were annotated using the ClinVar database. This table lists only those mutations that have previously been reported with clinical significance.
Single-Cell Sequencing
GeneSequence VariationdbSNP IDMutation Classification
BRAFc.1763T>Crs1562954580uncertain significance
BRAFc.1141-1111G>Ars373442098conflicting classifications of pathogenicity
TP53c.424C>Trs1597371187uncertain significance
SMAD4c.692dupGrs377767334pathogenic
APCc.559+37C>Ars1554084977uncertain significance
APCc.560-84T>Crs1561605775uncertain significance
Bulk Sequencing
GeneSequence VariationdbSNP IDMutation Classification
APCc.136-1428A>Crs2464807other
APCc.2413C>Trs587779783pathogenic
APCc.4666dupArs587783031pathogenic
EGFRc.3368C>Trs775317295uncertain significance
TP53c.877-1G>Ars587782272pathogenic/likely pathogenic
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Szakállas, N.; Kalmár, A.; Barták, B.K.; Nagy, Z.B.; Valcz, G.; Linkner, T.R.; Rada, K.R.; Takács, I.; Molnár, B. Investigation of Exome-Wide Tumor Heterogeneity on Colorectal Tissue-Based Single Cells. Int. J. Mol. Sci. 2025, 26, 737. https://doi.org/10.3390/ijms26020737

AMA Style

Szakállas N, Kalmár A, Barták BK, Nagy ZB, Valcz G, Linkner TR, Rada KR, Takács I, Molnár B. Investigation of Exome-Wide Tumor Heterogeneity on Colorectal Tissue-Based Single Cells. International Journal of Molecular Sciences. 2025; 26(2):737. https://doi.org/10.3390/ijms26020737

Chicago/Turabian Style

Szakállas, Nikolett, Alexandra Kalmár, Barbara Kinga Barták, Zsófia Brigitta Nagy, Gábor Valcz, Tamás Richárd Linkner, Kristóf Róbert Rada, István Takács, and Béla Molnár. 2025. "Investigation of Exome-Wide Tumor Heterogeneity on Colorectal Tissue-Based Single Cells" International Journal of Molecular Sciences 26, no. 2: 737. https://doi.org/10.3390/ijms26020737

APA Style

Szakállas, N., Kalmár, A., Barták, B. K., Nagy, Z. B., Valcz, G., Linkner, T. R., Rada, K. R., Takács, I., & Molnár, B. (2025). Investigation of Exome-Wide Tumor Heterogeneity on Colorectal Tissue-Based Single Cells. International Journal of Molecular Sciences, 26(2), 737. https://doi.org/10.3390/ijms26020737

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop