In contrast to approaches that compare pair-wise control (i.e. normal) to treated (i.e. disease) samples, we compared colorectal cancer samples not only to a set of control samples but also against a wide range of samples and conditions to collect the differentially expressed genes and identify target genes. We identified specific genes for colorectal cancer and showed that they are significantly associated with colorectal cancer in the literature. Analysis of independent datasets revealed a significantly distinct expression pattern for glucocorticoid receptor (GR) and ring finger protein 43 (RNF43) in colorectal cancer samples. GR was downregulated whereas RNF43 was upregulated in colorectal cancer with respect to various conditions in different datasets. In HCT116 colorectal cancer cell line, knock-down of GR levels with siRNA resulted in increased RNF43 levels, suggesting that GR might be a negative regulator of RNF43. Our study suggests that the downregulation of GR might be involved in the upregulation of RNF43 in colorectal cancer.
Academic Editor: Rizwan Khan, Aligarh Muslim University
Checked for plagiarism: Yes
Review by: Single-blind
Copyright © 2016 Ertugrul Dalgic, et al.
The authors have declared that no competing interests exist.
Analyses of gene expression levels generally focus on differential expression between two conditions. In cancer studies the comparison would be tumor vs. normal samples, long vs. poor surviving samples, or metastatic vs. non-metastatic samples 1, 2. However, the analysis of a multi-conditional large-scale gene expression dataset has provided useful information, such as identifying genes with switch-like behavior, which are not readily recovered using pair-wise data analysis 3. Analyzing the expression levels of a gene across a diverse condition space provides a better measure of the specificity of the gene for a particular condition 4. To determine whether a gene of interest for a condition is specific to that condition and differentially expressed, we need to have an understanding of the distribution of “normal” expression levels of that gene. This set of possible expression levels for a gene can be more readily obtained in a large-scale multi-condition dataset, which can be used to determine the “normal” expression distribution of a gene, and in turn help better define the gene expression levels that are “abnormal”. For example p53, a well-known gene mutated commonly in many cancer types, does not appear to be differentially expressed in a small dataset that compares only two conditions, i.e. control vs. γ-irradiation samples (wherein the DNA-damage response gene p53 is known to be activated), but was identified in a large dataset comprising of multiple conditions that was not limited to DNA-damage response 3. Thus, a specific gene for a condition of interest was readily uncovered by comparing the expression levels of the condition of interest with many different conditions. This approach was also able to predict novel genes for cancer. The TACSTD2 (tumor-associated calcium signal transducer 2) gene was shown to be significantly expressed in patient-derived breast cancer samples when compared to a wider range of controls and the significance of TACSTD2 for breast cancer cells was confirmed experimentally in MCF7 and MDA-MB-231 breast cancer cell lines 3, 4. Regression based analysis of datasets with multiple conditions was shown also to be useful for identifying drug targets 5.
Unlike previous studies that identified specific genes based on the gene expression level changes by comparing tumor to only normal samples, we identified differentially expressed genes specific to colorectal cancer by comparing the colorectal cancer samples to samples from a large set of diverse conditions. The large sample set with over 5000 samples included normal samples from various tissues including colorectal tissue, different cancer samples, and cell lines, as well as other disease samples, covering more than 200 types of samples. We calculated the separation of the expression values using a D value (see Supplementary Information), which represents a normalized absolute difference of the mean values between two populations 3. To assess the relevance of the list of differentially expressed genes, we compared the lists of differentially expressed genes obtained from a pairwise comparison of the colon cancer samples and normal samples vs. those obtained from the multiple comparison for their enrichment of colorectal cancer related publications in the literature. The multiple comparison approach yielded more significant genes that are related to colorectal cancer than the pairwise comparison (Figure 1). Based on the assumption that the appearance of a gene together with colorectal cancer in the literature indicates the importance of the gene in colorectal cancer, the multiple comparison approach appeared to offer better predictors of cancer type-specific genes.
Figure 1. Literature comparison of differentially expressed genes in colorectal cancer samples with respect to only the non-cancer colorectal samples (pairwise) vs. all other samples (multiple conditions). (A) Fraction of genes that are relevant to colorectal cancer in the Pubmed database when different D value cut-offs around 2 are chosen. (B) Significance of the genes that are relevant to colorectal cancer in Pubmed database when different D value cut-offs around 2 are chosen, based on permutation test. (C) Modulation of mRNA levels of RNF43 and GRα. Specific siRNA treatment of RNF43 and GR (RNF43 siRNA 1 and GR siRNA 1) were performed in HCT116 colorectal cancer cell lines and normalized to scrambled (control) siRNA. Real-time PCR results of the primers (GR primers and first set of RNF43 primers) specific for RNF43 and GRα are normalized with the results from the beta-actin primers. Fold change values are based on 9 values representing comparison of the 3 treated samples and their replicates, to the 3 control samples and their replicates. * indicates p ≤0.05 based on the comparison of expression values of the treatment replicates to control replicates.
This study does not suggest that either the pairwise or multiple comparison approach is superior, but rather suggests that they provide different sets of genes that could be integrated to offer a more robust list of important genes. Surprisingly, the overlap between these different approaches found only one common gene for colorectal cancer, NR3C1 (GR), between the pairwise and multiple comparison approaches using the large multi-condition dataset. The multiple comparison approach provides a very different set of genes from the pairwise approach. Although the multiple comparison approach is likely to provide more specific genes for the condition of interest than the pairwise comparison approach, it has some drawbacks, such as the challenge of (i) selecting the reference set of samples, (ii) merging a large set of samples from different sources into a single reference set, and (iii) comparing a relatively smaller set of samples (sample set of interest, e.g., the disease samples) to a much larger set of samples (reference samples from various sources). Therefore, this study provides a comparative analysis and suggests that integrating different approaches to determine differentially expressed genes could provide novel mechanistic insight for a condition of interest.
We set the threshold for the D value to be at least 2, indicating two distinct populations 3, and obtained a list of colorectal cancer specific genes that have significantly distinct expression profiles in colorectal cancer from the other samples in the multi-condition dataset. After obtaining a list of specific genes using the multiple comparison approach, we confirmed the differential expression of these genes with an independent colon cancer dataset. For the pairwise comparison approach, we performed the D value analysis for the second dataset that included colon tumor samples and paired normal samples. We identified only the glucocorticoid receptor (NR3C1, GR) and ring finger protein 43 (RNF43) genes as having D values greater than 2 in the pairwise comparison approach, from the list of genes specific to colorectal cancer obtained from the multiple comparison approach. NR3C1 expression levels were lower in the colorectal cancer samples than other samples, whereas RNF43 expression levels were higher in the colorectal cancer samples than other samples in both multiple and pair-wise comparison approaches. In order to test if this observation held across different experiments, we analyzed the expression levels of NR3C1 and RNF43 genes in an independent microarray dataset, which included data from various colorectal cancer cell lines. We confirmed that NR3C1 levels are low, while RNF43 levels are high in most of the colorectal cancer cell lines, including HCT116 (Supplementary Fig. 1). Therefore, this potential relationship between the genes was further explored in vitro.
GR (NR3C1) is a nuclear receptor for glucocorticoid hormones, primarily involved in maintaining homeostasis in response to stress. However, it has diverse roles in various cell types and under different conditions 6. GR expression levels and function vary among different colorectal cancer patient samples and cell lines 6, 7, 8. GR was found to be epigenetically downregulated or absent at the protein level in some patient tissues derived colorectal cancer samples as well as in certain colorectal cancer cell lines 7, 8. To investigate whether low levels of GR are related to high levels of RNF43 and identify a possible regulatory mechanism between GR and RNF43, we knocked down GR and RNF43 levels using specific siRNAs in HCT116 colorectal cancer cell lines. While knock-down of GR increased RNF43 mRNA levels, knock-down of RNF43 levels did not change the GR mRNA levels (Figure 1, Supplementary Fig. 2).
This result suggests a possible negative regulation of RNF43 by GR at the transcriptional level (Figure 2). The downregulation of GR and the upregulation of RNF43 vary in different colorectal cancer cell lines, with the GR and RNF43 levels in the HCT116 cell line in the mid-range (Supplementary Fig.1). The variability of GR and RNF43 levels in different colorectal cancer cell lines suggests that their deregulation could have differing impacts on various colorectal cancer cell lines. In the HCT116 cell line, downregulating GR levels by siRNA upregulated RNF43 levels. In support of transcriptional regulation, GR has a binding site 138 kb upstream of the RNF43 transcription start site in lung carcinoma cells 9. The position of this binding site is consistent with the results of large-scale analysis of glucocorticoid-induced target genes negatively regulated by GR, where GR binding is at distant sites (with a median of 146kb in contrast to the median of 11kb for positively regulated genes) from the transcription start-site of these genes 9. In a different study, GR was found to have a binding site for RNF43 transcriptional regulation, based on the computational analysis of chromatin-immunoprecipitation-sequencing analysis in U2OS osteosarcoma cells 10. Alternatively, β-catenin was found to be a positive regulator of RNF43 transcription in the HCT116 colorectal cancer cell line, by directly binding to its promoter together with TCF4 11. RNF43 can be induced by Wnt signaling in colorectal cancer and the Wnt pathway is well-known to induce the progression of colorectal cancer 12, 13. RNF43 is upregulated in colorectal cancer and knock-down of RNF43 suppresses the growth of colorectal cancer 14. Therefore, it is possible that the Wnt signaling pathway, as an early event in colorectal cancer, induces RNF43, which may in turn downregulate proteins like p53 and other tumor suppressors 15. Interestingly RNF43 has also been found to be mutated in colorectal cancer samples as well as in HCT116 cell line and have a negative effect on Wnt signaling, suggesting that it has diverse roles in different cases of colorectal cancer 12, 16, 17. The complex roles of RNF43 in colorectal cancer could be related to the varying levels of RNF43 in the different colorectal cell lines (Supplementary Fig. 1). GR has also been shown to downregulate the Wnt signaling pathway by direct inactivation of β-catenin 18. By downregulating the Wnt signaling pathway GR represses the targets of the Wnt pathway such as Cyclin D1. In this scenario, it is possible that GR could also indirectly downregulate RNF43, another target of the Wnt pathway, through inactivation of β-catenin (Figure 2). This could provide another potential mechanism by which the downregulation of RNF43 is achieved by GR, namely through the inactivation of β-catenin.
Figure 2. Putative mechanism for the relationship between GR and RNF43 in colorectal cancer. The effect of GR on β-catenin and directly on RNF43 transcription is not proven in colorectal cancer therefore they are shown in red color.
In summary, we applied a top-down approach on large scale gene expression datasets to find differentially expressed genes specific to colorectal cancer. We applied both the multiple comparison approach, in which we compared colorectal cancer samples to a wide variety of samples, as well as the pairwise comparison approach, wherein we compared colorectal cancer samples to only normal samples. Integration of these different approaches showed that levels of GR and RNF43 are low and high, respectively, in colorectal cancer, but their levels vary in the different cell lines. Silencing GR in a colorectal cancer cell line suggests a potential mechanism between GR and RNF43. Namely, GR could be involved in the negative regulation of RNF43 transcription, thereby controlling its activity. The control of RNF43 levels by GR could be disrupted in some cases of colorectal cancer, such that downregulation of GR abolishes the negative transcriptional regulation of RNF43 and thereby activates it, which in turn could contribute to the progression of colorectal tumorigenesis. Nevertheless, the role of RNF43 for colorectal tumorigenesis is complex and could differ in various cases of colorectal cancer 14, 17. The finding in this study could have important clinical implications, given that GR positivity correlates with colorectal cancer survival and that vaccine therapies currently using RNF43 derived antigens are in clinical trials for colorectal cancer 8, 19, 20. Namely, the potential use of therapies modulating the glucocorticoid and GR levels together with the RNF43 vaccine should be investigated in colorectal cancer patients for whom the tumor suppressing and promoting roles of GR and RNF43 levels could be clearly defined.