PurposeMolecular characteristics using gene-expression profiling can undoubtedly improve the prediction of treatment responses, and ultimately, the clinical outcome of cancer patients. We aimed at developing a genetic signature to improve the prediction of chemosensitivity and prognosis of patients with colorectal cancer (CRC).Patients and MethodsWe analyzed microarray data of 32 CRC patients to explore the potential functions and pathways involved in the disease relapse in CRC. Gene expression profiles and clinical follow-up information of GSE39582, GSE17536, and GSE103479 were downloaded from the Gene Expression Omnibus database (GEO) to identify prognostic genes. Eventually, a model of 15-mRNA signature was established, in which its efficacy for predicting chemosensitivity and prognosis was examined.ResultsBased on the proposed model of 15-mRNA signature, the test series patients could be classified into high-risk or low-risk subgroup with significantly different overall survival (OS) rate (hazard ratio [HR]=1.48, 95% confidence interval [CI]=1.30–1.70, P≤0.001). The prognostic value of this 15-mRNA signature was confirmed in another validation series. Further analysis revealed that the prognostic value of this signature was independent of the TNM stage and can predict adjuvant chemosensitivity of patients with early-stage CRC.ConclusionWe identified a novel 15-mRNA signature in patients with CRC, which could be clinically helpful in the prognosis evaluation and the process of selection of patients with early-stage CRC for undergoing adjuvant chemotherapy.
Colorectal cancer (CRC) is one of the most common types of cancer worldwide.1 Common strategies for CRC management include chemotherapy, radiotherapy, targeted therapy, and surgical resection. Treatment decisions are primarily made based on TNM stage, according to the staging system presented by the American Joint Committee on Cancer (AJCC) or the Union for International Cancer Control (UICC), which is taken as the strongest prognostic parameter in CRC.2–4 In addition to TNM staging, histology differentiation, vascular and nerve invasion have also shown remarkable influences on prognosis to some extent.5–9 Generally, the standard treatment for stage III and stage II with unfavorable prognostic characteristics is surgery in combination with 5-Fluorouracil (5-Fu)-based adjuvant chemotherapy.10 However, prognosis has been variable even for patients with identical TNM stage and prognostic parameters. Therefore, further sensitive indicators should be explored to represent malignant aggressiveness and chemosensitivity as a complementary component to TNM staging.
It has been reported that application of molecular biomarkers may promote the prognostic evaluation and identification of potential high-risk patients. For instance, RAS gene mutation was studied as a prognostic factor in CRC in a number of previous studies.10–12 Additionally, several other prognostic biomarkers for CRC have been tested, studies such as fibroblast growth factor receptor (FGFR), human epidermal growth factor receptor 2 (HER2), epidermal growth factor receptor (EGFR), etc.13–15 However, further potential and valuable molecular biomarkers are essential to improve the clinical outcome of patients with CRC. Several studies have shown that identification and application of integrated genetic profiles may promote prediction of outcomes in cancer patients, as well as being helpful for predicting response or toxicity using various anticancer drugs.16–20 Therefore, searching genetic signatures may be precious for prognosis in the management of CRC.
In the present study, we investigated the genes and functional pathways, which were highly correlated with disease relapse in CRC. In addition, we conducted gene expressions of mRNA profiling on a cohort of 895 patients. By using univariate regression analysis, we identified 15 genes with strong prognostic influence. The prognostic values of these genes and their association with chemosensitivity were then tested and validated.
Patients who had undergone curative resection of stage III CRC in the Sixth Affiliated Hospital of Sun Yat-sen University (Guangzhou, China) between 2007 and 2010 were studied, as previously reported.21 This study was approved by the Ethics Committee of the Sixth Affiliated Hospital of Sun Yat-sen University. Predefined inclusion criteria were as follows: 1) histologically proven adenocarcinoma of the colon or rectum, pT1-4N1-2M0; 2) R0 resection (histologically negative margins); 3) fresh frozen tissue of primary lesion available in the tissue bank of the Sixth Affiliated Hospital of Sun Yat-sen University; 4) treatment via adjuvant modified FOLFOX6 or CapeOx; and 5) follow-up showing recurrence or metastasis during 1.5 years (defined as non-responders), or disease-free for at least 3 years (defined as responders). Exclusion criteria included: 1) history of invasive colorectal malignancy or other malignancies; 2) history or family history of familial adenomatous polyposis or hereditary nonpolyposis CRC; or 3) laparoscopic resection of synchronous colorectal malignant lesions.
Total RNA was isolated from frozen samples using the RiboPure™ Kit (Ambion, Austin, TX, USA). Gene expression profiles were detected by using Affymetrix HG-U133 Plus 2.0 GeneChips (Affymetrix Inc., Santa Clara, CA, USA) according to manufacturer’s instructions, as described previously.22
Raw CEL files were adjusted using robust multichip average (RMA) method implemented in Affy package in R software.23 Probes with absent signal in all samples were removed. Differentially expressed analysis were performed using Limma R package24 and the cut off to determine significantly high or low expression between responders and non-responders was at least two-fold-change and P-value less than 0.05.
Functional enrichment analysis was performed using MsigDB H hallmark gene sets collection and C5 Gene Ontology (GO) gene sets collection.25,26 GO functional annotations provide molecular function of differentially expressed genes (DEGs), biological process, and cellular component. Hallmark pathways represent data related to canonical tumor pathways. DEGs were herein used to calculate P-values and obtain significantly enriched categories by hypergeometric distribution method. P-value <0.001 and Odd Ratio >10 were considered as statistically significant thresholds.
The associations between the gene expression and patient’s disease-free survival (DFS), relapse-free survival (RFS) or overall survival (OS) were assessed by univariate Cox regression analysis. Prognostic genes that were statistically significant (P<0.05) in no less than two GEO series out of three GEO series were selected to further construct the prognostic model. The risk score was formulated by each of the selected genes, and weighted by their estimated regression coefficients, which obtained by the univariate Cox regression analysis of training set. Using the presented risk score formula, each patient was scored in each class, and then categorized them into high-risk and low-risk groups by using the corresponding median risk score value as a cutoff value.
Survival differences between high-risk group and low-risk group in each dataset were assessed by the Kaplan–Meier estimator, and then compared by using the Log rank test. We used the Cox proportional hazards model to perform univariate and multivariate regression analyses. Other statistical analyses were conducted by using R 3.5.1 software. Statistical significance was set at P-value <0.05.
Herein, RNA isolated from 16 non-responders and 16 responders passed quality control criteria for gene expression analysis. Quality of microarray gene expression data was confirmed as well. Non-hierarchical cluster analysis of samples showed that there was no remarkable distinction between non-responders and responders. In addition, 57 probes denoting 50 DEGs between non-responders and responders were detected with a significant difference. After removing 6 genes on the Y chromosome and the inactivated X chromosome caused by sex imbalance, 25 DEGs were expressed with higher level in non-responders, and 19 DEGs were expressed with higher level in responders. (Figure 1A)
Those significantly dysregulated genes in responders and non-responders were found to be involved in GO annotation terms and Hallmark annotation terms for canonical tumor pathway. In the present research, 25 significantly enriched pathways were achieved, which may participate in cancer metastasis and recurrence (Figure 1B). Eventually, corresponding 110 unique genes from 3 pathways were selected as candidate genes for the following analysis. These pathways were CXCR chemokine receptor (GO:0045236), negative regulation of anoikis (GO:2000811), and regulation of synapse assembly (GO:0051963), consisting of 16, 17, and 79 genes, respectively. Among these 112 genes, PTK2 and NTRK2 were found as duplicated genes involved in both regulation of synapse assembly and negative regulation of anoikis.
Each GEO series were independently analyzed by subjecting gene expression data of the above-mentioned candidate genes for analysis by univariate Cox regression analysis. Both DFS/RFS and OS were queried in GSE39582, GSE17536, and GSE103479. In total, 108 genes were analyzed in 3 independent series except for 2 undetected genes (LRRTM3 and LRRC24). As a result, 15 genes that significantly influenced survival in no less than 2 series out of 3 series were selected. Among these genes, positive coefficients indicated that the higher expression levels of 6 genes (SIX4, PIK3CA, ITGB1, ITGA5, DKK1, and CXCL5) were associated with shorter survival, whereas the other 9 genes (WANT5A, PTRH2, NLGN3, LRTM1, EPHB3, EPHB2, CXCL2, CXCL1, and CBLN2 ) were taken as protective factors of survival into account. The results of Cox regression analysis of 15 genes are illustrated in Figure 2.
Risk scores were calculated by summation of the expression levels of these 15-mRNAs, which weighted by their coefficients obtained from the predicted OS of the training set (GSE39582). The risk score was formulated as follows:
We then scored each patient in training set and rank them according to their risk scores. Patients were classified as high-risk and low-risk using the median risk score as cutoff value. Patient in high-risk group had significantly shorter OS (Log rank test, P<0.001) (Figure 3A) and DFS/RFS (Log rank test, P<0.001) in the training set. The risks of 15-mRNAs were also significant when they were evaluated as a continuous variable in univariate regression analyses of OS (HR=1.48, 95% CI=1.30–1.70, P≤0.001), and DFS/RFS (HR=1.49, 95% CI=1.31–1.70, P<0.001).
It is noteworthy that the TNM stage is a well-known prognostic factor for CRC. We herein assessed the prognostic value of the risk score in patients in early stage (stage I–III) and metastatic (stage IV) CRC patients separately. As a result, the risk score was found as a significant prognostic factor independent of metastasis status. There were 502 patients in early stage and 60 metastatic CRC patients. For patients in early stage, those patients with high-risk score had significantly shorter OS (Log rank test, P=0.015) (Figure 3B). For metastatic CRC patients, those cases with high-risk score had remarkably shorter OS (Log rank test, P=0.004) (Figure 3B).
To confirm our findings, 15-mRNA signature was validated in two microarray cohorts, GSE17536 and GSE103479, with participation of 332 patients. Using the presented risk formula and median cutoff value, we classified GSE17536 and GSE103479 into high-risk group and low-risk group based on their median risk score, respectively. Patients in the high-risk group had significantly shorter OS (Log rank test, P=0.0014) (Figure 4A) and shorter DFS/RFS (Log rank test, P<0.0001). We also assessed the prognostic value of the risk score in patients in early stage and metastasis CRC patients separately. As a result, the risk score was noted as a significant prognostic value independent of metastasis status. There were 293 patients in early stage and 39 metastasis CRC patients. For patients in early stage, those cases with high-risk scores had significantly shorter OS (Log rank test, P=0.011) (Figure 4B). For metastasis CRC patients, those cases with high-risk score had significantly shorter OS (Log rank test, P=0.007) (Figure 4B)
To indicate whether the proposed model has a prognostic value for CRC patients in stages I–III who received adjuvant chemotherapy after surgery, we integrated CRC patients in stages I–III with GES39852, GSE17536, and GSE103479, consisting of 792 patients, of which data of adjuvant chemotherapy existed for 694 patients. Patients who received adjuvant chemotherapy had significantly longer OS than those who did not receive adjuvant chemotherapy as expected (Log rank test, P=0.005). Using the presented risk formula and median cutoff value, we classified patients into high-risk and low-risk groups based on their median risk score, respectively. In the low-risk group (N=353), patients who underwent adjuvant chemotherapy had significantly longer OS than others (Log rank test, P<0.001), whereas in the high-risk group (N=341), the OS was not remarkable in patients who underwent adjuvant chemotherapy (Log rank test, P=0.213) (Figure 5A).
To further investigate the difference in prognosis between the high-risk and low-risk groups, a multivariate Cox regression analysis was used in 417 CRC patients in stages I-III from GSE39582, since it had the most complete clinical data. This analysis included patients who were in the early stages with the inclusion of age, sex, adjuvant chemotherapy, T stage, N stage, location of primary tumor, and mutation status of KRAS and BRAF genes as covariates. We found that the risk score of 15-mRNA (HR=1.27, 95% CI=1.04–1.55, P=0.020) remained as an independent prognostic factor for predicting OS (Figure 5B). We then stratified patients with respect to risk score, and multivariable Cox regression analysis was used for the high-risk and low-risk groups to further investigate the difference in prognosis. For patients in the low-risk group who did not undergo adjuvant chemotherapy (HR=0.24, 95% CI=0.09–0.59, P=0.002), advanced T stage (HR=2.19, 95% CI=1.30–3.69, P=0.003) caused significantly lower OS rates (Figure 6A), while for patients in the high-risk group, no significant prognostic factor was noted except for N stage (HR=1.46, 95% CI=1.06–2.00, P=0.019) (Figure 6B). These results suggested that patients with a low-risk score were recommended to receive adjuvant chemotherapy. However, patients with a high-risk score did not benefit from the adjuvant chemotherapy regimen. For such patients, adjuvant chemotherapy regimen should be enhanced or modified, and further stringent follow-up is essential.
Currently, clinical and pathological features may be insufficient to truly predict survival outcomes, whereas genetic approaches have been demonstrated to be effective for this purpose. In the current research, we applied univariate Cox regression analyses on microarray data to screen potential genes with strong prognostic values, and attempted to establish a prognostic and chemo-benefit predicted model with these genes.
To identify prognostic genes, a microarray analysis was carried out on 32 CRC patients. It was revealed that recurrence or resistance after adjuvant chemotherapy of CRC was associated with dysregulated genes involving in several pathways, including CXCR chemokine receptor, negative regulation of anoikis and regulation of synapse assembly. The CXCR chemokine receptor consists of genes interacting selectively and non-covalently with a chemokine receptor in the CXCR family. Studies reported that upregulated expression of genes involved in CXCR family play a role in either promotion or inhibition of cancer cells, in addition to drug resistance.27–29 The negative regulation of anoikis consists of genes involved in any process that stops, prevents, or reduces the frequency, rate, or extent of anoikis, and that is a process, in which cells are triggered to die when those are separated from their surrounding matrix and neighboring cells. Resistance to anoikis is taken as a critical contributor into account to tumor invasion and metastasis.30,31 Synapse is a structure that permits a cell to pass an electrical or chemical signal to another effector cell. Recent studies found that specific synapses play essential role in tumorigenesis of glioma32,33 and brain metastasis from breast cancer.34 What’s more, actin remodeling at the cancer cell side of immunological synapses are emerging as an important mechanism of tumor immune evasion.35 These findings reveal a potential relevant between synaptic communication and tumor with potential clinical implications. However, this aspect requires further study.
We then focused on identification of key genes involved in the above-mentioned three pathways. A prognostic model was established consisting of 15 genes with strong prognostic effects. To date, a number of studies reported the effects of single genes involved in the model on tumors, such as WNT5A, DKK1, EPBH2, SIX4, ITGB1, PIK3CA, CXCL5, and CXCL2 .36–43 However, the majority of these studies did not assess the molecular mechanisms involved in the regulation of gene expression. Additionally, a number of genes have not been studied on tumors, such as NLGN3, PTRH2, CBLN2, LRTM1. After establishing a predictive model using the aforementioned genes, survival analysis of the risk score was undertaken. It was noted that patients with a low-risk score had significantly longer OS in the training set and validation set. The risk score was also found to be significant prognostic factor independent of metastasis status. Since the tumor tissue samples of the initial explorational cohort were collected before adjuvant chemotherapy and grouped according to the recurrence and metastasis events found in the follow-up, the molecular difference between them may arise from tumor malignancy, or the chemosensitivity of tumor to 5-FU, or both.
Moreover, 5-Fu-based adjuvant chemotherapy after resection is a standard treatment for CRC patients in early stage. Several studies demonstrated the advantage of predicting 5-Fu-based chemotherapy based on gene expression data, and showed its great potential in clinical application.44,45 In this study, we noted that patients in the low-risk group had a favorable response to adjuvant chemotherapy, whereas those in the high-risk group did not benefit from adjuvant chemotherapy. These results indicated that the proposed classifier may be clinically applicable for the selection of CRC patients in the early stages to undergo adjuvant chemotherapy. Patients in the low-risk group are encouraged to undergo 5-Fu-based adjuvant chemotherapy to prolong OS, while patients in the high-risk group may require an enhanced chemotherapy regimens or other alternative chemotherapy regimens.
Compared with prognostic models developed in other studies, the innovation of the model presented in this study lies in the function-based gene selection and multi-pathways combinatorial modeling. Another valuable feature is that the proposed model is not only a prognostic model, but can also be served to reflect the significance of chemotherapy in early-stage CRC.
However, similar to other expression-based classifiers, there is a limitation in clinical use of different qualitative- and quantitative-based methods. Although all the data used in this study were normalized by using the RMA method, detection platforms were not identically the same, highlighting the necessity of further validations in the next prospective studies. Additionally, the co-effects of pathways on tumors should be explored, and the relevant mechanisms need to be studied as the majority of studies investigated the association between single-pathway and tumor biological behaviors. The combined effects and cross-talk of interconnected pathways in tumors may be precious for guiding future cancer-based researches.
In conclusion, we utilized a large amount of transcriptome data to identify a prognostic 15-mRNA signature of CRC patients. Our results showed that the 15-mRNA signature might be an effective prognostic biomarker of CRC. We hope that the prognostic significance of the 15-mRNA signature may be clinically applicable for the selection of CRC patients in the early stages of the disease to undergo adjuvant chemotherapy. Future research should concentrate on the validation of our findings in planned clinical trials, and present the role of these genes in biological behavior of tumors and drug resistance.
This study was supported by the National Natural Science Foundation of China [grant number 81974369]; Science and Technology Planning Project of Guangdong Province [grant number 2017B030308006]; Key Technologies Research and Development Program of Guangzhou [grant number 201704020144].
AJCC, American Joint Committee on Cancer; CRC, colorectal cancer; CXCR, chemokine (C-X-C motif) receptor; DEGs, differentially expressed genes; DFS, disease-free survival; EGFR, epidermal growth factor receptor; FGFR, fibroblast growth factor receptor; 5-Fu, 5-Fluorouracil; GO, Gene Ontology; HER2, human epidermal growth factor receptor 2; HR, hazard ratio; OS, overall survival; RFS, relapse-free survival; RMA, robust multichip average; UICC, Union for International Cancer Control.
The microarray data of 32 patients that supported the ﬁndings of this study are available from the corresponding author upon request. Public data can be found online at the GEO database (accession nos. GSE39582, GSE17536, and GSE103479).
This study was approved by the Ethics Committee of the Sixth Affiliated Hospital of Sun Yat-sen University, and all patients provided written informed consent in according with the Declaration of Helsinki.