Mechanistic insights into zearalenone-accelerated colorectal cancer 2 in mice using integrative multi-omics approaches

Summary 17 Zearalenone (ZEA), a secondary metabolite from Fusarium fungi found in cereal-based foods, 18 promotes the growth of colon, breast, and prostate cancers in vitro. However, the lack of animal 19 studies hinders a deeper mechanistic understanding of the cancer promotive effect of ZEA. The 20 aim of this study was to unveil the effect of ZEA on colon cancer progression and its underlying 21 mechanism. Through integrative analyses of transcriptomics, metabolomics, metagenomics, and 22 host phenotypes, we investigated the impact of a 4-week ZEA intervention on colorectal cancer in 23 xenograft mice. Our results showed that a 4-week ZEA intervention increased the tumor weight 24 twofold. ZEA exposure significantly increased the mRNA and protein levels of BEST4, DGKB 25 and KI67, and the phosphorylation levels of ERK1/2 and AKT. Metabolomics analysis of the 26 serum revealed the levels of amino acids, including histidine, arginine, citrulline and glycine 27 decreased significantly in the ZEA group. Furthermore, ZEA lowered the alpha diversity of the 28 gut microbiota and reduced the abundance of 9 genera, such as Tuzzerella and Rikenella, etc. 29 Further association analysis indicated that Tuzzerella was negatively associated with the 30 expression of BEST4 and DGKB genes, serum uric acid level and tumor weight. In addition, 31 circulatory hippuric acid level was positively correlated with the tumor weight and the expression 32 of oncogenic genes, including ROBO3, JAK3 and BEST4. Altogether, our results indicated that 33 ZEA promoted colon cancer tumor progression through enhancing the BEST4/AKT/ERK1/2 34 pathway, lowering the circulatory amino acid concentration, alternating the gut microbiota 35 composition, and suppressing the SCFA production. 36 37 Keywords 38 Zearalenone, colon cancer, RNA-Seq, gut microbiota, metabolomics, multi-omics 3 39 Introduction 40 Colorectal cancer (CRC) is one of the most common cancers and the third leading cause of 41 cancer death [1], while the pathogenesis of CRC is still undergoing active exploration. Exogenous 42 estrogen exposure has been classified as an environmental hazard, contributing to CRC 43 progression through binding to estrogen receptors [2]. Zearalenone (ZEA) is a mycotoxin 44 commonly found in our diet, mainly in cereals and grains. ZEA is a secondary metabolite produced 45 by Fusarium species. Due to its structural similarity to estrogen, studies on ZEA carcinogenicity 46 have therefore focused on hormone-dependent cancers. A recent epidemiological study discovered 47 a link between increased urinary ZEA metabolites and the development of breast cancer [3]. 48 Furthermore, ZEA has been implicated in promoting cancer cell growth and progression in breast 49 [4], prostate [5] and colon cancer cells [6, 7] in vitro. These investigations, however, disregarded 50 the three-dimensional growth, microenvironment of tumors, and immunoregulatory role of the gut 51 microbiota in living animals [8]. To the best of our knowledge, there is no in vivo study evaluating 52 the impact of ZEA on CRC. 53 In recent years, increasing evidence suggests that dysbiosis in the gut microbiome may 54 contribute to CRC development [9-11]. The microenvironment of CRC is a complex community 55 composed of cancer cells and microorganisms [12]. Previous studies have shown that the gut 56 microbiota shapes immune responses and promotes tumor growth [13]. Since ZEA is mainly 57 digested and absorbed through the gastrointestinal tract, the gut microbiota is directly exposed to 58 ZEA and potentially regulates the metabolism and carcinogenesis of ZEA [14]. Indeed, studies 59 have indicated that ZEA disrupts the composition and function of the gut microbiota in healthy 60 animals [15-17]. 4 61 In this study, using combined analyses of the tumor transcriptome, targeted and untargeted 62 metabolomics and 16S rDNA sequencing, we comprehensively evaluated the impact of ZEA 63 exposure on the development and progression of CRC in a colon cancer xenograft mouse model. 64 We aimed to investigate the underlying mechanisms related to accelerated tumor growth and 65 identify the key transcriptional signature, gut microbes, and metabolites that potentially contribute 66 to ZEA-induced colon cancer development. 67 68 Materials and Methods 69 Animal study 70 Nude BALB/cAnN mice were purchased from the Laboratory Animal Unit of the University 71 of Hong Kong. Five-week-old male mice were housed in individual ventilated cages in a sterile 72 environment with regulated temperature (23-24ºC) and relative humidity (60-70%) on a 12-hour 73 light cycle. The mice were fed a phytoestrogen-free diet (Open Standard diet, D11112201, 74 Research Diets Inc., New Brunswick, NJ) and water provided ad libitum. After one week of acclimatization, the mice were injected in the right flank with 1x106 75 SW480 human colon cancer cells mixed with PBS/Matrigel (1:1). When the tumor size reached 70-100 mm3 76 , the mice were 77 randomly divided into two treatment groups (n=9 per group): control [oral gavage vehicle (olive 78 oil) thrice weekly] or ZEA (0.5 mg/kg body weight (BW) in olive oil thrice weekly) (Figure 1A). 79 The dosage in this study was chosen based on an animal study in which 1 mg/kg BW ZEA 80 promoted DNA adduct formation in healthy mice, and the reported levels of ZEA in grain-based 81 food in humans (3.0-33.0 µg/kg bw/day) [14, 18], in which 33.0 µg/kg bw/day was converted 82 through body surface area conversion from humans to mice [19]. The measurement of tumor size 83 was made as previously described with modification [20]. The sizes of tumors were measured by 5 84 an electric caliper thrice weekly. Tumor volumes were calculated based on the formula: volume (mm3 ) = (π x length (mm) × width (mm)2 85 )/6. The length represents the longest tumor diameter, 86 and the width represents the perpendicular tumor diameter. BW and food consumption were 87 recorded thrice weekly throughout the study. After 28 days, the mice were sacrificed with 88 pentobarbital (250 mg/kg, intraperitoneal injection), and the tumors and caecal contents were 89 removed, weighed and snap frozen in liquid nitrogen. Animal handling for the experimentation 90 was approved by the Committee on the Use of Live Animals in Teaching and Research (CULATR 91 No. 4785-18), the University of Hong Kong. 92 93 RNA extraction and RNA-seq 94 Total RNA was isolated from the tumor tissue using an RNAspin Mini kit (GE Healthcare, 95 UK) according to the manufacturer’s instructions. After passing the quality check, the samples 96 were fragmented and reversely transcribed to cDNA. The final cDNA library was then built by 97 Novogene (Novogene, China) through the following procedure: purification, terminal repair, A98 tailing, ligation of sequencing adapters, size selection and PCR enrichment. Finally, reads were 99 generated using Illumina HiSeq 2500. 100 101 Quality control of RNA-Seq reads and quantification of gene expression 102 We mapped all raw reads to the GRCm38 (Genome Reference Consortium Mouse Build 38, 103 NCBI BioProject Accession: PRJNA20689) genome using BWA v0.7.17 and removed those with 104 more than 95% mapping coverage. The ribosomal RNA (rRNA) reads were filtered with 105 SortMeRNA [21]. Raw reads were further processed for quality control using the following 6 106 procedures [22] (available at 107 https://github.com/TingtZHENG/metagenomics/blob/master/scripts/fqc.pl): (i) Illumina 108 primers/adaptors/linker sequences were removed; (ii) paired-end reads with 25 bp consecutively 109 exact matches from both ends were removed to avoid PCR duplicates; and (iii) terminal regions 110 with continuous Phred-based quality less than 20 were removed. High-quality remaining reads 111 were aligned using TopHat2 v2.1.1 employing the GRCh38 (Genome Reference Consortium 112 Human Build 38, NCBI BioProject Accession: PRJNA31257) assembly as the human reference 113 genome. Thereafter, alignment outputs from TopHat2 were formatted and sorted with SAMtools 114 v1.9. SAMtools outputs were used to perform gene counts using the htseq-count function from 115 HTseq framework v0.11.2. In addition, RSEM v1.3.1 was executed to estimate fragments per 116 kilobase of transcript per million mapped reads (FPKM) based on the alignment results. Based on 117 the read count of genes for each sample, we first applied centered log-ratio (clr) transformation. 118 119 Analysis of differentially expressed genes and enriched pathways 120 Subsequently, differentially expressed genes were identified by DESeq2 v1.34.0, which 121 calculated the significance of gene expression profile alternations in comparison groups. The 122 adjusted p-value < 0.1 were used as the cutoff values for significantly differentially expressed 123 genes. Next, differentially expressed pathways were identified by Generally Applicable Gene-set 124 Enrichment (GAGE, v2.44.0) analysis using gene sets derived from KEGG pathways. An adjusted 125 p-value < 0.1 was used to define significantly differentially expressed pathways in the comparison 126 groups. In addition, Gene set enrichment analysis (GSEA, v4.0.3) was performed on gene 127 expression profiles using gene sets from the Molecular Signatures Database (MSigDB, v7.1): (1) 7 128 Cancer Hallmarks, (2) Oncogenic Signatures and (3) Chemical and Genetic Perturbations. |NES 129 score| > 1 and FDR < 0.25 were used as the cutoff values for significantly enriched gene sets. 130 131 Hematoxylin-eosin (H&E) staining, immunohistochemistry (IHC) and TUNEL assay 132 A part of the tumor tissue was washed, fixed immediately with formalin and embedded in 133 paraffin for sectioning into 5 mm slices. The sections were then deparaffinized and used to conduct 134 H&E staining, IHC, or TUNEL fluorescent staining. H&E staining was performed using H&E kit 135 from Baso (Zhuhai, China) according to the manufacturer’s recommendations. The TUNEL assay 136 was performed using In Situ Cell Death Detection Kit (Roche Diagnostics) according to the 137 manufacturer’s instructions with modification [23]. In brief, the slides were incubated in 0.1 M 138 citrate buffer in a 70°C water bath for 30 minutes and then blocked with CAS block (Thermo 139 Fisher, USA). The slides were washed and then incubated with TUNEL reaction mixture for 60 140 min at 37°C. For IHC, the sections were rehydrated in graded alcohols and distilled water. It was 141 then boiled in Tris-EDTA followed by quenching endogenous peroxidase activities using 3% 142 H2O2. The sections were immersed in CAS-Bloc Histochemical Reagent (Thermo Fisher, USA) 143 and incubated with primary antibodies against pERK1/2, pAKT, Ki67, BEST4, (1:100, Abcam) 144 and DGKB (Invitrogen, USA) at 4°C overnight. It was then washed thoroughly and incubated with 145 anti-mouse or anti-rabbit HRP-conjugated secondary antibody (Bio-Rad) or anti-goat HRP146 conjugated secondary antibody (Bio-Rad). Positive signals were visualized using 3,3'- 147 diaminobenzidine (Abcam, USA). The nuclei were counterstained with hematoxylin. Expression 148 in IHC images was quantified by NIH ImageJ v1.50 software as described in [24]. 149 8 150 Non-targeted metabolomics analysis 151 One hundred microliters of serum sample were added to 1 mL ice-cold 100% methanol, spiked 152 with 4-chloro-phenylalanine as the internal standard at a final concentration of 100 ng/mL. The 153 mixture was vortexed for 30 s and kept at -20°C for 20 min. The cold mixture was then centrifuged 154 at 17,000 x g at 4°C for 15 min. The supernatant was collected and dried under a gentle flow of 155 nitrogen. The dried samples were reconstituted with 100 µL 60% methanol and sonicated for 1 156 min before centrifugation at 17,000 x g for 10 min. The supernatant collected was subjected to 157 LC‒MS/MS analysis using Agilent 6540 UHPLC-QTOF-MS/MS (Agilent Technologies, Santa 158 Clara, CA, USA) with a Waters ACQUITY HSS T3 column, 2.1 mm × 100 mm, 1.7 μm (Milford, 159 MA). The column temperature was set at 40°C and the injection volumes for positive mode and 160 negative mode were set at 10 µL and 15 µL, respectively. The mobile phases consisted of water 161 with 0.1% formic acid (A) and acetonitrile with 0.1% formic acid (B) at a flow rate of 0.3 mL/min. 162 The gradient (B%) started with 1% from 0 to 1 min and increased to 99% from 1 min to 10 min, 163 holding at 99% until 13 min and then returned to 1% at 13.5 min and maintained until 16 min. The 164 MS parameters were as follows: capillary voltage 4000 V, nozzle voltage 1500 V, skimmer voltage 165 65 V, drying gas temperature 300°C, sheath gas temperature 320°C, fragmentor voltage 140 V, 166 drying gas flow rate 8 L/min, sheath gas flow rate 11 L/min, and nebulizer pressure 40 psi. The 167 metabolites were identified by comparing the data to the HMDB and METLIN databases using 168 Progenesis QI software version 3.0 (Nonlinear Dynamics, Newcastle, UK). 169 170 Targeted metabolomics for amino acid profiling 171 Serum samples (50 µL ) were mixed with 50 µL internal standard (Kairos Amino Acid Internal 172 Standard Set, Waters USA) and water according to the manufacturer’s instructions. The mixture 9 173 was then centrifuged at 9000 x g for 15 min. The mixture (10 µL) was then mixed with 70 µL of 174 borate buffer and 10 µL of AccQ•Tag reagent. After vortexing for 5 s, the mixture was allowed to 175 stand at room temperature for 1 min and heat for 10 min at 55°C. The mixture was then analyzed 176 using Agilent 6460 UHPLC-QqQ-MS/MS (Agilent Technologies, Santa Clara, CA, USA) with a 177 Waters Cortecs C18 column, 2.1 mm × 150 mm, 1.6 μm (Milford, MA). For chromatographic 178 separation, a binary mobile phase system was used including 0.1% formic acid in water (A) and 179 0.1% formic acid in acetonitrile (B) with a flow rate of 0.5 mL/min. The gradient elution program 180 (B%) for each run started at 1%, lasted for 1 min, and then increased to 13% at 2 min, 13% to 15% 181 from 2 to 5.5 min, 15% to 95% from 5.5 to 6.5 min, kept at 95% until 7.5 min, then decreased to 182 1% from 7.5 min to 7.6 min and maintained at 1% until 9 min. The column temperature was 55°C, 183 and the injection volume was 2 μL. Data processing was achieved using Agilent MassHunter 184 Workstation software (version B.06.00). 185 186 Metabolomics data analysis 187 Pathway analysis was performed to integrate the targeted amino acid results using the 188 Ingenuity Pathway Analysis Software (IPA) version 70750971 (QIAGEN, Redwood City, USA). 189 Metabolites strongly correlated with tumor weight were identified by Spearman's rank correlation 190 analysis. The |correlation| > 0.6 and p-value < 0.05 were used as the cutoff values of statistical 191 significance. 192 10 193 Total microbial DNA extraction and 16S rDNA gene quantitative analysis 194 The DNA content of the caecal samples was extracted using a DNeasy Powersoil Kit (Qiagen, 195 Germany) according to the manufacturer’s instructions. The quality of extracted DNA samples 196 was examined through 1% agarose gel electrophoresis. The DNA samples were then sent to 197 Novogene Co. Ltd (Beijing, China) for sequencing. After quality check, the hypervariable region 198 V3-V4 (341F&534R) of the 16S rDNA was amplified. PCR was then performed by Phusion High199 Fidelity PCR master mix (New English Biolabs, England). The PCR product was quantified with 200 2% agarose electrophoresis gel. The products between 400 and 450 bp were extracted by Qiagen 201 Gel Extraction Kit (Qiagen, Germany). Samples were quantified through Qubit and prepared for 202 sequence library generation by the NEBNext Ultra DNA library prep kit. The resulting library was 203 then sequenced on the Illumina HiSeq 2500 platform. The R package dada2 v1.22.0 was used to 204 perform read quality control and resolve amplicon sequence variants (ASVs). Thirty low-quality 205 3’ bases of the forward and reverse reads were removed following inspection of QC plots. Reads 206 with any Ns or a maximum expected error of > 2 following truncation and mapping to phiX were 207 also removed. Error estimation was calculated on all samples (pooled) following dereplication. 208 After that, dada2 merged overlapping forward and reverse reads, and then removed chimeric 209 sequences. Following chimera removal, taxonomy was assigned independently to the SILVA 210 v138.1 reference database. Next, Shannon diversity at the genus level was calculated with the R 211 package vegan v2.6.2. Furthermore, the alpha diversity indexes were compared between the 212 experimental and control groups using independent-sample t-test. The R package vegan and 213 compositions v2.0.4 [25] were used to calculate beta diversity based on robust Aitchison distance. 214 The Wilcoxon rank sum test was used to evaluate whether there was significant difference for pair215 wise distance within groups between comparison groups. We then evaluated beta diversity in 11 216 multiple dimension scale (MDS) ordinations with vegan. Moreover, permutational multivariate 217 analysis of variances (PERMANOVA) of the distance matrices between groups was performed to 218 calculate R values (0 = no difference; 1 =maximum dissimilarity) to determine dissimilarities 219 among groups. Differential abundance of taxa at the genus level between comparison groups was 220 completed using Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC) 221 implemented in R package ANCOMBC v1.4.0 [26]. An adjusted p value < 0.05 was used as the 222 cutoff to define significantly differentially abundant genera between groups. 223 224 Gene expression, microbiota, and metabolome integrative analysis 225 We performed an integrative analysis to reveal the associations among multi-omics and 226 phenotypic data. First, Spearman’s correlation coefficients between the differentially expressed 227 genes, differentially abundant microbial taxa and differentially abundant metabolites between the 228 experimental and control groups were calculated. Statistical significance was set as FDR < 0.05 229 after Bonferroni correction, and a strong correlation was defined as |correlation|>0.8. Furthermore, 230 we built the significant strong correlations into a visualization network using Cytoscape software 231 v3.9.1. 232 233 Short chain fatty acids (SCFAs) analysis of caecal contents 234 The concentration of SCFAs in the caecum was measured by gas chromatography–mass 235 spectrometry (GC–MS) as previously described with modification [27, 28]. In brief, the caecal 236 content samples were homogenized using a blade homogenizer (T25, ULTRA-TUREAX, IKA) in 237 an extraction buffer containing 0.005 M sodium hydroxide with internal standard (10 µg/ml 12 238 deuterated acetic acid, acetic acid-d4). The homogenates were centrifuged at 13,200 x g for 20 min 239 at 4°C. The supernatant was mixed with 0.5 ml of 1-propanol/pyridine (3:2, v:v) and 0.1 ml of 240 propyl chloroformate. The mix was vortexed for 1 min and incubated at 60°C to derivatize the 241 SCFAs. The derivatized samples were then mixed with 0.5 ml hexane and centrifuged at 2000 x g 242 for 4 min. Thereafter, 400 µl of the upper layer was transferred to a glass vial for GC–MS analysis 243 (Agilent 6890N-5973 GC‒MS, USA) set according to Zheng et al [27] report. The concentration 244 of SCFAs was quantified using the calibration curves constructed using acetic acid, propionic acid, 245 and butyric acid response ratio against acetic acid-d4. 246 247 Statistical analysis 248 Data are presented as the means ± SDs. Analysis was performed by using independent-sample 249 t-test on GraphPad Prism 9.0 (GraphPad Software, CA, USA), unless specified. A p-value of