Membranome Similarity between Glioblastoma Multiforme Cell Lines and Primary Tumors

Genes encoding for proteins associated with the plasma membrane, referred to as the membranome, have long been recognized to play an important role in the development and maintenance of glioblastoma multiforme (GBM). GBM cell lines are commonly used to mimic tumors for in vitro experiments, but the extent to which they resemble GBM tumors in relation to the membranome is unclear. The present study explores the resemblance of GBM cell lines to primary tumors regarding membranome expression. Gene expression data was retrieved from Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). Membranomic genes were annotated and tumor purity was accounted for when correlating tumors and cell lines. The results suggest some commonly used cell lines, including AM38 and U87MG, display relatively little resemblance to tumors membranome. Differential gene expression analysis and subsequent gene set enrichment showed numerous genes related to neurexin/neuroligin, ion homeostasis, and synaptic signaling were downregulated in cell lines’ membranomes compared to that of GBM tumors. The findings suggest that the membranome of GBM cell lines exhibit pronounced changes in gene expression compared to primary tumors and may not be completely representative of the disease process.


Introduction
Glioblastoma multiforme (GBM), a high-grade glioma, is a common malignant primary brain tumor in adults [1]. Patients diagnosed with primary glioblastoma, which usually presents without prior history of low-grade disease, have a median survival of 1 year [2]. Research in GBM commonly utilizes established cell lines to mimic the cancer biology for developing diagnostics and drug therapies [3]. Indeed, a large number of GBM cell lines have been used to screen putative drug compounds; however, many show potential in vitro but underperform clinically [4]. These patterns demonstrate the necessity to more critically examine the biology of in vitro GBM cell lines compared to their in vivo tumors and any differences that may interfere with translational research.
Proteins associated with the plasma membrane have long been recognized to play an central role in the development and maintenance of cancer by mediating interactions between tumor cells and their surrounding environment, including cell-to-cell signaling, molecular transport, cell adhesion, and interaction with the extracellular matrix [5]. The genes coding for this class of membrane-associated proteins is hereinafter referred to as the membranome. The membranome is particularly relevant to oncology as it composes many potential targets for drug therapies, as evident in monoclonal antibody-based therapies [6,7]. Correspondingly, over 60% of FDA approved drug therapies are targeted towards the membranome, and approximately 34% of all clinical drugs target the G-protein coupeled receptor (GPCR) superfamily, which is the largest class of cell-surface receptors [8]. While in vivo tumors are composed of various cell types interacting with one another and the extracellular matrix through multiple signaling pathways mediated by surface proteins, cell lines utilized for in vitro models conversely consist of homogenous cell populations and their interactions within an artificial environment. The differences between the composition of the cell membranes in primary tumors thus result in different responses to cancer therapies, with cell lines generally showing higher sensitivity to anticancer agents compared to primary tumors [4,9].
Despite the prevalent use of cell lines to establish drug therapies for GBM [10][11][12][13], the extent to which the cell lines commonly used for GBM research replicate the membranome of GBM primary tumors has yet to be examined. Although previous reports have provided insight into membranomic correlation of GBM cell lines and tumors, these studies were limited due to their relatively small sample sizes and their utilization of sensitive microarray technologies [14]. To provide a more comprehensive analysis, the present study analyzed membranome-related gene expression for over 100 primary GBM tumors and over 30 GBM cell lines from The Cancer Genome Atlas (TCGA) and the Cancer Cell Line Encyclopedia (CCLE), respectively. Additionally, tumor purity was adjusted for, which can be a significant confounder in primary tumor transcriptomic data [15]. Using these extensively curated repositories of high throughput sequencing data and accounting for the purity of the primary tumor samples, it was found that certain cell lines are more correlated to tumors than others in terms of membranome similarity.

Data Collection
Gene expression data from RNA-sequencing and accompanying annotation file for cell lines were downloaded from CCLE (https://portals.broadinstitute.org/ccle/home). Level 3 released RSEM gene abundances for RNASeqV2 for GBM tumor samples were retrieved from the Broad GDAC portal (http://gdac.broadinstitute.org/), which hosts the TCGA project. The CCLE and TCGA databases were chosen for this analysis due to their opensource availability and comprehensive analysis of the genomic landscape of glioblastoma multiforme. In total, gene expression data from 142 primary GBM tumors and 31 GBM cell lines was analyzed. Tumor purity was calculated (defined broadly as the proportion of nonimmune cells in the tissue sample) for tumor and cell line samples using the ESTIMATE algorithm [16].

Classification of Membranome Genes and Correlation Analysis
In the present study, the membranome was defined as the set of all human genes coding for proteins integrated within or covalently associated with the cell plasma membrane. All genes shared between the CCLE and TCGA datasets were screened using a combined analysis of previously established gene ontology annotations [17] plus the results from the "Membranome 2.0" database [18]. The list of putative membrane protein genes was further filtered manually to exclude genes coding for proteins localized to intracellular compartments, such as the nuclear membrane, and include additional membrane-associated proteins procured from literature, such as glycosylphosphatidylinositol-anchored proteins. The final membranome dataset consisted of counts for a total of 4,340 protein coding genes. Data was subject to upper quartile normalization and log-2 transformed. To account for tumor infiltrating cells that might bias the bulk tumor gene expression data, genes that were highly correlated with tumor purity scores were removed (R > −0.4, padj < 0.01). Then the top 3000 genes, ranked by interquartile range across primary tumor samples, were selected to calculate rank-based Spearman correlations between GBM cell lines and primary tumors, as these genes have the highest likelihood to be biologically informative.

Differential Expression and Gene Set Enrichment Analysis
DESeq2 v.3.1 [19] was used to normalize data and determine differentially expressed genes, with purity estimates used as covariates. A gene was considered differentially expressed if its absolute log2 fold change >3 and FDR adjusted p value (or padj) < 0.01. Differentially expressed genes were subject to pre-ranked gene set enrichment analysis [20] performed using Bioconductor R package fGSEA v.1.10.1 [21]. Processes were deemed statistically significant if FDR adjusted p value < 0.0001.

Software Tools
All relevant analyses were conducted with programming language R [22]. Bioconductor R package ggplot2 [23] was used for results visualization.

Results
The observed median correlations between the membranome of GBM cell lines and GBM primary tumors ranged from 0.63 to 0.75 (Figure 1a). Cell lines SNU201, SNB75, and SNU626 showed the highest median correlations while AM38, U87MG, and SF295 exhibited the lowest correlations (Figure 1a). To gain insight into specific transcriptional differences underlying the differential correlations, differentially expressed genes between cell line and tumor membranomes were analyzed while also adjusting for the amount of immune and stromal infiltrates. Among the 4,340 genes between tumor and cell line membranomes, 959 genes were significantly dysregulated (FDR adjusted p value < 0.05, Log2 fold change >3 or <−3) (Table S1). Plotting the top two principal components revealed a clustering trend between tumors and cell lines, accounting for 73% of the variance ( Figure  1b). To explore the gene sets dysregulated between tumors compared to cell lines, gene set enrichment analysis was performed. Gene sets related to synaptic signal transduction, ion homeostasis, L1CAM interactions, and neurexins/neuroligins were significantly upregulated in primary tumors compared to cell lines (p value < 0.05) (Figure 1c). Processes related to G protein-couple receptor signaling, vesicle-mediated transport, and O-linked glycosylation were significantly upregulated in cell lines relative to tumors (Figure 1c).

Discussion
With increasing number of GBM cell lines being developed [24], choosing optimal cell lines for laboratory experiments is crucial. Previous research describes the most commonly used cell lines for GBM as U251, U87, and U373 (which has similar characteristics to U251) [25][26][27][28]. Analyzing conventional GBM cell lines from CCLE database showed that most resemble their primary tumors. However, several cell lines, such as AM38, display little resemblance. The lack of comparable features in these cell lines may be attributed to acquired genomic damage or the variation of GBM primary tumors in TCGA database. Further exploration of the membranome gene expression data showed a large number of genes are in fact underexpressed in cell lines membranome compared to that of tumors.
A certain number of these underexpressed genes were associated with synaptic signaling. Indeed, recent evidence suggests that gliomas engage in synaptic communication and this neural integration may be crucial for glioma progression [29,30]. It has also been shown that neuron-glioma interactions are bidirectional and gliomas induce hyperexcitability through secretion of synaptogenic factors, and reducing surrounding inhibitory interneurons [30][31][32][33]. Additionally, cell lines membranome displayed decreased expression of genes related to ion transport and homeostasis. Several ion channels have been implicated in glioblastoma proliferation and migration, including potassium and sodium channels [34][35][36]. Several genes associated with potassium channels (KCNJ16, KCNA6, KCNC1, KCNF1, KCNH8, KCNMB4) and sodium channels (SLC4A10, SCN2B, SCN7A, SCNN1B, SCN2A) were found to be significantly downregulated in cell lines (Online Resource 1).
Additionally, genes related to neurexins and neuroligins exhibited a similar pattern of decreased expression in cell lines. Neuroligins are synaptic cell-adhesion molecules (CAMs) that interact with presynaptic neurexins to mediate transsynaptic signaling [32][33][34].
It is a limitation of the current study that only transcriptomic data was used to analyze the membranome of GBM cell lines and tumor samples. Future examinations should use gene expression data in conjunction with other molecular profiling assays such as metabolomics and proteomics. Nonetheless, these initial findings suggest changes in the membranome of GBM cell lines compared to primary tumors that may not be representative of the disease process. Previous literature examining the membranome of other cancer cell types and primary tumors demonstrate significance in membranome genes and the tumor phenotype, but analysis at the protein level would be appropriate to validate our conclusions at the level of transcription [14]. Correspondingly, a number of genes related to GBM disease development were under expressed in important cell lines in GBM research. Our study design included a literature search to determine commonality of cell lines in published literature, and these cell lines exhibit distinct differences in membranome proteins when compared to primary tumors. Researchers should be mindful of these genes when conducting translational research and further work is required to explore whether cell lines can be induced to resemble tumor gene expression in culture medium is warranted.