Transcriptome Signatures of Dysregulated Brain Dynamics Induce Entangled Network States.

Marks of brain disorders may be visible through abnormal networks characterized by states with distinct signatures or connectivity patterns. Apart from neurodegenerations, drug abuse or eventually addiction may exert complex perturbative effects over human health, including damage to various organs and especially brain inflammation, volume shrinkage and functional deficits. These can also be regarded as signs of accelerated ageing, which involve multiple comorbidities associated with long-lasting cognitive impairments. The present study investigates drug effects at brain level through transcriptional changes potentially reflecting specific molecular and cellular dysregulation marks. As signatures arising in various brain regions may change, the data here analyzed refer to cross-referenced transcriptome profiles that were in part experimentally measured and in part computed. A methodological aim refers to network inference centered on a few well-characterized entangled states whose connectivity patterns identify clusters and motifs. At the biological annotation level an interesting role is played by the Vascular Endothelial Growth Factor A (VEGFA), a central connector of kinases involved in cytoskeleton rearrangement, synaptogenesis and synaptic functions, neurogenesis, cell growth, cell adhesion, neurohormone signaling apoptosis. VEGFA associations in both co-expression and protein-protein interaction networks reveal patterns linked to MCL1, a BCL2 family Apoptosis regulator.


Background
Mainly due to induced toxicity, drug abuse can accelerate brain ageing but also trigger early onset of disorders that in turn are typical of age [1,2]. To explain the complex dynamics underlying drug abuse and addiction, various studies have been proposed covering brain regions like Nucleus Accumbens (NAc), Amygdala, Dorsal Striatum, Hippocampus and Prefrontal Cortex. Each of such regions being relevant, NAc's role is for instance critical for mediating reward-motivated behavior and is an important component of a major dopaminergic pathway, namely the mesolimbic. An important player to explain pathological conditions is dopamine [3][4][5]. Elevated dopamine levels are observed in drug users and genomic signatures associated with addiction have been already linked to pathological conditions due to the role played by dopamine [6][7][8]. In NAc, increased feelings of reward may be induced, together with a reduced dopamine function.
NAc has been implicated in neurological and psychiatric disorders beyond drug abuse and addiction, say depression, obsessive-compulsive, bipolar, anxiety disorders, Parkinson's disease, Alzheimer's disease, Huntington's disease, obesity [9]. Several studies indicate that reward mechanisms explain the abnormal plasticity associated with drug abuse [6]. Drugs affect the reward brain regions, and in turn trigger neuroplasticity, i.e. the synaptic signaling among neurons crucial to learning and memory. Therefore, an important objective is to characterize molecular features underlying addicted cellular networks and their regulation mechanisms. Due to the complexity of such associations, and the consolidated practice of using networks to decipher complex systems, our hypothesis is that network inference can explain gene sets association starting, ideally, from an estimated differential expression gene (DEG) profile and then identifying from it clusters and motifs representing the patterns of dysregulation in brain activity determined by drug abuse. As a second aspect, finding tightly correlated nodes that taken together summarize significant information about the states of a perturbed system may characterize entangled states with embedded signatures that refer to the cause of the perturbation itself.
Considering that gene expression profiling has allowed to assess possible differential patterns between cocaine abusers and well-matched control subjects [7,8], the fact that the interrogation of thousands of gene transcripts has revealed only 1% as significant indicates the presence of possible limitations in these types of approaches (see also [10], and references therein). In general, high-throughput-derived transcriptome profiles can be quite large as syntheses of all RNA transcripts but represent static snapshots, i.e., measures with a specific temporal dimension attached, and the same holds for the spatial relationships embedded in a profile that is then mapped onto the network.
The role of network structures reflecting system's states is valuable because adds one further component, connectivity, which may better elucidate perturbation effects.

Materials and Methods
The unitary components of the transcriptome, namely gene readouts or transcripts, can be considered integrable into a collection of all the gene readouts present in a cell. Networks represent only statically correlative or causative associations between genes depending on the underlying metric and on the measurements mapped at node level. The metric determines what links populate the network, while the transcriptomic expression values are usually the measurements reported at the nodes. Our study refers to a few datasets described below.

Dataset-1
Experimental high-throughput work (RNA-Seq) was recently conducted at our and associated laboratories for measuring cocaine use effects (unpublished results). NAc tissue samples were used to deliver a profile with 167 significantly identified genes from 178 detected transcripts. The co-expression and interaction networks dynamics are computed here via STRING.db (V. 10.5) [11] to explore functional characteristics and assess the informativeness of network connectivity associated to the transcriptome.

Dataset-2
Brain disorders often involve multiple genes and complex interactions between them whose identification is particularly challenging and requires ad hoc prioritization tools. The tool brain-coX (https://shiny.wehi.edu.au/freytag.s/brain-coX/) [12] was used for cross-referencing the results and perform in silico gene prioritization based on multiple gene expression datasets in the post-mortem human brain.
Protein-chemical interactions were obtained from the Comparative Toxicogenomics Database (http://ctdbase.org/) [15], providing manually curated information from the literature in support to hypotheses on etiology and exposome-driven disease mechanisms. Finally, drug-target interactions were obtained by the DrugBank database (V5.0) (https://www.drugbank.ca/) [16], accounting for about 1000 drug entries (including one fifth FDA-approved small molecule drugs, 241 FDA-approved biotech protein/peptide drugs) and over 6000 experimental drugs, and with 4661 drug target/enzyme/transporter/carrier sequences linked to them. A network platform for integrative mapping applications was used with blend of solutions (available from [17,18]).

Clustering Methods
The Markov Cluster Algorithm (MCL) is a fast scalable unsupervised clustering applied to networks and based on stochastic flow simulation [19]. Each cluster has at least one distinct element compared to the other clusters and these together form a set of strongly connected components with an associated graph. The MCL process initiates with an input matrix and a graph with one strongly connected component that then grows in number yielding a clustering distribution related to the density of the input matrix. The clustering itself is stable under perturbations. The k-Means clustering is a distance-based algorithm that partitions the data into a pre-specified number of clusters. The distance function measures the similarity between cases assigned to the nearest cluster. This unsupervised learning iteratively groups the data into k predefined non-overlapping clusters maximizing the intra-cluster points similarity.

Entangled Network States
Networks allow inference to be conducted from one or multiple datasets represented by simple structures such as nodes and links. Network interactomes may display configurations of perturbed interactions, i.e. those changing in response to a certain stimulus. Such conditional networks are typically heterogenous, i.e., they show the rewiring effects induced by external factors or time-course events. Modularity, which is a characteristic of biological networks, adapts to perturbed conditions when forming hierarchies of functional relationships. These tell us what levels of functional stability and/or plasticity the system presents in response to activation/inhibition dynamics induced by nonbaseline conditions.
Of interest here is an understanding of networks whose complex architectures of interconnected nodes reflect a sort of entanglement (E) effect. This can be considered a quantum measure of correlation that reveals how the information is encoded among all the network interacting nodes (with single or multiple features combined into tensors etc.). There is a growing attention to this topic with the goal of putting into relationships E with network topological properties and more in general associating graphs to quantum states. In principle, quantum entanglement (QE) occurs between multiple interacting entities when their quantum state cannot be described independently but only as a system. For instance, QE may describe a state that two particles share because superposed or nonseparable, making impossible to describe them individually. Equivalently, entangled states indicate that if one entity is measured, specified or resolved in its characteristics, then the effects extend to its counterparts. At that point any interaction that occurs has system's relevance, i.e., it reflects the influence of an overall entangled state, eventually causing the initial superposition to collapse in view of particular perturbations.
Translated into the language of networks, stable states are usually considered attractors and other states can be identified as possibly perturbed ones. In parallel with entangled states, also modules are highly intercorrelated entities able to measure the effects of perturbations and define the system's response to external events through state transitions. Such transitions have not just localized influence but can trigger changes at global network scale. Even if biological network configurations are inherently sparse it is generally observed that their remodulation is systemic and covers variable resolutions (from local to global). Perturbations associated to drug-induced changes are expected to induce synergistic interconnections between transcriptome-driven bioentities (genes, proteins) and mimic entangled states in their transition dynamics. Once again, the state of a network is determined by the state of its component modules and thus changes in the global state depend on changes of state occurring at component level and then transmitted or propagated via the underlying interdependence structure.

Identified Network Motifs
Networks of interactions between proteins (PPIN) are used to display associations and discover emergent properties or unexpected events that the complex network self-organization laws tend to generate. Measures of global topology or locality (structures that repeat or not) may both enable network inference. Figure 1 shows a co-expressions network computed from a subset of the NAc profile. Clusters were computed by applying k-means and MCL. Use of tandem clustering serves the scope of proving consistency of findings by reducing method-induced bias. This is summarized by three identified (significant) down-regulated motifs: (a) FYN-SH3KBP1-PTPRE (b) VEGFA-FGFR1-FGF22-SOS2; (c) DLG1-DUT-GUK1.

Figure 1
NAc co-expression network with clusters. Source is STRING db (evidence mode: experiments, database and co-expression interaction sources). Red, yellow and blue arrows converge to correspondingly identified motifs.
The third, Motif c, includes: VEGFA (log2(FC)=-1.25), or Vascular Endothelial Growth Factor A, a member of the PDGF/VEGF growth factor family inducing proliferation and migration of vascular endothelial cells and essential for physiological/pathological angiogenesis. VEGFA plays also a role in white matter maintenance and homeostasis, as it can induce oligodendrocyte precursor cells (OPC) migration via a ROS and FAK-dependent mechanism [21].

Bio-annotations
Connectivity measured at PPIN scale ( Figure 2) offers the possibility to infer associations from NAc patterns. Specifically, the main motif refers to VEGFA, known to induce proliferative processes essential for both physiological and pathological angiogenesis. Also, very recent study showed that the expression levels of this growth factor are significantly affected by chronic cocaine use in both regiondependent and cocaine treatment-time dependent ways [22]. Here, VEGFA down-regulation indicates a general inhibition of angiogenesis, suggesting the nonprevalence of neuroplasticity induced by either chronic or acute effects of cocaine use. VEGFA appears connected with MCL1, which encodes an anti-apoptotic protein, and is member of the Bcl-2 family (not directly involved in proliferation). MCL1 is up-regulated and its pattern associated with the inhibited VEGFA was observed in other studies, for instance in multiple myeloma cell proliferation with inhibition of VEGF-induced MCL1 up-regulation [23], or in leukemia [24]. In our context, further insights came from other down-regulated (circled in Figure 2) proteins. MCL1 is connected directly with SIRPA (inhibited signal transduction activity possibly related with synaptogenesis and synaptic function), while the inhibited SH3KBP1 is indirectly connected with VEGFA being implicated in multiple processes (apoptosis, cytoskeletal rearrangement, cell adhesion, etc.). MUC1 is also inhibited and connected with VEGFA, being involved in cellular growth and survival. Canonical pathways for the genes listed in the three motifs are in Table 1.

In Silico Prioritization Tool
Aimed at the identification of high-confidence candidate genes possibly causative of disease phenotypes, a key factor is the prioritization of such candidates to limit unnecessary experimental validations. In general, a few candidate genes with specific relevance are produced in genomic studies, while automatically generated candidate lists result incompatible with experimental validations. Therefore, in silico gene prioritization ranks the candidate genes by integrating profiles from various datasets (see large-scale benchmarking of methods in [26]). Due to the complexity of biological processes in the brain, gene expression tends to differentiate from other tissues [27]. In addition, variation in gene expression is observed over individuals' development, suggesting larger effects exerted temporally rather than spatially [28].
For the purpose of gene prioritization, the web-tool brain-coX was explored in its brain transcriptome expression data gathered from multiple studies involving developing and ageing human brains. The majority of the collected samples come from normal (neurologically non-diseased) individuals useful for benchmarking "known disease genes" and prioritizing candidate gene lists (also referred to BrainGEP [29]). Figure 3 illustrates gene relationships established between our candidate genes and reference genes retrieved from brain-coX. Notably, the candidate gene list was cleaned (conventional way mode) and thresholded (20% the option of choice) while the selected age range was kept quite large (from adolescence to late adulthood, which is compatible with drug consumption/abuse habits). The gene prioritization (Table 2) indicates 14 candidate genes prioritized against the selected datasets and reported in COEX and PPI network configurations.

Prioritized Gene Lists
The gene list that was found with the above described tool includes: a) HAPLN2, forming the hyaluronan-associated matrix in the central nervous system and facilitating neuronal conduction and general structural stabilization; b) BOK, a BCL2 Family Apoptosis Regulator, acting as pro-apoptotic; c) DLG1, playing a possible role in cell proliferation and synaptogenesis; d) APBB1IP, involved in the signal transduction from Ras activation to actin cytoskeletal remodeling, suppressing insulin-induced promoter activities through AP1 and SRE, and mediating Rap1-induced adhesion; e) HTATIP2, involved in apoptosis and autophagy and cytoskeletal signaling pathways; f) SOS2, involved in development VEGF and Gap Junctions signaling pathways; g) ITPK1, referred to inositol metabolism and playing a role in the development of the neural tube; h) CDK16, which belongs to the cdc2/cdkx subfamily of the ser/thr family of protein kinases (relevant for vesicle-mediated transport processes and exocytosis and regulates GH1 release by brain neurons, neuron differentiation and dendrite development); i) ARHGAP44, which acts as a GAP for CDC42 and RAC involved in synaptic plasticity; l) SH3D19, suppressor of Ras-induced cellular transformation and Ras-mediated activation of ELK1 by EBP and regulator of ADAM proteins in the signaling of EGFR-ligand shedding and of cell morphology and cytoskeletal organization. Figure 3 is centered on NOTCH3, key regulator of nervous system development and promoter of neural stem cells renewal and proliferation [32]. Here, it includes the candidate MUC1 and its main connected path links to SH3KBP1, associated with NUMB (neurogenesis and repair) and DLL1 (brain development), and to SIRPA, associated with Notch signaling via DTX3 and RBPJ that recruit chromatin remodeling complexes and PSEN1 known to be involved with AD and dementia.

Role of VEGFA
It is of interest to check VEGFA then, previously identified as a determinant of a PPI network motif. Based on two normal brain datasets interrogated in brain.coX, VEGFA role appears marginally linked to the rest. When changing the reference dataset (normal and abnormal brain [33]), VEGFA returns to play a major role. In Figure 4, a VEGFA-centered network is displayed with the same known reference genes as in the previous context. More regulations need to be investigated and both TFs and miRNAs are then considered.

Figure 4
VEGFA sub-networks. Context from the Brain-coX reference data set, in which individuals with abnormal brains are included among the 457 individuals of age ranging widely across 908 samples from cerebellum and frontal cortex.

NAc Transcription Factor-Mediated Regulations
In Figure 5 TFs are displayed with reference to their target genes in our candidate gene list. The network at the top panel includes the minimum connected network (214 nodes; 1249 edges). The network at the bottom panel is topologically constrained, showing high-degree nodes. Note that SH3KBP1 is included, and is under regulatory influences from various TFs, namely: a) GATA3, important regulator of T-cell development and related to Notch signaling; b) NFIC, involved in transcription and replication; c) HOXA5, part of homeobox genes whose expression and differentiation are regulated during embryonic development; d) SRF, involved in cell proliferation, differentiation, growth and apoptosis, and acting downstream of MAPK; e) STAT3 (part of the JAK-STAT signaling cascade, acting in response to cytokines and growth factors, and playing a key role in cell growth and apoptosis; f) PPARG, regulating the transcription of many genes; g) NFKB1, involved in many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis; h) YY1, repressing and activating many promoters; i) FOXC1 and FOXL1, involved in regulation of embryonic development, and of metabolism and cell proliferation, respectively; l) SREBF1, whose related pathways are apoptosis and the survival caspase cascade; m) CREB1, whose related pathways are Trk receptor signaling (mediated by MAPK), hypoxia, Dopamine D1-like receptor family signaling, Akt signaling, and involving different cellular processes including the synchronization of circadian rhythmicity.

Figure 5
Transcription factor -target gene networks. Top panel: Whole TF-target network (240 nodes and 1295 edges). Bottom panel -selected sub-network with constraints fixed at degree ≥15 and betweenness ≥ 1. Note among the targets, SH3KBP1 is a highly connected node (indicated by yellow arrow).

NAc miRNA-mediated Regulations
Note that miRNAs may contribute to the long-lasting forms of synaptic plasticity through the modulation of regulatory pathways involving dendritic mRNA trafficking, synaptic mRNA translation, alteration of actin cytoskeleton, neurotransmitter metabolism, etc. [34]. With the scope of accounting for potential regulators of our targets, we examined the whole miRNA-target genes interactome ( Figure  6) by looking at TarBase and miRTarBase. Expectedly, we could find a very rich network of interactions between miRNAs and targets. By constraining topologically this network to nodes of degree level > 5, and centrality > 1, both VEGFA and MCL1 appeared as main hubs. With MCL1 as a target, it appears that miR-615-3p is part of a small group known to target innate and inflammatory factors, while also miR-20a-5p belongs to a family affecting target genes implicated in axonal guidance. This miRNA, together with miR-17-5p, miR-106b, and miR-16 have also been shown to act in concert to regulate E2F activity [35], and to be involved in regulating APP expression [36]. Then, another brain-specific miRNA, miR-124, has been implicated in neuronal differentiation and it promotes nervous system development, identifying a set of target genes involved in stress response and neural plasticity. In particular, miR-124 was among those few identified as cocaine-regulated in rats; and by using lentiviral vectors to over-express or silence this miRNA in NAc played a role in rats' ability to exhibit conditioned place preference for cocaine [37,38]. Overall, miRNA regulation acts on the expression of drug-metabolizing enzymes and transporters. In our specific context, due to the imposed network constraints, miRNA-mediated regulation appears to play a significant role in modulating the effects of cocaine on the nervous system [39,40].

Drug Interactions
The protein-drug interactions were also computed using the DrugBank db, and 13 significant subnetworks were identified. While the primary and richest network is centered on ACHE, or Acetylcholinesterase found in neuronal synapses and with a role in neuronal apoptosis (in particular, it terminates the signal transduction at the neuromuscular junction), the second sub-network is centered on VEGFA, as shown in Figure 7 (top panel). When the protein-chemical interactions is retrieved from the Comparative Toxicogenomics Database (CTD), the network (Figure 7, bottom panel) reveals 3 main communities (computed by the label propagation algorithm [41]) with one centered on VEGFA and overlapping to some extent with MCL1, due to high degree and centrality scores.

Discussion and Conclusions
Repeated exposure to drugs may lead to persistent changes in gene expression that are suspected to induce structural plasticity adaptation in multiple brain regions. Assuming that brain dynamics may show signatures of such adaptive process, transcriptome data were used to explore the significance of specific network patterns belonging to particularly entangled states, i.e., those tightly correlated because embedded in cluster motifs. Some of these emerged significantly from our analyses. In order to infer with confidence, we cross-referenced our detections with various methods and biological sources, including multiple studies on the brain effects from drug use. Through the employed coexpression and PII networks multiple types of associations and interactions were investigated to capture differential transcriptome features possibly associated to perturbation effects exerted over the selected brain region. One in particular, NAc, allows to observe specific neuronal functionality related to the reward circuit.
As a general consideration emerging from our findings, the identified brain transcriptome landscape reflects not surprisingly a diffuse inhibitory response. A mix of brain regulatory drivers was found to underlie the specific patterns identified through network motifs. As the milieu of neural types that can be addressed in each particular brain region is different, this variety can affect the neuronal connectivity patterns and the way in which responses to stimuli occur. A recent study emphasized the fact that the progression from recreational to recurrent drug use involves distinct brain areas at different stages of the drug addiction process [42]. Interestingly, the continuity of exposure to drugs justifies distinct impacts over the neuronal structural plasticity upstream NAc medium spiny neurons.
The study has demonstrated that the identifications obtained from the transcriptome can reveal, when interconnected, functionally relevant patterns reflecting entangled states of the system under study. The predominant patterns emphasized in our study appear mostly down-regulated and centered on the association between VEGFA and MCL1. This is especially highlighted in the PPI network, although both recur significantly in other types of networks, such as miRNA-targets, drug interaction, and protein-chemical interaction ones. This convergence seems to indicate a clear effect on regulation visible across various network configurations of regulation dynamics. One aspects that these marks suggest is the non-prevalence of specific effects when considering either acute or chronic drug use based on the identifications related to the NAc response. However, whether these differential transcriptome patterns can offer sufficient evidence to support causative relations with either chronic or acute drug effects remains unclear. This conclusion may indeed hold also because of the possible presence of brain disorders of difficult diagnosis, bringing up the issue of regulatory roles that may be played by comorbidities.
Overall, the inherent limitations of our analysis have nevertheless important implications for future studies. Inference on the effects of drug use may reasonably start with transcriptomic measurements as these determine influences on brain regulatory circuits observed at a genomic scale, therefore systemically. Following such step, multiple data sensitivities should be considered and investigated by integrative omics analyses, indeed a necessary direction for future studies [43]. A promising area is methylome-transcriptome integrated profiles to assess complex regulatory influences. In turn, the methodologies to conduct inference should adapt to the complexity of data integrations in order to increase the confidence in the results and consolidate them through biological validations. Drug effects may likely determine complex stratifications that are hardly detectable by specific measurements or perhaps only approximated (given suitable controls, depending on confounders, etc.) at either transcriptome and/or methylome levels. At a structural level, it is important the role of novel transcript features, for instance gene fusion delivered by integrative omics studies [44]. At a functional level, it is important to build joint profiles and establish through them the influences that can be measured at systems level. Finally, at the computational level it is important to leverage differential networks and measure their sensitivity to different perturbation levels by mapping extensively the brain system.

Author Contributions
Enrico Capobianco did all the research work of this study.