Discerning Functional Connections in the Pulsed Neural Networks with the Dynamic Bayesian Network Structure Search Method Based on a Genetic Algorithm

Abstract

It is important to explore potential structural characteristics of biological networks and regulatory mechanisms of network behaviors at the system level. In this study, a dynamic Bayesian network structure search method (DBNSSM) based on a genetic algorithm is employed to infer and locate functional connections in pulsed neural networks (PNNs) as typical artificial neural networks. In the process of network structure searching, a minimum description length score is calculated for each candidate network structure. The score indicates two characteristics of the network structure: (1) the likelihood based on network dynamic response data and (2) the complexity. Both should be considered together on selecting network structures. The DBNSSM is applied to analyze time-series data from PNNs, thereby discerns functional connections showing network structures collectively. It is feasible to analyze multichannel electrophysiological data of biological neural networks using the DBNSSM.

1. Introduction

Molecular regulatory mechanisms in neural systems have been intensely studied, and basic processes of information handling by neurons have been well understood (Trimble et al., 1991; Schuldiner et al., 1995). However, few pieces of evidence indicate delicate structures of complex biological neural networks (BNNs) based on two major reasons (Wu et al., 2006; Greicius et al., 2009): (1) noninvasive measurements are absent to record the electrical activity of BNNs in high temporal and spatial resolutions; and (2) even if mass data are available, effective “inverse engineering” methods are still required to retrieve basic structure information and fundamental working mechanisms of BNNs.

Many substantial endeavors have been made to identify functional connections in BNNs. Some prospective experimental methods such as multiple neuron recordings (Kruger, 1983), voltage-sensitive dyes (Baker et al., 2005), and multielectrode arrays (Erickson et al., 2008; Kang et al., 2009; Spira and Hai, 2013) can simultaneously monitor electrical activities of a large number of neurons in real time, and provide multichannel neural spike data with sufficient temporal and spatial resolutions. Experimental data obtained through multichannel simultaneous recordings can produce multivariable datasets, containing abundant causal information on the investigated biological networks. Further, the datasets can be used to infer network structures and regulatory mechanisms. Thus, data-based inference of connective structures and secondary examination of underlying regulation mechanisms play a central role in cutting edge studies of neuroscience and computational biology research (Watabeuchida et al., 2012).

Various methods inferring biological network structure were developed in recent decades, such as the Boolean network model (Shmulevich et al., 2002), differential equation modeling (Kim et al., 2007), the likelihood-ratio test method (Caines and Chan, 1975), and Granger causality (Doerfler et al., 2013). These existing methods have been extensively applied to investigate causal dependency among protein–protein networks, genetic regulatory networks, and metabolite networks. However, most of these techniques are limited in analyzing BNNs due to simplification or linearization of system dynamics and additional requirements for detailed a priori knowledge on system order and structure. Due to the nonlinear threshold property of neuronal membrane potential and neuronal plasticity, the dynamics of BNNs are more complex and require exploring a nonlinear or probabilistic method.

Bayesian networks (BNs) are probabilistic graphical models naturally representing the static probability dependency among involved network variables. Currently, BNs have been intensively applied to multivariable biological network data analysis (Jansen et al., 2003; Yu et al., 2004). Dynamic Bayesian networks (DBNs) can model stochastic evolution of a set of random variables over time horizon (Perrin et al., 2003; Hanks and Madigan, 2005; Zou and Conzen, 2005), thus being more suitable for dynamical causality analysis of the multivariable datasets. DBNs modeling method is essentially a multivariable time-series analysis method analyzing all dynamical couplings in network causality analysis, simultaneously. DBNs possess significant advantages over competing representations, such as the Granger causality test (Doerfler et al., 2013), essentially based on a linear system analysis method, Kalman filters (Chen, 2003) handling only unimodal posterior distributions with linear models, and hidden Markov models (HMMs; Smyth et al., 1997), whose parameterization exponentially increases with increasing number of state variables. In the last decade, DBNs have been extensively applied to investigating causal dependency among genetic regulatory networks (Friedman et al., 2000; Jansen et al., 2003; Perrin et al., 2003; Zou and Conzen, 2005). Nevertheless, the use of DBNs to characterize neuronal networks at single neuron and ensemble levels has not been fully studied.

In this study, a dynamic Bayesian network structure search method (DBNSSM) based on a genetic algorithm (GA) is employed to infer and locate underlying dependent structures of pulsed neural networks (PNNs), typical artificial BNNs. In the process of network structure sorting, a minimum description length (MDL) score, including both the likelihood of the network structure and the complexity of the network connective structure, is evaluated for each candidate network structure. A global optimal structure is achieved using a GA, which is a globally searching method that mimics the process of DNA evolution (Whitley, 2014). Finally, the DBNSSM is applied to time-series data obtained from PNNs, which are based on “integrate and fire” neurons (Doerfler et al., 2013). As a result, functional connections in PNNs can be correctly inferred, which shows the effectiveness of the DBNSSM for BNN structure inference. The remaining sections of this study are organized as follows: the DBNSSM is presented in Section 2. Numerical studies and parameter robustness tests are conducted in Section 3. Conclusions and discussions are detailed in Section 4.

2. Methods

2.1. Dynamic Bayesian networks

BNs (Lam and Bacchus, 1994) are the special case of a diagrammatic representation of probability distributions, called probabilistic graphical models (Buntine, 1994; Bach and Jordan, 2004; Friedman, 2004). The BN represents probabilistic dependency among variables using a directed acyclic graph (DAG), comprised of nodes (also called vertices) connected by directed links (also called edges or arcs), and prohibiting any cyclic path.

BNs construct a common modeling framework for various existing modeling methods, such as a naive Bayesian model, implicit class model, HMM, and Kalman filters. Among many potential “inverse engineering” approaches applied to infer the functional connections of BNNs (ranging from simple unions and intersections of datasets to Kalman filters, artificial neural networks, Granger causality, HMM, and support-vector machines), BNs take several advantages (Friedman et al., 1997; Heckerman, 1999): (1) enabling the combination of highly dissimilar types of data (i.e., numerical and categorical), converting these data into a common probabilistic framework without unnecessary simplification; (2) accommodating missing data; and (3) naturally weighting each information source according to its reliability. However, BNs may encounter some fundamental limitations, like specific requirement of an acyclic structure and only static data processing when they are applied to multichannel biological data. To overcome the shortcomings, DBNs are introduced to represent and rationalize stochastic evolution of a set of dynamical random variables over time (Yu et al., 2004), and are more suitable for dynamical causality analysis of multivariable data.

A biological network containing n nodes represented as a discrete variable set X = {X₁,X₂,…X_n}. Each variable (or node), X_i (i = 1, 2 … n), has s_i possible values: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${w_{i1}} , {w_{i2}} , { \rm{ }} \ldots {w_{i{s_i}}}$$ \end{document} . The parent set of X_i is denoted as Pa(X_i), which contains q_i nodes and has a total number of p_i possible unique instantiations, that is,

With v_ij (j = 1,2…, p_i) representing the j-th unique instantiation of Pa(X_i), the conditional probability \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \mathop{ \rm P} \nolimits} \left( { \left. {{x_i} = {w_{ik}}} \right\vert { \mathop{ \rm Pa} \nolimits} \left( {{x_i}} \right) = {v_{ij}}} \right) = { \theta _{i , j , k}}$$ \end{document} denotes the possibility of X_i taking the k-th value (k = 1,2…, s_i) when Pa(X_i) takes its j-th unique instantiation. Clearly, for any i and j, there is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sum \nolimits_{k = 1}^{{s_i}} {{ \theta _{i , j , k}}} = 1$$ \end{document} . All values of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \theta _{i , j , k}}$$ \end{document} compose a conditional probability table (CPT) for the network.

A BN is defined as a DAG with a joint probability distribution over a variable set X, denoting formally as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$BN = ( G , \Theta )$$ \end{document} . The vertices of a DAG correspond to the network variables, {X₁,X₂,…X_n}; \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\Theta$$ \end{document} is the CPT given the network structure of G. It is reasonable to assume that each node X_i is independent of its nondescendants given its parent nodes. Following a chain rule, the joint probability of a BN can be formulated as a product of the conditional probabilities specified for all network variables, that is,

A BN is used to describe the probability distribution of a biological network over a static dataset. However, a DBN extends this representation to modeling of dynamical processes, which can explain the dynamical causality among biological networks. In the formulation of a DBN, the joint probability distribution over the time stamps, t = 1,2…, T, is written as follows:

where the vector \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \bf{X}} ( 0 )$$ \end{document} represents initial states of a prior network. Without being explicitly stated, the forgoing derivation is based on a well-known philosophy that the first-order Markov assumptions, that is, the current network states, are only determined by their one-step previous states and have no relations with the other previous states. Thus, a DBN, formulated as a product of state transition probability for all network variables, is given as follows:

2.2. MDL criteria

A variety of methods can evaluate the “loss functions” of BNs or DBNs with respect to a certain network dataset. Loss functions are constructed using a number of typical criteria: expectation–maximization (Friedman, 1998), Akaike information criterion (Akaike et al., 1998), Bayesian information (Schwarz, 1978), Bayesian Dirichlet equivalent (Heckerman et al., 1995), and MDL. MDL algorithm was originally proposed in several studies exploring universal coding (Rissanen, 1978). The method was subsequently applied to BNs' structure learning (Lam and Bacchus, 1994). According to MDL criterion, the optimal BN structure is the network with a minimum summation of the network parameter length and the network response data length, suggesting that a balance is required between network complexity and matching degree of the network structure with the response data. For this purpose, the score of MDL consists of the following two parts: (1)

The network parameter length

An n-node BN requires a data length of \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \log _2} ( n )$$ \end{document} to code one index for each node (using binary coding). The i-th node can be encoded as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${q_i} \cdot { \log _2} ( n )$$ \end{document} bits, since it has q_i parent nodes. Therefore, to store a structure of BN, the required binary data length is

However, in a CPT, each variable X_i has \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${p_i} \times ( {s_i} - 1 )$$ \end{document} parameters. Since \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sum \nolimits_{k = 1}^{{s_i}} {{ \theta _{i , j , k}}} = 1$$ \end{document} , for any i and j in the network, only (s_i–1) independent parameters (instead of s_i) are required for each fixed index j (corresponding to one column of CPT). The estimation of conditional probability requires \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$ { \textstyle \frac { 1 } { 2 } } { \log _2 } ( m )$$ \end{document} bits of data length, where m is the number of data instances of X. For a DBN, the number m represents the length of the time series; that is, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$m = T + 1$$ \end{document} . Thus, the total length required for saving a full CPT is as follows:

(2)

The compressed length of the response data

Suppose all data instances are independent and the dataset D is complete, then the binary length of all instance data can be evaluated based on a form of conditional information entropy:

\documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} \begin{align*} \begin{split} & D{L_{data}} ( { \bf{D}} \vert BN ) = \left( {T + 1} \right) \sum \limits_{i = 1}^n {H \left( {{X_i} \vert Pa \left( {{X_i}} \right) } \right) } \\ & { \rm{ }} = - { \log _2}{ \mathop{ \rm P} \nolimits} ( { \bf{D}} \vert BN ) \\ & { \rm{ }} = - \sum \limits_{t = 0}^T {{{ \log }_2}P ( {D_t} \vert BN ) } \\ & \; = - { \log _2} \left( {{ \rm P} \left( {{ \bf{X}} ( 0 ) } \right) \cdot \prod \nolimits_{t = 1}^T {{ \mathop{ \rm P} \nolimits} \left( {{ \bf{X}} ( t ) \left\vert {{ \bf{X}} ( t - 1 ) } \right.} \right) } } \right) \\ & \; = - { \log _2} \prod \limits_{i = 1}^n { \prod \limits_{j = 1}^{{p_i}} { \prod \limits_{k = 1}^{{s_i}} {{{ \left( {{{{N_{ijk}}} \mathord{ \left/ { \vphantom {{{N_{ijk}}} {{N_{ij}}}}} \right. \kern- \nulldelimiterspace} {{N_{ij}}}}} \right) }^{{N_{ijk}}}}} } } . \\\end{split} \tag{7} \end{align*} \end{document}

where \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$H \left( {X \vert Y} \right) = - \sum {P ( x \vert y ) \log P ( x \vert y ) }$$ \end{document} is the information entropy of X conditioned on Y; N_ijk counts the number of instances when X_i = w_ik under the cause of Pa(X_i) = v_ij; N_ij is the total number of instances when Pa(X_i) = v_ij for all values of X_i, that is, \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${N_{ij}} = \sum \nolimits_{k = 1}^{{s_i}} {{N_{ijk}}}$$ \end{document} . Considering the evaluation of the three data lengths in Equations (5) to (7), a total MDL score of a structured BN or DBN depending on all data instances is provided as follows:

Based on the MDL criterion, the optimal model structure is selected as the one with a MDL.

2.3. GA for network structure optimization

Many algorithms can be used to search the optimal structure of a BN with the lowest loss function, that is, the minimum MDL score. Currently, due to their computational simplicity, best first searching (Chickering, 2003) or greedy searching (Burnhan and Anderson, 2002) is commonly employed for BNs' structure optimization. These two algorithms are limited by which they often converge to local optimal solutions severely depending on the initial structure. Compared with the aforementioned two methods, simulated annealing (SA; Glover, 1990) might be an alternative choice for network structure optimization, since SA is a global optimization algorithm. However, SA algorithm is typically criticized for its relative long computational time to converge (Glover, 1990).

In this study, a global optimization algorithm is utilized for structure inference of DBNs in PNNs. A combination of DBNs and GA is reported in the references (Wang et al., 2006; Ross and Zuviria, 2007), focusing on the theoretical derivations of GAs for the DBN structure searching. A genetic algorithm is one type of parallel searching algorithm by mimicking the inheritance and mutation of genetic information from parents to children. Thus, the fitness of children can be rapidly improved globally by three repeated major steps: mutation, crossover, and selection (Schmitt, 2004). Details of the GA used in the DBNSSM are provided in Section S1 in Supplementary Material.

2.4. Pulsed neural networks

The DBNSSM with GA is applied to time-series data obtained from PNNs (Maass and Bishop, 1999) to test its effectiveness for network structure inference. PNNs are artificial, based on “integrate and fire” neurons, also called spiking neurons (Back and Hoffmeister, 1991). Spiking neuron models use recent insights from neurophysiology and temporal coding to pass information between neurons (Maass, 1997), which closely mimics realistic communications between neurons. Its fundamental mathematical description can be found in Section S2 in Supplementary Material, where Supplementary Figure S1 provides the kernel functions used in mathematical modeling and the firing process of a membrane voltage.

3. Results

3.1. Inferring functional connections among PNNs

To illustrate the ability to infer functional connections, the DBNSSM is used to 2, 3-node PNNs of the topologies (Fig. 1). Simply, we hypothesize that synaptic efficacy remains the same and is inversely proportional to the number of network nodes: \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${w_{ij}}{ \rm{ = }}{{2.7} \mathord{ \left/ { \vphantom {{2.7} n}} \right. \kern- \nulldelimiterspace} n}$$ \end{document} , at the same network scale. For example, for 3-node PNNs, set all \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${w_{ij}}{ \rm{ = }}0.9$$ \end{document} .

FIG. 1.

All potential topologies of 2, 3-node PNNs (not considering the node numeral order): (a) 2-node network structures, (b) 3-node network structures. Arrows indicate the directions of synaptic regulations.

For all tested 2, 3-node PNN topologies, the same set of parameter values were assigned to each neuron (Supplementary Fig. S1). Each network runs a simulation for 5 seconds and generates n-channel spike trains of n neurons (n = 2 or 3). The spike recordings are sampled by the protocol in Section S3 in Supplementary Material.

For an example of a 3-node PNN (Fig. 2a), spike trains generated through network simulations are shown in Figure 2b, and binary discretized time series are shown in Figure 2c. Based on these time-series data, the DBNSSM was applied to the 2, 3-node PNNs for inferring inside functional connections. For network structure sorting, GA is employed to search for the optimal structure based on the multivariable time series from network simulation. Major parameters for the GA are selected as described in Section S1 in Supplementary Material. As a result, functional connections can be successfully inferred for most of 2, 3-node network topologies. The inference results are summarized in the first two rows of Table 1.

FIG. 2.

Simulation of a 3-node PNN. (a) Network topology. The numerical value near each arrow represents the strength of synaptic interactions. Arrows denote the direction of synaptic connections. (b) The spike trains of the 3-node PNN, that is, neuronal recordings (the total simulation period is 5 seconds and only 2 seconds is shown here). (c) The two-valued discretized time-series obtained from the 3 spike trains after binning at 10-ms intervals.

Table 1.

Functional Connection Inference Results for 2- to 50-Node Pulsed Neural Networks Using the Dynamic Bayesian Network Structure Search Method

Network scale	No. of tested structures	Maximum possible no. of network connections	CIR	FPIR	FNIR
2-node	100	200	0.9419	0.0465	0.0116
3-node	100	600	0.9789	0.0141	0.007
4-node	100	1200	0.9489	0.0511	0.0284
5-node	100	2000	0.896	0.0298	0.0742
10-node	100	9000	0.8323	0.125	0.0427
20-node	100	38000	0.8098	0.1364	0.0538
30-node	100	87000	0.8199	0.13	0.0501
40-node	100	156000	0.8196	0.1302	0.0502
50-node	100	245000	0.8037	0.1414	0.0549

For each network scale, 100 randomly connected networks are tested using the dynamic Bayesian network optimization method. The CIR, FPIR, and FNIR are calculated.

CIR, correct inference ratio; FNIR, false-negative inference ratio; FPIR, false-positive inference ratio.

The DBNSSM can also be applied to 4, 5, 10, 20, 30, 40, 50-node PNN structures. We randomly construct 100 network structures for each network scale and conduct similar simulations as to 2, 3-node networks. The connectivity ratio is set as 20%, which closely mimics the real situation of in vitro cultured biological neuronal networks. Self-connections of every node are prohibited; thus, the total number of connections and disconnections is \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$n ( n - 1 )$$ \end{document} . Key features such as the correct inference ratio (CIR), the false-positive inference ratio (FPIR), and the false-negative inference ratio (FNIR) are calculated according to the total number of connections and disconnections, as shown in Table 1. Irrespective of network scales, a high level of CIRs was obtained. The highest correct ratio, being 0.9789, appeared for 3-node networks. The false-positive ratio initially goes down to 0.007 for 3-node networks, while it later increases to the top point of 0.0742 for 5-node networks. From the simulation results, the CIR of 10-node networks initially decreases to 0.8323 compared with the CIR = 0.896 for 5-node cases, and the CIRs of the DBNSSM maintain a small fluctuation after 10-node networks until 50-node networks.

3.2. A comparison between GA and SA

To show the advantage of the GA over other network structure search methods, we also test the SA algorithm (Glover, 1990), a global optimization method, to optimize DBN structures of PNNs. The SA algorithm is set up as follows:

SA algorithm starts from the empty network structure \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $${ \Theta _0}$$ \end{document} , and continues to either a maximum of 100 steps or until a state with a minimum MDL score or less is observed. The acceptance probability of the updated structure is 1/(1 + exp(MDL_new – MDL_old)/T_max), where MDL_new is the MDL of the updated structure and MDL_old is the MDL of the current structure. T_max is the initial annealing temperature and is set as 10. A damping factor of the annealing temperature is 0.9.

Ten turns of 100 simulations were conducted for the two different algorithms, and CIR, FPIR, and FNIR of the connection inference are separately tested according to different network scales. Statistical hypothesis tests show that the CIR by the GA is significantly greater than the one by the SA algorithm. Details of Behrens–Fisher tests are listed in Section S4 in Supplementary Material.

Results of hypothesis tests are summarized in Table 2. The null hypothesis for the mean value of the CIRs, H₀ (i), was rejected for 3-, 4-, and 5-node networks. In addition, the null hypothesis for the mean value of FPIRs, H₀ (ii), was rejected for 3-, 4-, and 5-node networks. This evidence shows that the mean value of the CIR based on the GA is significantly greater than the one based on the SA algorithm for 3-, 4-, and 5-node simulations.

Table 2.

A Comparison Between the Genetic Algorithm and Simulated Annealing Algorithm for Functional Connection Inferences of Pulsed Neural Networks

Null hypothesis	H₀ (i)	H₀ (ii)	H₀ (iii)
Network scale	CIR	FPIR	FNIR
2-node	0.6388000	0.6388000	Not available
3-node	0.0005244	0.0000361	0.2136000
4-node	0.0001142	0.0000030	0.2567000
5-node	0.0024000	0.0002728	0.1245000

For each network scale, 10 turns of 100 simulations were conducted for each algorithm. The CIR, FPIR, and FNIR of the connection inference using different searching algorithms are separately tested according to different network scales. Double sample t-tests (for left tail) are performed to show that the mean value of the CIR of the GA is significantly greater than the mean values of the CIR of the SA algorithm. A significance level at \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\usepackage{upgreek}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\alpha = 0.05$$ \end{document} is used. The three null hypotheses H₀ are: (i) The mean value of the CIR by GA is equal to or smaller than the one by the SA algorithm. (ii) The mean value of the FPIR by GA is equal to or greater than the one by the SA algorithm. (iii) The mean value of the FNIR by GA is equal to or greater than the one by the SA algorithm.

CIR, correct inference ratio; GA, genetic algorithm; SA, simulated annealing.

4. Discussion and Conclusions

In this study, the DBNSSM is developed to investigate the synthetic BNN structure. Aided by the MDL and GA, the DBNSSM rapidly identifies and locates functional connections without a priori knowledge of the synthetic BNN structures. Particularly, the method is efficient for nonlinear neuronal dynamics, which often reflects the threshold property of neuronal membrane potential and neuronal plasticity. Although DBNs have been extensively applied to investigate causal dependency among genetic regulatory networks (Friedman et al., 2000; Jansen et al., 2003; Perrin et al., 2003; Zou and Conzen, 2005), these analyses have not been fully explored and widely reported in application to BNNs due to the threshold property of neuronal membrane potential and neuronal plasticity.

In addition, the proposed method only requires the measurement of dynamic expression profiles of network nodes. With the advances in high-throughput measurement methods, such as multiple single neuron recordings, voltage-sensitive dyes, or multielectrode array technology, this modeling technique may soon become applicable in describing large-scale in vitro or in vivo BNNs. The DBNSSM might greatly improve on understanding the relationship between network dynamics and network topologies.

5. Data Reports

The data used in this study originated from the mathematical model of PNNs for testing the effectiveness of the proposed network structure inference method. The synthetic model and dataset are available to Journal of Computational Biology editors and reviewers during peer review, and can also be provided to any readers if required at the time of publication.

6. Studies Involving Human Subjects, Animal Research, and Clinical Trail

None.

Footnotes

Acknowledgments

This work was financially supported by grants from the National Natural Science Foundation of China (Grant Nos. 61364018, 81503167, and 61863029), the Inner Mongolia Autonomous Region Natural Science Foundation (Grant No. 2016JQ07), and the Program for Young Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region (Grant No. NJYT-15-A05).

Authors' Contributions

C.D. and C.Y.D. conceived and designed the algorithms. X.Y.C. performed the simulations and acquired the data. C.D., C.Y.D., and X.Y.C. analyzed the data. C.D. and C.Y.D. prepared the article.

Author Disclosure Statement

The authors declare that there are no competing financial interests.

Supplementary Material

References

Akaike

, Petrov

B.N.

, and Csaki

1998. Information theory and an extension of the maximum likelihood principle, 199–213. In Parzen

, Tanabe

, and Kitagawa

, eds. Selected Papers of Hirotugu Akaike. Springer, New York.

Bach

F.R.

, and Jordan

M.I.

2004. Learning graphical models for stationary time series. IEEE T. Signal Proces. 52, 2189–2199.

Back

, and Hoffmeister

1991. Extended selection mechanisms in genetic algorithms, 92–99. In Belew

R.K.

, and Brooker

L.B.

, eds. 4th International Conference on Genetic Algorithm. San Mateo, CA.

Baker

B.J.

, Kosmidis

E.K.

, Vucinic

, et al. 2005. Imaging brain activity with voltage- and calcium-sensitive dyes. Cell. Mol. Neurobiol. 25, 245.

Buntine

W.L.

1994. Operations for learning with graphical models. J. Artif. Intell. Res. 2, 159–225.

Burnhan

K.P.

, and Anderson

D.R.

2002. Model Selection and Multi-model Inference: A Practical Information-Theoretic Approach. Springer, New York.

Caines

, and Chan

1975. Feedback between stationary stochastic processes. IEEE Trans. Automat. Contr. 20, 498–508.

Chen

2003. Bayesian Filtering: From Kalman Filters to Particle Filters, and Beyond. Adaptive Systems Lab, McMaster University, Hamilton, Ontario, Canada.

Chickering

D.M.

2003. Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554.

10.

Doerfler

, Lyon

, Nagele

, et al. 2013. Granger causality in integrated GC-S and LC-S metabolomics data reveals the interface of primary and secondary metabolism. Metabolomics, 9, 564–574.

11.

Erickson

, Tookerb

, Taib

Y.-C.

, and Pine

2008. Caged neuron MEA: A system for long-term investigation of cultured neural network connectivity. J. Neurosci. Methods. 175, 1–16.

12.

Friedman

1998. The Bayesian structural EM algorithm, 129–138. In Proceeding UAI’98, Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, Madison, Wisconsin. Morgan Kaufmann Publishers, Inc., San Francisco, CA.

13.

Friedman

2004. Inferring cellular networks using probabilistic graphical models. Science, 303, 799–805.

14.

Friedman

, Dan

, and Goldszmidt

1997. Bayesian network classifiers. Mach. Learn. 29, 131–163.

15.

Friedman

, Linial

, Nachman

, et al. 2000. Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620.

16.

Glover

1990. Tabu search part I. Informs J. Comput. 1, 89–98.

17.

Greicius

M.D.

, Supekar

, Menon

, et al. 2009. Resting-state functional connectivity reflects structural connectivity in the default mode network. Cereb. Cortex, 19, 72.

18.

Hanks

, and Madigan

2005. Probabilistic temporal reasoning. Found. Artif. Intell. 1, 315–342.

19.

Heckerman

1999. A tutorial on learning with Bayesian networks. In Jordan

, ed. Learning in Graphical Models. MIT Press, Cambridge, MA.

20.

Heckerman

, Geiger

, and Chickering

D.M.

1995. Learning Bayesian networks: The combination of knowledge and statistical data. Mach. Learn. 20, 197–243.

21.

Jansen

, Yu

, Greenbaum

, et al. 2003. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science, 302, 449–453.

22.

Kang

, Lee

J.-H.

, Lee

C.-S.

, et al. 2009. Agarose microwell based neuronal micro-circuit arrays on microelectrode arrays for high throughput drug testing. Lab. Chip. 9, 3236–3242.

23.

Kim

, Kim

, and Cho

K.-H.

2007. Inferring gene regulatory networks from temporal expression profiles under time-delay and noise. Comput. Biol. Chem. 31, 239–245.

24.

Kruger

1983. Simultaneous individual recordings from many cerebral neurons: Techniques and results. Rev. Physiol. Biochem. Pharmacol. 98, 177–233.

25.

Lam

, and Bacchus

1994. Learning Bayesian belief networks: An approach based on the MDL principle. Comput. Intell. 10, 269–293.

26.

Maass

1997. Networks of spiking neurons: The third generation of neural network models. Neural Networks, 10, 1659–1671.

27.

Maass

, and Bishop

C.M.

1999. Pulsed Neural Networks. MIT Press, Bradford Book, Cambridge, MA.

28.

Perrin

B.-E.

, Ralaivola

, Mazurie

, et al. 2003. Gene networks inference using dynamic Bayesian networks. Bioinformatics, 19(Suppl. 2), ii138–ii148.

29.

Rissanen

1978. Modeling by shortest data description. Automatica, 14, 465–471.

30.

Ross

B.J.

, and Zuviria

2007. Evolving dynamic Bayesian networks with multi-objective genetic algorithms. Appl. Intell. 26, 13–23.

31.

Schmitt

L.M.

2004. Theory of Genetic Algorithms II: Models for genetic operators over the string-tensor representation of populations and convergence to global optima for arbitrary fitness function under scaling. Theor. Comput. Sci. 310, 181–231.

32.

Schuldiner

, Shirvan

, and Linial

1995. Vesicular neurotransmitter transporters: From bacteria to humans. Phys Rev. 75, 369–393.

33.

Schwarz

1978. Estimating the dimension of a model. Ann. Statist. 6, 461–464.

34.

Shmulevich

, Dougherty

E.R.

, Kim

, et al. 2002. Probabilistic Boolean networks: A rule-based uncertainty model for gene regulatory networks. Bioinformatics, 18, 261–274.

35.

Smyth

, Heckerman

, and Jordan

M.I.

1997. Probabilistic independence networks for hidden Markov probability models. Neural Comput. 9, 227–269.

36.

Spira

M.E.

, and Hai

2013. Multi-electrode array technologies for neuroscience and cardiology. Nat. Nanotechnol. 8, 83–94.

37.

Trimble

W.S.

, Linial

, and Scheller

R.H.

1991. Cellular and molecular biology of the presynaptic nerve terminal. Annu. Rev. Neurosci. 14, 93–122.

38.

Wang

, Yu

, and Yao

2006. Learning dynamic Bayesian networks using evolutionary MCMC, 45–50. In 2006 International Conference on Computational Intelligence and Security, Guangzhou, China.

39.

Watabeuchida

, Zhu

, Ogawa

S.K.

, et al. 2012. Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron, 74, 858–873.

40.

Whitley

2014. An executable model of a simple genetic algorithm. Found. Genet. Algorithms, 2, 45–62.

41.

, Yao

, Long

Z.Y.

, et al. 2006. Functional connectivity in the resting brain: An analysis based on ICA, 175–182. In Proceeding of the Neural Information Processing, International Conference, ICONIP 2006, Hong Kong, China, October 3–6.

42.

, Smith

V.A.

, Wang

P.P.

, et al. 2004. Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics, 20, 3594–3603.

43.

Zou

, and Conzen

S.D.

2005. A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics, 21, 71–79.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.12 MB