State-of-the-art review on Bayesian inference in structural system identification and damage assessment

Abstract

Bayesian inference provides a powerful approach to system identification and damage assessment for structures. The application of Bayesian method is motivated by the fact that inverse problems in structural engineering, including structural health monitoring, are typically ill-conditioned and ill-posed when using noisy incomplete data because of various sources of modeling uncertainties. One should not just search for a single “optimal” value for the vector of model parameters but rather attempt to describe the whole family of plausible model parameters based on measured data using a Bayesian probabilistic framework. In this article, the fundamental principles of Bayesian analysis and computation are summarized; then a review is given of recent state-of-the-art practices of Bayesian inference in system identification and damage assessment for civil infrastructure. Discussions of the benefits and deficiencies of these approaches, as well as potentially useful avenues for future studies, are also provided. Our focus is on meeting challenges that arise from system identification and damage assessment for the civil infrastructure but our presented theories also have a considerably broader applicability for inverse problems in science and technology.

Keywords

Bayesian inference Bayesian model class assessment Bayesian model updating damage assessment sparse Bayesian learning structural health monitoring structural system identification uncertainty quantification

Introduction

In the last few decades, worldwide efforts to implement structural health monitoring systems on civil infrastructure have produced large amounts of data. This abundance of data has motivated the development of an increasing number of data processing (information extracting) techniques to understand the behavior and performance of civil infrastructure under real environmental conditions. These computer-based techniques, developed in system identification research (e.g. Beck, 2010; Ghanem and Shinozuka, 1995; Sirca and Adeli, 2012), are key components in model-based inversions for damage detection and assessment. Here, system identification refers to the inverse problem of finding a mathematical model of the structural system on the basis of measured data. However, there are always some challenges encountered in system identification. One challenge is that sensors are typically installed at only a limited number of locations, so that we are unable to resolve detailed spatial information about the structure. In addition, there are always modeling uncertainties involved because of sensor noise, inadequate theory for certain system behaviors, simplifying approximations for developing structural models, thermally induced variations in structural properties, and so on. Due to these facts, no model is expected to exactly represent the system input/output (I/O) behavior.

Inverse problems in structural system identification are typically ill-conditioned and ill-posed when treated deterministically because there is insufficient information in the collected data to precisely determine a model within a realistic class of structural models. One commonly employed approach to handle this difficulty involves regularized parameter estimation approaches, such as Tikhonov regularization (Tarantola, 2005), where a regularization term is added to the data-matching term in the objective function to be minimized. However, the relationship of the resulting unique solution to the solution of the original unregularized inverse problem is uncertain.

The presence of substantial modeling uncertainties suggests that when solving inverse problems in structural system identification, the objective should not be limited to the search for single “optimal” parameter vector. Instead, an attempt should be made to describe the family of all plausible values of the model parameters based on the available data. Bayesian inference provides a general, rational, and robust tool that is capable of handling the difficulty of non-unique solutions (Beck, 1989; Beck and Katafygiotis, 1991, 1998; Katafygiotis and Beck, 1998). It treats the parameter estimation problem using Bayes’ theorem to determine the posterior distribution for the parameter vector based on the available data. Probability in the Bayesian perspective represents a degree of plausibility of an uncertain proposition conditional on stated information (Cox, 1946; Jaynes, 1957, 2003; Beck, 2010). The proposition may refer to events, structural model parameters or even the model itself. The posterior distribution therefore quantifies the updated relative plausibility of the different values of the model parameters on the basis of the available incomplete information. Similarly, the posterior probability distribution obtained from Bayes’ theorem at the model class level can be used to quantify the plausibility of each model class within a set of candidate model classes for their consistency with both the observed data and the prior information.

The objective of this article is to review Bayesian inference approaches in system identification, especially structural damage assessment, including both vibration-based and wave propagation–based methods. By treating the damage assessment problem within a framework of plausible inference in the presence of incomplete information, the Bayesian framework provides a promising way to locate and assess structural damage, which may occur away from the sensor locations or be hidden from sight. Furthermore, being able to quantify the uncertainties of the structural model parameters is essential for a robust prediction of future safety and reliability of structural systems. This review starts with a section that introduces the underlying theory and perspectives of Bayesian analysis and computation. Section “Applications of Bayesian inference in system identification and damage assessment” gives a literature review for the application of Bayesian methods to real problems of system identification and damage identification for civil infrastructures. This section presents an end-to-end Bayesian framework that starts with building Bayesian models and ends with characterizing the final posterior distribution of the model parameters. Section “Sparse Bayesian learning and applications in structural damage assessment” introduces a recently developed hierarchical sparse Bayesian learning (SBL) methodology to perform sparse stiffness change inference based on the vibration data and to perform flaw (or defect) detection using wave propagation data. In the final section, some conclusions are drawn and suggestions for future research are provided.

Bayesian inference

Bayesian and Frequentist probability

In the area of statistical analysis, there are two broad categories of probability interpretations, namely “Frequentist” and “Bayesian” probabilities. Probability in the Frequentist definition is interpreted as the relative frequency of occurrence of an “inherently random” event in the “long run,” and probability distributions are considered as inherent properties of “random” phenomena (Mises, 1981 [1939]). However, this definition is not always operational because it requires well-defined “random” experiments that can be conceived as repeatable; for example, the probability of a model is not meaningful. It is also not practical for establishing distributions of multi-dimensional continuous variables because of the huge amount of effort that would be required for gathering the necessary relative frequencies in trials. Furthermore, the definition involves the concept of “inherent randomness” of events, which is assumed but cannot be proved (Beck, 2010, 2014).

In contrast, Bayesian probability quantifies the states of plausible knowledge about phenomena because of our limited capacity to collect or understand the relevant information, rather than the existence of “inherent randomness” in nature and the probability axioms can be derived as a multi-valued logic for quantitative plausible reasoning under uncertainty (Beck, 2010; Cox, 1946, 1961; Jaynes, 2003). These probability logic axioms provide a rigorous foundation for applying Bayesian inference. They incorporate not only parametric uncertainty (uncertainty regarding which model in a proposed set should be used to represent the system behavior), but also non-parametric uncertainty because of the existence of model prediction errors resulting from the approximate nature of any model. Under this interpretation, the probability of a model is a measure of its plausibility relative to other models within a set and one’s inferences regarding the relative plausibility of each model are updated through Bayes’ theorem as data evidences accumulate. Such a concept makes Bayesian inference more suitable for inverse problems in structural system identification than the Frequentist approach to inference.

Bayesian model updating

A key concept in Bayesian model updating for systems is a stochastic system model class $M$ , which consists of a set of probabilistic predictive I/O models for the structural system together with a prior distribution over this set that quantifies the initial relative plausibility of each predictive model. The data $D$ can be used to update the relative plausibility of each predictive model in $M$ by computing the posterior probability density function (PDF), $p (w | D, M)$ , for the uncertain model parameters, $w \in W \subset R^{N}$ using Bayes’ theorem

\begin{matrix} p (w | D, M) = p (D | w, M) p (w | M) / p (D | M) \\ = c^{- 1} p (D | w, M) p (w | M) \end{matrix}

(1)

where $c = p (D | M)$ is the normalizing constant, which is called the evidence or marginal likelihood for the model class $M$ given by data $D$ ; $p (D | w, M)$ , as a function of $w$ , is the likelihood function which expresses the probability of obtaining data $D$ based on the predictive PDF for the response given by model parameters $w$ within $M$ ; and $p (w | M)$ is the prior PDF, which quantifies the initial plausibility of each model defined by the value of the model parameters $w$ For ill-conditioned inverse problems, the prior PDF can be selected to provide “soft” constraints on the model updating and thereby provide regularization (Bishop, 2006). One important feature of the posterior PDF $p (w | D, M)$ is the maximum a posteriori (MAP) value of the model parameters $w$ , that is, $\hat{w} = \arg max_{w} p (w | D, M)$ , which is the most probable value of $w$ conditional on the data $D$ . Another useful feature is to define the more plausible values of $w$ by a “confidence interval,” which is an interval of parameter values centered on the MAP value that corresponds to a specified posterior probability of, say, 0.90 or 0.95.

Bayesian model class assessment and Bayesian Ockham Razor

In system identification, we are often faced with the problem of choosing the most plausible model class from a set of competing candidates to represent the behavior of the system of interest based on the measured data $D$ , that is, model class assessment (or selection, or comparison) (Beck and Yuen, 2004). Given a discrete set of chosen probabilistic model classes, $M = {M_{m} : m = 1, 2, \dots, M}$ , for a system, the posterior probability $P (M_{m} | D_{N}, M)$ is computed from Bayes’ theorem at the model class level (our convention is to use $P (\cdot)$ for probabilities and $p (\cdot)$ for PDFs)

P (M_{m} | D, M) = p (D | M_{m}) P (M_{m} | M) / p (D | M)

(2)

In the above, $p (D | M_{m})$ is the evidence for model class $M_{m}$ provided by the data $D$ (additional conditioning on $M$ is irrelevant), which is given by the Total Probability Theorem

p (D | M_{m}) = \int p (D | w, M_{m}) p (w | M_{m}) d w

(3)

Usually, the model classes are considered equally plausible a priori, that is, $P (M_{m} | M) = 1 / M$ , the computation of the multi-dimensional integral in equation (3) for the evidence function is vital in Bayesian model class assessment. It has been shown that the log evidence can be expressed as the difference between two terms (Beck, 2010; Muto and Beck, 2008)

\begin{matrix} \log [p (D | M_{m})] \\ = \int \log [p (D | w, M_{m})] p (w | D, M_{m}) d w \\ - \int \log [\frac{p (w | D, M_{m})}{p (w | M_{m})}] p (w | D, M_{m}) d w \end{matrix}

(4)

The first term is the posterior mean of the log likelihood function, which is a measure of the average goodness-of-fit of the model class $M_{m}$ , and the latter term is the relative entropy of the posterior $p (w | D, M_{m})$ relative to the prior $p (w | M_{m})$ , which is a measure of the amount of information gain about $w$ from the data $D$ (information-theoretic model complexity). The merit of equation (4) is that it shows rigorously, without introducing any ad-hoc concepts, that the log evidence for model class $M_{m}$ explicitly builds in a trade-off between the data-fit of the model class and its information-theoretic complexity. This means that the average log goodness-of-fit is penalized by the information gained about the model parameters in the sense of Shannon (Cover and Thomas, 2006). This trade-off between data-fit and model complexity is known as the Bayesian Ockham Razor (Beck, 2010; Gull, 1988; Jefferys and Berger, 1992; Mackay, 1992). This is important in system identification applications, since overly complex models often lead to overfitting of the data and the subsequent response predictions may be extremely sensitive to the modeling error and details of specific data (measurement noise, environmental effects, etc.). An optimal model class should have good data-fitting capability but small prediction differences due to perturbation of the model parameters.

Once Bayesian model class assessment is implemented, the most plausible model class based on the available data $D$ is accessible. In cases where there are multiple plausible model classes, Bayesian model averaging may be used where the expected value of a quantity of interest h conditioned on the data $D$ and all the chosen model classes $M$ can be estimated by

E (h | D, M) = \sum_{m = 1}^{M} E (h | D, M_{m}) P (M_{m} | D)

(5)

Hierarchical Bayesian model and empirical Bayes method

The Empirical Bayes Method (Bishop, 2005) is an inference procedure in which the prior distribution of the model parameters $w$ is selected using the data $D$ . This method can be viewed as an approximation of a full Bayesian framework involving a hierarchical Bayesian model, which is a Bayesian model expressed in multiple levels (by placing a hyper-prior on the prior at each level). Figure 1 is a graphical representation of a hierarchical model for one level, where $γ$ is called the hyper-parameter vector because it is the parameter of prior distribution, $p (w | γ)$ , for $w$ . The posterior PDF of the model parameter vector, $w$ , is inferred using a full Bayesian approach

\begin{matrix} p (w | D) = \int p (w | γ, D) p (γ | D) d γ = \int \frac{p (D | w) p (w | γ)}{p (D | γ)} \\ p (γ | D) d γ \end{matrix}

(6)

Figure 1.

Graphical hierarchical model representation, where each arrow denotes the conditional dependencies used in the joint probability model, $p (D, w, γ) = p (D | w) p (w | γ) p (γ)$ .

However, the higher level involves the evaluation of a multi-dimensional integral over the space of hyper-parameters $γ$ that is analytically intractable. On approach is to use the Laplace asymptotic method to approximate the integral, assuming that the posterior $p (γ | D)$ has a pronounced peak at its MAP value $γ$ (Beck and Katafygiotis, 1998)

p (w | D) \approx p (w | \tilde{γ}, D) = \frac{p (D | w) p (w | \tilde{γ})}{p (D | \tilde{γ})}

(7)

where the MAP estimate $\tilde{γ}$ is learned from the data $D$

\tilde{γ} = \arg max_{γ} p (γ | D) = \arg max_{γ} p (D | γ) p (γ)

(8)

This procedure is sometimes called Empirical Bayes using Type-II Maximum Likelihood Approximation (Mackay, 1992; Tipping, 2004) when $p (γ)$ is chosen as constant. It is seen that the final prior PDF $p (w | \tilde{γ})$ is not directly specified but instead is learned from the data from a specified class of priors $p (w | γ)$ . Note that the MAP estimation of hyper-parameter vector $γ$ involves maximization of the evidence function $p (D | γ)$ when $p (γ)$ is constant so that all values of $γ$ are considered to be equally plausible a priori, where

p (D | γ) = \int p (D | w) p (w | γ) d w

(9)

For a Gaussian likelihood function that has a mean linear in $w$ and for a Gaussian prior, this integral can be evaluated analytically (Tipping, 2001). Because of the Bayesian Ockham Razor (Beck, 2010; Gull, 1988; Jefferys and Berger, 1992; Mackay, 1992), the learning of the hyper-parameters $γ$ automatically implements a penalty against data overfitting of the models in Bayesian inference.

Useful Bayesian approximation tools

In Bayesian inference, the normalization of the posterior PDF is usually intractable because it involves a high-dimensional integral and so we must use approximations to proceed. In fact, the main computational issue in Bayesian analysis is the evaluation of multi-dimensional integrals. The normalizing constant $p (D | M) = \int p (D | w, M) p (w | M) d w$ in equation (1) is analytically intractable except in special cases, where conjugate priors are used (Bishop, 2006).

Laplace’s method of asymptotic approximation

Based on the topology of the likelihood function, $p (D | w, M)$ in the parameter space, three categories for a model class have been defined (Katafygiotis and Beck, 1998): globally identifiable, locally identifiable, and unidentifiable based on the sensor data $D$ , corresponding respectively to unique, multiple but isolated, and a continuum of maximum likelihood estimate (MLE) ( ${\hat{w} = argmax p (D | w, M)}$ ). Full Bayesian updating can treat all these cases (Yuen et al., 2004) but sometimes approximate inference methods are applied for special cases because of less computational effort.

If the model class is globally identifiable based on $D$ , Laplace’s method can be used to approximate the integral in the evidence function $p (D | M)$ as (e.g. Beck and Katafygiotis, 1991, 1998; Yuen, 2010)

p (D | M) \approx p (D | \tilde{w}, M) p (\tilde{w} | M) {(2 π)}^{\frac{N}{2}} {| H (\tilde{w}) |}^{- \frac{1}{2}}

(10)

where $H (\tilde{w})$ is the Hessian matrix of the function $\ln [p (D | w, M) p (w | M)]$ calculated at the MAP estimated values $\tilde{w}$ . Correspondingly, the posterior PDF $p (w | D, M)$ in equation (1) can be approximated as Gaussian, where the mean is the MAP estimated value $\tilde{w}$ and the covariance matrix is equal to the inverse of the Hessian matrix $H (\tilde{w})$ calculated at $\tilde{w}$ . Laplace asymptotic approximations for the posterior PDF are also available for the locally identifiable case (Beck and Katafygiotis, 1998; Yang and Beck, 1998) and the unidentifiable case (Katafygiotis and Lam, 2002). Because these approximations need all MLEs to be found, they are only feasible for low-dimensional parameter spaces.

Markov chain Monte Carlo method

For the general case, Markov chain Monte Carlo (MCMC) samplers can be used. These algorithms generate samples that are consistent with any probability distribution (e.g. the posterior distribution $p (w | D, M)$ ) by constructing a Markov chain that has the desired distribution as its equilibrium distribution. In recent years, this class of algorithms has received considerable attention for Bayesian model updating because it can provide a full characterization of the posterior uncertainty, no matter whether or not the data available is sufficient to constrain the updated parameters to give a globally identifiable model class (Robert and Casella, 2004).

Several MCMC methods have been proposed with the goal of improving the computational efficiency of posterior sampling in Bayesian model updating of systems (e.g. Beck and Au, 2002; Beck and Zuev, 2013; Catanach and Beck, 2017; Cheung and Beck, 2009; Ching and Chen, 2007; Straub and Papaioannou, 2015). Most of these are based on the Metropolis–Hastings MCMC algorithm (Hastings, 1970). The advantage of the Metropolis–Hastings algorithm is that it can draw samples from any PDF $p (w) = f (w) / K$ , where K is the normalization constant that cannot be readily evaluated, provided that the value of the function $f (w)$ (e.g. $p (D | w, M) p (w | M)$ in equation (1)) can be computed. Transitional MCMC (Ching and Chen, 2007) is one of the most widely used methods in Bayesian system identification. This method was inspired by an adaptive Metropolis-Hastings method (Beck and Au, 2002) and is applicable for Bayesian inversion problems with higher dimensions. It also enables model class assessment by providing an estimate of the multi-dimensional integral in equation (3) for the evidence as a by-product. Transitional MCMC is fundamentally a sequential Monte Carlo method where samples are taken from a series of intermediate PDFs $p_{j}$ in an adaptive manner, where

\begin{matrix} p_{j} (w | D, M) \propto p (w | M) p {(D | w)}^{s_{j}}, \\ j = 0, \dots, J; 0 = s_{0} < s_{1} < \dots < s_{J} = 1 \end{matrix}

(11)

where j is the stage number and $s_{j}$ is the corresponding tempering or annealing parameter for the jth stage. This parameter controls the speed of the gradual transition from the prior $p (w | M)$ (when $j = 0 and s_{0} = 0$ ), to the posterior $p (w | D, M)$ (when $j = J and s_{J} = 1$ ) and it is automatically computed in the process to form the intermediate PDFs so they are not too different.

Other recent popular techniques for Bayesian approximations are Approximate Bayesian Computation (ABC) methods (Chiachio et al., 2014; Marin et al., 2012; Vakilzadeh et al., 2017) and Variational Bayesian methods (Bishop, 2006; Fujimoto et al., 2011; Li and Der Kiureghian, 2017). ABC methods are applicable even when an analytical formula for the likelihood function ( $p (D | w, M)$ in equation (1)) is elusive or it is computationally costly to evaluate it in Bayesian inference. The key idea of this class of methods is to sample from a posterior distribution conditional on model predicted outputs (rather than on observed data vector $D$ ) that are acceptably close to the observed data $D$ in the output space under some metric. The key idea for variational Bayesian methods is to find a surrogate distribution close to the true posterior PDF by approximately minimizing the Kullback–Leibler divergence between these two distributions over a specified class of distributions, such as a Gaussian family.

Applications of Bayesian inference in system identification and damage assessment

Bayesian inference for accurately detecting, locating, and assessing damage from severe loading events or progressive structural deterioration has been studied for nearly two decades. Numerous techniques have been developed using the probability logic-based unifying Bayesian system identification framework presented in Beck (1989), Beck and Katafygiotis (1991, 1998), and Beck (2010). Typically, there are two categories of methods for structural damage detection: one is vibration based and the other is wave propagation based.

Vibration-based damage assessment using Bayesian inference

For vibration-based damage assessment, many methods utilize the dependence of the identified structural modal parameters, such as natural frequencies and mode shapes, on physical properties of structures, that is, stiffness, mass, and damping. On the other hand, other methods directly utilize the measured time-domain vibration response to infer the physical properties of structures. For small-amplitude vibrations, such as ambient vibrations of structures, it is reasonable to choose linear structural models with classical damping and to parameterize the uncertain stiffness matrix, $K \in R^{N_{d} \times N_{d}}$ using a sub-structuring approach

K (θ) = K_{0} + \sum_{j = 1}^{N_{k}} η_{j} K_{j}

(12)

where $K_{j} \in R^{N_{d} \times N_{d}}, j = 1, \dots, N_{K}$ , is the nominal contribution of the $j th$ substructure to the overall stiffness matrix, $K$ . The corresponding stiffness scaling parameter $η_{j}, j = 1, \dots, N_{K},$ is a factor that allows modification of the $j th$ substructure stiffness to make it more consistent with the real structure behavior. Note that the mass matrix M can also be parameterized in the same affine manner with mass scaling parameters $ρ_{j}, j = 1, \dots, N_{M}$ . In this case, the structural model parameter vector $θ$ to be updated should include not only the stiffness scaling parameters $η_{j}, j = 1, \dots, N_{K}$ , but also the mass scaling parameters $ρ_{j}, j = 1, \dots, N_{M}$ .

Modal-based Bayesian approaches

Bayesian inference for structural damage assessment first appeared in Vanik (1997), Vanik et al. (2000), and Sohn and Law (1997) using modal data information. In the framework presented in Vanik (1997) and Vanik et al. (2000), the likelihood function for the structural model parameters $θ$ is written as the product of the PDFs for the modal frequencies ${{\hat{ω}}_{r}^{2}}_{r = 1}^{N_{m}}$ and mode shape components ${{\hat{ψ}}_{r}}_{r = 1}^{N_{m}}$

p (D | θ) = Π_{r = 1}^{N_{m}} p ({\hat{ω}}_{r}^{2} | θ) p ({\hat{ψ}}_{r} | θ)

(13)

where $N_{m}$ is the number of modes observed and the modal parameters are modeled as independently distributed from mode to mode and from modal frequency to mode shape. The PDFs for the $r th$ modal frequencies ${\hat{ω}}_{r}^{2}$ and mode shape components ${\hat{ψ}}_{r}$ are obtained from the following two model equations, respectively

{\hat{ω}}_{r}^{2} = ω_{r}^{2} (θ) + e_{{\hat{ω}}_{r}^{2}}

(14a)

{\hat{ψ}}_{r} = a_{r} Γ ψ_{r} (θ) + e_{{\hat{ψ}}_{r}}

(14b)

where $a_{r}$ is a scaling factor, and $Γ \in R^{N_{o} \times N_{d}}$ with ‘1s’ and ‘0s’ picks the observed degrees of freedom (DOFs) in the “measured” mode shape ${\hat{ψ}}_{r} \in R^{N_{o}}$ from the full model mode shapes, $ψ_{r} (θ) \in R^{N_{d}}$ . Using the Principle of Maximum Information Entropy (Jaynes, 1983, 2003), the combined prediction errors and measurement errors for the model modal parameters $ψ_{r} (θ)$ and $ω_{r}^{2} (θ)$ are modeled as independent zero-mean Gaussian variables with unknown variances, which gives the largest uncertainty for the set ${e_{{\hat{ω}}_{r}^{2}}}_{r = 1}^{N_{m}}$ and ${e_{{\hat{ψ}}_{r}}}_{r = 1}^{N_{m}}$ subject to the first two moment constraints. This produces the Gaussian likelihood functions $p ({\hat{ω}}_{r}^{2} | θ)$ and $p ({\hat{ψ}}_{r} | θ)$ . After defining the prior PDF of $θ$ , Bayes’ theorem can be applied to infer the posterior PDF of $θ$ .

Bao et al. (2013) employed this formulation for data fusion-based structural damage detection under varying temperature conditions. Here, the temperature change effects on the modal parameters were considered in the construction of the likelihood function for $θ$ in equation (13). Behmanesh and Moaveni (2015) applied this method for identification of simulated damage on a footbridge where an adaptive Metropolis–Hastings algorithm (Andrieu and Thoms, 2008; Haario et al., 2001) was used to sample the posterior PDF of the structural model parameters. Lam et al. (2014) extended this method to detect railway ballast damage under a concrete sleeper. They employed a model class assessment technique to select the most plausible number of ballast regions given multiple sets of modal data. Behmanesh et al. (2017) also incorporated a model class assessment method, where multiple model classes were defined as different subsets of the contributing modes.

One major difficulty for the approaches above is that mode matching is required, that is, it is necessary to match model modes (e.g. $ψ_{r} (θ)$ ) and experimental modes (e.g. ${\hat{ψ}}_{r}$ ), one by one. This is a nontrivial task because usually only partial mode shapes are measured. Moreover, the order of modes may switch due to the fact that the damage-induced local stiffness loss may affect some modes more than others; this makes mode matching even more challenging. To deal with this difficulty, the concept of system mode shapes was introduced in Beck et al. (2001) as additional variables for Bayesian model updating. The system mode shape parameters ϕ represent the actual underlying mode shapes of the linear dynamic structural system at all DOFs corresponding to those of the structural model, but they are distinct from the model mode shapes, ${ψ_{r} (θ)}_{r = 1}^{N_{m}}$ in equation (14). The other benefit of introducing system mode shapes is that they do not require solution of the nonlinear eigenvalue problem of a structural model. Instead of the model modal frequencies, $ω_{r}^{2} (θ)$ the Rayleigh quotient frequencies, $ω_{r}^{2} (θ, ϕ)$ can be employed using the structural model parameters and system mode shape parameters instead of the model modal frequencies, $ω_{r}^{2} (θ)$

ω_{r}^{2} (θ, ϕ) = \frac{ϕ_{r}^{T} K (θ) ϕ_{r}}{ϕ_{r}^{T} M ϕ_{r}}

(15)

Then Bayes’ theorem can be used to express the posterior probabilities of the uncertain parameters $θ$ and $ϕ$ . Ching and Beck (2004a, 2004b) proposed an Expectation–Maximization algorithm to find the MAP values of these parameters, together with the prediction error variance parameters. They analyzed the Phase II simulated benchmarks (Bernal et al., 2002) and experimental benchmark (Ching and Beck, 2003; Dyke et al., 2003) that were sponsored by the IASC-ASCE Task Group on Structural Health Monitoring. Most of the damage was detected and assessed successfully. Goller et al. (2012) found that it is vital to weigh differently the relative contributions of the likelihoods relating to modal frequencies and mode shape data in equation (13) to provide balanced model updating results. Bayesian model class assessment was employed for selecting the most plausible weight parameter (defined as the ratio between the prediction error variances of mode shape vectors and modal frequencies) based on the modal data.

For damage assessment of complex civil infrastructures, we would like to treat each structural member as a substructure so that we can infer which, if any, members have been damaged. Therefore, high-dimensional model parameter vectors $θ$ often arise. Ching et al. (2006b) proposed a Gibbs sampler method to efficiently sample the posterior PDF of the high-dimensional vector $θ$ . The effective dimension is kept low by decomposing the uncertain parameters into three groups and iteratively sampling the posterior distribution of one parameter group conditional on the other two groups and the measured data. The three parameter groups are the model parameters $θ$ , system mode shapes $ϕ$ , and variances for the prediction errors ${e_{r}}_{r = 1}^{N_{m}}$ . For each mode, the prediction error $e_{r}$ is defined for the prediction of the measured frequency based on the system mode shape $ϕ_{r}$ and model parameters $θ$ as follows

e_{r} = (K (θ) - {\hat{ω}}_{r}^{2} M) ϕ_{r}, r = 1, \dots, N_{m}

(16)

where $e_{r}$ is modeled as Gaussian based on the maximum entropy distribution (Jaynes, 1983, 2003) subject to the first two moments as constraints. By employing this prediction error equation, Yan and Katafygiotis (2015) developed a Bayesian damage assessment method for using modal information from multiple sensor setups. The problem is formulated as minimizing an objective function with respect to the three parameter groups above, which incorporates the information of local mode shape components corresponding to different sensor setups automatically.

Yuen et al. (2006) introduced system frequencies ${ω_{r}^{2}}_{r = 1}^{N_{m}}$ as uncertain parameters but they used the eigen-equations of the structural dynamics model only in the prior to provide a soft constraint

p (ω^{2}, ϕ | θ, β) = c_{1} \exp {- \frac{β}{2} \sum_{r = 1}^{N_{m}} ‖ (K (θ) - ω_{r}^{2} M) ϕ_{r} ‖^{2}}

(17)

where $c_{1}$ is a constant, which is independent of system modal parameters, $ω^{2}$ and $ϕ .$ An iterative scheme involving a series of coupled linear optimization problems was employed to find the MAP values of the structural model parameters $θ$ and system modal parameters, $ω^{2}$ and $ϕ$ . By incorporating a finite element (FE) model reduction technique in this formulation, Yin et al. (2017) developed a methodology for detection of bolted connection damage in steel frame structures. The novel feature is that only partial components of system mode shapes $ϕ$ are inferred. This is practical in cases where the dimension of full-system mode shapes $ϕ$ is extremely large but considerably fewer DOFs are measured, resulting in unreliable full-system mode shape inference.

Time-domain Bayesian approaches

The other category of Bayesian damage assessment is the time-domain approach, which is particularly appropriate for situations with time-varying structural properties and a sequential dataset is observed. Typically, a stochastic state-space model is defined for the state time history ${x_{n}}_{n = 1}^{N_{t}}$ by implying a state transition PDF

\begin{matrix} \forall n \in Z^{+}, p (x_{n} | x_{n - 1}, u_{n - 1}, θ) \\ = N (x_{n} | f_{n} (x_{n - 1}, u_{n - 1}, θ), Q_{n}) \end{matrix}

(18)

along with a state-to-output PDF

\forall n \in ℤ^{+}, p (y_{n} | x_{n}, u_{n}, θ) = N (y_{n} | g_{n} (x_{n}, u_{n}, θ), R_{n})

(19)

where $u_{n}$ and $y_{n}$ denote the (external) input and output vectors, respectively, at time instant, $t_{n}$ ; and $Q_{n}$ and $R_{n}$ are the prior covariance matrices of the uncertain state and output prediction errors, respectively. Ching et al. (2006a) presented a comparison study of two Bayesian filtering algorithms: extended Kalman filter (EKF) and a particle filter for the estimation of the augmented state vector containing both the state (displacement and velocity) vector and structural model parameters $θ$ . Yuen and Kuok (2016) proposed a Bayesian probabilistic algorithm for online estimation of noise parameters of EKF, motivated by the fact that improper assignment of noise covariance matrices $Q_{n}$ and $R_{n}$ leads to divergence in the estimates and misleading uncertainty quantification for the system state and model parameters. Using the general hierarchical state-space model in equations (18) and (19), Vakilzadeh et al. (2017) examined the performance of the ABC-SubSim algorithm (Chiachio et al., 2014) for Bayesian updating of the model parameters of dynamical systems, together with the noise parameters. For the case of unknown, or partially unknown, input excitations, Astroza et al. (2017) presented a Bayesian method for nonlinear FE model updating and seismic input identification.

Bayesian pattern recognition methods

Closely related to artificial intelligence and machine learning, pattern recognition is another popular approach in structural damage detection and assessment. Lam et al. (2006) introduced a Bayesian artificial neural network (ANN) design method for pattern recognition-based damage detection, where an “optimal” ANN model class is automatically selected based on the set of ANN training data. In the ANN training process, the calculated features of damage-induced changes in Ritz vectors and corresponding damage scenarios in the structural model are treated as inputs and targets, respectively. The calculated features are defined by

\begin{matrix} Δ R (k) = {[r_{1}^{T} (k), \dots, r_{N_{R}}^{T} (k)]}^{T} - {[r_{1}^{T} (0), \dots, r_{N_{R}}^{T} (0)]}^{T}, k \\ = 1, \dots, N_{dp} \end{matrix}

(20)

where $N_{dp}$ and $N_{R}$ are the total number of damage patterns considered and the number of extracted Ritz vectors, respectively; the Ritz vectors $r_{i}$ are computed as the mass normalized product of the flexibility matrix and spatial load vector. In the real identification process, the damage scenario is identified by inputting the measured Ritz vector changes into the trained ANN model. Lam and Ng (2008) extended the Bayesian ANN design method to include the selection of activation (transfer) functions for neurons in the hidden layer. A comparison study showed that ANN performance trained by modal parameters is better than that trained by Ritz vectors. Bayesian neural network models were also examined in Arangio and Beck (2012) and the automatic relevance determination method (MacKay, 1994; Neal, 1996) was applied to evaluate the relative importance of every input in the neural networks and separate relevant variables from those that are redundant. The applicability of these Bayesian neural networks was investigated in Arangio and Bontempi (2015) for the identification of damage of a cable-stayed bridge in China. This study demonstrated that the method is able to detect anomalies in the structural behavior produced by damage. Figueiredo et al. (2014) proposed a MCMC-based Bayesian pattern recognition method for damage detection, where the Bayesian approach is used to cluster structural responses of the bridges into a reduced number of state conditions by inferring the parameters of a finite mixture of Gaussian distributions. The method can be viewed as an improvement over the classical MLE-based expectation–maximization algorithm and it has the potential to overcome some difficulties when dealing with the structural responses containing the effects of the environmental temperature variability.

Wave propagation-based damage detection using Bayesian inference

Wave propagation-based damage detection techniques, such as the Lamb wave method and ultrasonic NDT (non-destructive testing), are widely acknowledged as a most encouraging tool for quantitative identification of damage in civil engineering structures, and much research has been conducted intensively over the last several decades. The use of Bayesian inference in these approaches is also increasing, due to the fact that there are always unavoidable uncertainties in the measurement and modeling processes.

For Lamb wave methods, Ng et al. (2009) introduced a Bayesian framework to detect and characterize laminar damage in beam structures using guided wave signals measured at a single point. The uncertain model parameters $θ$ to be inferred include damage location, length, depth, and Young’s modulus of the material. Given the measured guided wave data $q_{m},$ the likelihood function is defined as

p (q_{m} | θ, β) = \frac{1}{{(2 π β^{- 1})}^{\frac{N_{t} N_{0}}{2}}} \exp [- \frac{β}{2} \sum_{t = 1}^{N_{t}} ∥ q_{m} (t) - q (t, θ) ∥^{2}]

(21)

where $q (t, θ)$ is the calculated guided wave response using spectral FE method. A two-stage optimization process consisting of simulated annealing followed by a standard simplex search method was employed for determining the MAP values of the posterior PDF of the model parameters $θ$ to characterize the multivariate damage and quantify the associated uncertainties. The method is only applicable for a single crack case. To identify multiple cracks, He and Ng (2017) introduced a Bayesian model class assessment technique to determine the most plausible solution for the number of cracks based on the guided wave data information before further crack parameter identification and associated uncertainty quantification. Another measurement information for damage detection is the time-of-flight of scattered Lamb waves. Yan (2013) utilized this information to produce a Bayesian inference method in which a MCMC algorithm developed by Nichols et al. (2010) was employed to characterize the posterior distributions of the unknown damage location and wave velocity parameters.

Regarding ultrasonic NDT, Wang et al. (2015) combined cluster analysis and Bayesian theory for assessing external corrosion location and depth in buried pipeline structures. Chiachio et al. (2017) presented a multilevel Bayesian approach for identifying Young’s moduli, number, and position of damaged layers in composite laminates using through-transmission ultrasonic measurements. Three steps were defined in their Bayesian inverse procedure: (1) inferring the posterior PDF $p (θ | M_{j}, q_{m})$ of model parameters (defined by Young’s moduli of damaged layers) for a specific damage hypothesis using equation (1); (2) obtaining the plausibility $P (M_{j} | P_{i}, q_{m})$ of a particular damage hypothesis (associated with damage positions) among the set of candidate hypotheses using equation (2); (3) assessing the degree of plausibility of the given damage pattern (defined by the number of damaged layers) within a predefined damage pattern set $P = {P_{i}}$ by

P (P_{i} | P, q_{m}) \propto p (q_{m} | P_{i}, P) P (P_{i} | P)

(22)

where

p (q_{m} | P_{i}, P) = \sum_{j} p (q_{m} | M_{j}, P_{i}) P (M_{j} | P_{i}, P)

(23)

The most plausible damage hypothesis and pattern can be selected through the Bayesian model class assessment at different levels.

Sparse Bayesian learning and applications in structural damage assessment

In vibration-based damage assessment, there is a fundamental trade-off between the spatial resolution of the inferred damage locations and the reliability of the probabilistically inferred damaged state. In reality, the information available from the structure’s local network of sensors will generally be insufficient to support a member-level resolution of stiffness loss from damage. Accordingly, larger substructures consisting of assemblages of structural members may be necessary in order to reduce the number of model parameters. In this case, defining a proper threshold to determine whether the damage features shift from their healthy state is very important to alleviate false positive and false negative detections. However, it is very challenging to establish a reliable threshold value in a rigorous manner in order to issue a timely damage alarm (Sohn et al., 2005). A general strategy to alleviate this problem is to incorporate as much prior knowledge as possible in order to constrain the set of solutions; in particular, it is helpful to exploit the prior information that structural stiffness change from damage typically occurs at a limited number of locations in a structure in the absence of its collapse. Recently, by exploiting this prior knowledge about the spatial sparseness of damage, the effectiveness of sparse recovery techniques, for example, l₁ norm least square regularization (Candes et al., 2006; Chen et al., 1999; Tropp and Gilbert, 2007) and SBL (sparse Bayesian learning) techniques (Tipping, 2001), have been explored to produce more robust damage assessment even for high-dimensional model parameter spaces (higher-resolution damage localization) (e.g. Hou et al., 2018a, 2018b; Huang and Beck, 2015, 2018a; Huang et al., 2017a, 2017b; Zhou et al., 2015).

In the field of guided wave/ultrasonic NDT signal processing for damage or defect detection, sparse signal recovery algorithms have also attracted increasing attention during the recent decade (Hong et al., 2006; Raghavan and Cesnik, 2007; Wu et al., 2017a, 2017b, 2018; Zhang et al., 2008). The prior information exploited in the application is that the object being inspected contains a limited amount of damage or defects, and so the measured signal should be a linear combination of echoes reflected from these damage or defects.

In this section, the recent progress of SBL-based structural damage detection and assessment is reviewed. The general theory of SBL is first introduced, and thereafter, the recently developed vibration-based SBL methods are discussed, followed by the wave propagation–based SBL methods.

Sparse Bayesian learning

In this section, we only briefly review the theory of SBL and refer to Tipping (2001) and Faul and Tipping (2002) for a more detailed description. It is supposed that the model prediction of the measured output is $\hat{y} = f + e + m \in R^{N_{o}}$ , which involves a deterministic function

f = \sum_{j = 1}^{N_{p}} w_{j} Θ_{j} = Θ w

(24)

along with uncertain model prediction error $e$ and measurement noise $m$ , where $Θ = [Θ_{1, \dots,} Θ_{N_{p}}]$ is a general $N_{o} \times N_{p}$ design matrix with column vectors ${Θ_{j}}_{j = 1}^{N_{p}}$ that may depend on inputs $\hat{u}$ and $w = [w_{1}, \dots, w_{N_{p}}]^{T}$ is a corresponding coefficient vector. Based on the principle of maximum information (largest uncertainty) entropy subject to the first two moment constraints (Jaynes, 1983, 2003), the combination of the prediction error $e$ and measurement noise $m$ is modeled as a zero-mean Gaussian vector with covariance matrix, $β^{- 1} I_{N_{o}}$ . This yields a Gaussian likelihood function based on the data $\hat{y}$ :

\begin{matrix} p (\hat{y} | w, β) = {(2 π β^{- 1})}^{- \frac{N_{o}}{2}} \exp (- \frac{β}{2} ‖ \hat{y} - Θ w ‖_{2}^{2}) \\ = N (\hat{y} | Θ w, β^{- 1} I_{N_{o}}) \end{matrix}

(25)

The prior distribution for the parameter vector, $w$ , is assigned as follows

\begin{matrix} p (w | α) = Π_{j = 1}^{N_{p}} p (w_{j} | α_{j}) = Π_{j = 1}^{N_{p}} N (w_{j} | 0, α_{j}) \\ = Π_{j = 1}^{N_{p}} [{(2 π α_{j})}^{- 1 / 2} \exp {- \frac{1}{2} α_{j}^{- 1} w_{j}^{2}}] \end{matrix}

(26)

The key to the model sparseness is the utilization of the $N_{p}$ independent variance hyper-parameters ${α_{1}, \dots, α_{N_{p}}}$ that moderate the strength of the Gaussian prior. Note that an extremely small value of $α_{j}$ implies that the corresponding term $w_{j} Θ_{j}$ in equation (24) has an insignificant contribution to the modeling of measurements $\hat{y}$ because it essentially produces a Dirac delta-function at zero for the prior of $w_{j}$ , and so for its posterior.

The learning of the coefficient vector $w$ from measured output $\hat{y}$ is characterized by applying Bayes’ Theorem to infer the posterior PDF $p (w | \hat{y})$ . Based on the Empirical Bayes method (Laplace asymptotic approximation)

\begin{matrix} p (w | \hat{y}) = \int p (w | \hat{y}, α, β) p (α, β | \hat{y}) d α d β \\ \approx p (w | \hat{y}, \tilde{α}, \tilde{β}) = p (\hat{y} | w, \tilde{β}) p (w | \tilde{α}) / p (\hat{y} | \tilde{α}, \tilde{β}) \\ = N (w | μ, Σ) \end{matrix}

(27)

with

Σ = {(\tilde{β} Θ^{T} Θ + {\tilde{A}}^{- 1})}^{- 1}, μ = \tilde{β} Σ Θ^{T} \hat{y}

(28)

where $\tilde{A} = diag ({\tilde{α}}_{1}, \dots, {\tilde{α}}_{N_{p}})$ is the prior covariance matrix for $w .$ The approximation of equation (27) is based on the assumption that the posterior $p (α, β | \hat{y})$ is highly peaked at the MAP value

\begin{matrix} {\tilde{α}, \tilde{β}} = \arg max_{[α, β]} p (α, β | \hat{y}) \\ = \arg max_{[α, β]} {p (\hat{y} | α, β) p (α) p (β)} \end{matrix}

(29)

Two optimization algorithms have been proposed in the SBL literature to find the MAP values $\tilde{α}$ and $\tilde{β}$ . One is Tipping’s original iterative algorithm (Tipping, 2001) and the other is Tipping and Faul’s “Fast Algorithm” (Tipping and Faul, 2003). It is found that the maximization in equation (29) results in many hyper-parameters $α_{j}$ to approach zero during the learning process. Thereby, a sparse model vector, $w$ , is produced, that is, many of its components become zero. This is the Bayesian Ockham Razor at work (Beck, 2010; Gull, 1988; Jefferys and Berger, 1992; Mackay, 1992).

Huang et al. (2014) found that the SBL algorithm suffers from a robustness problem: there are local maxima for equation (29) that may trap the hyper-parameter optimization if the number of measurements $N_{o}$ is considerably smaller than the number of model parameters $N_{p}$ ; this leads to non-robust Bayesian updating results. Several robustness enhancement algorithms (Huang et al., 2011, 2014, 2016, 2018b, 2018c, 2018d) have been developed, with the goal of increasing the signal reconstruction accuracy in compressive sensing for highly and approximately sparse signals.

Sparse Bayesian learning methods for vibration-based damage detection and assessment

Hierarchical Bayesian model class

For damage detection purposes, the hierarchical Bayesian model in Figure 2 was presented in Huang and Beck (2015), where $ω^{2}$ and $ϕ$ denote the system natural frequencies and mode shapes, respectively, that correspond to the identified natural frequencies and mode shapes ${\hat{ω}}^{2}$ and $\hat{ψ}$ . A joint prior PDF is assigned to the system modal parameters $ω^{2}$ and $ϕ$ and structural stiffness parameters $θ$ of the structural model. This is accomplished by introducing an equation error precision parameter $β$ to explicitly control how closely the system and model modal parameters agree:

\begin{matrix} p (ω^{2}, ϕ, θ | β) \propto {(2 π / β)}^{- N_{m} N_{d} / 2} \\ \exp {- \frac{β}{2} \sum_{r = 1}^{N_{m}} ‖ (K (θ) - ω_{r}^{2} M) ϕ_{r} ‖^{2}} \end{matrix}

(30)

Figure 2.

Hierarchical Bayesian model representation of the structural system identification problem.

Although the system modal parameters $ω^{2}$ and $ϕ$ are a nonlinear function of the structural model parameters $θ$ , the joint prior $p (ω^{2}, ϕ, θ | β)$ can be decomposed into the product of a conditional PDF for any one of the parameter vectors and a marginal PDF for the other two. Therefore, a series of coupled linear-in-the-parameter problems can be set up (Huang et al., 2017a, 2017b).

To promote model sparseness in the stiffness changes, the MAP value ${\hat{θ}}_{u}$ from the calibration state is chosen as the pseudo-data for $θ$ to define a likelihood function, then motivated by the SBL framework (Tipping, 2001)

p ({\hat{θ}}_{u} | θ, α) = Π_{s = 1}^{N_{θ}} N ({\hat{θ}}_{u, s} | θ_{s}, α_{s})

(31)

where the hyper-parameters $α_{s}$ are learned from the modal data. If hyper-parameter $α_{s} \to 0$ , then $θ_{s} \to {\hat{θ}}_{u, s}$ , which is interpreted as the sth substructure being undamaged. Gaussian likelihood functions, $p ({\hat{ω}}^{2} | ω^{2}, ρ)$ and $p (\hat{ψ} | ϕ, η)$ , are also defined for the system parameters $ω^{2}$ and $ϕ$ with corresponding precision parameters $ρ$ and $η$ , respectively. In addition, we model our prior uncertainty for the equation error precision $β$ by an exponential hyper-prior, $p (β | b_{0})$ , with rate parameter, $b_{0}$ .

Fast sparse Bayesian learning algorithm

Based on the hierarchical model in Figure 2, Huang et al. (2017a) proposed a fast sparse Bayesian learning algorithm that focuses on an analytical derivation of the posterior PDF of the stiffness parameters $θ$ and collects all other uncertain parameters in the vector $δ = [(ω^{2})^{T}, ρ^{T}, ϕ^{T}, η, α^{T}, β, b_{0}]^{T}$ . The latter are treated as “nuisance” parameters, which are integrated out using Laplace’s approximation method (Beck and Katafygiotis, 1998). It is assumed that the posterior $p (δ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u})$ is highly peaked at the MAP value $\tilde{δ}$ , then the posterior PDF of $θ$ is approximated by

\begin{matrix} p (θ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) = \int p (θ | δ, {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) p (δ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) d δ \\ \approx p (θ | \tilde{δ}, {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) \end{matrix}

(32)

where $\tilde{δ} = arg max p (δ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) = arg max p ({\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u} | δ)$ $p (δ)$ . Treating the prior $p (δ)$ as uniform, the maximization of the evidence $p ({\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u} | δ)$ here is effectively implementing the Bayesian Ockham Razor. This suppresses the occurrence of false and missed alarms for stiffness reductions, as shown in Appendix 1.

Sparse Bayesian learning algorithms using partial Gibbs sampling combined with Laplace’s approximation

To provide a fuller treatment of the posterior uncertainty, it is necessary to avoid the Laplace approximation in the fast SBL algorithm that involves the system modal parameters, ${ω^{2}, ϕ}$ , and the equation error precision parameter $β$ . Huang et al. (2017b) accomplished this using Gibbs sampling (GS) to draw posterior samples from $p (ϕ, ω^{2}, θ, β | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u})$ by decomposing the whole model parameter vector into four groups and repeatedly sampling from one parameter group conditional on the other three groups and the available data ${{\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}}$ . The effective dimension is then four, rather than the considerably higher total number of model parameters. Laplace’s approximation is used only for the integrals that marginalize the hyper-parameters from the conditional posterior PDF. In this GS method, the conditional posterior PDFs

p (ϕ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}, ω^{2}, θ, β) = p (ϕ | \hat{ψ}, ω^{2}, θ, β)

(33a)

p (ω^{2} | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}, ϕ, θ, β) = p (ω^{2} | {\hat{ω}}^{2}, ϕ, θ, β)

(33b)

p (θ | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}, ϕ, ω^{2}, β) = p (θ | {\hat{θ}}_{u}, ϕ, ω^{2}, β)

(33c)

p (β | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}, ϕ, ω^{2}, θ)

(33d)

are successively sampled to generate samples from the full posterior PDF $p (ϕ, ω^{2}, θ, β | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u})$ when the number of samples n is sufficiently large (beyond burn-in) since the Markov chain created by the GS is ergodic.

For the updating of stiffness scaling parameters $θ$ and system modal parameters, $ω^{2}$ and $ϕ$ , the corresponding model classes are investigated, as seen from the hierarchical Bayesian model in Figure 2. The application of Bayes’ Theorem at the model class level automatically penalizes models of $θ$ ( $ω^{2}$ or $ϕ$ ), which “underfit” or “overfit” the associated data ${\hat{θ}}_{u} ({\hat{ω}}^{2} or \hat{ψ})$ . Consequently, more reliable updating results for the three parameter vectors are obtained. This is the Bayesian Ockham Razor (Beck, 2010). Note that it is also tractable to marginalize out the equation error precision parameter $β$ to remove it from the posterior distributions as a “nuisance” parameter. This leads to Student’s t-distributions for the posteriors $p (ϕ | \hat{ψ}, ω^{2}, θ)$ , $p (ω^{2} | {\hat{ω}}^{2}, ϕ, θ)$ , and $p (θ | {\hat{θ}}_{u}, ϕ, ω^{2})$ (Huang et al., 2017b). Student’s t-PDFs have heavy tails and so the associated algorithm is more robust against noise and outliers.

Full Gibbs sampling procedure for sparse Bayesian learning

In order to characterize the full posterior uncertainty, the GS is implemented to draw posterior samples from the joint posterior PDF $p (ω^{2}, ϕ, θ, β, ρ, η, α, b_{0} | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u})$ in Huang and Beck (2018). To alleviate any inefficiency where the Markov Chain samples may get trapped in local maxima of the posterior PDF for the hyper-parameter $α$ because of a very large number of uncertain parameters to be inferred, a sequential Bayesian inference procedure was introduced based on the hierarchical Bayesian model in Figure 2. The full joint posterior PDF is given by

\begin{matrix} p (ω^{2}, ϕ, θ, ρ, η, α, β, b_{0} | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) \\ = p (ω^{2}, ϕ, ρ, η | θ, α, β, b_{0}, {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) \\ p (θ, α, β, b_{0} | {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}) \end{matrix}

(34)

The full posterior uncertainty is characterized by first taking the generated samples, ${θ^{(n)}, α^{(n)},$ $β^{(n)}, {b_{0}}^{(n)}}, n = 1, \dots, N$ , from the PDF, $p (θ, α, β, b_{0} |$ . ${\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u})$ . Thereafter, posterior samples, ${{ω^{2}}^{(n)}, ϕ^{(n)}, ρ^{(n)}, η^{(n)}}, n = 1, \dots, N$ , are drawn from the conditional posterior PDF, $p (ω^{2}, ϕ, ρ, η | θ^{(n)}, α^{(n)}, β^{(n)}, {b_{0}}^{(n)}, {\hat{ω}}^{2}, \hat{ψ}, {\hat{θ}}_{u}), n = 1, \dots, N$ , using GS.

Multi-task sparse Bayesian learning methods

Multi-task learning is a method that attempts to examine informative relationships or data redundancy between M different groups of measurements ${y_{m}}_{m = 1}^{M}$ , which may improve the SBL performance. Huang et al. (2018a) presented a multi-task SBL method by assigning a shared hyper-prior and prediction error precision parameter, which characterizes the common sparseness profile across multiple tasks. To enhance the learning robustness and posterior uncertainty quantification accuracy, the algorithm marginalized out the common prediction error precision parameter instead of merely finding its MAP value. Then the posterior PDF of the combined parameter vector ${w_{m}}_{m = 1}^{M}$ where $y_{m} = θ_{m} w_{m} + e_{m} + m_{m}$ (see equation (24)) is

\begin{matrix} p ({w_{m}}_{m = 1}^{M} | {y_{m}}_{m = 1}^{M}, α, a_{0}, b_{0}) \\ = \int p ({w_{m}}_{m = 1}^{M} | {y_{m}}_{m = 1}^{M}, α, β) p (β | {y_{m}}_{m = 1}^{M}, α, a_{0}, b_{0}) d β \end{matrix}

(35)

where $b_{0}$ is the prior rate parameter for the Gamma prior of the prediction error precision parameter, $β$ . The MAP values of the hyper-parameters are inferred using the datasets from all learning tasks

(\tilde{α}, {\tilde{b}}_{0}) = argmax p (α, b_{0} | {y_{m}}_{m = 1}^{M})

(36)

This approach was applied to identify structural stiffness losses by exploiting a commonality among stiffness reduction models in the temporal domain, that is, the damage changes by a “small” amount over adjacent time periods. It has been shown that damage patterns are more reliably detected in both qualitative and quantitative ways by this sharing of related information using multi-task learning. Huang et al. (2018b) employed a multi-task SBL to adaptively borrow the respective strengths of two fractal dimension-based damage indices to acquire a unifying damage identification index.

Application to IASC-ASCE Phase II benchmark problems

The fast SBL algorithm with and without the sparseness constraint (Huang et al., 2017a) and the full GS SBL algorithm (Huang and Beck, 2018) were applied to the brace damage patterns in the IASC-ASCE Phase II simulated benchmark (Bernal et al., 2002) and experimental benchmark problems (Ching and Beck, 2003; Dyke et al., 2003). The benchmark structure is a four-story, two-bay by two-bay steel braced-frame. Results for the damage scenarios DP1B.ps (.ps denotes partial-sensor measurement, which are at the third floor and the roof) and Config. 5 from simulated and experimental benchmarks, respectively, are reported here. The stiffness scaling parameter vector $θ$ has 16 components, one for each of the four faces of each of the four stories. The true damage ratio values for the damaged substructures are 88.7% for $θ_{1, + y}$ and $θ_{1, - y}$ for DP1B.ps and 77.4% for $θ_{1, - y}$ for Config. 5 in terms of stiffness reduction from the calibration configuration.

In Figures 3 and 4, all the samples generated from the full GS SBL algorithm (Huang and Beck, 2018), excluding those in the burn-in period (4000 samples), are plotted in the ${θ_{1, + x}, θ_{1, + y}}$ and ${θ_{1, - x}, θ_{1, - y}}$ spaces for DP1B.ps and Config. 5, respectively. They show that the stiffness reduction corresponding to $θ_{1, + y}$ and $θ_{1, - y}$ for DP1B.ps and $θ_{1, - y}$ for Config. 5 scenarios are correctly identified and quantified as far as the sample means are concerned. Considerably larger posterior uncertainties can be observed in the stiffness scaling parameters for Config. 5 scenario compared with those for DP1B.ps. This is because of the larger modeling errors in this real data case, especially for those components corresponding to real damage locations.

Figure 3.

Post burn-in samples of some posterior stiffness parameters for the DP1B.ps scenario, plotted in (a) ${θ_{1, + x}, θ_{1, + y}}$ and (b) ${θ_{1, - x}, θ_{1, - y}}$ spaces by running the full GS SBL algorithm. The reduction in stiffness shown by $θ_{1, + y}$ and $θ_{1, - y}$ reflects the damage in the corresponding substructures.

Figure 4.

Post burn-in samples for some posterior stiffness parameters for Config. 5 scenario plotted in (a) ${θ_{1, + x}, θ_{1, + y}}$ and (b) ${θ_{1, - x}, θ_{1, - y}}$ spaces by running the full GS SBL algorithm. The reduction in stiffness shown by $θ_{1, - y}$ reflects the damage in the corresponding substructure.

Figures 5 and 6 show the posterior probability densities of the damage extent fraction f for each substructure, which is the decrease in each stiffness parameter divided by its original calibration value. The posterior probability densities are estimated by the computed posterior PDFs (fast algorithms) or posterior samples (full GS algorithm). Damaged substructures should have large posterior probability density values where the stiffness reduction value is close to the real value. By comparing the results, it is seen that the fast SBL and GS algorithms give more accurate stiffness reduction ratios than the method without the sparseness constraint. Moreover, the false and missed damage indications have been effectively suppressed. This shows the benefit of exploiting damage sparseness. The performance of the two SBL algorithms is similar, although false damage detections (actual undamaged substructures that have probability densities shifted to larger damage extents) occur less often for the full GS SBL algorithm for Config. 5 (Figure 6). This is because of the robust treatment of the hyper-parameters by a fuller posterior uncertainty quantification. For example, the probability densities of $θ_{1, + y}, θ_{3, + y}, θ_{4, + y}, θ_{2, - y}, θ_{3, - y}$ , and $θ_{4, - y}$ are shifted to larger damage extents for the fast SBL algorithm, which tends to produce false detections. Moreover, the damage extent estimation for $θ_{1, - y}$ (22.6%) is more accurately quantified for the full GS SBL method than for the fast SBL method.

Figure 5.

DP1B.ps scenario: (a) Approximated Gaussian PDFs for the fast SBL algorithm with sparseness turned off and (b) Approximated Gaussian PDFs for the fast SBL algorithm; (c) Kernel probability densities built from 6000 post burn-in stiffness parameter samples by running the full GS SBL algorithm. The two damaged substructures correspond to $θ_{1, + y}$ and $θ_{1, - y}$ .

Figure 6.

Config. 5 scenario: (a) Approximated Gaussian PDFs for the fast SBL algorithm with sparseness turned off and (b) Approximated Gaussian PDFs for the fast SBL algorithm; (c) Kernel probability densities built from 6000 post burn-in stiffness parameter samples by running the full GS SBL algorithm. The only damaged substructure corresponds to $θ_{1, - y}$ .

Sparse Bayesian learning application in guided wave/ultrasonic NDT signal processing

In guided wave/ultrasonic NDT signal processing, the signal obtained from pulse-echo mode testing can be represented as a linear combination of echoes reflected from damage or defects in the sample being examined. A generalized expression of the signal is given by

y (t) = \sum_{l = 1}^{L} c_{l} ϕ_{l} (t) + ξ (t)

(37)

where $c_{l}$ is the weighting coefficient of the lth echo $ϕ_{l} (t)$ and $ξ (t)$ is a term representing noise in the signal. When accurate representation of each echo is available, the recorded signal $y (t)$ can be represented using only a few terms, that is, the sparseness of the representation of $y (t)$ can be exploited. Therefore, the SBL can be employed to infer the weighting coefficients ${c_{l}}^{'} s$ from the measured signal vector $y$ using the following linear equation

y = Φ c + ξ

(38)

where $Φ \in R^{K \times L}$ is an over-complete dictionary matrix ( $K << L$ ) that consists of L basis vectors, $ϕ_{l}$ (also called atoms), and $c = [c_{1}, \dots, c_{L}]$ is a sparse weighting coefficient vector.

SBL applications in ultrasonic NDT

In ultrasonic NDT, the received signals are often contaminated by noise from both the measurement system and test sample (structure noise, due to scattering of ultrasonic waves by the “grain” microstructure of the tested material). To suppress noise and to increase the visibility of echoes for detection, Zhang et al. (2008) proposed a methodology in which the SBL algorithm developed by Wipf and Rao (2004) was employed to decompose the noisy NDT signals. The dictionary $Φ$ consisted of several fixed-scale critically sampled cosine Gabor bases where each atom is defined by the parameters ${s, u, v}$ , as follows

g = (A / \sqrt{s}) \exp (- π {(t - z)}^{2} / s^{2}) \cos (v (t - z))

(39)

where s is the scale of the function, z is its translation, and v is the frequency modulation. One limitation of this SBL algorithm is that it is extremely computationally demanding, especially when the dimension of dictionary $Φ$ is large. As such, Wu et al. (2017a) developed a signal processing method that employs a robust sparse Bayesian learning algorithm (RSBL) to process noisy NDT signals for flaw detection. The RSBL algorithm was developed by Huang et al. (2014). It is based on the “fast algorithm” by Tipping and Faul (2003) but enhanced for better robustness by a successive relaxation strategy and stochastic optimization searching scheme to alleviate the optimization problem of the hyper-parameters being stuck in local maxima. Moreover, instead of using a uniformly distributed parameter set for dictionary design, Wu et al. (2017a) chose dictionary parameters in accordance with the energy distribution of the signal. Although the structure noise is not sparse in the spatial domain, the proposed dictionary design strategy can produce sparse representations of the structure noise by utilizing its limited frequency range and bandwidth. This modeling is useful to increase the sparse Bayesian learning accuracy of the weighting coefficient vector, $c$ .

SBL applications in guided wave damage/defect detection

In Guided Wave testing, the identification and recovery of each guided wave mode in the received signal is vital for damage or defect characterization and localization. Once individual modes are identified, defect localization is straightforward. Specially, if the amplitudes of each mode are specified, it is possible to further characterize the size of the defect.

To process narrowband guide wave signals in which signal dispersion is negligible, Wu et al. (2017b) introduced a SBL-based method, where the Gabor model given in equation (39) was utilized to approximate the GW pulses. To form an efficient over-complete dictionary, the three Gabor parameters $(ω, s, u)$ were designed as follows: the natural frequencies $ω$ evenly divide the total power of the signal; the scale parameters $s$ are uniformly distributed in the range $(0.5 s_{0}, 2 s_{0})$ , where $s_{0}$ is the bandwidth of the generated signal, and the parameters $u$ evenly separate the area between the upper and lower envelopes of the signal.

For dispersive guided wave signal processing, the Gabor dictionary becomes inefficient. Wu et al. (2018) also proposed a parameterized chirp model for the approximation of the dispersive guided wave signal using a polynomial approximation of the frequency–wavenumber dispersion $k (f)$ . This wavenumber, as a function of frequency f, characterizes the dispersion property of the GW mode in the waveguide. By utilizing a third-order polynomial approximation of the frequency–wavenumber dispersion $k (f)$ , the time-domain waveform of the pulse at the travel distance $x = x_{0}$ can be obtained as

\begin{matrix} g (x_{0}, t) = Re [G (0, f) e^{- jk (f)}] \\ ≅ Re [G (0, f) e^{- j x_{0} (\frac{1}{3} d_{2} f^{3} + \frac{1}{2} d_{1} f^{2} + d_{0} f + c_{0})}] \end{matrix}

(40)

where $G (0, f)$ is the Fourier transform of the pulse at $x = 0$ . Based on this model, Wu et al. (2018) presented a signal processing method, which utilizes an advanced SBL algorithm presented in Huang et al. (2016) to recover multiple dispersive GW modes from noisy signals for damage detection and localization. The dictionary design was based on the propagation path of each mode, which is closely linked to signal dispersion. This leads to the desirable consequence that distances between defect and actuator and between defect and receiver can be easily obtained, making defect localization straightforward. In addition, the SBL algorithm presented in Huang et al. (2016) can treat both highly sparse and approximately sparse signal models, so the method is robust against the noise in the signals.

This method was verified through the experimental study in Figure 7, where a notch was prefabricated as damage. Figure 8 shows the recovered signals and individual modes obtained. Using the propagation information of the recovered modes and the triangulation method, the location of the damage was obtained, as presented in Figure 9. It is observed that the detected notch is close to its actual position (the distance between these two positions is approximately 17 mm). It is noteworthy that it is sufficient to localize the notch using the measurements from only two sensors.

Figure 7.

Experimental setup for damage localization.

Figure 8.

(a) Processed signal and recovered modes for sensor 1 and (b) processed signal and recovered modes for sensor 2.

Figure 9.

Notch localization.

Discussion and future prospects

This article presented a state-of-the-art review on Bayesian inference and its application in structural system identification and damage assessment of civil infrastructures. Because of limited page space allowed for this article, other applications of Bayesian inference in system identification, such as modal identification, are not included. Based on the literature review, the following concluding remarks can be made:

A powerful Bayesian probabilistic framework is available for treating modeling uncertainty in system identification that is based solely on the probability logic axioms. It allows plausible reasoning regarding system behavior based on noisy incomplete data without invoking the concept of “inherent randomness.” Rather than considering only a point estimate based on a single model, Bayes’ theorem is used to compute the posterior probability distribution and quantify the relative plausibility of each model in a parameterized set of system models.

Comparing the posterior probability at the model class level automatically implements a quantitative form of the Ockham Razor. Roughly speaking, this principle states that models should not be more complex than is sufficient to explain the data. The Bayesian Ockham Razor penalizes model classes that “overfit” the data, which is important in real applications since overly complex models often lead to overfitting of the data and then subsequent response predictions may be unreliable.

To allow a computationally feasible Bayesian implementation, various Bayesian approximation tools have been developed for robust analysis and characterization of the posterior distribution in Bayesian updating and model class assessment involving a large number of uncertain parameters. Their applications are based on different situations. For example, Laplace’s asymptotic approximation is useful if the amount of data is not too small, and the model class is globally identifiable. When the chosen class of models is unidentifiable or locally identifiable based on the data so that there are multiple maximum likelihood estimates (MLEs), stochastic simulation methods are more practical to calculate the model class evidence, such as MCMC methods.

The application of Bayesian inference for both vibration-based and wave propagation-based damage assessment is addressed and reviewed. Using a Bayesian probabilistic formulation, the updated posterior probability distribution of the uncertain damage-related model parameters is obtained. Not only the most probable estimates are inferred but also the associated posterior uncertainties are quantified, including the probability of substructure damage of various amounts. The concept of system mode shape was utilized in the vibration-based damage assessment methods. This avoids the challenging mode-matching problem and the necessity of solving the nonlinear inverse problem related to a structural model eigenvalue equation.

The hierarchical sparse Bayesian learning methodologies have attracted interest in recent years for performing sparse stiffness loss inference for vibration-based damage assessment and also for flaw detection using guided wave/ultrasonic NDT signal processing. It is found that the incorporation of prior information pertaining to the spatial sparseness of structural damage helps to suppress the possible occurrences of false damage detections. Moreover, the algorithms have the appealing feature that they automatically select all algorithmic parameters, so that no user intervention is required.

To enhance the application of Bayesian inference in civil engineering and other related areas in science and technology, the following suggestions for future research are suggested:

Most past Bayesian inference applications in system identification and damage assessment have involved low-dimensional model parameters. There are computational challenges to applying the Bayesian approach to high-dimensional inverse problems, such as how to efficiently sample the posterior high-dimensional parameter spaces and how to explore robustly the features implied by the collection of models corresponding to the posterior samples of the model parameters. Further research is desirable to develop new methods for exploring high-dimensional model parameter spaces, such as iterative block-parameter Gibbs sampling algorithms, and for refining model parameter spaces, such as variable-resolution approaches that permit a progressive refinement of model parameterization.

The assessment of the bottlenecks in Bayesian model updating and uncertainty quantification of nonlinear structural models requires further study. For complex nonlinear models, an analytical formula of the likelihood function might be difficult, or even elusive. Methods such as Approximate Bayesian Computation (ABC) methods have the potential of bypassing the evaluation of likelihood functions and should be further explored in applications.

Bayesian inference is a powerful statistical framework for dealing with big datasets to avoid data overfitting and to allow model uncertainty to be explicitly quantified. To overcome the computing challenges with large-scale spatial and temporal datasets in structural health monitoring, the advance of scalable Bayesian inference algorithms should be explored. Topics for research include subsampling big datasets with a stochastic method that exploits the redundancy in large-scale datasets, developing recursive Bayesian estimation for inferring an unknown PDF over time using sequential datasets, producing modular and portable software for distributed/parallel computing platforms.

Bayesian methods can enhance many machine learning methods (including deep learning) by handling missing data, extracting much more information from small datasets, and automatically tuning hyper-parameters. Moreover, Bayesian methods allow us to quantify both modeling and measurement uncertainty in learning and making predictions, which is a desirable feature in various fields. Future research in machine learning methodologies and applications can benefit by exploring Bayesian methods.

Footnotes

Appendix 1 Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is financially supported by the National Key Research and Development Program of China (2017YFC1500605) and the National Natural Science Foundation of China (Grant Nos. 51778192, 51638 007 and 51308161).

ORCID iDs

Yong Huang

Biao Wu

References

Andrieu

Thoms

(2008) A tutorial on adaptive MCMC. Statistics and Computing 18: 343–373.

Arangio

Beck

(2012) Bayesian neural networks for bridge integrity assessment. Structural Control and Health Monitoring 19(1): 3–21.

Arangio

Bontempi

(2015) Structural health monitoring of a cable-stayed bridge with Bayesian neural networks. Structure and Infrastructure Engineering 11(4): 575–587.

Astroza

Ebrahimian

et al . (2017) Bayesian nonlinear structural FE model and seismic input identification for damage assessment of civil structures. Mechanical Systems and Signal Processing 93: 661–687.

Bao

Xia

et al . (2013) Data fusion-based structural damage detection under varying temperature conditions. International Journal of Structural Stability and Dynamics 12(6): 1250052.

Beck

(1989) Statistical system identification of structures. In: Proceedings of 5th international conference on structural safety and reliability, San Francisco, CA, 7–11 August.

Beck

(2010) Bayesian system identification based on probability logic. Structural Control and Health Monitoring 17: 825–847.

Beck

(2014) Bayesian system identification and the Bayesian Ockham Razor. In: Proceedings of the 9th international conference on structural dynamics, Porto, 30 June–2 July.

Beck

(2002) Bayesian updating of structural models and reliability using Markov Chain Monte Carlo simulation. Journal of Engineering Mechanics 128(4): 380–391.

10.

Beck

Katafygiotis

(1991) Updating of a model and its uncertainties utilizing dynamic test data. In: Spanos

Brebbia

(eds) Computational Stochastic Mechanics. Dordrecht: Springer, pp. 125–136.

11.

Beck

Katafygiotis

(1998) Updating models and their uncertainties. I: Bayesian statistical framework. Journal of Engineering Mechanics 124(4): 455–461.

12.

Beck

Yuen

(2004) Model selection using response measurements: Bayesian probabilistic approach. Journal of Engineering Mechanics 130(2): 192–203.

13.

Beck

Zuev

(2013) Asymptotically independent Markov sampling: a new Markov Chain Monte Carlo scheme for Bayesian interference. International Journal for Uncertainty Quantification 3(5): 445–474.

14.

Beck

Vanik

(2001) Monitoring structural health using a probabilistic measure. Computer-Aided Civil and Infrastructure Engineering 16(1): 1–11.

15.

Behmanesh

Moaveni

(2015) Probabilistic identification of simulated damage on the Dowling Hall footbridge through Bayesian finite element model updating. Structural Control and Health Monitoring 22: 463–483.

16.

Behmanesh

Moaveni

Papadimitriou

(2017) Probabilistic damage identification of a designed 9-story building using modal data in the presence of modeling errors. Engineering Structures 131: 542–552.

17.

Bernal

Dyke

Lam

et al . (2002) Phase II of the ASCE benchmark study on SHM. In: Proceedings of 15th ASCE engineering mechanics conference, New York, 2–5 June, pp. 1048–1055.

18.

Bishop

(2005) Neural Networks for Pattern Recognition. New York: Oxford University Press.

19.

Bishop

(2006) Pattern Recognition and Machine Learning. New York: Springer.

20.

Candes

Romberg

Tao

(2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory 52(2): 489–509.

21.

Catanach

Beck

(2017) Bayesian system identification using auxiliary stochastic dynamical systems. International Journal of Nonlinear Mechanics 94: 72–83.

22.

Chen

Donoho

Saunders

(1999) Atomic decomposition by basis pursuit. SIAM Journal on Scientific and Statistical Computing 20(1): 33–61.

23.

Cheung

Beck

(2009) Bayesian model updating using Hybrid Monte Carlo Simulation with application to structural dynamics models with many uncertain parameters. Journal of Engineering Mechanics 135: 243–255.

24.

Chiachio

Bochud

Chiachio

et al . (2017) A multilevel Bayesian method for ultrasound-based damage identification in composite laminates. Mechanical Systems and Signal Processing 88: 462–477.

25.

Chiachio

Beck

Chiachio

et al . (2014) Approximate Bayesian computation by subset simulation. SIAM Journal on Scientific Computing 36(3): A1339–A1358.

26.

Ching

Beck

(2003) Two-step Bayesian structure health monitoring approach for IASC-ASCE phase II simulated and experimental benchmark studies, Technical Report EERL 2003-02, Earthquake Engineering Research Laboratory, California Institute of Technology, Pasadena, CA.

27.

Ching

Beck

(2004a) Bayesian analysis of the Phase II IASC–ASCE Structural Health Monitoring experimental benchmark data. Journal of Engineering Mechanics 130(10): 1233–1244.

28.

Ching

Beck

(2004b) New Bayesian model updating algorithm applied to a Structural Health Monitoring benchmark. Structural Health Monitoring 3(4): 313–332.

29.

Ching

Chen

(2007) Transitional Markov Chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging. Journal of Engineering Mechanics 133(7): 816–832.

30.

Ching

Beck

Porter

(2006a) Bayesian state and parameter estimation of uncertain dynamical systems. Probabilistic Engineering Mechanics 21(1): 81–96.

31.

Ching

Muto

Beck

(2006b) Structural model updating and health monitoring with incomplete modal data using Gibbs sampler. Computer-Aided Civil and Infrastructure Engineering 21(4): 242–257.

32.

Cover

Thomas

(2006) Elements of Information Theory. Hoboken, NJ: Wiley-Interscience.

33.

Cox

(1946) Probability, frequency and reasonable expectation. American Journal of Physics 14(1): 1–13.

34.

Cox

(1961) The Algebra of Probable Inference. Baltimore, MD: Johns Hopkins Press.

35.

Dyke

Bernal

Beck

et al . (2003) Experimental phase II of the structural health monitoring benchmark problem. In: Proceedings of 16th Engineering Mechanics conference, ASCE, Seattle, USA.

36.

Faul

Tipping

(2002) Analysis of sparse Bayesian learning. In: Dietterich

Becker

Ghahramani

(eds) Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press, pp. 383–389.

37.

Figueiredo

Radu

Worden

et al . (2014) A Bayesian approach based on a Markov-chain Monte Carlo method for damage detection under unknown sources of variability. Engineering Structures 80: 1–10.

38.

Fujimoto

Satoh

Fukunaga

(2011) System identification based on variational Bayes method and the invariance under coordinate transformations. In: Proceedings of the 50th IEEE conference on CDC-ECC, Orlando, FL, pp. 3882–3888. New York: IEEE.

39.

Ghanem

Shinozuka

(1995) Structural-system identification. I: theory. Journal of Engineering Mechanics 121(2): 255–264.

40.

Goller

Beck

Schuëller

(2012) Evidence-based identification of weighting factors in Bayesian model updating using modal data. Journal of Engineering Mechanics 138(5): 430–440.

41.

Gull

(1988) Bayesian inductive inference and maximum entropy. In: Erickson

Smith

(eds) Maximum Entropy and Bayesian Methods. Dordrecht: Kluwer Academic Publishers, pp. 53–74.

42.

Haario

Saksman

Tamminen

(2001) An adaptive Metropolis algorithm. Bernouli 7: 223–242.

43.

Hastings

(1970) Monte Carlo sampling methods using Markov Chains and their applications. Biometrika 57(1): 97–109.

44.

(2017) Guided wave-based identification of multiple cracks in beams using a Bayesian approach. Mechanical Systems and Signal Processing 84: 324–345.

45.

Hong

Sun

Kim

(2006) Waveguide damage detection by the matching pursuit approach employing the dispersion-based chirp functions. IEEE Transactions on Ultrasonics Ferroelectrics and Frequency Control 53(3): 592–605.

46.

Hou

Xia

Zhou

(2018a) Structural damage detection based on l₁ regularization using natural frequencies and mode shapes. Structural Control and Health Monitoring 25(3): e2107.

47.

Hou

Xia

Bao

et al . (2018b) Selection of regularization parameter for l₁-regularized damage detection. Journal of Sound and Vibration 423: 141–160.

48.

Huang

Beck

(2015) Hierarchical sparse Bayesian learning for structural health monitoring with incomplete modal data. International Journal for Uncertainty Quantification 5(2): 139–169.

49.

Huang

Beck

(2011) Robust Diagnostics for Bayesian Compressive Sensing Technique in Structural Health Monitoring, The 8th international workshop on structural health monitoring, Stanford, USA. 13–15 September.

50.

Huang

Beck

(2018) Full Gibbs sampling procedure for Bayesian system identification incorporating sparse Bayesian learning with automatic relevance determination. Computer-Aided Civil and Infrastructure Engineering 33(9): 712–730.

51.

Huang

Beck

(2017a) Hierarchical sparse Bayesian learning for structural damage detection: theory, computation and application. Structural Safety 64: 37–53.

52.

Huang

Beck

(2017b) Bayesian system identification based on hierarchical sparse Bayesian learning and Gibbs sampling with application to structural damage assessment. Computer Methods in Applied Mechanics and Engineering 318: 382–411.

53.

Huang

Beck

(2018a) Multi-task sparse Bayesian learning with applications in Structural Health Monitoring. Computer-Aided Civil and Infrastructure Engineering. Epub ahead of print 21 August 2018. DOI: 10.1111/mice.12408

54.

Huang

Beck

et al . (2014) Robust Bayesian compressive sensing for signals in Structural Health Monitoring. Computer-Aided Civil and Infrastructure Engineering 29(3): 160–179.

55.

Huang

Beck

et al . (2016) Bayesian compressive sensing for approximately sparse signals and application to structural health monitoring signals for data loss recovery. Probabilistic Engineering Mechanics 46: 62–79.

56.

Huang

et al . (2018b) Fractal dimension based damage identification incorporating multi-task sparse Bayesian learning. Smart Materials and Structures 27: 075020.

57.

Huang

Ren

Beck

et al . (2018c) Sequential Bayesian compressed sensing. In: The 7th world conference on structural control and monitoring, 7WCSCM, Qingdao, China, 22–25 July.

58.

Huang

Shao

(2018d) Diagnosis and accuracy enhancement of compressive-sensing signal reconstruction in structural health monitoring using multi-task sparse Bayesian learning. Smart Materials and Structures. Available at: https://doi.org/10.1088/1361-665X/aae9b4

59.

Jaynes

(1957) Information theory and statistical mechanics. Physical Review 106(4): 620–630.

60.

Jaynes

(1983) In Rosenkrantz

(ed.) Papers on Probability, Statistics and Statistical Physics. D Dordrecht, Holland: Reidel Publishing.

61.

Jaynes

(2003) Probability Theory: The Logic of Science. Cambridge: Cambridge University Press.

62.

Jefferys

Berger

(1992) Ockham’s Razor and Bayesian analysis. American Scientist 80: 64–72.

63.

Katafygiotis

Beck

(1998) Updating models and their uncertainties. II: Model identifiability. Journal of Engineering Mechanics 124(4): 463–467.

64.

Katafygiotis

Lam

(2002) Tangential-projection algorithm for manifold representation in unidentifiable model updating problems. Earthquake Engineering & Structural Dynamics 31(4): 791–812.

65.

Lam

(2008) The selection of pattern features for structural damage detection using an extended Bayesian ANN algorithm. Engineering Structures 30(10): 2762–2770.

66.

Lam

Wong

(2014) The Bayesian methodology for the detection of railway ballast damage under a concrete sleeper. Engineering Structures 81: 289–301.

67.

Lam

Yuen

Beck

(2006) Structural health monitoring via measured Ritz vectors utilizing artificial neural networks. Civil and Infrastructure Engineering 21: 232–241.

68.

Der Kiureghian

(2017) Operational modal identification using variational Bayes. Mechanical Systems and Signal Processing 88: 377–398.

69.

Mackay

DJC

(1992) Bayesian methods for adaptive models. PhD Thesis, Computation and Neural Systems, California Institute of Technology, Pasadena, CA.

70.

MacKay

DJC

(1994) Chapter 6: Bayesian methods for backpropagation networks. In: MacKay

DJC

(ed.) Model of Neural Networks III. Berlin: Springer, pp. 211–254.

71.

Marin

Pudlo

Robert

et al . (2012) Approximate Bayesian computational methods. Statistics and Computing 22: 1167–1180.

72.

Mises

(1981 [1939]) Probability, Statistics, and Truth. New York: Dover Publications (in German).

73.

Muto

Beck

(2008) Bayesian updating and model class selection using stochastic simulation. Journal of Vibration and Control 14: 7–34.

74.

Neal

(1996) Bayesian learning for neural networks. Lecture Notes in Statistics, Berlin.

75.

Veidt

Lam

(2009) Guided wave damage characterization in beams utilizing probabilistic optimization. Engineering Structures 31(12): 2842–2850.

76.

Nichols

Link

Murphy

et al . (2010) A Bayesian approach to identifying structural nonlinearity using free-decay response: application to damage detection in composites. Journal of Sound and Vibration 329(15): 2995–3007.

77.

Raghavan

Cesnik

CES

(2007) Guided-wave signal processing using chirplet matching pursuits and mode correlation for structural health monitoring. Smart Materials and Structures 16(2): 355–366.

78.

Robert

Casella

(2004) Monte Carlo Statistical Methods (ed Fienberg

). 2nd ed.New York: Springer.

79.

Sirca

Adeli

(2012) System identification in structural engineering. Scientia Iranica 19(6): 1355–1364.

80.

Sohn

Allen

Worden

et al . (2005) Structural damage classification using extreme value statistics. Journal of Dynamic Systems Measurement and Control 127(1): 125–132.

81.

Sohn

Law

(1997) A Bayesian probabilistic approach for structure damage detection. Earthquake Engineering & Structural Dynamics 26(12): 1259–1281.

82.

Straub

Papaioannou

(2015) Bayesian updating with structural reliability methods. Journal of Engineering Mechanics 141(3): 04014134.

83.

Tarantola

(2005) Inverse Problem Theory. Philadelphia, PA: Society for Industrial and Applied Mathematics.

84.

Tipping

(2001) Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research 1: 211–244.

85.

Tipping

(2004) Bayesian inference: an introduction to principles and practice in machine learning. In: Bousquet

et al . (ed.) Advanced Lectures on Machine Learning. Springer-Verlag Berlin Heidelberg, pp. 41–62.

86.

Tipping

Faul

(2003) Fast marginal likelihood maximization for sparse Bayesian models. In: Proceedings of 9th international workshop on artificial intelligence and statistics, Key West, FL, 3–6 January.

87.

Tropp

Gilbert

(2007) Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory 53(12): 4655–4666.

88.

Vakilzadeh

Huang

Beck

et al . (2017) Approximate Bayesian Computation by Subset Simulation using hierarchical state-space models. Mechanical Systems and Signal Processing 84: 2–20.

89.

Vanik

(1997) A Bayesian probabilistic approach to structural health monitoring. Technical Report EERL-9707. Pasadena, CA: Earthquake Engineering Research Laboratory, Caltech.

90.

Vanik

Beck

(2000) Bayesian probabilistic approach to structural health monitoring. Journal of Engineering Mechanics 126(7): 738–745.

91.

Wang

Yajima

Liang

et al . (2015) A Bayesian model framework for calibrating ultrasonic in-line inspection data and estimating actual external corrosion depth in buried pipeline utilizing a clustering technique. Structural Safety 54: 19–31.

92.

Wipf

Rao

(2004) Sparse Bayesian learning for basis selection. IEEE Transactions on Signal Processing 52(8): 2153–2164.

93.

Huang

Krishnaswamy

(2017a) A Bayesian approach for sparse flaw detection from noisy signals for ultrasonic NDT. NDT&E International 85: 76–85.

94.

Huang

Chen

et al . (2017b) Guided-wave signal processing by the sparse Bayesian learning approach employing Gabor pulse model. Structural Health Monitoring 16(3): 347–362.

95.

Huang

(2018) Sparse recovery of multiple dispersive guided-wave modes for defect localization using a Bayesian approach. Structural Health Monitoring Available at: https://doi.org/10.1177/1475921718790212

96.

Yan

(2013) A Bayesian approach for damage localization in plate-like structures using Lamb waves. Smart Materials and Structures 22(3): 035012.

97.

Yan

Katafygiotis

(2015) A novel Bayesian approach for structural model updating utilizing statistical modal information from multiple setups. Structural Safety 52: 260–271.

98.

Yang

Beck

(1998) Generalized trajectory methods for finding multiple extrema and roots of functions. Journal of Optimization Theory and Applications 97(1): 211–227.

99.

Yin

Jiang

Yuen

(2017) Vibration-based damage detection for structural connections using incomplete modal data by Bayesian approach and model reduction technique. Engineering Structures 132: 260–277.

100.

Yuen

(2010) Recent developments of Bayesian model class selection and applications in civil engineering. Structural Safety 32(5): 338–346.

101.

Yuen

Kuok

(2016) Online updating and uncertainty quantification using nonstationary output-only measurement. Mechanical Systems and Signal Processing 66–67: 62–77.

102.

Yuen

Beck

(2004) Structural damage detection and assessment using adaptive Markov Chain Monte Carlo simulation. Structural Control and Health Monitoring 11(4): 327–347.

103.

Yuen

Beck

Katafygiotis

(2006) Efficient model updating and health monitoring methodology using incomplete modal data without mode matching. Structural Control and Health Monitoring 13(1): 91–107.

104.

Zhang

Harvey

Braden

(2008) Signal denoising and ultrasonic flaw detection via overcomplete and sparse representations. Journal of the Acoustical Society of America 124(5): 2963–2972.

105.

Zhou

Xia

Weng

(2015) L₁ regularization approach to structural damage detection using frequency data. Structural Health Monitoring 14(6): 571–582.