Physics-Informed neural networks with transfer learning for space charge impedances in particle accelerators

Abstract

The physics-informed neural network (PINN) method, which is a powerful approach for solving partial differential equations with deep learning, has been recently applied to modeling of electrodynamic interaction problems including a relativistic beam in charged particle accelerators. In the present study, the transfer learning (TL) is applied to the PINN based on the total-field (TF) formulation. It is shown that TL can accelerate significantly the training process of the TF-PINN in the simulation of the space charge impedances of an infinitely long beam pipe of elliptical and rectangular cross section.

Keywords

deep learning transfer learning impedance electromagnetic field

Introduction

The physics-informed neural network (PINN) method^1,2 is a powerful approach for solving partial differential equations (PDEs) via deep learning. The key idea is to embed the PDEs into the loss function of a deep neural network (DNN) using automatic differentiation.

Recently, this approach has been successfully applied to the modeling of electrodynamic interaction problems including a relativistic beam in charged particle accelerators.^3–6 The PINN based on the total-field (TF) formulation is proposed in.^3–5 We shall call this approach as TF-PINN. The concept of transfer learning (TL)^7,8 was introduced into the PINN based on the scattered field formulation,⁶ where the basic idea on introducing TL for the simulation of the beam impedance⁹ is briefly described. However, TL is not yet discussed in the framework of TF-PINN.^3–5

The purpose of the present study is to apply TL into TF-PINN.³ The main focus is to clarify that TL can be applied to TF-PINN for the simulation of space charge impedances.³ We limit our discussion to this point.

PINN method for space charge impedances in particle accelerators

Problem formulation

When a charged particle beam passes through vacuum chamber components in a particle accelerator, an electromagnetic field can be induced. To evaluate the effects of electromagnetic interaction of the accelerator beam with vacuum chamber walls, the concept of beam impedance is used in the frequency domain. It is assumed that the beam charge density distribution is not changed during its passage. In accelerator physics, the beam impedance in the direction of motion is defined as⁹

\begin{matrix} Z_{| |} = - \frac{E_{z}}{I_{b}} \end{matrix}

(1)

where I_b= Qv = Qβc is the total beam current, c is the speed of light in vacuum, Q is the total charge, v = ve_|| is the beam velocity, and e_|| is the unit vector in the longitudinal direction (z-direction), and E_z is the longitudinal component of electric field. It is assumed that the beam moves along the axis of an infinitely long vacuum chamber. To calculate (1), one need to obtain E_z for a particular harmonic component with an angular frequency ω = 2πf. The beam charge and current densities (

ρ, J

) and E_z can be expressed as⁹

ρ (r, t) = ρ_{⊥} (x, y) e^{j (ω t - k z)}, J = ρ v, E_{z} (r, t) = E_{z} (x, y) e^{j (ω t - k z)}

(2)

where r = (x,y,z). The longitudinal field E_z obeys the following PDE^3,9

\begin{matrix} \nabla_{⊥}^{2} E_{z} - \frac{k^{2}}{γ^{2}} E_{z} = - \frac{j k}{ε_{0} γ^{2}} ρ_{⊥} \end{matrix}

(3)

where

\nabla_{⊥}^{2} = \partial_{x}^{2} + \partial_{y}^{2}

, k = ω/v is the wavenumber, γ=(1−β²)^−1/2 is the Lorentz factor, ε₀ is the permittivity of vacuum, respectively. In the current study, we limit our interest to the case of a round Gaussian beam⁹:

\begin{matrix} ρ_{⊥} = \frac{Q}{2 π σ_{r}^{2}} e^{- \frac{{(x - x_{c})}^{2}}{2 σ_{r}^{2}} - \frac{{(y - y_{c})}^{2}}{2 σ_{r}^{2}}} \end{matrix}

(4)

To obtain E_z for a given vacuum chamber cross-section, we solve (3) with the perfectly electric conductor boundary condition (PEC-BC)

\begin{matrix} E_{z} = 0 \end{matrix}

(5)

It means that the tangential component of the electric field is zero on the boundary surface. Here, E_z is called as the space charge (SC) field, due to the fact that it is induced by the SC ρ in a vacuum chamber. In general, the SC field E_z can be expressed as a superposition of the free-space field of the SC and the scattered field.⁶ In this paper, the superposed (total) field, E_z, is chosen as field values to be solved. In the following, we shall call this formulation as the TF formulation.

PINN method based on the total-field formulation

Here, we summarize the PINN method³ based on the TF formulation for SC impedances in particle accelerators. In this method, PDE (3) and PEC-BC (5) are embedded into the loss function of a DNN^10,11 using the automatic differentiation.¹² The method is schematically shown in Figure 1. Note that γ and ρ_⊥ are explicitly included in the constructed network. This method works well, when a beam charge density is transversally smooth like (4). An elliptical bi-Gaussian charge can be also modeled in the TF-PINN framework as in.³ However, as already mentioned in,⁶ discontinuous beam charges such as point charge, ring charge and round uniform charge density with hard edge cannot be addressed in this framework, because a loss function including such charges is not differentiable over the computational domain. Although not shown here, the input, output, PDE and BC are properly scaled as in.^3,4,5

Figure 1.

Physics-informed neural networks for space charge impedances.

Application of transfer learning to PINN method in impedance simulation

As mentioned in Section 2.1, the beam impedance (1) is obtained from the total fields driven by the beam current in an accelerator vacuum chamber in the frequency domain. In many cases, a frequency range of interest is specified. At each of the frequency points of interest, the learnable parameters (weights and biases) are initialized and then the training process starts. For the parameters to be trained at the next frequency point, the same initialization was used. The main point of this paper is to change this initialization procedure in the TF-PINN method by using the concept of transfer learning^7,8; it is to reuse the parameters learned at a task for the training of other tasks. Note that this idea was applied to the scattered-field formulation in,⁶ but its applicability to the TF formulation is not yet studied.

In the TF-PINN, the DNN is trained as a solution surrogate of (3) with (5) at each of the frequency points. To accelerate the training process with a gradient-based optimizer, the initialization of the learnable parameters of the DNN will be important. Except for the first frequency point, DNN parameters trained at a low frequency point can be used to initialize DNN parameters at the adjacent higher frequency point. If the frequency step is enough small, the TF solution obtained at one frequency point can be similar to that of the adjacent higher frequency point, and then the corresponding parameters may be also similar to the ones at the next frequency point. Using the initialization with the similar parameters, the next training process starts. With this approach, it is expected to reduce the number of iterations for the optimization of the corresponding DNN parameters. Application examples of the TL to the impedance simulation will be shown in Section 4. The algorithm is listed as follow:

Set up a computational domain for the PDE, boundary surfaces for the PEC-BC, source domains of ρ_⊥, a frequency range, physical constants such as ε₀, γ.

Generate a set of sampling points S within the vacuum domain surrounded by the chamber walls. No point is generated outside the chamber.

Construct a DNN with output ${\hat{E}}_{z} (x, y; θ)$ as a surrogate of the PDE solution $E_{z}$ , where θ is a vector containing all weights w and biases b in the DNN to be trained, and σ denotes an activation function.

Define the loss function L related to (3) and (5)

Initialize learnable parameters θ at one frequency point using the parameters previously optimized at the other frequency point, except for the first frequency point

Train the constructed DNN to find the best parameters θ by minimizing the loss function L via a gradient-based optimizer such as the Adam and L-BFGS,¹³ until L is smaller than a threshold $ϵ$

In the above method, we use the loss function L defined by

L (θ; S) = L_{P D E} (θ; S_{d}) + L_{B C} (θ; S_{b})

(7)

\begin{aligned} L_{P D E} (θ; S_{d}) & = \frac{1}{N_{d}} \sum_{x, y \in S_{d}} | f (x, y; θ) |^{2}, L_{B C} (θ; S_{b}) \\ = \frac{1}{N_{b}} \sum_{x, y \in S_{b}} | {\hat{E}}_{z} (x, y; θ) |^{2} \end{aligned}

(8)

\begin{matrix} f = \nabla_{⊥}^{2} {\hat{E}}_{z} - \frac{k^{2}}{γ^{2}} {\hat{E}}_{z} + \frac{j k}{ε_{0} γ^{2}} ρ_{⊥} \end{matrix}

(9)

where S_d is the set of N_d sampling points for the computational domain and S_b is the set of N_b sampling points for boundary surface, the full set of sampling points S consists of S_d and S_b. Minimizing L_PDE enforces (3) at a set of finite sampling points S_d in the computational domain. Similarly, minimizing L_BC enforces (5) at a set of finite sampling points S_b on the boundary surface.

Numerical results

To investigate the effect of TL on the TF-PINN,³ we show application examples for the impedance analysis of two vacuum chambers. We calculate the SC impedances of a round Gaussian charge density with the total chare q = 1pC, the Gaussian parameter σ_r= 1 mm in the radial direction and the relativistic factor γ=100 in an infinitely long vacuum chamber with elliptical or rectangular cross section of height H = 10 mm and width W = 18 mm. Throughout this study, we adapt a fully connected neural network and the Swish activation function¹⁴ for the DNN architecture. We use three hidden layers and 30 neurons per layer. At each sampling point, the DNN parameters are updated to minimize the loss function L by using the L-BFGS algorithm, until L get smaller than a threshold $ϵ = 10^{- 6}$ or the number of iterations get larger than 30,000.

Figure 2 shows the SC impedances calculated for the elliptical and rectangular vacuum chambers using the TF-PINN with and without the TL. Just for reference, the corresponding numerical results with BEM¹⁵ are also displayed. The results of TF-PINN with TL agrees with those of TF-PINN without TL and BEM. This shows that the TF-PINN with TL can be successfully applied to the impedance modeling for various type of vacuum chamber geometries.

Figure 2.

Space charge field and impedances for two different vacuum chambers obtained with and without transfer learning in TF-PINN. (a) Elliptical chamber. (b) Rectangular chamber.

Figure 3 shows the number of iterations for training in TF-PINN with and without the TL. From the second sampling frequency point, the number of iterations is decreased at each sampling point by using the TL. The SC fields at two adjacent frequency points are displayed in the insets of Figure 2. As expected, the field distribution obtained at the first frequency point is similar to that of the second frequency point. These results demonstrates that the training process of the TF-PINN can be accelerated with the TL.

Figure 3.

Effect of transfer learning in TF-PINN based impedance simulation of a Gaussian charge density in two different vacuum chambers. (a) Elliptical chamber. (b) Rectangular chamber.

Conclusion

The TF-PINN combined with the TL has been developed, and successfully applied to the SC impedances of a round Gaussian charge density in an infinitely long PEC vacuum chamber beam pipe with elliptical and rectangular cross sections. It has been found that the TL is useful for accelerating the training processes in TF-PINN.

A transversally smooth beam charge was assumed in the present study. A removal of the limitation of TF-PINN on modeling discontinuous beam charges is left for a future work.

Footnotes

Acknowledgment

The author would like to thank anonymous reviewers for their helpful comments that improved this article.

ORCID iD

Kazuhiro Fujita

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Raissi

Perdikaris

Karniadakis

. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J Comput Phys 2019; 378: 686–707.

Karniadakis

Kevrekidis

, et al. Physics-informed machine learning. Nature Rev Phys 2021; 3: 422–440.

Fujita

. Physics-informed neural network method for space charge effect in particle accelerators. IEEE Access 2021; 9: 164017–164025.

Fujita

. Physics-informed neural network method for modelling beam-wall interactions. Electron Lett 2022; 58: 390–392.

Fujita

. Electromagnetic field computation of multilayer vacuum chambers with physics-informed neural networks. Front Phys 2022; 10: 967645.

Fujita

. Impedance modeling of accelerator beams with discontinuous charge density using scattered-field physics-informed neural networks. IEICE Electron Expr 2023; 20: 1–6.

Weiss

Khoshgoftaar

Wang

. A survey of transfer learning. J Big Data 2016; 3: Article 9.

Markidis

. The old and the new: can physics-informed deep-learning replace traditional linear solvers? Front Big Data 2021; 4: article no.669097.

Zotter

Kheifets

. Impedance and Wakes in High-Energy Accelerators. Singapore: World Scientific, 1998.

10.

Aggarwal

. Neural networks and deep learning: A textbook. Cham, Switzerland: Springer International Publishing AG, 2018.

11.

Goodfellow

Bengio

Courville

. Deep learning (Adaptive computation and machine learning series). Cambridge, MA: The MIT Press, 2016.

12.

Baydin

Pearlmutter

Radul

, et al. Automatic differentiation in machine learning: a survey. J Mach Learn Res 2017; 18: 5595–5637.

13.

Kochenderfer

Wheeler

. Algorithms for Optimization. Cambridge, MA: The MIT Press, 2019.

14.

Ramachandran

Zoph

. Searching for activation functions. In: Proc. ICLR. 2018.

15.

Fujita

. Impedance computation of cryogenic vacuum chambers using boundary element method. Phys Rev Accel Beams 2022; 25: 064601.