Abstract
Designing an electromagnetic device, as with many other devices, is an inverse problem. The issue is that the performance and some constraints on the inputs are provided but the solution to the design problem is non-unique. Additionally, conventionally, at the start of the design process, the information on potential solutions needs to be generated quickly so that a designer can make effective decisions before moving on to detailed performance analysis, but the amount of information that can be obtained from simple analysis tools is limited. Machine learning may be able to assist by increasing the amount of information available at the early stages of the design process. This is not a new concept, in fact it has been considered for several decades but has always been limited by the computational power available. Recent advances in machine learning might allow for the creation of a more effective “sizing” stage of the design process, thus reducing the cost of generating a final design. The goal of this paper is to review some of the work in applying artificial intelligence to the design and analysis of electromagnetic devices and to discuss what might be possible by considering some examples of the use of machine learning in several tools used in conventional design, which have been considered over the past three decades.
Keywords
Introduction
Engineering Design is an inverse problem. The issue is that there is a need for a particular performance – for example the torque-speed curve of an electric motor – and the question is “what is the device that can meet this specification or need?”. Inverse problems can be described as two classes – the first is a situation where the structure is given and is capable of producing the outputs required but the inputs needed to generate a particular output are unknown. In this case, the inverse problem becomes a control problem and the inputs are adjusted based on feedback provided by the outputs. The second class is more challenging where the device itself needs to be determined. The goal of this problem is to produce a device which has a performance envelope which can include all the possible output performances required while the inputs are maintained within specified ranges. The traditional approach to solving this problem is to “guess” or “size” a possible solution to provide a starting point and then iterate on the solution by varying the various design parameters to try to meet the performance specifications. This inverse problem has led to the development of advanced analysis methods for predicting the performance of a possible device with approaches which range from simple equivalent circuits to full three-dimensional multi-physics solutions. These are then included in an optimization process. However, this is expensive and considerable effort has been applied to simplify the inverse problem for electromagnetic devices. In particular, over the past three decades, the concepts behind artificial intelligence in the form of expert systems as well as machine learning have often been considered as approaches to accelerating the design process. Throughout the 1980’s and 1990’s there was considerable effort in applying knowledge-based techniques to accelerate finite element analysis and a substantial review of the work up to the end of the 1990’s is provided in [1]. The work in [2] and [3] provides examples of these approaches applied specifically for magnetics. In [2], the authors describe an adaptive technique where an expert system is used to refine a finite element mesh based on error estimation. In [3], heuristics are used to guide and advise on the solution of computational electromagnetics problems. While there was some success, these systems were limited by the issues of acquiring the knowledge needed to be effective. This is discussed later in this paper. Simultaneously, automated learning systems, i.e. ones which could acquire knowledge from examples were being developed using neural networks and [4,5], and [6] are typical of this approach. Over the past two decades, there has been considerable work in the optimization component of the design process and [7] provides a recent review of trends in optimization as part of the inverse problem solution and proposes the use of machine learning.
Traditionally, the design process for an electromagnetic device follows a series of steps which are commonly used for the creation of many different types of system. However, the details of these steps are heavily dependent on the available models to describe the device performance and the computational time and effort required to implement them. In the case of an electromagnetic device, it is usually a component in a larger system and thus the design process often starts with a set of requirements which are imposed by the system needs. They may include overall physical dimensions, power and torque outputs over a speed range, minimal efficiencies, electrical supply details (voltages, currents, frequencies, etc.), to name but a few. The goal is to take these requirements and translate them into an electromagnetic device that satisfies the constraints and meets the performance specified while minimizing the cost of achieving the solution.
The starting point is essentially one of identifying the space for a possible solution (based on experience plus sizing) and then exploring that space to identify the structure that satisfies the requirements, i.e. an optimal solution. This process is one of increasing detail in the modelling as shown in Fig. 1. Approximate costs for each analysis type for a single solution in terms of the cpu time needed on a single core on a typical workstation are given. At the start of the design process, algebraic models can provide fast answers to initial questions but the amount of information available for decision making is limited. As the design process moves forwards, the models provide more information but with rapidly increasing costs measured both in engineer and computer time. At some point, the cost benefit of added information may result in the process being terminated and a physical prototype being constructed. Access to information late in the process can result in expensive redesigns.

The cost of information in the design process.
The key, however, is to implement the design process as effectively as possible. In order to achieve this, the designer relies on both existing knowledge and a series of tools which can answer questions related to the performance of the proposed design. These tools may have a variety of capabilities, but the point is that each has a place in the process. In fact, many models developed decades ago still have a valuable place in the set of tools needed for efficient design. For example, simple algebraic models may allow a fast initial exploration of the design space to identify candidate solutions which might satisfy the requirements. More capable tools might then be used to examine these candidates in more detail and at some point a decision is made to construct a physical prototype. This step is expensive and, ideally, should occur only at the end of the design process. While the design process is, fundamentally, solving an inverse problem, the traditional approach to this is to use a simulation of the device (i.e. a model to predict performance) which allows easy modification of the device parameters and then embed this in an optimization loop. As a result, much of the work in developing “design” tools over the past decades has concentrated on solving the analysis problem. i.e. estimating the performance of a particular design solution, in ever more detail. These advanced tools are expensive computationally and take significant amounts of time to generate answers. The problem with this approach, while it can work well, is that information the designer would like to have access to in the initial phases of the process is only actually determined at a late stage leaving the designer to make early decisions which may later turn out to result in an inappropriate design requiring a return to the start of the process. In effect, the information gain is not where the designer needs it.
The intention of this paper is to review and identify where Artificial Intelligence and Machine Learning technologies might be used to assist in accelerating the design process by providing relevant information earlier and thus removing the bottlenecks in existing systems. It is not the intention to go into great depth through the paper but rather to highlight where existing techniques might enhance the overall process, thus making it more effective and reducing overall costs. Some examples are given based on previous work of the author to try to illustrate the potential impact.
In general, given a specific requirement, it is impossible to implement a design process without any understanding of the relationships between the structure of a device and the output performance. Consequently, an electrical machine (i.e. a device intended to convert electromagnetic energy into mechanical outputs) could not be conceived, let alone designed, until there is some basic understanding of the relationships between its structure, the operating environment, the electric and magnetic fields and the concepts related to magnetic fields and force generation. Once a basic device exists, it is conceivable and possible to implement a design process based solely around the construction of a series of physical prototypes each of which is a modification of the previous ones. However, this is an extremely expensive and time-consuming exercise and, consequently, not viable. The development of virtual models does not actually change the process, it just replaces physical systems with virtual ones and it is usually expected that the virtual models can allow more flexibility in exploring the design space as well as returning results in a shorter time span.
The required performance, i.e. the rationale for the design process, is usually specified through a set of Key Performance Indicators (KPI’s) such as a torque-speed requirement, a minimal efficiency, etc. The development of simulation tools, i.e. the tools for constructing virtual devices, is driven directly by the need to be able to determine the actual KPI’s given the prototype (virtual or real) structure. In general, since the 19th century, tools for the performance estimation of electrical machines have been generated using whatever could be extracted from the basic understanding of fields and the computational capabilities that were available. Thus, there exist a large number of algebraic models of electrical machines – many specific to a particular class of electrical machine, e.g. induction machines, dc machines, ac machines, etc. These models are, in fact, both fast to execute and can deliver accurate performance predictions and have been developed and tuned over decades of design. So why are advanced tools, such as those based on field analysis, needed? The problem is that while these “traditional” tools can produce predictions of the terminal performance of an electrical machine, they cannot, in general, provide local information.
For example, a local concentration of losses may not be visible in the performance at the terminals of a device but it could have a significant impact on the thermal performance and could ultimately lead to failure. Additionally, the field distribution can aid a designer in understanding where material can be removed without impacting the performance, thus reducing overall costs. Such choices can, though, result in increasing the loading in other parts of the system (for example, the thermal performance) leading to early failures of a device. In summary, while field solutions are seen as necessary, they are most useful when it comes to understanding local effects in an electrical machine and these effects can provide the links to other areas of physics such as structural, acoustic, fluids, etc. This has assumed that relatively simple models exist for all classes of machine. This may not be true. For novel topologies, algebraic models may not exist or may be expensive to develop and the field solution is general – it applies to all structures. However, the computational cost of acquiring this information can be high both in terms of the processors and memory required as well as the time taken to achieve a result. When the overall design process is considered, the ideal simulation tool would be one which can provide an effective estimate of the KPI’s, including those related to areas of physics other than electromagnetics, in the early stages of the process and at a low computational cost, i.e. attempting to reach the lower right hand corner of Fig. 1.
The main problem with developing a field solution is that it is, in fact, the computational equivalent of constructing a prototype physical device, energizing it and measuring the performance (in fact, the physical device could be considered to be an analogue computer capable of solving Maxwell’s equations in real time). This is an “analysis” process and can only be used by a designer once an initial design exists. It can then be used to explore the design space to optimize or improve the design by running digital experiments varying various design parameters to gauge their impact on the performance. However, the question remains as to how the initial design is created. This process is heavily dependent, at present, on the existence of a design engineer who can make initial choices based on experience and a knowledge of the capability of the company which will manufacture the system. In general, there have been “rules” related to the structure of an electrical machine which have enabled designers to begin the process. Examples include the rationale for the number of poles to be used (related to the design speed); the number of slots used per pole (related to the back emf being generated and the winding structure); the number of rotor slots (related to the level of tooth ripple torque that can be generated); etc. In an environment where the operating frequency of a device and the voltage are more or less fixed, and the device runs at a single speed, the rules could be stated relatively clearly. The initial design process is often referred to as “sizing” and rules based on an understanding of the magnetic operation of an electrical machine can be used to generate algebraic models which, in turn, can be used to estimate the initial structure of a possible solution. References [8–11] and [12] are typical of the development of sizing approaches used by designer. While these approaches to developing analycial models provide a way of starting the design process, the rules used may not be a good guide in a modern environment where the voltage and frequency are now controllable and the device might operate over a range of speeds and torques. However, the idea that initial design could be implemented based on a set of rules resulted in efforts to use artificial intelligence approaches in the initial design almost in parallel with the development of simuation models. This work can be traced back to the 1950’s where initial attempts were made to implement what might now be referred to as “decision trees” but, at the time, were seen as implementations of standard design sheets which included choices based on decisions that had already been made [13,14]. The approaches used by Veinott and his colleagues were very similar to many that being considered today but were limited by the compuational power available at the time. However, they did enable significant advances in the design process of electrical machines and were used successfully in several companies for many years for both design synthesis and analysis.
While recent developments in the fields of Artificial Intelligence and Machine Learning have raised the visibility of alternate approaches to generating the information needed for a design both by providing “surrogates” for some of the existing tools and, possibly, accelerating other existing tools, some of the published work in this area can be traced back over the last four decades. The remainder of this paper will consider the basis behind these technologies and illustrate where they might be used to address the needs of the design process.
Rule-based artificial intelligence
Rule-based systems are developed around a formal logic system which allows the “correctness” of a piece of information to be verified. Through an inferencing process, the new facts can be generated related to a particular problem based on what is currently known and the existing and applicable set of rules. The “rules” are a representation of the “knowledge” in the system and operate as a series of “situation-action” pairs, i.e. given a particular set of facts (by definition “true”) the rules allow new facts to be determined. Additionally, as rules are added to the system, the implications of the rules may result in the generation of new rules. A typical architecture for a system of this sort is given in Fig. 2. Given a set of facts and a set of rules, the system can deduce new information and can, in theory, explain how it generated this. The generation of information continues through a process of “forward” or “backward” chaining until there are no rules left in the knowledge base that could be used on the problem. At that point, if a solution can not be found, then the system will fail and should request more information from the designer, [15–17].
The issues with a rule-based approach to design are two-fold. First, the knowledge needs to be generated by an expert in the field. Second, the number of rules needed to create an effective design can be extremely large and can result in contradictions occurring in the rule-base. However, if the basic knowledge involved in the early process of relating the requirements to a particular structure can be implemented, it is possible that such a system could provide the first step in the design of a device. Such an approach is often referred to as an “Expert System”. While an interesting path to explore, these systems seem to be being left behind as a result of recent developments in machine learning and a lack of experts to provide the knowledge base.

A typical expert system architecture.
Machine Learning is an approach to try to overcome the issue with acquiring knowledge in an “expert system” although its goals and capabilities are somewhat different. Whereas the Rule-Based approach attempts to simulate the human reasoning system with a logical reaction to a set of known facts, based on physics and experience, coupled with an ability to explain “why”, the machine learning system uses a very large and heavily interconnected set of simple computational elements, “neurons”. Each neuron implements the same function but the outputs are determined by the inputs which, in turn, are a weighted set of the values being presented to the neuron. These values may come from the input set or from a previous layer of neurons. The weights determine the performance of the network in terms of the output response to a particular input vector.
Such a system can “learn” the desired responses by being shown data sets consisting of a series of input vectors and the desired response to each one. This is sometimes referred to as “labelled data”, i.e. the output provides a label that “classifies” the input. Learning is a process of determining the weights of the connections into each neuron and this is, effectively, an optimization process and is often implemented through the use of techniques such as gradient descent. For large networks, the number of weights to be determined can be of the order of millions or more and thus determining them can require a large number of training examples – often in the thousands or tens of thousands. Such networks, once trained, are capable of two possible tasks. The first is modelling a high dimensional response surface – so an example might be the torque output of a motor given its geometry and excitation conditions. The second is a classification, i.e. recognizing an input pattern and matching it to a known class of objects. In this case, an example might be to suggest to which class of motor a particular set of requirements belongs, i.e. generating an initial starting point for the design process. The process described here is often referred to as “supervised learning”, i.e. the output vector is defined.
An alternate approach to learning is to create systems which are unsupervised, i.e. the network is expected to learn on its own. To achieve this, two networks may be connected together with one being the mirror image of the other. The first network generates an output vector from a set of inputs and the second takes the output vector of the first and attempts to regenerate the original input vector. The intermediate output vector is often referred to as the “latent” vector. The outputs of the second network can be compared with the inputs to the first in order to train the system. This structure is known as an “auto-encoder” and the second network usually uses the same set of weights as the first.
Variations on these structures depend on the functions implemented in the neuron and the interconnections. For example, if every neuron in one layer is connected to every neuron in the previous layer, the network is said to be Fully interconnected and Feedforward (FFN). If the neurons in one layer are connected to a subset of neurons in the previous layer, functions such as convolutions can be created leading to a Convolutional Neural Network (CNN), which is usually used for pattern recognition or for reducing the size of an input vector to its significant components. If a neuron has feedback included, then the output is dependent on the current input and the last output generated, i.e. it contains a memory system. This is usually referred to as a Recurrent Neural Network (RNN). Additionally, the function of the neuron can be based on a Radial Basis Function, rather than just a pure sum, leading to a RBF network. While these architectures may be interchangeable in terms of addressing a particular problem, each will perform better if the characteristics of the problem match those of the network.
However, the actual architecture of the network needed for an accurate representation of the problem characteristics is often a matter of experimentation. The goal is to use as few neurons as possible in each layer and to minimize the number of layers. Each layer adds to the execution time of the network as data has to propagate through it.
Machine learning in the design process
Given the above descriptions of the capabilities of machine learning and the needs of the design process, there are several areas, both in generating a design and in accelerating the computational processes, where machine learning might help achieve the goal of minimizing cost and overall design time. Four examples are given here. The first is the generation of a surrogate model that can predict the torque and efficiency performance of an electrical machine based on its structure and the input conditions; the second relates to speeding up the mesh generation process for a finite element based modelling system; the third uses neural networks to model hysteretic phenomena in magnetic materials and the fourth applies reinforcement learning for the topological optimization of a simple actuator system.
Surrogate model for torque and efficiency
Torque and Efficiency are two of the critical KPI’s for most machine requirements. They are also complex to compute needing several finite element solutions which, for some machine types, can be time-consuming. This is an ideal candidate for a surrogate model based on a neural network. One process for computing the efficiency map of an electrical machine involves several steps. The first is to identify the direct and quadrature axis flux linkage maps, as the flux linkage of the machine at a particular position is needed by the control system. This is usually done through a set of finite element analyses. The second stage connects the drive outputs to the flux linkage to set up the operating condition for the electrical machine and the torque can then be computed, usually through another finite element simulation. The use of finite element simulations can be seen as deriving points on a response surface for a machine relating the geometric and operational parameters to the performance parameter, e.g. the flux linkage, the torque or the efficiency. However, the finite element simulations are expensive and the result is that computing an efficiency map for an electrical machine can take significant time and thus this process is not feasible for the early stages of the design process when a fast estimate is required.
In this situation, neural network surrogates to replace the finite element simulations could result in efficiency estimates being available in minimal time early in the design process thus reducing the overall time needed to achieve a solution. The process is described in [18,19] but in brief, a large data set of machine performances was generated using a parameterised version of a basic IPM (interior permanent magnet) machine to predict the flux linkages. The inputs to the system were the geometry of the device and the current value and advance angle. The outputs were the d and q flux linkages. This data set was used to train two networks, a standard FFN (for the operational conditions) and a CNN for the geometry, which was described through an image, and the outputs were combined in a third network to generate the flux linkages. The flux linkage information is an input to the control system and is combined with the speed and torque requirements to generate the excitation conditions for the machine. Again, using a parameter scan of the operational envelope of the machine, a data set was constructed to predict the efficiency and torque for any operating condition. Because the efficiency varies smoothly over the operational space of the machine, a recurrent neural network was used for the final stage so that the information could be spatially related. A dataset of 3000 examples was used to train and validate the network. The resulting efficiency maps from the finite element approach and the neural network surrogate are shown in Fig. 3 and the error is less than 1.5% over the range. However, the neural network approach can execute up to 500 times faster than the finite element simulations and thus it can be applied right at the start of the design process allowing more informed decisions about potential designs to be made early, reducing the overall costs of the process.

(a) The generation of DQ Flux Linkages, (b) the generation of efficiency maps, (c) the efficiency map geenrated by a finite element based system, (d) the neural network prediction of the efficiency map.
An appropriate finite element mesh depends on several issues. The first is the geometry, i.e. the boundaries of the various regions in the problem; the second is the material property interfaces where sharp corners in boundaries can cause field singularities; and the third is the field distribution itself. While the latter is unknown, if an estimate can be made of the field and how it might vary over the solution space, then that information can be used to determine the mesh density needed. Conventional mesh generators mostly rely on only the geometric information to create initial meshes and the users are often provided with toolsets that allow them some control over the mesh structure. However, in effect, the user is controlling the mesh based on what the field distribution is likely to be – i.e. information is being added based on the experience of the user. Typically, estimating the effect on the field of a sharp corner or predicting the distribution of eddy currents will allow a user to refine a mesh appropriately. A machine learning system that can add the solution information to the meshing process can considerably improve the user experience and the accuracy of a finite element solution while, at the same time, significantly reducing the number of steps needed if an adaptive solver is implemented. While a simple FFN can be used for this problem, the issue is to minimize the input vector and thus the size of the network needed. Again, this work is described in detail in [20,21] so only a brief review is given here.
The choice of an input vector for meshing an electromagnetic field problem has several requirements. The first is that the data should be rotationally invariant – the field does not depend on the orientation of the problem. The point at which the field is being measured or, in terms of the goal of mesh generation, the density of the mesh at the point, should depend on the position of the point relative to features in the problem which could influence the field structure, e.g. the nearest current, the nearest magnetic material, the nearest corner, etc. This results in an input vector of the form shown in Fig. 4 [20,21].

Parameterization for the network.
To generate a training set for the network, a finite element solution was generated with a problem which exhibited many of the features found in a typical magnetic device and the mesh was refined and the problem solved until there was no significant change in the local fields when the mesh was further refined. The resulting element sizes at a large number of points across the solution domain was recorded and used, along with the input vectors at each point, to train the network. The final network architecture consisted of an input layer having 9 inputs, and 2 hidden layers having 24 and 18 neurons respectively. The output was the proposed element size at the given point. Typical results from the trained network are shown in Fig. 5. As can be seen, the basic mesh generated by the network has clearly taken the potential field structure into account by the way it has increased the mesh density in areas where the change in magnetic field would be very rapid.

As a second test on the network, meshes were created when the core was constructed from magnetic material and when it was built of air (i.e. did not exist) and the results, shown in Fig. 6, clearly demonstrate that the system is generating meshes based on an “understanding” of the probable field behaviour. The impact of a permeable structure can be seen in the mesh generated in the excitation coil (purple) and in the narrow part of the airgap.

Comparison of meshes generated with different core materials.
When used with an adaptive solver, this approach to mesh generation can cut the number of steps needed to obtain an accurate solution by a factor of two, thus significantly speeding up the solution process without needing any user input to guide the mesh generator.
While the dominant loss mechanism in most electromagnetic devices is the resistive loss in the windings, the iron loss in the stator and rotor can be of importance and needs to be modelled. If it is large in small areas of the device, while it may not be a major contributor to the global loss, it can cause significant thermal problems which, if not considered in the design process can lead to the failure of the device. Magnetic materials have losses due to the fact that the magnetization process requires that energy be transferred to the material and then, as the field decreases, some of that energy can be recovered but some is lost due to the work involved in moving magnetic domains. If this is not considered, there will be errors (albeit relatively small) in the field distributions and in the efficiency calculations. Because of the loss involved, the trajectory of the magnetic flux densities (B) and fields (H) in the B-H plane forms a loop. In fact, there is a series of nested loops depending on the maximum field that was reached in any particular cycle. This property is referred to as “hysteresis” and the actual structure of the loops depends on the chemical and crystalline structure of the magnetic material. The modelling of the physical process is particularly complex so many models are “phenomenological”, i.e. they do not attempt to implement the actual physics of the magnetization process. In solving a magnetic field problem involving this material property, it is necessary to know where on the loop the particular piece of material was the last time the field was computed before a new operational point can be found. In a finite element code, this calculation has to be performed for every element which contains a magnetic material – and there may be thousands or hundreds of thousands of these. Thus a fast model is needed if this effect is to be included. Again, the hysteretic performance can be considered as a surface in the B-H plane and as such, should be representable by a neural network with the one major issue that the system has to have memory of the last operational point. This problem has been explored using neural networks over the past 25 years [22–25]. Work on material modelling using machine learning has been reported by several authors of which [26] is typical. The example here is a recent implementation using both a FFN and an RNN to compare performance [27,28].
While it is preferable to work with measured hysteresis data on a particular material, the experiments were performed using data generated from an implementation of the Preisach model [29,30] and are shown in Fig. 7. In the case of hysteresis, the input vector contains the last two values of the B-H pairs and the next value of H. The output of the network should be the next value of B. The last two values are used to resolve the field reversal issue. The result from a FFN is shown in Fig. 8. Similar results were obtained using a RNN but the execution time was almost 50 times slower than the FNN. This can largely be attributed to the extra weights that are needed to implement the memory system inside the RNN. Again, the goal here is to reduce the size network as much as possible to accelerate the execution time since this can massively affect the time taken by the finite element solver.

Data generated from presiach model.

Neural network prediction compared with Preisach model.
The topological design of an electromagnetic device to achieve a particular set of objectives has been the subject of considerable research interest over the past three decades. The rationale is to improve the performance of a device not merely by changing the shape of a component but actually adding and removing magnetic structures, including introducing holes in a structure, to guide the magnetic field more effectively. Much of the work has considered conventional optimization systems coupled with a system capable of switching material properties in particular regions of a device [31,32]. This can be improved by using sensitivity analysis coupled to the optimizer but it is, inherently, a computationally expensive process [33]. To reduce the computational cost involved, the performance of the topologically modified structure can be predicted by using a deep learning system [34]. Such a system can remove the need for detailed finite element analyses during the optimization process. Additionally transfer learning can be used to reduce the size of the training sets needed for novel structures [35]. However, one of the problems with “classical” topology optimization is that it can generate structures which are non-manufacturable due to a large number of holes (air regions) which may be introduced to guide the magnetic flux. As a result, topological design often needs a post-processing operation to edit the design into something which can actually be made. An alternate possibility is to use a neural network which has been trained to recognize structures which can achieve the desired objectives [36]. One approach to doing this is to create a system which can “grow” a magnetic structure by using a basic set of moves (up, down, left, right) from its present position and, as it moves through the space available for the device, leaving a trail of magnetic material behind it. The decision on which move to make is based on a policy which determines which move is most likely to improve the objective. By applying reinforcement learning, the system is rewarded if it makes a move which results in an improvement in the objective function or it is penalized if a move does not improve the device performance – this modifies the policy. Since this is a continuous trail, the problem with introducing many air spaces can be mostly removed.
The system includes a learning agent, Fig. 9, which consists of “actor” and “critic” neural networks. The environment is the problem space. The output from the learning agent is a move to make in the environment, i.e. to place a piece of magnetic material or not. The input from the environment is an evaluation of whether the move improved the overall objective, i.e. if placing a piece of magnetic material has increased the force on the armature. The critic evaluates the result and updates the policy being used by the actor, i.e. the direction to move in. The “learning element” is responsible for improving both the policy and the critic. The goal of the system is to gain sufficient information about the environment to implement an optimal policy. The policy is basically the probability of taking a particular action (a i ) in a given state (s). The update to the policy is implemented based on the gradient of the improved actions. The system is described in [37,38].

Learning agent structure which interacts with the environment.
The approach was implemented to design an actuator structure [39]. The objective was to maximize the force on the armature while a computational constraint of 500 episodes (iterations) was imposed and convergence was assumed to be achieved if at least 20% of the episodes where within 5% of the optimal value. The system was given the excitation coil layout but then needed to determine where magnetic material should be placed. Figure 10 shows the initial state of the design space. The coil and armature positions are defined and the design space for the core is indicated by the set of squares. The initial position for the controller is indicated by the 3 by 3 grey area in the top left of the domain. The objective is to increase the force on the armature but this depends on the flux density in the airgap so the average value of the flux density through the armature is used as a surrogate for the force. The design proceeds by moving the controller and several passes through the problem are allowed – each one improving the core design if it can. Figure 11 shows the state at the end of the first pass. The learning process allows the system to optimize the moves that it makes – thus speeding up the process of finding the optimum core topology. The solution provided by the reinforcement learning system is compared with previously published results for the same problem [39,40] in Table 1 and demonstrates the effectiveness of the system.

Starting state for the actuator design.

State of the design at the end of the first pass.
Results of the optimization process for the armature force
The previous section has described areas within the design and analysis process where Machine Learning might be applied. The examples given have, in general demonstrated that such an approach can be applied successfully. However, the major computational cost of the design process is in the solution of the field equations, i.e. in predicting the field distribution within the device to an accuracy which is appropriate for predicting the KPIs and enabling multi-physics simulations. Here, again, much work has been done using several different neural network architectures. In [41] the authors describe the use of neural networks for the modelling of a high frequency, wave problem. The ultimate goal in this case is to provide a real time estimate of the field behaviour by implementing the network in hardware. While this is an interesting concept, it illustrates the speed gain which might be attainable through a machine learning system. Rather than predicting the field through a neural network, there has also been some consideration of using a neural network as a replacement for finite elements in the analysis system. In such an approach, the locally connected neurons provide a method for minimizing the energy in the system [42]. Again, the target was to try to generate real time solutions for an electromagnetic field problem.
More recently, the interest in generating predictions of the magnetic field through the use of neural networks has focussed on structures such as CNNs and auto-encoders [43]. The approach can be shown to work but is limited in that it uses images of field solutions as the training data for the network and these have a specified resolutions and may not be able to capture all the details of a real magnetic device. However, the application shows the potential for fast field prediction – something that would provide significant benefits early in the design process in terms of the links to muti-physics analysis. Such an approach uses a conventional loss function, i.e. minimizing the error between the input and output images, for training. A refinement of this method is to include physics knowledge in the network training process. By including the underlying physics partial differential equations in the loss functions, the number of training examples needed for a network can be significantly reduced. Such an approach is described as a “Physics Informed Neural Network” or PINN and is, in some sense a development of the concepts in [32]. The approach has been described in [44] and [45].
Finally, developments of these machine learning approaches can also contribute to solving the inverse problem involving multi-physics interactions, such as in the TEAM Workshop Problem 36 where a magnetic and thermal problem are coupled and the goal is to determine the design of the coil structures needed to heat a body. A potential solution to the problem is illustrated in [46].
Conclusions
Artificial Intelligence and Machine Learning are not new technologies. Over time, the developments in computational capabilities have largely been used to create improved physics simulation systems which can allow designers to work with virtual models of their proposed systems in a way which can produce much more information than the real device in a laboratory. These systems are underpinned by the mathematics that can prove properties such as convergence, provide the conditions on the correctness of the solution, etc. This is often referred to as “hard computing” meaning that it is built on a solid foundation. However, such systems, while very powerful, tend to require large computational resources and have significant limitations when being used in a design process. As a result, a lot of the focus on tools for design has moved back towards algebraic models and simplifications which can enable high speed performance predictions allowing for real time interactions between the designer and the design system. However, these fast models cannot provide all the information which would, ideally, be needed early in the process. The technologies associated with Machine Learning are part of a set of possible techniques which can embed knowledge and enable more information to be provided at a lower cost. This paper has discussed where in the design process, machine learning can be harnessed and has reviewed some of the relevant work in the area. Additionally, a limited set of examples has been used to highlight areas where Machine Learning can be used effectively. However, it is clear that Machine Learning based tools demonstrate that it might be possible to both improve the early stages of the design decision making as well as to accelerate the computational tools, such as the finite element based simulators, to enable them to become more useful earlier in the process, thus moving the design systems nearer to the lower right hand corner of Fig. 1. As such it is expected that, over time, with further development, Machine Learning based surrogates will become a significant component of the design process.
