Abstract
Iterative Learning Control (ILC) is an intelligent control algorithm that can effectively handle a tracking error in any system that operates in a repetitive manner. In practice, it is hardly possible to implement a single gain learning control law to improve the tracking performance due to the existence of large transient growth. To prevent the growth, this paper proposes a time-varying learning control design using the unique concept of fuzzy logic control to track the desired trajectory as well as the desired control input signal. The proposed control design is developed on both serial and parallel ILC configurations. The two configurations are initially constructed and implemented on a robotic manipulator with the use of a single gain learning control law. To avoid bad transients, the gain adjustment mechanism based on fuzzy logic control is introduced to vary the learning gain in each time step for enhancing the robustness of the system. According to the simulation and experiment on a robotic manipulator, both ILC structures with the proposed mechanism achieve the desired learning performance without bad transients.
Introduction
Robotic manipulators play an important role in trajectory tracking applications, such as pick and place tasks, mechanical cutting, grinding, deburring, and polishing tasks, including welding and painting. These applications require the end-effector to keep track of the desired trajectory within acceptable limits. However, recent applications require higher levels of precision, leading to a challenge in developing new control algorithms to achieve higher precision.
Numerous tracking control techniques have been proposed to improve tracking performance in tracking applications. Proportional-Integral-Derivative (PID) controllers are most common. [1] introduces a guideline for tuning PID gains for a one-link robotic manipulator. [2] combines a PID algorithm and electrical stimulation to track the desired limb trajectory in an exoskeleton. Besides, Sliding Mode Control (SMC) is another type of control algorithm that has been widely implemented to show the trajectory tracking performance of robotic manipulators [3–5]. A detection and tracking system, proposed by [6], is developed for a welding robot, and an absolute interpolation method proposed for the real-time tracking algorithm. Another solution presented by [7] makes use of an Iterative Learning Control (ILC) technique to track a three-dimensional path in the fastening process.
Some other control techniques, such as an observer-based estimator or the least squares method, are demonstrated in [8, 9] to prove their control efficiency. Most of the addressed works provide a stability analysis using the Lyapunov method. Moreover, [10] summarizes an extensive literature survey on trajectory tracking control algorithms used in industrial machines. Meanwhile, [11] provides a comprehensive review of the control strategies for flexible manipulators and joints, presenting the advantages and disadvantages between model-based and non-model-based methods, including PID, Fuzzy Logic, Neural Networks, Repetitive, Sliding Mode, Robust, Adaptive, and Optimal, Predictive, Linear Quadratic Regulator (LQR)/ Linear Quadratic Gaussian (LQG).
According to [11], ILC is categorized using an intelligent control algorithm that can effectively handle a tracking error. The basic idea of ILC is to improve the performance of a classical feedback control system by reducing the tracking error from one iteration to another. This can be achieved by observing the error at the current iteration to adjust the control input in the subsequent trial.
Based on a review of existing literature, two types of control input signals are used for adjustment: 1) The control input signal outside the feedback control system [12–14]; and 2) The control input signal inside the feedback control system [15–19]. The former creates a cascade configuration, a so-called serial architecture, in which the reference command at the current iteration is reshaped according to the tracking error recorded from the last iteration. The latter constructs a parallel architecture, known as a plug-in configuration, providing access to vary the internal control input generated by the feedback controller. In most commercial manipulators, lack of access to manipulate the internal control input is quite common. As a result, the serial architecture is more conventional in terms of control implementation.
A few existing works [20, 21] literally compare the two ILC configurations including stability analysis. However, none provide a comparison in terms of practical implementation. The study by [20] explores the two configurations and shows an error convergence in the frequency domain, pointing out that the monotonic decay conditions of error for both configurations are different when using ILC for tracking a reference trajectory. Another idea was proposed by [21] summarizing that the stability proofs for the two configurations are identical. It should be carefully noted that even though the error propagation from one iteration to the next looks the same for both structures, the system matrix is not referred to the same system dynamics.
To the best of the researcher’s knowledge, [22] is the only work providing a comprehensive evaluation of the two structures in terms of implementation. The serial and parallel ILC architectures were constructed and implemented on an Atomic Force Microscopy (AFM) nanopositioner. The comparative results showed that the parallel ILC architecture always outperformed the serial ILC architecture, although it did take longer to converge. In addition, the variation in the learned command input under the serial structure indicated potential numerical sensitivity issues. However, such issues were not apparent in this investigation.
It can be observed that the two configurations of ILC have been inexplicably chosen to demonstrate the effectiveness of ILC in many literary works. Even though it is evident that they are both capable of eliminating the tracking error, none of the existing works have ever completely studied the pros and cons of using both configurations. This work therefore bridges the gap and provides guidelines for the selection of an appropriate ILC structure as well as a theoretical analysis of the stability conditions in the time domain whereby the transients are completely considered.
The two configurations are originally constructed using a D-type [23], or PD-type learning control law to evaluate their tracking efficiency. However, in practice, a large transient growth can be expected due to a model mismatch or the propagation of noise and nonrepetitive disturbances.
The use of a Q-filter is a common technique for addressing the issue, as discussed in [24, 25]. The study by [24], presents a Linear Time-Varying Q-filter incorporating an ILC algorithm to prevent transient growth. Monotonic convergence conditions, relying on the choice of the Q-filter, are only investigated in numerical examples. An alternative Q-filter configuration in ILC provided by [25] is experimentally applied on a wafer stage. Among these works, the choice of the filter bandwidth is barely determined since tracking performance will deteriorate unnecessarily when choosing an inappropriate value for the bandwidth. Some other techniques designed to prevent a growth of error signals include a data-driven ILC approach [26], an ILC gain boundary method [27], and a control input boundary design [28].
Regardless of dealing with the Q-filter for practical ILC, this work additionally develops a rule-based mechanism to vary the learning control gains for serial and parallel ILC architectures. The time-varying gains are adjusted so that the system can track the desired trajectory as well as the control input signal. This efficiently avoids amplification of the control input signal while learning from the historical error.
In summary, the main contributions of this work consist of the following: Rather than integrating a Q-filter as in many other studies, this study presents the unique concept of fuzzy logic control (FLC) to track both the desired learning control input and the desired trajectory in the gain adjustment mechanism (GAM). Based on the design criteria, FLC is used as a tool to adjust the learning gains to compromise the tracking error and growth of the learning control signal. This is an absolutely novel scheme, taking the desired learning control input into account while tracking the desired trajectory. The two configurations are initially constructed and simulated using the approximate system obtained from experimental data, and subsequently implemented into a robotic manipulator to compare the efficacy of the two ILC structures as well as verifying the effectiveness of the GAM.
The rest of the paper is organized as follows. Section 2 introduces an overview of ILC including its structures and a stability analysis. Section 3 presents the iterative learning controller with a fixed gain design that will later be adjusted by the GAM in section 4. Section 5 demonstrates the efficiency of using the GAM in the simulation and experiment with a robotic manipulator, while the conclusion to the paper is provided in section 6.
Iterative learning control
A feedback control system representing a single joint of a robotic manipulator is considered in this study. Under the assumption of a one time-step delay in the system, the output corresponding to the command,
The purpose of using ILC is to eliminate tracking error by learning from historical information. Assuming that a control system produces the same tracking error every time it executes an identical command, it can be observed that the error and control input signals from the previous iteration must learn and adjust for a more appropriate control input signal to reduce the tracking error in the current run.
In general, a learning control law can be written as
where q is the forward time-shift operator qx (k) ≡ x (k + 1), L (q) is a learning function and Q (q) represents a low-pass filter, designed to prevent amplification of the learning control input. Since the closed-loop system has a one time-step delay, the control input u (k) attempts to cancel the error e (k + 1). To eliminate e
j
(k) , k ∈ [1, N], the set of control inputs for iteration j is therefore expressed as
With the concept of adjusting the control input to the system, a suspicious question arises concerning which control input signal should be considered: the one inside the feedback control loop or that outside the loop (or command). In literature, two ILC architectures are generally constructed using the same concept of adjusting the control input signal but at different locations. The first configuration is called serial architecture where the external command to the feedback control system is adjusted by the ILC mechanism. The other configuration is known as parallel architecture or a plug-in configuration where the internal control input in the feedback loop is directly modified by the learning law. More details of the two configurations, including stability analysis, are provided in the following subsections.
To eliminate the tracking error using serial ILC architecture, the command to the feedback control system is adjusted before the next run. Figure 1 depicts a block diagram of the serial ILC architecture from iterations j to j + 1.

Block diagram of serial ILC architecture.
At iteration j, the output from the feedback controller is called
Since the control input
In the initial trial, where the learning control mechanism has not yet been activated, the learning control input
In order to investigate the stability of the serial ILC architecture, the closed-loop control system can be constructed as a discrete Linear Time Invariant (LTI) state-space model as
This can be rewritten in a packaged form as
By taking the limit at infinity,
Using the definition that the spectral norm of
It can be clearly observed that the error norm monotonically decays if
Figure 2 depicts a block diagram of the parallel ILC architecture from iterations j to j + 1. The learning mechanism has the same concept as the serial structure in that it records the learning control input and corresponding error from the previous iteration to adjust the learning control signal in the current run. However, instead of adjusting the external command to the feedback control system as in the serial structure, the internal control input

Block diagram of the parallel ILC architecture.
Since the control input is a summation of the learning control input and output from the feedback controller, one can generalize by updating only the learning control input as in the serial structure in (3). In the initial run, the learning control input is set to zero, resulting in
A simple form of the learning control matrix can be represented as
where the learning control gain l kk treats the associated error at time-step k + 1 as presented in (3).
In order to design a fixed gain ILC with l kk = l, ∀ k, it is common to consider the error propagation from one iteration to the next as displayed in Proposition 1 for a serial ILC and Proposition 2 for a parallel ILC.
In the serial structure,
A similar phenomenon applies to the parallel structure whereby
To prevent a bad transient, one needs a practical solution. The next section introduces a gain adjustment mechanism to further adjust the fixed gains along the diagonal line. A variety of techniques have been proposed for tuning the learning gains over the learning matrix, for example, [29] makes use of an optimization technique to vary the learning gains based on the system model, while [30] uses a model-free adaptive ILC with a high order control law, increasing the computational complexity. However, this current work mainly focuses on adjusting the diagonal gains, regardless of the system model.
This section introduces the gain adjustment mechanism (GAM), a practical method to adjust the learning control gains based on Fuzzy Logic Control (FLC) to suppress error growth.
The design of the GAM
The algorithm initially seeks the learning control input providing the lowest error norm before noticing the bad transients. The aforementioned learning control input is nominated as a desired learning control input
Applying the same concept to both serial and parallel structures, Fig. 3 presents block diagrams of the ILC system, integrated with the GAM for the serial and parallel architecture shown side by side.

Block diagrams of an ILC with the GAM.
It can be observed from each block diagram that an extra box, as highlighted in orange, is added to the structure. This represents a GAM based on FLC to iteratively tune the learning control gains l11, …, l NN constructed in (20).
Before starting the adjusting mechanism, all gains are identically set to an initial value of l. After the mechanism starts, the maximum gain allowed through the adjustment is limited to 1.
The inputs for the FLC mechanism are considered using two kinds of errors, primitively scaled with unit vector normalization to produce new values, spanning the zero to one interval. The former is calculated from the desired control input as
Both inputs are fed into the FLC with the fuzzy rules indicated in Table 1. The learning control gain l
kk
is obviously increased when
Figure 4 shows the input-output membership functions of the Mamdani-type fuzzy inference system. The input, either
Fuzzy rules for the adaptive gain algorithm

Membership functions for inputs and outputs.

Surface plot for the Gain Adjustment Mechanism.
This section firstly shows the occurrence of tracking error from a robotic manipulator performing a screw fastening maneuver. The following subsections compare the efficiency of using a typical ILC and an ILC with the gain adjustment mechanism for both architectures in simulation and experimental conditions.
Tracking error from a robotic manipulator
Figure 6 shows the four-joint robotic manipulator from Seiko Epson Corporation and its task path used in the experiments. The prismatic joints R and joint Z move along the Y and Z-axes, respectively, while the revolute joints T and A rotate around the Z-axis. An electric screwdriver is installed at the tip of the robot arm to perform the screw fastening task.

4 joint Seiko D-Tran RT3200.
The manipulator is controlled by a National Instruments cRIO-9075 CompactRIO controller, which is a real-time controller that can generate precise sample timing intervals. The control input signals are generated directly from the real-time controller, while the tracking errors are measured by high precision absolute encoders with a resolution of 12 bits. A MATLAB program installed in the host computer is used as an interface to access data and compute the proper commands of the robot. Each robot joint is controlled independently by a conventional feedback controller with the appropriate gains adjusted to obtain optimal performance.
To perform the fastening process, Fig. 7a displays the desired path of the robot end effector moving from points 1 to 5 in three dimensions. The points along the path indicate the target location of the screwdriver tip. The path initiates at point 1 with the co-ordinate (0, 0, 0) and passes point 2, moving toward point 3 to tighten a screw. It then rotates to point 4 and heads down to point 5, with the aim of tightening another screw before returning to the initial point. It should be noted that the desired path is designed in such a way that the screwdriver does not directly touch the target screw to avoid damage. A gap of 1 cm is left between the tip of the screwdriver and target screw to ensure all components are safe from the moving robot.
In order for the robot arm to move along the desired path, the three actuators, i.e., joints R, T, and Z receive commands according to the inverse kinematics of the robot arm [7], as illustrated by the blue solid lines in Fig. 7b. The corresponding output from each joint are displayed as red dashed lines.

The screw fastening path.
The parameter φ represents the rotation angle of joint T around the Z-axis. d1, does not appear in the illustration, and refers to an arbitrary placement from the floor to the tip of the screwdriver. For simplicity, the value is set to zero so that the change in the Z-axis only occurs with the movement of joint Z. d2 and d3 are placements in the Z and Y-axes, respectively. The trajectory consists of 730 time-steps with a sampling time of 0.055s, which is longer than the minimum time required for program execution. This is equivalent to a trajectory of 40.15s.
Figure 8a shows the tracking error corresponding to Fig. 7b performed in an iteration. Specifically, in joint Z where the vertical movement is considered, the robot arm cannot perform fast enough to keep up with the change in reference. In addition, once it had almost reached the desired level, the transient response oscillated a little before the system reached the steady state. Since the position control system for each link is controlled by a classical feedback controller, the gain in the controller can be increased to achieve a more rapid response. However, the overshoot is most likely to be larger when using a high gain, resulting in an undesirable response. A learning method with automatic gain tuning is therefore proposed to eliminate the tracking error without becoming involved with the gain adjustment.
Figure 8b displays the root mean square (RMS) error calculated according to the tracking error executed from nine runs. It can be observed that the RMS errors in each iteration are approximately identical, and this problem will be eliminated when the ILC method is introduced in the next subsection.

Tracking error of the robotic manipulator.
Based on the input-output information observed in Fig. 7b, the closed-loop control system for each link can be approximated using the command ‘tfest’ in MATLAB. The algorithm determines the numerator and denominator polynomial coefficients using nonlinear least-squares search-based updates to minimize the weighted prediction error norm. The second-order transfer function in discrete time can generally be presented as
Parameters used in the closed-loop system
The mean fitting accuracies from the approximation are 97.35%, 97.35%, and 88.46% for joints R, T, and Z, respectively. Since the classical controller for each link is only composed of a proportional gain, the order of the open-loop system remains the same as in the closed-loop system. The gains were experimentally adjusted to achieve the lowest tracking error. With the proportional gain of 2.8 for joint R, 60 for joint T, and 25 for joint Z, the corresponding open-loop systems, also presented in the form of (25), can consequently be calculated using the parameters displayed in Table 3.
Parameters used in the open-loop system
Simulation tests are performed using the estimated models in Tables 2 and 3 with the references shown in Fig. 7a to fully evaluate the following control approaches: 1) Fixed gain ILC with l kk =0.25 for k ∈ [1, N] and 2) Time-varying gain ILC with the GAM. Stability analyses, including the tracking performance for each strategy, are provided in the next subsections.
This subsection considers the tracking performance of the control technique described in the previous section. The learning control matrix used in the learning control law can be expressed as (20) with l
kk
=0.25, ∀k. For the fixed gain ILC, it is previously seen that the learning gain should be less than
Note that even though the learning gain can be increased to develop fast-learning behavior, one can initially select the gain without expecting to achieve an optimal response since the gain will be iteratively adjusted by the GAM. As a result, the gain will be automatically varied as a trade-off between learning performance and stability robustness.
The learning control matrix is successively applied to the estimated models in (25) for both configurations. Table 4 summarizes the relevant stability indicators equivalent to the left terms the convergence conditions for both serial and parallel structures.
Stability indicators for the fixed gain ILC
Stability indicators for the fixed gain ILC
Based on the stability indicators illustrated in Table 4, both structures provide similar analytical results. It can be clearly seen that the asymptotic stability condition is satisfied for all systems. In contrast, the monotonic decay condition is only satisfied for joint T where the indicator gets extremely close to one but is still less than the stability boundary. However, the indicators for joints R and Z are greater than one, implying that the bad transient apparently exists, but will eventually die out when iteration progresses.
Figure 9 displays the plots of Root Mean Square Error (RMSE) versus the iteration number for all joints using both structures. The simulation results provide supportive evidence to verify that all systems are eventually stable, although some bad transients can be observed in joints R and Z as expected from the stability indicators.

RMSE using the fixed gain ILC.
The RMSE graphs for the serial architecture should firstly be considered. The RMSE in joint T gradually falls down and apparently remains unchanged after iteration 20. For the other two joints, decaying of the RMSE can be recognized in the early iterations. After iteration 19 for joint R and iteration 120 for joint Z, the graph swings back and dramatically increases until reaching its peak at iteration 8,410 for joint R and iteration 12,679 for joint Z. Immediately afterward, the RMSE continuously falls down again following elimination of the bad transient.
The RMSE graphs in the parallel structure provide a similar pattern to those in the serial structure but longer iterations are needed before noticing the evidence. The RMSE graph in joint T converges to nearly zero around iteration 3000 while the graph for joints R and Z decays until iteration 57 and iteration 3540, respectively. The graph then bends toward the peak at iteration 23,760 for joint R and iteration 316,862 for joint Z. After reaching the peak, it persistently declines after an extensive number of iterations.
Even though the error decay from using the serial structure and parallel structure can analogously be observed, it is of interest to further investigate the performance of error reduction from each structure. It can obviously be seen that the serial structure outperforms the parallel structure in the sense that the final RMSE levels are lower in all joints. The benefit of using the serial structure is that not only does it provide a faster response, but also a lower final RMSE level.
Since the bad transient can be observed in joints R and joint Z, one cannot wait until it disappears in a real situation. As a result, the fixed gain method is not a practical method of implementation. The next subsection demonstrates a practical method for removing the bad transient in a system.
In this subsection, the gains used in the learning control matrix are adapted by the gain adjustment mechanism explained in the aforementioned section. The learning control matrix is initially chosen as the fixed gain ILC. The gains along the main diagonal line are iteratively tuned based on the FLC algorithm. It should be noted that the maximum learning gains are set to one which is substantially below the stability boundary.
Beginning with the GAM, prior knowledge is required of the desired learning control input. Based on the previous fixed gain ILC results, the desired learning control inputs providing the lowest RMSE for joints R, T, and Z are from iteration 19, 9014, and 120 in the serial structure and iteration 57, 20000, 3540 in the parallel structure. The learning gains will subsequently be adjusted to track the desired learning control input as well as the desired reference based on the GAM.

RMSE using time-varying ILC with the GAM.
The tracking performance using the GAM for both structures is displayed in Fig. 10. It shows that the serial structure outperforms the parallel in every joint, as observed in the fixed gain ILC results. In addition, the bad transients clearly disappear in joints R and Z, supporting the advantage of using the time-varying ILC. However, the final RMSE level is slightly greater than that of the fixed gain ILC since the time-varying gains are adjusted to trade off between the tracking error and desired control input.
Table 5 summarizes the average last RMSE values read from Figs. 9 and 10.
The desired trajectory displayed in Fig. 7b is used as a reference for tracking the robotic manipulator. In the initial run, the tracking error from each joint is only observed from the closed-loop control system. The recorded error, together with the past control input, is used in the fixed gain learning control law to calculate a proper trajectory for the control signal to the next iteration. The process is repeated by applying the tracking error and control input to the learning control law containing the fixed gain or the time-varying gain adjusted by the GAM, to compute the control input signal to the next iteration. Since this is a batch update process, it is executed at the end of each iteration. Only datasets of the tracking errors and control input signals from the one iteration need to be memorized before computing a proper control input signal for the next run.
Figure 11 illustrates the RMS errors as the iteration progresses from both architectures in each joint using the fixed gain ILC of 0.25. One can notice a decay in RMS errors at the beginning of the iterations. However, all serial ILC systems eventually became unstable and were terminated before any damage could occur. In addition, the parallel ILC systems perform similarly but take longer for instability to be observed. Even though the growth of the RMS error in joints T and Z can barely be seen, the accumulation of high-frequency signals fluctuating around the transitions can obviously be perceived from the output signals of both joints.

RMSE using the fixed gain ILC from the experiment.
With the assistance of the GAM, the desired learning control input is obtained from the iteration containing the lowest RMSE in Fig. 11. The

RMSE using time-varying ILC with the GAM from the experiment.
Figure 12 shows the RMSE with the time-varying ILC gains adjusted by the GAM. All serial and parallel ILC systems can be stabilized with a decent final error level. The RMS error in the serial structure decays faster and is obviously less serious than that of the parallel structure in joints T and Z.
In the serial structure, the RMS error has been reduced by 42% (from 0.6 cm to 0.35 cm) within 10 iterations, or 6 mins, in joint R, 38% (from 0.8 deg to 0.5 deg) within 5 iterations, or 3 mins, in joint T, and 23% (from 0.43 cm to 0.33 cm) within 50 iterations, or 33 mins, in joint Z.
In contrast, it can be observed that the error reduction in the parallel structure has a slower decay rate but greater final level. The RMS error reduces by 42% (from 0.6 cm to 0.35 cm) within 20 iterations, or 12 mins, in joint R, 12.5% (from 0.8 deg to 0.7 deg) within 5 iterations, or 3 mins, in joint T, and 19% (from 0.43 cm to 0.35 cm) within 60 iterations, or 40 mins, in joint Z.
Table 5 summarizes the average last RMSE values read from Figs. 11 and 12.
This subsection discusses the complexity of the GAM in terms of computational performance when using both structures. Execution is performed in MATLAB on a computer with a 2 GHz Quad-Core Intel Core i5 processor and a 16 GB 3733 MHz memory.
The execution times are obtained using the run and time feature in MATLAB. Up to 100,000 iterations are performed to calculate the average execution time, T exe , for an iteration on each structure. The results are displayed in Table 5 with ILC and GAM representing the fixed gain ILC and varying gain ILC with the GAM, respectively. The results show that the varying gain ILC with the GAM takes several times longer than the fixed gain ILC due to the complexity of the proposed method. However, the extra execution time involved in introducing the GAM is insignificant when performing a real task. The amount of memory used, Mem, by running 100,000 iterations is also illustrated in Table 7.
Computational performance of the GAM on both structures
Computational performance of the GAM on both structures
It should also be noted that the execution time and memory usage for the serial structure appear to be lower than those of the parallel structure since the computations for internal feedback control systems are taken into account. In fact, the execution times and memory usage for both structures are indistinguishable in the experiment since the only difference between them is the location at which the learning control input signal is added.
This paper presents a practical design of iterative learning control law using the gain adjustment mechanism based on fuzzy logic control. To demonstrate its efficiency, three joints of the robotic manipulator are originally commanded using a feedback control system for each joint. Repetitive tracking errors corrupted by noise can be observed and eliminated using ILC. Two different ILC structures are constructed for a comparative study. The error convergent conditions for both configurations are analyzed and demonstrated in the time domain. The simulation results of stability analysis show consistency in error convergence. Transient growth is expected to appear when using a single gain learning control law. The time-varying gains tuned by the gain adjustment mechanism are therefore introduced to avoid a growth in errors. Unlike the Q-filter commonly used in practical ILC, the gain adjustment mechanism can handle the error growth without attempting to select an appropriate value for the filter bandwidth. The effectiveness of the gain adjustment mechanism is verified by the robotic manipulator. From the experiment, it is evident that the serial structure outperforms the parallel structure due to a faster learning speed and lower level of error. It should also be noted that even though, the gain adjustment mechanism leads to extra computational complexity to recursively calculate the control input signal, the processor can effectively handle the extra computation time without causing any problem. Moreover, the execution times and memory usage are insignificantly changed when adding the algorithm in the experiment.
Footnotes
Acknowledgment
This work was supported by Petchra Pra Jom Klao Ph.D. Research Scholarship (Grant No. 1/2561) from King Mongkut’s University of Technology Thonburi and National Science Research and Innovation Fund (NSRF).
