Abstract
This paper presents a novel methodology to identify the dynamic parameters of a real robot with a convolutional neural network (CNN). Conventional identification methodologies use continuous motion signals. However, these signals are quantized in their amplitude and are discrete in time. Therefore, the time required to identify the parameters of a robot with a limited measurement system is related to an optimized motion trajectory performed by the robot. The proposed methodology consists of an algorithm that uses a trained CNN with the data created by the dynamical model of the case study robot. A processing technique is proposed to transform the position, velocity, acceleration, and torque robot signals into an image whose characteristics are extracted by the CNN to determine their dynamic parameters. The proposed algorithm does not require any optimal trajectory to find the dynamic parameters. A proposed time-spectral evaluation metric is used to validate the robot data and the identification data. The validation results show that the proposed methodology identifies the parameters of a Cartesian robot in less than 1 second, exceeding 90% of the proposed evaluation metric and 98% for the simulation results.
Introduction
The parametric identification is a highly relevant line of research in robotics and instrumentation due to the interest in knowing the behavior of a robotic system with high precision and accuracy. From applications such as motion control systems with dynamic compensation [1] to the detection of faults induced by external disturbances [2], it is necessary to have the values of the dynamic parameters to implement these applications with optimal performance according to their theory.
The most widely used algorithm to identify parameters in the consulted works is the least-squares (LS) algorithm reported in [3–6]. Weighted versions of LS are used in [7], whereas in other LS parametric identification work, the process is split to decrease its complexity, as shown in [8] and [9]. In [10], LS is used to identify parameters of a robot of 6 degrees of freedom (DOF). The results of a comparison between LS and indirect parametric method shows that LS has the best performance. Although LS is widely used, the trajectory followed by the robot must be optimized for parameter identification, as explained in [11]. The mathematical structure of LS is based on the dynamic model of the robot, and consequently, some parameters are not directly measured. In [12], some properties of the inertial tensors of manipulator robots are defined to be used to identify parameters.
Neural networks are used as a complement to the dynamic model of a robot, as shown in [13], where a neural network of Long Short-Term Memory (LSTM) is used to improve the conservative model, and in [14], which uses a feed-forward neural network (FFNN) to model the friction of a 6-DOF robot. In [15], a radial-based neural network (RBNN) is used to compensate for the unmodeled phenomena of a robot. In [16], the kinematic uncertainties of a robot of 6-DOF are compensated with a two-layer FFNN achieving an error of less than one millimeter. In [17], the LS is combined with an LSTM neural network to reduce the uncertainties of a 6-DOF robot.
Other applications of neural networks to parametric identification are found in these works: In [18], a Deep Neural Network (DNN) is implemented to calibrate a robot of 6 DOF that uses a laser to measure the position. In [19], a DNN is used to measure the external force applied to a remotely controlled robot. In [20], a recurrent neural network (RNN) estimates the torque of a robot of 2 DOF. In [21], the combination of a DNN and the Nonlinear Auto-Regressive Exogeneous model (NARX) is used to estimate the motion of a robot of 5 DOF. The paper [22] uses a CNN as a motion estimator applied to three different mechanical systems. The cuckoo search algorithm is used in a robot of 6-DOF to identify the dynamic parameters of the robot in [23]. In [24], a nonlinear regressor is used for the parameter identification of a robot da Vinci. In [25] uses reverse engineering methods to determine inertia and mass parameters for Franka Emika Panda 7-DOF.
The robot trajectory in parametric identification is optimized in papers that use LS and maximum likelihood to reduce the execution time. In [25], the trajectory of a 7-DOF robot is optimized with mechanical constraints for the identification of its parameters. In [26], the basic steps to optimize a trajectory for parameter identification of a robot are reported. Due to the complexity of the trajectory optimization, the work reported in [3] addresses the maximum Hadamard height to obtain a complexity of O (n) instead of the original O (nm2 + n3) in the parametric identification of a robot 2 -DOF.
The main objective of optimizing trajectories in parameter identification is to find the best movement signal to achieve short identification times. However, it is not considered that the speeds and accelerations are estimations made with the position signals. Consequently, the optimization of trajectory takes longer than identification algorithm execution, as shown in [4, 14].
The proposed methodology of this paper has not been addressed yet in the reviewed related work. The survey reported in [27] shows applications with a great performance of the CNN to speech, vision, and language application because of the rapid growth and the remarkable improvements achieved in the last years. A comparison between a CNN and a FFNN applied to three mechanic systems is found in [22]. The CNN is oriented to work in a bounded region of the input data rather than combinatorial form. The results show that the CNN has a better response than the FFNN due to its region-oriented architecture. Even in the presence of noise in the motion signals, the CNN response is robust.
This work has two contributions. The first is the proposed methodology that identifies parameters of a robot avoiding trajectory optimization; consequently, the required time for parameter identification is reduced. In addition, once the CNN of the proposed methodology was trained with a specific robot model, the algorithm can be scaled to identify multiple robots with the same model. The novelty of this work remains in the parameter residuals extraction by a CNN without an optimized trajectory and the transformation of the position, torque, and estimations of velocity and acceleration into a small image that contains enough information to identify the parameters in a short time. The other contribution is the feed-drive robot validation of the dynamic model by a comparison between the measured torques and the estimated torques with the obtained parameters using the proposed algorithm.
This paper is organized as follows: first, the proposed methodology to identify the parameters of a robot is developed, then the application of the identification methodology to a feed-drive robot is described, then the results of the algorithm are shown, subsequently, the discussions of the most important points of this work and finally the conclusions.
Propossed methodology
The proposed methodology consists of the blocks shown in Fig. 1: First, a robot is following a predefined trajectory that is not required to be optimized. The measure acquisition system gives only the position and torque signals of the robot. Therefore, a block of filtering, derivating, and subsampling is used to estimate the velocity and accelerations. Together with the torque, the motion signals are subsampled to enough length to be processed in a short time. Then, the initial dynamic parameters P0 are initialized to create an image with the proposed transformation technique.

General diagram of the proposed methodology.
Once the image made with the robot signals and P0 is obtained, the CNN is analyzed to find the parameters residuals of the initial dynamic parameters. A proposed evaluation metric analyzes the robot torque and the estimated torque with the identified parameters by the CNN. The output dynamic parameters are denormalized when the evaluation metric gives a value greater than an umbral value v s . In the following subsections, the details of the blocks of Fig. 1 are described.
The developed neural network consists of three convolutional layers, two pool layers, and three feed-forward layers, shown in Fig. 2. This CNN has a 100-pixel by 100-pixel input map and four outputs to determine the eight parameters shown in the case study in section 3. Convolution operations work with D input channels as described in Equation 1 [28]:

Proposed CNN to the identification algorithm.
where X
d
is the input map of d ∈ D channels, Wd,m is the kernel associated with the d input channel and the m output, B
m
is the bias matrix, f is the activation function, and Y
m
is the m output map. The max function is used in the 2 x 2 size pool window, where the address of the maximum value is stored in the DRM vector to the CNN training process. In Fig. 2, the convolution operation is represented by an asterisk (*), the addition of the bias by a plus sign (+), and the pooling layer by an R. The activation function used in the convolution and feed-forward hidden layers is the Rectified Linear Unit function (ReLU(z) = max (0, z)) because of the property of avoiding the fading effect of the loss gradient in the training process. For the output layer, the sigmoid function described in Equation 2 is used.
The kernels size is selected by the behavior of the loss function in the training step of the CNN; the design of the net is that the first layer process the input image in big areas, the second layer in a smaller area, and the third layer process for extract details of the image. The best-obtained results are 9x9 for the first, 5x5 for the second, and 3x3 for the third. The total synaptic weights of the proposed CNN are 375,640, and the total number of neurons is 29,384.
Two transposed vectors are multiplied to obtain an image from the motion and torque signals. The first step in the image creation is subsampling the robot position and torque signals with the Discrete Cosine Transform II (DCT-II)
The image is created with vector ψ and vector α, as described in Fig. 3. These vectors are different for each robot, and their content is a function of the specific dynamic model of each robot. The general representation for the ψ and α is in Equation 6:

Image from signals of the feed-drive robot.
A repository of n
sim
simulations of the dynamic model
The training data in Fig. 4 is subsampled with the Equation 3. The motion and torque signals of the simulations are taken to generate the images. The training process of the CNN uses two types of images; the Mp images constructed with the dynamic parameters that match with position and torque signals, and the Mn images that use random parameters, as illustrated in Fig. 5. The

Simulation to create the data to train the CNN.

Training image generation.
Half of the image repository is Mp, and the other half is Mn. The labels put in Mp images are zero because of the zero parameter error. The labels of Mn images are the subtraction of the simulation parameters and the randomly selected parameters. Each image is rotated in four possible random orientations (0°, 90°, 180°, 270°) to avoid CNN memorization (see Fig. 5).
The CNN training process uses the propagation error algorithm with the loss function
The evaluation metric analyzes the similarity between two signals in the time and DCT-II frequency domain. The correlation is used as an evaluation metric in the related work, but the high-frequency variations are not considered. The proposed metric
The algorithm shown in pseudocode 1 identifies the parameters of a robot introducing the robot position and torque signals, its dynamic model, and the CNN trained parameters. In lines 1 and 2, the derivative is estimated by the function
In line 6, the variable v for the evaluation metric shown in the Equation 8 is initialized with zeros. The parameters are initialized by assigning n p numbers, as shown in line 7. The while loop of line 9 runs until the variable v exceeds a threshold v s and the identified parameters P out are physically possible; the coefficients of inertia and friction must be positive, while the gravity coefficients can have positive or negative values. Within the while loop, n P random numbers of range 0 to 1 are assigned to P out . The vectors ψ and α are determined using the equations ref eq6 as shown in lines 10 and 11 to create the M image of line 12. In line 13, M is normalized to a range of 0 to 1, and it is introduced to the CNN in line 14.
Because the sigmoid function is used in the CNN output layer, in line 15, the parametric residual P r is recovered by applying the inverse sigmoid function. In line 16, the identified parameters are obtained by subtracting P r from P out . The torque determined by P out is constructed with the dynamic model f (x, P) and the set of signals x, as shown in lines 17 and 18. The evaluation metric is applied to the τ out and the input torque to determine their similarity. Finally, P out is denormalized in line 20 to identify any value inside and outside the range -1 to 1.
Description of the robot
The robot to be identified with the proposed methodology is a 2-DOF feed drive robot [31]. The robot uses two DC motors for each degree of freedom coupled to a 131.25:1 gearbox. Two incremental encoders with a resolution of 8400 counts per revolution of the output shaft reducer are implemented to measure the position. The maximum velocity for x-motor v mx and z-motor v mz are 10x103 rpm. Both gearboxes are connected to an endless screw that is part of the ball screw drive mechanism used to convert axial movement into linear movement with a 2mm/π rad reduction. Figure 6 shows the connection between the x-axis nut and the z-axis support table where it has a rectified linear rail to support the base of the floor.

Feed-drive robot of two degrees of freedom.
The robot embedded electronics shown in Fig. 7 consist of a field-programmable gate array (FPGA) with a microprocessor connected to the peripheral controller firmware required to drive the robot’s DC motors, process the encoder position signal, and implement the control system to set the desired trajectory. The acquisition of the robot signals is made through Wi-Fi communication between the FPGA system and an external computer in a sampling time of 2.5ms. The programming of the microprocessor and the data acquisition is made by a software design in Labview ©.

General diagram of the embedded electronics.
The schematic diagram of the feed-drive robot is shown in Fig. 8. The output shafts of the gearboxes, couplings, and endless screws are connected in series to facilitate the determination of the dynamic model. The gearboxes and nuts of the ball screw mechanism are placed as a reduction constant, a spring, and a damper, shown in the Fig. 8. The couplings and supports of the robot are represented by springs and dampers.

Schematic diagram of the feed-drive robot. The torsional stiffness κ mg x and κ mg z are represented by the gears, and torsional friction β mg x and β mg z are between the tooth of the gears. For the torsional stiffness κ gs x and κ gs z , it is used a connection bar, and for the torsional friction β gs x and β gs z , it is used a hollow bar.
The dynamic model of the robot considers that the sum of the external forces and reaction forces is equal to the mass by their acceleration. In the case of rotational joints, forces are replaced by torques and mass by a moment of inertia:
The model equations for the x and z axes are described in the Equation 1:
where the angular position of the motors θ m , φ m , the output shaft of the gearboxes θ g , φ g , the endless screws θ s , φ s and, the linear displacements x l , x s , x A , z l , z s are in the vectors θ and φ. The equations of the feed-drive begin with the motors that are modeled with inertia parameters J mx andJ mz , and viscous frictionβ mx and β mz ; the torque applied is by a PWM signal of the embedded system that corresponds with the motors torque [31]. The gearboxes are modeled with the total axial stiffness, κ mgx , and κ mgz , of the gears. The viscosity between gears teeth are β mgx andβ mgz coefficients;β gx and β gz model the lubricant of the gearboxes. The connection between the output shaft of the gearboxes is modeled with axial stiffnessκ gsx andκ gsz coefficients. The model for these connections includes the viscosity frictionβ gsx and β gsz between the endless screws represented byθ s and φ s . The axial motion is converted to linear motion with a nut;κ slx andκ slz model their stiffness and viscosity, β slx andβ slz . The mass m lx and m lz represent the tables where the nut is attached. The mass m A corresponds with the support for the z-axis. Finally, the tables move in the bases supported by m sx andm sx , in which stiffness and viscosity are defined by k sx , k sz ,b sx , andb sz .
The mass parameters (
The mechanical couplings of motors-gearboxes (κ mg x , κ mg z ), gearboxes-endless screws (κ gs x , κ gs z ), screws-nuts (κ sl x , κ sl z ) and mechanical supports (k s x , k s z , k A x ) are in the matrices K x andK z . The reduction of the gearboxes are inr x = 131.25 andr z = 131.25, while the nut reduction are inn x = 2/π mm/rad andn z = 2/π mm/rad. To group the dynamic parameters of equations10, the deformations of the mechanical connections are considered to be less than the encoder resolution; for axial connections Δa = 1.1905e-04 [rad], while for linear displacements ΔL = 4.7619e-07 [m]. These deformation considerations made the connection term equal to zero K x θ → 0, K z φ → 0}. Thus, eight parameters can be determined to obtain a reduced version of equations 9 where τ x = r x τ m x , τ z = r z τ m z . Variables θ and φ are taken from the output shaft of both gearboxes. With the set of equations shown in 12, ψ and α vectors are constructed with the eight parameters as Equation 13 shown. For this specific case study, the α vector is set as the torque. The main idea of the image construction for the feed drive robot is that when the identified parameters match the torque signals, the M image contains an approximation of the square value of the torque. It is important to note that the ψ and α vectors are not limited to the form presented in this case study.
The verification of the algorithm uses five simulations of the 12 determined by the classic fourth-order Runge-Kutta ordinary differential equation solver. For this case study, the sinusoidal trajectory frequency is set at 0.2875 rad/s. The FPGA contains a programmed trajectory follow for the feed-drive 2-DOF robot. For the x-axis, the trajectory is θ d = 2π sin(0.2875t), and for the z-axis is φ d = 2π cos(0.2875t).
Data acquisition from the feed-drive robot is implemented via a wireless interface in LabView 2015 © . The raw data is transferred from the FPGA board to a personal computer to be processed by the identification algorithm. The proposed algorithm was implemented in Matlab R2017b © with the optimized convolution in two dimensions function
Training of the CNN
The computer used for the parameter identification algorithm has a GTX-1060© GPU with an Intel i7-8th© processor. The computational cost is determined experimentally for convolution layers 1 (100x100 input map with 9x9 kernel), 2 (46x46 input map with 5x5 kernel), and 3 (21x21 input map with 3x3 kernel). Figure 9 shows the For Loop Convolution (FLC) runtime, which is the direct implementation of the Equation 1, the Optimized Processor Convolution (OPC) of Matlab © and the solved convolution by the GPU. All times in this figure are for a single input map and convolution kernel only. For layers 4, 5, and 6, the lowest computational cost is obtained using the processor. CNN’s time cost is approximately 3.3426 ms.

Computational cost of the layers 1, 2 and 3.
The generated training set described in Fig. 5 is divided into train set and test set to avoid over-learning of the CNN. Figure 10 shows the evolution of the loss function for both training tests with a learning rate set to η = 0.14 units. The loss function is close to zero in the first 0.1 × 106 iterations due to the range of the parameters residuals. However, the sigmoid function tends to minimize the residuals of the parameters, and the error evaluated with the proposed metric is greater than the loss training function.

Loss function of the CNN training process.
The normalized Fourier frequency spectrum of the 210 trained kernels is shown in Fig. 11. The characteristics of the Mimage are extracted using the shape of each kernel spectrum to determine the parameters with the feed-forward part of the CNN. It is observed that the kernels are different from each other, and each kernel in combination with the activation function determines the parameters of the Equation 18.

Frequency spectrum of the kernels of the layers 1, 2 and 3 using the Fourier transform.
Figure 12 shows four M p and M n with their respective labels. The M p images for the Equation 12 have an approximation of the square value of the torque, but the M n images do not match the square torque. When the vectors ψ and α contain parameters that are not correct, the resultant image is unbalance.

Normalized images Mp and Mn from the training set.
The parameters used in the five simulations are described in Table 1. The simulations S4 and S5 have parameters outside the range -1 to 1 to verify the denormalization step of the algorithm. Table 2 describes the results of the evaluation metric along with the execution time of the identification algorithm and the identified parameters.
Dynamic parameters used to the five simulations
Dynamic parameters used to the five simulations

Comparison between torque simulations and torque obtained by the identified parameters.
Identified dynamic parameters with evaluation metric and execution time
The evaluation with
The trajectory made by the feed-drive robot is shown in Fig. 14 where for both axes, the position, torque, and estimations of velocity and acceleration, are cut to adjust to the sinusoidal training trajectory. Multiplying θ and φ for the reduction of the screwball parameters n x and n y , the position of the end effector is obtained as shown in the Fig. 15 where the black arrow shows the direction of the trajectory.

Signals of the position, velocities estimations and accelerations estimation for both axis of the feed-drive robot.

Trajectory of 2 dimensions of the feed-drive robot.
Table 3 shows the identification results of the parameters shown in Equation 18 of the feed-drive robot with the evaluation metric and the algorithm execution time. For both axes of the feeding robot, the
Identified dynamic parameters with evaluation metric and execution time
Figure 16 shows the comparison between the torques of the feed-drive robot and the torque obtained by the identified parameters. For the axis x the shape of the real torque is close to the torque built with the identified parameters. The error shown in the black dashed line is more significant at zero-torque transitions than the other parts. In the real torque, the z is close, but the

Comparison between feed-drive torques and torque obtained by the identified parameters.
The proposed evaluation metric
The characteristics of the proposed evaluation metric for electronic signals make it ideal for comparison because it considers the frequency spectrum in the DCT-II transform.
Optimizing a trajectory to identify dynamic robot parameters takes time because each new trajectory needs to be tested on the real robot. With the proposed methodology, trajectory optimization is not necessary, and the processing time is reduced. The preprocessing technique applied to the input data can subsample the signals to any integer value length with the DCT-II; this frequency transformation contains the most significant information in the first frequency bins.
The transformation of signals into an image helps in the training of the proposed CNN. Other transformation methods such as the time-frequency spectrum are reported. However, the technique used in this work does not require the frequency spectrum and has the necessary information in a simple way to be processed by the CNN to obtain the dynamic parameters. The proposed CNN-based algorithm works with estimates made by position measurements rather than real velocity and acceleration measurements. The CNN can generalize the input information to identify parameters not included in the training data set.
The proposed algorithm has a short running time. Once the CNN has been trained with one type of robot, it can be applied to any real robot with the same dynamic model, which makes it helpful in identifying many robots. The future work of this investigation is to extend the dynamic model to increase the similarity with the proposed metric.
Conclusions
In this paper, has been presented an algorithm for parameter identification based on a convolutional neural network. The feed-drive robot is used to test the proposed algorithm. The parameters extraction with a CNN and the creation of the image are the novelty parts of this work. For the image creation with robot motion signals, the proposed transformation technique puts enough information into a small image to determine the parameters quickly. The proposed methodology does not require the optimization step of a trajectory performed by a robot. Therefore, once the CNN is trained with the dynamic model of a specific type of robot, the algorithm can identify the parameters of any robot that the same model determines as the one used to train the CNN.
Unlike correlation, the most used metric in parameter identification papers, the proposed metric analyzes the information in time and frequency. This characteristic makes it ideal for comparing similarities of electronic signals such as torque measurements. The results show that, for the five simulations, the algorithm obtained a similarity greater than 98% because the simulated model was the same that was used for the CNN training. The results of the feed-drive identification show that the similarity exceeds 90% . The parameter identification for this robot contributes to validate the dynamic model of eight parameters where both estimated torque signals are close to the experimental torque signals. The results presented in this paper show that the proposed methodology works in the identification of dynamic parameters. Consequently, the future work will be the application of the proposed algorithm to identify other types of robots used commonly in the industry and investigations.
