Abstract
There are various high-dimensional engineering and scientific applications in communication, control, robotics, computer vision, biometrics, etc.; where researchers are facing predicament to fabricate an intelligent and robust neural system which can process higher dimensional information efficiently. In various literatures, the conventional neural networks based only on real valued, are tried to solve the problem associated with high-dimensional parameters, but these neural network structures possess high complexity and are very time consuming and weak to noise. These networks are also not able to learn magnitude and phase values simultaneously in space. The quaternion is the number, which possesses the magnitude in all four directions and phase information is embedded within it. This paper presents a learning machine with a quaternionic domain neural network that can finely process magnitude and phase information of high dimension data without any hassle. The learning and generalization capability of the proposed learning machine is performed through chaotic time series predictions (Lorenz system and Chua’s circuit), 3D linear transformations, and 3D face recognition as benchmark problems, which demonstrate the significance of the work.
Keywords
Introduction
Machine learning technique mainly concentrates to improve the intelligent activities of the system governed through experience. In the recent scenario, it is one of the most emerging technical fields that build a bridge between artificial intelligence and data science. Currently, development of new learning algorithms and theory are challenging one for online or high-dimensional data with low computational cost [26]. In addition, the high-dimensional information processing through neural network is also emerging as a fascinating but challenging field for second generation neurocomputing researchers. The recent researches in high-dimensional neural networks have established their superiority over first generation as real-valued neural networks (RVNN) as addressed in [1, 8]. Although, RVNN have been used to process high-dimensional data, but the network needs to utilize too many real valued neurons hence the resulting network is become huge in structure and slow learning. For the complex network structure, reliability of the network has an important element but, but reliability calculation is a NP-hard problem, therefore simulation approach is feasible to assessing network reliability [24]. However, some recent advances in theapplication of RVNN are based on the modeling of active devices like a transistor [23]. For advancement of learning algorithm, the instance selection algorithm based on cross-validation [25] and MapReduce and voting mechanism (MRVIS) [27] are used only for large data sets and compared it with some classical algorithms in terms of learning speed and selected ratio. The RVNN can also not process phase information during learning and generalization of mapping on the plane [3, 7]. The complex-valued neural networks (CVNN) with nonparametric activation functions [28] can promptly process two-dimensional information with phase as a single number, which leads to a drastic reduction in the complexity of the network along with better performance. But, the neural network of three-dimensional information still needs an exhaustive investigation. The applications with three-dimensional information are popular in computer vision, robotics, biometrics, bioinformatics etc. The few researchers attempted machine learning with three-dimensional information considering it as a vector [8, 9]. The corresponding learning algorithms have restrictions on weight matrix and a vector does not provide freedom like a complex number, as in CVNN [9]. Thus, it is very demanding to have neural network, which may promptly process different high-dimensional parameters as numbers and can be simply incorporated in various applications of intelligent machine design, like CVNN [2, 3–6]. In the enhancement of higher-order number systems, the complex numbers (2D), quaternions (4D), octaves (8D), sedenions (16D) were developed by mathematicians in the past but there is no number system in three dimensions [10]. The researches [1–3, 7] also elaborate that the CVNN has outperformed over RVNN even for real-valued problems, therefore we propose to exploit quaternions in neural network to process three and four dimensional problems.
The neurocomputing with high-dimensional number systems will definitely overcome from learning and generalization of huge conventional neural network and lead to lower complexity. The quaternion is the hyper-complex number initially introduced by Iris mathematician Hamilton [11]. It has been extensively employed in the field of quantum mathematics, physics, computer graphics, signal processing and control [12, 17]. This number system has recently popped up in neural network through quaternionic neurons, as complex or real-valued neurons, to develop efficient machine learning in higher dimensions. Few attempts have been made in this direction, the orthogonal decision boundary of single quaternionic neuron has been utilized to solve 4-bit parity problem in [18]; quaternionic MLPs proposed in [15] has the problem of existence of singularities; quaternion-valued algorithms are proposed for adaptive filtering [16, 17]; a basic work for quaternionic-valued neural network with sigmoid activation function is presented in [14, 19]. In this paper, we present not only simple, straightforward, but potential machine learning algorithm for sufficient general structure of the quaternionic domain neural network (QDNN) but also demonstrate the evaluation over the wide spectrum of applications, like function approximation, motion interpretation and recognition in space. The parameters in QDNN, like synaptic weights, biases, inputs-outputs signals and internal potentials are quaternions and represented as quaternion matrix, in multi-layered neural network. Although, Hamilton proposed quaternionic numbers (
This paper investigates the general structure of QDNN with learning algorithm through simulation on various benchmark problems of different sphere of influence. The rest part of this paper is organized as follows: Section 2 presents a complete machine learning framework with pseudo code of learning in quaternionic domain. Section 3 evaluates the learning and generalization capability through function approximations, linear transformations and 3D face recognition. Final conclusion and future scope of the work are presented in Section 4.
Machine learning in quaternionic domain
A quaternionic number system is the straightforward extension of real and complex number system, where four components are incorporated in single number; the first component acts as real and other three as imaginary with unit vectors (
All bold-type letters denote either quaternionic variable or quaternionic matrix. The conjugate of quaternionic variable (
The learning algorithm incorporates the basic operations of quaternion algebra [11, 12]. The addition and subtraction of two quaternionic matrices
The norm of quaternionic matrix
A three-layered (L - M - N) quaternionic domain neural network (QDNN) possesses L inputs; M and N quaternionic neurons in hidden and output layers respectively. All inputs, outputs, weights and biases signals are considered as quaternionic matrices, as represented in Equation (1). The derivation of optimization technique incorporates the basic operations of quaternion algebra which present the compact and the generalized derivation of the backpropagation algorithm (QDBP) of three-layered network.
Forward pass
Let us consider
The matrix of inputs (I) at the input layer of the network is defined by I = [I1 I2 I3 … I
L
]
T
. The initialization of synaptic connection weights
The internal potential matrix (
The internal potential matrix (
In order to develop a QDNN based learning machine, we present the derivation of the error-backpropagation learning algorithm in quaternion domain (QDBP) through minimization of average mean square error (E) of the network:
The update equations of weight and bias matrices are obtained by employing a gradient decent optimization approach on MSE e.g. mean square error (E). The weight update matrix (Δ
For the sake of simplicity and better understanding, we further present an algorithm QDNN_TRAIN(.) for training of quaternionic domain neural network (QDNN), which is elaborated by procedures QDNN_INIT(.), QDNN_FORWARD(.) and QDNN_BACKWARD(.). The learning and generalization ability of a three-layered neural structure is obtained through optimization of mean square error. The procedure QDNN_INIT(.) randomly initializes the weight and bias matrices in considered network. It calls the RANDOM_QM(a, b) procedure which randomly generates the quaternionic matrix of each interconnection weight and bias of neuron in the range from a to b. The QDNN_FORWARD(.) procedure is intended to implement forward pass of QDNN, hence generate internal potentials (
The ACTIVATION_FUNCTION(.) limits the output of corresponding neuron of the network. For updates weight and bias matrices, QDNN_BACKWARD(.) is developed for the backward pass of QDNN. All required procedures are presented in pseudo code are as follows:
Performance evaluation of learning machine through benchmark problems
In this section, we evaluate the effectiveness of learning machine through a wide spectrum of benchmark problems: function approximations, linear transformations, and 3D face recognition. The components of all quaternionic weights and biases are randomly initialized in the range –1 to 1. The quaternionic variable
Function approximations
The Lorenz system
Comparison of training and testing performance for Lorenz system

3D plot of the Lorenz system tested by the QDNN network trained through QDBP.
Chua’s circuit is the simplest autonomous electronic circuit containing registers, capacitors and inductors that exhibit the chaotic behavior under specific parametric conditions [22]. This circuit satisfies the chaotic criterion which contains one or more non-linear elements, one or more active registers and three or more energy storage devices. It uses the one Chua’s diode as non-linear element, one locally active register and two capacitors and one inductor as energy storage devices. The dynamics of Chua’s circuit are governed by three state equations as
Comparison of training and testing performance for Chua’s circuit

Testing through QVNN network trained by QDBP for Chua’s circuit.
In order to evaluate the performance of QDNN, we have considered a three-layered neural structure (2-M-2). This section presents the learning of linear transformations (rotation, scaling, and translation and their combinations) by QDNN through a few sets of points on the line and generalization over complicated 3D objects. Each quaternionic variable
For training on a three-layered 2-6-2 QDNN, all experiments consider a straight line in space containing few input data points (21 points) on line and a reference point (mid point). The set of point (x, y, z) lying on line goes to the first input and a second input passes the reference point (x
r
, y
r
, z
r
). The incorporation of the reference point provides more information to learning a system which yields better accuracy. Similarly, the first and second output neurons of output layer result the transformed point (x′, y′, z′) on line and transformed reference point
The learning of 2-6-2 QDNN structure is performed for scaling transformation through input-output mapping for scaling factor 1/2 over 3D line containing 21 points where the point (0, 0, 0) is the reference point as shown in Fig. 3(a). Convergence of mean square error (Fig. 3(b)) shows the smart learning capability of the proposed network. The training of QDNN with 0.00005 learning rate converges to MSE = 1.005567e-05 after 20000 iterations. The trained network is able to generalize over many complicated standard geometric structures like sphere (4141 data points), cylinder (2929 data points), and torus (10201 data points) which is presented in Fig. 4(a-c) respectively.
(a) Training input-output mapping for scaling with scaling factor 1/2; (b) Convergence of mean square error. Testing results from similarity transformation over (a) sphere, (b) cylinder, and (c) torus.

The learning of 2-6-2 QDNN is performed in combination of scaling (scaling factor 1/2) and translation (0.3 unit in positive Y-direction) through input-output mapping over 3D line containing 21 data points referenced at (0, 0, 0), as shown in Fig. 5(a). Convergence curve of QDNN shown in Fig. 5(b), with learning rate 0.00005, up to 2.58514e-05 mean square error shows the smart learning capability of the proposed learning machine after 20000 iterations. The trained network is able to generalize well over many complicated standard geometric structures like sphere (4141 data points), cylinder (2929 data points), and torus (10201 data points) as shown in Fig. 6(a-c)respectively.
(a) Training patterns: input-output mapping shows transformation with scaling factor 1/2, followed by translation with 0.3 units in positive Y-direction (b) Convergence of mean square error. Testing results from similarity transformation through (a) sphere, (b) cylinder, and (c) torus.

The learning of QDNN for general linear transformation (scaling factor 1/2, counterclockwise rotation about the X-axis by π/2 radian, and translation by (0, 0, 0.3)) is performed for, through input-output mapping over straight line and reference (0, 0, 0), as shown in Fig. 7(a). The 2-6-2 QDNN model is used for training of these transformations through 21 data points in a straight line. Convergence of mean square error 1.0e-04 after 20000 iterations is achieved with the 0.00005 learning rate, as shown in Fig. 7(b). The trained network is also able to generalize over many complicated standard geometric structures like sphere (4141 data points), cylinder (4141 data points), and torus (10201 data points) as shown in Fig. 8(a-c) respectively.
(a) Training mapping patterns through straight line (scaling factor 1/2, counterclockwise rotated about the X-axis by π/2 radian, and translated by (0, 0, 0, 3)); (b) Square error during training of straight line pattern. Generalization of a linear transformation (scaling factor 1/2, counterclockwise rotated about the x-axis by π/2 radian, and translated by (0, 0, 0.3) over (a) sphere, (b) cylinder, and (c) torus.

All transformation experiments promise the intelligent behavior of QDNN for motion interpretation of 3D objects. Further, this novel experiment provides a direction to generalize the motion for intelligent system design for a variety of operations.
This section presents a basic experiment, though with a small data set but its implication is wide for the applicability of proposed learning machine for 3D recognition. The proposed method has a great deal to perform successful recognition in variable head position, orientation, and facial expressions. Two experiments are conducted here to learn and classify point cloud data of 3D faces using proposed quaternionic domain backpropagation algorithm. A simple structure of (1-2-1) QDNN with single input-output performs experiments using only two quaternionic neurons at hidden layer.
The first experiment is performed on a dataset containing 05 faces of the same person (4654 points cloud data) with different orientation and poses; the learning of QDNN is made with one face (Fig. 9a) and testing over all faces. Table 3 presents the testing MSE (mean square error) of all five faces which are comparable; hence demonstrate that they are faces of same person irrespective of variations in face orientation and poses. It infers straightforward learning and generalization ability of a simple QDNN which is not possible by RVNN.
Five 3D faces of same person with different orientation and poses. Comparison of testing MSE of faces of the same person with different orientation (MSE Training =0.0001)
Similarly, the second experiment is performed on a dataset containing 05 faces of different people (6397 points cloud data); the learning of QDNN is made with one face (Fig. 10a) and testing over all faces. Table 4 presents the testing MSE of each face obtained from trained network, which shows that the MSE of other four faces are much higher in comparison to the face (Fig. 10a) used in training. This demonstrates that the simple QDNN correctly classifies the faces of same or different person. It again reveals the learning and generalization capability of a proposed learning machine where real-valued neural network lacks.
Five 3D faces of different persons. Comparison of testing MSE of faces of different person (MSE Training =0.0001)
In this paper, we present an efficient and generalized learning machine for high-dimensional problems and evaluate it with a variety of problems of different areas. The proposed neural network with learning algorithm in quaternionic domain directly processes three or four dimension data without the hassle of its different components and phase information among them. The quaternion is the number which possesses the magnitude of intended components and phase information of each component is embedded in it. Thus, quaternionic domain neural network (QDNN) leads to simple network structure, efficient learning and better performance; whereas conventional real-valued neural network (RVNN) deals with individual components hence need huge topology, slow learning and poor performance. Apart from that RVNN and complex-valued neural network (CVNN) do not work for problems where it is required to learn and generalize phase information like 3D object recognition and motion or transformation of objects in 3D space. It is worth to mention here again that the proposed machine learns the composition of transformations through input-output mapping over a line containing a small set of points and generalize this motion over complex geometrical structure such as sphere, cylinder, and torus. Although, the problem presented for recognition in 3D imaging is small and basic but it is very encouraging for prospective researcher due to network simplicity, faster convergence, and the result.
