Abstract
Isometric feature mapping (ISOMAP) is one of the classical methods of nonlinear dimensionality reduction (NLDR) and seeks for low dimensional (LD) structure of high dimensional (HD) data. However, the inverse problem of ISOMAP has never been investigated, which recovers the HD sample from the related LD sample, and its application prospect in data representation, generation, compression and visualization will be very brilliant. Because the inverse problem of ISOMAP is ill-posed and undetermined, the sparsity of HD data is employed to reconstruct the HD data from the corresponding LD data. The theoretical architecture of sparse reconstruction of ISOMAP representation comprises the original ISOMAP algorithm, learning algorithm of sparse dictionary, general ISOAMAP embedding algorithm and sparse ISOMAP reconstruction algorithm. The sparse ISOMAP reconstruction algorithm is an optimization problem with sparse priors of the HD data, which is resolved by the alternating directions method of multipliers (ADMM). It is uncovered from the experimental results that, in the case of very LD ISOMAP representation, the proposed method outperforms the state-of-the-art methods, such as discrete cosine transformation (DCT) and sparse representation (SR), in the reconstruction performance of signal, image and video data.
Keywords
Introduction
Currently, human society has already entered the epoch of big data. The explosive increasing of various big data is emerging, such as the big data of text, speech, image, video, sensor, media, network, biology, medicine, industry and business. According to the authoritative prediction, the quantity of the big data will reach 44 zettabytes by 2020 [1]. Being faced with the dramatically surging of big data, the classical processing algorithms of big data stand a severe challenge. It has become one of the urgent requirements of the present development of science and technology to study the new category and high efficiency methods of big data acquisition, representation, compression, transmission and storage [2].
Contemporarily, human society is stepping into the era of artificial intelligence, which has already changed, is now changing, and will continue to change the ways of our learning, working, and living [3]. The subdisciplines of artificial intelligence, including the fundamentals of artificial intelligence, machine learning, pattern recognition, natural language processing, knowledge representation and intelligent system, make a leap-forward development, and provide powerful support of theory and technology for big data processing [4].
Nonlinear dimensionality reduction (NLDR) or manifold learning (ML) is an important branch of machine learning, can map high dimensional (HD) space data into low dimensional (LD) space data, and achieves extensive theoretical researches and real applications in big data representation, classification, recognition and visualization [5]. The inverse problem of NLDR refers to the rebuilding of the HD sample from the related LD sample, and holds highly theoretical value and tremendous application prospect in big data representation, generation, compression and visualization [6]. The inverse problem of NLDR in very LD space is worthy of more attention, because it can offer amazing capability of data compression. For instance, firstly, a data point in HD space is projected to a data point in 1D or 2D LD space by NLDR; secondly, the HD point is reconstructed from the related LD point by resolving the inverse problem of NLDR; finally, extremely high data compression ratio can beattained.
Because the dimension of LD space sample is far less than that of HD space sample and the LD sample loses information compared with the HD sample, the inverse problem of NLDR is an ill-posed and undetermined one, which is hard to resolve [7]. The efficient method to resolve the ill-posed problem is to fully take advantage of the priors of HD data, such as the sparse priors of HD data and deep learning priors of HD data.
This paper focuses on isometric feature mapping (ISOMAP) and its inverse problem [8]. ISOMAP is one of the traditional NLDR algorithms, and it attempts to find intrinsic LD representation of HD data by preserving the geodesic distances between neighbors in both HD space and LD space. The inverse problem of ISOMAP will be solved depending on the sparse priors of HD data.
The main contributions of this paper consist of the theoretical architecture of ISOMAP reconstruction, including original ISOMAP, sparse dictionary learning, embedding and reconstruction modules, and the sparse ISOMAP reconstruction algorithm based on sparse priors of HD data and the alternating directions method of multipliers (ADMM) [9].
This paper is an extended version of our earlier conference paper [10]. It focuses on ISOMAP algorithm, but the conference paper concentrates on general NLDR methods and takes ISOMAP as an example. This submission rewrites and enhances the whole conference paper, including the sections of introduction, related-work, theory, experiments and conclusions. Especially, this paper elaborates the sparse ISOMAP reconstruction algorithm, which is omitted in the conference paper. In addition, this submission utilizes new data sets in the experimental section, compared with the conference paper.
The rest part of this paper is arranged as follows. The related-work of the proposed method is summarized in section 2, the theoretical fundamentals are elaborated in section 3, the experimental evaluation is conducted in section 4, and the conclusions are drawn in section 5.
Related-work
ISOMAP is a mature NLDR algorithm and its improvement algorithms and new applications continue to spring up [10–19]. The improvement ISOMAP algorithms are listed in references [10–14]. Zhang Yan proposes a semi-supervised local multi-manifold ISOMAP learning framework by linear embedding, that can apply the labeled and unlabeled training samples to perform the joint learning of neighborhood preserving local nonlinear manifold features and a linear feature extractor [10]. Shi Hao presents Landmark-ISOMAP to improve the scalability of ISOMAP [11]. Qu Taiguo raises the Dijkstra’s algorithm with Fibonacci heap to speed up ISOMAP [12]. Yang Bo puts forward multi-manifold discriminant ISOMAP for visualization and classification [13]. Liang Dong brings forward an extensive landmark selection method to enhance both efficiency and representational capability of ISOMAP [14]. The new applications of ISOMAP algorithm are enumerated in references [15–19]. Liu Qingchao puts forward an approach for predicting urban road traffic states based on ISOMAP manifold learning [15]. Xu Kang-Kang presents an ISOMAP-based spatiotemporal modeling for Lithium-Ion battery thermal process [16]. Kanishka Ganvir raises a watershed classification method using ISOMAP technique and hydrometeorological attributes [17]. Li Ming-ai brings forward an adaptive feature extraction method of motor imagery electroencephalograph (EEG) with optimal wavelet packets and supervised and explicit mapping ISOMAP (SE-ISOMAP) [18]. Kashniyal Jyoti comes up with a progressive ISOMAP algorithm for wireless sensor networks localization [19].
However, the inverse problem of ISOMAP has never been studied by now. Actually, the inverse problem of NLDR has also never been fully investigated. In recent years, only one journal paper involves the inverse problem of NLDR, which adopts radial basis function interpolation to resolve the inverse problem of general bi-Lipschitz nonlinear dimensionality reduction mapping [6]. Its major disadvantages comprise, dense sampling in HD space is needed, the dimension of LD space cannot be less than the intrinsic dimension of HD data, lossless reconstruction of HD data cannot be gained, and the inverse problem in very LD space has never beenresearched.
The solutions to the inverse problems in other disciplines can be borrowed to resolve the inverse problem of ISOMAP. There are many inverse problems in different disciplines, such as imaging, compressed sensing, super resolution, denoising, deblurring, enhancement, inpainting, recovery and rebuilding [20]. The inverse problems of imaging consist of magnetic resonance imaging (MRI), computer tomography (CT), holographic imaging and ultrasonic imaging, which study the reconstruction of HD image data from LD measurement data [21]. The inverse problem of compressed sensing researches the signal recovery problem of speech, image, video and sensor, under the condition that the sampling rate is far less than the Nyquist sampling frequency, and recovers HD original signal from LD measurement sample [22]. The inverse problem of super resolution investigates the resolution promotion of optical, infrared, radar and hyperspectral images, and obtains high resolution HD sample from related low resolution LD sample [23]. The solutions to these inverse problems make use of different priors of original data, such as sparse priors and deep learning priors of HD data.
The priors of HD data include total variation, sparsity, low-rank and locality constraint [24–30]. Total variation refers to the integral of the absolute value of signal gradients [24]. Because of the increasing of the total variation of noisy signal, the minimization of total variation can remove the noise and keep the signal details. Sparsity means a signal in the real world can be represented by a linear combination of limited basis signals, with minimal non-zeros coefficients of the linear combination [25]. The property of low-rank is referred to as a signal in the real world can also be expressed by a linear combination of limited basis signals, maintaining the global features of the original signal [26]. The property of locality constraint refers to a signal in the real world can likewise be approximated to a linear combination of limited basis signals, retaining the local features of the original signal [27]. For example, sparsity and low-rank priors based parallel MRI [28], low-rank and locality constraint priors based robust super resolution of face image [29], and sparsity and locality constraint priors based super resolution of infraredimage [30].
In summary, this paper concentrates on the inverse problem of ISOMAP with sparse priors.
Theory
Recall of classical ISOMAP
The classical or original ISOMAP algorithm searches an LD embedded nonlinear manifold within the HD space. The input data of the ISOMAP algorithm is described by the following matrix:
The classical ISOMAP algorithm comprises the following three steps. Neighborhood Graph
Construct a neighborhood graph
The neighborhood matrix Shortest Paths
Compute the shortest paths using Dijkstra’s algorithm or Floyd’s algorithm. The shortest paths matrix Multidimensional Scaling
Find the LD representation of the input data utilizing multidimensional scaling (MDS). For MDS, it is needed to construct matrix
Let:
From formulas (7) and (8), the following equation can be derived:
Extracting the d largest positive eigenvalues λ1, λ2, …, λ
d
and the d corresponding eigenvectors
The residual variance of the ISOMAP algorithm is defined by the following expressions:
The proposed theory architecture for the reconstruction of ISOMAP representation contains two progressive models: the general ISOMAP reconstruction architecture and the sparse ISOMAP reconstruction architecture. The general ISOMAP reconstruction architecture is illustrated in Fig. 1, where: I0 is the classical ISOMAP algorithm; I
e
is the general ISOMAP embedding algorithm, which will be discussed later; I
r
is the general ISOMAP reconstruction algorithm, which will be introduced later;

The general ISOMAP reconstruction architecture.
In order to promote the reconstruction performance of the general ISOMAP reconstruction architecture by making use of sparse priors of HD data, the sparse ISOMAP reconstruction architecture is shown in Fig. 2, where: D0 is the sparse dictionary learning algorithm, which will be described later;

The sparse ISOMAP reconstruction architecture.
The training data pair, which comprises the HD data and the corresponding LD data, is described by the following formulas:
For a given HD testing sample, its LD embedding can be attained by general ISOMAP embedding algorithm, which tries to retain the distances from the LD embedding sample to its neighbors as those from the corresponding HD sample to its neighbors, and is depicted by the subsequent minimization problem:
For a given LD testing sample, its HD reconstruction can be achieved by general ISOMAP reconstruction algorithm, which attempts to maintain the distances from the HD reconstructed sample to its neighbors as those from the related LD sample to its neighbors, and is expressed by the following minimization problem:
Because the general ISOMAP algorithm cannot provide a satisfied reconstruction of original HD data, the sparse priors of HD data are considered to improve the reconstruction quality. According to the training data pair, a sparse dictionary is learned by the following minimization problem:
Relying on the aforementioned sparse dictionary of HD data, the sparse ISOMAP reconstruction algorithm can be described by the following minimization problem:
The abovementioned constrained minimization problem can be rewritten as the subsequent two alternative constrained minimization problems:
They can be further rewritten as the following two alternative unconstraint minimization problems:
The derivative of φ
sr
with respect to
The derivative of φ
s
with respect to
Hence, the foregoing alternative unconstrained minimization problems can be resolved by the following alternative standard gradient descent procedures:
Finally, the sparse reconstruction sample of ISOMAP representation is gained by the following formula:
The description of the sparse ISOMAP reconstruction algorithm is listed in Table 1.
The description of the sparse ISOMAP reconstruction algorithm
The description of the sparse ISOMAP reconstruction algorithm
Experimental conditions
Three experiments are designed to evaluate the performance of the proposed method. The first is the experiment of image reconstruction, the second is the experiment of video reconstruction, and the third is the experiment of signal reconstruction.
The experimental images come from the University of Southern California - Signal and Image Processing Institute (USC-SIPI) image database (http://sipi.usc.edu/database/database.php). It has four volumes: Textures, Aerials, Miscellaneous, and Sequences. The first three volumes are employed for the experiment of image reconstruction, and the last volume is applied to the experiment of video reconstruction. All color images are converted into grayscale images, and the image size is adjusted to 256×256. Each image is partitioned into small blocks with same size 8×8.
The experimental videos are also derived from the testing sequences of highly efficient video coding (HEVC) [31]. The HEVC testing sequences comprises five subsets: A, B, C, D, and E, and only subset D is adopted for the experiment. All color frames are transformed into grayscale frames, and the frame size is 240×416. Each frame is divided into small patches with the same size 8×8.
The experimental signals are chosen from the Apnea-Electrocardiograph (Apnea-ECG) Database (https://www.physionet.org/physiobank/database/apnea-ecg/) [32, 33]. Only 20 records, a01-a20, are selected for experiment from the total 70 records. Each record is divided into fragments with same size, and each fragment holds 90 sampling data points, which is corresponding to a cycle of ECG signal. Each fragment is normalized by peak alignment. For each record, the first 1024 chips are utilized as training samples, and the second 1024 chips are utilized as testing samples.
The proposed method is compared with two state-of-the-art methods of image and video coding: sparse representation (SR) [34] and discrete cosine transformation (DCT) [35]. The SR algorithm is K-SVD, the size of sparse dictionary is 256, the sparse level is arbitrary, the number of iterations is 20, the stop criterion is a given representation error, and the algorithm of sparse coefficients estimation is orthogonal matching pursuit (OMP) [34]. The DCT algorithm is direct discrete cosine transform and dictionary learning of DCT is not exploited [35].
This paper is devoted to the situation of LD representations of HD data, such as 1D, 2D and 3D LD representations. For the proposed method, one, two or three LD ISOMAP representations of image patches are employed for reconstruction. For SR, the most important one, two or three sparse coefficients of image patches are preserved for reconstruction. For DCT, the most important one, two or three DCT coefficients of image patches are retained for reconstruction.
The experimental hardware platform is a laptop computer with 2.6 GHz CPU and 8GB memory, and the experimental software platform is Matlab 2016a on Windows 10.
Experiment of image reconstruction
The experiment of image reconstruction is devised as self embedding and reconstruction. That is to say: an image is utilized to establish a training set and then the same image is rebuilt from the training set. The reason for this design is that, we expect to preliminarily verify the reconstruction performance of the proposed method. The usage of different training and testing samples is probed in the following experiment of video reconstruction.
The experimental results of the USC-SIPI image database are listed in Tables 2–4, which respectively enumerate the results of the aerials volume, miscellaneous volume and textures volume. It is revealed in Tables 2–4 that, the proposed method is superior to DCT and SR in the peak signal to noise ratio (PSNR) of image reconstruction, in the situation of the 2D and 3D LD representations. It is also indicated in Tables 2–4 that, the average PSNRs of miscellaneous volume are higher than those of aerials volume, in the case of 2D and 3D LD representations. This is because that the images in aerials volume hold more details than those of miscellaneous volume. It is further reflected in Tables 2–4 that, the average PSNR of textures volume is higher than that of miscellaneous volume, in the situation of 3D LD representation. This is due to that fact that, the images of textures volume possess more repeated patches than those of miscellaneous volume and the ISOMAP algorithm depends on the similarities between image patches, although the former holds more details than the latter.
The experimental results of the aerials volume of USC-SIPI image database
The experimental results of the aerials volume of USC-SIPI image database
The experimental results of the miscellaneous volume of USC-SIPI image database
The experimental results of the textures volume of USC-SIPI image database
The reconstruction images of Tables 2–4 in the situation of 3D LD representations are respectively illustrated in Figs. 3–5, where the proposed method owns higher reconstruction performance than DCT and SR.

The first four reconstruction images of the aerials volume of USC-SIPI image database from left to right: original image, reconstruction image of DCT, reconstruction image of SR, reconstruction image of the proposed method.

The first four reconstruction images of the miscellaneous volume of USC-SIPI image database from left to right: original image, reconstruction image of DCT, reconstruction image of SR, reconstruction image of the proposed method.

The first four reconstruction images of the textures volume of USC-SIPI image database from left to right: original image, reconstruction image of DCT, reconstruction image of SR, reconstruction image of the proposed method.
It is indicated by the experimental results that, if the training samples and testing samples are very similar, even the same, the proposed method can reach excellent reconstruction performance. It can also be discovered by the experimental results that, DCT surpasses SR in reconstruction performance. This may be due to the fact that, the implementation of DCT is not based on a trained DCT dictionary but on the direct DCT. SR is image patch set oriented and is the optimal solution to a set of image patches, hence more sparse coefficients may be needed to recover a given image patch. Inversely, DCT is image patch oriented and is the best solution to a given image patch, thus fewer DCT coefficients may be required to reconstruct theimage patch.
The major weakness of the experiment of image reconstruction is that, the training and testing sample sets are the same. On the one hand, the experimental results of image reconstruction reflect the recovery capability of the proposed method to some extent. On the other hand, the following experiment of video reconstruction employs different training and testing sample sets and can further prove the performance of the proposed method. Another shortcoming of the proposed method is that, the performance promotion is at the expense of huge computation load. However, the proposed method can be exploited by a non-real-time system, which needs high reconstruction performance.
Therefore, in the situation of same training and testing samples, the proposed method achieves perfect reconstruction performance with 2D or 3D LD ISOMAP representations.
The experiment of video reconstruction is designed to testify the performance of the proposed method with different training and testing samples. The first frame of a video is employed as the training samples, and the second frame of the video is adopted as the testing samples.
The experimental results of the sequences volume of USC-SIPI image database is enumerated in Table 5, and the experimental results of subset D of the HEVC testing sequences is listed in Table 6. It is shown in Tables 5 and 6 that, the proposed method overmatches DCT and SR in reconstruction performance in the case of 2D and 3D LD low representations. The average PSNRs of USC-SIPI sequences are lower than those of HEVC sequences. This is owing to the fact that, USC-SIPI sequences possess more details than HEVC sequences.
The experimental results of the sequences volume of USC-SIPI image database
The experimental results of the sequences volume of USC-SIPI image database
The experimental results of subset D of the HEVC testing sequences
The reconstruction frames of Tables 5 and 6 in the situation of 3D LD representations are respectively illustrated in Figs. 6 and 7, where the proposed method holds higher reconstruction performance than DCT and SR.

The reconstruction frames of the sequences volume of USC-SIPI image database from left to right: original frame, reconstruction frame of DCT, reconstruction frame of SR, reconstruction frame of the proposed method.

The reconstruction frames of subset D of the HEVC testing sequences from left to right: original frame, reconstruction frame of DCT, reconstruction frame of SR, reconstruction frame of the proposed method.
The experimental results of the records in the Apnea-ECG Database
It is obvious in the experimental results that, the average PSNRs of the image reconstruction outbalance those of video reconstruction. This is because of the similarities of the former between training and testing samples is better than those of the latter.
Hence, the proposed method can gain satisfied performance of video reconstruction with different training and testing samples.
The experimental results of ECG signal reconstruction are illustrated in Table 7. The root of mean square error (RMSE) substitutes for PSNR as the performance metric. It is reflected in Table 7 that, in the case of 2D and 3D representations, the proposed method possesses the least average RMSEs. Thus, the proposed method outbalances the state-of-the-art methods, such as DCT and SR, in the reestablishment capability of 1D ECG signal. This is due to the fact that, ECG signal is periodic and there are many similarities between different chips. It is also revealed in Table 7 that, 1D SR representation overmatches 2D and 3D SR representations in rebuilding ability of ECG signal. This is owing to the fact that, a testing sample owns a nearest neighbor, which is proportional to an atom in the SR training dictionary.
The experimental results of ECG signal reconstruction are also shown in Fig. 8. Only the reconstruction signals of 3D representation in the first cycle of record a09 are displayed. It is further disclosed in Fig. 8 that, the proposed method is superior to DCT and SR in recovery quality of ECG signal.

The reconstruction signals of 3D representation in the first cycle of record a09. Black curve with cross mark is the original signal, red curve with star mark is the recovery signal of the proposed method, green curve with circle mark is the recovery signal of DCT, and the blue curve with square mark is the recovery signal of SR.
This paper discusses the inverse problem of ISOMAP and its applications in data reconstruction. The theoretical framework of the proposed method includes four parts: the classical ISOMAP algorithm, the general ISOMAP embedding algorithm, the sparse dictionary learning algorithm, and the sparse ISOMAP reconstruction algorithm. The sparse reconstruction algorithm is a minimization problem based on the sparse priors of the HD data, and is resolved by the conventional ADMM method. It is testified by the experimental results of image, video and signal reconstruction that, the proposed method is superior to the traditional DCT and SR methods in the reconstruction performance of image, video and signal data, in the situation of very LD representations. It is also verified by the experimental results that, the proposed method is especially suitable for the situation that the training and testing samples are very similar. It is further indicated by the experimental results that, the proposed method is very fit for the reconstruction of texture images, which holds plenty of similar image patches.
In our future work, deep learning priors of HD data will be utilized to resolve the inverse problem of ISOMAP, and the out-of-sample extensions of ISOMAP will also be studied.
