Abstract
In medical imaging, the lack of high-quality images is present in many areas such as magnetic resonance (MR). Due to many acquisition impediments, the generated images have not enough resolution to carry out an adequate diagnosis. Image super-resolution (SR) is an ill-posed problem that tries to infer information from the image to enhance its resolution. Nowadays, deep learning techniques have become a powerful tool to extract features from images and infer new information. In MR, most of the recent works are based on the minimization of the errors between the input and the output images based on the Euclidean norm. This work presents a new methodology to perform three-dimensional SR based on the combination of Lp-norms in the loss layer. Two multiobjective optimization techniques are used to combine two cost functions. The proposed loss layers were trained with the SRCNN3D and DCSRN networks and tested with two MR structural T1-weighted datasets, and then compared with the traditional Euclidean loss. Experimental results show significant differences in terms of Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM) and Bhattacharyya Coefficient (BC), while the residual images show refined details.
Keywords
Introduction
The improvement of image quality and resolution is a constant aim in medical imaging, due to the critical importance of these images to find the correct diagnosis and treatment for patients. This is not only reflected in the optimization of acquisition techniques, for instance in the case of magnetic resonance imaging (MRI), but also at the post-processing stage, with an ever-increasing interest in new, improved algorithms. Enhancing resolution is particularly relevant in this area, given the need to inspect the details of anatomical structures and to locate functional information in a more precise way. Many machine-learning approaches are being proposed for medical imaging applications, and deep learning is becoming increasingly popular among them [1]. The interest of these algorithms for medical imaging is developing as they evolve towards greater efficiency and reliability. The improvement of image resolution is a fundamental step towards attaining an adequate performance in subsequent phases of the medical image processing pipeline. For example, segmentation of the regions of the human brain by clustering is a key task [2], which can benefit from the enhancement of the quality of the input image. Deep learning-based MRI super-resolution technology has the potential to become a commonplace procedure in all MRI medical protocols [3].
These methods have largely overcome the traditional interpolation and spline-based approaches to increase image resolution, which typically causes blurring. Among them, example-based methods have become popular as super-resolution techniques [4]. Some exploit the internal similarities of the image [5], while others use external datasets to learn mapping patterns between LR and HR images [6]. Recently, an example-based super-resolution algorithm, the SRCNN convolutional neural network [7], and its 3D version SRCNN3D [8] have obtained great attention because of their ability to learn an end-to-end mapping between LR and HR images, thus avoiding to learn from dictionaries or manifolds to model the high-resolution space.
More generally, convolutional neuronal networks (CNNs) have demonstrated excellent performance in image and video processing. These methods are under constant development [9, 10, 11], and they have been successfully applied in detection and recognition of objects, classification of images or within recommender systems [12, 13]. This has also been facilitated by the power of the new graphical acceleration devices (GPU), and more specific hardware developments [14]. Hundreds of articles based on the development of CNNs have been published in several areas [15] including medical image analysis [4, 5, 16, 17], where the use of CNNs is progressively expanding.
Deep learning neural networks use loss functions that are commonly based on the squared Euclidean norm. However, the use of alternatives like
Using an
However, it is still interesting to be able to count on the robustness of the traditional squared Euclidean norm. Multiobjective optimization methods have been developed for the case where more than one goal is aimed at in the optimization process. For example, image enhancement aims to reduce the noise while preserving the small details. Usually, these two goals clash, since noise reduction is often achieved by smoothing out the image and thereby removing the details. Multiobjective optimization makes it possible to combine two or more loss functions and, depending on the methods and parameters used, to give more relevance to the most relevant goal in each case. A fundamental concept related to multiobjective optimization is that of the Pareto front [29], which is a surface in the space of possible solutions which comprises all solutions that are not dominated by any other solution, i.e. there is no solution which is equal or better than them for all the goals which are optimized.
Multiobjective optimization is based on the premise that the improvement of one of the objectives may lead to the deterioration of another objective. Therefore, a globally optimal solution is not possible, so that a search over the Pareto front is required. Evolutionary algorithms can be employed to approximate the Pareto front of a problem within a population of possible solutions, which has led to their extensive application [30]. Popular current proposals for multiobjective optimization include heuristic methods based on evolutionary algorithms [31]. Among these, genetic algorithms, decomposition-based proposals, particle swarm optimization, bat algorithms, harmony search, ant colony optimization, and non-dominated sorting genetic algorithms are designed to cope with various challenges of multiobjective optimization problems. All of them use operators inspired in biological evolution in order to improve a population of possible solutions to the optimization problem. Also, it is very common that several methodologies are hybridized, by combining search and updating methods, or alternating methods in different phases [30]. In our case, the number of objectives to be optimized is two, although there are methods to optimize more than four objectives, which is known as many-objective optimization [32].
Scalarization is another approach to multiobjective optimization, which is based on the optimization of a scalar function which combines the multiple objective functions of the original problem. There are many scalarization methods in multiobjective optimization with different characteristics such as convexity, boundedness, the ability to generate proper efficient solutions, the number of additional constraints, etc. For example, the elastic constraint method [33] gives conditions on the characterization for properly efficient solutions, and the augmented weighted Chebyshev scalar problem [34] generates properly efficient solutions for certain selected values of weights and augmentation parameter. This work focuses on a particularization of the Pascoletti-Serafini scalarization [35], which guarantees to generate proper solutions just by selecting the proper weights.
In light of the above, our proposal involves combining two different cost functions with different values of
The rest of this paper is organized as follows: The theoretical background of this work is detailed in Section 2. Section 3 provides a description of the SR network to be improved, the datasets for testing and the optimization experiments, followed by the obtained results. A discussion of the obtained results is carried out in Section 4. Finally, the conclusions and proposals for future work are presented in Section 5.
Methodology
In this section, we propose a new optimization framework to train deep neural networks. Our proposal considers cost functions based on the
Lp-norm loss functions
Next, the usage of
The starting point for our strategy is the realization that the squared Euclidean norm loss function, i.e.
The standard configuration of a deep learning neural network includes a loss function given by the average for all training data of the squared Euclidean norm (
In light of the above, we advocate the use of
where
The gradient of the
where
The optimization of the
As done in standard learning procedures, in our proposal the adjustment of the weights of a neural network is carried out by gradient descent. Since gradient descent minimizes a single loss function, this implies that a combined loss function
Weighted sum scalarization (WSS [49]). In this simple approach, the combined loss function is defined as the weighted average of the two loss functions:
where
Scheme of the proposed model: the LR image is fed into a convolutional neural network with a modified loss layer, producing an optimized SR image. These networks are based on the minimization of the residue between the original HR image and the output of the network. Weighted Chebyshev scalarization (WCS[50, 51]). This strategy is based on the previous calculation of an ideal point:
where the minima are computed over the entire domain of the loss functions
where again

Once the selected combined loss function
The content of this section reports the experiments that were carried out. The proposed methodology is applicable to any kind of regression network whose optimization layer is based on the minimization of a cost function that compares the input and the output of the network. Thus, two sets of experiments with two different convolutional neural networks were carried out, whose description is made in Subsections 3.1 and 3.2, including the description of the used datasets as well as the low-resolution image generation procedure. In addition to this, in Subsection 3.3 the third experiment for anisotropic generated data is presented in order to study the performance of the proposed methodology. The metrics employed to evaluate the performance of the proposal are detailed in Subsections 3.4 and the parameter tuning of the proposals is performed in Subsection 3.5. Subsection 3.6 details the statistic analysis we carried out. Finally, Subsection 3.7 sums up the outcomes of the experimental analysis.
Four different optimization models have been tested in each experiment: the standard squared Euclidean norm (
Experiment 1: SRCNN3D
Firstly, we make use of the SRCNN3D deep neural network [8], which is a convolutional neural network that carries out the super-resolution of three-dimensional MR images.
SRCNN3D is based on the application of three blocks of convolutional layers successively, comprising Rectified Linear Unit (ReLU) layers. The method first creates a pre-interpolated image
where
This network is trained using overlapping patches extracted from a set of HR reference images. A down-sampling and up-sampling are applied to each patch and a set of pairs input-target is created to learn an end-to-end function between low and high-resolution images. Specific details of the implementation of this network can be found in the literature [8].
A scheme of the operation of the network is shown in Fig. 1. Our proposal1 consists of the substitution of the squared Euclidean cost function
In order to carry out an adequate analysis of the performance of each model, it was necessary to provide a large dataset of MR images. Nowadays, the number of public datasets has increased, although it is still hard to find the ideal images to be processed since they usually have images of both control and pathological subjects. For this experiment, we considered the OASIS-1 dataset, consisting of a cross-sectional MRI Data of 416 subjects aged 18 to 96. Data were acquired on a 1.5-T Vision scanner with a 1.0
A total of 220 T1-weighted MR images of the dataset were considered for the evaluation of the proposed models, which correspond to indices from 0001 until 0240 of type MR1 (patient’s first visit), except the image 0080.
Since the whole dataset contains high-resolution images only, low-resolution images were created from the high-resolution ones and fed into the networks. As is stated in [52], the observation model is usually decomposed into a linear downsampling operator after a space-invariant blurring model as a Gaussian kernel with the full-width-at-half-maximum (FWHM) equal to slice thickness. SRCNN3D is based on this model. For that purpose, the following procedure is applied. Firstly the HR images were adequately cropped to make the image dimensions divisible by the zoom factor. Then, a 3D Gaussian filter with a standard deviation equal to 1 is applied. Finally, imresize3 Matlab function was used to perform a 3D cubic interpolation to obtain the LR image. This is a standard procedure to generate the LR versions of HR images for the evaluation of MRI super-resolution algorithms, as seen in [53, 54, 55].
Training procedure
The SRCNN3D has been developed using the Caffe package [56] on a Python framework. In this work, a training over 50000 iterations was carried out for each model as well as for the parameter selection. We considered this value to cover a large enough number of epochs that allows the network to converge properly and without taking too much time because each training takes around 12–14 hours to complete. The rest of the network hyper-parameters were set to default: momentum of 0.9, learning rate of 0.0001 and batch size of 256, using Stochastic Gradient Descent (SGD) for model optimization. In order to make the experiments replicable, we set the pseudorandom seed in the Caffe engine to the value 1701.
As described in Subsection 3.1.1, from a total of 220 images, the first 120 were used for training. Specifically, the first 100 images were used to train the network and the next 20 ones were used as a validation set to monitor the error curves. The remaining 100 images were used for testing. We divided each training, validation and testing sets into 10 folds with an equal number of images into them (10, 2 and 10 images, respectively) in order to carry out the statistical analysis described in Section 3.6. Although it may seem that the proportion training/testing is inadequate, the SRCNN3D model is patch-wise based and extracts around 15000 samples from each training image in order to have a sufficient number of inputs, while the testing is carried on the whole image without patch extraction.
Furthermore, taking advantage of that this network can be trained for multiple scale factors at the same time, zoom factors 2, 3 and 4 were employed in our analysis. For it, the triple amount of patches was extracted from the training dataset and they were given to the network.
Experiment 2: DCSRN
In the second set of experiments, we make use of the 3D Densely Connected Super-Resolution Network (DCSRN) [57], which is focused on the super-resolution of three-dimensional MR images.
DCSRN is based on a densely-connected block. The network starts with a convolutional layer applied to the input image, and the output is fed to densely-connected block with 4 units, composed by a batch normalization layer and an exponential linear unit activation followed by a convolutional layer. In the end, a convolution is applied before providing the final SR image.
This network is patch-based so it is faster in training, back-propagation is more efficient, and the model is smaller. The patch size provided to the network is 64
HCP dataset
In this set of experiments, the Human Connectome Project (HCP) [58, 59] was employed, consisting of a great amount of neuroimaging data ranging from structural MRI, functional MRI and diffusion tensor imaging (DTI), from multiple sites. Concretely, we used the HCP Young Adult 1200 Subjects Data Release, which includes 1113 structural MR scans acquired on a 3-T Siemens scanner. The image size is 256
The low-resolution images were created in a different manner from the one used for SRCNN3D. A Fourier-based procedure is applied [60]: first, the Fast Fourier Transform (FFT) is computed on the HR images, then the resolution is degraded by zeroing the outer part of the 3D
Training loss curves of the WSS model optimization (logarithmic scale) using SRCNN3D network.
The DCSRN has been developed using Tensorflow 1.8 on a Python 3.6 framework. The training was carried out with the default parameters set by their authors, over a total of 49000 iterations and a batch size of 2. This number of evaluations was empirically deduced monitoring the loss curves until they do not change. Adam optimizer with a learning rate of
From the 600 images of HCP1113 dataset, the first 500 were used for training and the remaining 100 images were used for testing. As well as described in Subsection 3.1.2, data were divided into training and testing sets into 10 folds with 50 and 10 images, respectively to carry out the statistical analysis described in Subsection 3.6. The DCSRN model is patch-wise based and 200 patches of 64
Handling anisotropic data
There are many cases where isotropic low resolution structural brain MRI is uncommon. When such MR images are acquired, for example, a T2 or FLAIR modality, they typically retain a high in-plane resolution to provide sufficient quality data for radiologists to interpret across slices. Thus, the resolution is sacrificed in the through-plane direction and the voxel sizes become anisotropic.
In order to handle this type of image, the SRCNN3D network was used because it has the ability to restore an image in one plane without carrying out an extra training. A set of images of different databases were selected and low-resolution images were generated artificially by extending the voxel size in the last plane by a factor of 2, 3 and 4. Thus, if the image has voxel resolution 1
The images used in this experiment are four:
Images 10 and 11 (named as MPRAGE10 and MPRAGE11 resp.) of the Kirby 21 dataset [61]. These data were acquired using a 3-T MR scanner with a 1.0 An image of the Medical Research Center of the University of Málaga (CIMES) acquired with size 256 An image of the IBSR dataset [62] with image size 256
In order to evaluate the performance of the proposed model three different quality measures were employed: Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM) [63] and Bhattacharyya coefficient (BC) [64].
First of all, PSNR focuses on the intensity values obtained from the algorithm when it is compared with the ground truth image. The unit of measurement is dB (decibel), where higher is better. It is defined as follows:
where peak is the maximum possible value of the image and
On the other hand, SSIM focuses on structural similarities between images, returning a value between 0 and 1 (higher is better). This measure permits to check whether the edges are correctly preserved and it is formulated as:
where
Finally, the BC measures the closeness of the two discrete pixel probability distributions
where
From a qualitative point of view, it is useful to analyze the residual images obtained by the subtraction of the GT image
The best performance is such that the residual image is the zero matrix. The constant 0.5 was added to the residual images for the sake of clarity, thereby obtaining gray images.
Our main aim was to attain the best generality at the time of tuning up the parameters of the proposals, so a set of images different from the ones used for training and testing were used. In the case of the SRCNN3D network three different images from 3 different datasets were used to fine-tune the model parameters of each cost function:
Regarding the DCSRN network, the MGH HCP Adult Diffusion dataset was used, which comprises 35 young adult structural scans using the MGH Siemens 3T Connectome scanner.
PSNR, SSIM and BC measures of the above-presented images were computed and a ranking was established sorting each tested parameter according to its performance with respect to each image. The assigned points were accumulated among all images and the lowest scores mean that the network is better. Thus, there are two kinds of parameters to be tuned: the
Firstly, we performed a set of experiments fixing weights and varying the
Secondly, for both WSS and WCS cost functions a set of weight values
WSS model optimization for SRCNN3D: PSNR, SSIM, and BC rankings across the three tuning images are shown varying 
WCS model optimization for SRCNN3D: PSNR, SSIM, and BC rankings across the three tuning images are shown varying 
The weighted Chebyshev scalarization (WCS) is based on the maximum value between the two
In order to make a reasonable selection of the best
In Table 1 the final configurations of the models based on our previous analysis are summarized. With respect to the DCSRN network, the configurations of the WSS and WCS methods were established taking into account the previous analysis. Thus, the WSS parameters were set up with the best two
Parameter selection of the proposed cost functions
Mean and standard deviation of the rank values computed among the PNSR, SSIM and BC ranks for both models: WSS, with 
First, a Friedman aligned ranks test [66, 67] is performed in order to check whether at least two of the methods represent populations with different median values, i.e. the methods have significantly different performance. This technique is a similar version of the Friedman test that can be used under the same circumstances, although the Friedman aligned ranks test is appropriate where the number of methods to be compared is low.
In this technique, if we have
where
Then, if the obtained p-value is smaller than the level of significance
Additionally, box plots were generated, one box plot per performance measure, where each method is associated to a box, and the
First, we evaluate the proposed methodology from a quantitative point of view for each experiment. For each quality measure and each zoom factor, a Friedman aligned ranks test was carried out to measure between the different cost functions tested. A total of 10 mean values corresponding to the 10 test repetitions were computed for each method and passed to the Friedman aligned ranks test. The methods were ranked assigning 1,2,3 or 4 points for each repetition and the accumulated results are the ones presented in the following tables.
Results of experiment 1
PSNR aligned Friedman ranks are shown in Table 2. The lower values for all zoom factors, i.e the best, are achieved always by the model based on a unique
Friedman Aligned Rankings of the methods for PSNR measure and for zoom factors 2, 3 and 4, computed for the SRCNN3D network. The last row shows the probability value to reject the null hypothesis
Friedman Aligned Rankings of the methods for PSNR measure and for zoom factors 2, 3 and 4, computed for the SRCNN3D network. The last row shows the probability value to reject the null hypothesis
There is a bit of variety for SSIM and BC measures, whose statistical analysis is summarized in Tables 3 and 4. For larger scale factors (3 and 4), where the network needs to be more precise to recover the voxel’s information, the weighted Chebyshev scalarization is clearly the best cost function, achieving the lowest ranks of all the Friedman analysis performed. This means that the WCS is more suitable for recovering the structural features of the MR image than both the usual squared Euclidean norm and the
Friedman Aligned Rankings of the methods for SSIM measure and for zoom factors 2, 3 and 4, computed for the SRCNN3D network. The last row shows the probability value to reject the null hypothesis
The differences in terms of BC are closer. In Table 4 all the average rankings have values around 20, although again for zoom factors 3 and 4 the WCS method has the best outcome, corroborated by the p-value. However, we can see that the squared Euclidean norm is the second best cost function. As the BC measures the differences in pixel probability distributions, we can infer that the image histograms obtained by all the tested methods are very similar, thereby avoiding the inclusion of artifacts or anomalous intensity values.
Friedman Aligned Rankings of the methods for BC measure and for zoom factors 2, 3 and 4, computed for the SRCNN3D network. The last row shows the probability value to reject the null hypothesis
Friedman Aligned Rankings of the methods for PSNR, SSIM, and BC measures computed for the DCSRN network. Last row shows the probability value to reject the null hypothesis
Comparison of the PSNR, SSIM and BC for the four models using SRCNN3D network and 
Comparison of the PSNR, SSIM, and BC for the four models using DCSRN network. Box plots of 10 runs are displayed, where the medians are plotted as horizontal gray lines, while the means are plotted as gray circles.
The variances between the 10 runs of 10 different images we executed are depicted as box plots in Fig. 6, for scale factors 2, 3 and 4. Analyzing first the zoom 2 (first row), in terms of PSNR and SSIM the
PSNR, SSIM and BC measures computed for with SRCNN3D network varying the scale factor of the third dimension of the anisotropic LR image. The blue color represents the best method
Qualitative results for OASIS-0174 image for each model, applied with zoom factor 2. Slices of sagittal, coronal and axial views are showed in a 3D representation. The second row shows the reconstructed image by each algorithm and the third row shows residual images between the reconstructed and the original HR image.
Figure 7 summarizes the outcomes obtained by the DCSRN network after modifying its cost function. As explained in Subsection 3.2.1, this network was trained by simulating degradation by Fourier transforms similar to zoom factor 4. The most stable method with less dispersion in its results is WSS, which yields better SSIM and BC measures than the squared Euclidean norm. The mean value (circle) of PSNR is on par with the
The statistical analysis carried out to check if the methods are significantly different is presented in Table 5. The average rankings computed by the Aligned Friedman test showed that WCS is the first method for PSNR followed by WSS model, and this one is the best for SSIM and BC. It should be remarked the difference obtained with respect to the
The WSS and WCS results of this set of experiments confirm the idea of improving the performance by the combination of two
Anisotropic super-resolution
The last set of experiments deals with anisotropic images. The quantitative outcomes of the zoomed images are collected in Table 6. The values of the scale factors are referred to the SR applied to the third dimension only. The rows show the results obtained for each of the four tested images and the columns represent the measures obtained for each scale. In blue are highlighted the best values for each measure and zoom factor.
The best optimization method is shared by the
Qualitative results for a section of the coronal view of the OASIS-0177 image for each model, applied with zoom factor 3. The second row shows the reconstructed image by each algorithm and the third row shows residual images between the reconstructed and the original HR image.
On the other hand, the differences between method performances increase when the scale factor applied in the SR process is higher, due to the necessity of recovering more information from the one present in the image. This occurs for every method if we compare them with the squared Euclidean norm, obtaining around 1% of improvement in the quality of the image. We need to remark that the network was only trained for isotropic tasks, so it is easy to think that appropriate training focused only on this type of image may improve substantially the outcomes.
In terms of qualitative outcomes, three different images of the OASIS dataset are presented. Figure 8 shows a three-dimensional perspective with one slice of the sagittal, coronal and axial planes of image numbered as 0174, using an augmentation factor of 2. The differences can be seen in the central part of the image, where the intensity of gray varies from one method to another. If we focus on the residual images, the whitest image should be the best approximation to the original HR image. Here, the method based on the
In Fig. 9 a section of the coronal plane of the OASIS-0177 image is shown. In this case, the restoration factor was 3. Here the effect of the
Finally, the visual outcome of DCSRN network is analyzed in Fig. 10. Here a section of the test image with ID 206929 is displayed. The amount of information to be recovered is quite high, provoking distortions in the enhanced image. Focusing on the residual images we can see better the different performance of the optimization methods. WCS method tends to resolve better the intensities because large gray surfaces are clearer than the other methods. On the other hand, fine details are remarked by WSS and
Qualitative results for a section of a reconstructed patch of the image ID 206929 from HCP dataset for each model, applied with zoom factor 4. The second row shows the reconstructed image by each algorithm and the third row shows residual images between the reconstructed and the original HR image.
The above-presented experiments have demonstrat-ed that the multiobjective optimization of the cost function makes the SR networks more precise. Two different ways to combine the
The WCS performed well with SRCNN3D, the SSIM and BC values improved with respect to the squared Euclidean norm. Nevertheless, the WSS method worked better than WCS with the DCSRN network. In this case, WSS yielded good outcomes for either PSNR, SSIM, and BC, but also WCS was good restoring the images. There are two factors that may affect the results of the experiments: the type of the neural network and the data used for training.
Firstly, the effect of the backpropagation in the learning procedure may be crucial in the performance of the methods. With a small network like SRCNN3D (only three convolutional layers), the Chebyshev scalarization achieved more stability. However, when a larger, densely connected network is trained, this methodology loses efficiency and the weighted sum of norms outperforms the rest of the models. Thus, the different layers of the network learned better the features of the images because they were interconnected. This fact may indicate that the larger the network used, the more efficient might be the combination of
Secondly, the amount of data and patch sizes were different in each case. For DCSRN, a larger patch is used, which covers more details of the image, and the minimization of the errors is more effective if the cost function is more complex. The WCS is essentially one
The qualitative outcomes showed that the scalarization methods can provide refined results. The differences among methods were more noticeable in the DCSRN network than in SRCNN3D. The reason might be the degradation model on which they are based. SCRNN3D carries out an initial interpolation that can smooth the effect of the restoration, while DCSRN is created to enhance LR images based on a low-pass filtering-like, i.e. the number of voxels are the same and there is not an intermediate blurring effect.
Conclusions and future works
This work presents a multiobjective optimization model for deep super-resolution neural networks. With the aim of improving the brain magnetic resonance image super-resolution, the usual squared Euclidean loss layer is substituted by combinations of
SRCNN3D and DCSRN models, and OASIS and HCP datasets were employed for experiments. Three different models were compared to the
In future work, we will extend and apply the proposed idea to other machine learning tasks, such as recommendation task and person tracking [68]. The proposed approach could be extended to other neural networks in order to improve the quality of the outputs in any other task like noise removal or segmentation. Moreover, the depth of the network seems to be the key to a correct back-propagation of the proposed cost function errors. An extensive analysis with deeper neural networks may improve the performance with lower values of
Footnotes
The source code and demo of the proposed approach will be published in case of acceptance.
Acknowledgments
This work is partially supported by the Ministry of Economy and Competitiveness of Spain under grants TIN2016-75097-P and PPIT.UMA.B1.2017. It is also partially supported by the Ministry of Science, Innovation and Universities of Spain (grant number RTI2018-094645-B-I00), project name Automated detection with low cost hardware of unusual activities in video sequences. It is also partially supported by the Autonomous Government of Andalusia (Spain) under grant MA18-FEDERJA-084, project name Detection of anomalous behavior agents by deep learning in low cost video surveillance intelligent systems. All of them include funds from the European Regional Development Fund (ERDF). The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the SCBI (Supercomputing and Bioinformatics) center of the University of Málaga. They have also been supported by the Biomedic Research Institute of Málaga (IBIMA).They also gratefully acknowledge the support of NVIDIA Corporation with the donation of two Titan X GPUs used for this research. The authors also thankfully acknowledge the grant of the Universidad de Málaga. Karl Thurnhofer-Hemsi (FPU15/06512) is funded by a PhD scholarship from the Spanish Ministry of Education, Culture and Sport under the FPU program. The authors acknowledge the funding from the following grants, which was used to develop the OASIS database by its creators: P50 AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382, R01 MH56584. HCP data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.
