Abstract
Detecting forged digital image has been an active research area in recent times. Tampering introduces artifacts within images that differentiate tampered images from authentic images. Forgery detection techniques try to identify these artifacts by analyzing differences in the texture properties of the image. In this paper, we propose a multi-texture description based method to detect tampering. Different texture descriptors considered are Local Binary Pattern, Local Phase Quantization, Binary Statistical Image Features and Binary Gabor Pattern. The method captures subtle texture variations at different scales and orientation using Steerable Pyramid Transform (SPT) decomposition of image. The different texture descriptors extracted from each subband image after SPT decomposition is combined to form the multi-texture representation. Then, ReliefF feature selection method is applied on this high dimensional multi-texture representation to generate a compact representation. This compact multi-texture representation is classified using Random Forest classifier. We have evaluated the performance of individual texture descriptors and multiple textures in detecting image forgery. Experimental results show that the compact multi-texture description has improved detection accuracy.
Introduction
Image forgery detection is an active research area where researchers devise techniques that expose forgery. A skilled forger will try to hide the traces with the help of efficient image processing tools capable of creating image alterations unidentifiable by naked eyes. Considering the visual impact of digital images, both common people and digital forensic experts demand techniques capable of authenticating visual contents.
There are different kinds of image forgery; varying from the simple act of increasing the brightness of an image to complex copy-paste operations considering the semantics of the images. Even a simple modification such as brightness enhancement is considered as a forgery if it changes the meaning of the image adversely. Forgery detection techniques address various types of forgery such as image splicing, and copy-move forgery. In image splicing, image regions are copied from one image to another image, whereas in a copy-move forgery, image regions are copied from and pasted onto the same image.
Image forgery detection techniques analyze the artifacts introduced by forgery. For example, an image splicing operation may leave some unevenness around the image boundary when the image region is pasted onto another region. This will be reflected as a texture difference compared to the texture pattern of an unaltered authentic image. Thus, the texture variation can be relied upon as a clue for image forgery detection. A number of forgery detection techniques have been proposed recently that expose traces of texture variation [1, 24]. Most of these techniques follow a general pipeline consisting of three stages, such as image representation in a convenient color space, extraction of features representing texture, and application of a machine learning technique using extracted features.
In the proposed work, we address forgery detection based on texture variations using multiple texture features. This work is motivated by the work of Khan et al. for texture classification [13]. In their work, authors have demonstrated that a compact multi-texture description improved the performance of texture classification. In our work, we tried to combine multiple texture descriptors instead of using a single texture descriptor. In addition to this, we have considered a multi-orientation multi-scale texture representation using steerable pyramid transform to capture the texture variations in different orientation and scales.
Here, we evaluated four texture descriptors such as Local Binary Pattern (LBP) [17], Local Phase Quantization (LPQ) [18], Binary Statistical Image Features (BSIF) [12] and Binary Gabor Pattern (BGP) [25]. Experimental results showed that combining the descriptors have improved the detection accuracy. Among the four descriptors, two descriptors- LBP and LPQ have been previously used for forgery detection. In the respective previous works, authors have employed the descriptors separately and achieved improved performance. In addition to these two texture descriptors, we considered the two recent descriptors-BSIF and BGP that were employed for the texture classification in the work proposed by Khan et al. [13].
In the proposed work, a multi-orientation, multi-scale subband decomposition is carried out on images using Steerable Pyramid Transform (SPT) [21]. For each subband generated by SPT, a multi-texture description is obtained. Finally, all the multi-texture descriptors from various subbands are concatenated to form a high dimensional feature vector. We applied a variant of Relief feature selection method known as ReliefF [14] for reducing the dimensionality of the feature vector. ReliefF selects a subset of features capable of discriminating forged and authentic images. The selected highly discriminating feature subset is fed to a Random Forest classifier [8]. Experimental results showed that feature selection improved detection accuracy. Also, we investigated the performance obtained on combining various combinations of texture descriptors.
The rest of the paper is organized as follows. Section 2 gives an overview of related work considering texture variations for forgery detection. Section 3 describes the proposed method. Section 4 explains the experimental setup. Results are discussed in Section 5. Finally, Section 6 concludes the work.
Related work
A lot of image forgery detection techniques have been proposed that consider the variations in illumination color and texture patterns within an image introduced by image forging operations [22, 23]. Different texture descriptors have been applied for capturing the variations in texture patterns in recent works.
El-Alfy and Qureshi captured the variations in the correlation of pixels using Markov process from spatial and Discrete Cosine Transform (DCT) domains [4]. The dimensionality of feature vector is reduced using Principal Component Analysis and fed into an SVM classifier. The method obtained an accuracy of 99.82% on Columbia Digital Video and Multimedia Lab (DVMM) dataset. Zhang et al. tried to capture the texture variations in the image using Local Binary Pattern (LBP) features extracted from Multi-Block Discrete Cosine Transform (MBDCT) [24]. Principal Component Analysis (PCA) is carried out on the features for dimensionality reduction followed by classification using an SVM classifier. The method achieved an accuracy of 91.38% of accuracy in Columbia dataset. He et al. proposed an image forgery detection technique using Markov features in DCT and DWT domains [7]. The dimension of the feature vector is reduced by SVM-RFE (support vector machine recursive feature elimination). Authentic images and forged images are classified using an SVM classifier. The method obtained an accuracy of 89.76 % in CASIA v2.0 and 93.55% in Columbia datasets.
Muhammad et al. developed another image forgery detection technique capable of considering texture variations in different orientations using Steerable Pyramid Transform (SPT). The texture variations are captured using Local Binary Pattern (LBP) [15]. From YCbCr color space, the chrominance components of the image are decomposed into multi-scale multi-orientation subbands using SPT. LBP is used to extract texture features from each subband. The LBP descriptors from all the subbands are concatenated to obtain the feature vector. Later, a combination of L0-norm and Learning based (LLB) feature selection scheme is used for selecting the most discriminating features. Finally, the reduced feature vector is fed to an SVM classifier. The method obtained an accuracy of 94.89%, 97.33% and 96.39% in CASIA v1.0, CASIA v2.0 and Columbia datasets respectively.
Isaac and Wilscy used Gabor wavelets and Local Phase Quantization (LPQ) for capturing the texture variations within images for detecting image forgery [11]. Here, the RGB image is represented in YCbCr color space. Gabor wavelet is applied on the Cr component and the texture variations in the Gabor subband images are described using LPQ features. The LPQ feature obtained from the different Gabor subband images are fed to the SVM classifier. This method achieved a detection accuracy of 99.83% and 99.45% in CASIA v1.0 and Columbia datasets respectively.
Hussain et al. have evaluated the performance of multi-scale Weber Local Descriptors (WLD) and multi-scale LBP for image forgery detection [10]. Their experiments showed that WLD resulted in a detection accuracy of 92.62% and 96.52% in CASIA v1.0 and CASIA v2.0 respectively. The authors have also showed that multi-LBP obtained a detection accuracy of 85.93% in CASIA v1.0.
Agarwal and Chand devised an image forgery detection technique that enhances the texture variations using an entropy filter and captures enhanced texture patterns using Local Phase Quantization (LPQ) [1]. In the first stage, the RGB image is transformed to YCbCr color space. Both the chrominance channels Cb and Cr are fed to Entropy filters of varying neighborhoods such as 3 × 3, 3 × 5, 5 × 3 and 5 × 5. LPQ descriptors are extracted from each of these entropy filtered images and concatenated to form a feature vector. This feature vector is fed to an SVM classifier and obtained an accuracy of 95.41%, 98.33 % and 91.14% in CASIA v1.0, CASIA v2.0 and Columbia datasets respectively.
Alahmadi et al. proposed an image forgery detection technique using Local Binary Pattern (LBP) and DCT [2]. Here, the input RGB image is converted to YCbCr color space. The chrominance channels are divided into overlapping blocks and the texture variation in each block is captured using LBP codes. The LBP code blocks are converted to the frequency domain by DCT. Finally, the standard deviations of corresponding DCT coefficients of all blocks are fed to an SVM classifier. The method obtains an accuracy of 97.00%, 97.5%, and 97.77% in CASIA v1.0, CASIA v2.0 and in Columbia datasets respectively.
El-Alfy and Qureshi used intra-block Markov features extracted from LBP and DCT domains for training an SVM classifier for forgery detection [5]. Experimental results show that combining Markov features from both LBP and DCT has improved the detection accuracy. The method exhibits an accuracy of 97.33% on Columbia dataset and 99.73% on CASIA v2.0 datasets.
Among the previous works, Hussain et al. have evaluated multi-WLD and LBP [10]; Issac and Wilscy have considered LPQ on Gabor subband images [11]; and Muhammad et al., used LBP on SPT subband images [15]. LBP and LPQ provided better performance when applied alone. Hence, we considered LBP and LPQ in the multi-texture representation. In addition to LBP and LPQ, we considered the two recent texture descriptors such as BSIF and BGP, used earlier in the compact multi-texture representation of Khan et al. meant for texture classification. In the proposed work, we evaluated each of the descriptors for their discriminability and combined them for better performance in forgery detection. Though combining multiple texture features have been attempted in various computer vision problems such as texture classification and face recognition, none of the previous work attempted combining multiple texture descriptors for image forgery detection.
Image forgery detection based on compact multi-texture representation
The proposed method based on compact multi-texture representation consists of four stages as shown in Fig. 1. As a preprocessing step the RGB image is represented in YCbCr space. Previous works have shown that chrominance components – Cb and Cr are good at representing the texture patterns than the gray level image or the Y component in YCbCr [1, 15]. Also, Muhammad et al. have showed that the SPT decomposition of the chrominance components have improved the detection accuracy [15]. Therefore, we considered a multi-scale multi-orientation decomposition of chrominance components using Steerable Pyramid Transform (SPT) for capturing subtle variations in texture.
In the proposed method, a 2-level decomposition is carried out on the Cb and Cr channels. In the zero level, each of the chrominance component is decomposed into a lowpass subband and highpass subband. The lowpass subband is subsampled by a factor of 2 in the horizontal and vertical directions. In level one, the subsampled lowpass subband image is decomposed into four orientation subbands and a lowpass subband. In level two, again this lowpass band is subsampled by two and decomposed into four orientation subbands and another lowpass subband. Thus, a total of 10 subband images (1 highpass + 4 orientation subbands from level 1 + 4 orientation subbands from level 2 + 1 lowpass) are generated. These subband images provide a better representation of subtle texture variations at differentscales.
Multi-texture description
A multi-texture representation is extracted from each of the 10 subbands using four different texture descriptors discussed here.
Local Binary Pattern (LBP)
LBP is a rotation invariant texture descriptor proposed by Ojala et al. in [17]. For each pixel, a neighborhood of the pixel is considered and the central pixel value is subtracted from the neighborhood pixels. If the subtracted value is negative then it is represented as a zero. If the subtracted value is positive then it is represented as a one. Finally, all the bits are concatenated to form a numeric value. The distribution of all these numeric values throughout the image is represented as a 256 bin-histogram representing the texture of the image.
Local Phase Quantization (LPQ)
In the LPQ features proposed by Ojansivu and Heikkil [18], the Fourier phase information is used. For each pixel in the input image, a neighbourhood is considered and is represented in the frequency domain. In the frequency domain, the local frequency coefficients are taken at a set of frequencies. A threshold is used to get a binary representation of value at each of the frequency. The binary codes from the imaginary and real parts are combined to obtain an 8-bit binary code. This code is calculated for all the pixels in the image. The histogram generated from the codes of all pixels in the image is termed as the LPQ descriptor. The LPQ descriptor is a 256 feature vector.
Binarized Statistical Image Features (BSIF)
BSIF is a dense descriptor where a binary code for a pixel is generated by convolving a neighborhood region of pixels with filters that are learnt by prior training by independent component analysis [12]. Training is performed using image patches randomly sampled from a small set of natural images. Thus, the filters capture the statistical properties of natural images. The advantage is that the filters can be learnt for a specific application. BSIF outperforms LBP and LPQ and exhibited tolerance to blurring and rotation when tested on a face recognition application.
Binary Gabor Pattern (BGP)
Zhang et al. in [25] proposed a texture feature using Gabor filters. Here, an image is convolved with even symmetric and odd symmetric Gabor filters at three different resolutions to obtain a 216 bin histogram, termed as the Binary Gabor Pattern (BGP).
The four descriptors from each subband are concatenated to form the multi-texture representation.
Compact representation of multiple textures
The image decomposition using multi-orientations and multi-scale levels along with multi-texture description result in a high dimensional feature vector. When both Cb and Cr channels are considered with a 2-scale, 4-orientation SPT, the multi-texture description results in a total of 19680 features. This increases the memory requirements as the number of images in the dataset increases. Therefore, the most distinguishing features are selected using the feature selection algorithm ReliefF [14]. ReliefF is capable of handling missing data as well as noisy data. ReliefF resulted in a compact representation that gave better performance when compared with various feature selection reduction methods such as Fisher, Laplacian, Local Learning based, and L0-norm.
Forgery detection
In the final stage of the proposed method, the compact multi-texture representation is fed to a Random Forest classifier [8] for forgery detection. Random forest is an ensemble learning method where a number of decision trees are considered and the output class is selected based on the mode of output of the different decision trees.
Experimental setup
The details of three standard datasets used in the experiements and the performance criteria for evaluating the proposed method are discussed here. In this work, the feature selection is carried out using the Feature Selection Code Library (FSLib) code library for Matlab1 [19, 20]
Datasets
The three publicly available datasets such as Columbia Color [16, 9], CASIA v1.0 [3] and CASIA v2.0 [3] were used in the experiments. All the three datasets contain authentic, and forged colorimages.
Columbia
Columbia uncompressed image slicing detection evaluation dataset contains 183 authentic images and 180 forged images of resolutions 757 × 568 to 1152 × 768. Images are stored in either tif or bmp format.
CASIA v1.0
CASIA v1.0 contains a total of 1721 images where 800 images are authentic and 921 images are forged. Images are saved in JPG format with resolution of either 384 × 256 or 256 × 384.
CASIA v2.0
CASIA v2.0 is the largest dataset containing 12613 images with 7491 authentic images and 5122 forged images. Images are stored in JPG, tif or BMP format. The resolution of images varies from 240 × 160 to 900 × 600.
Evaluation criteria
We conducted 10-fold cross validation using Random Forest Classifier available in the open source machine learning package WEKA [6]. For evaluating the performance, four criteria such as the Detection Accuracy, Precision, Recall, True Negative Rate (TPR) as defined in Equations 1, 2, 3 and 4 respectively, were used.
In this section, we begin with the elaborate experiments conducted to investigate the performance of different combinations of texture descriptors. This is followed by the analysis of the effect of feature selection. The effect of combining the chrominance channels are also analyzed and the performance of proposed method on various categories of forgery using CASIA v1.0 and CASIA v2.0 is discussed.
The effect of mulit-texture representation
The effect of multi-texture representation is studied by evaluating the detection accuracy using possible combinations of LBP, LPQ, BSIF and BGP. All the combinations are evaluated on Cb, Cr and combined Cb-Cr channels using CASIA v1.0, CASIA v2.0 and Columbia. Table 1 shows the feature vector dimension for each of the possible combinations. The length of the feature vector after feature selection using ReliefF is 970, 480 and 700 for CASIA v1.0, CASIA v2.0 and Columbia respectively. Table 2 shows the result of the evaluation of different combinations of texture descriptors after feature selection. Among the four descriptors LBP and LPQ showed better performance when applied individually. On CASIA v1.0, LPQ provides the best detection accuracy on Cr, Cb and combined Cb-Cr channels. On CASIA v2.0, LPQ gives better performance on Cr and Cb-Cr channels. On Columbia, LPQ gives better detection accuracy on Cb and Cb-Cr channels. LBP gives the best performance on Cb on CASIA v1.0, and on Cr on Columbia. Even though BSIF shows the lowest performance when applied individually, considering BSIF in combinations improved detection accuracy. Overall results reveal that combining the texture descriptors improved detectionaccuracy.
On all datasets, the multi-texture representation improved detection accuracy compared to even the best performing individual descriptor. On CASIA v1.0, a gain of 2.44%, 1.22%, and 1.39% are obtained on Cb, Cr and combined Cb-Cr channels respectively. Similarly, on CASIA v2.0, a gain of 0.25%, 0.28% and 0.23% are obtained on Cb, Cr and Combined Cb-Cr channels respectively. Also, on Columbia, a gain of 0.28%, 1.93%, and 2.76% are obtained on Cb, Cr and Combined Cb-Cr channels respectively. These results show that the complementarity of different texture descriptors have helped in improving the performance of multi-texture representation.
The effect of combining Cb and Cr
Figure 2 illustrates the comparison of detection accuracy on Cb, Cr and combined Cb-Cr on CASIA v1.0, CASIA v2.0 and Columbia datasets. Also, Table 3 gives a detailed evaluation of performance on Cb, Cr and combined Cb-Cr channels for the three datasets. On all the three datasets, the combined Cb-Cr channels resulted in better detection accuracy. Table 3 shows results obtained on combining LBP, LPQ, BSIF and BGP.
The effect of compact representation
Experiments showed that multi-texture representation improved the detection accuracy over individual texture descriptors. But, representing texture using a number of different texture descriptors result in high dimensional feature vector. This makes the idea of combining multiple texture descriptors infeasible due to the heavy memory requirements. Therefore, a feature selection method that selects the most discriminating subset of features should be employed.
In our work, we used ReliefF feature selection technique to obtain a compact multi-texture representation. Table 4 shows how feature selection affects the performance of multi-texture representation using Columbia and CASIA v1.0 datasets. Using the feature selection method, the number of features is reduced from 19680 to 700 on Columbia and from 19680 to 970 on CASIA v1.0. The feature length limit 700 and 970 are selected based on experiments. On Columbia, the detection accuracy is improved by 1.65% and on CASIA v1.0, the detection accuracy is improved by 14.58%.
Analyzing performance on different types of forgery using CASIA v1.0
CASIA v1.0 contains different types of forged images such as spliced, and copy-move images. The forged images are also created with various post-processing operations such as rotation, scaling, deforming and combinations of these operations. Table 5 shows performance evaluation of the proposed method in these various categories of forged images. The best results are obtained for combination of post-processing operations such as rotation and scaling, scaling and deforming. The reason for this better performance is that forged image regions where a combination of post-processing operations are applied, exhibit strong variations in texture patterns [2].
Analyzing the performance on forged image regions with different shapes using CASIA v1.0
In CASIA v1.0, the copy-pasted regions in forged images vary in shape. Different shapes are, arbitrary, circular, rectangular and triangular. Table 6 shows the results obtained for each of these categories. Detection accuracy obtained for arbitrary and triangular shapes are higher than that of circular and rectangular. This indicates that the multiple texture descriptors considered in the proposed method are better at capturing texture variations around forged regions of arbitrary and triangularshapes.
Analyzing performance on different types of forgery using CASIA v2.0
CASIA v2.0 contains samples of various image tampering operations such as image splicing, copy-move, rotation, scaling. Experiments are conducted on each of the categories to analyze the relationship between detection performance and the nature of tampering. Table 7 shows the performance of multi-texture representation consisting of LBP, LPQ, BSIF and BGP. In all the categories except rotation, the best detection accuracy is obtained in combined Cb-Cr channels. For rotated images, the best accuracy is obtained in Cr channel. The accuracy varies between 92.60% and 97.5%.
The effect of entropy filtering
Here, we verified the effect of enhancing the texture variation in detection accuracy. Entropy filtering provided better detection accuracy for image forgery detection using the technique proposed by Agarwal and Chand [1]. Hence, we evaluated the performance of the multi-texture representation in entropy filtered Cb and Cr channels. Table 8 shows the results obtained with and without entropy filtering on Columbia and CASIA v1.0 datasets. Experimental results show that better detection accuracy is obtained without entropy filtering.
Comparison with other methods
We compared the performance of the proposed method with other methods that were evaluated on CASIA v1.0, CASIA v2.0 and Columbia. Table 9 compares the detection accuracy of various methods. The proposed method used Random Forest classifier as it gave the best performance when experiments were conducted with other classifiers in WEKA. The performance results of all the other methods are taken from respective publications.
The proposed method exhibits good performance on various sub-categories of forgery on CASIA v2.0 even though the performance is less compared to the other methods on CASIA v1.0 and Columbia. Only the proposed method considered combining multiple texture descriptors for forgery detection. It is important to note that the experimental results shown in Table 2 reveals that the detection accuracy improved when multiple texture descriptors are combined rather than applied individually.
Figure 3 shows the performance comparison on different sub-categories of forgery using CASIA v2.0. The proposed method shows comparable performance with Muhammad et al.’s technique in copy-move forgery. In image splicing, Muhammad et al.’s method shows slightly better performance. Though, Muhammad et al.’s method is better than the proposed method in scaling, the proposed method outperforms Muhammad et al.’s method in rotation, scaling and no transformation categories in CASIA v2.0.
Conclusion
In this paper, we proposed a compact multi-texture representation for image forgery detection. Experimental results showed that the multi-texture representation is capable of capturing subtle texture variations in a better manner with the help of the complementary discriminative power of individual texture descriptors. We considered four texture descriptors such as LBP, LPQ, BSIF and BGP. From YCbCr color space, chrominance components – Cb and Cr, are decomposed into multi-scale multi-orientation subbands using SPT. Texture features using the four texture descriptors are extracted from each subband. Finally, the texture features from each subband are concatenated to form the feature vector. The high dimensionality of this combined multi-texture representation is reduced to a subset by selecting the most discriminating features using ReliefF method. The selected features are fed to a Random Forest classifier for forgery detection. We conducted extensive evaluation of the performance of individual texture descriptor and multi-texture representation on standard datasets such as CASIA v1.0, CASIA v2.0 and Columbia.
Experimental results showed that the multi-texture representation improves the detection accuracy rather than applying each descriptor individually. Also, experimental evaluation revealed that the compact representation of multi-texture representation resulted in better detection accuracy. The proposed method resulted in better performance than the state-of-art method in detecting different categories of forgeries in CASIA v2.0 such as copy-move, rotation, and no transformation. In the future, we plan to evaluate the effect of high level subbands of SPT with a combination of more robust texture descriptors.
Footnotes
1
The research in this paper uses the Feature Selection Code Library (FSLib).
Acknowledgments
The authors would like to thank the Higher Education Department, Government of Kerala for funding this research and the Department of Computer Science and Engineering, College of Engineering, Trivandrum for providing the facilities.
