Image forgery detection using deep textural features from local binary pattern map

Abstract

Nowadays the manipulations of digital images are common due to easy access of many online photo editing applications and image editing softwares. Forged images are widely used in social media for creating deceitful propaganda of an individual or a particular event and for cooking up fake evidences even in court proceedings. Hence ensuring the integrity of digital images is of prime significance and it has become a hot research area. In this paper, a novel technique for image forgery detection is proposed. The method utilizes the layer activation of inception-ResNet-v2, a pretrained Convolutional Neural Network(CNN)to extract the deep textural features from Rotation Invariant – Local Binary Pattern (RI-LBP) map of the chrominance image. Non-negative Matrix Factorization (NMF) technique is used to reduce the dimensionality of the extracted features. The dimensionality reduced features are used to train a quadratic Support Vector Machine(SVM) classifier to classify images into forged or authentic. The method is assessed on four benchmark datasets (CASIA ITDE v1.0, CASIA ITDE v2.0, CUISDE and IFS-TC). Extensive experimental analysis is done and the results show an improved detection accuracy compared to the state-of-the-art methods.

Keywords

Deep learning rotation invariant-local binary pattern pretrained convolutional neural etworks deep textual features image forgery detection

1 Introduction

Images contain sensitive information contents, and everyday a huge number of images are posted in various social medias. The authenticity of the images shared in social media is uncertain since anyone can purposefully and easily manipulate images using photo editing softwares such as Adobe Photoshop and PhotoScape. The wide spread propagation of manipulated images may result in unwanted social or political unrest among the public. Thus, the privacy and authenticity of digital images shared via Internet is an important concern. Images are also used in police investigations as a scientific proof and presenting a manipulated image to the court may lead to an incorrect judgment. Thus digital image forgery detection is becoming a hot research area now and researchers are widely exploring various methods to detect image forgeries.

Copy move forgery [3] and image splicing [22] are the two broad categories of image manipulation techniques. A slice of an image is copied and pasted in some other location of the same image in copy move forgery, whereas a spliced image is a composite image created by copying portions from one or more images and then pasting them on another image. Figure 1 demonstrates examples of image manipulations. Figure 1(a) is an original image of an evidence of crime scene having one cartridge and Fig. 1(b) is the manipulated (copy move forged) version of original image with two cartridges. Rooster displayed in Fig. 1(c) is copied and pasted on Fig. 1(d) to create the composite image as shown in Fig. 1(e).

Fig. 1

(a) Original image of evidence showing a cartridge (b) Copy move forged image of evidence showing two cartridges [28] (c) Original image one (d) Original image two and (2) Spliced image [9].

Digital image forgery detection approaches can be mostly categorized as active and passive [30]. Pre-embedded information like watermark [18] or digital signature [19] is used in the active approaches. Image specific features which can clearly discriminate a forged image from an authentic one is used in passive techniques. Only a few images in the media contains pre-embedded information and hence passive methods are extensively studied. The traditional machine learning workflow of the passive methods consists of pre-processing of the images, handcrafted feature extraction, selection of optimum features and then training a suitable classification model. The feature engineering process is time consuming and complex part of any machine learning framework due to the difficulty of defining appropriate features for different types of manipulations.

Recent advancement of data-driven techniques like deep learning using Convolutional Neural Networks (CNNs) [15] have shown exceptional results in general image classification problems. These CNNs are capable of learning rich feature representations directly from images [21]. The layer activations of the pretrained CNNs models can be used as feature extractors [4] for numerous applications in the field of computer vision.

In this work, a novel method is proposed for image forgery detection by exploiting the power of pretrained CNN along with rich texture representation capability of Rotation Invariant – Local Binary Pattern (RI-LBP) maps. The textural inconsistencies in images due to manipulations are captured by RI-LBP maps of chrominance images. The deep textural features are extracted from these RI-LBP maps using the layer activation of pretrained CNN. The dimensionality of the extracted features is reduced using Non-negative Matrix Factorization (NMF) technique. The dimensionality reduced features are used for training a Support Vector Machine (SVM) classifier. The method is evaluated on four benchmark image forgery datasets: (i) Chinese Academy of Sciences Institute of Automation Image Tampering Detection Evaluation version 1.0 (CASIA ITDE v1.0) database, (ii) version 2.0 of CASIA ITDE database (CASIA ITDE v2.0) [9], (iii) Columbia Uncompressed Image Splicing Detection Evaluation (CUISDE) dataset [11] and (iv) Image Forensic Challenge (IFS-TC) dataset. The experimental analysis shows improved detection accuracy compared to other state-of–the-art methods for image forgery detection.

The remainder of this paper is structured as follows. Section 2 discusses the earlier works related to the proposed technique. The proposed method and its details are explained in Section 3, which is followed in Section 4 by the specifications of the experimental setup. The experimental results and discussions are presented in Section 5. Section 6 gives the conclusions and future work.

2 Related work

Many studies have done on image forgery detection using handcrafted feature engineering techniques. Various techniques have been proposed by the researchers for classifying the manipulated images from the authentic ones. In this section, we discuss some passive image forgery detection methods for copy move and image splicing forgeries.

Al-Hammadi et al. [1] proposed a technique using curvelet transform and LBP for image forgery detection. Curvelet transform is applied on the chrominance images and its LBP histograms are calculated. A feature vector is formed by combining these histograms and fed an SVM model. Muhammad et al. [20] applied a Steerable Pyramid Transform (SPT) on the chrominance images. Then obtained LBP histograms for each SPT sub bands and feature vector is formed by concatenating these histograms. The feature vector is then served to an SVM classifier.

Hussain et al. [12] suggested a forgery detection technique using multi-scale Weber Local Descriptors (WLD) and multi-scale LBP techniques. Locally Learning Based (LLB) algorithm is used as feature selection technique and the selected features are then fed to an SVM classifier. Alahmadi et al. [2] proposed a block based technique using LBP and Discrete Cosine Transform(DCT) for image forgery detection. The chroma component of image is divided into overlapping blocks, and LBP image of each block is obtained. Then DCT is applied on each LBP images and the statistical measures of DCT coefficients are used as feature vector to train an SVM classifier. Vidyadharan and Thampi [32] utilized multi-texture feature extraction technique by combining texture descriptors such as Local Phase Quantization (LPQ), LBP, Binary Gabor Pattern (BGP) and Binarized Statistical Image Features (BSIF) for detecting image forgeries. These texture features are extracted from the Steerable Pyramid Transform (SPT) sub bands of image and are then combined together to form the multi-texture descriptor. ReliefF feature selection method is to generate a compact representation of texture and the selected features are given to a Random Forest classifier.

A method based on Gabor wavelets and LPQ for detecting image forgery is proposed by Isaac and Wilscy [13]. Gabor wavelet transform is applied on the Cr component of the image at different scales and orientations. Then the LPQ features obtained from the different Gabor sub band images are reduced using NMF technique and given to an SVM classifier. The above-mentioned approaches [1 , 32] utilized handcrafted feature extraction techniques to capture the discriminative features between manipulated regions and authentic regions.

Recently some image forgery detection methods utilized the CNN to detect image forgeries. Rao and Ni [26] designed a CNN architecture and trained the network using image patch samples. This trained CNN is used to extract the patch based features of image using a sliding window. The extracted features are combined using feature fusion technique and fed to an SVM classifier. Rota et al. [27] proposed a CNN architecture for tampered image classification and the network is trained with patches of the training images to perform a classification. Zhou et al. [35] trained a rich model Convolutional Neural Network (rCNN) for detecting image forgery using a special block strategy. Shi et al. [29] proposed a Dual-domain CNN architecture for image forgery detection. The network is trained with image patches to perform the classification. These methods [26 , 35] use image patches or blocks for training a CNN, however this patch based approach may lead to the loss of evidence in image forgery detection. A huge number of labelled data is essential for obtaining an accurate and reliable classification model. Also, training a CNN architecture from scratch is computationally expensive. In image forgery detection tasks, it is often difficult to attain vast amount of labelled training data and to overcome these issues, we utilize the transfer learning approach [33] by exploiting the power of pretrained CNN as a feature extractor [6, 7] along with rich textural description capability of LBPs. LBP is a good discriminative and computationally proficient texture descriptor [23]. Thus LBPs can store hidden texture variations due to image manipulations.

These observations motivated us to combine the deep feature extraction power of CNNs and rich texture description capability of LBPs for forgery detection. In the proposed method, we are extracting deep textural features from the RI-LBP maps of chrominance images using the layer activation of pretrained CNN. The NMF technique is used to obtain optimized number of features and then features are given to an SVM classifier with quadratic kernel for training. The details of the proposed technique, experimental setup and analysis of experimental results are presented in the following sections.

3 Proposed method

An overview of the proposed technique for image forgery detection is provided in Fig. 2, the proposed method for detecting image forgery utilizes high texture description power of LBP along with rich feature representations of pretrained CNN is described. The deep textural features are extracted from RI-LBP maps of chrominance (Cb, Cr) channels of images using the layer activation of Inception-ResNet-v2 [31], a pretrained CNN. NMF technique is used to reduce the dimensionality of the extracted deep textural features and this dimensionality reduced features are utilized to train a quadratic SVM model for classifying images into forged or authentic.

Fig. 2

An overview of the proposed image forgery detection method.

The method contains the following stages. (A) Conversion of RGB images to YCbCr color space since it is proved in the literature [1, 34] that chrominance channels (Cb, Cr) are more useful for identifying image forgeries. We conducted experiments to verify this claim and it is found that very high detection accuracies are obtained when chrominance channels are used, compared to luminance channel (Y) or RGB images. The details of this experimental analysis is given in Section 5. (B) Obtaining RI-LBP maps from chrominance components of images. RI-LBP is an effective texture descriptor which is able to capture texture variations in images due to manipulations. (C) Deep textural feature extraction using pretrained CNN. The fully connected layer of pretrained network Inception-ResNet-v2 is used as feature extractor for extracting deep features from RI-LBP maps, since the deeper layers give rich discriminative features [25]. (D) The dimensionality reduction of extracted features using NMF. (E) Classification of images into forged or authentic using an SVM classifier with quadratic kernel. The detailed explanation of various steps used in this method are explained in the following subsections.

3.1 Conversion of RGB images to YCbCr color space

In this pre-processing step, RGB color image is converted into YCbCr color space. The conversion is done using the JPEG 2000 standard as shown in Equation (1). The chrominance-blue (Cb) and Chrominance-red (Cr) channels give higher forgery detection accuracies than the luminance (Y) component of YCbCr image and hence two chrominance components (Cb, Cr) are considered for the further processing.

$\begin{matrix} [\begin{matrix} \begin{matrix} Y \\ Cb \end{matrix} \\ Cr \end{matrix}] \\ = [\begin{matrix} \begin{matrix} 0.299 & 0.587 & 0.114 \\ - 0.16875 & - 0.33126 & 0.5 \\ 0.5 & - 0.41869 & - 0.08131 \end{matrix} \end{matrix}] . [\begin{matrix} \begin{matrix} R \\ G \end{matrix} \\ B \end{matrix}] \end{matrix}$ (1)

3.1 Obtaining Rotation Invariant – Local Binary Patterns (RI-LBPS) maps

The texture of an image provides details about the spatial arrangement of color or intensity variations. The manipulations like copy move or splicing induce unnatural textural variations in images. LBPs are good discriminative and computationally proficient texture descriptors [23] which can capture hidden texture variations in manipulated images. Even though LBP efficiently captures the local texture structure, it is not rotation invariant. To attain the rotation invariance in local binary patterns, the RI-LBP [24] is used in this work. RI-LBP_X is obtained by circularly rotating the original LBP_X until its minimum binary value is reached as given in Equation (2), where LBP_X is the local binary pattern of an image considering X neighboring pixels. The function ROR(LBP_X,i) symbolizes the circular shift operator which circularly shifts LBP_X array i times to the right.

$\begin{matrix} {RILBP}_{X} = min {ROR ({LBP}_{X}, i) | i \\ = 0, 1, \dots, X - 1} \end{matrix}$ (2)

In the proposed method, RI-LBPs are obtained from the chrominance (Cb, Cr) components of each image using a 3×3 neighborhood. Due to the convolution operations performed in CNN, unordered local binary pattern codes cannot be directly given as input to CNN. To overcome this issue, local binary pattern codes are mapped to a 3D metric space using Multi-Dimensional Scaling(MDS) [17]. The resulting 3 channel RI-LBP maps are used as input to CNN model.

3.2 Extracting deep textural features using pretrained CNN

CNN is a deep neural network architecture that contains convolutional layers, pooling layers and classification layers. CNNs are used mainly for various image recognition applications. One of the most significant results in deep learning is the use of CNNs for the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [15]. These pretrained CNNs have learned rich feature representations for a wide range of images. The early layers of pretrained CNNs are able to learn basic features and the deeper layers use these features to produce more discriminative rich features [25].

In this proposed method, layer activation Inception-ResNet-v2 [31], a pretrained CNN is used as a feature extractor. Inception-ResNet-v2 network achieved lowest error percentage (Top-1 error percentage of 19.9% and Top-5 error percentage of 4.9%) on the ILSVRC dataset. Rich features are extracted from the layers which are nearer to the classification layer [21] and hence in the proposed technique, fully connected(fc) layer of Inception-ResNet-v2 is used to extract deep features. The dimension of the extracted deep features using fc layer (‘prediction’) is1000. Instead of giving RGB images, RI-LBP maps of images are given as inputs to pretrained CNN for deep feature extraction and the extracted deep features are called deep textural features. Hence in this work, we combine the power of pretrained CNN and rich texture description capability of LBPs for forgery detection. The dimension of extracted deep features is reduced using NMF technique and the dimensionality reduced features are utilized to train an SVM classifier.

3.3 Dimensionality reduction using NMF

The dimensionality reduction techniques help to select the relevant features for training a classifier by removing redundant features. This results in reduced size of feature vector and thereby reduces the training time of a classifier. It also helps to avoid overfitting of the classifier. In this work, the dimensionality of the extracted deep features is reduced to an optimum size using NMF technique[16]. Let $X \in ℝ^{D \times N}$ be a non-negative feature matrix, then NMF decomposes X into the product of two non-negative matrices U and V as given in Equation (3), where U is $ℝ^{D \times K}$ and V is $ℝ^{N \times K}$ . $X \approx {UV}^{T}$ (3)

Each x_n sample in X can be approximated as the linear combination of the columns of U multiplied by the components of the n^th column of V, as in Equation (4). $x_{n} \approx \sum_{i = 1}^{R} u_{i} v_{m}$ (4)

Hence, U is collection of basis vectors, while v_n, the n^th column of V is the coding vector of the n^th data sample. Squared Euclidean Distance (SED) is the cost function used to solve U and V as shown in Equation (5). $O^{NMF} (U, V) = X - {UV}^{2}$ (5)

3.4 Classification using quadratic SVM

The dimensionality reduced deep textural features are utilized to train an SVM classifier [10] for filtering images into two classes, authentic and forged. We use an SVM with quadratic kernel for classification. On initial testing, quadratic kernel was found superior to other kernel functions like linear, cubic and gaussian. The quadratic kernel transforms the features into a higher dimensional feature space and where the features can be linearly separated. The quadratic kernel function is defined as in Equation (6) where $\vec{x}$ and $\vec{z}$ are vectors in feature space. $K (\vec{x}, \vec{z}) = {(1 + {\vec{x}}^{T} \vec{z})}^{2}$ (6)

4 Experimental setup

MATLAB R2019a is used for implementing the proposed technique with the help of a GPU based system. The system contains an Intel(R) core(TM) i7 processor with a NVIDIA GTX 1060 graphics card. The details of the datasets and the performance evaluation metrics are discussed in the following subsections.

4.1 Datasets

CASIA ITDE v1.0, CASIA ITDE v2.0,CUISDE and IFS-TC¹ are four publically accessible image forgery detection datasets used for evaluating the proposed method. The details of these benchmark datasets are given in Table 1.

Table 1
Details of datasets

Dataset Original Forged Total Image format Image resolution

CASIA ITDE v1.0 800 921 1721 JPEG 384×256

CASIA ITDE v2.0 7491 5123 12614 JPEG,TIFF, BMP 384×256, 900×600

CUISDE 183 180 363 TIFF, BMP 757×568, 1152×768

IFS-TC (Phase-1Train) 1050 450 1500 PNG 1025×768

Dataset	Original	Forged	Total	Image format	Image resolution
CASIA ITDE v1.0	800	921	1721	JPEG	384×256
CASIA ITDE v2.0	7491	5123	12614	JPEG,TIFF, BMP	384×256, 900×600
CUISDE	183	180	363	TIFF, BMP	757×568, 1152×768
IFS-TC (Phase-1Train)	1050	450	1500	PNG	1025×768

4.2 Performance evaluation metrics

In the proposed method, forged images are taken as positive and authentic images are considered as negative. Following performance metrics are used for assessing the proposed method, $Accuracy = 100 \times \frac{(TP + TN)}{(TP + TN + FP + FN)}$ (7) $Precision = \frac{TP}{(TP + FP)}$ (8) $Recall = \frac{TP}{(TP + FN)}$ (9) $F_{Measure} = \frac{2 \times (Precision \times Recall)}{(Precision + Recall)}$ (10) where, TP represents True Positives (TP- the count of forged images which are classified correctly as forged), TN represents True Negatives (TN- the count of the authentic images that are classified correctly as authentic), FP represents False Positives (FP- the count of authentic images which are misclassified as forged) and FN represents False Negatives (FN- the count of forged images that are misclassified as authentic).

Accuracy is defined as percentage of images that are classified correctly and is calculated as in Equation (7). Precision and Recall of the classifier are obtained using Eqs. (8) and (9), respectively. F_Measure is the harmonic mean of precision and recall and is calculated using Equation (10).

5 Experimental results and discussions

Four experiments are performed to study the efficacy of the proposed work. The following experiments are conducted to study:

Effect of chrominance channels on the detection accuracy

Effect of feature dimensionality reduction on detection accuracy

Effect of extracting deep textural features from RI-LBPs

Performance of SVM classifier with other classifiers

The experiments are done on four standard image forgery datasets and finally, a comparison with state-of-the-art methods is done to evaluate the performance the proposed work.

5.1 Effect of chrominance channels on the detection accuracy

In this experiment, we evaluate the effectiveness of chrominance channels (Cb, Cr), luminance channel (Y) of images and RGB images in detecting image forgeries. The detection accuracies are evaluated for each case. The dimension of deep features extracted from RI-LBP map of each image is 1000.The dimensionality reduction of these features is done using NMF technique and empirical method is used to find the optimal feature dimension for each dataset. The optimum feature dimension for each dataset is tabulated in Table 2. The experiments are performed using these optimum number of features. The results are analyzed in Fig. 3, which clearly shows that higher accuracies are obtained when chrominance channels are used for feature extraction. Hence in this work, chrominance channels (Cb, Cr) are used for the forgery detection and analysis.

Table 2
Optimum feature dimensions for various datasets

Dataset Optimum feature dimension

CASIA ITDE v1.0 500

CASIA ITDE v2.0 600

CUISDE 52

IFS-TC (Phase-1-Train) 350

Dataset	Optimum feature dimension
CASIA ITDE v1.0	500
CASIA ITDE v2.0	600
CUISDE	52
IFS-TC (Phase-1-Train)	350

Fig. 3

Detection accuracies of various datasets by considering chrominance channels (Cr and Cb), luminance channel (Y) and RGB images.

Table 3 gives the performance evaluation of the proposed method using chrominance channels (Cb, Cr) and effect of combining chrominance channels. It can be observed from Table 3 that, the highest detection accuracy of 99.1% is obtained for the dataset CASIA ITDE v1.0 with a good recall of 1.00 and precision of 0.99 when Cr channel is considered for the experiment. Highest accuracy of 99.30% obtained for CASIA ITDE v2.0 dataset by considering Cb channel. Highest detection accuracy of 98.3% is obtained for CUISDE dataset when the features from Cb and Cr channels are concatenated(Cb+Cr). Also it can be seen that detection accuracy of greater than 99% is obtained for CASIA ITDE v2.0 datasets for all the three cases (Cb, Cr, Cb+Cr).

Table 3

Performance evaluation

Dataset	Channel	Accuracy (%)	Recall	Precision	F_Measure
CASIA ITDE v1.0	Cr	99.1	1.00	0.99	0.99
	Cb	98.0	0.99	0.98	0.98
	Cb+Cr	97.2	0.99	0.96	0.97
CASIA ITDE v2.0	Cr	99.1	0.99	0.99	0.99
	Cb	99.3	0.99	0.99	0.99
	Cb+Cr	99.1	0.99	0.99	0.99
CUISDE	Cr	95.0	0.98	0.93	0.95
	Cb	96.7	0.99	0.95	0.97
	Cb+Cr	98.3	0.99	0.98	0.98
IFS-TC (Phase-1-Train)	Cr	95.6	0.91	0.95	0.93
	Cb	97.7	0.96	0.96	0.96
	Cb+Cr	96.5	0.93	0.95	0.94

5.2 Effect of feature dimensionality reduction on detection accuracy

In this experiment, we study the impact of feature dimensionality reduction using NMF technique on the detection accuracy. The number of deep features extracted from each image is 1000 and the optimum number of features after feature reduction for each dataset is given in Table 2. The detection accuracy before and after applying the feature reduction on chrominance Cb channel is shown in Fig. 4 and we can see that the accuracies obtained without the application of feature reduction are 94.2% for CASIA ITDE v1.0,95.1% for CASIA ITDE v2.0,91.5% for CUISDE and 94.0% for IFS-TC 92.0%. However, after applying feature reduction using NMF it can be observed that the detection accuracy has improved to a great extent. The accuracy for CASIA ITDE v1.0 has improved from 94.2% to 98.0% with a feature dimension of 500. In the case of CASIA ITDE v2.0, accuracy has hiked from 95.1% to 99.3% with a feature dimension of 600.The accuracy for CUISDE has raised from 91.5% to 96.7% with a feature dimension of 52 and accuracy for IFS-TC has increased from 94.0% to 97.7% with a feature dimension of 350. Thus feature reduction technique using NMF increased the detection accuracy on an average of 4 % on every dataset.

Fig. 4

Detection accuracies before and after feature reduction considering Cb channel.

5.3 Effect of extracting deep textural features from RI-LBP maps

In this experiment, the effect of using RI-LBP map of images in the proposed method is investigated. To study this, deep features are extracted directly from chrominance channels of images using the fully connected layer of Inception-ResNet-v2. The extracted features are reduced using NMF technique and a quadratic SVM classifier is training using these features. The results are compared with the detection accuracies obtained by using deep textural features extracted from RI-LBP maps of the chrominance channels. Cb channel is considered for this comparison and the results are presented in Fig. 5 and it can be noted that, there is a great improvement in the detection accuracies if deep features are extracted from RI-LBP maps. This is due to the fact that, local binary patterns are powerful texture descriptors and any textural inconsistencies induced in forged images are well captured by LBPs. Thus by extracting deep textural features from RI-LBP maps of chrominance channels enhance the detection accuracy.

Fig. 5

Detection accuracies for various datasets without and with RI-LBP map for deep feature extraction.

5.4 Performance of SVM classifier with other classifiers

To examine the effectiveness of quadratic SVM classifier in the proposed method, we have done experiments to compare the performance of SVM classifier with other classifiers. K Nearest Neighbor(KNN) [8] and Decision Tree(DT) [5] are the two classifiers used for the comparison. CUISDE dataset is considered for this evaluation. The experiment is done using the optimum deep features extracted from the RI-LBP maps of (Cb+Cr) channel of the images and Table 4 shows the comparison results. It is evident that the performance of quadratic SVM classifier is much higher than other two.

Table 4
Comparison of detection accuracy (%) obtained by using various classifiers on CUISDE dataset

Classifiers Accuracy (%)

Quadratic SVM 98.3

KNN 86.8

DT 76.6

Classifiers	Accuracy (%)
Quadratic SVM	98.3
KNN	86.8
DT	76.6

5.5 Comparison with state-of-the-art methods

The detection accuracy of the proposed method is compared with four state-of-the-art methods which use deep learning techniques for detecting image forgeries [26 , 35]. These state-of-the-art methods used CASIA ITDE v1.0 and CASIA ITDE v2.0 datasets for the experimental evaluation. The detection accuracies of these state-of-the-art techniques are obtained straight from respective papers. Table 5 gives the comparison and the highest detection accuracies obtained (99.10% for CASIA ITDE v1.0 using Cr and 99.30% for CASIA ITDE v2.0 using Cb) by the proposed method are considered for the comparison with state-of-the-art methods. The proposed method is also evaluated on CUISDE and IFS-TC(Phase-1-Train) datasets. To the best of our knowledge, forgery detection on CUISDE and IFS-TC datasets using any deep learning techniques are not reported in literature so far. The highest detection accuracies reported using traditional machine learning techniques are, 98.72% [14] for CUSIDE and 85% [13] for IFS-TC. The proposed method obtained a comparable detection accuracy of 98.30% on CUSIDE dataset, while a high improvement is obtained for IFS-TC dataset with an accuracy of 97.70%. These comparative analysis proves that the proposed method using deep textural features obtained from RI-LBP maps of chrominance images outperforms state-of-the-art methods.

Table 5
Comparison the proposed method with state-of-the-art methods

Methods Datasets

CASIA ITDE v1.0 CASIA ITDE v2.0

Proposed Method 99.10 99.30

Zhou et al. [35] 98.04 98.02

Rota et al. [27] – 97.44

Rao and Ni [26] 98.04 97.83

Shi et al. [29] – 99.25

Methods	Datasets
Proposed Method	99.10	99.30
Zhou et al. [35]	98.04	98.02
Rota et al. [27]	–	97.44
Rao and Ni [26]	98.04	97.83
Shi et al. [29]	–	99.25

6 Conclusion and future work

This paper suggests a novel method for detecting image forgeries by utilizing the layer activation of pretrained CNN, Inception-ResNet-v2 as a feature extractor. In this work, RGB images are converted into YCbCr space and obtained the RI-LBP maps of chrominance (Cb, Cr) channels of images. The fully connected layer of Inception-ResNet-v2 is used for extracting deep textural features from RI-LBP maps. The NMF technique is used for reducing the dimensionality of the extracted deep features and these features are utilized to train a quadratic SVM classifier for classifying images into forged or authentic. The proposed method is assessed on four benchmark datasets – CASIA ITDE v1.0, CASIA ITDE v2.0, CUISDE and IFS-TC. The results show that the proposed technique outperforms state-of-the-art methods. The power of pretrained CNN along with rich texture representation capability of RI-LBP maps helped to increase the performance of the method, which is evident from the results obtained. In future, we aim to develop efficient methods to detect deepfake images.

Footnotes

References

Al-Hammadi

, Muhammad

, Hussain

and Bebis

, Curvelet transform and local texture based image forgery detection, Advances in Visual Computing. ISVC 2013. Lecture Notes in Computer Science, Springer (2013), 503–512.

Alahmadi

, Hussain

, Aboalsamh

, Muhammad

, Bebis

and Mathkour

, Passive detection of image forgery using DCT and local binary pattern, Signal, Image and Video Processing11 (2017), 81–88.

Alahmadi

A.A.

, Hussain

, Aboalsamh

, Muhammad

and Bebis

, Splicing image forgery detection based on DCT and local binary pattern, 2013 IEEE Global Conference on Signal and Information Processing, IEEE, 2013, 253–256.

Ali Sharif

, Hossein

, Josephine

and Stefan

, CNN Features off-the-shelf: An Astounding Baseline for Recognition, IEEE Conference on Computer Vision and Pattern Recognition Workshops 2014, 512–519.

Breiman

, Classification and Regression Trees, Routledge, New York, 1984.

Canziani

, Paszke

and Culurciello

, An analysis of deep neural network models for practical applications, ArXivPreprintArXiv:1605.07678v4 (2017).

Cimpoi

, Maji

, Kokkinos

and Vedaldi

, Deep Filter Banks for Texture Recognition, Description and Segmentation, Journal of Computer Vision118 (2016), 65–94.

Cover

and Hart

P.E.

, Nearest neighbor pattern classification, IEEE Transactions on Information Theory13 (1967), 21–27.

Dong

, Wang

and Tan

, CASIA image tampering detection evaluation database, 2013 IEEE China Summit and International Conference on Signal and Information Processing, IEEE, 2013, 422–426.

10.

Hsu

C.-W.

, Chang

C.-C.

and Lin

C.-J.

, A Practical Guide to Support Vector Classification, 2003.

11.

Hsu

Y.-F.

and Chang

S.-F.

, Detecting image splicing using geometry invariants and camera characteristics consistency, IEEE International Conference. Multimedia and Expo, 2006, 549–552.

12.

Hussain

, Qasem

, Bebis

, Muhammad

, Aboalsamh

and Mathkour

, Evaluation of image forgery detection using multi-scale weber local descriptors, International Journal on Artificial Intelligence Tools24 (2015), 1–28.

13.

Isaac

M.M.

and Wilscy

, Multiscale local gabor phase quantization for image forgery detection, Multimedia Tools and Applications76 (2017), 25851–25872.

14.

Isaac

M.M.

and Wilscy

, Image forgery detection using region-Based Rotation invariant Co-occurrences among adjacent LBPs, Journal of Intelligent and Fuzzy Systems34 (2018), 1679–1690.

15.

Krizhevsky

, Sutskever

and Hinton

G.E.

, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems (2012), 1097–1105.

16.

Lee

D.D.

and Seung

H.S.

, Algorithms for non-negative matrix factorization, Advances in Neural Information Processing Systems (2001), 556–562.

17.

Levi

and Hassner

, Emotion recognition in the wild via convolutional neural networks and mapped binary patterns, Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ACM (2015), 503–510.

18.

C.S.

and Liao

H.Y.M.

, Multipurpose watermarking for image authentication and protection, IEEE Transactions on Image Processing10 (2001), 1579–1592.

19.

C.S.

and Liao

H.Y.M.

, Structural digital signature for image authentication:an incidental distortion resistant scheme, IEEE Transactions on Multimedia5 (2003), 161–173.

20.

Muhammad

, Al-Hammadi

M.H.

, Hussain

and Bebis

, Image forgery detection using steerable pyramid transform and local binary pattern, Machine Vision and Applications25 (2014), 985–995.

21.

Nanni

, Ghidoni

and Brahnam

, Handcrafted vs. non-handcrafted features for computer vision classification, Pattern Recognition71 (2017), 158–172.

22.

T.-T.

and Chang

S.-F.

, A model for image splicing, 2004 International Conference on Image Processing (ICIP), IEEE, 2004, 1169–1172.

23.

Ojala

, Pietikäinen

and Harwood

, A comparative study of texture measures with classification based on feature distributions, Pattern Recognition29 (1996), 51–59.

24.

Ojala

, Pietikäine

and Mäenpää

, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence24 (2002), 971–987.

25.

Oquab

, Bottou

, Laptev

and Sivic

, Learning and transferring mid-level image representations using convolutional neural networks, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014), 1717–1724.

26.

Rao

and Ni

, A deep learning approach to detection of splicing and copy-move forgeries in images, 2016 IEEE InternationalWorkshop on Information Forensics and Security (WIFS), IEEE, 2017, 1–6.

27.

Rota

, Sangineto

, Conotter

and Pramerdorfer

, Bad Teacher or Unruly Student: Can Deep Learning Say Something in Image Forensics Analysis? 2016 23rd International Conference on Pattern Recognition (ICPR), IEEE, 2017, 2503–2508.

28.

Shah

, Shinde

and Kukreja

, Retouching detection and steganalysis, International Journal of Engineering Innovation & Research2 (2013), 487–490.

29.

Shi

, Shen

, Kang

and Lv

, Image manipulation detection and localization based on the dual-domain convolutional neural networks, IEEE Access6 (2018), 76437–76453.

30.

Sutthiwan

, Shi

Y.Q.

, Zhao

, Ng

T.T.

and Su

, Markovian rake transform for digital image tampering detection, Transactions on Data Hiding and Multimedia Security VI. Lecture Notes in Computer Science, Springer (2011), 1–17.

31.

Szegedy

, Ioffe

, Vanhoucke

and Alemi

A.A.

, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) (2017), 4278–4284.

32.

Vidyadharan

D.S.

and Thampi

S.M.

, Digital image forgery detection using compact multi-texture representation, Journal of Intelligent and Fuzzy Systems32 (2017), 3177–3188.

33.

Yosinski

, Clune

, Bengio

and Lipson

, How Transferable are features in deep neural networks? Proceedings of the 27th International Conference on Neural Information Processing Systems, MIT Press (2014), 3320–3328.

34.

Zhao

, Li

and Wang

, Detecting digital image splicing in chroma spaces, International Workshop on Digital Watermarking, Springer, 2010, 12–22.

35.

Zhou

, Ni

and Rao

, Block-based convolutional neural network for image forgery detection, Digital Forensics and Watermarking. IWDW 2017. Lecture Notes in Computer Science, Springer (2017), 65–76.

Image forgery detection using deep textural features from local binary pattern map

Abstract

Keywords

1 Introduction

3 Proposed method

3.3 Dimensionality reduction using NMF

4.1 Datasets

Table 1 Details of datasets Dataset Original Forged Total Image format Image resolution CASIA ITDE v1.0 800 921 1721 JPEG 384×256 CASIA ITDE v2.0 7491 5123 12614 JPEG,TIFF, BMP 384×256, 900×600 CUISDE 183 180 363 TIFF, BMP 757×568, 1152×768 IFS-TC (Phase-1Train) 1050 450 1500 PNG 1025×768

5.1 Effect of chrominance channels on the detection accuracy

Table 2 Optimum feature dimensions for various datasets Dataset Optimum feature dimension CASIA ITDE v1.0 500 CASIA ITDE v2.0 600 CUISDE 52 IFS-TC (Phase-1-Train) 350

Table 4 Comparison of detection accuracy (%) obtained by using various classifiers on CUISDE dataset Classifiers Accuracy (%) Quadratic SVM 98.3 KNN 86.8 DT 76.6

Table 5 Comparison the proposed method with state-of-the-art methods Methods Datasets CASIA ITDE v1.0 CASIA ITDE v2.0 Proposed Method 99.10 99.30 Zhou et al. [35] 98.04 98.02 Rota et al. [27] – 97.44 Rao and Ni [26] 98.04 97.83 Shi et al. [29] – 99.25

Footnotes

References

Table 2
Optimum feature dimensions for various datasets

Dataset Optimum feature dimension

CASIA ITDE v1.0 500

CASIA ITDE v2.0 600

CUISDE 52

IFS-TC (Phase-1-Train) 350

Table 4
Comparison of detection accuracy (%) obtained by using various classifiers on CUISDE dataset

Classifiers Accuracy (%)

Quadratic SVM 98.3

KNN 86.8

DT 76.6

Table 5
Comparison the proposed method with state-of-the-art methods

Methods Datasets

CASIA ITDE v1.0 CASIA ITDE v2.0

Proposed Method 99.10 99.30

Zhou et al. [35] 98.04 98.02

Rota et al. [27] – 97.44

Rao and Ni [26] 98.04 97.83

Shi et al. [29] – 99.25