Abstract
To abandon the use of overlapping block division and resist the tampering factor of illumination change, a novel copy-move forgery detection method is proposed based on Maximally Stable Extremal Regions (MSERs) and the Local Intensity Order Pattern (LIOP), that integrates block-based and keypoints-based methods. The method involves the following steps: first, affine transformation invariant MSERs are used to maintain the geometrical transformation invariance and reduce computational complexity; second, LIOP features are used to describe the texture and resist illumination change; and finally, RANdom SAmple Consensus is applied to remove false matches. The experiments indicate that the proposed method has great performance for scaling, rotation and illumination changes. Moreover, the method has the high robustness to Gaussian noise, Gaussian blur and JPEG compression.
Keywords
Introduction
Due to the increasing ease of image forgeries in daily life, such as social networks, advertisements and court evidence, the authentication of images has become an important issue. Passive forgery detection, which analyses the content of an image, is an effective way to detect tampering without watermarking insertion. Copy-move forgery is an operation involving copying a region and pasting it in same image, where the illumination and compressed factor are the same.
The Copy-Move Forgery Detection (CMFD) methods could be divided into three classifications, namely, overlapping block-based, keypoints-based and non-overlapping irregular block-based.
In the overlapping block-based scheme, Fridrich first proposed a CMFD method based on Discrete Cosine Transform (DCT) [1]. Several improved DCT methods [2, 3] have focused on the robustness of noise and blur. Prakash applied overlapped circular blocks and DCT features than the prevalent methods, which had lower computational complexity [3]. Moreover, several features have been applied to CMFD block-based schemes, such as Non-negative Matrix Factorization (NMF) [4], Principal Component Analysis (PCA) [5], Dyadic Wavelet Transform (DyWT) [6] and Fourier-Mellin Transform (FMT) [7]. To solve the rotation transformation invariance, the Hu [8], Blur [9] and Zernike [10] moments are used as features in CMFD. However, the overlapping block-based CMFD scheme, in which the input image is divided into overlapping blocks, cannot address the geometrical transformation of the forgery regions and significantly increases the computational expense. In the keypoints-based schemes, Scale Invariant Feature Transform (SIFT) [11], improved SIFT [12, 13] and Speeded Up Robust Feature (SURF) [14] are widely used. A feature descriptor called DAISY [15] is used to resist contrast variation and scale changes. Because of the efficiency of the rapid extraction and high matching rate, a CMFD method based on scaled Oriented FAST and Rotated BRIEF (ORB) [16] was presented. Alfraih proposed the method based on MSER features and K-means Clustering [17], which had a great performance on removing false matching. Nonetheless, the keypoints-based CMFD scheme is not robust to smooth regions and performs unevenly in post-processing operations. Recently, several novel schemes based on non-overlapping irregular blocking, such as over-segmentation [18, 19], Dense-Field [20], hashing [21] and triangles of local keypoints [22], were presented. Zandi applied an iterative scheme to detect keypoints even in low contrast regions, which had better performance than state-of-the-art methods [23]. Abdel-Basset proposed 2-level clustering strategy based on spatial and transformation domains, able to individuate the altered areas, with had high reliability and could deal with multiple cloning [24].
Nonetheless, most of CMFD schemes are useless for copy-move with illumination change. Therefore, we propose a novel CMFD method to maintain the advantages of the block-based and keypoints-based schemes. The affine transformation invariance patches are used to maintain the geometrical transformation and reduce the computational complexity. The Local Intensity Order Pattern (LIOP) [25] features are used to effect illumination change.
The rest of this paper is organized as follows. In Section II, the proposed CMFD method is expounded, with each step described in detail. Then, the dataset, metrics and experiments on different tampering factors and types of post-processing are shown in Section III. Finally, Section IV presents the conclusions.
Overview of the proposed CMFD method
This section explains the four steps of the proposed CMFD method. The framework is shown in Fig. 1. First, affine transformation invariance patches are extracted by applying Maximally Stable Extremal Regions (MSERs) [26], which have the advantage of affine transformation invariance. Second, each region is described by the LIOP features, which are invariance to illuminate change to make sure matching in copied and pasted regions. Third, LIOP features are matched by Euclidean distance. However, there have some dissonant false matches, as is shown in Fig. 1(c). Finally, false matches are removed through RANdom SAmple Consensus (RANSAC) [27].

The framework of the proposed CMFD method.
To abandon the use of traditional overlapping blocks division and to improve the computational efficiency, affine transformation invariance patches are extracted from the input image. The procedure is described in Fig. 2.

Procedure for affine transformation invariance patch extraction.
The Maximally Stable Extremal Regions (MSERs) approach, proposed by Matas, was evaluated as an excellent region extraction method in [28]. MSERs are invariant to affine transformation and image intensity, which ensures that the same regions are extracted when tampered areas are transformed and the illumination is changed. Moreover, the computational complexity of MSERs is almost linear with the number of pixels, which can improve the computational efficiency. Therefore, MSERs, which are stable over a large range of thresholds, are extracted first. A more detailed procedure can be found in [26].
Because the MSERs are irregular in shape, as shown in Fig. 2(c), they are usually fitted to ellipses using the covariance matrix C, which is defined as Equation (1):
Since Local Intensity Order Pattern is invariant to rotation and illumination changes, which was applied to describe the patch texture in our method. The procedure for the LIOP features is explained as Fig. 3.

Procedure of LIOP descriptor extraction.
Region division is based on the intensity order, which is invariant to rotation and illumination changes and contains spatial information. For each affine transformation invariant patch, patch(i), the pixels in the patch are first sorted and divided equally to B bins (B = 6 according to [26]), where the number of pixels in each bins are the same. The ith bin is recorded as bin i (), as is shown in Fig. 3(b).
LIOP descriptor
Assume that P(x0)={I(x1), I(x2), I(x3), ... , I(x
N
)}, where P(x0) is the set of grey values, I(x
i
) is the grey value of point x
i
and x
i
is the ith neighbouring sample point of point x0, as is shown in Fig. 4(a). Let P(x0) be sorted in non-descending order. The index of the is recorded as γ (P (x0)) = π = {i1, i 2, · · ·, i
N
}, where π ∈ Π, Π is all of the possible permutations of {1, 2, …, N} (N = 4 according to [24]), and γ() is the function mapping π to ind (π), as is shown in Fig. 4(b). The feature descriptor φ (x0) is defined as Equation (3):

The procedure of LIOP descriptor conduction.
Finally, for each affine transformation invariant patch in Section 2.2.1, the LIOP descriptor of patch(i) is defined as Equation (5):
In the matching process, the Euclidean distance is used to describe the similarity of the LIOP features, which indicate the similarity of patches. Mathematically, the Euclidean distance between patch(a) and patch(b) is defined as Equation (6):
As shown in Fig. 1(c), there are several false matches. Therefore, RANSAC, proposed by Fischler [27], was applied to evaluate the affine transformation model through iteration to maintain the inliers (correct matches) and remove the outliers (false matches) that do not fit the model. After a certain number of iterations, the affine transformation model has the greatest number of correct matches. The false match removal result after RANSAC is shown in Fig. 1(d).
Dataset
In order to evaluate the efficiency of the algorithm, the original images are the format of TIFF and PNG, which are constructed by Christlei [29], CoMoFoD [30], COVERAGE [31] and total have 348 images. Then, the copy-move forged regions are transformed using 4 tampering factors: naive, rotation, scaling and illumination. As a result, the number of tampered images are 4176. naive: The copied regions are pasted in the same image with no geometric transformation. rotation: The copied regions are rotated by 5 different angles {2°, 10°, 45°, 90°, 180°}. scaling: We scale the copied regions by 4 different scaling factors {0.7, 0.9, 1.1, 1.2}. illumination: To perfectly adapted to background, two different settings, namely, uniform illumination and non-uniform illumination, are applied to the forged regions. The uniform illumination is that forged regions are operated through directional light and non-uniform illumination is that forged regions are operated through omni light in Photoshop CS4.
Metrics
The performance of the proposed CMFD method is evaluated at the image level, measured by the ratio of the number of correctly detected images to the number of tampered images (True positive rate, T
PR
) and the ratio of the number of falsely detected images to the number of original images (False positive rate, F
PR
), which are defined by Equation (8)–(9).
Experiments on tampering factors
To evaluate the performance of the proposed method, the 4 tampering factors in our dataset were estimated. Example results are shown in Fig. 5. As shown in the third and fourth columns of Fig. 5, CMFD-SIFT [13] and ORB [16] almost could detect the copy-move forgery in 4 tampering factors, but it could not cover full tampered regions. As shown in the fifth column of Fig. 5, Dense-field [20] accurately identifies the tampering location for the naïve and scaling, however, it is ineffective for rotation and illumination. In contrast, according to the last column of Fig. 5, our proposed method performs well on naïve images, rotation, scaling, illumination, and achieves the accurate tampering location with no false matches.

Example results for different tampering factors with respect to naïve images, rotation, scaling and illumination from top to bottom. Column 1 shows the original images. Column 2 shows the tampered images, where red solid lines are the copied regions and red dashed lines are the pasted regions. Column 3 shows the results based on CMFD-SIFT [13], where the green lines are the matching regions. Column 4 shows the results based on ORB [16], where the red lines are the matching regions. Column 5 shows the results based on Dense-field [20], where the black regions are the tampered regions. Column 6 shows the results based on the proposed method, where the red ellipses are MSERs and the green lines are the matching regions.
As is shown in Fig. 6, the statistical results are analysed through ROC curves, where indicate that our proposed method has better performance than CMFD-SIFT [13], ORB [16] and Dense-field [20].

In this section, we evaluate the robustness of our proposed CMFD method against three types of post-processing attacks, namely, Gaussian white noise, Gaussian blur and JPEG compression, on applied to the tampered images mentioned in Section 3.3.1. Gaussian white noise: Forged images are processed by Gaussian noise through the Matlab function Gaussian blur: Gaussian blur is added to the forged images, with the same window size, (w = 3) but different sigma (σ= 1, 3, 5). JPEG compression: Tampered images are re-compressed with different JPEG quality factors, varying from 100 to 40, in steps of – 20.
Some examples of the results are shown in Fig. 7. As shown in the second and third column of Fig. 7, CMFD-SIFT [13] and ORB [16] only locate parts of tampered region on Gaussian blur, JPEG compression and fail to locate tampered region on Gaussian noise. On the contrary, Dense-field [20] detectstampered regions in Gaussian blur but has false detection on JPEG compression and Gaussian noise from the forth column of Fig. 7. Moreover, as shown in the last column of Fig. 7, the proposed method exhibits the best robustness to noise, blur and JPEG compression. From a statistical analysis of the ROC curves, which are shown in Fig. 8, our proposed method has higher TPR at the same FPR than CMFD-SIFT [13], ORB [16], Dense-field [20] for different degrees of Gaussian noise, Gaussian blur and JPEGcompression.

Example results for post-processing with Gaussian noise, Gaussian blur and JPEG compression from top to bottom. Column 1 shows the results of the tampered images, where red solid lines are the copied regions and red dashed lines are the pasted regions. Column 2 shows the results based on CMFD-SIFT [13], where the green lines are the matching regions. Column 3 shows the results based on ORB [16], where the red lines are the matching regions. Column 4 shows the results based on Dense-field [20], where the black regions are the tampered regions. Column 5 shows the results based on the proposed method, where the red ellipses are MSERs and the green lines are the matching regions.

In this section, time-consuming is recorded through testing CoMoFoD database. Since the image size of CoMoFoD are all 512*512, which would compare equally with CMFD-SIFT [13], ORB [16], Dense-field [20]. The hardware environment of the experiments is a 2.8 GHz Intel Core i5 processor; the software used is Matlab R2014b. In order to clearly analyse, the frame is divided into feature extraction and matching. The result is shown in Table 1. It’s obvious that the block-based method Dense-field [20] has the highest time-consuming. CMFD-SIFT [13] applied Agglomerative hierarchical clustering (AHC) to match, which consumes more on matching. ORB [16] extra needs to build pyramid scale space. Our proposed method perform better than CMFD-SIFT [13] and ORB [16], since we use MSERs to extract affine invariance regions.
The comparison of time-consuming
The comparison of time-consuming
This paper presents a novel CMFD method based on MSERs and LIOP, integrating block-based and keypoints-based methods. Our proposed method abandons the traditional block-based scheme that divides the image into overlapping blocks and maintains the superiority of the keypoints-based scheme, which is effective for rotation and scale geometric transforms. Moreover, our method performs well on the tampering factor of illumination change and is robust against Gaussian noise, Gaussian blur and JPEG compression. However, our proposed method is still not effective for some degrees of rotation and scaling, which will be improved in future research.
