Abstract
The rapid progress in the transfer of information and its availability are the reasons behind the widespread use of the Internet. Document images are the most complicated and challenging category among various forms of digital data to which watermarking can be applied for its security and authentication. Watermarking of the document image is a very difficult task, since these images have very limited redundancy. Because of this factor there has been very little research in this area. A comprehensive research should be carried out to ensure the effective assessment, review as well as implementation of document image watermarking techniques. This paper reviews the existing document image watermarking by considering different evaluation parameters. Based on this review, a variety of challenges and directions have been prescribed for the development of effective watermarking techniques for document images.
Keywords
Introduction
Technological innovations in the world of information technology have developed an environment in which information can be stored, exchanged, duplicated, copied and distributed. The illegal manipulation and dissemination of digital media such as text documents, images, video and audio has increased with this growth. Information security monitoring, which requires not only encryption but also traffic safety, is now increasingly in demand [88,106].
Many approaches like cryptography, watermarking and steganography are used to secure digital information. Digital watermarking uses an algorithm to imperceptibly embed some user watermark in the digital data, as shown in Fig. 1. Digital data may be in various formats such as text document, image, audio, or video. To secure ownership and ensure authentication, digital watermarking may be used. The owner of the digital media can prove his ownership if required. Watermarking technique’s main objective is to secure a digital document [39,67,87,95].
Figure 1 shows a general process for the embedding and extraction process. Watermarked document is transferred via communication channel with the help of watermark and key. Communication channel is not secured and with the attacks like insertion, deletion and reordering, watermarked document can be altered. For comparison with the original watermark, the watermark is extracted from the watermarked document on the receiver side. If the two watermarks are the same then document will be authenticated otherwise it will be a tampered document.

Watermark embedding and tamper detection process.
A digital document image generally consists of textual contents which are prone to common image processing attacks. This vulnerability makes their security and protection a key problem. Document image security becomes problematic due to the lack of effective watermarking techniques [4,128]. Preserving the integrity of digital content and maintaining its availability and secrecy at the same time is important. The growth of many languages like Chinese, English, Arabic and Spanish, has resulted from the rise of the Internet [85]. A single framework is designed by busing text and layout information. Layout of LM model was evaluated by using three tasks form understanding, receipt understanding and scanned document image classification [114]. Due to variations in the nature and characteristics of text contents, most of the existing document watermarking techniques have concentrated on certain languages only. In order to resolve the challenges and vulnerability in the transmission of documents and hiding of information due to language barrier, more general document watermarking techniques that can be extended to any kind of text need to be design and developed. Figure 2 shows the year wise papers published in the watermarking document domain. We can conclude that the number of papers on the watermarking of documents has been decreasing since the last four years. It shows that novel watermarking techniques applicable to the document or text image need to be designed which can incorporate novel features of these images.
The purpose of this analysis is to conduct a comprehensive investigation into the current status of document watermarking development including its theory, methodology and usages. In addition, open research issues requiring significant research are being explored, with an emphasis on information integrity, availability of information, retention of originality, secrecy of information, protection of information, transformation of documents, implementation of cryptography, and language versality.

Year-wise paper published on document watermarking.
The article is structured as follows. Section 2 presents the literature review and Section 3 includes a discussion of the requirements of watermarking techniques. Section 4 outlines document watermarking approaches which are currently available. Section 5 presents the logical and physical embedding processes employed in digital watermarking. Applications of document image watermarking are covered in Section 6. Section 7 describes the various types of parameters. Section 8 addresses the research challenges and issues. In the last section, the findings of this study are summarised.
Watermarking techniques for Artificial Intelligence is reviewed by [14]. Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL) technologies are compared in watermarking. A summary and comparison of the surveyed scheme’s contribution are provided for several technical perspectives. The authors outlines current research issues and future directions that could close knowledge gaps for scientists and technologists working in this field.
[77] proposed a Blockchain-oriented watermarking techniques for the authentication and integrity of construction information. The 5W1H (Who?, What?, Where?, Where?, Why?, How?) watermarks are used to create a blockchain-based deployment architecture that secures information integrity and authentication for construction tasks by hiding them within digital content.
[36] proposed a watermarking technique for digital image in cybersecurity using compass edge detector and LSB. This approach depends on the least important bit mechanism, compass edge detection, and the use of chaotic encryption for watermark embedding. A range of geometrical and image-processing attacks are systematically assessed to determine the efficacy and performance.
Medical image copyright protection has proven essential, and even little modifications might put a patient’s life in risk. Digital watermarking technology is widely used in healthcare applications to confirm the authenticity of these images. The basic concepts of watermarking including such as watermarking in the biometric, blockchain, machine learning, and spatial and transform domains; watermarking using optimization; and watermarking using encryption and compression are reviewed by authors [68]. A robust and high capacity data hiding technique using Integer Wavelet Transform (IWT) and Least Significant Bit (LSB) to secure the medical data. Multiple watermarks are embedded into medical color image using IWT and LSB. Cover images were first converted into three channels using this process, and then IWT was applied to each channel. After then, the LSB method was used to hide several marks inside the cover image. In the meantime, the image mark was compressed using a lossless soft approach before embedding, which decreased storage and transmission overhead and increased the marked colour image’s embedding capacity [103]. A watermarking technique for encrypted images using Generative Adversarial Network (GAN) provides authentication and copyright protection. This technique is helpful to avoid the data leakage in healthcare scenarios. First, image is encrypted with the help of chaotic map and randomised singular value decomposition (RSVD). In order to create a watermark, a GAN model is developed by hiding several markings within of a picture. Afterwards, the created watermark is embedded into the encrypted image to provide authentication and copyright protection [102].
Reversible data hiding (RDH) technique for JPEG document images using zero coefficients embedding is proposed by [119]. The integrity and authenticity of JPEG document images can be ensured by reversible data hiding. JPEG document images provide a large number of zero coefficients to contain data because of their obvious borders and simple texturing. The best options for maintaining the visual quality are those with zero coefficients, as determined by a mean square error analysis. Then, to greatly reduce the visual distortion, the optimal zero coefficients are chosen by minimizing a carefully created embedding cost.
Requirements of digital watermarking techniques
Following are the characteristics of a watermarking technique, as shown in Fig. 3.
Robust, fragile and semi-fragile are three watermarking categories according to robustness property [97,127]. A robust technique is capable to resist attacks like compression, rotation, scaling etc. On the other hand, a fragile watermarking technique cannot withstand any attack. A semi-fragile method guarantees that the watermark can withstand few attacks. Robustness can thus be used to calculate and differentiate the strategies of text watermarking by measuring how many attacks it can withstand. For measuring robustness, there are no accurate indicators. In the case of document watermarking, however, the attacks that seek to break the watermarked image of a document are used to test the robustness of the technique. Formatting and contents modification are the two types of attacks that typically threaten watermarked text information [8,10,108,127]. So, if such attacks can be avoided by a technique, it is considered a robust one. They are called semi-robust or non-robust, where only half or none of the attacks can be avoided. To determine a watermarking technique’s security, it must be believed that the intruder is aware of the watermark embedding method. The intruder can attempt to examine the watermarked document image as well as to remove the watermark bits. Without modifying the contents of watermarked image, intruder should not be able to change the watermark bits. Identifying the best protection procedure and strategy is important in order to increase the efficacy of the watermarked document.
The above requirements contradict one another. For instance, increasing the embedding capacity will reduce the watermarked image visual quality and decrease the robustness against attacks. The most critical properties for any watermarking technique are robustness and imperceptibility. Conflicting the above properties poses many difficulties in the design of a robust watermarking technique.

Characteristics of a watermarking technique.
Work related to watermarking of document images began in 1997. This initiative inspired other scholars to work in the domain of document image watermarking. Several watermarking approaches for the authentication and protection of digital document images have been proposed [82] as shown in Fig. 4. These approaches to watermarking of document images are categorized as:
Structural based watermarking: In this approach, in order to embed the watermark bits, the lines, letters and spaces of the text content are modified [22].
Linguistic based watermarking: In this approach, the language of text present in document image is analyzed and modified to embed watermark bits [17]. Since then a number of modifications have been suggested by several researchers [3,51].
Image based watermarking: In this approach, characteristics of the contents of the document image are used to create the watermark. Using some embedding approach, this watermark is then inserted into the cover image.

Document watermarking techniques.
In this approach, the structure or features of the text are adjusted to hide the bits of the watermark. This includes the general formatting of the digital cover text media within which text content is changed to embed the bits of watermark image by using its words or sentences. It is also possible to change the writing style or the positions of words as well as letters for this purpose. This involves repeating those letters or changing the text’s features. General text properties are detected, and some physical properties are used to hide the bits of watermark image into text content.
Watermark bits are often inserted by shifting up or down the words and sentences of the text of a document image in the techniques proposed by [22,24]. To improve the Brassil techniques, a variety of somewhat different approaches were proposed. Examination of the average word width within every line is also one of the structural watermarking techniques. Other approaches have also been developed on the basis of word classifications [66]. In these methods, a word is classified according to its characteristics. Another method takes advantage of the justified paragraphs and the uneven spacing found in the document to embed the watermark bits [7].
One of the methods to insert the watermark bits uses the keyword present in the text. This keyword is selected by the owner and the watermark bits are created using the length of the preceding words and the following words; to and from the keywords in the text [56].
The first watermarking method based on Chinese text was proposed by [73]. The authors used the characteristics of the pictographic characters in Chinese language. These features are used in their technique to create the redundancy that is further used to embed the watermark. Chinese character properties are also used by other Chinese text watermarking techniques, as they are found to be ideal objects for embedding watermark bits. A watermarking technique was developed by following the pronunciation of polyphonic characters and features of two Chinese character polyphones [37]. The other method used was Chinese sentence entropy, measured on the basis of word frequency. The key sentences for embedding the secret bits are chosen based on entropy [37,121].
Mostly existing techniques used features of Arabic script characters to embed watermark bits in content of Arabic text. The characters are expanded and replaced with similar Unicode characters, or the diacritics are modified to embed watermark bits [10,43].
The structural-based techniques described above are not immune to format based attacks like copying, pasting and retyping. Some of these techniques are immune to print and change of font. This also relies on the robustness of the technique for attacks based on content.
Linguistic based approach
The linguistic based approach is based on methods based on natural language processing. To add the watermark bits, it uses the syntactic and semantic essence of the digital cover image [27,108]. The bits of watermarks are inserted in such a way that the text’s structure and context remain unchanged. Some watermarking methods are bases on the linguistic use of semantic or syntactic transformation, or a combination of both, depending on the text’s language [72]. The substitutions of synonym words are used to preserve the imperceptibility and one word per sentence increases the embedding capacity [42].
To insert the watermark bits, words of the contents are manipulated according to the syntactic approach. Verbs, nouns, adjectives, pronouns, prepositions, synonyms and other grammatical features of text content are used for embedding. These grammatical changes are implemented without altering the content’s originality. And the word sequence in the contents can be rearranged to add the bits of watermark image. This can be achieved by modifying the text structure, such as changing the adverbial expression from active to passive, adding the subject matter and altering a sentence. Syntactic tree and transformation were used as one of the early approaches [15–17]. Modifications in the watermarking techniques have introduced several other syntactic-based approaches [64,65,83,84,111].
In the semantic dependent method, the watermark bits are added by modifying the words of text content. It consist of approaches like synonym replacement, algorithms based on noun-verbs, typo-based algorithms, acronyms and abbreviations, linguistic-based presupposition-based algorithms, and text-based representational sequences. This method relies on language, and uses vocabulary, grammar or structure to embed the watermark bits [27,78,86].
Image based approach
In this approach, text contents are viewed as a series of text images and bits of watermark image are hidden in content of digital original text. The watermarked text is considered to be an image and it is no longer possible to copy and paste the text and it must be retyped for reproduction. To create a watermark, the watermark is translated into a text string and combined with the characteristics of cover text. Using certain embedding algorithms, watermark bits are embedded into the content of cover text. A watermark logo is generally created to identify the validity of the original image and to check the possibility of the text being altered.
In recent years, image-based approaches to text watermarking have been suggested. One approach is to embed watermark bits by using letter pixels [118]. Curvaceous letters are used in these approaches for watermarking by changing the curves of letters with respect to the specific bits. Curve of a watermarked text image involves embedding mechanism of the bit, which varies a little in direct relation to the changed bit.
Image based approach to document watermarking is deemed safe against format based attacks as format based attacks are common to digital documents.
Embedding process
Embedding in a document image is the process by which the watermark bits are created and inserted in the original document image to create a watermarked image. There are two types of embedding, namely zero embedding and physical embedding, that can be used during the embedding process. Both methods have their advantages and drawbacks as discussed in the following subsections.
Zero watermarking techniques
Zero watermarking is a fully invisible or imperceptible approach to watermarking. It is equivalent to data hiding in which the viewer is absolutely transparent to concealed data due to this, no embedding is implemented on the cover media. The solution involves digitally embedding the watermark bits into content of text, without embedding the watermark physically inside a digital carrier. This is achieved by producing the watermark data from the features of the carrier image. Although the watermarked document is kept confidential, the original document is released to the public. The watermark information is generated again from the same features of the cover image and compared with the original watermark bits to validate the authenticity of document being targeted. Authenticity or protection is proved if the comparison is true [53,99,105,112,127]. The following requirements should be fulfilled by a zero watermarking technique:
The features used to create a zero watermark should fully reflect the cover image.
These features should satisfy the collision resistance i.e. from different cover images, probability of extracted zero watermarks bits will be same, is very less.
In this technique, the structural or attribute properties of the text contents of the image are analyzed and then used to create a watermark. This watermark is kept secure with Certifying Authority (CA) and can be used for future authentication process. It appears that this method is very robust as well as imperceptible, as there is no watermark information in the cover document image being released to the public. Nevertheless, there are some disadvantages that need to be stressed. Even though it is safe against format-based attacks or attacks based on content, all of the contents can be easily changed or modified by a possible attacker so that the information about the watermark can no longer be found. The text information can be misused if the watermark is hard to recognize. Logical or zero embedding methods relevant to document images have been explored in this section. The study of these approaches in relation to attacks and results is summarized in Table 1.
POS tag sequence based zero watermarking
As patterns of a sentence are represented by Part of Speech (POS) tag sequence. These patterns are characterized by the quantity of POS tags and the permutation. You may use this tag sequence in zero watermarking. Many sentences can be built up to reflect the same meaning as is required. So, there are infinite words patterns and POS tag sequences. The contents and expressions of various texts are various. Their POS tag sequence is expected to be different. Final zero watermarking is produced by applying transformation to the characteristic of POS sequence.
A zero watermarking technique using POS tag sub-sequences is proposed by [45]. Tag sequences have been extracted using a chaotic method to create a watermark. This approach offers good protection as sequence of selected POS tags is not known o the attacker, whereas the chaotic process has uncertain properties. Robustness is also assured which involves protection against reformatting and transforming document type, interchangeable substitution, and sentence transformation attacks with a very poor percentage of effectiveness.
[90] proposed a zero watermarking approach to authenticate Chinese documents. The frequency of POS tags as well as entropy are used as features. Watermark bits are generated by using forward cloud model generator and features generated from the POS tags. This approach is very vulnerable to insertion and deletion attacks. A tamper detection rate can be achieved high if insertion and deletion is less than 20%. Synonym replacement attacks have little affect on the watermark’s degree of resemblance.
Prepositions based zero watermarking
A document image consists of sentences and each sentence contains words or phrases that connect a noun or pronoun to a verb or an adjective. The word or phrase is referred to as preposition. This preposition is used in a number of ways for the watermarking process. Occurrence or maximum occurrence of preposition is used in zero watermarking.
Maximum occurring prepositions in the text contents are used by [49] to construct a text separator. For creating text partitions this separator is often used. After that, the maximum occurring Non-Vowel ASCII characters in every partition are listed. Author key is generated by using this list and watermark bits selected by user. The generated watermark is stored with CA to make it secure. Against insertion and deletion attacks, precision of the extracted watermark bits are evaluated. The insertion and deletion rate is checked at 5%, 10%, 20% and 50% and the watermark accuracy results are analyzed. When the addition and deletion ratio is highest, the precision is found to be lowest. Without deleting the text contents, it’s hard to totally uninstall the watermark bits.
A preposition and a double letter were used by [55] to build the watermark. The data partitions in the document image are evaluated using the frequency of repeated letters. By using the number of letters in the time interval, the watermark is formed. To judge the degree of closeness, original and modified watermark bits are matched. This strategy ensures protection and is more stable in terms of resistance to any attacks like insertion, deletion and reordering.
A blind watermark approach based on the appearance of prepositions and double letters in the text document was suggested by [50]. This approach offers protection of copyright and protects the cover image from addition, deletion and reordering attacks. This process is also checked for ASCII characters and reports a reasonable percentage of watermark derived with results above 90%.
Keywords based zero watermarking
There are various keywords in a document image. These keywords are used in watermarking process by considering their length and frequency of occurrence. Word length and multiple occurrences of the words are used for watermarking.
A zero watermarking technique on the basis of word is proposed by [57]. To create a watermark key for every sentence, all words with more than four letters are chosen and used. Watermark distortion rate and pattern matching are evaluated for insertion and deletion attacks. At a constant rate, experimental results for low, moderate and high-level tampering attacks are evaluated.
Characteristics of the text are used by [56] to propose another technique for creating the watermark. The keyword that appears several times is selected, and the length of the word that precedes and follows the keyword is used to build the watermark. Watermark distortion rate is evaluated against the insertion and deletion attacks. The tamper detection accuracy of the technique is measured and found to be better than the existing techniques.
Another secure and robust technique is proposed by [53] in which the bits of watermark image are logically embedded in the text content of the digital document cover image. Watermark key is created by using text elements, double letters and most commonly used English words. The technique is being tested for insertion and deletion attacks and found to be efficient.
Image to text based zero watermarking
Image to text based zero watermarking techniques transform an image watermark to a text watermark. The first letter that most commonly appears is generally used to create a watermark, in these techniques.
A image to text watermarking technique in which first an image of watermark image is transferred into an alphabetical watermark image is proposed by [55]. After using the propositions in the text as separator, a key is created by using the first letter that appears most frequently (MOFL). These separators are used to render groups and to construct a MOFL list, every group’s first double letter occurrence is used. The watermark letters and the list are used to generate the watermark key. The watermark has been shown to survive insertion, reordering and deletion attacks. This technique provides a higher percentage of successfully detected watermarks following attacks relative to the existing techniques.
A technique using an image-to-text converter to generate a watermark key was suggested by [105]. The watermark image is embedded in the duplicated original file where it is processed and categorized into word sets and based on its features, the watermark key is created. Technique provides protection against forgery and unauthorized exploitation of information in digital data. By means of a blind and fragile watermark extraction procedure, watermark key is secured.
Entropy based zero watermarking
Entropy of a sentence is amount of average information. It is defined as
The word in Chineses text is used by [121] to produce a watermark. This entropy is measured using the word frequency and is further used to select the key phrases. The watermark is built in accordance with the order of critical sentences. Due to the complexities of Chinese text semantics, a sentence with a larger word frequency has a high entropy. Technique is tested against attacks such as adding, deleting and synonym substitution. Since this technique offers good imperceptibility and security, it is used for protection purposes.
A technique that uses the features of Chinese content sentences is proposed by [74]. Text of the document is divided among sentences and semantic code of each word is then used to measure its entropy. Word entropy, relevance, weighting function and length are used to measure the weight of each sentence. The created key is encrypted and stored with CA. This technique offers security, robustness as well as protection.
Model based zero watermarking
A document image consists of sentences and each sentence includes different words. Probabilistic features as well as syntactic and semantic features of textual contents of an image can be used in zero watermarking.
Space models based zero watermarking was developed by [122]. The zero watermark is generated from a 3-D model on the basis of 2-D coordinates of word level and sentence weights of the sentence-level. 2-D word space layout consists of both the length and frequency of the word and is expanded to a 3-D format. By mapping 3-D model with the sentence, text watermark is constructed. This approach is evaluated against three most frequent attacks; attack of syntactic transformation, synonymous replacement attack and deleting attack. This technique is used to protect the digital document images.
Markov’s model is utilized by [6] to propose a watermarking technique in which watermark is generated by using the probabilistic features for document image content authentication. Text information is analyzed with the help of hidden Markov model. This strategy provides protection against insertion and deletion attacks.
[120] proposed a three-dimensional (3-D) space model by using syntactic and semantic features of digital text image. Based on abstract set, a watermark is generated that can be derived after that by comparing the distance of each sentence of document image. Syntactic transformation, synonymous substitution, addition, deletion and reordering attacks were performed in the experiment. Deletion and reordering attacks demonstrated the model’s vulnerability to the modifications, syntactic transformation and synonymous replacement attacks were well resisted.
Three order Markov model for english text image is analyzed by [18]. In order to generate the watermark bits, the analysis is further utilized to find the relationship that exists between the contents of text image. The contents of English text document are used to extract the probabilities of the interrelationships between the contents. This technique is used for authentication and provides better robustness against attacks like insertion, deletion and reordering as compared to existing techniques.
A zero watermarking technique based on Markov Model for English text document is proposed by [40]. The technique is capable to provide content authentication and tamper detection for English text documents. This technique provides more robustness and better performance as compared to other existing techniques, especially for insertion and deletion attacks.
Markov model is used by [5] to proposed a zero watermarking technique for Arabic text. This technique is based on the Fourth level of word mechanism that provides tamper detection and content authentication. Integration of the zero watermarking with the hidden Markov model enhances the embedding capacity as well as robustness.
Chinese machine code and information strategy based zero watermarking
Chinese machine code and information gain approach are used in text zero-watermark scheme that uses Chinese edit distance to generate the text zero watermark and detection mechanism. In each paragraph, the edit distance is computed between the characteristic words and used to pick the commutation location to create an edit distance matrix to produce a zero watermark.
A watermarking technique for text document based on Chinese machine code and information tactics is proposed by [123]. The weighing terms are computed of paragraphs and Chinese machine code formally expresses the text features. The edit distance is measured and the point of commutation is located to create a watermark. This technique has good robustness and protection against attacks like addition and deletion, synonymous substitution, and syntactic transformation. It provides security with the help of copyright holder ID and date stamp for every generated key.
Transform based zero watermarking
Transforms are used in zero watermarking. A transform changes the image from spatial domain to transform domain. After transformation, the image is divided into blocks of different coefficients values. These blocks may also be further used for watermarking.
Discrete Cosine transform (DCT) transforms an digital image into a set of numbers called coefficients. It reduces the complexity of computing complex number. DCT is a method of transformation derived from the minimum mean square error conditions for image coding which is the sub-optimal orthogonal transform to K-L transform.
[38] proposed an DCT based zero watermarking technique for Chinese document images. Cover document image is partitioned into 64 blocks and then DCT is applied to every block. Low-frequency to medium-frequency band DCT coefficients are chosen to form a one dimensional series. Keys are used to encrypt the sequence, using the logistic mapping to create the zero watermark sequence. The detected watermark and registered watermark are compared in order to measure the similarity parameter for verification.
[99] proposed a wavelet transform zero watermarking techniques applicable to the document images. Lifting wavelet transform is applied on the cover image to get its subbands. Non-overlapping blocks of the same size are generated using these subbands. A zero watermark is created using the features of each blocks. A key watermark is combined with this watermark to generate a meaningful watermark. The technique can detect the tamperness in the document images in any regions. Performance of the techniques is found to be superior than the existing techniques.
Digital signature based zero watermarking
A digital signature is a mathematical feature used to confirm that digital images or documents is authenticate or not. The digital signature technique is applied using symmetric key or asymmetric key methods. It guarantees the content authenticity, integrity, and privacy of the transmitted data. Two most used public key digital signature methods are Rivest–Shamir–Adleman (RSA) encryption algorithm and the Digital Signature Algorithm (DSA).
[105] proposed a hybrid technique on the basis of zero watermarking and digital signatures to secure digital media. Logical embedding is done by embedding the watermark bits in the cover document and watermark image is transformed into a character sequence. Unicode is used to numerate words into binary values. This technique provides better performance in terms of tamper detection, robustness, capacity ratio, imperceptibility, to verify authentication of a document, and language independence as compared to the other existing watermarking techniques. It has its drawback because it needs a large amount of memory to store the created keys with the certification authority.
Chinese phonetic alphabet and chaotic equation based zero watermarking
Chinese phonetic alphabets are divided into three categories: initials, finals and syllables.
A watermarking technique connecting the syllable parts of the Chinese phonetic alphabet is proposed by [130]. Every syllable is assigned initial and final values, and frequency is calculated according to the total of values. A sequence is constructed by using the logistic chaotic with respect to the value of total and transformed. Output of the sequence’s transformation is used to create text watermarking key. One dimensional logistic chaotic equation has been evaluated as these chaotic features follow the demands of sequence key to evaluate its robustness against content attacks such as format, deletion and addition. The outcome demonstrates the algorithm’s ability to perform anti-adding attacks. It demonstrates good robustness as well as high resistance against tampering attacks. In order to prevent illegal access, the key shall be registered with certified authority along with the date and licence.
Word occurrence and selection ratio based zero watermarking
The document image consists of sentences and each sentence contains different word. Word presence plays a vital role in determining the ratio of word occurrence and also in determining the size of replacement attack.
A zero watermarking approach on the basis of a replacement attack for document images was developed by [19]. The replacement attack changes the contents of a document without altering the structure of text content. Word Occurrence Ratio (WOR) is used to determine the rate of presence of the word within a document and Word Selection Ratio (WSR) has been used to determine the size of replacement attack. This technique depends on the three key stages of replacement resource selection, new word preparation as well as replacement action. It is clear from the results that word-based methods are not capable of detecting document’s replacement attack and also less accurate for calculating the size of advance replacement attacks compared with normal attacks. This technique is used for authentication purposes. The
Effective characters list based zero watermarking
[96] proposed a Effective Characters List (ECL) based zero watermarking technique for document images. The watermark is constructed by concept of the conversion between the selected characters of a documents. After that, the number of characters to be selected for watermark generation is calculated by Effectiveness Ratio (ER). Deletion, insertion and reordering attacks are deployed on the original digital documents to verify the effectiveness of the technique.
Types of attacks and performance evaluation of zero watermarking
Types of attacks and performance evaluation of zero watermarking
Table 1 describes the attacks and metrics used by the authors to measure the efficiency of the zero watermarking approach.
In this approach, the watermark bits are embedded into the digital original image. The bits of watermark are in the pattern of a linguistic, image or structural manipulations. Embedded watermark bits could be invisible, slightly visible as well as visible. In visible watermarking, embedded watermark bits are visible to the receiver [87] and therefore prevents copying as well as reusing the image. So it is helpful in promoting the owner of task. This type of embedding is usually done in the form of an image watermark, but text content is not well-known because text content can be easily manipulated and retyped. Visible data can, however, also be deleted or altered by an unauthorized access.
Depending on the process used, the modifications made to the text vary from barely visible to completely invisible. In invisible as well as partially visible physical embedding, only authorized users are aware of the watermark. Thus, in the case of copying and tampering, the watermark bits are not lost or removed as the unauthorized user does not know that there is watermark bits in the text. The invisible watermark can impede the tampering of the text and it is easy to detect any kind of manipulation. The related work on the physical embedding of document image watermarking as well as study of attacks and results are discussed in Table 3.
Feature based physical embedding
As document image contains different elements such as lines, words and characters. The physical embedding can be achieved using the features of these elements. Each element has different features that can be used to hide secret bits of the watermark. The watermark bits may be embedded into the digital document image by shifting or altering these elements.
Different shifting in the lines and words of the cover image are used by [23] to hide watermark bits. The watermark is embedded in the line shift approach by dislodging the entire text line vertically. The line is shifted up or down whereas the line quickly above or underneath is not moved. In word shifting approach, the position of a word inside a text line is moved horizontally to embed the watermark. Moving may be either right or left while neighboring words will remain unchanged. In character shifting, features of a character are altered to embed the watermark bits. These alterations are made by extending or shortening the length of the characters by (at least one) pixels.
[25] proposed watermarking techniques by embedding each document with a single code word for each recipient. In this work, the authors proposed three techniques i.e. Line-shift coding, Word-shift coding and Feature coding. The discussed techniques are extremely reliable even if the documents are photocopied.
Feature calibration based technique for embedding and detecting watermarks in the document images is proposed by [13]. Within that technique, two sets of symmetrically arranged partitions are specified in the bounding boxes of the text line and features have been extracted from every partition. The difference between the average feature values of the two sets is encoded as a positive or negative displacement of every bit of the watermark. This technique is used for protection purpose. In order to calculate the efficiency of the technique, the average and standard deviations of the feature displacement are evaluated.
Some of the measurable properties of digital content fonts are used by [21] to develop a technique for embedding data in color or monochrome images. The embedding process includes two scanning of original document. The first scan is used to produce the preview of the document image at a lower resolution and the second scan provides maximum resolution that is used to create the copy of document.
A data embedding technique for document images is proposed by [33]. The data embedding is done by changing the widths of a few successive characters space in a certain line. During the embedding process, characters are grouped into blocks using sentence or word spacing as a group boundary. Blocks with character size less than the threshold value are not used for embedding a watermark. The disadvantage is that width of the character spacing width is very limited relative to width of the word spacing.
Several watermarking techniques used in the document image have been reviewed by [28]. The techniques reviewed are: Text line, Word or Character shifting, boundary modifications. In these techniques, the document image is divided into blocks with fixed partitioning, character features modifications, run length patterns modification, or half-tone images modifications.
Different lines of text have different inter-word spacing in their contents. This feature is used by [46] to propose a watermarking technique for protection purpose. The spacing between different lines of text are altered. The sine wave is produced by the average spaces of different lines after the alteration. Technique is compared with the previous techniques: Line Shift Coding, Word Shift Coding and Feature Coding. This approach supports blind and non-blind techniques.
An authentication technique based on a block pixel rearrangement for binary images is developed by [48]. Grayscale image conversion into binary image is achieved using Halftone technique. The first image is divided into 9 × 9 non-overlapping blocks and each 9 × 9 block is further divided into non-overlapping 3 × 3 blocks. Standard deviation is used as a authentication signal and is created by reordering the pixels in the selected 3 × 3 blocks. The main drawback of this approach is the deterioration introduced in the binary image encoded.
[113] proposed a blind data hiding technique for binary image authentication and annotation. The flipping of pixels is used before the embedding process and a significant amount of data is embedded without any visible artifacts. Secret data is embedded by manipulating pixels with high score of flippability. The features based on group of pixels are used to embed the data.
A blind watermarking technique for authentication of text document images by combining inter-character and word-spaces was suggested by [117]. Inter character spaces are incorporated with word spaces in enhancing the hiding capacity. To embed watermark bits, different embedding rules are used. In the experiment, English and Frech text documents are used. The major disadvantage of this technique is its high computational complexity as well as susceptibility to noise attack.
[129] used Render Sequence Encoding (RSE) to suggest a technique for authenticating a text. It encodes the hidden data into the documents by altering the display sequence of words and characters, without altering the appearance or contents of the digital document. Encoding is accomplished by means of a special permutation of the sequence of the display. RSE hides data into the formatted document that includes all text data layout information such as PostScript, Portable Document Format (PDF), Printer Control Language and Device Independent Document.
Adaptive Least Significant Bit (LSB) replacement technique is used by [94] for the purpose of proposing a copyright protection technique. A embedding capacity map is generated by using a saliency map after that adaptive LSB replacement is performed to generate a watermarked image. Imperceptibility and robustness are optimized to verify the results.
Transform based physical embedding
Transforms are also used in physical embedding. Transform changes the spatial domain into transform domain. Coefficients of an image are used to embed bits of a watermark. It can have better robustness compared to the techniques of spatial domain.
[80] developed a watermarking technique on the basis of Discrete Cosine Transform (DCT) for binary images. The DC components of DCT coefficients are used to embed bits of watermark image. Before embedding process, a pre-processing approach is often used to blur the binary image into gray scale image. A post-processing process is often used after the embedding which binarizes the image into a binary image. Pre-processing is the required step to prevent the watermark embedding in DC components from failing. Binarization is used to make sure that image is still a binary image. The technique provides not only robustness but also protection against cropping and noise attacks.
Spread spectrum modulation and Human Visual System are used by [131] to develop a image watermarking technique for binary documents. Watermark patterns are created using the coefficients of DCT and are inserted into the cover document image. Perceptual masks are used to evaluate the places of flipped pixels. Robustness of technique is evaluated against four types of attacks such as noising, denoising, geometrical and print scan. The output of the technique is evaluated by using
A blind watermarking technique by using Discrete Wavelet Transform (DWT), Discrete Fourier Transform (DFT) and Arnold scrambling for document images is proposed by [72]. Before the embedding process, Arnold scrambling transform is applied to the watermark image for security purposes. DWT is used to divide the cover image into four subbands
A hybrid non-blind watermarking technique using DWT and Singular Value Decomposition (SVD) is proposed by [58]. Four subbands (LL, HL, LH, and HH) of cover image are generated using DWT transform. After this, SVD has been performed on the LL sub-band and diagonal singular value coefficients are modified with the help of scaling factor. After that, coefficients of
[70] proposed a DCT based on semi-blind watermarking technique for Quran images. First, cover image is split into 8 × 8 blocks after that DCT is applied to every block. DCT and Interpolation is used during embedding process. 120 Quran text images were used in the experiment. PSNR, MSE, Visual Information Fidelity (VIF), Universal Quality Index (UQI), Structural Similarity Index (SSIM), Noise Quality Measure (NQM) and Visual Information Fidelity Criterion (IFC) were used to evaluate performance of the technique.
[32] proposed a adaptive image enhancement technique for handwritten document images. With the help of histogram, threshold value is used to equalize the contrast of a handwritten image by using contrast limited adaptive histogram equalization. After that directional transform enhancing is used for foreground and interfering strokes to improve the quality of image. The technique also increases the readability of handwritten document images.
A watermarking technique on the basis of DCT and Integer Wavelet Transform (IWT) for text images is proposed by [12]. IWT is applied on cover image to create sub-bands and DCT is applied on low frequency sub-band. For each block 9, 16, 25 and 36 coefficients are selected. The attacks used are JPEG compression, rotation, cropping, median filter, salt and pepper and histogram equalization. This technique is used for the purpose of protection. Parameters like
A watermarking technique based on stable region and object fill-based for grayscale document images was suggested by [76]. With the help of image processing operations, document image is transferred into intermediate form to find the stable region. Object segmentation is applied to find the separated objects in stable region. The performance of technique is measured by using imperceptibility, robustness and capacity against attacks like JPEG compression, geometric transformation as well as print and scan process.
[34] proposed a blind watermarking technique for document images with the help of DWT and Quick Response (QR) code. Cover image is decomposed into n-level of frequency channels with the help of DWT and QR code is embedded in a sub-band of final level of DWT. Robustness of the technique is tested against attacks like JPEG compression, cropping, scaling as well as salt and pepper noise. SSIM and
[124] proposed a watermarking technique to improve the security of two dimensional code by using QR. Cover image is divided into blocks after that two dimensional DCT is applied on every block. A smaller block is selected after applying two dimensional DCT and SVD is applied on the selected smaller block. Two dimensional DCT and SVD is also applied on watermark image. The technique shows good robustness, invisibility, low computational cost and high security.
A watermarking technique for digital documents by using encryption and compression was suggested by [98]. Multiple watermarks are embedded by using non sub-sampled contourlet transform, DWT and SVD transform. SHA-256 and Lempel Ziv Welch (LZW) are used to encrypt and compress the digital document. Multiple watermarks can be extracted from distorted cover image.
Distance reciprocal distortion measure based physical embedding
Distance-Reciprocal Distortion Measure (DRDM) measures distortion by applying a weighted matrix with every one of its weights calculated by the reciprocal of a relative distance from the center pixel. The amount of distortion can be determined by flipping a particular pixel of a binary document images. Distance among pixels plays a crucial role in the perception of distortion in document images by human.
A secure data embedding technique for binary document images was suggested by [79]. This technique provides security for tampering and authentication. The amount of distortion that occurred due to flipping a specific pixel in binary document images is measured using DRDM. This technique is derived from a combination of DRDM technique and 2-D shifting technique with an even-odd embedding technique.
Image and text based physical embedding
Image as well as text can be used for zero watermarking. Watermarks created by using an image and text combination to make the document image more secure. The combination of image and text can be used to enhance the robustness of watermarking technique.
A watermarking technique by using the combination of image and text watermark for document images is proposed by [59]. The watermark bits are logically inserted in the content of text image and also text image is further encrypted by using Rivest–Shamir–Adleman (RSA) algorithm to provide the security. On the receiver side, the encrypted text is decrypted prior to the extraction phase. The extracted watermark bits are compared with original watermark to prove the authentication of cover document image.
A text image watermarking for printed documents was suggested by [47]. Fourier descriptor has been used to flip the trivial pixel with frequency information on the boundary of a character. Multiple bits watermark with respect to a single character is embedded with the help of quadratic quantization function. QR code with high decoding reliability and robust error correction capability can be used as a watermark information to reduce the effect of bit error as well as enhance the robustness of watermark information.
[1] developed a intelligent text watermarking technique to protect Latin-based information from malicious attacks on social media. A secret watermark is inserted in the text and can be removed later to provide evidence of ownership. The technique is based on instance-based learning algorithm and all text words are labelled that can be used to evaluate their integrity against attacks like forgery, tampering and plagiarism.
[100] proposed a tamper detection technique for text images using Markov matrix and entropy. A character of pattern is generated by computing the entropy of every sentence and markov matrix using the occurrences of character. After the terminator, each sentence has its Zero Width Characters (ZWCs) of entropy encoded at the end of it. ZWCs of character patterns are embedded at the end of text of image. The same process is performed on the receiver side for tamper detection.
A watermarking technique for the protection and authentication of digital document images with the help of hashing algorithm is proposed by [101]. A hash value of cover document image is produced by applying the Message Digest 5 (MD5). Hash value and watermark image is translated into ZWCs using lookup table before the embedding process. The same process is applied on the receiver side to detect the tamper detection. Integrity rate is used as a parameter to compare with the existing.
Syntactic tree based physical embedding
The syntactic structures of the document image are represented by syntactic trees and these trees are not very large. These syntactic trees are used for document watermarking.
[17] developed a watermarking technique based on natural language for document images. This technique relies on the text-meaning representations (TMR) of the text sentences that are richer and larger trees that like the syntactic trees. The semantic marking scheme and syntactic are used together to design a system that can detect the tampered text without using any side information.
[84] proposed a Natural language watermarking approach for text documents. Original text image is converted into a syntactic tree where the syntactic hierarchies and functional dependencies are encoded. In order to embed the bits of watermark, sentences in syntax tree format are used.
Edge based physical embedding
The characteristic of text document images can be identified using the histogram of the edge directions. These characteristics are used for the embedding of watermark bits. Edge directions histogram is used to improve the robustness. Edge pixels are used to enhance the data embedding capacity.
A watermarking technique based on edge direction histograms for grayscale text document is proposed by [66]. The histogram of edge directions is used to create empty spaces for embedding. Sobel edge director is used to find the edge strengths and directions for a grayscale document image. After that, it quantizes each edge direction into 16 levels. The image is partitioned into a set of primitive blocks, both mother and child blocks are chosen. The child blocks are used to embed the watermark data. English, Korean and Chinese document images has been used in the experiment to evaluate efficiency of the technique.
A technique based on invisible and high data capacity data hiding by using edge pixels is proposed by [107]. In the embedding process, connected components of edge pixels are used to embed the data. For blind data hiding, the total number of vertical and horizontal lines between two consecutive embeddable diagonal edge lines are taken as a parameter. The external boundary of a character is often used to embed data. Inner boundary is used to enhance the embedding capacity of the technique.
Reversible data hiding based physical embedding
Reversible data hiding technique that is used for data embedding in which a cover image is restored without any loss after extraction of the embedded watermark. This technique is helpful in enhancing the protection of the encrypted cover image.
A reversible data hiding for lossless reconstruction for binary images was suggested by [109]. This approach is based on logical computation that is pair-wise. To construct a new 2 M bit binary sequence, bit 1 is inserted in front of each bit in the secret binary bit stream sequence. The technique provides good embedding capacity as well as a lossless recovery of the watermark but it is not applicable to grey and color images.
Signature based physical embedding
Digital signature also can be used for physical embedding. For the authentication of images, digital signature and watermarking techniques are used. Digital signature is a cryptographic tool that is used to verify the authenticity as well as integrity of a digital images.
[116] developed a semi-fragile watermarking technique to authenticate the text document images. This technique considers the smoothness and connectivity of the block. The block’s content signature is generated and inserted in the block. A pixel’s flippability depends on denoise pattern matching. During embedding process a block size of 7 × 7 is used and a moving window of size 3 × 3 is used for each block. The Look Up table consisting of the denoise pattern index is used. The documents in English and Chinese language are used as a cover image.
A watermarking technique based on document analyzer and signature generator for digital document is proposed by [89]. Document Analyzer and Signature (DAS) generator works on authors information, the actual text and its features. The number of watermark bits that can be embedded in document image are computed by DAS. The main point of the invisible insertion algorithm for watermarking is that the output image is the same as the input image.
Self-embedding watermarking approach is used by [104] to propose a watermarking technique for digital documents. The technique provides detection of seal and signature entities. Recovery information is derived from these entities and embedded throughout the document after encryption. The pixel level is used to authenticate using three integrity check bits is based on location, neighborhood and pixel value.
Entropy based physical embedding
Entropy is a computation of image information content and the corresponding states of intensity level of every pixel. It is used in quantitative analysis and to evaluate details of the image. The value of entropy provides a better comparison of the information of the image.
An entropy-based data embedding technique for digital document images was suggested by [69]. First, smallest fonts that appear most commonly are identified in the document and then unique regions at word level are identified where the embedding process causes minimal perceptual distortion. Variations of the entropy are used to detect the beginning and end of the character, as well as the suitable positions in the word for embedding the data of the watermark.
The technique for embedding data using entropy for binary document images is proposed by [63]. Entropy is used to select blocks of the cover image with minimal perceptual distortion. This technique provides good visual quality, reduces response time as well as computational complexity. This technique is based on Authentication Watermarking by Template raking with symmetrical central pixels (AWTC) with entropy measurements and used for authentication.
Block based physical embedding
Document cover image is partitioned into blocks prior to the embedding process. It becomes easier to work on a block compared to the whole image. The size of the block size is determined as per the watermarking approach.
A data hiding technique on the basis of connectivity for binary images authentication is proposed by [115]. Prior to the embedding process, the flippability of the pixel is determined. Connectivity of the pixels is not destroyed by the flipping pixel. Fixed 3 × 3 block (FB), Non-Interlaced Block (NIB) and Interlaced Block (IB) definitions are used to partition the image into blocks. In each block during the embedding process, only one bit of watermark image is embedded. Various types of images such as cartoon, French, Chinese, English, Japanese and handwritten text are used to determine the efficiency of the technique.
[30] proposed a fragile watermarking technique for document image. Cover document is partitioned into blocks of same size as well as a pseudo random mapping of each block is generated. The watermark is created using the authentication and recovery information and is embedded in the corresponding mapping block. Authentication information is created with the help of LSB value of every pixel in the block. A dual option parity check method has been used to check the authenticity of watermarked image. This technique is used for both recovery and tamper detection.
[110] proposed an authentication technique using the principle of block patterns for binary images. Two embedding block patterns: non-symmetrical block and dual pair block are designed to enhance the visual quality of watermarked images. Two types of matching pair (MP) methods are used to minimise the embedding process effect: internal MP adjustment and external MP adjustment.
Segmentation based robust watermarking technique for protection of document images was suggested by [29]. The block encoding approach is used to compress the binary watermark logo. Depending on the presence and absence of the information in the original document image, empty and non-empty segments are created. In order to obtain subbands, the two-level integer wavelet transformation is applied to the non-empty segments of the cover document image. To hide the compressed watermark bits in the lower subband blocks, quantization is used. The parameters used to determine the technique’s efficiency are
[91] proposed a fragile watermarking technique for document image by using linear block mapping. This fragile technique is used to detect small changes in watermark bits and also used for authentication process. Chaotic map is used to spread the neighboring pixels of the document image to a mostly dispersed location.
Slope based physical embedding
Watermarking based on slope can be used for physical embedding. Characteristics of sloping letter of the language are used. Slopes of the typical sloping letters are used to embed the watermark bits.
A watermark technique for Farsi language based on sloping letters is proposed by [35]. By changing the value of their slopes, watermark bits are embedded in four common slope letters. The key parts of the sloping letters are located below the baseline and the difference between letters is only in the number of points are placed on top of them. Sloping letters are appeared in a distinct and associated form. The slope of the letter can be determined as follows:
Script format based physical embedding
A document file has a particular format such as Portable Document Format (PDF) or Open Document Format (ODF). These formats are smaller in size compared to other image files and are known to be more secure than other file format types.
A script format document authentication technique is developed by [41]. Watermark bits are embedded into the document file during the embedding process. The tamper detection capability is determined with the help of 200 documents; if more than 0.25% of total characters of the document are changed, then technique is able to detect alteration in the document.
[2] proposed a watermarking technique for E-government document images. DWT is applied on the cover image up-to two levels after that SVD is applied. Non-reversible embedding is applied on the diagonal elements of DWT-SVD decomposed document image.
A watermarking technique by using Discarded Page Object for PDF document was suggested by [125]. The content of PDF document is read with the help of translator in binary form. Advanced Encrypted Standard (AES) is used for encryption and decryption process. Robustness and invisibility are the parameters used to evaluate the performance of the technique.
[31] proposed a watermark technique for PDF document to improve the content accuracy extraction. When a watermark is embedded into the PDF document then watermark can affect the order of content, this technique is overcome the above problem. Direct extraction of text from PDF and Optical Character Recognition are used to improve the extraction of content.
A watermarking technique for PDF based on file page objects is proposed by [62]. Huffman coding is used to compress the PDF document. Suitable page object is utilized to embed the watermark bits. Content as well as format of document is not affected because the watermark is embedded in PDF document page objects. The technique shows good robustness and imperceptibility.
Double watermarking technique for PDF document is proposed by [126]. Data Encryption Standard (DES) is applied on watermark image to generate the hexadecimal watermark information. Read the PDF file information till the end of file and replace the first string of IDs in the end information of PDF file with the hexadecimal watermark sequence. Watermark is extracted without any distortion after text attack and page attack. Technique shows good robustness.
Kashida based physical embedding
Kashida, an Arabic character, used for adjacent letters, may be used for watermarking purposes. There are some rules for using Kashida in a image that it can not be used at the beginning or end of a preposition and a word. These rules can be used to produce a zero watermark. Imperceptibility depends on the average number of kashida per word.
[43] proposed a technique using Kashida in which watermark bits are embedded in Arabic characters. Embedding process does not alter content of the text and may be placed before and after particular letters. The technique randomizes the position of every kashida based on the series of a random value. The random number assigned to each partition is determined using a pseudo random number generator. The ratio of capacity is measured and found to be larger than the current techniques. It does not change writing contents and provides good capacity, security, and robustness.
[8] proposed an improved kashida technique for copyright protection and authentication of Arabic document images. The kashidas are embedded in front of a particular characters. For bit 1, the kashida is set and for bit 0, it is omitted. In order to improve security, some rules are followed in the embedding process of the kashida. The technique is performed on the capacity ratio and it is found that higher imperceptibility results in lower capacity.
A authentication technique based on frequency recurrence properties is proposed by [9] in which kashida is inserted in front of specific characters. For bit 1, the kashida is set and for bit 0, it is omitted. The above approach provides good security, capacity as well as robustness. However, several kashida characters may increase suspicion and may reduce the security and imperceptibility. Watermark bits are not embedded within every letters that can decreases the embedding capacity of the technique. For short texts, the kashida approach is also not applicable. While the technique supports invisibility, also vulnerable against attacks like copy and paste, retyping and Optical Character Recognition (OCR).
Letter occurrence based physical embedding
Letters that appear in a document image may also be used for physical embedding. A document image consists of a number of words, letters and lines. The occurrence of letters is known to be embedding of watermark bits.
A text watermarking technique using a combination of image-plus-text watermarking is proposed by [53]. Watermark bits are inserted in its embedding process through the use of double letters. First, the watermark image is divided into text and image. Alphabetical watermark is created by converting the image to an alphabet. Depending upon the preposition input and the size of the group, partitions and groups are formed. By using watermark and the second largest double letter occurrences from every group, the key generator produces an author key that is stored with the Trusted Authority (TA) and used to authenticate the cover image.
A information hiding technique in vocalized Arabic text is proposed by [20]. Diacritics (vowel signs), which are optional to put, are used for embedding. In the embedding process, when the secret bit is 1 then use diacritics as it is, but if this is 0 then delete diacritics. The number of letters and spaces as well as the number of diacritics of the cover image are considered for embedding.
There is no particular shape for Arabic letters, but their shapes depend on their location in the word. This characteristics is used by [11] to propose two watermarking techniques. These techniques use the pseudo space in Arabic text document. These spaces make the connected letters appear separate. This space is used to embed watermark bits. The very first technique embeds the watermark bits in the text by adding a pseudo space in Arabic text by using dot character. Second technique adds normal space to the pseudo space, which improves the embedding capacity. This approach shows good robustness and imperceptibility against attacks like copy and paste, formatting and tempering, but it is perceptible against attacks like retyping.
[60] proposed an invisible digital watermarking method for Arabic digital document images. In Arabic, diacritics of the letters have a crucial role to play in their sense. Each paired letter with the diacritic and its Unicode standard is changed into UTF-8 form and XOR operation is used to generate the watermark key. This technique is also extended to sensitive text because it does not change the nature and text content.
Hashing based physical embedding
A hash function is used to adjust discretionary size data to fixed-size values and returns hash codes, digests, or hashes. A fixed-size table called a hash table is indexed using these values.
A watermarking technique by using character encoding and attributes for combined text of Chinese and English was suggested by [95]. The MD5 hashing approach is used to encrypt the original watermark prior to its embedding. The technique can be used to embed more than one bit in a single character and to provides error correction mechanism. This technique is used for the purposes of protection.
Character based physical embedding
Microsoft word supports the characteristics that can be used for physical embedding. Unicode extended characters provides a robustness, high imperceptibility and adequate capacity for text watermarking.
Unicode extended characters are used by [3] to propose a watermarking technique for digital documents. Two lookup tables are designed to divide the available characters into two groups. The first group implies that 0s are embedded and the second group implies that 1s are embedded. This technique demonstrates high imperceptibility and robustness for attacks like conversion, copying, addition as well as deletion. The robustness analysis showed that the technique tolerates almost all of the possible attacks and the watermark is retrieved with high precision. Capacity assessment indicates that technique has a excellent payload capacity at around 2 bits per word. The watermark does not survive retyping and font change attacks, and demonstrates only a certain degree of resistance to attack reordering. Technique provides protection to the digital documents.
Text watermarking technique by using homoglyph characters substitution for latin symbols was suggested by [92]. This technique is based on password watermark and used for authentication purposes. It conserves the shape and content without transferring the text to an image. It reveals invisibility, preserves the content’s properties and is also blind in nature. The technique utilizes alternate Unicode symbols to make sure visual indistinguishability as well as length conservation. The symbols selected are converted into same symbols and analyzed as well as encrypted by Unicode. This methodology used a set of data of 1.8 million New York papers to analyse their technique.
[93] proposed a text watermarking technique to protect the Intellectual Property (IP) of text document. Even small portions of digital text content is protected by using fine-grain text watermarking. The technique is based on homoglyph characters substitution for white-spaces and latin symbols. The technique does not alter content of text and watermark is generated with the help of hash function is visually indistinguishable from the original text content. The technique shows good robustness against attacks like partial copy and paste attack.
Data mining based physical watermarking
A watermarking technique based on data mining for the security of copyright and verification of ownership was suggested by [61]. Data mining principles are used to select features from document image to embed watermark bits. The techniques provides copyright protection for local and cloud computing paradigm in text documents. In order to test attacks like formatting, addition, and deletion, 20 separate text documents have been used. The technique is capable of achieving a high degree of imperceptibility where
Convolution neural network based watermarking
[71] proposed a technique for document images by using Convolution Neural Network (CNN). Three views of document images were generated by extracting table frame, text region and shape. These views are merged and resized to a standard image. Three classic methods are used to evaluate the proposed technique.
A watermarking technique for handwritten documents based on Fully Convolution Network (FCN) was suggested by [75]. Mean value of document content is used to replace the gray level with highest values by using a pre-processed. FCN is used to find the watermarking region that is used to hide the watermark bits. Imperceptibility and robustness are parameters used to test the efficiency of technique.
Types of attacks and performance evaluation of physical embedding watermarking
Types of attacks and performance evaluation of physical embedding watermarking
(Continued)
Table 2 describe the attacks and metrics used by the authors to evaluate performance of the physical embedding approach.
Authentication
This application is used to detect the changes to the watermarked image. Digital signature is an example used for the authentication of document image. Authentication may be based on content and the complete authentication of the document image.
Access control
Different users have different access to the document image. This application helps avoid unauthorized copying of the document image.
Media forensics
This application is very useful in gathering evidence for illegal activity. This watermarking enforces agreements between the content owner and the individual with whom it shares its contents.
Protection
Document image watermarking is very useful for protecting the copyright and ownership of the document image. Contents of digital image are embedded to identify the copyright owners. Digital content may also be embedded to avoid unauthorized duplication of the content.
Localization
Localization application of the document image watermarking can be used to identify the precise location where document image has been tampered with.
Tampered recovery
Tampered recovery application is used to restore the tampered position after tampering has been detected.
Evaluation parameters
Different technique used different types of parameters to evaluate its efficiency. Table 3 described the various criteria are used by existing document watermarking techniques to evaluate the performance.
Types of Parameters
Types of Parameters
Since the document image watermarking approach has been thoroughly studied, further research on document image watermarking is required to resolve various current issues. In document image watermarking, some important issues are not resolved. Present document image watermarking techniques are not in a position to solve these problems. The following subsections discuss some of the key research concerns and direction for future research work.
Standard data set
Data set is one of the fundamental requirements for monitoring the effectiveness of any watermarking technique. The existing document watermarking techniques do not use the general data set. Authors use different data sets to develop their technique. The key problem is that we do not have a standard data set as a common data set is often useful in comparing outcomes with the existing techniques.
Language independent technique
There is no general technique developed for watermarking of document images. A technique that can be applied independently of the language of the document image is the need of the hour. A variety of techniques are developed for document image watermarking but each technique works for a specific language, such as some techniques for Arabic and some techniques for Chinese. A technique that can be applied to all languages needs to be developed.
Hybrid technique
A hybrid technique for document image watermarking should be developed. A technique can be extended to all document types and facilitates both logical and physical embedding. The technique should be independent of the language of the document image.
JPEG2000 attack
The JPEG2000 compression attack should be considered during the attack process. No technique has been developed to withstand the JPEG2000 attack in document watermarking. The existing techniques can only tolerate attacks such as insertion, deletion, reordering, transformation and cropping etc.
Tolerable technique
A general document image watermarking technique should be developed that can withstand all types of image processing attacks. Current techniques can not withstand all sort of such attacks. It is also necessary to ensure a proper level of robustness of watermarked images against such attacks.
Transformation of document
There are possibilities to lose an embedded watermark during the transformation of a text into a different format i.e. from pdf to word and vice versa. A document image watermarking technique should be developed to make sure that watermark stays protected and secure in any format during transformation.
Lack of flexibility
The key problem with the existing document images watermarking techniques is the lack of automation and flexibility. The current text watermarking approaches are computationally inefficient and can only be deployed to a certain field under such predetermined assertions.
Lack of deep learning based techniques
Recently, Deep learning and neural networks have achieved noticeable advancement, especially in the area of image processing, segmentation, classification and natural image security. There is no watermarking technique based on deep learning which can be used for the security of document image.
Conclusion
This paper explores the techniques relevant to watermarking of document images. Watermarking of the document image is increasingly popular and a number of novel and innovative techniques have been proposed. This area of watermarking faces a number of challenges, especially in terms of enhancing the development and accuracy of text detection. This article addressed the most common challenges and hurdles to document image watermarking. A brief description of the existing techniques in this field is given, the potential drawbacks are evaluated, and the findings are analyzed. The key tasks have been listed for potential study in the field of document watermarking.
Declarations
Ethical approval:
This declaration is “not applicable”.
Funding:
No funding was received.
Availability of data and materials:
Data sharing not applicable to this article as data availability statement in my main manuscript.
