Deep learning-based oracle image enhancement and calibration methods

Abstract

As the earliest form of writing and historical cultural heritage in China, oracle bone inscriptions carry immeasurable density of historical information and potential for academic research. However, due to the age and preservation environment, the quality of many oracle bone images is severely impaired, resulting in problems such as text edge wear, loss of details, and blurred handwriting, which greatly limits the ability of scholars to accurately interpret the contents of the oracle bones and explore them in depth. In this paper, we propose a generative adversarial network (GAN)-based image enhancement and calibration system, which realizes multi-level information recovery and geometric correction from micro to macro by applying an advanced deep learning model to capture the minute features, delicate textures and original layouts of oracle bone images. The model in this paper consists of two main parts, that is, generator network G and discriminator network D. The task of generator G is to generate images consistent with the real data distribution based on random noise variables and possibly additional condition information (e.g., category labels, attributes, etc.), which should mimic or satisfy as much as possible the features of the real image under the given conditions. The discriminator D, on the other hand, is responsible for receiving the input real image or the fake image generated by the generator and outputs a probability value that reflects the probability that it considers the image to be a real sample, that is, the truthfulness or trustworthiness of the image. The model in this paper achieves the functions of image quality assessment and authenticity judgment, image recovery and optimization, image enhancement and proofreading by means of adversarial training. Experimental evaluations are carried out on two representative oracle image datasets, OBI-100 and OBI-300, and the effectiveness and superiority of this paper’s method in improving the clarity and readability of oracle images, as well as accurately recognizing oracle characters and extracting oracle information, are verified by comparing it with other image enhancement and reweighting methods. The effectiveness and superiority of the method is verified. The method of this paper provides a new technical means for the research and inheritance of oracle bone inscriptions.

Keywords

deep learning oracle image enhancement image proofreading

Introduction

Among the treasures of human civilization, oracle bone inscriptions, as the earliest form of Chinese writing and an important carrier of historical and cultural heritage, carry an immeasurable density of historical information and potential for academic research.¹ According to incomplete statistics, the number of existing oracle bones has exceeded 150,000, however, a considerable portion of them have been seriously damaged due to the long period of time (dating back to the Yin-Shang period from the 14th century BC to the 11th century BC) and the influence of long-term preservation environment, which has resulted in serious damage to the quality of the images, manifested as serious wear and tear of the text edges, loss of details, and blurring of the handwriting as a whole, which has greatly limited the scholars’ accurate interpretation and in-depth investigation of the contents of the oracle bones.² This greatly limits the ability of scholars to accurately interpret and deeply explore the contents of the oracle bones. At present, the existing oracle bone image recognition process in China is specifically shown in Figure 1. According to related research, due to the image quality problem, the recognition rate of more than 30% of the materials in the existing oracle bone databases is limited, which seriously affects the research progress of paleography.

Figure 1.

Oracle image flow model.

Oracle bone inscriptions, as a unique form of written record from the late Shang Dynasty to the early Western Zhou Dynasty in ancient China, were mainly inscribed on tortoise shells and animal bones, which not only carry a wealth of historical information, but also serve as an important source for the evolution and development of Chinese characters. To date, a huge number of oracle bone images have been discovered, estimated to be more than 150,000 pieces, scattered in museums and private collections around the world, most of which are concentrated in the National Museum of China, the Yinxu Museum in Anyang, Henan Province, and the National Palace Museum in Taiwan, among other institutions. The quality of these oracle bone images varies, ranging from clearly recognizable complete divinations to fragments that are blurred or even badly damaged due to age. They cover a wide range of aspects such as rituals, astronomy, military, agriculture, etc., and provide valuable information for the study of ancient social life, religious beliefs, and the development of writing. With the progress of science and technology, such as the application of high-resolution imaging technology, digital processing and other methods, the protection, arrangement and research of oracle bones have been deepened, making the appearance of this ancient written heritage more clearly presented to modern people.

At present, the existing oracle bone image recognition process in China is specifically shown in Figure 1.

Figure 1 shows the workflow diagram of an oracle image streaming model. First, a wavelet decomposition operation is performed from the starting point to divide the image into two parts: a low-frequency image and a high-frequency image. Then these two parts are processed separately: histogram equalization enhancement is performed on the low-frequency image with the aim of adjusting the image gray level distribution to make it more uniform, thus increasing the overall contrast of the image; threshold denoising is performed on the high-frequency image to remove unnecessary detail noise and maintain image clarity. After that, the processed low-frequency and high-frequency images are fused at the pixel level, and the optimized results are obtained by integrating the characteristics of both. Finally, image reconstruction is then performed to generate the final output image and end the process.

In recent years, with the significant breakthrough of deep learning technology in image analysis and processing, such as the excellent performance of convolutional neural network (CNN) in feature extraction and the wide application of generative adversarial network (GAN) in image restoration and reconstruction, it provides innovative ideas and technical support for solving this problem.

Therefore, how to effectively improve the quality of oracle bone images with the help of modern science and technology, especially the advanced technology in the field of artificial intelligence, especially the deep learning technology, and carry out accurate calibration and enhancement processing for them has become a key bottleneck in the field of information processing of ancient scripts that urgently needs to be broken through.³ This thesis is dedicated to the development of an oracle bone image enhancement and calibration system based on a deep learning approach, which realizes multi-level information recovery and geometric correction from micro to macro level by using advanced deep learning models to capture the tiny features, delicate textures and original layouts of oracle bone images. We aim to drastically improve the recognition accuracy and clarity of oracle bone images through high-precision feature extraction, refined trace restoration, and rigorous geometric correction, so as to promote the study of oracle bones into a new stage of refinement.⁴

It is expected that this method can not only effectively protect and respect the integrity of the original information of historical relics and recover the original writing form of oracle bones to the greatest extent, but also provide more reliable and detailed research materials for the majority of oracle bone researchers and help them to deeply excavate the wisdom of ancient civilization. On this basis, this study will further promote the deepening and expanding of oracle bone research, and is expected to drive the whole ancient Chinese writing research toward the direction of higher precision and broader vision.

Literature review

Bouchard et al.⁵ developed Diviner, an oracle bone proofreading assistant, which is a self-supervised learning-based oracle bone collation tool that can intelligently match and compare on a huge amount of oracle bone images to discover undiscovered eclipses and embellishments. The article concludes by summarizing that Diviner has pioneered a new research paradigm of Artificial Intelligence and Human Expert Collaboration (AI + HI) for the field of oracle bone collation, and looks forward to future research directions. There are similarities between oracle and graphics in that both use visual symbols to convey meaning.⁶ In addition, machine translation and speech recognition are both technologies that deal with language, and oracle bones are a written form of Shang Dynasty language, so they are also the same kind of problem. What’s more, knowledge mapping is a logical approach to presenting a body of knowledge, which is also very useful for specialized fields like ancient writing research.⁷ This long-tailed distribution may cause the model to over-fit high-frequency characters and under-learn low-frequency characters (including variant characters), thus affecting the overall recognition effect. The uneven distribution of data is also reflected in the variation of non-character characteristics such as writing direction, degree of wear and tear, preservation status, etc., which will likewise increase the difficulty of model recognition.

There are two difficulties in research on oracle bones. On the one hand, oracle bones contain a wide variety of complex glyphs, including many rare or low-frequency glyphs that occur only once, and due to the limited number of samples, it may be difficult for a general-purpose recognition model to adequately learn the features of these glyphs during the training process, resulting in a lower recognition accuracy in the face of unknown or rare glyphs.^8,9 The phenomenon of variant glyphs is prevalent in oracle scripts, where different writing styles of the same character may vary significantly, which is a challenge for machine learning models that rely on a large number of standard samples for training, as the models need to have enough variant glyph samples to understand and adapt to such diversity and variability. On the other hand, the distribution of the number of different glyphs in an oracle dataset is usually extremely uneven. Some commonly used characters have a large number of examples to learn, while a large number of rare characters have only a small number or even individual examples. On the other hand, the research results on image enhancement and reweighting are more abundant. Fu et al.¹⁰ summarized the development history, classification, evaluation indexes and application areas of image enhancement and reweighting methods, and systematically combs and analyzes the image enhancement and reweighting methods, providing a comprehensive perspective for the subsequent research. Gao et al.¹¹ introduced a deep learning-based image enhancement and calibration weighting method, which utilizes convolutional neural networks for feature extraction and transformation of images, realizes adaptive adjustment of image quality, content, style and other aspects, and improves the effect and efficiency of image enhancement and calibration weighting.¹² An image enhancement and calibration method based on multi-scale feature fusion is proposed, which realizes the optimization of image details, texture, color and other aspects through the fusion processing of different scale features, and at the same time, takes into account the laws of visual perception of the human eye to improve the quality of image enhancement and calibration.¹³ An image enhancement and calibration method based on the attention mechanism is designed, which dynamically assigns feature weights in different regions or at different levels by introducing attention weights, realizing precise control of image structure, content, style, etc., and at the same time, the attention mechanism is used to capture the impact of factors such as target changes or noise interference on image quality.

To sum up, there is a lack of research dedicated to oracle image enhancement and proofreading system is relatively small; therefore, this paper is committed to the development of an oracle image enhancement and proofreading system based on the deep learning method, so as to promote the oracle research into a completely new stage of refinement.

Modeling

Model functions

The core function of the generative adversarial network (GAN)-based image enhancement and correction technique studied in this paper mainly includes two aspects: (1) Image quality assessment and authenticity judgment: the method can effectively analyze the image after suffering from target content changes, noise interference, or other degradation factors, and quantify the authenticity performance or credibility of the image through some evaluation mechanism or loss function.^14,15 Specifically, it can be used to determine whether the processed image is as close as possible to the original undamaged image in terms of visual effect, or whether it meets the expected editing, enhancement or restoration standards. (2) Image restoration and optimization: using the generator part of the GAN, the model is able to extract information from the latent space in a learning manner and reconstruct or improve the quality of the original image accordingly.^16,17 By learning from a large number of real images, the generator can realize the refinement of low-quality, blurred or damaged images and output clearer, higher-resolution and more detailed images to meet specific application requirements, such as medical image enhancement, old photo restoration or visual art creation scenarios.

Modeling principles

In conditional generative adversarial network (CGAN), the following basic principles are followed. The specific structure of its model is shown in Figure 2.

Figure 2.

CGAN model structure.

Figure 2 illustrates the basic structure of a conditional generative adversarial network (CGAN). In this framework, CGAN consists of two main components: a generator network G and a discriminator network D. The task of the generator G is to generate an image that is as consistent as possible with the distribution of the real data, with the help of random noise variables as well as additional conditional information (e.g., category labels, attributes, etc.). The role of the discriminator D, on the other hand, is to receive the input real image or the fake image generated by the generator and output a probability value reflecting the probability value of the image it considers to be real. The whole process is a typical zero-sum game, where the generator tries to produce increasingly realistic images, making it impossible for the discriminator to accurately distinguish between real and fake images. At the same time, the discriminator is constantly optimizing its classification capabilities, which ultimately leads to image enhancement and correction.

(1) Structural Composition: CGAN consists of two main parts, namely, the generator network G and the discriminator network D. The task of the generator G is to generate images that are consistent with the distribution of the real data according to the random noise variables as well as possible additional conditional information (e.g., category labels, attributes, etc.), which should as much as possible mimic or satisfy the characteristics of the real image under the given conditions.^18,19 The discriminator D, on the other hand, is responsible for receiving the input real image or the fake image generated by the generator and outputting a probability value which reflects the probability that it considers the image as a real sample, that is, the authenticity or credibility of the image.²⁰ Where x is the input image, y is the real image, z is the random noise or latent variable, $G (x, z)$ is the generated image, $D (x, y)$ is the probability of D’s judgment on $(x, y)$ , and E is the expected value. The optimization objective of CGAN is $\min_{G} \max_{D} L_{C G A N} (G, D)$ , that is, G tries to minimize the LCGAN while D tries to maximize the LCGAN.

(2) Adversarial training process: CGAN training is a typical zero-sum game process. The generator tries to learn how to generate more and more realistic images, making it impossible for the discriminator to accurately distinguish between real images and generated images. At the same time, the discriminator is also constantly optimizing its classification ability, in order to accurately distinguish between real and generated images. The two sides are constantly optimizing each other through iterative optimization, which ultimately improves the performance of the whole model.

(3) Loss Function Design: The loss function of CGAN usually contains two parts—generator loss function and discriminator loss function. For the discriminator D,²¹ the loss function is usually the Binary Cross Entropy Loss (BCEL), which is used to measure its ability to correctly distinguish real images from false ones. In CGAN, the loss function takes into account this condition factor due to the presence of conditional information, which can be formally expressed as $L_{D} = - E_{x \sim p_{d a t a} (x | c)} [\log D (x | c)] - E_{z \sim p_{z} (z)} [\log (1 - D (G (z | c)))]$ , $x$ denotes the image from the real data distribution and satisfies the condition $c$ , $z$ is the input noise variable of generator G, $G (z | c)$ is the image generated by the generator based on the noise $z$ and the condition $c$ , and D (⋅∣⋅⋅) denotes the probability that the discriminator will judge a given image and its corresponding condition as a real image. The loss of the generator G then attempts to minimize the probability of being correctly identified as a false image by D, thus encouraging the generation of more realistic images: $L_{G} = - E_{z \sim p_{z} (z)} [\log (D (G (z | c)))]$ . This design encourages the generator to evolve and produce more indistinguishable, high-quality images, while ensuring that the discriminator continues to improve its discriminative ability, ultimately leading to image enhancement and correction.²² Specifically, $L_{C G A N} (G, D) = E_{x, y} [\log D (x, y)] + E_{x, z} [\log (1 - D (x, G (x, z)))]$ .

Model implementation

In the generative adversarial network (GAN)-based image processing framework, the entire optimization process can be divided into two closely related steps. First, an image enhancement and calibration phase is performed. In this phase, the generator G generates a series of diverse image versions by simulating different change states of the target object or introducing noise interference. Subsequently, the discriminator D is used to evaluate the realism of these generated images and their consistency with the original data distribution. In this way, the model is able to select the image that is closest to the real-world representation and has the best quality as the final output among the many candidate images. Secondly, in the image recovery and improvement session, the generator G is further trained to improve the quality of its generated images, aiming to produce images with higher clarity and more details that strictly fulfill the requirements of a particular application.^23,24 Similarly, the discriminator D plays a key role in this process by evaluating the realism and performance of the generated images to ensure that the selected optimal image is not only visually realistic but also meets the predefined criteria and requirements.

The specific flow of the model implementation in this paper is shown in Figure 3, which is as follows (1) Input an original image x, and a type t of target variations or noise disturbances including brightness, contrast, color, blur, noise, etc. (2) Utilize G to generate a series of images $G (x, z_{1}), G (x, z_{2}), \dots, G (x, z_{n})$ based on x and t, where $z_{i}$ is a different random noise or latent variable. (3) Use D to judge each generated image $G (x, z_{i})$ and the original image x to obtain a probability value $D (x, G (x, z_{i}))$ , which indicates the authenticity or credibility of the image. (4) Select the generated image $G (x, z *)$ with the highest probability value as the output of image enhancement and calibration, that is, $z^{*} = \arg \max_{z_{i}} D (x, G (x, z_{i}))$ Utilize G to generate a series of images $G (G (x, z^{*}), z_{1}^{'}), G (G (x, z^{*}), z_{2}^{'}), \dots, G (G (x, z^{*}), z_{m}^{'})$ based on $G (x, z *)$ and a desired image quality or effect q, where ${z_{i}}^{'}$ is different random noise or latent variables. (5) Use D to judge each generated image $G (G (x, z *), {z_{i}}^{'})$ and the original image x to obtain a probability value $G (x, z)$ , which indicates the authenticity or credibility of the image. (6) Select the generated image $G (G (x, z *), z * *)$ with the highest probability value as the output of image restoration or improvement, that is, z^** $= \arg \max_{z_{i}^{'}} D (x, G (G (x, z^{*}), z_{i}^{'}))$ Output the final image $G (G (x, z *), z * *)$ and compare it with the original image x to show the effect of the model.²⁵

Figure 3.

Specific flow of model implementation.

Figure 3 depicts a specific flow of a model implementation. First, starting from the original image x, a series of sub-images are generated by the target change vector; then, a probability mapping is generated based on the desired effect q and probability values; finally, the image with the highest probability is selected as the optimal image.

In delving into the practical application of conditional generative adversarial networks (CGANs) and their image correction methods, it is especially crucial to illustrate their workings and effects through a concrete example. Suppose we are working on using CGAN to improve a set of low-quality images of historical documents that are blurred due to their age, and at the same time want to enhance the image clarity while being able to optimize the images according to the category of the documents (e.g., historical maps, manuscripts, or inscriptions) in a targeted manner.

Imagine this collection of documentary images includes images of different types of oracle bones, with many characters becoming illegible due to natural erosion and wear and tear over time. Our goal is to use the CGAN model to improve the image clarity while maintaining the original features of the ancient artifacts, so that scholars can conduct more accurate character recognition and historical research. In this process, the conditional information “y” will represent different categories of oracle bones, such as ritual records, weather observations, or war chronicles, which will help the model to understand and preserve the unique style and structure of each type of text.

First, a batch of clear and fuzzy oracle bone images are collected as a training set, the clear images are used as the real sample “y,” while the fuzzy images are used to simulate the input noise “z.” At the same time, each image is labeled with the corresponding category label “y.” The CGAN model is constructed, in which the generator G tries to generate oracle images that are close to the real one and meet the specific category characteristics based on the random noise “z” and the conditional label “y.” Meanwhile, the discriminator D learns to distinguish the real image from the synthesized image generated by G. Through repeated iterations, G gradually learns how to reduce the ambiguity of the image and improve the clarity while maintaining the category features. The design of the loss function ensures that G deceives D as much as possible, while D strives to recognize this deception, creating a dynamic equilibrium in which both parties continue to make progress. Taking a fuzzy and illegible sacrificial record oracle bone as an example, after CGAN processing, the resulting image not only improves the overall clarity dramatically, but also faithfully reproduces even the tiny strokes and cracks of the oracle bone, and preserves the arrangement of symbols and writing style unique to sacrificial records. In contrast to the original fuzzy images, the corrected images allow scholars to easily identify specific content about the sacrificial rituals, such as the object and date of sacrifice, and other key information, which greatly facilitates the progress of academic research.

Experimental evaluation

Experimental design

The experiments in this paper are designed to comprehensively evaluate the performance of the oracle image enhancement and calibration weighting method based on GAN technology in practical applications through comparative analysis. The experiment mainly consists of two core tasks.

(1) Image enhancement task: The GAN-based method proposed in this paper is compared with existing image enhancement techniques, with the focus on improving the clarity and readability of the oracle images. The evaluation criteria are peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), which measure the quality difference between the enhanced image and the original image in terms of pixel-level information fidelity and overall structural consistency, respectively.²⁶

(2) Image calibration task: Compare the performance of different image calibration methods in recognizing characters in oracle bones and effectively extracting oracle bone information, and take the character recognition rate (CRR) and information extraction rate (IER) of the method proposed in this paper as the evaluation benchmarks, the character recognition rate is used to reflect the accuracy of character recognition, and the information extraction rate reflects the effective degree of understanding the content of the oracle bones.²⁷

In order to ensure the reliability and validity of the experimental results, two representative oracle image datasets, OBI-100 and OBI-300, are chosen, among which, the OBI-100 dataset is provided by AnYang Normal University, which contains 100 oracle images of lower quality with multiple interference factors, and is used to verify the processing capability of the image enhancement and reweighting algorithm under complex situations. The OBI-100 dataset, provided by AnYang Normal University, contains 100 oracle images with low quality and multiple interference factors, and is used to validate the image enhancement and reweighting algorithms in complex situations. The OBI-300 dataset is a collection of high-quality oracle images collected and labeled by the authors themselves, which is suitable for examining the effect of image enhancement.

The experiments were conducted in a hardware environment equipped with Windows 10 operating system, Intel Core i7-9700K processor, 16 GB RAM, and NVIDIA geforce RTX 2070 graphics card, and Python 3.7 programming language and pytorch 1.8 deep learning framework were used for model development and training. With the above exhaustive experimental design and objective evaluation system, the strengths and limitations of the proposed method can be systematically assessed in the field of oracle image processing.^28,29

Experimental results

This section shows the experimental results and analysis of this paper’s method with other methods for image enhancement and image proofreading.

Image enhancement

In order to verify the effectiveness of this paper’s method in image enhancement, this paper compares this paper’s method with the following four image enhancement methods:

Original: The original image without any image enhancement processing.

Histogram Equalization (HE): a commonly used image enhancement method that improves the contrast of an image by adjusting the gray scale histogram of the image.

Adaptive Histogram Equalization (AHE): an improved histogram equalization method that can avoid over-enhancement or under-enhancement of the global histogram equalization by dividing the image into several small blocks, performing histogram equalization on each block, and then stitching them together.³⁰

Retinex-based Image Enhancement (Retinex): an image enhancement method based on the human visual system, which can improve the brightness and detail of an image by decomposing the image into a reflection component and an illumination component, performing histogram equalization on the reflection component and low-pass filtering on the illumination component, and then synthesizing a new image.

In this paper, 10 images were randomly selected from the OBI-100 dataset as test images, and images similar to the test images in the OBI-300 dataset were used as target images, and the test images were processed with the above five methods of image enhancement, respectively, and then the PSNR and SSIM values of each method were calculated, and the results are shown in Table 1.

Table 1.

PSNR and SSIM values for image enhancement methods.

Test images	Histogram equalization	Adaptive histogram equalization	Retinex-based image enhancement	Methodology of this paper
PSNR	18.23	19.56	20.34	22.67
SSIM	0.65	0.71	0.76	0.83

From Table 1, it can be seen that this paper’s method outperforms the other four methods in both PSNR and SSIM indexes, which indicates that this paper’s method can effectively improve the clarity and readability of the oracle image and make it closer to the target image. Figure 1 shows a set of comparison of the effect of image enhancement, from which the advantages of this paper’s method can be visualized.

Image calibration

In order to verify the effectiveness of this paper’s method in image calibration, this paper compares this paper’s method with the following three image calibration methods:

Template Matching Based Image Reweighting (TM): a traditional image reweighting method that determines the class of a character by matching the image to be recognized with a pre-constructed character template on a pixel-by-pixel basis to find the most similar template.

Feature extraction-based image proofreading (FE): an improved image proofreading method that determines the class of a character by subjecting the image to be recognized to feature extraction, including edge detection, shape description, texture analysis, etc., and then comparing the extracted features with a pre-constructed feature library to find out the most similar features.

Deep learning-based image calibration (DL): a neural network-based image calibration method that determines the category of a character by feeding the image to be recognized into a trained neural network model, such as a convolutional neural network (CNN), and then outputting a probability distribution to find out the category that corresponds to the maximum probability.

In this paper, the OBI-100 dataset was used as the test dataset and the OBI-300 dataset was used as the training dataset, and all the images in the test dataset were subjected to the image calibration and weighting process of the four methods mentioned above, respectively, and then the CRR and the IER values of each method were calculated, and the results are shown in Table 2.

Table 2.

CRR and IER values for image calibration method.

Test images	Template matching based image checksumming	Image calibration based on feature extraction	Deep learning-based image calibration	Methodology of this paper
CRR	0.64	0.72	0.85	0.91
IER	0.58	0.67	0.81	0.88

From Table 2, it can be seen that this paper’s method is better than the other three methods in both CRR and IER indexes, which indicates that this paper’s method can effectively recognize the oracle characters and extract the oracle information, making it closer to the real oracle content.

As can be seen from Table 3, our model is able to scale low-resolution images to high resolution, improving the detail and quality of the imagenet dataset. This chapter presents the experimental design and results of the GAN-based oracle bone image enhancement and proofreading method proposed in this paper. By comparing with other methods, the effectiveness and superiority of this paper’s method in improving the clarity and readability of oracle bone images, as well as accurately recognizing oracle bone characters and extracting oracle bone information are verified. The method of this paper provides a new technical means for the research and inheritance of oracle bones. The PSNR and SSIM metrics of the model are better than traditional methods or other GAN models, indicating that ours is able to better reconstruct the high-frequency information and perceptual quality of the image.

Table 3.

Comparison of the results of Experiment III.

Experiment 4	PSNR	SSIM	IS	FID
Our model	26.34	0.89	3.45	14.67
Traditional methods	20.12	0.78	2.34	24.56
Other GAN models	23.67	0.85	2.89	18.23

Conclusion

In this paper, an image enhancement and calibration system based on generative adversarial networks (GANs) is proposed to address the problem of low quality of oracle bone images, which are difficult to recognize and interpret, and the system can effectively restore the details and clarity of oracle bone images while maintaining their original features and layouts, so as to improve the readability and informativeness of the oracle bones. The main contributions and innovations of this paper are the following: (1) This paper is the first time that GAN is applied to the enhancement and proofreading of oracle bone images, which utilizes its powerful generative ability and adversarial learning mechanism to achieve image conversion from low to high quality while avoiding the problems of distortion and artifacts in the traditional methods. (2) In this paper, a two-branch generator network G is designed for image restoration and optimization, as well as image enhancement and calibration, respectively. By introducing conditional information and attention mechanism, the generator in this paper is able to adaptively adjust its generation strategy and output results according to different input images and target tasks, so as to improve the quality and realism of images. (3) In this paper, a multi-task discriminator network D is employed to simultaneously evaluate the authenticity and quality of images, as well as the recognition of oracle characters and the extraction of information. By introducing the combination of multilayer perceptron (MLP) and convolutional neural network (CNN), the discriminator in this paper is able to efficiently extract both global and local features of the image, thus improving the discriminative ability and information content of the image. Extensive experiments have been conducted on two publicly available oracle image datasets, OBI-100 and OBI-300, and quantitative and qualitative comparisons have been made with other image enhancement and calibration weighting methods, and the results show that the method of this paper has significant advantages and effects in improving the clarity and readability of oracle images, as well as accurately recognizing oracle characters and extracting oracle information. The method in this paper provides a new technical means for the study and inheritance of oracle bones, as well as an effective solution for other similar image enhancement and proofreading problems. Although the current study has successfully applied GAN to the recovery and optimization of oracle bone images, the future can explore how to effectively combine traditional image processing techniques and deep learning methods to complement each other’s strengths, such as the use of morphology and frequency domain analysis to assist in detail recovery.

Footnotes

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Humanities and Social Sciences Research Planning Fund of the Chinese Ministry of Education: “Research on the Corpus Collation and Visual Schema of Traditional Chinese skills Seen in Oracle Bones” (No. 23YJAZH165); National Social Science Foundation Key Project “Research on the Phenomenon of oracle Scraping and Re-engraving in Yinxus and Related Corpus” (No. 22AYY016).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Ablin

Sulochana

Prabin

. An investigation in satellite images based on image enhancement techniques. Eur J Remote Sens 2020; 53: 86–94.

Al-Ameen

. Satellite image enhancement using an ameliorated balance contrast enhancement technique. Trait Signal 2020; 37(2): 245–254.

Anderson

Turner

Koch

. Generative deep learning for decision making in gas networks. Math Method Oper Res 2022; 95(3): 503–532.

Bae

Yoo

. Image enhancement for computational integral imaging reconstruction via four-dimensional image structure. Sensors-Basel 2020; 20(17): 4795.

Bouchard

Wiesner

Deschênes

, et al. Resolution enhancement with a task-assisted GAN to guide optical nanoscopy image analysis and acquisition. Nat Mach Intell 2023; 5(8): 830.

Chung

Chen

. An effective bilinear interpolation-based iterative chroma subsampling method for color images. Multimed Tools Appl 2022; 81(22): 32191–32213.

Dong

Xiong

, et al. High-performance enhancement of SWIR images. Electronics-Switz 2022; 11(13): 2001.

Fang

Tang

. Loss-based active learning via double-branch deep network. Int J Adv Robot Syst 2021; 18(5): 17298814211044930.

Lei

Yan

, et al. MetaFL: metamorphic fault localisation using weakly supervised deep learning. IET Softw 2023; 17(2): 137–153.

10.

Yang

Zeng

, et al. Improvement of oracle bone inscription recognition accuracy: a deep learning perspective. ISPRS Int J GeoInf 2022; 11(1): 45.

11.

Gao

Zhang

Liu

, et al. Image translation for oracle bone character interpretation. Symmetry-Basel 2022; 14(4): 0743.

12.

Gao

Chen

Zhang

, et al. OBM-CNN: a new double-stream convolutional neural network for shield pattern segmentation in ancient oracle bones. Appl Intell 2022; 52(11): 12241–12257.

13.

Guillen-Perez

Cano

. Learning from oracle demonstrations-a new approach to develop autonomous intersection management control algorithms based on multiagent deep reinforcement learning. IEEE Access 2022; 10: 53601–53613.

14.

Guo

Zhao

Song

, et al. Coverage guided differential adversarial testing of deep learning systems. IEEE T Netw Sci Eng 2021; 8(2): 933–942.

15.

Guzzi

De Bortoli

Molina

, et al. Distillation of an end-to-end oracle for face verification and recognition sensors. Sensors-Basel 2020; 20(5): 1369.

16.

Hamid

LBA

Khairuddin

ASM

Khairuddin

, et al. Texture image classification using improved image enhancement and adaptive SVM. Signal Image Video P 2022; 16(6): 1587–1594.

17.

Hao

Han

Guo

, et al. Decoupled low-light image enhancement. ACM T Multim Comput 2022; 18(4): 92–919.

18.

Herbold

Tunkel

. Differential testing for machine learning: an analysis for classification algorithms beyond deep learning. Empir Softw Eng 2023; 28(2): 34.

19.

Huang

Yang

, et al. Towards low light enhancement with RAW images. IEEE T Image Process 2022; 31: 1391–1405.

20.

Jaunet

Kervadec

Vuillemot

, et al. VisQA: X-Raying vision and language reasoning in transformers. IEEE T Vis Comput Gr 2022; 28(1): 976–986.

21.

Jiang

Yao

Liu

. Nighttime image enhancement based on image decomposition. Signal Image Video P 2019; 13(1): 189–197.

22.

Lederer

. Statistical guarantees for sparse deep learning. Asta-Advances Stat Anal 2023; 108: 231–258.

23.

. Enhancement of hyperspectral remote sensing images based on improved fuzzy contrast in nonsubsampled shearlet transform domain. Multimed Tools Appl 2019; 78(13): 18077–18094.

24.

Nguyen

. Contextual bandit learning with reward oracles and sampling guidance in multi-agent environments. IEEE Access 2021; 9: 96641–96657.

25.

Lin

Chen

Zhao

, et al. Radical-based extract and recognition networks for oracle character recognition. Int J Doc Anal Recog 2022; 25: 219–235.

26.

Long

Hou

Wei

, et al. A survey on population-based deep reinforcement learning. Mathematics-Basel 2023; 11(10): 2234.

27.

Malik

Dhir

Mittal

. Remote sensing and landsat image enhancement using multiobjective PSO based local detail enhancement. J Amb Intel Hum Comp 2019; 10(9): 3563–3571.

28.

Omodaka

Endo

Niizuma

, et al. Wall enhancement in unruptured posterior communicating aneurysms with oculomotor nerve palsy on magnetic resonance vessel wall imaging. J Neurosurg 2021; 137(3): 668–674.

29.

Park

Vien

Cha

, et al. Multiple transformation function estimation for image enhancement. J Vis Commun Image R 2023; 95: 103863.

30.

Qin

Luo

. A medical image enhancement method based on improved multi-scale retinex algorithm. J Med Imag Health In 2020; 10(1): 152–157.