Abstract
The Chinese Maze Murders, a Chinese culture-based detective novel transcreated by Dutch sinologist Robert Hans van Gulik, represents Chinese cultures multimodally and reconciles Chinese-English cultural gaps effectively. Figuring out what cross-cultural functions and contributions of illustration-text multimodal combinations in The Chinese Maze Murders make would be inspiring for effectively promoting and applying multimodal ways in Chinese-English cultural translation. Yet, there is a scarcity of translation studies investigating it directly. This multimodal corpus-based study first identifies the different culture-representing features of the illustration-text multimodal combinations in the Chinese and English versions of The Chinese Maze Murders quantitatively, and then proposes the fundamental functions and contributions of illustration-text combinations that could facilitate cross-cultural translation qualitatively. The corpus textual statistics show that the Chinese version comprises 23.04% material cultural elements and 76.96% non-material cultural elements. While the English version comprises 36.58% material cultural elements and 63.42% non-material cultural elements. The different type-allocation of textually represented cultural elements in two language versions is regarded as an operating result of the mechanism of illustration-text multimodal combinations. Based on further retroductive corpus statistics-derived example analysis, it is argued that illustration-text multimodal combinations could facilitate cross-cultural translation in (a) transforming abstract cultural elements into concrete cultural elements, (b) integrating cultural elements as “meaning wholes,” and (c) visualizing cultural implications. This study sheds new light on the cultural functions of illustration-text multimodal combinations in translation, which provides practical implications for mitigating the cultural incommensurability of Chinese-English translation.
Plain Language Summary
“The Chinese Maze Murders,” a detective novel developed by Dutch sinologist Robert Hans van Gulik, serves as a transcreated representation of Chinese culture that effectively bridges the cultural gaps between Chinese and English. Exploring the cross-cultural functions and contributions of illustration-text multimodal combinations in this novel can inspire the promotion and application of multimodal approaches in Chinese-English cultural translation. However, there is a lack of direct investigation into this aspect within translation studies. This study utilizes a multimodal corpus-based approach to quantitatively identify the cultural features represented by illustration-text multimodal combinations in both the Chinese and English versions of “The Chinese Maze Murders.” Furthermore, it proposes the fundamental functions and contributions of these combinations, which facilitate cross-cultural translation qualitatively. The allocation of culturally significant elements in different textual forms is seen as an outcome of the mechanisms employed by illustration-text multimodal combinations. Through additional analysis using retroductive corpus statistics, it is argued that illustration-text multimodal combinations aid cross-cultural translation by (a) transforming abstract cultural elements into concrete representations, (b) integrating cultural elements as coherent wholes, and (c) visualizing cultural implications. This study sheds new light on the cultural functions of illustration-text multimodal combinations in translation, offering practical implications for addressing the cultural incommensurability present in Chinese-English translation.
Introduction
Multimodal texts challenge the traditional meaning-making techniques of pure verbal texts and inevitably influence the way translation operates. Images and texts together form the most common type of multimodal texts (e.g., picture books, comics, textbooks, technical manuals, advertisements, etc.), and the translation of image-text multimodal texts has already gained both practical and academic significance in the current social context. Visual mode (images) as an additional resource for communication greatly compensates for the inadequacy of sole verbal mode (texts) resource-based communication. Scholars have contributed much to generalizing image-text relationships to investigate the mechanism of visual-verbal mode interaction (Bateman, 2014; Kress & Leeuwen, 2021; Liu & O’Halloran, 2009; Marsh & Domas White, 2003; Martinec & Salway, 2005), and their studies have set a solid foundation for exploring image-text synergic interaction in translation production.
Picture books (Mateo, 2015; Oittinen, 2001; Van Meerbergen, 2009;), comics (Borodo, 2015; Kaindl, 2004; Zhao, 2023), advertisements (S. Li, 2019; Song, 2021; Torresi, 2008), and cover design in translated book (Mossop, 2018; Yu & Song, 2017;) are common research objects of image-text multimodal translation studies in recent 20 years, most of which are studied from semantic, pragmatic and semiotic perspective to uncover the potential contribution and influence of using combined modes in enhancing meaning reconstruction and communication in translations. The cultural function of image-text multimodal texts is then noticed by many scholars, such as cultural transformation and representation of culture-specific character’s image (Chen, 2018), mode adaptation to meet the cultural needs of the target language (Fu, 2013; L. Li et al., 2019), and manipulations to ideological narration (Altahmazi, 2020). Image-text multimodal texts are proved to be effective in translingual culture communication, thus more research is needed to identify breakthroughs and overcome barriers of present cross-cultural translations.
Multimodal corpus analysis is a basic method of multimodal studies (Jewitt et al., 2016, p. 110), it is quite useful and frequently used in describing potential correlations between modes in a generalized way and empirically verifying the proposed multimodal hypotheses and theories (Jewitt et al., 2016, p. 125). It is a common way to apply corpus methods in audio-visual translation studies, especially in subtitling, dubbing, and audio description studies. However, the corpus is rarely applied in visual-verbal (image-text) translation studies because designing an annotation framework that meets the requirements of both descriptive power and computational applicability (Pastra, 2008, p. 299) is rather challenging and complex for image-text multimodal texts. To further explore how cultural functions of image-text multimodal texts are realized in translation, it is worth trying to apply the multimodal corpus analysis method in describing and generalizing them.
Though it is not a typical form of image-text multimodal texts, Robert Hans van Gulik’s transcreated bilingual novel The Chinese Maze Murders has notable cultural image-text multimodal features directly reflected by the Chinese-styled illustrations in the novel drawn by van Gulik himself. Very few studies have been carried out to investigate the relationships between illustrations and texts in van Gulik’s novels, and fewer studies have studied it in the context of cultural translation. Van Gulik’s cross-cultural creation, writing, drawing, and translation together form a unique multimodal translingual practice that is highly suitable for investigating how cultural meaning in translation is compensated or enhanced by combined modes of visual illustrations and verbal texts.
In this article, the cultures represented by the illustration-text combinations of The Chinese Maze Murders in both English and Chinese are going to be investigated through an ad hoc multimodal corpus that is based on a pilot illustration-text annotation framework, aiming to explain why illustration-text multimodal combinations facilitate culture’s representation in translation.
Literature Review
Robert Hans van Gulik, The Chinese Maze Murders, and Illustrations
Robert Hans van Gulik (1910–1967) has multiple identities as a Dutch diplomat, sinologist, and crime fiction writer. His multi-identity life facilitates him to develop special interests in Chinese culture that are different from those of his contemporary sinologists and previous missionaries to China. Van Gulik’s intensive studies in Chinese painting, calligraphy, lute, zoology, and sexology, which can be considered “atypical,” is exemplified in his renowned crime novel series, the Judge Dee Mysteries. This series presents authentic detective stories based on the real historical figure of, Di Renjie (狄仁傑) or Judge Dee during the Tang Dynasty of China. Before creating the Judge Dee Mysteries series, van Gulik has already annotated and translated two Chinese crime and judicial literary works, namely Tang Yin Bi Shi (棠阴比事) / Crime and Punishment in Ancient China: T’ang Yin Pi Shih and Wu Zetian Si Da Qi An (武則天四大奇案) / Dee Gong An, Celebrated Cases of Judge Dee. It is van Gulik’s initial annotation and translation of the above two works that inspires his later creation of the Judge Dee Mysteries series, certain case records and plot settings from Tang Yin Bi Shi and Wu Zetian Si Da Qi An are creatively adapted and incorporated into the series. Thus, van Gulik’s later creation can be regarded as a transcreation practice. What van Gulik expects from writing and publishing the Judge Dee Mysteries series is to promote traditional Chinese Gong An (公案) or court-case fiction style worldwide and prove the attractiveness of the underlying Chinese culture (Shi, 2017, p. 222). By combining the writing style of Western popular detective fiction and the potting style of Chinese Gong An fiction, van Gulik successfully represents an imagined vivid society of the Tang Dynasty of China and portrays an impressive Chinese detective image of Di Renjie (Dee Gong) in a transculturally and translingually acceptable way. The underlying techniques of neutralizing cultural dissonance of his transcreation practice are inspiring to global culture’s circulation.
Literary illustrations are proved to be culturally productive in translation, especially in preserving the sense of transcultural continuity (Ferrand, 2014, p. 181) and appealing to audiences (Colombo, 2014, p. 402). There are 189 illustrations in all 17 books of the Judge Dee Mysteries series and Dee Gong An, Celebrated Cases of Judge Dee. All the illustrations are originally drawn by van Gulik himself. Plot-related scenes are mostly depicted in the illustrations, in which various real-life Chinese cultural elements are embedded. Van Gulik’s image writing technique as literary illustrations has visualized and expanded the cultural narrative space of his novels. The Chinese cultural elements that beyond verbal textual description can be adequately presented, assisting readers in recontextualizing and perceiving Chinese culture.
The Chinese Maze Murders is the only book in the Judge Dee Mysteries series that has been translated back into Chinese by van Gulik himself, which means both the English and Chinese version of it can be regarded as the source text. Though the Chinese version is self-translated after writing the English version, it is not a derivative work of the English version. Considering the Chinese cultural theme of this book, it is reasonable to regard the Chinese version as the source text culturally. Since van Gulik has done all the writing, illustrating, and translating works independently, it is representative to investigate multimodal visual-verbal cultural functions based on The Chinese Maze Murders.
Image-Text Multimodality and Translation
In a general sense, a “modality” or “mode” is a means or semiotic resource for meaning making, and the term “multimodality” refers to the combination of different “modes” (e.g., gesture, speech, image, inscription, video, etc.) to generate meaning (Jewitt et al., 2016, pp. 2–3). Every single mode has its distinct possibilities and inherent limitations in generating meaning. What the term “multimodality” highlights is to offset the limitations of a single mode by exploring the synergy of multiple modes interplay. Though verbal modes of language such as speech and writing are most widely studied, especially in linguistics and semiotics, it is premature and ambiguous to consider verbal mode as the most generative and resourceful mode (Jewitt et al., 2016, pp. 17–23). Multimodal studies encourage investigating “multimodal wholes” to reconcile the traditional opposition and imbalance between verbal and non-verbal modes. Image-text combination is a prevalent form of “multimodal wholes” wherein visual grammar, as proposed by Kress and Leeuwen (2021), demonstrates that visual images possess a similar mechanism of meaning making to linguistic texts. This proposition establishes a robust theoretical foundation for subsequent research on multimodal collaborative interaction, and the research on clarifying the interactive relationship between images and texts become crucial reference for exploring cross-cultural translation functions of illustration-text multimodal combination in this study.
Although it is a general knowledge that combining visual images and verbal texts gives rise to a sense of meaning multiplication, we still need to know how it works in detail (Bateman, 2014, p. 7). To figure out how images and texts function synergistically, scholars have contributed much to clarifying the interplaying relations between images and texts. Marsh and Domas White (2003) propose a taxonomy of relationships between images and texts inductively by reviewing and reorganizing findings of related previous studies in different fields. Their taxonomy includes 49 image functions toward texts that are further classified into 11 lower-level types and three higher-level groups. Though Marsh and White’s taxonomy offers a systematic descriptive tool for image-text relations, its inductive nature makes it inappropriate to be used as a theory to explain how image-text combinations function (Martinec & Salway, 2005, p. 339). Martinec and Salway (2005) develop a grammatical descriptive system of image-text semantics relations based on Halliday and Matthiessen’s (2004) logico-semantic and status relations and Barthes (1977) system of anchorage, illustration, and relay. Martinec and Salway’s image-text relations system not only categorizes the relationship between images and texts but also explains the generative mechanism of image-text synergy from the perspectives of functional linguistics and semiotics. Additionally, their system has solid theoretical foundations that objectively demonstrate how images and texts are structured and function, thereby addressing the theoretical insufficiencies in Marsh and Domas White (2003). Following the description and explanation of image-text interaction, Liu and O’Halloran (2009) further explore the cohesive devices between images and texts from the perspective of discourse analysis, providing additional explanations for the underlying reasons behind the semantic extension effects observed in image-text multimodal combinations. Liu (2023), from a socio-semiotic perspective, posits that the image-text relationship forms a continuum with cohesion and tension as two polarities. By manipulating the cohesive relationships between images and texts, diverse yet coherent attitudinal meanings can be generated, thereby elucidating the reasons behind the diversity of meaning forms exhibited in multimodal texts. The above studies delineate the general framework of research on the theory of image-text relationships, progressing from describing the relationship between images and texts to explaining the mechanisms of image-text interaction, and further to utilizing these interaction mechanisms for presenting diversified semantic information. The explicit understanding of the image-text interaction mechanism is an important prerequisite for exploring the cross-cultural translation functions of image-text multimodal combinations.
Methodologically, corpus-based methods are widely employed in multimodal studies, translation studies, and studies involving multimodal translation. It is worth noting that when it comes to parallel comparison between different modalities, the key lies in how to utilize a corpus to describe, annotate, and compare the features of different modalities using a unified standard. In multimodal comparative studies, using textual transcriptions to describe different modal features is a convenient way for modality description and comparison (Taylor, 2013, p. 100). However, detailed textual transcriptions do not meet the requirements of concise and accurate corpus annotation, which hinders the retrieval, statistical analysis, and comparison of modal features. Therefore, the method of modal texts transcription is difficult to apply to corpus-based multimodal translation studies. If it is necessary to utilize a corpus to compare multiple modalities, it is prerequisite to design a tagging form that conforms to the requirements of corpus annotation format and can simultaneously reflect the common characteristics of multiple modalities. Pastra (2008) proposes a multimodal corpus annotation method based on specific semantic connections between modalities. For instance, the content features jointly reflected by a pair of image-text combinations can serve as the corpus tags for this image-text combination. As a result, by using corpus tools, it becomes possible to retrieve the image and text modal information related to this label in one search, thereby achieving the purpose of describing, comparing, and statistically analyzing the characteristics of multiple modalities.
Currently, there is limited study in the field of corpus-based multimodal translation that directly focuses on the phenomenon of image-text interaction in translation. S. Li (2019) applies an image-text multimodal corpus to investigate the translation issues of Chinese dishes in restaurant menus. Her study proposed that a cross-modal translation approach based on images is beneficial in conveying the visual esthetics of dishes that cannot be presented solely through textual translation. However, judging from the research methodology employed in this study, Li did not make specific annotations or statistical analyses of the modal characteristics of the images using the corpus, and the images were only utilized as a component of the corpus to provide evidence for the analysis and conclusions of the textual aspect.
The multimodal studies reviewed above are of critical reference significance to this study in both theoretical and operational aspects. Building upon the existing theoretical research achievements regarding the image-text interactive relation and the logic of Pastra’s (2008) multimodal annotation framework, this study will design and utilize a dedicated illustration-text corpus annotation framework specifically for exploring the cultural translation of The Chinese Maze Murders in a multimodal context.
Methodology
The particular use of multimodal corpus in this study is to provide data evidence for identifying what and how cultural functions are performed by illustration-text interaction in translation. Before conducting further corpus analysis, two pre-requisite tasks need to be completed to facilitate corpus compilation: data collection and processing, as well as the design of the corpus annotation framework.
The multimodal corpus used in this study includes two sets of data: illustrations and texts. These two sets of data were obtained from the original English version of The Chinese Maze Murders (1997) and the self-translated Chinese version of Di Renjie Qi An (2000). The Chinese Maze Murders is the original English version written around 1950 and republished by the University of Chicago Press in 1997 based on its first 1956 publication in Netherlands. Di Renjie Qi An is the self-translated Chinese version written around 1952 and republished by Qunzhong Chubanshe in 2000 based on its first 1953 publication in Singapore. The visual illustrations data includes eight commonly shared illustrations in both English and Chinese version, and the verbal texts data includes all the descriptive texts directly related to each of the eight illustrations. The eight illustrations and their descriptive texts are extracted and saved as picture files and text files for further annotation. UAM Image Tools 2.1 and UAM Corpus Tools 6 are used to annotate the images and texts gathered in this study. Both the two corpus tools are developed by Mick O’Donnell wherein UAM Image Tools is specialize in annotating and indexing image data, and UAM Corpus Tools is specialize in annotating and indexing textual data. What makes the two corpus tools well suited to this study is that there is a same-designed fully customizable built-in annotation layer editor in them, meeting the need for customized manual annotation of cultural elements of the corpus. Additionally, although the annotations for illustrations and texts are performed separately in the two corpus tools, the identical design of two sets of annotation layer editors allows for the sharing of the same annotation framework. In other words, through a unified illustration-text annotation framework, each annotation tag can correspond to a set of associated illustrations and texts. Consequently, the comparative analysis of the illustration-text relationship features in the Chinese and English versions of The Chinese Maze Murders can be conducted in a corpus-based manner.
The critical phase of compiling the multimodal corpus is designing an annotation framework that covers all the recognizable cultural features of illustrations and texts in an integrated way, thus the culture’s representation features of both illustrations and texts can be marked computationally. The fundamental designing logic of this culture-oriented illustration-text annotation framework lies in that mode transformation only affects the effectiveness and impression of culture’s representation, so the signified of Chinese cultural elements are fixed regardless of the variation in their modes of presentation. Within this logical framework, the semantic association or commonality between the illustrations and texts modalities lies in their shared signified of Chinese cultural elements. Therefore, by constructing the annotation framework based on the signified of Chinese cultural elements, both illustrations and texts can be incorporated into a unified corpus comparison framework. Once the annotation framework is constructed, the cultural translation functions of illustration-text interaction in The Chinese Maze Murders can be concluded by comparing the differences in the modalities of presenting various Chinese cultural elements between the two language versions. Identifying and classifying the signified of Chinese cultural elements, that is, types of Chinese cultural elements, in The Chinese Maze Murders are thus a significant step to construct the annotation framework. It is noticeable that van Gulik’s Judge Dee Mysteries series is characterized by abundant descriptions of Chinese cultural materials as van Gulik intended to project his traditional Chinese literati’s cultural tastes into those cultural thematic items in the novels (Wei, 2023, p. 68). Van Gulik’s writing preference of arranging plots with rich Chinese cultural materials has provided fertile ground for the detailed classification of cultural elements in his novels. Shi (2017) and Tang (2017) have contributed to classifying cultural elements in van Gulik’s novels. Shi (2017) categorized the Chinese cultural elements in van Gulik’s works as cultural thematic artifacts, primarily consisting of material cultural elements such as traditional Chinese architecture, religious artifacts, imperial objects, Chinese-style weapons, and everyday items. While Tang (2017) categorizes them as culture-specific items, encompassing a wider range of intangible cultural elements, such as Chinese religious beliefs, unique cultural expressions, modes of address, official systems, and attire. By integrating the classification approaches of Shi (2017) and Tang (2017), and in order to comprehensively reflect the presentation characteristics of cultural elements in both illustrations and texts, the illustration-text annotation framework in this study will be constructed based on two major categories: material cultural elements and non-material cultural elements found in The Chinese Maze Murders.
A further modified and customized cultural elements typology exclusive to The Chinese Maze Murder is proposed as the corpus annotation framework based on the previous typologies of cultural elements in van Gulik’s novels, as shown in Figures 1 and 2. To facilitate tagging and statistic efficiency, the annotation frameworks for texts and illustrations are separately designed and applied. However, they can be combined as a whole annotation system as they share identical structures and classification principles, and they are essentially two parts of the general annotation system of cultural elements in The Chinese Maze Murder. All the cultural elements are categorized into two major types: material and non-material. Further subcategories are extended based on these two major types. Material cultural elements refer to those tangible and concrete cultural objects in real life, and non-material cultural elements refer to those intangible and abstract cultural concepts or expressions. Two adjustments are made to reflect the distinct representation individuality of texts and illustrations as far as possible without damaging the integrity and consistency of the whole annotation system. The first one is the extension of non-material type in texts because texts convey various non-material cultural elements while illustrations are incapable of conveying them. The second one is the addition of more classifications and subcategories to material type in illustrations because illustrations show material cultural elements in a more detailed way. Categories and subcategories of the framework are set in a consistent and unified format, which is in full compliance with the criterion of a commonly applicable annotation framework for both illustrations and texts.

Annotation framework of culture elements in text.

Annotation framework of cultural elements in illustration.
The overall corpus is annotated and proofread manually in accordance with the above annotation framework. Notably, only clearly recognizable cultural elements in illustrations are tagged to avoid ambiguity when compared with their corresponding text equivalents. The whole multimodal corpus includes three sub-corpora: corpus of English texts (CET), the corpus of Chinese texts (CCT), and the corpus of illustrations (CI). Table 1 shows the basic information of the multimodal corpus. Through further data analysis of tag sets in this corpus, the seemingly incommensurable visual illustrations and verbal texts become comparable, and their interaction can thus be explored. Due to the natural requirements of contrastive analysis, the Chinese version is regarded as the source text and the English version is regarded as the target text considering the culture’s subjectivity, meaning that the Chinese version is a benchmark for the analysis even though the Chinese version was written later than the English version.
Basic Information of the Multimodal Corpus.
Corpus Analysis
Before any further data analysis, it is necessary to clarify the kinds of mode relations between illustrations and texts in The Chinese Maze Murders. Based on the reviewed generally accepted taxonomies of image-text relations in multimodal discourse, the illustration-text relation in The Chinese Maze Murders can be best defined positionally and logico-semantically according to Martinec and Salway’s arguments. Positionally, the illustrations and texts are independently distributed with equal status in all van Gulik’s novels, that is, each of the two modes narrates independently and does not affect the reading comprehension adequacy of one another, and the novel is still complete if there are no illustrations. However, they are coherent with each other by a shared topic and together create a larger multimodal subject. Logico-semantically, the illustrations and texts in The Chinese Maze Murders show a relation of expansion. More specifically, they are related by two subtypes, extension and enhancement, of the expansion. Illustrations and texts in The Chinese Maze Murders always add new or related information to each other and modify each other circumstantially, for example, textual narration extends the background story of illustrations and illustrations enhance the text by place, character appearance, environment, etc. In Pastra’s (2008) framework, their relations can also be defined as equivalence, that is, illustrations and texts convey equivalent information in different forms that can be literal or figurative equivalent to each other. Recognizing the natural relations between illustrations and texts in The Chinese Maze Murders is inspiring for grasping potential realization ways of their multimodal cultural functions in translation.
In general, the motivation behind multimodal design stems from the author’s literary creative techniques. However, this study does not primarily aim to explain the reasons behind the specific presentation of illustrations and texts in their given form. Instead, this study will concentrate more on their cultural effects in translation, thus the following corpus analysis is a function-oriented description of the multimodal phenomenon in translation.
Table 2 is the selected statistics of the annotated multimodal corpus that shows cultural elements frequencies of two main categories. As Table 2 illustrates, the English version conveys more material cultural elements and fewer non-material cultural elements than the Chinese version in a significant way. What can be identified from such a significant variance is that inherent language distinguishing qualities place restrictions on forms of representing cultural elements, especially on language-related cultural concepts and expressions. For the English version, most non-material Chinese cultural elements cannot be equivalently represented because language signifiers in English cannot signify the invisible signified in Chinese. In contrast, representing material cultural elements in English is feasible so long as there are visible and describable signified in real life. Though the English version cannot fully represent all non-material cultural elements of the Chinese version, it can expand material cultural elements with more details. Moreover, illustrations contribute much to reconciling the difference in cultural elements allocation between the two language versions.
Selected Statistics of the Multimodal Corpus.
Illustrations play a more effective role in the English version than in the Chinese version. In the Chinese version, illustrations mainly perform esthetic functions, while illustrations in the English version mainly perform information-enhancing functions. Chinese language readers do not need illustrations to assist their cultural understanding. Chinese cultural elements can be easily associated and perceived with only textual descriptions in the Chinese version. For Chinese language readers, illustrations are more of esthetic embellishments that strengthen the reader’s sense of reading immersion. In contrast to Chinese language readers, it is challenging for English language readers to grasp Chinese cultural elements in text form, especially concepts of Chinese religion and beliefs, place and architecture’s name, and some culture-related language expressions. For English language readers, illustrations act as visualized glosses that provide complementary information to help them reinforce their cultural recognition from text reading, for example, some Chinese religious concepts are converted into equivalent visible material objects, and special architectures can be depicted with more visible details beyond text description. Even if there are still some Chinese cultural elements that cannot be illustrated, the vivid material scenes that have been depicted in illustrations could stimulate English language readers’ subjective cultural imagination, enhancing their cultural understanding in another way. To sum up, illustrations and texts function differently and their combination also performs different cultural functions to cater to different language readerships, and such combinations appear more practical in cultural meaning transmission for English language readers. The following section will further analyze the cultural functions of illustration-text combinations that facilitate dissolving Chinese-English cultural incommensurability and how they are realized in detail with multimodal corpus-based examples.
Transforming Abstract to Concrete
Statistics of the multimodal corpus have shown that the English version conveys more material cultural elements and fewer non-material cultural elements in texts, suggesting that there is an abstract-concrete transformation of culture’s representation in text and illustrations contribute to bridging the meaning gap resulting from the different allocation of cultural elements between the English version and the Chinese version. Thus, the fundamental cultural function of illustration-text combination in the English version is switching representing forms of various Chinese cultural elements from abstract to concrete with minimized cultural meaning loss. The switching process is realized by a three-phase operation. Firstly, abstract non-material cultural elements in Chinese, mostly null-equivalent in English, are textually “translated” (not source-target translation in the strict sense) and transformed into concrete material cultural elements in English to make them acceptable enough to meet English language norms, avoiding clashes resulted from uncontrollable inherent language differences. Then, illustrations highlight the transformed as well as other material cultural elements in English by visualized images of detailed object depictions to stabilize their cultural signified. Lastly, illustrations provide additional complementary visual information for those remaining non-material cultural elements in English to mitigate cultural meaning comprehension barriers caused by the textual translation. Through the implementation of the three-phase multimodal switching operation, the representation of culture is effectively managed in cross-language contexts.
Example 1 exemplifies how this cultural function works, and the cultural elements are highlighted with underlines.
Example 1.
Chinese texts: ……那时他正画着一个
English text: ……working on a picture representing the Black Judge of the Nether World……It were all pictures of Buddhist saints and deities. The goddess Kwan Yin was very well represented, sometimes single, sometimes with a group of attendant deities.
Example 1 describes the scene of Judge Dee’s first meeting with Woo Feng the suspect and painter, and Figure 3 is the illustration that depicts it. In the Chinese version, “阎罗王,”“神佛,”“观音菩萨” and “菩萨” are all cultural concepts of Chinese religion and folklore, and they are represented as abstract non-material cultural elements in the text. While in the English version, abstract non-material cultural elements such as “阎罗王” are “translated” and transformed into concrete material cultural elements of “Chinese painting representing the Black Judge of the Nether World” instead of being represented as individual abstract cultural concepts. In other words, “drawing about the Black Judge of the Nether World” is different from “working on a picture representing Black Judge of the Nether World” as the former is centered on “Black Judge of the Nether World” and the latter is centered on “picture.” For English language readers, recognizing a concrete material Chinese-styled “picture” would be easier than recognizing an abstract null-equivalent Chinese folklore concept in English, so the reader’s reading fluency will not be damaged even if they have trouble perceiving “阎罗王.”“A picture representing the Black Judge of the Nether World” and “pictures of Buddhist saints and deities” are represented as material cultural elements in English, and the illustration highlights them with visual images. As Figure 3 illustrates, Judge Dee on the left is looking at paintings on the wall, and Woo Feng the suspect and painter on the right is painting. The paintings on the table and the paintings on the wall show readers what “Black Judge of the Nether World” (阎罗王) and “Buddhist saints and deities” (神佛) look like respectively, visually enhancing the reader’s cultural comprehension by visible cultural signified. The visual enhancement of material cultural elements not only specifies the signified of textual description but also evokes the reader’s cultural schematic construction of the story scene. “Goddess Kwan Yin” (观音菩萨) and “a group of attendant deities” (菩萨) are kept in their original form of non-material elements in English, and their figures are sketched in the illustration. Though the sketches of “Goddess Kwan Yin” and “a group of attendant deities” in the illustration are simple line drawings, they still provide enough additional visual information for readers to substantialize these two non-material cultural elements in mind. With the synergy of illustration-text combination, cultural elements in example 1 are well represented in English based on the three-phase switching operation, and possible cultural meaning recognition gaps caused by purely textual translation are bridged by illustrations to the utmost extent.

Judge Dee in Woo Feng’s Studio.
Integrating Cultural Elements
The inner cohesion between cultural elements in translated texts could be hard to grasp for English language readers who are not familiar with Chinese cultures, especially when there are many material cultural elements being densely described. As Table 2 shows, the English version conveys much more specific material cultural elements than the Chinese version, which could lead to the problem of identifying material cultural element clusters effectively. A material cultural element cluster refers to a group of material cultural elements that together modify the same subject, and readers could be confused about the modified subject if they have trouble seeing the cultural bond between elements inside the cluster. What illustration-text combination facilitates is stabilizing every single cluster in texts to a fixed part in illustrations, and readers will get the characteristics of the modified subjects visually even if they could not discern the inner cohesion of the densely distributed cultural elements within a cluster in texts. As Table 3 shows, the distributions of environment-related material cultural elements in the two versions are highly different in frequency. Example 2 exemplifies how illustration-text multimodal combination integrates the exceeded material cultural elements in the English version and makes them distinct “meaning wholes.” The cultural elements are highlighted with underlines.
Statistics of “Environment” of Material Cultural Elements.
Example 2.
Chinese text: ……又把书桌上的
English text: Then Judge Dee looked at
Example 2 describes the scene of Judge Dee’s investigation of General Ding’s crime scene, and the Figure 4 is the illustration that depicts it. The short text description of the desk set is densely composed by material Chinese cultural elements. All the seven cultural elements underlined in the English text make up a cultural element cluster that modifies the typical traditional Chinese writing desk. Though “set of writing materials,”“bamboo brush holder,”“stone slab,”“red porcelain water container,”“cake of ink,” and “stand of carved jade” are all represented as material cultural elements, integrating them in mind as a “meaning whole” is still challenging for English language readers because they possess no experience of using these items in daily life. Thus, the inner cohesion of these elements is lost in text translation. To complement the lost cultural information, the desk with seven cultural elements is visualized by the illustration, directly showing readers how these cultural elements are arranged together by their inner cohesion as well as what the modified subject looks like in a proper layout. The desk in the illustration groups all seven cultural elements and forms a “meaning whole” that not only visually highlights the appearance of every material cultural element within it but also integrates seemingly cluttered cultural elements in text in a culturally reasonable and meaningful way. In other words, the cultural function of integrating cultural elements as independent “meaning wholes” performed by illustration-text multimodal combination is essentially a visual enhancement of cross-lingual cultural experience transmission.

Judge Dee in General Ding’s Library.
Visualizing Cultural Implications
Texts usually convey cultural implications beyond words, and such cultural implications are too subtle to be noticed by English language readers. Cultural implications of The Chinese Maze Murders are associated with certain cultural elements, and they can be perceived as cultural artistic conceptions, cultural intertextuality, cultural atmosphere of certain scenes, implicit cultural factors that promote plot development, overall cultural style of the imagined Tang-dynasty society, etc. For Chinese language readers, textual descriptions are capable enough to evoke their perceptions of cultural implications because they are fully aware of the associative meaning of every single cultural element. While this is hard to realize for English language readers as their cultural association cannot be effectively stimulated based on their incomplete recognition of cultural elements in texts. For example, there is a plot that Woo Feng the suspect and painter says that Judge Dee could be set as a model for painting “Black Judge of the Nether World,” and hence Judge Dee’s lieutenant Sergeant Hoong is furious with him. Knowing what “Black Judge of the Nether World” looks like is a prerequisite to a better understanding of this plot, and it could be confusing for English language readers if there are no illustrations as shown in Figures 3 and 5. As the illustration shows, Judge Dee’s official dress and winged cap make him look like the “Black Judge of the Nether World” in Woo Feng’s painting, which explains Woo Feng’s satire. Though the appearance of Judge Dee as an official is already depicted in texts, it is hard to imagine his appearance based on merely textual depiction for English language readers as discussed in the last section. With the complimentary visual information provided by illustrations can English language readers better perceive the cultural implications of “Black Judge of the Nether World” and its implicit connection with plot developments.

Judge Dee’s first appearance as official.
As the corpus statistics in Table 2 show, there are few original cultural expressions and language styles of non-material cultural elements that have been kept in the English version, which is largely resulted from the inherent language differences. Though these language-related cultural elements have been transformed into material cultural elements as much as possible to enhance the readability of the English version, their literal-evoked associative meanings are lost in the transformation. A representative example is shown in Figure 6 of a Chinese landscape painting called “Bowers of Empty Illusion” (“虚空楼阁” in Chinese), which is crucial to the development of the story. This painting is depicted by a series of “four-character idioms” in the Chinese version, while the English version deconstructs these “four-character idioms” into individual words, erasing their cultural associations and original stylistic meaning. Fortunately, illustrations complement that with abundant visual information. The painting in the story is meant to represent typical characteristics of traditional Chinese landscape painting and reflects the artistic conception of the philosophical spirit of Chinese Taoism. More specifically, the illustration first vividly portrays the distinctive features of Chinese traditional painting, which are characterized by freehand brushwork with bold outlines, in stark contrast to the realistic traditions of the Western painting. Then, it depicts a visible scenery with Taoism conception by its contents of mountains, clouds, and pinewoods. For English language readers unfamiliar with Chinese painting art and philosophical traditions, the illustration provides them with enough visual stimulation to comprehend or imagine the cultural implications of the text description of the painting and further grasp the character image of the painting’s owner Governor Yoo. The cultural function of visualizing cultural implications is commonly performed by illustration-text multimodal combinations in the novel, compensating a lot for the meaning lost in text translation.

A Chinese landscape painting.
Conclusion
This study has investigated culture’s representation in The Chinese Maze Murders from a perspective of cultural functions of illustration-text multimodal combination and interaction in a multimodal corpus approach. The research results firstly reveal that there is a quantitative pattern of cultural elements form transformation in translation based on corpus statistics, and then indicate that there are three fundamental cultural functions of illustration-text multimodal combination that facilitate culture’s representation in translation. Transforming abstract cultural elements into concrete cultural elements, integrating cultural elements as “meaning wholes,” and visualizing cultural implications are the three fundamental cultural functions of illustration-text multimodal texts of The Chinese Maze Murders. Also, these three fundamental cultural functions can be applied to further cultural multimodal translation practice. To sum up, this study attempts to testify the possibility of simultaneously corpus indexing images and texts based on a unified set of annotation framework to study cultural multimodal translation, and it turns out to be practicable. It is noteworthy that multimedia development provides fertile ground for exploring new forms of translating cultures. Culture’s translation is encouraged to be studied from more perspectives beyond traditional purely textual analysis.
Footnotes
Author Contributions
Lin Deng is a Ph.D. candidate in Translation Studies of the School of International Studies at Shaanxi Normal University, China. His research interests include multimodal translation studies, corpus-based translation studies, translation of Chinese classics, and cross-cultural translation.
Culture’s Representation in Van Gulik’s Transcreated Novel The Chinese Maze Murders: A Multimodal Corpus Approach
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
