Environment-Based Prompt Framework for AI-Generated Façade Design

Abstract

As automated retail environments continue to expand globally, their façade design has emerged as a critical factor influencing consumer perception, psychological comfort, and visual appeal. At the same time, current design practices often prioritize technological efficiency over visual wellness. This study proposes a generative AI-assisted design methodology grounded in the Environment-Based Design (EBD) framework. The approach emphasizes visual dimensions of the WELL Building Standard and integrates biophilic design principles to enhance façade aesthetics in automated retail contexts. This study has four research objectives: (1) to extract WELL-aligned visual design variables for façade design through literature, certification mapping, and case analysis; (2) to develop structured prompt strategies and ControlNet-based image generation workflows using Stable Diffusion XL; (3) to evaluate perceptual outcomes through Learned Perceptual Image Patch Similarity (LPIPS) metrics and expert scoring across wellness-relevant dimensions; and (4) to analyze design trade-offs and limitations, and identify opportunities for recursive improvement within the EBD framework. Eight façade images, consisting of the original and seven AI-generated variants, were evaluated by five experts using a 7-point Likert scale across five perceptual criteria. The results show that the "Material + Pattern" strategy received the highest ratings in perceived material quality and natural features; "Color + Material + Pattern" showed the most balanced overall performance. Perceptual similarity was quantitatively assessed using LPIPS, confirming that multidimensional interventions led to greater visual deviation from the original design. Expert comments emphasized the warmth and affinity created by natural textures, while cautioning against excessive decorative complexity. Open-ended feedback was subjected to thematic analysis, revealing nuanced perceptions of design richness, comfort, and realism. This study demonstrates the feasibility of operationalizing health-focused visual principles within an AI-assisted design pipeline. The proposed approach offers a scalable and reproducible method for enhancing the emotional and aesthetic quality of automated retail façades. Future research should extend the scope of visual dimensions, including form, signage clarity, and transparency, and incorporate multimodal user experience evaluations to better reflect real-world engagement.

Keywords

prompt engineering AI-generated design environment-Based design (EBD)façade design automated retail store

Introduction

Background

As automated retail stores spread worldwide, their spatial environment and visual design have become an important topic of public concern. These cashierless, self-check-out facilities employ artificial intelligence, sensor technologies, and advanced automation systems to provide continuous 24-hour service with minimal human intervention (Nam et al., 2025). Despite their significant operational and technological advantages, the architectural design of such stores, particularly the façade, is often neglected or oversimplified, resulting in environments that lack visual appeal, material expression, and psychological comfort (Majid, 2022). Prior research by Yun et al. (2024) has confirmed that integration of biophilic elements in automated stores positively affects visual attention and emotional response, as demonstrated through eye-tracking experiments and self-reporting methods. However, those findings were largely perceptual and lacked implementation through generative design strategies.

As contemporary design thinking places increasing emphasis on spatial quality and user well-being, health-oriented design, and wellness-focused frameworks such as the WELL Building Standard provide structured principles for improving both physical and psychological aspects of the built environment. Developed by the International WELL Building Institute (IWBI), the WELL v2 standard comprises 10 core concepts that target air, water, light, sound, materials, and mind, among others, to promote holistic health in architectural spaces (IWBI, 2020). Although primarily applied to healthcare, workplace and residential environments, specific principles pertaining to visual comfort, natural connection, and material clarity are equally relevant to the façade design of automated retail facilities. This trend reflects a broader shift in architecture and facility planning, where built environments are increasingly recognized as key determinants of human health, safety, and well-being (Marberry et al., 2022). However, despite the growing application of these standards in architectural practice, their integration into automated retail environments remains limited.

At the same time, recent advances in image-generative AI, such as Stable Diffusion XL with ControlNet, have created new opportunities for rapid visual exploration in architectural ideation (Cao et al., 2025; Podell et al., 2023). Yet, despite the acknowledged relevance of WELL and biophilic principles, few studies have systematically translated these frameworks into façade design strategies for automated retail stores. Architects and designers lack structured workflows that integrate wellness frameworks, generative design tools, and perception-based evaluation, creating a methodological gap in health-centered façade design research (Ali & Lee, 2023; Sourek, 2024).

To address this gap, this study proposes an Environment-Based Design (EBD) framework that integrates WELL and biophilic principles into a generative façade design strategy. Building on selected visually oriented criteria—color, material, and pattern—derived from WELL guidelines and biophilic design concepts (Kellert & Calabrese, 2015), this study applies the EBD logic in three phases: 1) extracting WELL-aligned visual variables; 2) constructing AI prompts for generative design; and 3) conducting expert-based perceptual evaluation (Zeng & Cheng, 1991).

The aim of this study is to establish a health-centered, AI-assisted façade design workflow by translating visual aspects of WELL and biophilic design concepts into generative parameters and evaluation structures for automated retail environments.

Research Objectives

To extract WELL-aligned visual design variables for façade design through literature, certification mapping, and case analysis.

To develop structured prompt strategies and ControlNet-based image generation workflows using Stable Diffusion XL.

To evaluate perceptual outcomes through LPIPS metrics and expert scoring across wellness-relevant dimensions.

To analyze design trade-offs and limitations, and identify opportunities for recursive improvement within the EBD framework.

Research Questions

RQ0 (Integrative): How can WELL and biophilic design principles be operationalized—within an EBD framework—into promptable façade variables that guide generative AI toward health-aligned, attractive concepts for automated retail stores?

RQ1: How can health-oriented façade design principles (WELL + biophilic) be translated into core visual variables suitable for AI-driven design of automated retail stores?

RQ2: What are the visual impacts of AI-generated façade variations under different prompt strategies (e.g., Color, Material, Pattern)? How can they be evaluated?

RQ3: What trade-offs and perceptual patterns emerge when combining multiple design variables? How can they inform recursive prompt adjustments in future EBD-guided workflows?

RQ4: What methodological limitations exist in using expert-based and AI-driven workflows for WELL-aligned design? How can future research address these limitations?

Literature Review

The Rise of Automated Retail and Spatial Experience Issues

Automated retail stores, which operate without human staff using digital interfaces and AI systems, have evolved into diverse forms such as smart convenience stores, vending modules, and uncrewed supermarkets (Nam et al., 2025). These systems optimize transaction efficiency and labor costs, but often embody a “design vacuum”—a lack of human-centered aesthetic consideration (Majid, 2022). Most façades emphasize branding or technological functionality, often with repetitive, industrialized visual expressions (Jo et al., 2024).

In a recent experiment, Nam et al. (2025) found that consumer preferences for uncrewed stores are significantly influenced not only by operational factors, but also by environmental qualities such as spatial aesthetics and perceived safety. These findings suggest that physical design remains a crucial factor in shaping user acceptance, even in highly automated environments.

Meanwhile, architectural studies have revealed that spatial cues—such as natural light, texture, and form—are critical in establishing environmental legibility and emotional response. However, in automated retail design, literature has focused on interior layout, user-device interaction, and UX/UI optimization (Kim & Lee, 2021), at the expense of façade-related perceptual and psychological concerns.

Recent studies have emphasized the importance of physiological and behavioral responses in evaluating spatial environments. For instance, Kim and Kim (2022) applied biometric tools to quantify emotional responses to architectural stimuli, offering a data-driven perspective on spatial affect. Kim and Lee (2021) assessed consumer attention and arousal using eye-tracking technology in virtual retail environments, demonstrating how visual cues shape user perception and engagement. Similarly, Kim (2024) employed VR-based eye-tracking to capture initial gaze attraction in branded spaces, highlighting the influence of spatial composition on early user attention. These studies reinforce the value of integrating perceptual data into the evaluation of visual design features in commercial and retail spaces.

Environment-Based Design (EBD) in Architectural Reasoning

EBD did not emerge as a formal methodology until 2011 (Zeng, 2011), but its theoretical foundation was laid earlier in Zeng and Cheng's (1991) proposal of recursive logic as the logic of design. Unlike deductive or inductive reasoning, recursive logic frames design as a process in which problems, solutions, and knowledge evolve simultaneously, and always in relation to the environment (Zeng, 2002).

Building on this foundation, Zeng (2011) introduced EBD as a structured methodology premised on the idea that “design starts from the environment, functions for the environment, and brings changes to the environment.” This orientation positions the environment not as a backdrop but as both the source of inspiration and the locus of transformation.

EBD operationalizes this orientation through a recursive cycle of environment analysis, conflict identification, and solution generation, with each new solution immediately reintegrated into the environment for subsequent iterations until no further undesirable conflicts remain. Asking questions is the core of the EBD methodology (Wang & Zeng, 2009), the primary mechanism through which designers probe the environment to elicit hidden requirements, uncover implicit constraints, and generate the knowledge necessary to identify and resolve conflicts. By systematically asking both generic and domain-specific questions, designers expand and refine their understanding of the environment, ensuring that subsequent solutions are grounded in contextual realities rather than abstract assumptions. Finally, Zeng (2015) characterized this process as an environmental evolution, highlighting the co-development of problems, solutions, and knowledge as the environment changes.

In the context of automated retail design, where human presence is limited and operations are predominantly mediated through digital interfaces, the EBD framework lays a compelling foundation for the reintroduction of environmental quality and user perception into architectural considerations. By emphasizing environmental legibility, material articulation, and affective response, EBD encourages designers to reconceptualize façades not merely as branding surfaces, but also as perceptual interfaces that mediate the relationship between users and spatial environments In this study, three visual dimensions—color, material, and pattern—are the key perceptual mediators within this framework.

To consolidate the theoretical foundation, these dimensions can be explicitly linked to constructs from the EBD framework. EBD emphasizes the translation of environmental cues into cognitive representations that guide design reasoning (Zeng, 2015). According to this view, color has been shown to influence emotional states and stress regulation (Küller et al., 2006), reflecting EBD's concern with affective responses. Pattern relates to biophilic principles of complexity and order, aligning with EBD's recursive reasoning about environmental coherence (Yang et al., 2023). Material corresponds to the sensory attributes of the built environment, serving as inputs that shape perception and design decisions (Nguyen & Zeng, 2012; Yang et al., 2022). These links clarify the integration of visual dimensions within EBD while suggesting that future research may extend this mapping to additional WELL variables.

Furthermore, the EBD approach demonstrates conceptual alignment with generative design workflows, especially those characterized by iterative visual output and evaluation processes. In both frameworks, the design evolves through a recursive feedback cycle in which environmental requirements or design objectives are translated into visual representations, subsequently assessed and refined based on contextual fit. This structural correspondence underscores the applicability of EBD as a theoretical foundation for guiding the formulation of design instructions and the evaluation of visual outcomes in computational façade design practices.

Biophilic Design and WELL Building Standard

Biophilic design emphasizes humans’ innate tendency to connect with nature by integrating natural patterns, materials, and spatial forms into the built environment (Kellert & Calabrese, 2015). Numerous empirical studies have demonstrated that biophilic elements can reduce stress, support attention restoration, and enhance users’ spatial identity. These effects are especially valuable in high-frequency commercial environments, where functional efficiency often takes precedence over psychological comfort. Recent neuroarchitecture research has substantiated these findings. For instance, Kim and Gero (2022) showed that biophilic features can elicit measurable neurophysiological responses, reinforcing the relevance of biophilic principles in wellness-oriented architectural frameworks. Similarly, Jung et al. (2023) demonstrated in a virtual reality hospital patient room that introducing biophilic design elements such as plant walls and digital nature walls improved users’ emotional state. Questionnaire results indicated that plant walls reduced negative affect, while digital nature walls enhanced positive affect. EEG analysis revealed that biophilic design increased relaxation-related low-frequency activity and decreased tension-related high-frequency activity. These findings provide convergent psychological and neural evidence for the positive impact of biophilic interventions on human well-being (Jung et al., 2023)

The WELL Building Standard integrates the biophilia into multiple categories—including Mind, Light, Materials, and Biophilia I/II—highlighting its value in promoting visual wellness and user well-being (IWBI, 2020; Tabassum & Park, 2024). WELL's Biophilia I feature encourages the integration of nature through environmental elements and patterns; Biophilia II promotes a deeper, sustained human-nature connection.

Although WELL and biophilic strategies have been widely adopted in residential and workplace contexts (Kim & Park, 2025), their application to retail environments, particularly automated stores, remains underexplored. Recent bibliometric reviews indicate that WELL-related research is heavily skewed toward office and residential buildings, with commercial sectors receiving much less attention (Kokatnur et al., 2025). This issue is important because spatial design in these settings has a direct impact on user behavior, attentional focus, and emotional responses.

Recent research also highlights the importance of user-centered perception and affective experience in spatial design. Biometric-based methods are increasingly employed to measure emotional responses to architectural stimuli (Kim & Kim, 2022), offering insights beyond superficial aesthetics. These methods support the operationalization of WELL and biophilic concepts not only as theoretical ideals but also as practical design tools. In parallel, recent studies have explored how text-to-image generative AI tools can capture and reinterpret the visual language of biophilic design, offering new expressive possibilities in AI-assisted workflows (Thampanichwat et al., 2025).

AI-Assisted Design and Façade Generation Tools

Recent advances in generative artificial intelligence have introduced new opportunities for architectural design, particularly through image-to-image (img2img) generation workflows. Among these, Stable Diffusion XL (SDXL) has emerged as a powerful open-source tool that enables high-quality visual outputs guided by both textual and visual inputs (Cao et al., 2025). In contrast to closed commercial platforms such as Midjourney, DALL·E, and Imagen, SDXL supports extensible control modules like ControlNet and provides transparent parameterization, which is advantageous for reproducibility and method disclosure in architectural research. When paired with ControlNet, SDXL facilitates structure-preserving img2img workflows particularly suitable for façade contexts where geometric fidelity must be maintained (Podell et al., 2023).

AI approaches to façade design can be compared along several categories. GAN-based methods have been applied to façade and style generation, producing compelling transformations but requiring curated datasets and extensive training, while offering limited semantic controllability (Wang et al., 2022). Commercial diffusion tools such as Midjourney deliver high-quality visualizations and rapid outputs but operate as closed systems, restricting access to intermediate control signals and limiting their integration with wellness-oriented semantics (Jo et al., 2024; Kim & Park, 2025). Parameter-efficient fine-tuning techniques like LoRA provide lightweight adaptation and allow style-specific training with small data requirements; however, they mainly inject stylistic priors and do not independently guarantee geometry preservation or the mapping of certification standards into visual outputs (Petráková & Šimkovič, 2023; Ali & Lee, 2023). In contrast, the SDXL + ControlNet configuration used in this study enables both localized control and transparent prompt-level manipulation of WELL-aligned visual dimensions (Color, Material, Pattern), offering methodological clarity and semantic rigor that other approaches lack.

These capabilities are well-suited for early-stage design ideation. Designers can utilize text- or image-guided diffusion models to explore stylistic variations and reinterpret architectural elements with improved semantic fidelity, as demonstrated by recent advancements in semantic image synthesis (Ali & Lee, 2023; Petráková & Šimkovič, 2023; Wang et al., 2022). Recent studies have also explored the integration of local identity and biophilic design principles into the generative pipeline, thereby enhancing the cultural and psychological relevance of AI-generated façades (Jo et al., 2024; Kim & Park, 2025; Thampanichwat et al., 2025). However, few investigations have aligned these outputs with wellness certification systems such as the WELL Building Standard—an area addressed by this study.

Beyond image generation, researchers have examined how these tools affect design thinking and creativity.) Veloso (2025) discusses the use of multimodal large language models and precedent-based reasoning in architectural education, reflecting a growing shift toward collaborative and adaptive design paradigms (Jun & Jia, 2025).

Evaluation methods for generative outcomes are also advancing. LPIPS is widely used to measure visual deviations from baseline designs, while expert-based evaluations provide valuable insight into experiential quality and conceptual fit (Sourek, 2024). Recent studies assess the consistency and usability of synthetic datasets produced by Stable Diffusion (Stöckl, 2023), supporting the development of reliable, quantifiable metrics for AI-assisted façade design.

Taken together, these comparisons clarify the rationale for the method selection in this paper. GAN-based approaches demand high training costs and offer weak semantic alignment, Midjourney provides closed but visually appealing outputs, and LoRA improves adaptability yet focuses mainly on stylistic adaptation. Against this backdrop, SDXL + ControlNet emerges as the most proportionate and reproducible choice for early-phase façade ideation under WELL/EBD constraints, ensuring that generative outputs are both visually compelling and aligned with wellness- and biophilia-oriented design logic.

Methodology

This study adopts the EBD theory (Zeng, 2011; Zeng & Cheng, 1991) as the guiding methodological framework to develop an AI-assisted, health-oriented façade design process for automated retail stores. EBD emphasizes designing from, for, and to the environment—advocating a recursive logic that connects environmental inputs, design intentions, and feedback outcomes in a dynamic loop. Unlike function-based or purely formal approaches, EBD begins by analyzing and modeling environmental conditions across three domains—human, built, and natural—and transforms them into operable design requirements from the earliest stage.

In the context of automated retail stores—typically characterized by minimal human presence and utilitarian aesthetics—façade design often lacks emotional resonance, natural integration, and perceptual appeal. To fill this gap, this study begins with WELL Building Standard principles and translates selected features into actionable visual dimensions (Color, Material, Pattern). These dimensions serve as the basis for semantic prompt construction, guiding generative design via Stable Diffusion and ControlNet. The aim is to visualize biophilic design intentions within technically constrained, high-frequency retail façades.

To establish a recursive and closed-loop design logic that progresses from environmental intention through AI-generated expression to perceptual evaluation, this study adopts a three-phase methodological structure.

Visual Dimension Framework Development

Through literature synthesis, WELL feature analysis, and representative case studies, this phase identifies three perceptually salient design dimensions—Color, Material, and Pattern. These dimensions form the semantic control structure for constructing prompts in the AI generation stage.

AI Image Generation and Technical Evaluation

Using Stable Diffusion XL with ControlNet, the study performs guided image-to-image inpainting of baseline façades. All generated images are quantitatively compared with the original façade using the LPIPS metric to assess the degree of visual variation and perceptual deviation.

Expert Evaluation and Framework Validation

Expert reviewers in architecture and wellness design assess the images based on pre-defined perceptual criteria. Their feedback is analyzed to trace prompt effectiveness and design coherence. Finally, the overall design logic is formalized using a Recursive Object Model (ROM) to visualize the structure of the EBD-informed generative process.

This three-stage approach proposes a novel, environment-led design framework that uses AI tools not merely for image synthesis, but also for translating certification logic into perceptual design quality. It offers an operational model for applying EBD to façade development in human-absent, health-sensitive spaces such as automated retail stores.

To illustrate the recursive design logic central to this study, Figure 1 presents a ROM diagram that visualizes the structured relationship between the environmental context, prompt design, generative image production, and perceptual feedback.

Figure 1.

ROM of the AI-Assisted WELL-Based Façade Design Process).

Identified Visual Dimensions from Literature Review

This phase establishes the theoretical foundation for AI-assisted design by identifying three core visual dimensions—Color, Material, and Pattern—through a systematic literature review of biophilic design, architectural visualization, and health-oriented building frameworks.

The selection of these dimensions was guided by three criteria: (1) relevance in existing design and environmental psychology research; (2) controllability in prompt-based image generation; and (3) perceptual clarity for expert evaluation. Among the range of visual attributes found in literature, these three were determined to be the most suitable for façade-level interventions in automated retail stores, where space is limited, interaction time is short, and branding needs are strong.

Color refers to the use of natural, calming, or psychologically supportive hues in façade elements such as signage, frames, or lighting. It is critical for conveying emotional tone, environmental harmony, and spatial legibility in compact urban retail contexts.

Material focuses on the visual representation of surface textures and finishes, including wood, metal, or patterned glass. These influence perceived warmth, safety, and biophilic authenticity—qualities essential in building trust and comfort in unattended commercial spaces.

Pattern encompasses geometric rhythm, modular layering, and ornamental repetition within the composition of façade components. Pattern contributes to visual richness and cognitive engagement, especially important in standardized and visually competitive environments like street-facing retail stores.

Together, color, material, and pattern offer a perceptually grounded and technically operable framework for AI-assisted design generation. They allow for structured prompt development, image variation control, and expert-based evaluation. Most importantly, this dimension system acts as a conceptual bridge between design generation and health-oriented goals, aligning visual aesthetics with principles from biophilic design and WELL-based wellness frameworks.

Although visual dimensions such as façade shape articulation or form are also frequently discussed in WELL-related and biophilic design literature, this study focuses on Color, Material, and Pattern due to their high frequency across biophilic and wellness-oriented design frameworks, as well as their practical controllability within current AI image generation workflows. Other dimensions were excluded at this stage to maintain experimental focus and prompt clarity. Table 1 summarizes the identified visual dimensions and their theoretical grounding in biophilic and wellness-oriented design frameworks.

Table 1.

Visual Dimension Analysis.

Core Dimensions	Contents	Source
Ten categories: Form, Space, Movement, Light, Color, Material, Object, View, Sound, and Weather.	Identified three types of biophilic design experiences: direct nature, indirect nature, and the human-nature relationship, emphasizing the importance of sensory attributes such as color, light and shadow, form, and material.	(Kellert, 2008)
(1) Nature in the Space: Visual Connection with Nature, Non-Visual Connection with Nature, Non-Rhythmic Sensory Stimuli, Thermal & Airflow Variability, Presence of Water, Dynamic & Diffuse Light, Connection with Natural Systems. (2) Natural Analogues: Biomorphic Forms & Patterns, Material Connection with Nature, Complexity & Order. (3) Nature of the Space: Prospect, Refuge, Mystery, Risk/Peril	Categorized biophilic design into 14 patterns across three dimensions—Nature in the Space, Natural Analogues, and Nature of the Space—emphasizing visual connections to nature, material references, patterns, and spatial experiences. The framework provides a practical basis for analyzing visual attributes such as color, material, form, and natural patterns in architectural design.	(Browning et al., 2014)
Color, Shape and Form	Proposed an AI-assisted biophilic façade design method for elderly housing using fine-tuned Stable Diffusion (LoRA + ControlNet), based on color and form features. The method was validated through expert evaluation and FID scoring, aiming to improve visual inclusivity and design adaptability for aging populations.	(Kim & Park, 2025)
Color, Material, Pattern	Conducted a systematic review of optimization methods for building façade design, classifying objectives and variables, comparing algorithm performance, and summarizing current tools, limitations, and future directions in building façade optimization.	(Shan & Junghans, 2023)
Shape, Form, Pattern	Explored the use of AI-based tools (VQGAN + CLIP) for biophilic form generation in architectural design. Conducted literature review, generative experiments using nature-inspired prompts, and image evaluation based on biophilic design criteria.	(Viliunas & Grazuleviciute-Vileniske, 2022)
Color, Light, Shape	Proposed an AI-based method for transforming traditional residential façades into commercially viable designs using a LoRA + ControlNet diffusion model trained on nighttime storefront data. Validated performance through qualitative and quantitative evaluation to support adaptive reuse in historic urban areas.	(Zhang et al., 2024).
Outline, Style and Shape	Proposed a bottom-up digital design approach using GAN-based ifaçade to generate infill façade designs by referencing adjacent buildings. Demonstrated its potential for early-stage urban design through image-based synthesis and heuristic modeling of abstract façade elements.	(Ali & Lee, 2023)
Style, Form, Pattern	Developed a fine-tuned generative AI model for creating alternative façade designs that reflect local identity. The method uses street-view image data and paired text prompts for additional training, enabling efficient generation of regionally adapted design concepts in early-stage communication.	(Jo et al., 2024)
Shape, Form	Proposed a workflow combining Grasshopper with Stable Diffusion (LoRA + ControlNet) to generate and preview regional traditional façades for Minnan architecture. This AI-assisted approach enables efficient visualization and modification during historic district renewal.	(Xu et al., 2024)
Material, Pattern	Developed a multimodal AI system integrating GPT-4 Vision, Stable Diffusion, LoRA, and ControlNet to support ecologically adaptive façade design. The system enables non-experts to generate site-specific, biodiversity-friendly designs by combining visual inputs, plant suitability data, and substrate analysis.	(Wei & Herr, 2025)
Color, Shape, form	Explored the interaction between AI-generated graffiti and architectural façades using Stable Diffusion, framing graffiti as a third skin over modified and original surfaces. The study analyzed creative outputs across environmental contexts, revealing new potentials for urban expression and spatial reinterpretation.	(Shih, 2024)

Certification Standards Analysis Based on WELL

To ensure that the AI-generated façade designs are aligned with health-oriented and biophilic principles, this study adopts the WELL Building Standard as its primary evaluative framework. A comprehensive review was conducted on the ten core concepts of WELL v2 (IWBI, 2020), in addition to biophilic-related entries from WELL v1—specifically Feature 88 Biophilia I—Qualitative and Feature 100 Biophilia II – Quantitative—to determine their applicability to the visual aspects of façade design.

While categories such as Air, Water, and Sound offer minimal direct relevance to external visual expression, other WELL concepts—such as Light, Materials, Mind, and Community—provide meaningful guidance on how built environments can support human health and perception through design. These features are therefore mapped to the three core visual dimensions identified in Section 3.1.

This mapping enables the translation of abstract certification principles into tangible visual variables that can be operationalized in AI prompt formulation and image evaluation. The Color dimension reflects WELL features that address daylight autonomy, lighting quality, and visual harmony, all of which influence user comfort and emotional response. The Material dimension aligns with WELL requirements for transparency, non-toxicity, and natural sourcing, supporting healthful and biophilic material application. The Pattern dimension draws from WELL's emphasis on wayfinding, spatial rhythm, and community identity, guiding the application of geometric and culturally resonant design elements.

The detailed correspondence between WELL's features and the three visual dimensions is summarized in the following tables:

Table 2-A presents WELL features associated with Color, emphasizing light, emotional tone, and nature-inspired palettes.

Table 2-A.

Mapping of WELL Features to the Visual Dimension: Color.

WELL Concept	Feature Code	Feature Name	Relevance to Façade Design
Air	–	Indoor Air Quality	○ No direct color implication in exterior design
Water	–	Water Quality & Distribution	○ No impact on visual color expression
Nourishment	–	Food Environment	○ Not related to façade design elements
Light	L01	Daylight Autonomy	⬤ Natural light influences visible color tone and ambience
Light	L02	Visual Lighting Design	⬤ Enhances visibility, avoids glare, supports color rendering
Movement	–	Circulation Promotion	○ No color control implications
Thermal Comfort	–	Temperature Control	○ Affects thermal perception, not color
Sound	–	Acoustic Comfort	○ Not linked to visual color properties
Material	–	(See Material Table)	—
Mind	V03	Visual Aesthetics	⬤ Influences color harmony and emotional response
Community	–	(See Form Table)	—
Biophilia I (v1)	88	Biophilia I	⬤ Supports use of daylight and natural palettes
Biophilia II (v1)	100	Biophilia II	⬤ Encourages warm, nature-related color schemes

Table 2-B organizes features for Material, highlighting traceability, texture, and biophilic authenticity.

Table 2-B.

Mapping of WELL Features to the Visual Dimension: Material.

WELL Concept	Feature Code	Feature Name	Relevance to Façade Design
Air	–	Indoor Air Quality	○ Internal air filters don't affect external materials
Water	–	Water Quality	○ Plumbing systems have no visible façade representation
Nourishment	–	Food Transparency	○ Not related to building materials
Light	–	(See Color Table)	—
Movement	–	Movement Support	○ Pathways and signage, not material per se
Thermal Comfort	–	Temperature Control	○ Unless wall insulation is visible (rare), not relevant
Sound	–	Acoustic Comfort	○ Mostly interior material application
Material	X06	Material Restrictions	⬤ Promotes non-toxic, nature-sourced façade materials
Material	X07	Enhanced Material Transparency	⬤ Supports use of traceable, natural materials like stone, wood
Mind	V03	Visual Aesthetics	⬤ Texture and material finish directly influence perception
Community	C13	Inclusion & Accessibility	⬤ Some material cues (e.g., tactile surfaces) may convey inclusion
Biophilia I (v1)	88	Biophilia I	⬤ Encourages organic material use like timber and clay
Biophilia II (v1)	100	Biophilia II	⬤ Promotes material diversity rooted in nature

Table 2-C outlines entries linked to Pattern, focusing on form rhythm, visual orientation, and cultural symbolism.

Table 2-C.

Mapping of WELL Features to the Visual Dimension: Pattern.

WELL Concept	Feature Code	Feature Name	Relevance to Façade Design
Air	–	Indoor Air Quality	○ Not related to spatial form
Water	–	Water Quality	○ No façade form implications
Nourishment	–	Nutrition	○ Not connected to form composition
Light	–	(See Color Table)	—
Movement	V02	Circulation & Wayfinding	⬤ Façade shape can facilitate intuitive movement and entry
Thermal Comfort	–	(Interior comfort only)	○ Rarely influences outer form
Sound	–	Acoustic Comfort	○ Form has minor or no visual impact on sound control
Material	–	(See Material Table)	—
Mind	V03	Visual Aesthetics	⬤ Supports emotional connection through spatial rhythm and geometry
Community	C01	Community Identity	⬤ Façade form can reflect local cultural patterns or heritage
Biophilia I (v1)	88	Biophilia I	⬤ Biophilic geometry, organic curves, fractals
Biophilia II (v1)	100	Biophilia II	⬤ Extended integration of natural patterns into built form

These structured mappings provide a conceptual and methodological foundation for AI-driven façade generation that prioritizes visual wellness. They also enhance expert evaluation by offering clearly defined references rooted in a globally recognized certification system.

Furthermore, this mapping also reflects the conceptual logic of the EBD framework. Prior studies have demonstrated the adaptability of EBD across domains, such as its application to quality management systems (Sun et al., 2011). By linking color, material, and pattern to WELL variables within an EBD perspective, this study situates visual façade design in a recursive reasoning process, where environmental cues are translated into cognitive representations that guide design decisions. This cross-theoretical connection underscores the broader validity of integrating WELL-based standards into computational façade design.

In addition, this study deliberately limits its scope to visual WELL concepts. Non-visual categories such as Air, Water, and Sound, while essential to occupant well-being, were excluded because they cannot be represented in façade imagery or assessed through visual perception methods. In contrast, the dimensions of Color, Material, and Pattern are directly observable in generative outputs, correspond to WELL principles related to Light, Materials, and Mind, and are consistent with biophilic design theory (Kellert & Calabrese, 2015; Yun et al., 2024). This methodological narrowing ensures consistency between the design variables, the capabilities of AI-based façade generation, and the expert evaluation process, thereby reinforcing the academic rigor of the dimension selection.

Case Analysis of Automated Retail Stores

To support the development of a WELL-based façade design framework for automated retail stores, this study analyzes five cases: Amazon Go (USA), Bingo Box (China), 7-Eleven Shop & Go (Singapore), Lawson Digital Store (Japan), and Super Swift (Korea). Unlike previous research that focuses on interior layouts and operational systems, this study investigates façade expression and organizes the analysis according to visual elements aligned with WELL criteria.

As discussed in Sections 3.1 and 3.2, a preliminary set of façade design dimensions—color, material, pattern, shape, texture, and transparency—was extracted through literature review and mapped to relevant WELL v1/v2 certification entries. Through further observation of actual design trends in selected cases, Color, Material, and Pattern emerged as the most frequently expressed and perceptually impactful dimensions. These elements consistently conveyed spatial atmosphere, directed user attention, and enhanced natural perception across diverse store types.

The key characteristics of each case are as follows.

Amazon Go (USA): A glass curtain wall with dark metallic frames and warm lighting accents; clean modular divisions and light-dark contrast enhance visibility and night recognition.

Bingo Box (China): Bold orange-white color blocks, steel container materials, and illuminated signage; modular LED strip arrangements introduce a rhythmic visual composition.

7-Eleven Shop & Go (Singapore): Silver-toned aluminum cladding and frameless sensors; subtle horizontal panel lines suggest a calm and minimalistic visual texture.

Lawson Digital Store (Japan): Light matte materials and soft wood-textured panels; subtle repetition of vertical planks and warm signage evoke locality and coherence.

Super Swift (South Korea): Translucent glass and film combined with warm wood-like tones; clean layering and soft graphical decals support biophilic comfort within a transparent layout.

Notably, while the “Pattern” dimension in this case analysis includes compositional rhythm, segmentation, and surface articulation—reflecting common façade design logic in real-world automated stores—this study intentionally narrows its interpretation in the generative design phase. For the purpose of prompt control and AI-based image generation, “Pattern” is defined as two-dimensional surface graphics or decorative motifs, such as organic overlays or geometric ornaments. This ensures consistency in variable manipulation while aligning with WELL's emphasis on visual coherence and biophilic harmony.

These comparative observations are summarized in Table 3, which visualizes the Color, Material, and Pattern characteristics of each case along with corresponding façade images for reference. The analysis validates the practical relevance of the three dimensions in addressing WELL-related façade considerations, thus finalizing them as core input parameters for the design experiment.

Table 3.

Visual Dimensions × Case Study Visual Element Matrix.

Case Name	Color	Material	Pattern
Amazon Go	Dark neutral tones with warm interior lighting; high contrast for brand logo visibility	Glass curtain wall with metallic frame; visible interior wooden ceiling textures	Strong vertical and horizontal framing grid; clear alignment of LED strips and signage
Bingo Box	White and orange contrast; solid white base for container visibility in urban context	Modular metal panels and transparent glass; minimal structural exposure	Evenly segmented panel layout; modular product shelving visible through façade
7-Eleven Shop & Go	Bright orange-red framing; strong brand color identity; enhanced with LED signage	Smooth metal cladding and transparent security glass; integration with vending tech	Diagonal brand striping; functional and graphic alignment along consumer flow
Lawson Digital Store	Dark frame tones; neutral palette for equipment; focus on functional visibility	Plastic gates and metal shelving; temporary vertical columns for spatial boundary	Symmetrical alignment of entry gates; visual layering through product display lines
Super Swift	Soft white tones with transparent surfaces; natural light integration from façade	Fully transparent tempered glass; minimal frame; wood-textured shelving inside	Minimal panel segmentation; shelving and lighting form internal geometric rhythm

Prompt Strategies as Design Direction

To enable the effective integration between architectural certification standards and AI-assisted façade design, this study establishes a structured three-layered prompt structure aligned with three core visual dimensions—Color, Material, and Pattern—as defined in prior visual analysis. The three-layered prompt structure unifies the vocabulary—WELL-based and descriptive keywords—with the grammar of sequencing and weighting. This integration provides a controllable and replicable foundation for image generation.

Each prompt group is systematically constructed using the following layered logic:

Core Elements (Nouns): Represent architectural features such as “glass storefront,” “wood-clad façade,” or “entrance wall.” These serve as the semantic anchors of the generated image.

Descriptive Modifiers (Adjectives): Describe visual characteristics including texture, tone, materiality, or lighting, such as “earth-toned,” “modular,” or “leaf-inspired.

WELL-Based Labels: Directly map to WELL Building Standard (v1/v2) and biophilic design concepts such as “visual comfort,” “material health,” or “spatial identity.”

This prompt logic is not merely descriptive but operational, allowing prompts to act as design direction modules for AI image generation, rather than fixed instructions. Combined with ControlNet execution, this layered vocabulary enables partial inpainting of source images, preserving spatial context while selectively altering the appearance of the façade under different visual intentions.

The system is summarized in Table 4, which outlines how each visual dimension is translated into prompt keywords.

Table 4.

System Prompt Examples by Dimension Category.

Dimension	Basic Elements	WELL Integration	WELL-Based Visual Keywords	Visual Goal	Prompt Example (as Design Direction)
Color	Earth-tone façade, vivid signage palette, pastel entrance walls	L01, L02, V03, Biophilia I & II (v1)	Visual comfort, color psychology, color harmony, low-saturation schemes	Guide AI to explore color harmony, warmth, and emotional tone in façade design	Prompt Highlights: Natural color palette, earth tones, warm hue, low-saturation pastel wall Direction: Encourage psychologically comfortable, biophilic color schemes aligned with WELL visual aesthetics
Material	Wood-clad walls, stone façade textures, green wall panels	X06, X07, V03, C13, Biophilia I & II (v1)	Material health, low-VOC, natural finishes, non-toxic surfaces	Encourage selection of natural, low-emission materials, visible texture, and sensory warmth	Prompt Highlights: Natural wood texture, stone surface, green wall panels, low-VOC façade finish Direction: Enhance sensory connection and material authenticity
Pattern	Biophilic overlays, leaf-inspired motifs, modular grid, branch-like lines	C01, V02, V03, Biophilia I & II (v1)	Spatial identity, visual rhythm, support patterns, stress reduction, natural symbolism	Improve spatial recognition and aesthetic richness via natural or structured compositional schemes	Prompt Highlights: Modular grid, leaf silhouettes, branch-line tracery, fractal foliage Direction: Introduce intuitive, rhythmic, or nature-inspired patterns for wayfinding and symbolic identity

Image Generation Process

To generate façade images of automated retail stores that align with the WELL Building Standard and reflect differentiated design intentions across the core visual dimensions of Color, Material, and Pattern, this study adopts a structured workflow based on the Stable Diffusion XL (SDXL) model using img2img local inpainting, combined with ControlNet control modules.

Building upon the three-layered prompt structure introduced in Section 3.4, this system consists of three components: core elements such as façade types and architectural components, descriptive modifiers including textural and chromatic attributes, and WELL-based labels that reflect principles of health, comfort, and biophilic design. Together, these components translate abstract design goals into concrete semantic inputs for generative AI.

Unlike conventional text-to-image generation approaches, this study uses real-world façade photographs as structural seed images to retain the spatial logic and contextual layout of the original buildings within their urban environments. Through localized image-to-image generation, only specific regions of the façade are transformed in style or visually enhanced, thereby preserving the overall structure and avoiding unintended distortions.

In practice, the original façade image of the Super Swift automated retail store was selected as the base input. A Depth-based ControlNet module (using the control_v11f1p_sd15_depth model with the MiDaS v3 preprocessor) was applied to extract a depth map of the façade (see Table 7 in Section 4.1.1), which provided edge-preserving control during inpainting. This approach ensured that architectural boundaries and spatial proportions were maintained, while enabling targeted manipulation of visual attributes. For the sake of transparency and reproducibility, the standardized parameter configuration of the ControlNet Depth module—including resolution, weight, and conditioning steps—is reported in detail in Section 4.1.2, together with the complete negative prompts list used to suppress artifacts.

These control maps work in tandem with the prompt strategies to guide generation in a structured and replicable manner. Each image undergoes iterative refinement through prompt modification, ControlNet parameter adjustment, and manual visual inspection. ChatGPT is used to assist in semantic restructuring and prompt recomposition, helping ensure that the resulting images reflect WELL-oriented design intentions more precisely.

This study emphasizes that the goal of image generation is not to deliver finalized design outputs. Rather, it serves as an AI-assisted tool that offers direction and inspiration aligned with healthy-building principles. The AI-generated images present designers with diverse expressive possibilities—such as natural materials, warm tones, and rhythmic compositions—within localized areas, encouraging exploratory and adaptable design thinking.

To ensure the representativeness of the images used for evaluation, approximately 20 images per visual condition are generated. A panel of five experts with backgrounds in architecture and spatial design participate in an initial selection process. Within each image set, they reach a consensus to select the image that best exemplifies the intended visual dimension, which is then included in the expert evaluation phase.

Overall, under the guidance of WELL certification logic and prompt-controlled visual semantics, this study constructs a façade design workflow that integrates generative AI with structural control. This approach enables precise manipulation of visual dimensions and provides a systematic and assessable design support tool for future architectural practices. Figure 2 illustrates the workflow of AI-assisted façade design for automated retail stores, integrating visual dimension identification, prompt-based image generation, and expert evaluation.

Figure 2.

Workflow of AI-Assisted Façade Design for Automated Retail Stores.

Image Evaluation Methods

To assess the performance of AI-generated images in façade design, this study adopts a dual-method evaluation system combining quantitative technical indicators and expert-based scoring with a Likert scale. This integrated approach ensures comprehensive evaluation from both visual quality and design applicability perspectives.

Technical Evaluation Indicators

The following metric is used in this study.

LPIPS score:

Used to evaluate the perceptual similarity between image variants under the same design dimension. It is more sensitive to structural and semantic differences than pixel-level indicators, and is well-suited for comparing images modified via SD with ControlNet. A lower LPIPS score indicates higher visual similarity between images with minimal unintended noise or distortion.

Expert Evaluation Using Visual Dimension Scoring Table

To evaluate the subjective visual quality of the AI-generated façade images, a structured evaluation form was developed based on key visual dimensions. These criteria—façade material, transparency, color, design complexity, and natural features—were derived from research on the impact of exterior design on retail business performance and customer attraction (Majid, 2022). In addition, elements from the WELL Building Standard and biophilic design theory were referenced to ensure alignment with established wellness-related architectural principles (Browning et al., 2014; Kellert, 2008;). The retained evaluation dimensions and their justification are summarized in Table 5, mapping visual aspects to corresponding WELL features.

Table 5.

Evaluation Dimension Scoring Table.

Evaluation Dimension	Original Dimension	Related to WELL Standard	Retain or Not	Reason
Façade Material	Façade Material	X06, X07, C13	Yes	A core material-related dimension directly aligned with WELL features such as low-VOC, natural, and non-toxic materials.
Transparency	Transparency	V03, L01	Yes	Transparency impacts daylight access and visual comfort, aligning with WELL's visual light standards.
Color	Color	L01, L02, V03	Yes	Directly linked to color temperature, visual clarity, and environmental perception.
Complexity of the Design	Complexity of the Design	Indirect (C01: Legibility)	Yes	Excessively complex patterns may reduce visual clarity; can be a secondary reference but not a primary rating criterion.
	Lighting	L01 (Daylight Quality)	No	The current image generation does not include light source control; color/light expression is handled via color dimension.
	Sign Placement	Not included	No	Not directly related to façade design in this study or WELL standards; excluded.
Natural Feature		Biophilic I & II	Yes	Evaluates the presence of natural elements (e.g., leaf patterns, trees, fractals) in biophilic expression.

The structure of the evaluation framework was also informed by previous façade design assessment studies to ensure theoretical rigor and practical applicability (Kim & Park, 2025; Shan & Junghans, 2023). An expert panel of five professionals, each with over ten years of experience in architectural, interior, or biophilic design, independently evaluated the generated images using the standardized assessment table. All panel members held professional backgrounds in architecture but represented diverse areas of expertise: two professors specializing in architectural design and spatial planning with a focus on biophilic design, one WELL certification expert, and two PhD researchers specializing in environmental psychology and spatial perception in WELL-oriented design. This breadth of experience ensured that the evaluation incorporated expertise in design, WELL-related knowledge, and user-centered perceptual insights.

A 7-point Likert scale evaluation form was developed based on the retained visual dimensions aligned with WELL standards and biophilic design principles. Experts scored each image based on the following criteria, using a scale where: −3 = Very Poor, −2 = Poor, −1 = Slightly Poor, 0 = Neutral, + 1 = Slightly Good, + 2 = Good, + 3 = Excellent. The higher the score, the more closely the image aligns with the described design feature. The final set of evaluation criteria and scoring framework is presented in Table 6.

In addition to the quantitative Likert-scale evaluations, this study collected open-ended responses from the five expert participants to capture more nuanced perceptions and personalized design suggestions. These qualitative responses were analyzed thematically to identify common evaluative criteria, emotional impressions, and critiques that are not fully represented by numerical ratings. This mixed-method approach aims to enrich the evaluation framework and arrive at deeper insights into the semantic impact of AI-generated façade designs.

Table 6.

Evaluation Dimension Scoring Table.

Evaluation Dimension	Description	Likert Scale
Façade Material	Quality and appropriateness of façade material (naturalness, texture, WELL-aligned)	−3 → 3
Transparency	Degree of visual openness, daylight access, and comfort	−3 → 3
Color	Application of color in terms of clarity, temperature, and perceptual harmony	−3 → 3
Complexity of the Design	Overall pattern complexity and its effect on visual clarity	−3 → 3
Natural Feature	Presence of biophilic elements such as leaf motifs, natural textures, and fractals	−3 → 3

Data Analysis Methods

To evaluate whether the AI-generated façade images demonstrate effective visual differentiation under WELL-based design strategies, this study employed a two-level mixed-methods analysis.

Quantitative analysis was conducted as follows.

Perceptual similarity between original and generated images was assessed using the LPIPS metric. This helped quantify the visual divergence across different prompt strategies and control conditions.

Expert evaluations were collected through a 7-point Likert scale ranging from −3 (Very Poor) to +3 (Excellent), across five dimensions: Façade Material, Transparency, Color, Design Complexity, and Natural Features. Descriptive statistics, including mean scores, standard deviations, and trend charts, were first computed to compare the performance of each image category.

Prior to inferential testing, Shapiro–Wilk tests with Q–Q plot inspection was performed for each Image × Dimension cell (n = 5 per cell) to assess the normality assumption. Normality was evaluated at the α = .05 level, and potential deviations were further checked by false discovery rate (FDR) correction and visual inspection of distributions. Sphericity was examined using Mauchly's test, and when violated, Greenhouse–Geisser corrections were applied. In addition, repeated measures ANOVAs were conducted to test for statistically significant differences across conditions and dimensions. All statistical analyses were performed using IBM SPSS Statistics version 27.

To complement the numerical scores and explore more nuanced design perceptions, this study then conducted thematic analysis on the experts’ open-ended responses, following the six-step approach proposed by Braun and Clarke (2006):

Familiarization with the data: All textual feedback from experts was read several times to gain an initial understanding of recurring perceptions.

Generating initial codes: Key phrases related to visual comfort, naturalness, material perception, and biophilic effects were systematically coded.

Searching for themes: Codes were clustered into broader themes such as “natural integration,” “visual coherence,” “material authenticity,” and “aesthetic inconsistency.”

Reviewing themes: The emergent themes were cross-checked against the raw data and adjusted for consistency and distinctiveness.

Defining and naming themes: Each theme was refined and given a clear operational definition to reflect its design relevance and connection to WELL-based visual dimensions.

Producing the report: The finalized themes were integrated into the results discussion to explain expert preferences and design implications beyond quantitative scores.

This multi-layered analytical approach enables the systematic examination of visual differentiation and perceptual responses to WELL-based AI design strategies.

Results

Façade Image Generation Results

Local Redrawing and Structural Control

To preserve the spatial integrity and real-world context of the automated retail store “Super Swift” façade, this study uses a photographic image of the original building as the structural seed input for AI-based regeneration. Given the functional nature of automated retail stores, safety and visibility are critical considerations, making transparency an essential design requirement. Therefore, when redesigning the façade, we incorporate transparency by retaining key elements such as the entrance door and clear glass panels, while modifying other aspects to enhance visual quality and aesthetic appeal integrating WELL and biophilic design principles.

A two-step generation strategy is adopted: Local inpainting masks are manually applied to designate editable façade regions. ControlNet modules are then used to guide structural constraints and ensure consistent alignment with the base image during style transformation.

Three modification zones are defined based on architectural features and design flexibility (see Table 7):

Signage Zone (Green Area): This upper section includes the storefront signage and adjacent wall surfaces. It accommodates adjustments in both material and color, such as wood panels, biophilic overlays, or soft tone finishes, along with potential pattern applications to enhance rhythm and identity.

Framing System (Yellow Area): Covering the window and door frames, this zone supports the replacement or enhancement of materials (e.g., wood or aluminum) and color accents, allowing modulation of warmth, contrast, or tactile appeal.

Glazing Zone (Blue Area): Representing the transparent glass façades and doors, this area is vital for ensuring interior visibility and retail openness. Only minimal and non-obstructive interventions—such as light-permeable patterns or ambient lighting effects—are introduced to maintain transparency and user trust.

Table 7.

Region-Specific Inpainting for Localized Generation Based on ControlNet Structure.

Seed Image	Local Inpainting Mask	Edge Structure via ControlNet (Depth)

The entrance zone remains unmodified across all generation conditions to safeguard accessibility and spatial legibility. These targeted redrawing strategies help ensure that WELL-based design intentions are precisely mapped onto appropriate façade components without disrupting the overall architectural coherence.

Visual Outcomes Under Different Prompt and Dimension Controls

This study evaluates the visual effectiveness of WELL-based façade design strategies by comparing two approaches. The first is a Baseline Design with a conventional commercial appearance, while the second is a WELL-Based Design guided by WELL v2 principles across the visual dimensions of Color, Material, and Pattern.

Both strategies are applied to the same base structure—the Super Swift store—using Stable Diffusion XL with localized inpainting and structure-preserving control through ControlNet. The WELL-Based strategy involves seven controlled generation conditions: three single-dimension cases (Color only, Material only, Pattern only), three dual-dimension combinations (Color + Material, Color + Pattern, Material + Pattern), and one full integration (Color + Material + Pattern).

The prompt formulation follows a structured three-layered logic:

Core Elements: Key architectural targets such as “storefront,” “glass façade,” or “signage band”

Descriptive Modifiers: Adjectives expressing visual qualities, including “natural,” “textured,” “modular,” or “warm-lit”

WELL-Based Labels: Semantic phrases derived from WELL principles, such as “visual comfort,” “natural material,” or “biophilic expression”

These components are compiled into operative prompts with selective emphasis weights, guiding AI-based image generation while maintaining spatial consistency. For example, a Color + Pattern prompt may emphasize natural tones and curved botanical graphics applied to signage or glazing areas.

Table 8 summarizes the seven façade generation strategies and their associated evaluation dimensions.

Table 8.

Evaluation Dimension Scoring Table.

Prompt Type	Color	Material	Pattern	Strategy Description
Baseline Design	Neutral lighting, standard exterior tones	Basic geometric textures, uniform surfaces	Generic commercial layout, minimal façade articulation	No WELL criteria considered. Focuses on conventional retail appearance and basic clarity.
WELL-Based Design	Circadian-inspired lighting, warm natural tones, daylight-responsive color palette	Natural textures (e.g., timber, bamboo), low-VOC finishes, material diversity	Clear visual hierarchy, bio-inspired geometry, rhythmic patterning (e.g., modular grids)	Integrates WELL v2 features (L01, L02, V03, X06, C01). Aims to enhance health, comfort, and visual recognizability through targeted manipulations of color tones, surface materials, and glazing patterns.

Table 9 presents a consolidated overview of each conditions prompt structure, emphasis phrases, generation parameters, and the representative output image. The “Prompt Highlights” field illustrates a representative example of the composed input prompt used in the generation process, reflecting the integrated logic of the three-layer structure. Meanwhile, the “Selection Logic” summarizes the rationale behind the expert-based selection of each output.

Unless otherwise stated, all images were generated using the following standardized configuration: Stable Diffusion XL model, DPM++ SDE sampler, 20 sampling steps, CFG scale of 7, and denoising strength between 0.5 and 0.6. The output resolution was fixed at 1125 × 844 pixels. To maintain spatial alignment and architectural structure, ControlNet (Depth) modules were employed to extract edge-preserving features from the original Super Swift façade image for controlled inpainting.

Table 9.

Original Image vs. Generated Images by Dimension Display.

Visual Control Condition	Prompt Structure (Three-Layered)	Prompt Highlights (example)	Selection Logic
Baseline (Original Façade)	—	—	Original image for reference.
Color only	Core Elements:storefront, transparent glass façade, indoor lighting, signageDescriptive Modifiers:natural color palette, green accent, visual openness, calming toneWELL-Based Labels:visual comfort, daylight access, healthy material expression	"transparent glass façade, green glass signage, soft lighting, natural texture, storefront view"(Emphasis: (green glass:1.2), (transparent material:1.1))	Selected for its strong alignment with WELL principles, particularly in emphasizing transparency and visual comfort through a green-accented, open-glass façade.
Material only	Core Elements:storefront, wooden façade, wooden frame, “Super Swift” logoDescriptive Modifiers:natural timber texture, warm tone, material richnessWELL-Based Labels:material authenticity, biophilic expression, non-toxic and safe finishes	"wooden wall texture, timber cladding, storefront signage, biophilic wood material, warm-toned natural surface"(Emphasis: (wood wall:1.2), (timber surface:1.1)	Chosen for its warm and authentic material expression. Experts highlighted its biophilic wooden textures and visual comfort as key WELL-aligned features.
Pattern only	Core Elements:transparent glass façade, “Super Swift” logo, storefront frameDescriptive Modifiers:minimalist, nature-inspired pattern, line-based graphicsWELL-Based Labels:biophilic expression, perceptual rhythm, visual identity	"glass façade with white biophilic line patterns, minimalist plant motif, clean storefront with logo"(Emphasis: (biophilic pattern:1.2), (transparent glass:1.1))	This version was favored for its clean application of biophilic patterns and minimalistic plant motifs, enhancing identity and emotional comfort without compromising transparency.
Color + Material	Core Elements:Storefront with wooden frame, transparent glass surface, grass-textured panel, “Super Swift” logoDescriptive Modifiers:Warm natural tones, organic texture, soft lightingWELL-Based Labels:Material authenticity, visual comfort, natural finishes, biophilic integration	“wooden frame storefront, transparent glass, green grass texture, soft lighting, earth-toned natural surface”(Emphasis: (wood wall:1.2), (green grass:1.1))	Selected for its cohesive integration of warm tones and natural materials. The image demonstrates biophilic qualities that reinforce WELL-based principles of material health and visual warmth.
Color + Pattern	Core Elements:Storefront withtransparent glass panel, green canopy, minimalistic signage, biophilic façade pattern (leaf motif)Descriptive Modifiers:Harmonious natural colors, curved botanical linework, soft green gradients, abstract nature-inspired geometryWELL-Based Labels:Visual harmony, emotional comfort, stress-reducing color scheme, biophilic identity, connection to nature	“Transparent storefront, green color panel, abstract leaf pattern, curved line graphics, soft green tones”Emphasis: (natural pattern:1.3), (curved botanical motif:1.2), (pastel tone:1.2)	Experts agreed on this version for its calming green palette combined with abstract leaf patterns, promoting nature connection and visual harmony in a commercial setting.
Material + Pattern	Core Elements:Storefront with wooden frame, transparent glass façade, “Super Swift” signageVertical wall surface featuring natural material (e.g., wood, stone), patterned surface overlayDescriptive Modifiers:Textured wood or stone material, matte finish, biophilic geometric patternOrganic motifs embedded into material layering, minimalistic rhythmic lineworkWELL-Based Labels:Low-VOC and safe finish, tactile comfort, natural material authenticityBiophilic expression through texture and motifs, emotional engagement via patterned material surface	“Natural wooden wall, organic geometric pattern, matte surface finish, healthy material expression”Emphasis: (wood wall:1.2), (organic pattern:1.3), (natural surface finish:1.1)	Chosen for its layered use of healthy materials and refined biophilic motifs, offering both tactile and visual depth aligned with WELL's comfort and natural engagement goals.
Color + Material + Pattern	Core Elements:Storefront with wooden frame, “Super Swift” logo, large transparent glass panelsWall surface featuring composite textures (wood + grass overlay), structure preserved from original base imageDescriptive Modifiers:Warm earthy tones, organic surface finish, ambient lightingMinimalistic bio-friendly geometric pattern applied on glassRhythmic layering of natural elements to enhance spatial depthWELL-Based Labels:Balanced façade composition aligned with WELL principles (material health, daylight access, visual harmony)Use of low-VOC, eco-conscious materials; emphasis on stress-reducing colors and texturesBiophilic aesthetics through form integration and soft natural palette	“Wood texture, natural grass wall, warm natural tones, geometric bio-friendly pattern on glass, ambient lighting”Emphasis: (wood wall:1.2), (natural texture:1.1), (bio-friendly geometric pattern:1.3), (warm tones:1.1)	Selected for its balanced integration of earthy colors, natural textures, and biophilic patterns, creating a visually rich and emotionally comforting façade per WELL standards.

The ControlNet module was configured using the control_v11f1p_sd15_depth model with the MiDaS v3 preprocessor. Unless otherwise stated, all parameters were kept at their default settings. The control resolution was fixed at 512, which is the default balance between feature extraction accuracy and computational efficiency. The control weight was set to 1.0, and the guidance start/stop steps followed the default values of 0 and 1, ensuring full-step conditioning across the generation process. The control mode was configured as Balanced, and the resize mode was left at the default Crop and Resize option, which preserves spatial consistency between the control map and the input image. No additional ControlNet variants were applied, since only the Depth module was employed in this study.

In all generation conditions, a standardized set of negative prompts was applied to suppress visual artifacts and enhance image clarity:

over sharpening, dirt, bad color matching, graying, wrong perspective, distorted person, Twisted Car, NSFW, (worst quality:2), (low quality:2), (normal quality:2), lowres, (monochrome), (grayscale), blurry, signature, drawing, sketch, text, word, logo, cropped, out of frame, nsfw.

This negative prompt configuration was consistently used to minimize undesired visual outputs and ensure higher fidelity aligned with architectural realism.

No fixed seed was manually applied during generation, but the consistency of visual intent across attempts was ensured through carefully controlled prompt content and spatial constraints such as ControlNet masking.

Visual Comparison with Original Designs

To assess the visual perceptual differences between the AI-generated façades and the original baseline design, this study applies LPIPS analysis. LPIPS measures perceptual similarity by computing feature distances from a pre-trained neural network, allowing for a more human-aligned assessment of visual changes.

The analysis compares each generated image—based on seven WELL-based visual control conditions—with the original Super Swift façade. A higher LPIPS score represents a greater visual deviation from the baseline; a lower score suggests closer resemblance.

Table 10 summarizes the LPIPS scores for each condition, highlighting how design strategies affect the perceptual outcome. Notably, the Material only condition achieved the lowest score (0.1601), indicating minimal change and high visual similarity with the original. Conversely, the Color + Material + Pattern condition produced the highest score (0.3807), reflecting the most significant transformation across all visual aspects.

Table 10.

LPIPS Scores and Interpretation of Visual Deviation.

Visual Control Condition	LPIPS Score	Interpretation
Color only	0.1977	Moderate deviation due to single-dimension color change.
Material only	0.1601	Minimal deviation; material-only control yields the closest resemblance to the original.
Pattern only	0.2177	Noticeable change introduced by added structural motifs.
Color + Material	0.2638	Increased variation from the combined effect of color and material transformation.
Color + Pattern	0.2995	Pronounced deviation resulting from simultaneous color and pattern control.
Material + Pattern	0.3046	Elevated perceptual complexity from texture-pattern synergy.
Color + Material + Pattern	0.3807	Highest deviation observed; full-spectrum control leads to the most significant visual transformation.

To illustrate this trend, Figure 3 presents the LPIPS scores in a bar chart. It clearly shows a progressive increase in perceptual deviation as more visual dimensions (Color, Material, Pattern) are combined. This trend quantitatively supports the idea that multidimensional WELL-based façade modifications lead to higher visual impact and distinguishability. These findings reinforce the controllability and sensitivity of prompt-based AI design strategies in modulating the perceptual outcomes.

Figure 3.

LPIPS Scores Between Original and Generated Façade Images.

Beyond numerical comparison, the LPIPS results also offer design-relevant insights. Conditions driven primarily by color adjustments, such as the Color-only variant, generated relatively high perceptual deviations but did not lead to meaningful improvements in nature-related qualities. This indicates that noticeable visual change alone is insufficient to enhance restorative potential if the intervention lacks biophilic grounding. In contrast, experts perceived material- and pattern-based modifications, which produced moderate LPIPS deviations, as more coherent and beneficial. These findings suggest that façade design practice should not equate higher perceptual deviation with better outcomes. Instead, effective WELL-oriented interventions require a calibrated balance: introducing enough change to differentiate the design from a conventional baseline, while maintaining perceptual coherence and selectively enhancing natural attributes. Further details on how these deviations align with expert perceptions are elaborated in Section 4.3.

Expert Evaluation Results

To complement the perceptual similarity analysis, this study conducted expert evaluations on eight façade images. These included one original and seven AI-generated variants. The evaluations were based on five key visual dimensions: façade material, transparency, color, design complexity, and natural features. The evaluation employed a 7-point Likert scale ranging from −3 (Very Poor) to +3 (Excellent), and was completed by five experts with backgrounds in architecture and environmental psychology. The internal consistency of the expert evaluation was verified using Cronbach's alpha, which yielded a value of 0.837, indicating a high level of reliability across the five dimensions.

In addition, Kendall's W tests were conducted to assess inter-rater agreement within each visual dimension. Results showed moderate and statistically significant agreement in Material (W = 0.373, p = 0.018) and Naturalness (W = 0.405, p = 0.011), and high agreement in Transparency (W = 0.599, p < 0.001). In contrast, agreement in Color (W = 0.170, p = 0.246) and Design Complexity (W = 0.190, p = 0.194) was low and not statistically significant. These findings suggest that while experts held relatively consistent views on materiality, naturalness, and transparency, their perceptions of color and complexity were more divergent.

Table 11 presents the mean scores and standard deviations for each image across the five evaluation dimensions. The results show that the baseline image (Image 1) received a relatively high score in Transparency (M = 2.40, SD = 0.894) but a notably low score in Natural Features (M = -2.60, SD = 0.548) and Color (M = -0.80, SD = 1.924), indicating a lack of biophilic qualities in the original real-world design. This underscores the necessity of integrating WELL-based and biophilic design principles into the visual enhancement of automated retail façades.

Table 11.

Expert Evaluation Scores by Visual Control Condition.

Condition	Material	Transparency	Color	Complexity	Natural Features
Baseline (Original)	−0.40 (±1.140)	2.40 (±0.894)	−0.80 (±1.924)	0.00 (±1.871)	−2.60 (±0.548)
Color only	−0.20 (±1.095)	1.80 (±0.837)	1.00 (±1.871)	0.40 (±2.191)	−0.20 (±2.049)
Material only	1.40 (±1.517)	1.00 (±0.707)	0.60 (±1.517)	1.00 (±1.225)	1.60 (±1.140)
Pattern only	0.00 (±1.225)	2.00 (±0.707)	−0.20 (±1.924)	0.20 (±1.643)	0.60 (±1.140)
Color + Material	0.80 (±2.049)	0.40 (±1.140)	0.60 (±2.074)	−0.60 (±0.894)	1.20 (±2.049)
Color + Pattern	0.00 (±0.707)	2.00 (±1.225)	1.40 (±1.140)	−0.60 (±1.140)	0.80 (±1.095)
Material + Pattern	1.80 (±0.837)	1.20 (±1.095)	1.60 (±0.548)	0.40 (±1.517)	2.00 (±0.707)
Color + Material + Pattern	1.20 (±1.643)	0.80 (±1.304)	0.80 (±2.280)	−0.60 (±1.949)	1.60 (±1.949)

Among the AI-generated images, the Material + Pattern condition (Image 7) achieved the highest ratings in both Façade Material (M = 1.80, SD = 0.837) and Natural Features (M = 2.00, SD = 0.707), suggesting that the combination of natural textures and biomorphic patterns is particularly effective in enhancing expert-perceived design quality. At the same time, experts noted that the strong visual presence of this combination may risk “visual overload,” highlighting the importance of balancing biophilic richness with visual simplicity to maintain comfort and prevent cognitive strain. In contrast, the Color + Material + Pattern condition (Image 8), while aiming for comprehensive optimization, did not achieve the highest scores in any individual dimension. This result implies that overloading multiple visual interventions may introduce perceptual trade-offs or cognitive strain.

Normality checks using Shapiro–Wilk tests with Q–Q plots indicated no systematic violations after FDR correction, supporting the use of repeated-measures ANOVAs. To evaluate whether the observed differences across conditions and dimensions were statistically significant, a series of repeated measures ANOVAs was conducted with Condition (8 levels) and Dimension (5 levels) as within-subject factors. The analysis revealed a significant main effect of Dimension, F(4,16) = 3.46, p = 0.032, indicating that experts’ ratings differed substantially across perceptual dimensions. The main effect of Condition alone was not significant, F(7,28) = 1.51, p = 0.205. Importantly, the Condition × Dimension interaction was significant, F(28,112) = 3.62, p < 0.001, suggesting that the influence of design interventions varied depending on the evaluation dimension. Follow-up one-way ANOVAs for each dimension revealed that significant effects of Condition were present for Naturalness, Transparency, and Material, but not for Color or Complexity. These results confirm that design strategies strongly influenced perceptions of restorative qualities, materiality, and transparency, whereas evaluations of color and complexity were more divergent. Given the small expert sample (n = 5), these inferential results should be interpreted as exploratory. Taken together, the normality checks, and the repeated-measures ANOVA results provide a consistent basis for interpreting expert evaluations. Table 12 reports the per-dimension one-way repeated-measures ANOVA statistics and exact p-values, indicating which dimensions exhibited significant condition effects.

Table 12.

Results of Repeated Measures ANOVAs for Expert Evaluations Across Five Dimensions.

Dimension	F (7,28)	p-value	Significance
Material	2.63	0.032	*
Transparency	5.75	<0.001	**
Color	1.12	0.377	n.s.
Complexity	0.72	0.659	n.s.
Naturalness	7.39	<0.001	**

Note: n.s. = not significant; * p < 0.05, ** p < 0.001. Tests are based on one-way repeated measures ANOVAs with Condition (8 levels) as the within-subject factor for each dimension (n = 5 experts).

Figure 4 provides a radar chart visualization of the expert ratings, offering a comparative perspective across all eight design strategies. The chart highlights the distinctive perceptual impacts of each visual control. For instance, the Color-only image (Image 2) shows a notable improvement in the Color dimension, while exhibiting minimal influence on Natural Features. The baseline image excels in Transparency but demonstrates significantly lower performance in nature-related dimensions.

Figure 4.

Expert Evaluation Trend Across Visual Strategies.

When considered together with the LPIPS results reported in Section 4.2, clear patterns emerge. The Material-only condition showed the smallest deviation and provided little improvement in wellness-related qualities, indicating that minimal change rarely enhances restorative attributes. In contrast, the Color + Material + Pattern condition produced the largest deviation but was not rated most effective by experts, suggesting that excessive transformation may undermine coherence and risk visual overload. The Material + Pattern condition lay between these extremes: it showed a moderate LPIPS score yet received the highest ratings for naturalness and material quality, demonstrating that balanced modifications can enhance restorative potential while preserving legibility.

These correspondences translate into actionable guidance for façade practice.

Prioritize material–pattern synergies to introduce biophilic cues with controlled complexity. Designers should adjust the scale and density of patterns to enrich restorative qualities while avoiding the risk of visual overload.

Use color strategically to convey warmth and harmony. Color interventions should ideally be combined with authentic materials or restrained patterning,since color alone can produce noticeable visual change but contributes little to restorative potential.

Protect transparency and legibility by placing interventions in secondary façade zones, such as signage bands or frames, or by applying partial or fritted treatments instead of large opaque overlays.

Adopt balanced interventions that create perceptible distinction from the baseline while maintaining contextual coherence. In this study, the Material + Pattern condition exemplified such balance and achieved the highest ratings in wellness-related attributes.

Taken together, these findings confirm that WELL-oriented visual prompt strategies can substantially shape expert perception of AI-generated façade designs. The results also suggest that the most effective façade interventions may not necessarily involve maximal visual richness but instead require a careful balance between biophilic attributes and perceptual simplicity. The results also support the use of structured environmental cues as a basis for prompt engineering, reinforcing the potential of an EBD approach in guiding visually and emotionally responsive generative outputs.

Qualitative Insights from Expert Feedback

In addition to the quantitative Likert-scale evaluations, the open-ended responses from experts provided valuable qualitative insights into the perceived strengths and limitations of the AI-generated façade designs. Thematic analysis was used to extract recurring patterns and opinions from the responses (Braun & Clarke, 2006). Three key themes emerged:

Theme 1: Appreciation for Natural Features

Several experts emphasized the positive impact of natural elements, using expressions such as “green textures,” “natural materials,” and “biophilic imagery.” One expert remarked that “adding biophilic features makes the space feel more harmonious and welcoming,” affirming the effectiveness of AI-generated designs that integrate nature-based elements in enhancing visual appeal and user comfort.

Theme 2: Caution Toward Overuse of Patterns

While the use of patterns contributed to visual richness, some experts expressed concerns about visual overload. One noted, “The pattern in [Image 6] feels too intense, which may lead to sensory fatigue if applied extensively.” This suggests that pattern dimensions, especially when combined with material modifications, require careful modulation to avoid disrupting visual balance.

Theme 3: Feedback on Semantic Clarity and Realism

Several experts pointed out inconsistencies or ambiguities in the AI-generated outputs. Phrases such as “some images lack realistic material textures” and “the design intent is unclear in certain façades” were mentioned. These comments highlight current limitations in AI's ability to fully capture architectural detail and intent, underscoring the need for continued refinement of prompt strategies and post-processing techniques.

Taken together, the open-ended responses deepen our understanding of expert perceptions and provide qualitative validation for the visual strategies implemented in the study. These insights complement the quantitative findings and support the iterative development of WELL-aligned, biophilic façades using generative AI tools.

Discussion

Key Findings

This study proposes a structured AI-assisted design workflow for generating façade designs of automated retail stores by integrating WELL Building Standard principles and biophilic design principles. Through the identification and application of three core visual dimensions—Color, Material, and Pattern—the study operationalizes abstract wellness-oriented concepts into controllable, generative visual features.

The image generation process employed Stable Diffusion XL (SDXL) with ControlNet for localized modification of real-world façade images, enabling precise integration of health-related design cues. The evaluation framework included a 7-point Likert scale applied to five perceptual dimensions—material, transparency, color, design complexity, and natural features—assessed by a panel of design experts.

In addition to quantitative scores, qualitative feedback was collected through open-ended expert responses. Using thematic analysis, the study extracted deeper insights into expert perceptions, emotional impressions, and practical concerns, enriching the understanding of AI-generated design effects.

The analysis revealed the following key findings.

Color emerged as the most semantically aligned and visually effective dimension. AI-generated outputs focusing on color adjustment consistently received higher scores for aesthetic warmth and environmental harmony, indicating that color is a powerful medium for conveying WELL-aligned design intent.

Material-focused prompts introduced elements like wood and other nature-resembling textures, which enhanced perceptions of naturalness and tactile warmth. However, their visual integration with the original architectural context varied, suggesting that material changes require careful calibration to avoid inconsistencies or visual detachment.

Pattern, which included biophilic motifs such as leaves and waves, introduced the greatest variability across generated images. While this dimension enhanced the natural feature ratings, it was more sensitive to over-decoration and sometimes introduced visual clutter, requiring balanced deployment to preserve coherence.

The combined strategy of Color + Material + Pattern did not yield the highest scores in any single dimension but performed consistently across all, suggesting a trade-off between expressive richness and perceptual clarity. By contrast, the Material + Pattern strategy achieved the highest scores in both the Material (M = 1.80, SD = 0.837) and Natural Feature (M = 2.00, SD = 0.707) dimensions, demonstrating its effectiveness in enhancing biophilic and health-related qualities without overloading visual complexity.

The baseline (original) image scored relatively high on Transparency but received low ratings in Natural Features and Material, reinforcing the research rationale: The absence of biophilic and WELL-aligned cues in existing façade designs underscores the need for design intervention.

Importantly, qualitative expert insights validated these findings. Experts praised the incorporation of natural elements, while also cautioning against visual overload due to excessive patterning. Some also pointed out semantic ambiguity and unrealistic rendering in certain images, highlighting areas for prompt refinement and post-processing improvement.

Overall, these findings support the feasibility of using AI prompt engineering to infuse WELL and biophilic principles into façade design, while also highlighting the perceptual impact of individual and combined visual dimensions. The results validate the effectiveness of a layered prompt structure and expert-in-the-loop evaluation—both quantitative and qualitative—as a method for refining and guiding AI-generated architectural solutions.

Implications

This study offers both theoretical and practical implications. From a theoretical perspective, this study extends the theory of EBD, originally proposed by Zeng and Cheng (1991), by applying it to the architectural scale and advancing its implementation through the integration of AI-based generative design tools. EBD emphasizes a recursive design reasoning process grounded in environmental understanding. This study operationalizes it through three iterative stages: visual variable extraction, generative prompt construction, and expert-based perceptual evaluation. However, this process remains conceptually recursive rather than algorithmically recursive, since the generative loop is not yet technically closed or automated. In doing so, the research not only reaffirms the adaptability of EBD across design domains but also demonstrates its compatibility with emerging computational workflows in the context of wellness-oriented façade design.

Building upon this theoretical foundation, the study also establishes connections between computational design tools and architectural certification systems. It links computational design workflows to architectural certification standards by establishing a mapping mechanism that translates WELL principles into visual design dimensions: Color, Material, and Pattern. This structured approach operationalizes wellness-oriented strategies, such as biophilic connections and material health, into prompt-level design interventions that can be visually expressed and evaluated. Our previous study (Yun & Kim, 2025) explored biophilic façade strategies in the context of urban infrastructure, using eye-tracking and subjective evaluation methods. While that research focused on energy-related public facilities, this study extends the scope to automated retail environments and introduces the WELL Building Standard as a structured evaluation framework for wellness-driven design. This expansion allows for a more standardized and certifiable approach to façade enhancement in commercial settings, bridging user-centered perception studies with AI-assisted generative workflows.

Taken together, aligning EBD with WELL establishes a dual framework in which environmental reasoning and health-oriented benchmarks jointly guide the three visual design variables. This integration clarifies how environmental cues can be systematically translated into perceptual prompts for façade generation, thereby reinforcing methodological validity and enhancing the framework's adaptability to broader architectural contexts. Nevertheless, as EBD validation often benefits from linking environmental attributes to measurable perceptual and physiological responses, future integration of behavioral and physiological outcome metrics will be essential to further substantiate the empirical grounding of this framework.

In practice, the proposed AI-assisted image generation process—based on Stable Diffusion XL and ControlNet—offers a flexible and low-cost tool for early-phase façade ideation. It enables designers to visualize directional design alternatives that reflect WELL-aligned features without relying on finalized design proposals. The use of img2img local redraws helps retain the original architectural context while selectively enhancing façade components, making the process especially suitable for adaptive renovation or incremental design tasks. These capabilities are particularly relevant in the context of rapidly expanding uncrewed retail spaces, where spatial appeal, user comfort, and psychological engagement are critical to attracting foot traffic and supporting user well-being.

In addition to the quantitative evaluations, qualitative feedback from experts enriched the interpretation of perceptual outcomes. Experts frequently emphasized the importance of material authenticity, natural detail subtlety, and façade-context harmony when assessing the images. Comments also pointed to issues such as excessive visual complexity and lack of realism in certain AI-generated outputs. These insights reinforced the strengths observed in the color and material dimensions while highlighting the need for refined prompt strategies that balance innovation with environmental coherence. The inclusion of such expert perspectives demonstrates the value of integrating both quantitative and qualitative judgment into the iterative AI design process.

Limitations and Suggestions for Future Research

This study presents a preliminary exploration into applying generative AI tools—guided by selected visual aspects of the WELL Building Standard—for the façade design of automated retail stores. While the findings suggest promising potential, several methodological and contextual limitations were identified during the research process, pointing to future directions for refinement and expansion.

First, the current design strategies did not adopt the full scope of the WELL Building Standard. Instead, this study selectively focused on visual aspects of color, material, and pattern that are particularly relevant to façade appearance. While these dimensions were operationalized through EBD reasoning and WELL alignment, the framework has not yet encompassed other health-related variables. Moreover, given the high transparency and spatial openness required by automated retail formats, only small, localized regions of the original façade were modified. These constraints inevitably limit the transformative impact of the redesign. Future studies should extend the mapping to additional WELL-related dimensions such as lighting ergonomics or spatial proportions, thereby enhancing theoretical rigor and broadening the applicability of the framework. This extension could work in parallel with the immersive and multimodal evaluation strategies outlined later in this section, ensuring that theoretical mapping and empirical validation advance together.

Second, the image generation process produced only seven façade variations corresponding to distinct prompt control strategies. Although these conditions were sufficient to demonstrate visual differentiation across design dimensions, the limited sample size restricts the generalizability and richness of findings. Expanding the visual dataset to include more diverse prompt combinations, structural variations, and context-specific adaptations could strengthen the design repertoire and support more nuanced analysis.

Third, although this study frames its workflow as a recursive reasoning process, the implementation remains linear and expert-driven. Future work should incorporate algorithmic recursion by integrating user feedback into automated prompt regeneration, potentially through reinforcement learning, LLM-based prompt tuning, or human-in-the-loop optimization. Such developments would more rigorously operationalize the recursive principles that lie at the core of EBD and computational design.

Fourth, the expert evaluation primarily drew on the expertise of five professionals in the fields of architecture and wellness design. Their insights were constructive and aligned with the study's objectives. Nevertheless, the relatively small panel size (n = 5) may introduce bias and limit the generalizability of the findings. Future research could expand the panel to include a larger and more diverse set of experts, and where appropriate, complement expert judgment with input from actual users to capture a wider range of perceptual and emotional responses. Moreover, experts noted that certain combinations, such as Material + Pattern, while highly rated, may approach the threshold of “visual overload.” This suggests the need for future studies to investigate the complexity threshold at which biophilic enrichment transitions from being perceived as positive to becoming visually overstimulating. Identifying such a balance point would provide valuable guidance for calibrating façade design strategies to maximize restorative qualities while avoiding cognitive strain.

Fifth, although this study included open-ended comments in addition to quantitative ratings, applying more robust qualitative methods such as semi-structured interviews or evaluation workshops may reveal deeper insights into participants’ aesthetic judgments and design preferences. These methods are particularly helpful in interpreting subtle or ambiguous visual outcomes.

Sixth, the use of static images limits the ability to assess real-world user engagement. Immersive scenarios can be generated through virtual reality (VR) or augmented reality simulations. When combined with physiological and behavioral measures such as eye-tracking, galvanic skin response, and facial expression analysis, these methods offer a more comprehensive understanding of spatial perception and emotional responses. Prior research has shown that immersive eye-tracking can effectively capture attention patterns in virtual environments (Kim, 2024), indicating promising applications for future architectural design studies. Future research should situate these psychophysiological methods more explicitly within existing human—computer interaction and architectural psychology paradigms, so that measures of attention, cognitive load, and affective state can be directly interpreted as indicators of wellness-oriented design impact. In practice, insights from eye-tracking, electrodermal activity and heart rate variability can be mapped onto higher-level perceptual constructs such as comfort, complexity, and restorative potential. These constructs can then be employed as feedback to refine prompt-generation strategies within recursive algorithmic processes.

Seventh, although LPIPS was employed as a perceptual similarity metric to quantify deviations between the generated façades and the original baseline, it presents several limitations. This study implemented the metric with the VGG backbone, which is commonly used in computer vision tasks for its sensitivity to feature-level perceptual differences. However, while LPIPS effectively measures image divergence at the visual level, it does not directly indicate architectural quality, semantic fidelity, or functional appropriateness. A façade design with a higher LPIPS score may simply represent a more visually distinct alternative, rather than a failure in design logic. Future work should therefore complement LPIPS with additional evaluation metrics—such as semantic segmentation accuracy, realism ratings, or functional alignment measures—to capture both perceptual change and architectural validity.

Lastly, while this study focused on three visual dimensions, future research could extend to additional WELL-related visual features such as form, shape articulation, light quality, façade transparency, signage clarity, or natural view access, depending on the architectural context. At the same time, the methodology relied exclusively on Stable Diffusion XL (SDXL) with fixed parameters, which ensured internal consistency but also limited the exploration of potential variability across other generative models. Since prompt interpretation, controllability, and visual consistency may differ between architectures such as SD 1.5, SD 2.1, or fine-tuned variants, expanding future investigations to multiple models and parameter settings would enhance the robustness and generalizability of the findings. Such methodological diversification would further improve the adaptability and practical impact of prompt-based AI tools across various façade typologies. Future research should also reflect more deeply on the balance between design controllability and AI creativity. While WELL-based prompts ensure semantic rigor and alignment with certification standards, excessive control may suppress generative diversity. Exploring hybrid strategies that safeguard both semantic integrity and creative exploration will add conceptual depth to the emerging discourse on AI-assisted design.

In summary, although the proposed methodology demonstrates structured integration of WELL-inspired design prompts with generative AI tools, further development is necessary. Key next steps are expanding visual variables, diversifying evaluation perspectives, and leveraging immersive and physiological assessment methods to build a more comprehensive and adaptive framework for AI-assisted façade design in automated retail environments.

Conclusion

This study proposes and validates a generative AI-assisted design methodology that translates health-oriented architectural standards into tangible visual strategies for façade design in automated retail environments. Grounded in the EBD framework, the research integrates principles from the WELL Building Standard and biophilic design to establish an iterative workflow that links environmental intention, prompt-based image generation, and expert visual evaluation.

Focusing on the three key visual dimensions of Color, Material, and Pattern, the study constructs structured prompt strategies that guides AI-generated designs toward wellness-aligned and visually engaging outcomes. The generation process employs Stable Diffusion XL and ControlNet, enabling localized modifications of real façade images while preserving architectural context and spatial continuity.

Expert evaluations validate the effectiveness of this approach. Among the AI-generated variations, the Material + Pattern strategy achieved particularly high ratings in perceptual qualities such as Natural Features and Façade Material, illustrating the visual value of incorporating natural textures and biomorphic patterns. In contrast, the original design, while performing well in Transparency, lacked biophilic attributes, reinforcing the importance of health-driven visual integration. Qualitative feedback from experts contextualized these results. Despite praising naturalistic materials for enhancing warmth and approachability, some participants cautioned against excessive visual complexity. These insights reiterate the importance of visual balance in applying wellness-oriented design elements.

Overall, this research demonstrates a structured and scalable pathway for enhancing the visual and psychological qualities of automated retail façades through generative AI. Future work should broaden the design scope by incorporating additional WELL-related dimensions such as form, signage clarity, light quality, and transparency, and it should also extend toward multisensory experiences in real-world settings. Furthermore, evaluation methods need to be advanced through multimodal user experience assessments, including neuroimaging, eye-tracking, electrodermal activity, and other physiological measures. These extensions will help ensure that AI-generated designs are not only aligned with expert aesthetic standards but also resonate with users across diverse spatial and cultural contexts. Beyond its empirical findings, the study contributes by articulating a methodological framework that operationalizes health-oriented design principles through generative AI, offering a replicable process for both design research and practice. Although demonstrated in the context of automated retail, the proposed framework holds potential for broader application in workplaces, learning environments, healthcare facilities, and public spaces, where visual quality and user well-being remain central to architectural performance.

Footnotes

Acknowledgments

The authors would like to express their sincere gratitude to Dr. Yong Zeng for his valuable advice on the application of Environment-Based Design (EBD) theory. The authors also thank the anonymous reviewers for their insightful comments and constructive suggestions, which have significantly improved the quality of this paper.

ORCID iDs

Jie Yun

Nayeon Kim

Funding

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF) grant (NRF-2025S1A5A8007949), and by the Korea government (Ministry of Science and ICT) (RS-2025-23523874).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data supporting the findings of this study are available upon request.

References

Ali

A. K.

Lee

O. J.

(2023). Façade style mixing using artificial intelligence for urban infill. Architecture, 3(2), 258–269. https://doi.org/10.3390/architecture3020015

Braun

Clarke

(2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

Browning

W. D.

Ryan

C. O.

Clancy

J. O.

(2014). 14 patterns of biophilic design: Improving health and well-being in the built environment. Terrapin Bright Green LLC. https://www.terrapinbrightgreen.com/reports/14-patterns/

Cao

Abdul Aziz

Mohd Arshard

W. N. R.

(2025). Stable diffusion in architectural design: Closing doors or opening new horizons? International Journal of Architectural Computing, 23(2), 339–357. https://doi.org/10.1177/14780771241270257

International WELL Building Institute . (2020). The WELL Building Standard v2. International WELL Building Institute. https://www.wellcertified.com/certification/v2/

Lee

J. K.

Lee

Y. C.

Choo

(2024). Generative artificial intelligence and building design: Early photorealistic render visualization of façades using local identity-trained models. Journal of Computational Design and Engineering, 11(2), 85–105. https://doi.org/10.1093/jcde/qwae017

Jun

Jia

W. Z.

(2025). A novel architectural design paradigm: Collaborative construction with design thinking semantic networks and large language models.

Jung

Kim

D. I.

Kim

(2023). Bringing nature into hospital architecture: Machine learning-based EEG analysis of the biophilia effect in virtual reality. Journal of Environmental Psychology, 89, 102033. https://doi.org/10.1016/j.jenvp.2023.102033

Kellert

S. R.

(2008). Dimensions, elements, and attributes of biophilic design. Biophilic design: the theory, science, and practice of bringing buildings to life, 2008, 3–19.

10.

Kellert

S. R.

Calabrese

E. F.

(2015). The practice of biophilic design . https://www.biophilic-design.com

11.

Kim

(2022). Quantifying emotions in architectural environments using biometrics. Applied Sciences, 12(19), 9998. https://doi.org/10.3390/app12199998

12.

Kim

J. Y.

Park

S. J.

(2025). AI-driven biophilic façade design for senior multi-family housing using LoRA and stable diffusion. Buildings, 15(9), 1546. https://doi.org/10.3390/buildings15091546

13.

Kim

(2024). Capturing initial gaze attraction in branded spaces through VR eye-tracking technology. International Journal of Human–Computer Interaction, 41(7), 4392–4405. https://doi.org/10.1080/10447318.2024.2463832.

14.

Kim

Gero

J. S.

(2022, July). Neurophysiological responses to biophilic design: A pilot experiment using VR and EEG. In In international conference on-design computing and cognition (pp. 235–253). Springer International Publishing.

15.

Kim

Lee

(2021). Assessing consumer attention and arousal using eye-tracking technology in virtual retail environment. Frontiers in Psychology, 12, 665658. https://doi.org/10.3389/fpsyg.2021.665658

16.

Kokatnur

Faris

Gunay

O’Brien

Azar

(2025). The WELL Building Standard: A literature review and bibliometric analysis of a nascent field. Journal of Building Engineering, 103, 112121. https://doi.org/10.1016/j.jobe.2025.112121

17.

Küller

Ballal

Laike

Mikellides

Tonello

(2006). The impact of light and colour on psychological mood: A cross-cultural study of indoor work environments. Ergonomics, 49(14), 1496–1507. https://doi.org/10.1080/00140130600858142

18.

Majid

Z. K.

(2022). Exterior façade design and its impact on boosting business and attracting customers in retail sectors. Journal of Design, Business & Society, 8(1), 69–86. https://doi.org/10.1386/dbs_00033_1

19.

Marberry

S. O.

Guenther

Berry

L. L.

(2022). Advancing human health, safety, and well-being with healthy buildings. Journal of Hospital Management and Health Policy, 6, 18. https://doi.org/10.21037/jhmhp-21-63

20.

Nam

Lee

(2025). Consumer preferences for unmanned stores: A choice experiment study. Journal of Retailing and Consumer Services, 82, 104061. https://doi.org/10.1016/j.jretconser.2024.104061

21.

Nguyen

T. A.

Zeng

(2012). A theoretical model of design creativity: Nonlinear design dynamics and mental stress-creativity relation. Journal of Integrated Design and Process Science, 16(3), 65–88. https://doi.org/10.3233/jid-2012-0007

22.

Petráková

Šimkovič

(2023). Architectural alchemy: Leveraging artificial intelligence for inspired design – A comprehensive study of creativity, control, and collaboration. Architectural Papers of the Faculty of Architecture and Design STU, 28(4), 3–14. https://doi.org/10.2478/alfa-2023-0020

23.

Podell

English

Lacey

Blattmann

Dockhorn

Müller

Rombach

(2023). SDXL: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952.

24.

Shan

Junghans

(2023). Multi-objective optimization for high-performance building façade design: A systematic literature review. Sustainability, 15(21), 15596. https://doi.org/10.3390/su152115596

25.

Shih

(2024). The role of research funders in providing directions for managing responsible internationalization and research security. Technological Forecasting and Social Change, 201, 123253. https://doi.org/10.1016/j.techfore.2024.123253

26.

Sourek

Artificial intelligence in architecture and built environment development 2024: A critical review and outlook.

27.

Stöckl

(2023, February). Evaluating a synthetic image dataset generated with stable diffusion. In In international congress on information and communication technology (pp. 805–818). Springer Nature Singapore.

28.

Sun

Zeng

Zhou

(2011). Environment-based design (EBD) approach to developing quality management systems: A case study. Journal of Integrated Design and Process Science, 15(2), 53–70.

29.

Tabassum

R. R.

Park

(2024). Development of a building evaluation framework for biophilic design in architecture. Buildings, 14(10), 3254. https://doi.org/10.3390/buildings14103254

30.

Thampanichwat

Wongvorachan

Sirisakdi

Somngam

Petlai

Singkham

Bhutdhakomut

Jinjantarawong

(2025). The architectural language of biophilic design after architects use text-to-image AI. Buildings, 15(5), 662. https://doi.org/10.3390/buildings15050662

31.

Veloso

(2025). (In) forming the new building envelope: A pedagogical study in generative design with precedents and multimodal large language models. International Journal of Architectural Computing, 23(1), 96–121. https://doi.org/10.1177/14780771241254634

32.

Viliunas

Grazuleviciute-Vileniske

(2022). Shape-finding in biophilic architecture: Application of AI-based tool. Architecture and Urban Planning, 18(1), 68–75. https://doi.org/10.2478/aup-2022-0007

33.

Wang

Zeng

(2009). Asking the right questions to elicit product requirements. International Journal of Computer Integrated Manufacturing, 22(4), 283–298. https://doi.org/10.1080/09511920802232902

34.

Wang

Bao

Zhou

Chen

Yuan

(2022). Semantic image synthesis via diffusion models. arXiv preprint arXiv:2207.00050.

35.

Wei

Herr

C. M.

Applying Multimodal Large Language Models in Ecological Facade Design.

36.

Zhang

(2024). Knowledge-driven and diffusion model-based methods for generating historical building façades: A case study of traditional Minnan residences in China. Information, 15(6), 344. https://doi.org/10.3390/info15060344

37.

Yang

Dou

Zeng

(2023). Environment-based design (EBD): Using only necessary knowledge for designer creativity. Proceedings of the Design Society, 3, 1675–1684. https://doi.org/10.1017/pds.2023.168

38.

Yang

Quan

Zeng

(2022). Implementation barriers: A TASKS framework. Journal of Integrated Design and Process Science, 25(3-4), 134–147. https://doi.org/10.3233/JID-210011

39.

Yun

Chen

Lee

Kim

(2024). Visual attention and emotional response to biophilic design in unmanned stores. Proceedings of the Korean Institute of Interior Design Conference, 26(3), 395–400.

40.

Yun

Kim

(2025). Biophilic building envelopes in urban infrastructure: Visual attention and subjective evaluation of façade designs for power and energy facilities. The Journal of the Korean Institute of Interior Design, 34(2), 40–54. https://doi.org/10.14774/JKIID.2025.34.2.040

41.

Zeng

(2002). Axiomatic theory of design modeling. Journal of Integrated Design and Process Science, 6(3), 1–28. https://doi.org/10.3233/jid-2002-6301

42.

Zeng

(2011, August). Environment-based design (EBD). In Proceedings of the ASME 2011 international design engineering technical conferences and computers and information in engineering conference. Volume 9: 23rd international conference on design theory and methodology; 16th design for manufacturing and the life cycle conference (Vol. 54860, pp. 237–250). https://doi.org/10.1115/DETC2011-48263

43.

Zeng

(2015). Environment-based design (EBD): A methodology for transdisciplinary design. Journal of Integrated Design and Process Science, 19(1), 5–24. https://doi.org/10.3233/jid-2015-0004

44.

Zeng

Cheng

G. D.

(1991). On the logic of design. Design Studies, 12(3), 137–141. https://doi.org/10.1016/0142-694X(91)90022-O

45.

Zhang

Huang

(2024). Development of a method for commercial style transfer of historical architectural façades based on stable diffusion models. Journal of Imaging, 10(7), 165. https://doi.org/10.3390/jimaging10070165