Abstract
This pedagogical study delves into integrating established and emerging computational methods into architectural education, with a specific focus on building envelope design within a B.Arch. course. Students employ parametric modeling (PM), design optimization (DO), and multimodal large language models (MLLMs) to analyze and reinterpret building envelope precedents. Parametric design and optimization are utilized to explore envelope variations based on parametric logic and performance evaluation. In the case of MLLMs, students leverage visual patterns from precedents as a form-giving construct for new 3D envelope proposals. While students adeptly integrate MLLMs into their design process, generating successful 3D models, challenges arise in control and translation across representations, leading to unclear scale and tectonics in some design proposals. Survey results reveal that students perceive MLLMs as a valuable, uncomplicated method for rapid design ideation and refinement, but challenges persist in addressing real architectural constraints. Parametric modeling is viewed as a tool for structuring design and DO is seen as a later stage for refining designs based on metrics. The study underscores the importance of evolving user interfaces for MLLMs in specific design tasks, addressing challenges in precision and design scale through prompts and guiding images. It also discusses the potential to combine MLLMs with various generative methods and modeling software during transitions between design media to support future initiatives integrating computational methods into the design process.
Keywords
Introduction
The consolidation of computational design in architecture and the rise of deep learning tools offer fresh prospects and challenges for architectural education. In this context, this study assesses how computational methods can foster creative design of innovative building envelopes within a B.Arch. program. It leverages a precedent-driven pedagogical approach to integrate diverse computational methods, including multimodal large language models (MLLMs). The research employs a mixed methodology, analyzing design exercises within the pedagogical context and surveying students.
The emergence of new building envelopes
Over the past two centuries, the role and form of the building envelope have been radically revised. With the development of modern load-bearing materials, such as steel frames and reinforced concrete, it has been freed from the duty of supporting building loads. Advances in building science and material research have transformed the envelope into a performative surface composed of specialized layers and components, each serving distinct functions.1,2 Examples include curtain walls and innovative glazing materials for transparency and efficient assembly, metal cladding and plastic membranes for watertightness, synthetic materials for insulation, rainscreens to prevent pressure-driven moisture infiltration, integrated photovoltaics for energy production, kinetic assemblies for adaptation, and media facades for communication. In sum, the envelope has evolved into a specialized niche within building design, with dedicated consultants, fabricators, and contractors.
In parallel with technical advances, designers have consistently pushed the boundaries of the contemporary building envelope, moving beyond the conventional tectonics of wall and roof assemblies, codified in stylistic conventions as a classical façade. Early 20th-century modern architects explored the dichotomy between structure and envelope, leading to new aesthetics marked by transparency, abstract geometry, and functional expression. Later, post-modern architects responded to modernism’s abstraction by reimagining the envelope as a canvas for communication, blending elements from popular culture and history.3–5 In recent decades, designers have embraced unconventional geometry, bespoke patterns, and intricate tessellations, utilizing digital modeling, simulation, and fabrication to mediate environmental factors and create novel spatial and material experiences.
Contemporary architectural discourses emphasize a diagrammatic approach to design, moving away from the orderly assembly of figurative elements and compositional arrangements in favor of non-compositional aesthetic principles.6–8 Some contemporary interpretations of envelope aesthetics liken them to a “skin” due to their ability to heighten sensorial and tactile interactions or to effectively respond to environmental and contextual influences.9–11 Envelopes are perceived as mediums for external narratives, often incorporating scientific imagery and metaphors,12,13 or allowing for multiple layers of interpretations anchored in enigmatic signifiers. 5 They have also been linked to an innovative integral approach to ornamentation, which engages with the boundless scale of computational geometry to support open-ended interpretations and generate novel effects through customized architectural assemblies.14–19
Computing building envelopes in the age of AI
Over the past decades, Computer-Aided Architectural Design (CAAD) research has witnessed significant growth in building envelope-related publications, spanning topics like generative design, parametric design, and performance evaluation (see Figure 1). The integration of parametric modeling (PM), optimization, and simulation tools into CAD and BIM software has supported specialized research on pavilions and building envelopes. These tools enable rapid exploration of material and geometric envelope variations while assessing their impact on building efficiency and sustainability. Left: Graph with the number of publications on building envelopes in the CumInCAD repository by year. This is based on a search for the terms “envelope”, “façade”, “facade”, “architectural skin”, and “building skin” in the summaries of the publications of the repository. Right: word cloud with the keywords of these publications.
Building envelopes have consistently been the subject of research in visual and textual programming,20–26 digital fabrication,27,28 architectural geometry,29,30 and demonstrations of digital design.31–33 Recent research also includes pedagogical initiatives such as courses in textual programming for architects focusing on combinatorial envelopes 34 and the development of adaptable frameworks and user-friendly algorithms to make computational envelope design more accessible to designers.35,36
The current landscape of generative design prominently features a strong integration of deep learning methods, typically involving the training of neural networks composed of differentiable mathematical functions to generate new design representations. 37 Ongoing deep learning research has explored the application of models to different representations, such as graphs, point clouds, and surfaces. While those applications can support important inquiry for future generative design workflows, in this study we focus on problems related to image-based methods for architectural design. Given the importance of visual media for architectural ideation, the spotlight is on MLLMs such as Midjourney, OpenAI’s DALL-E, and StabilityAI’s Stable Diffusion.
Multimodal large language models undergo extensive training using various computational mechanisms to efficiently learn visual concepts from natural language and create novel representations based on text prompts and image inputs. For example, they employ a diffusion process that predicts Gaussian noise incrementally added to training data. 38 This enables them not only to reverse the stochastic process to estimate original data but also to generate entirely new data samples. Due to the training process with vast and diverse datasets, often comprising millions or even billions of images, MLLMs seamlessly incorporate and merge visual concepts from different sources. As a result, they emerge as highly expressive and easy to use generative models that can facilitate aesthetic exploration of architectural elements, including the building envelope.
This trend has given rise to a growing number of tutorials, workshops, and publications. For instance, DigitalFUTURES offered the AI Series with workshops like “An Introduction to AI for Designers,” while PAACademy 39 supported workshops such as “Taking Control: Midjourney x ControlNet,” “Spatial Effects with Midjourney,” “Midjourney Design,” and “Midjourney Architecture.” Computer-Aided Architectural Design conferences have also hosted related workshops, like “Diffusion: Architecture, Artificial Intelligence & Synthetic Imaginations” at ACADIA, 40 “Architectural Intelligence: Multimodal Machine Learning Applications in Design” at SIGraDi, 41 and “Neural 3D Synthesis” at CAADRIA. 42
Recent CAAD research has begun to investigate the potential of Machine Learning and Deep Learning Models (MLLMs) in architectural design and education, particularly focusing on enhancing design creativity during the early stages of the design process. It is important to acknowledge that some of the research cited below has been published during the development and revision of this article, which has prevented the integration of their metrics and findings into our own research method.
Koh’s 43 investigated the use of GANs (CycleGAN, VQGAN-CLIP) and MLLMs (DALL-E 2, Midjourney and DreamFusion) to generate architectural images inspired by food references. The author analyzes and compares the images produced by the MLLMs considering variations in the usage of key prompt elements: architectural terms, architectural styles, architects, colors, and forms.
Guida 44 developed and described three text-to-image-to-3D workflows that rely on MLLMs (Stable Diffusion or DALL-E 2) to create a 3D proof-of-concept for the MAXXI Grande Extension competition in Rome. The author discusses the combination of different modeling processes together in the workflows and how their varied degrees of design agency can increase feedback in the design process and support subjective interpretation and curation.
Turchi et al. 45 investigated the impact of Stable Diffusion and DALL-E 2 on design creativity through a 4-hour workshop where participants responded to the design brief for a sensory-rich water repository in a park. The study involved collecting and analyzing participants’ opinions on the models’ influence on their creative process. The survey, featuring open-ended and five-point scale questions, revealed perceived creative value in swiftly producing diverse and unpredictable images. However, participants highlighted limitations in model explainability and expressed challenges in replicating specific design ideas due to limited control over results.
Paananen et al. 46 conducted a laboratory study with 17 architecture students to understand how MLLMs support divergent creativity and ideation. Each session focused on experimenting with one generative AI tool – Midjourney, DALL-E, or Stable Diffusion – to generate visual concepts for a regional cultural center’s floorplans, façade material samples, and indoor views. Data collected through standardized questionnaires, semi-structured group interviews, and participant comments revealed that MLLMs facilitated a “flow-state creative experience” and had a strong influence in the design process. However, challenges arose in generating technical images, such as footprints and façade materials, and some participants missed having constraints in the system.
Dortheimer et al. 47 explored the potential of MLLMs in architectural design through a workshop involving 25 architecture students. The students, in their 3rd to 5th year, used MLLMs like Midjourney, DALL-E, Stable Diffusion, and Lexica to design a two-story multicultural community building. Data from presentation slides and screen recordings of usage of MLLMs and CAAD software underwent quantitative and qualitative analysis. The authors concluded that MLLMs are most beneficial during ideation, leveraging computational creativity to explore new aesthetic possibilities. However, they observed students limiting creative efforts and being less critical when MLLMs produced more artistic images, a phenomenon termed “design fixation.” In later design phases, MLLMs were deemed less useful due to resulting images not aligning with ongoing design or deviating from the engineering logic of a building representation.
Pedagogical research method
While the B.Arch. students at the Fay Jones School of Architecture and Design learn about building envelopes in later design studios and in building science courses, there are currently no courses on computational design in the core curriculum. Thus, students typically depend on studio instructors’ desk critiques and on on-line tutorials to learn and apply computational techniques, while they are involved in a time-consuming process of designing a site-specific project. To bridge the gap between undergraduate education and CAAD and assess the role of computation in the curriculum, we offered the elective (In)Forming the New Building Envelope in the Spring of 2023 to a group of seven students. Tailored for B.Arch. students from the third to the fifth year, the course explores the integration of computational design methods into architectural building envelope design to support students’ future design initiatives.
It comprises five modules, each lasting about 3 weeks: 1. Diagramming (Di) 2. Parametric modeling (PD) 3. Design optimization (DO) 4. Generative learning with MLLMs 5. Exhibition (Ex)
The initial diagramming module begins with an introduction to architectural theory and contemporary exploration of building envelopes. It culminates with the selection, understanding, and analysis of three existing building envelopes per student. The students classified the selected envelopes using categories from a subset of the proposed readings—depth, material, and affect by Moussavi, 15 and assemblage types by Zaera-Polo & Anderson 2 —and explained their geometric logic using diagrams.
In contrast to prior studies, which focus solely on MLLMs, this course integrates instructional lectures on various computational methods with hands-on exercises that result in 3D proposals. So, the three subsequent modules delve into PM, DO, and generative learning (utilizing MLLMs), selected as a subset from the eight generative modeling categories outlined by Veloso and Krishnamurti.
37
The criteria for selection were their direct applicability to building envelopes and accessibility for students without a programming background. The motivation is to explore their potential for design generation and to cultivate critical thinking about their differences in terms of transparency, control, and expressiveness (see Figure 2). Elements of generative design addressed in the course with their respective opacity based on existing taxonomy.
37
(A) parametric modeling and design optimization as glass box methods; (B) generative learning with MLLMs as a black box method.
In each module, students delve into one of these methods, reinterpreting the formal logic and geometry of selected precedents for design derivations. This approach, combining computational instructions and hands-on exercises, aligns with established practices in workshops and courses dedicated to coding and generative design.20,34,48–50 Rooted in experiential learning principles, this methodology emphasizes the pivotal role of students’ active engagement in practical experiences as a crucial pedagogical element. 51 It also resonates with Oxman’s digital design theory for architectural education, highlighting the significance of reconceptualizing design as a hands-on, exploratory, and research-based activity where various computational methods are integral to the design process. 52 Additionally, the use of precedents introduces a lexicon of successful forms tailored to specific problems and provides a qualified starting point for design ideation. Therefore, they alleviate the cognitive load of design exercises, allowing students to focus on the relationship between computational methods and envelope generation in short pedagogical modules.
The final module included a presentation in the school’s gallery and the compilation of the course portfolio. Both incentivized students to utilize various resources, including digital fabrication and augmented reality, to document and display the semester’s design results to a broader audience. Additionally, to elucidate the pedagogical structure for both students and the audience, students created a map illustrating relationships between selected precedents and the derivations produced in various exercises.
Based on this course structure, we use a hybrid method approach to analyze the opportunities and challenges in using MLLMs in architectural education. We use the maps produced in Module 5 as an initial overview of semester projects, then analyze the design proposals generated with MLLMs in Module 4. In addition, we administered an online survey after the semester’s end to evaluate students’ perspectives and attitudes toward the pedagogical approach, the capacity, and relevance of different generative design methods, and their incorporation into a B.Arch. curriculum.
Results
Selecting precedents
Projects selected by seven students for the different assignments.

Location and year of the projects selected by the students.
Parametric modeling and optimization
In the PM assignment, students interpreted the internal logic of the envelope of two of the precedents. They then proposed five distinct parametric variations utilizing techniques such as panelization, component customization, morphing, attractors, image sampling, aggregations, etc. This process was facilitated through the integration of add-ons such as Lunchbox, PanelingTools, and Weaverbird. The objective was to introduce PM and underscore how it allows designers to operate on an explicit design space that supports the creative exploration of designs based on parameters.
For optimization (DO), students selected one of the projects modeled in the preceding assignment and devised eight variations of a performative envelope using optimization algorithms — one of which was also translated into a physical model. Although students had the freedom to create their own fitness functions and use different optimization algorithms, we shared fully functional workflows for the optimization of dimensions, sunlight exposure hours, and visibility of external elements using the Genetic and Simulated Annealing algorithms implemented in Galapagos. The goal was to enable students to explore innovative design expressions rooted in the interplay between custom parametric and performance spaces.
Student 1 (Figure 4) created parametric diagrams for the components of the Chemnitz City Hall (Rudolf Weisser and Hubert Schiefelbein, 1974), the Suzhou Cultural and Arts Center (Paul Andreu and Studio505, 2006), and the dynamic envelope of the Bund Finance Center (Foster + Partners, Heatherwick Studio, 2017). In the PM assignment, the student explored variations of the Bund Finance Center, including adjustments to the distribution of the magnesium alloy tassels and the shape of its guiding curve. Regarding the Suzhou Center, the parameters for variation encompassed panel width, distribution, and the thickness of the different layers. This project was also expanded upon in the DO assignment, incorporating different objectives such as maximizing or minimizing sun exposure within the interior, maximizing the percentage of views, and optimizing geometric properties of the frame such as thickness and depth. Example of a map by student 1 (L. Butler) showing the relationship between the precedents and the different design exercises.
Student 2 (Figure 5) investigated the use of curve attractors to generate the brick facade of the Revolving Bricks Office Building (A. P. Pars Architects & Associates, 2015), the distribution of random lines to form the web-like pattern of the Italy Pavilion (Nemesi, 2015), and the tessellation of the façade proposal for the Pushkinsky Cinema (Synthesis Design + Architecture, 2011). For the PM assignment, the student developed parametric variations of the first two projects, incorporating a building mass and adding more details to the envelope components. In the DO assignment, the student used a genetic algorithm to investigate how variations in brick dimensions, distribution, and angles impacted the minimization or maximization of direct sun-hours within the interior. The maximization of the views to the exterior and maximization or minimization of number of panels are also used as complementary objectives. Example of a map by student 2 (S. Park) showing the relationship between the precedents and the different design exercises.
Student 3 (Figure 6) delved into the grid transformation and the custom panelization of the Sejong M-Bridge (Morphosis, 2019), the modeling and regular distribution of the envelope components of the Kolon One (Morphosis, 2018), and the varied textures defined by different brick components for the Gallery House (Abin Design Studio, 2020). In the PM assignment, the student created variations of the Sejong M-Bridge based on varied grid configuration, on non-uniform scaling, and on mirroring. The student also explored Gallery House variations based on the distribution of the different brick patterns. For the DO assignment, the student investigated how changes in panel depth, aperture size, and orientation within the Sejong M-Bridge project—while maintaining a fixed percentage of flat panels—yield different interior sun hour distributions. Example of a map by student 3 (S. Cutlip) showing the relationship between the precedents and the different design exercises.
Building envelopes patterns and metaphors
The fourth assignment (AI) is a multi-step pedagogical exercise exploring the potential of MLLMs in contemporary architectural design.
Initially, students interpreted architectural precedents using Charles Jencks's metaphorical analyses of post-modern and contemporary buildings as references. 5 They created matrices with 18 hand-drawn diagrams paired with textual prompts to capture visual aspects of architectural precedents. Then, they employed MLLMs like DALL-E 2, Midjourney, and Stable Diffusion to create a matrix with 18 analogous diagrams. Stable Diffusion, our recommended choice, enables local execution with web interfaces such as the one provided by Automatic1111. 53 It stood out due to its lightweight, open-source nature, and the availability of add-ons like ControlNet, which supports incorporating conditional inputs from reference images, including edge maps, depth maps, and segmentation maps. 54
In the second phase, students explored the use of architectural signs and natural language as creative resources for designing building envelopes. Starting from their matrices, they refined a subset of diagrams and related concepts to form the basis of their envelope proposals (see Figure 7). Images of the projects developed in class with MLLMs. Each row contains projects developed by a different student and is numbered for quick reference in the text.
This approach aligns with the literal analogy design heuristic, which involves “borrowing a known or found form-giving construct as a point of departure for structuring a design problem”. 3 To facilitate this process, we introduced techniques such as projection and parameter mapping, allowing students to translate concepts between images, depth maps, and geometric structures such as meshes and surfaces. These methods were complemented by the ControlNet add-on within the localized Stable Diffusion environment, enabling the generation and utilization of depth estimation maps as conditional inputs. Additionally, students had access to Grasshopper add-ons, including Ambrosinus, 55 for the seamless integration of MLLMs and other AI tools into the Grasshopper platform and SurfaceRelief 56 to create 3D relief models from images.
Student 1 drew inspiration from the metal screen patterns of the Suzhou Cultural and Arts Center, focusing on concepts like aggregation, overlapping layers, and tessellation. This effort yielded a matrix of images featuring honeycombs, cords, neurons, nests, spiderwebs, and more (Figure 8(A)–(D)). While the sketches express these ideas with simple iconic drawings, the corresponding AI-generated images are enriched with detailed prompts, including adjectives like “abstract” and “greyscale,” leading to more intricate and dynamic representations. In the second part of the exercise, the student explored the images of “chords and wire,” “honeycomb,” and “neural network” (Figure 8(D4), (A5), and (D5)) combined with specific architectural descriptors in the prompt as potential form-giving constructs for building envelopes. The student particularly concentrated on “chords and wires” using the description “architecture façade, GFRC, openings,” resulting in unconventional fins, walls, voids, and openings (Figure 8(E1)–(E2)), which were then translated into a 3D model of a curved building envelope at an intersection (Figure 8(F)–(G)). Matrices of patterns produced by student 1 (L. Butler) based on the visual interpretation of the Suzhou Cultural and Arts Center. The representations used for the complete development in 3D are highlighted in yellow. The numbers and letters in this figure are alphanumeric coordinates used for quick reference in the text.
Student 1 also developed matrices based on the magnesium alloy tassels of the Bund Finance Center’s. These diagrams portrayed various round-like objects arranged in hexagonal grids or clusters, such as pencils, honeycombs, soccer nets, nuts & bolts, windchimes, pipe organs, fences, and plastic straws. The sketches used simple representations, like a few tubes for “windchime” (Figure 9(A2)), a window elevation for “curtains” (Figure 9(B1)), or an icon of a “pipe organ” (Figure 9(B2)). By adding descriptive adjectives to the prompts, corresponding AI-generated images depicted arrangements like interconnected windchimes with cords (“abstract, greyscale, windchimes,” Figure 9(A5)), swaying curtains with geometric patterns (“abstract, greyscale, curtains,” Figure 9(B4)), and a dynamic view of an elaborate pipe organ (“abstract, greyscale, pipe organ,” Figure 9(B5)). In the next phase of the exercise, the student explored how windchime, curtains, and pipe organ images could serve as design concepts for building envelopes. This involved creating new prompts with detailed architectural descriptions and using these images as inputs. For example, combining the prompt “architectural façade, glass, metal” with A5 resulted in an image portraying a multi-story building with openings and opaque elements arranged similarly to the windchime pattern (Figure 9(D2)), which was translated into a 3D model and conceptual renderings (Figure 9 (E)). Matrices of patterns produced by student 1 (L. Butler) based on the visual interpretation of the Bund Finance Center. The representations used for the complete development in 3D are highlighted in yellow. The numbers and letters in this figure are alphanumeric coordinates used for quick reference in the text.
Student 2 concentrated on abstract and textural qualities of natural and artificial entities to describe the Revolving Bricks Office Building and the Italy Pavilion Milan Expo. Her matrices (Figures 10 and 11) encompass patterns like ripples in water, sand dunes, fish scales, reptile skin, top view of trees, rain in the wind, hair, worms, swarm of birds, trees, fabric, lights, grass, lasers, vines on a wall, street, bale of hay, and glue. With a few exceptions, the hand-made and AI-generated images look like 2D-textures with abstract qualities that are open to reinterpretation. Matrices of patterns produced by student 2 (S. Park) based on the visual interpretation of the Revolving Bricks Office Building. The representations used for the complete development in 3D are highlighted in yellow. The numbers and letters in this figure are alphanumeric coordinates used for quick reference in the text. Matrices of patterns produced by Student 2 (S. Park) based on the visual interpretation of the Italy Pavilion for the Milan Expo (columns 7–12). The representations used for the complete development in 3D are highlighted in yellow. The numbers and letters in this figure are alphanumeric coordinates used for quick reference in the text.

For the Revolving Bricks Office Building, the student focused on curvilinear patterns with granularity, such as “drawing of curves in the pattern of fish scales”, “abstract pattern of texture that looks like reptile skin”, and “abstract pattern of texture that looks like hair” (Figure 10(A6), (B4), and (C4)). These were further developed by incorporating architectural specifications into new AI-generated envelope images (Figure 10(D)). The fine-grained hair pattern was employed to design the 3D envelope for a box-shaped corner building. The resultant shape operates primarily at the scale of the envelope rather than the building mass, suggesting a tactile rainscreen characterized by curvilinear flows, ridges, and valleys, potentially constituting a kinetic envelope.
Regarding the Italy Pavilion Milan Expo, student 2 generated patterns like “texture like fabric with diagonal lines,” “2D drawing of lines that look like streets,” and “texture that looks sticky like glue” (Figure 11(A5), (C4), and (C6)), producing envelope images with concrete materials (Figure 11(D)). The latter was further developed into an envelope for an isolated building, where irregular ridges delineate regions with opaque panels and openings (Figure 11(E)). Due to the size of the ridges in this image, it begins to alter the shape of the original building block, leading to an irregular profile.
Based on the curvilinear shape of the envelope components of the Kolon One, student 3 focused almost exclusively on biological forms and natural patterns. These included shark teeth, birds, water, dolphins, people dancing, hummingbirds, liver, and a shoulder blade. In the second part of the exercise, the student investigated natural patterns using more sophisticated prompts and weights, and parameters to achieve seamless tiling with high-quality renderings (Figure 12(A4)–(C6) and (D1)–(F3)). Three images were used to produce custom envelopes with curved ridges, crumples, and openings that reveal its inner structure and curtain wall (Figure 12(D4)–(F6)): • “Facade that looks like human shoulder blade bones, realistic, 3d --tile --v 5 --q 2” (Figure 12 (D3)) • “Shoulder blades but as a facade component abstracted painting --tile --v 5 --q 2” (Figure 12 (E3)) • “Abstracted rib cage represented as flowers --tile --v 5 --q 2” (Figure 12 (F3)) Matrices of patterns produced by student 3 (S. Cutlip) based on the visual interpretation of the Kolon One and Only Tower. The representations used for the complete development in 3D are highlighted in yellow. The numbers and letters in this figure are alphanumeric coordinates used for quick reference in the text.

Survey
In contrast to the existing research reviewed in the introduction, our survey takes a unique approach by evaluating students’ attitudes towards distinct computational methods—namely, PM, DO, and MLLMs—and their perceived integration into architectural design. The survey delves into considerations such as design process, creativity, usefulness, and curriculum integration, aiming to offer a comprehensive exploration of these methods. The inclusion of multiple methods serves a dual purpose: to establish a benchmark for comparison and to uncover potential relationships between MLLMs and other design tools.
The initial segment of the survey comprises five questions for each design technology (denoted as ‘x’ below): Q1. x… (1 …restricts creativity – 5 …augments creativity) Q2. How likely would you use x in your next project? (1 not likely −5 likely) Q3. Given your current background, you consider learning x… (1 …easy – 5 …hard) Q4. What year should x be introduced to architecture students? (1st year – 5th year)”. Q5. Share your thoughts about x.
The first three questions utilize a five-point scale, reflecting the impact of computational methods on creativity and usefulness, and the perceived ease of learning. The statements at the scale’s endpoints act as antonyms, defining the spectrum of responses. The midpoint signifies a neutral attitude, while the second and fourth points indicate a slight inclination towards one of the extremes.
Q4 solicits students’ input on the optimal year for introducing each technology Results of the survey with minimum, maximum, and average values for the questions with a numerical scale from 1 to 5.
Cumulative results with the votes for the two multiple-choice grids used in the survey.
Based on the responses, PM is generally perceived as a tool to enhance creativity (Q1 4.57/5) and support innovative design (G1.1 7/7). It is deemed useful for real design challenges (G1.2 4/7 votes) and likely to be adopted in future projects (Q2 4.57/5). Parametric modeling is considered most relevant in predesign (G2.1 4/7), schematic design (G2.2 7/7), and design development (G2.3 5/7). It is considered complicated (G1.3 0/7) with a high level of uncertainty about its ease of learning (Q3 2.71/5), and it is viewed as valuable for architectural education (G1.4 6/7), with a recommended introduction within the first 3 years of the curriculum (Q4).
Design optimization is seen as a facilitator of innovative design (G1.1 5/7) that is highly useful for real design challenges (G1.2 7/7) and is primarily applicable in schematic design (G2.2 6/7) and design development (G2.3 4/7). While on average students believe it can augment creativity (Q1 3.86/5), responses vary from 2 to 5, indicating uncertainty. Deemed relevant for education (G1.4 7/7) and likely to be used in future projects (Q2 4.14/5), DO is considered complicated (G1.3 0/7) and challenging to learn (Q3 3.29/5). Students suggested that it should be introduced in the curriculum between the 2nd and 4th year (Q4). According to one of the students, it is a great tool, but it can lead to premature decision-making: (…) Even though this tool is great, I can image that this is something that could be dangerous to some students who are less mature in their “design-ability.” (…) I can easily see students falling on this tool way to[o] early in the design phase. It is way to[o] easy to plug in variables straight from an analysis phase, without digesting and establishing a system of meanings.
Multimodal large language models are generally perceived to enhance creativity (Q1 4.14/5) and support innovative design (G1.1 5/7). They are likely to be used in future projects (Q2 4.14/5) and considered valuable in predesign (G2.1 5/7) and schematic design (G2.2 4/7). Multimodal large language models are seen as relevant for architectural education (G1.4 6/7), but contrary to other technologies, they are regarded as uncomplicated (G1.3 5/7) and relatively easy to learn (Q3 1.29/5). Opinions on the year of introduction vary (Q4 2.29/5, ranging from 1 to 4), with some caution about introducing them to an audience without sufficient architectural knowledge. While some suggest starting in the first year, others advocate waiting until later when students have a more solid design foundation. One student explicitly refers to the risk of introducing it to students without proper architectural knowledge: “by third year, students have a good enough design background to know what kind of words to use to generate successful images.” Still, the student thinks that “...these tools COULD be used during the end of 1st year to allow students to adapt to different technologies with the early studios artistic and fundamental approach to design.”
Students acknowledged the potential of MLLMs to quickly suggest design alternatives in early stages of design. Similarly to the findings of Turchi et al., 45 students recognized the stochastic image generation’s value for divergent thinking but noted challenges in realizing specific ideas, possibly contributing to the perception of limited utility for real design issues (G1.2 3/7). Some comments highlighted difficulties in controlling output to match mental images: the “(…) hardest part was trying to use the right words for it to generate the picture that I vaguely had in mind” or “AI tools took time to really get the result that I wanted, but it was still faster to iterate with AI than without”. Despite this lack of fine-grained control, fast image generation was seen as aiding ideation and design concept development. One student mentioned that the AI model “(…) creates somewhat unrealistic designs, but I think that’s the perfect way to inspire creativity.”, while for another one “(…) the unexpected outcomes it gave you prompted (…) you to see [the] assignment in a different way”.
Synthesis of the sequencing and roles of the design technologies, based on comments provided by five students.
Limitations
This pedagogical study is marked by certain limitations inherent in its reliance on the experiences of seven undergraduate students participating in an elective course. The classroom setting offers the advantage of maintaining the students’ familiar learning environment with time to reflect on their design decisions. However, it introduces challenges in controlling the conditions under which students engage with computational methods in their design proposals and restricts the capacity to observe all the design process.
The data collection process involved analyzing and discussing the work submitted in class, supplemented by classroom observations, and an online survey. While this hybrid approach facilitates a comprehensive understanding of the topic, the small sample size and potential sampling bias limit the generalizability of the findings. Enrolment in a generative design course likely attracts students already predisposed to an interest in generative design, potentially skewing the findings. Furthermore, relying solely on qualitative analysis of data from course deliverables and a survey may not fully capture the nuances of students’ perspectives, attitudes, and design processes.
To address these shortcomings, future research should incorporate follow-up methods to build upon these initial findings, capturing more accurate information about the students’ design process and perspectives. Besides, it should also establish a dialogue with other researchers working on the same topic, to look for shared standards and metrics.
Discussions
Design interfaces
Through their exploration of form-making with PM, DO, and MLLMs, students incorporated design principles and visual patterns drawn from precedents. Overall, students recognized the potential of these methods to enhance design creativity and facilitate innovative solutions. MLLMs were perceived by students as particularly user-friendly computational tools for ideation and concept development. This perception is likely attributed to the fact that students do not need to understand the internal workings or development of MLLMs. They can concentrate on design exploration exclusively by manipulating a user interface, like conventional CAD software. However, unlike CAD software, MLLMs are generative models that rely on the indirect production of output based on guiding images, text prompts, model parameters, and are supported by other functions and extensions.
To further enhance human-computer interaction, ongoing efforts involve integrating new algorithms and tools into MLLM interfaces. These enhancements aim to improve generation capabilities, prompt recommendations, and image processing. Notable additions include image editing features, parallel output production, a history of prompts and results, and tools for prompt generation, such as textual inversion. Moreover, there are now more accessible methods for customizing MLLMs, such as importing and combining specialized models and extensions, or fine-tuning the model using a few reference images.
In this scenario, designers should be encouraged to view the software interface as a tool for integrating expert knowledge, fostering collaboration, and systematically exploring design history and variations. The matrix of diagrams introduced in the course serves as a pedagogical example, demonstrating how students organized a visual interface with multiple patterns from a precedent for further design exploration. This facilitated the production, transformation, and combination of references and diagrams using MLLMs. In an advanced course, this approach could be expanded to the development of a custom interface prototype fully integrated with MLLMs, emphasizing the importance of continual engagement with evolving design technologies.
Design control
Despite their user-friendly nature, MLLMs presented challenges in developing design solutions with strict constraints due to a lack of control and precision. This inherent difficulty led to students perceiving MLLMs as potentially unsuitable for addressing real-world design challenges. For designers utilizing MLLMs, the imperative lies in developing strategies to customize input information to align the design output with expectations and design constraints. The course exercise exemplified this challenge, particularly regarding building scale and tectonic elements. Students interpreted precedents based on visual patterns, employing them as form-giving constructs within the scale-less space of pixel matrices with MLLMs.
To surmount this challenge, students adopted various approaches. Some delved into intricate prompts with refined parameters, while others integrated conditional images. Notably, certain students creatively described and utilized AI-generated images as tiles or textures, effectively addressing panel shapes and ornamental relief at different scales (as demonstrated in projects 2.1, 3.1, 3.2, 3.3, 4.2, 6.1, 6.2, and 7.1 in Figure 7). Conversely, students who explored imagery from other domains to craft custom building forms faced provocative building shapes. However, these shapes necessitated subsequent revision and rationalization in a real design process. Considerations such as system integration, structural design, and interior layout must be addressed, as evident in projects 1.1, 1.2, 5.1, and 7.2 in Figure 7.
The need for a balance between creative exploration and practical application emerges, underscoring the importance of refining MLLM use to align with the complexities of building design.
Design translations
The seamless translation between various design media formats and software platforms is a crucial aspect within image-based workflows employing MLLMs. This process encompasses both the creation of images intended for use as input in MLLMs and the utilization of synthesized images as a foundation for subsequent modeling procedures. It is essential to underscore that the translation between representations in the design process is both a technical and creative undertaking that can be explored computationally.
In this context, numerous computational methods, including PM, rule-based modeling, and agent-based modeling, provide avenues not only to generate conditional images for MLLMs but also to use their output images as guiding diagrams. In the course exercise, the output images from MLLMs played a pivotal role in developing 3D models. This was achieved through either labor-intensive direct geometric modeling or by utilizing images as input for other generative methods.
Notably, the course introduced approaches to leverage resultant images for controlling the distribution, shape, and transformations of building components within a parametric model. However, the students predominantly focused on projecting depth images onto meshes and surfaces, constraining their parametric control to position and depth. In scenarios where images are directly employed to inform the construction of 3D models, future research could explore more robust digital modeling software with procedural textural capabilities, such as Blender and Keyshot. These can enhance students’ parametric control when translating MLLMs images into 3D forms and reduce the need for post-production and geometric rationalization.
Conclusion
In this article, we presented a pedagogical experience focused on generative design for building envelopes, integrating various computational methods, including PM, DO, and MLLMs. Initially, students selected contemporary building envelopes as their design precedent and employed PM to decipher their internal logic. Subsequently, they explored envelope variations using optimization algorithms and custom metrics. Finally, in the AI assignment, students explored the potential of MLLMs within contemporary architectural design methods and aesthetics. They created matrices consisting of hand-drawn and AI-generated diagrams, accompanied by textual prompts to capture visual patterns and qualities from the precedent buildings. Subsequently, they used a subset of these visual patterns and textural descriptions as the basis for creating novel envelope designs.
Through exposure to PM, optimization, and MLLMs, students aimed to gain a nuanced understanding of transparency, control, and expressiveness within glass-box and black-box methodologies. Besides learning about the potential of the different technologies for different design tasks and stages, they specifically sought to comprehend and harness the potential of MLLMs as a design tool for ideation. Overall, students seamlessly incorporated MLLMs into their design exploration, successfully developing 3D models for their proposals. While MLLMs facilitated the transformation of visual concepts into new envelope forms, they also presented challenges related to control and translation across different representations, resulting in design proposals with unclear scale and tectonics. In the survey, students considered MLLMs easy to learn, relevant for undergraduate education, and effective for creative tasks but acknowledged difficulties when addressing real architectural constraints.
In the discussion section, we examined the pivotal role of software interfaces, control, and translations in design exploration with MLLMs. We highlighted the prominence of user interfaces, noting their ongoing development with new algorithms, functions, and accessible workflows for fine-tuning MLLMs in specific design tasks. Emphasizing challenges in design precision, we addressed issues of building scale and tectonics by exploring prompts, parameters, and guiding images in the design exercises. Additionally, we discussed the challenge of transitioning between design media as an opportunity to combine and chain MLLMs with various generative methods and modeling software. These insights aim to support future initiatives integrating computational methods, including MLLMs, into the design process.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
