Abstract
As the main part of design display and evaluation, product three-dimensional (3D) form is the core object in affective product design. However, previous research has not yet addressed the development of technical models and method involving complete 3D surface data, and thus cannot guarantee the quality of affective product design. By using the techniques of triangular mesh model, spherical harmonic and conditional variational auto-encoder, this paper proposes a data-driven affective product design method composed of several technical models using complete 3D surface data. These models include: mathematical model for quantifying 3D form, recognition model for recognizing customer’s affective responses, and generative model for generating new 3D forms. For affective product design, the mathematical model achieves the acquisition and processing of complete 3D surface data, the recognition model improves the objectivity and accuracy of recognition by integrating the 3D form data into the calculation process of emotion recognition, and the generative model realizes the automatic generation of new 3D forms in response to emotional data based on the recognition results. Each model provides technical support for realizing the acquisition, processing and generation of complete 3D surface data of product form, and ensures the systematicness and completeness of the proposed method for the affective product design involving 3D form innovation. The feasibility of the method is verified by an example of car design, and the results show that it is an effective affective product design method involving 3D form innovation.
Keywords
Introduction
Affective product design has a decisive influence on consumers’ emotional preference and choice for products [1–3]. By integrating customers’ cognitive behavior and emotional needs into product form design, it shows better applicability and higher practical value and attracts more and more scholars’ attention [4–6]. Data is the basis and key of ensuring the scientificity and efficiency of design, and data-driven design has become an inevitable trend in the development of affective product design. As an integrated technical system, the data-driven affective product design method usually includes design processes such as product form quantification, emotion recognition and product form generation, and current research mainly focuses on these processes and their corresponding technical models. Focusing on the technical research status and problems involved in these different technical processes, the related work is presented in three parts below.
(1) Product form quantification
Product form quantification is the basis and prerequisite for data-driven affective product design. It mainly involves quantifying product form by using mathematical model to obtain the complete data reflecting product form feature. Considering the complete data availability and analytical convenience, two-dimensional (2D) form is the main research object of current research and its mathematical model. Compared with 2D form, three-dimensional (3D) form is the actual carrier for consumers to perceive products. It represents shapes more comprehensively and authentically, and delivers richer emotional information. Therefore, 3D form is a more necessary and valuable research object for affective product design. However, because it is difficult to obtain and process its complete surface data, current research mainly focuses on constructing mathematical models for incomplete surface data acquisition of 3D form. Hsiao et al. [7] studied the basic dimension of affective response triggered by 3D form based on the mathematical model constructed by morphological analysis technique. Chang and Chen [8] also used a mathematical model established by morphological analysis to realize the incomplete quantification of steering wheel’s 3D form and the determination of consumer’s affective preferences. However, because the morphological analysis technique only gets the classification data of partial surface of the 3D form, rather than the complete surface data, the mathematical model based on this technique obviously cannot guarantee the diversity and flexibility of the 3D form innovation in affective product design.
Among the possible techniques used in other fields for building mathematical models that can obtain complete 3D surface information, point cloud [9] and volumetric grid [10] are more representative, but both have problems that are not applicable to data-driven affective product design. Point cloud data obtained is discrete and irregular [11], its amount is large and it is not easy to be processed. Volumetric grid data has low output accuracy and introduces extra computation cost due to computational and spatial complexity, which restricts it to be applied on complex 3D forms [12]. Different from these two techniques, triangular mesh model has the applicability for affective product design. It can get the complete surface data of complex 3D forms in the form of mesh data, which has good convertibility with the common Non-Uniform Rational B-Spline (NURBS) surface data in 3D form design, and has the potential to simplify the data analysis process through data conversion and dimensionality reduction [13, 14]. In the authors’ previous studies [15, 16], we have proved that elliptical Fourier technique can process the 2D coordinate data of product form into harmonic coefficient with better form analyzability. As an extension of this technique to 3D objects, spherical harmonic (SPHARM) can process mesh data into SPHARM coefficient [17, 18]. After further extraction of data features, the dimension-reduced SPHARM coefficient is more suitable for quantitative analysis of affective product design. In this process, conditional variational auto-encoder (CVAE), a representative technology of deep learning with excellent big data processing and feature mining functions, is used to complete the processing of SPHARM coefficient. Therefore, a mathematical model that combines the triangular mesh model, SPHARM and CVAE is formed to achieve product form quantification and obtain complete 3D surface data in this study.
(2) Emotion recognition
Emotion recognition mainly involves the quantification of consumers’ affective responses to product forms. It uses several adjectives and their evaluation data that represent the type and degree of emotion as the output result of emotion recognition. The quantitative results provide design goals and data sources for data-driven affective product design, which is of great significance [19]. Many scholars have studied the construction techniques of emotion recognition model of 2D form. Shieh et al. [20] employed factor analysis to construct an emotion recognition model for the 2D car shape. Guo et al. [21] completed the emotion recognition of the 2D camera form through expert interview and cluster analysis, and in the follow-up study, Guo et al. [22] further built an emotion recognition model for the 2D car form by combining the event related potential method with physiological measurement. Sutono [23] proposed a similar emotion recognition model based on questionnaire survey, semantic difference method and cluster analysis. However, these emotion recognition models urgently need to break through the limitations of 2D objects and complete the expansion to 3D form. The modeling ideas and techniques of these models separated from the actual form data are easy to cause recognition problems such as inaccurate recognition and incomplete coverage of key emotions. These problems need to be improved in the construction of 3D form’s emotion recognition model. In the authors’ previous studies, we have used 2D form data to participate in the recognition process, and improved the accuracy and completeness of 2D form’s emotion recognition. Therefore, this research will further explore the related techniques of using complete 3D surface data to participate in emotion recognition, so as to establish a more effective emotion recognition model for 3D form.
(3) Product form generation
Emotion-oriented form generation is the final technical process of affective product design, and the generative model is the key to determining the outcome of the process. Current studies often build generative models based on hybrid algorithms. Hsiao et al. [7] established a generative model based on quantification theory type I and genetic algorithm, and verified the feasibility of the model through the shape generation of the coffee machine. Deng and Wang [24] adopted neural network and interactive genetic algorithm to construct the generative model. Lin and Zhao [25] combined neural network and particle swarm algorithm to build the emotion-oriented form generative model of cultural creative products. Although generative models based on hybrid algorithms have certain performance advantages, these studies are still in the stage of constructing 2D generative models rather than 3D generative models. In addition, there are already many design processes of affective product design, the introduction of generative model based on hybrid algorithm in the product form generation process will make the operation and debugging of the overall method more complex, and it is more susceptible to multiple constraints of performance bottlenecks of different single algorithms. For example, quantification theory type I and shallow neural network are not the best choice for mining data features and ensuring generation accuracy, so the hybrid algorithm composed of them cannot give the best performance to the generative model.
As mentioned above, CVAE provides the technique required to construct a mathematical model for complete quantification of 3D form. At the same time, it also provides favorable conditions for constructing an emotion-oriented 3D generative model. As a representative deep learning technology that uses multi-layer neural networks, CVAE has a complete algorithm structure and can build a widely used generative model with good generalization ability without mixing other algorithms [26, 27]. Its multi-layer structure gives it the advantage of more effective expression of complex functional relationships, and enables it to more efficiently extract data features from a large amount of product form and emotional data. Therefore, CVAE has the ability to generate new 3D form data in response to emotional data. To reduce the technical complexity of the method that combines different models, in view of the multiple roles of CVAE in different design processes of affective product design, this study proposes a generative model based on this technique.
In summary, by combining the above models in different design processes, a data-driven affective product design method using complete 3D surface data is proposed. Figure 1 shows the overall research framework of the method, which includes three parts. In the first part, the complete quantification of 3D form is realized by using the techniques of expert interview, triangular mesh model and SPHARM, and the reduced-dimensional SPHARM coefficient matrix is obtained by using the CVAE technique. In the second part, emotion recognition and quantification are achieved by using the techniques of expert interview, questionnaire survey, recognition model, cluster analysis and factor analysis. In the last part, new 3D forms that respond to emotional data are generated by using the generative model.

Research framework of the proposed method.
The innovations of this paper are summarized as follows: A mathematical model for complete quantification of 3D form is first constructed, which achieves quantitative description, data acquisition and processing of 3D form. This laid a solid foundation for the formation of quantitative research technique of affective product design involving 3D form innovation. A more effective emotion recognition model for 3D form driven by its complete surface data is proposed, which forms a systematic and effective emotion recognition technique for 3D form. It innovatively proposes the 3D form generation and evaluation technique based on CVAE and effect evaluation indexes, and realizes the generation of rich and innovative novel 3D form design schemes that meet design goals.
The rest of this study is organized as follows. Section 2 provides a theoretical framework and background overview of the related theories and techniques. Section 3 details the implementation procedures of the proposed method, and Section 4 presents some discussions of the results. Finally, Section 5 demonstrates some brief conclusions.
Mathematical model construction
3D mesh data acquisition
NURBS is an advanced 3D surface modeling method. NURBS surface is a free-form surface, which is equivalent to a thin square rubber sheet with infinite flexibility. Designers usually use it to carry out product 3D modeling in actual design operations. After obtaining the picture samples of the target product, use NURBS surface modeling software-Rhinoceros 3D to complete the 3D form modeling of the samples, and then save their triangle mesh models in the OBJ file format through the data format conversion. Figure 2 shows the process of acquiring 3D mesh data for a certain product.

3D mesh data acquisition process.
SPHARM technique is well suitable for general surface manipulation and analysis applications of triangle mesh model [28]. As a parametric technique for surface representation, SPHARM takes the angular space coordinates of 3D surface points and the radius from an internal center point as parameters to expand the surface into SPHARM functions, and hierarchically describes the comprehensive and local form properties of 3D surface through spatial frequency components [29]. It first defines a 3D form by three SPHARM functions, then converts them into three sets of SPHARM coefficients by Fourier transform [30] to complete the morphological information transformation from time domain to frequency domain and get the standard spectrum information that can be used to quantitatively analyze the 3D form. The whole process is generally composed of three steps, which are (1) spherical parameterization, (2) spherical Fourier transform, and (3) SPHARM coefficients normalization.
Spherical parameterization is the process of obtaining a bijective mapping between each point p on a 3D surface and a pair of unit spherical coordinates ω (polar coordinate) and Φ (azimuthal coordinate) [31], which is expressed as
Where ω ∈ [0, π], Φ ∈ [0, 2π], and u2 (ω, Φ) + v2 (ω, Φ) + w2 (ω, Φ) =1
Since any continuous function P (ω, Φ) defined on a spherical surface can be expanded by SPHARM function, then a 3D surface is expanded into a complete SPHARM function by spherical Fourier transform as
Where
To perform group analysis and have comparability between different samples, the SPHARM coefficients of each sample need to be normalized to remove the effects of scaling, rotation, and translation. First order ellipsoid [32] technique is adopted to obtain the normalized SPHARM coefficients of each sample, and the SPHARM coefficient matrix of the overall sample is established based on the coefficients.
CVAE is further used to perform dimensionality reduction and data feature extraction on SPHARM coefficient matrix. Its network structure is shown in Fig. 3, which includes a recognition network (encoder) and a generative network (decoder). The output result of the encoding process of the encoder realizes the dimensionality reduction of the SPHARM coefficient matrix and assists in the completion of emotion recognition, and the output result of the generation network achieves the generation of new 3D form data in combination with external conditions (emotional data). By using the multi-layer neural network of the encoder, CVAE can extract the low-dimensional deep structural features of the high-dimensional input data as the representation of the data. The lowest dimensional vector x n of the latent space is the output data after dimensionality reduction.

Network structure of CVAE.
After the network coding of CVAE, the SPHARM coefficient matrix with the total sample size of s will be output as a t-dimensional data matrix as
The steps of using the reduced-dimensional SPHARM coefficient matrix to further participate in the construction of the recognition model are as follows: After collecting and screening adjectives, a focus group composed of industry experts is invited to select the relevant adjectives; Questionnaires are designed by combining representative 3D form samples with relevant adjectives and conducted to get the evaluation scores of all test subjects, which is given as
Where: i is the sample number; j is the number of relevant adjective; The evaluation mean score (EMS) matrix
Where: K is the total number of test subjects. As a method of eliminating the abnormal value from data set, Pauta criterion is used to determine the validity of the subjects’ evaluation data. The method uses the unbiased estimator of the standard deviation of the statistical data calculated by the Bessel formula as the residual σ. It states that the sample values are almost completely within 3σ, so highly abnormal sample values exceeding 3σ are deleted. Consider the evaluation data of individual test subjects that meet the condition of The correlation analysis was conducted to get the correlation coefficient matrix R, and assign a multiplier coefficient α (α = 0, 1, 2) to different significance levels of the correlation coefficient in R, and construct the recognition model as
Where: u is the number of relevant adjective; v is the coefficient number; vmax is the maximum number of v; R
uv
and P
uv
are the correlation coefficient and significance level between the u-th adjective and the v-th coefficient number, respectively; S
u
is the final emotional score (FES). Representative adjectives and their rankings are determined according to the score, and the FESs of representative adjectives constitute the recognition result; Factor analysis and clustering analysis are performed to get the sample emotional label.
CVAE is currently one of the most advanced methods of deep learning generative models. Its technical principle is based on the variational auto encoder (VAE). VAE can generate output data with the same data distribution as the input data but with different numerical values, but lacks a certain orientation in data generation, so it cannot realize the generation of new 3D form data matching with the specified target emotion required by affective product design. By introducing label data as external information, CVAE makes up for the deficiency of VAE in the application of affective product design. It has the ability to generate output data in a specified direction and is an ideal generative model technique. Therefore, to take into account the technical needs of both the mathematical model and the generative model, the generative model is constructed with CVAE technique. As shown in Fig. 3, the decoder structure enables CVAE to have the data generalization capability. By encoding the probability density distributions of input data and the sample emotional label data
Implementation procedures
Cars are typical products with complex 3D surfaces and display a variety of emotional types, the feasibility and effectiveness of the proposed method are illustrated in detail by an example of car form design in the following sections.
3D form quantification
In consideration of the brand and visual differentiation, we extensively collected 300 pictures of car samples with three-box shape through websites, car magazines and brochures. Each sample was displayed its 3D form features with three-view drawings for sample screening. A focus group of nine experts (five men and four women) with more than two years of car design experience was established. The main functions of the focus group are sample screening and adjective screening. To better accomplish this task, the experts come from the academic and industrial circles of car form design, of which 6 are teachers from universities majoring in industrial design, and 3 are senior designers from car design companies. They all have rich practical experience in car form design, and have a keen insight into the similarity and emotion recognition of car form. These provide a guarantee for accurately selecting representative samples and adjectives. Then, the focus group conducted a sample screening based on visual similarity, and 256 samples were retained as the research objects. The 3D solid modeling of each sample was completed to build the overall sample database. The specific 3D forms of all samples are showed in Fig. 4.

3D forms of all samples.
The 3D forms of all samples were first preprocessed to obtain their mesh data containing vertices and faces. In this process, the surface of each sample need to be closed first, so as to meet the requirement of SPHARM that its parameterized object can only be genus zero surfaces [31]. To normalize the morphological data of different samples for data analysis, the number of vertices and faces was set to be close to the value of 20,000 vertices and 40,000 faces, on the premise of guaranteeing the product form features of each sample. Each vertex is given an identifying number, and each face is marked by three vertices with different identifying numbers. Figure 5 manifests the result of the data preprocessing of sample NO. 1. Then, the mesh data of all samples were extracted by using SPHARM software. Table 1 demonstrates the mesh data of sample NO. 1, including 19923 vertices and 39842 faces.

Data preprocessing of sample NO. 1.
Mesh data of sample NO. 1
Then, all samples were subjected to spherical parameterization using Equation (1). The surface mesh and unit sphere surface mesh of sample NO. 1 after spherical parameterization are manifested in Fig. 6. After spherical parameterization, the spherical vertex coordinates mapped to the unit sphere of all samples were obtained. Table 2 shows the coordinates of sample NO. 1.

Spherical parameterization of sample NO. 1.
Spherical vertex coordinates of sample NO. 1
Spherical Fourier transform was performed on the spherical vertex coordinates of each sample using Equations (2) and (3) to obtain its SPHARM coefficients. In this process, we tested the visual effect of 3D form reconstruction for each sample under different degrees, and determined that the ideal degree l suitable for representing the main features of different samples was 30. Figure 7 displays the 3D form reconstruction of sample No.1 under different degrees. With the increase of degree, the surface features obtained by reconstruction are more and more abundant and detailed. When l = 30, the reconstructed visual effect is fine enough to represent the main form features, and the shape noise is well controlled. Consequently, the SPHARM coefficients of all samples were obtained at this degree. The SPHARM coefficients of sample No.1 are manifested on the left column of Table 3, and its normalized SPHARM coefficients are demonstrated on the right column of Table 3. Subsequently, the SPHARM coefficient matrix of the overall sample as shown in Table 4 was established.

3D form reconstruction of sample No. 1 under different degrees.
SPHARM coefficients of sample No. 1
SPHARM coefficient matrix of the overall sample
After using CVAE for data feature extraction, the dimension of SPHARM coefficient matrix was reduced from 5400 to 8 (Table 5).
Reduced-dimensional SPHARM coefficient matrix
Emotion evaluation and recognition
Firstly, the 36 adjectives identified in Wang et al. [15] that are closely related to the emotional attributes of cars with three-box shape are taken as the preliminary relevant adjectives in this study (Table 6).
Preliminary relevant adjectives
Preliminary relevant adjectives
To allow the focus group to do the omni-directional observation of samples’ 3D forms from multiple perspectives, a standardized observation video was made for each sample. Figure 8 shows multiple screenshots of the video for sample No. 1.

Screenshots of video for sample No. 1.
All videos were produced in a uniform standard, with the same length and shooting angle. Each sample was set to rotate once around the same rotational axis that coincides with the central axis of its shape within the same unit time (10 seconds). Focus group experts were invited to browse videos of 50 randomly selected samples from all samples, and then select the adjectives that best represented the emotion of the sample’s 3D form from 36 preliminary related adjectives. After assembling all expert opinions, the adjectives were sorted by occurrence frequency and the top 15 adjectives were retained as relevant adjectives. Subsequently, 100 senior college students studying in industrial design major (50 males and 50 females) were invited to conduct a questionnaire survey on the 3D form emotions of the overall sample. Due to the large sample size, to reduce the burden on test subjects and improve the evaluation accuracy, the sample and test subjects were randomly divided into 4 groups, respectively. Twenty-five test subjects in each group (men and women at the same proportion) evaluated 64 samples using 7-point semantic differential method (-3-3), where -3 is the lowest score and 3 is the highest for adjectives. The Z - score normalization method was used to check whether there were abnormal data in each group. After data verification, the EMS of all samples were calculated (Table 7).
EMS matrix
Then, the correlation coefficient matrix were further calculated (Table 8), and the recognition model was used to get the FES and the rankings of relevant adjectives (Table 9). Finally, a total of 12 adjectives with a score bigger than 1 were further selected as representative adjectives.
Correlation coefficient matrix
Note: The value of significance level P is the data in bracket, * indicates P < 0.05 and significantly correlated. ** indicates P < 0.01 and very significantly correlated.
FES and rankings of representative adjectives
Since the reduced-dimensional SPHARM coefficient matrix extracts the 3D form data features of the overall sample, the correlation coefficient and FES objectively represent the quantitative relationship between 3D form features and its emotional attributes, so the proposed recognition model has higher objectivity and accuracy. In addition, it quantitatively calculates and ranks the importance of all representative adjectives, fully retains the key emotional information, and helps to set scientific and flexible 3D form design goals.
Based on the recognition of representative adjectives, factor analysis was performed on the EMS matrix of 12 representative adjectives by SPSS statistics software. In the light of the variance contribution rate in the result of factor analysis manifested in Table 10, four principal factors were determined, whose cumulative variance contribution rate reached 83.78%.
Total variance of interpretation
Total variance of interpretation
According to the maximum factor loading, as shown in Table 11, the four principal factors were named as “high-grade”, “dynamic”, “organic” and “unique” respectively, and they were set as sample emotional labels of this study.
Rotated component matrix
Then, k-means clustering analysis was performed on the scores of the four principal factors (left column of Table 12) extracted after the factor analysis, and all samples were classified into four clusters according to the four emotional labels, so that each sample had its own emotional label number (i.e., cluster number, right column of Table 12). These numbers are the label data, which together with the SPHARM coefficient matrix make up the input data of CVAE. The total number of samples in each cluster is shown in Table 13.
Scores of principal factors and result of k-means clustering analysis
Sample size and proportion in four clusters
Model structure and parameter
After comparison and analysis, the encoder and decoder structure of the CVAE with three hidden layers were affirmed finally (Fig. 9).

Network structure of CVAE.
As manifested in Fig. 9, the sample emotional label data was input into the second fully connected layer (FCL) of the encoder and the third FCL of the decoder, respectively. Except that the output layer uses the Sigmoid function, all other activation functions adopt the Relu function, and the Adam optimization algorithm is used as the model’s optimizer. The specific dimension number of hidden layers and other parameters of the generative model were illustrated in Table 14.
Network structure and parameters of generative
Then, the generative model was independently run on a ThinkStation P410 (Xeon E5-1620 v3) with windows 10 operating system and implemented by using the Python and Keras API according to the above parameters. After inputting the data into the generative model, SPHARM coefficients of 8 novel 3D form schemes (2 schemes were generated for each emotional label) were generated. Figure 10 displays the 3D forms reconstructed from the SPHARM coefficients of these schemes. All test subjects were invited to evaluate these schemes using four labels, and then their EMSs were calculated (Table 15). It can be seen that these schemes perform well on the score of the emotional label to which they belong, which indicates that the generative model achieves the generation of novel schemes that conform to the target emotion.

Novel schemes reconstructed from newly generated SPHARM coefficients.
EMSs of 8 novel schemes
The following is a further discussion of the core techniques, technical parameters and design effects of the proposed method.
Values of SPHARM
SPHARM is the core technique that forms the 3D form quantitative function of the method. Its most important value is to realize the conversion of multi-type and difficult-to-process mesh data to single-type and easy-to-process SPHARM coefficient data, which greatly enhances the analyzability of 3D form data and the applicability of affective product design. SPHARM coefficient shows good properties for data feature extraction and feature reconstruction for 3D forms, which is helpful to construct the recognition model with 3D form data participation and the generative model based on a single high-performance algorithm. The variance contribution rate in Table 16 illustrates that most data features of SPHARM coefficient can be extracted by a small number of principal components. The reduced-dimensional SPHARM coefficient after feature extraction improves the objectivity and accuracy of emotion recognition and the generative quality of 3D form. In addition, the SPHARM coefficient endows the method with high operational flexibility and applicability for the affective product design involving 3D form innovation of different types of products. It can be flexibly adjusted to reconstruct 3D forms according to actual engineering needs, using high frequency harmonics to represent complex and irregular surfaces, and using low frequency harmonics to describe simple and continuous surfaces. Meanwhile, It has the advantages of fixed interpolation and precise scaling for 3D forms and is suitable for data compression and statistical modeling [31, 33].
Variance contribution rate of each PC
Variance contribution rate of each PC
Due to its global role, CVAE is a key factor in determining the overall performance of the method, and the reasonable selection of its network structure is crucial. There are many factors influenced the determination of its network structure. Firstly, to observe the data distribution of latent space, CVAE usually sets the dimension of bottleneck layer as 2. After testing, it was found that when the number of hidden layers in the model’s encoder and decoder is 3, the calculation precision and efficiency can be guaranteed at the same time, and further increasing the number of hidden layers does not enhance the model’s convergence effect, but adds the computational complexity and time, so the number of hidden layers in both networks of the model was set to be 3. Then, to extract a low-dimensional SPHARM matrix to participate in the generation of sample emotional label, an appropriate low dimension of the second FCL was determined in front of the bottleneck layer. Through the test of principal component analysis, it was found that when there were 4–12 principal components, enough original data information can be extracted. This dimension range is also helpful to reduce the burden of data processing for the recognition of emotional label. Therefore, we tested the appropriate number of neurons in the second FCL in this range. In this process, a total loss index was used to describe the overall performance of the model. This index introduces mean squared error to measure the reconstruction error of new data to original data on the one hand, and on the other hand uses KL divergence to represent the difference between the distribution of latent variables and the distribution of recognition network q φ (z|x). It takes the sum of the reconstruction error and KL divergence divided by the number of harmonic features as its index value. The smaller the index value is, the better the model’s overall performance will be. Figures 11 and 12 reflect the changes in the mean and variance of the loss after 30 independent runs when the first and second FCLs take different numbers of neurons, respectively. Among them, the translucent area around the polyline reflects the change range of the loss’s variance. By observing the change trend in Fig. 11, the number of second FCL neurons was first determined as 8. Figure 12 shows further test results of different structures in the first FCL, which proves that the effect is best when the number of neurons in this layer is 200. Therefore, the optimal structure of CVAE was finally determined as 5400-200-8. Compared with other structures, this structure can reduce the mean loss of the model to the global optimal value rapidly and stably in 20 iterations, and minimize loss’s variance under multiple independent runs, which proves it has the best convergence and stability.

Trend chart of the mean and variance of loss for different structures of the second FCL.

Trend chart of the mean and variance of loss for different structures of the first FCL.
Emotion-oriented form generation is a crucial characteristic of affective product design, so it is necessary to establish a close relationship between form data and emotional data. Figure 13 reflects a change in mean and variance of the loss after running the CVAE independently for 30 times with different emotional labels. As can be seen from the figure, with the increasing of iterations, various losses demonstrate a gradual downward trend, and all reach a certain value within acceptable error range. The result proves that the selected network structure has the accuracy and stability for the generation of 3D form adapted to different emotional labels.

Trend chart of the mean and variance of loss under different emotional labels.
The novel schemes generated by the four emotional labels shown in Fig. 10 and the EMSs obtained from the evaluation results of all test subjects recorded in Table 15 have shown to a certain extent that each scheme fits well with its target emotion. To further observe the design effect of the method, two indexes of form diversity and emotion matching degree were introduced. These indexes are calculated as
Where: D(i) and M(i) are the form diversity and emotion matching degree of the design schemes generated under the i-th emotional label; S(i) is the total number of the schemes under the i-th emotional label; S (D(i)) is the number of schemes with visual differences in S(i), and S (M(i)) is the number of schemes in S(i) that conform to the target emotion.
Then, CVAE was run independently using one of the four emotional labels in turn to generate 10 design schemes, respectively, and all test subjects were invited to evaluate the affective response and visual similarity of the schemes. The statistical results were further used to calculate the index values (Table 17).
Design effect evaluation index of method
Refer to the sample size of each cluster (Table 13) and index values, the effect of the method to different emotional labels can be further observed. As shown in Table 17, the “dynamic” emotional label has a high index value, which reflects that the sample size is an important factor affecting the generative quality, and relatively comprehensive and abundant sample form information make the method have better feature mining and generalization ability. In addition, although the sample sizes of “organic” and “unique” clusters are close, their index values are different. Compared with the sample forms in the “organic” cluster, the sample forms in the “unique” cluster have more different overall form features, but fewer partial form features. The index values of “unique” are higher than that of “organic”, which shows that the method has a better design effect for database samples with obvious differences in overall form features. The overall index values under the four emotional labels indicate that the method has established a 3D form generation mechanism that conforms to different emotional labels, and can generate 3D forms that are in line with the target emotion. It actually realizes the distribution fitting of the 3D form data of multiple emotions at the same time. By mastering the distribution of 3D form data of the samples with the same emotional label, novel 3D form design schemes matching the target emotion can be randomly sampled and generated, which increases the possibility of obtaining abundant and various schemes and reduces the workload of designers.
To improve the design quality and make up for the obvious shortcomings of existing design methods involving 3D form innovation in affective product design, this paper proposes a data-driven affective product design method composed of several technical models using complete 3D surface data. It completes the construction of the mathematical model for quantifying 3D form based on the triangular mesh model, SPHARM and CVAE, which realizes the acquisition of the complete surface data of 3D object and lays a foundation for the objective and accurate emotion recognition and generation of 3D form. In addition, by using complete 3D surface data to participate in the emotion recognition process, it establishes a new recognition model for determining the customer’s affective response, which improves the objectivity and accuracy of recognition. Finally, it further uses CVAE to build a single algorithm generative model that accurately responds to emotional data, so as to achieve efficient generation of novel 3D form design schemes that meet design goals.
In general, this study proposes a new concept of data-driven affective product design using complete 3D surface data, and forms a systematic and effective method to make up for the deficiencies of existing design methods in affective product design involving 3D form innovation. The method is in line with the future development trend of product design and its technical requirements in the context of Industry 4.0. Industry 4.0 is based on intelligent design, intelligent manufacturing and intelligent operation. Data-driven digital twins and other technologies play an important role in it [34], and the research work of this paper fully reflects this point. It is based on the product model in the real world, then the model is mapped to the image of the virtual space and converted into data, and then the improvement of the real product is realized through data-driven intelligent design, which is exactly what Industry 4.0 should be. The key performances of the method are summarized as follows: The mathematical model for the complete quantification of 3D form solves the problem that 3D form is difficult to be complete quantified and analyzed. It uses l = 30 to achieve a fine reconstructed visual effect of 3D form and obtain its complete surface data. Compared with the traditional method that completely separates from the form data to complete emotion recognition, the emotion recognition model achieves the quantitative calculation of emotion based on the correlation between the complete surface data information of 3D form and the emotional data, which can identify and sort target emotions more objectively and comprehensively. Based on the model, four emotion labels were accurately determined. The generative model based on CVAE and effect evaluation indexes well realizes the generation and evaluation of 3D forms. The key design effect evaluation index values of the proposed method, form diversity and emotion matching degree, are not less than 0.7 and 0.8, which can effectively ensure the efficiency and quality of the actual design.
However, it still has some problems. The preprocessing of 3D forms still requires manual intervention. The efficiency of this process is not high, and the accuracy needs to be further improved. Since the method cannot generate 3D forms that respond to multiple emotions and contain color information, further developing a multi-objective affective product design method that integrates 3D form and color will be a focus of follow-up research. In addition, because human emotions are vague and uncertain, future research can use fuzzy evaluation methods to better adapt to these features. At the same time, the actual product form design should not only meet emotional requirements, but also be constrained by product function and performance. Therefore, how to comprehensively consider these factors to obtain a form design result that is more in line with actual requirements is a problem that needs to be studied in depth in the future.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant NO. 71661023) and Jiangxi Province Humanities and Social Science Research Projects of Universities (Grant NO. YS19234). We thank the editors and anonymous reviewers for their valuable comments and suggestions. Furthermore, we thank all test subjects for their participation and assistance in the experimental study and Dr. Qu Min from Nanchang University for her help in providing language. Readers interested in the original data can contact the author via email.
