A dynamic network-based decision architecture for performance evaluation and improvement

Abstract

This study introduces a dynamic decision architecture that involves three steps for corporate performance forecasting as such bad performance has been widely recognized as the main trigger for a financial crisis. Step-1: performance evaluation and integration; Step-2: forecasting model construction; and Step-3: knowledge generation. First, the decision making trial and evaluation laboratory (DEMATEL) is incorporated with balanced scorecards (BSC) to discover the complicated/intertwined relationships among BSC’s four perspectives. To overcome the problem of BSC that cannot yield a specific direction, the study then employs data envelopment analysis (DEA). Apart from previous studies that utilize an all embracing one-stage model, this set-up extends it to a two-stage model that calculates the performance scores for each BSC perspective. By doing so, users can realize a company’s weaknesses and strengths and identify possible paths toward efficiency. VIKOR is subsequently used to summarize all scores into a synthesized one. Second, the analyzed outcomes are then fed into random vector functional-link (RVFL) networks to establish the forecasting model. To handle the opaque nature of RVFL, the instance learning method is conducted to extract the implicit decision logics. Finally, the introduced architecture, tested by real cases, offers a promising alternative for performance evaluation and forecasting.

Keywords

Artificial intelligence decision-making performance evaluation

1 Introduction

In today’s highly volatile and competitive business environment, companies are encountering more severe challenges (such as the demand for quick order fulfillment, fast delivery, and short lead time) than ever before [44]. With growing interdependence among companies in their supply chain, risks of all types - including financial difficulties, depressed economic atmospheres, operational troubles, cultural dissimilarities, labor cost problems, and political issues - can promptly cascade throughout the whole supply chain and cause financial troubles or, even worse, turn into a financial crisis [8, 15]. Due to its harsh and negative impacts on the whole society, financial crisis prediction has attracted a high degree of attention over the decades, especially when the financial market is full of uncertainties.

Compared to the widely examined research fields of financial crisis prediction and credit default forecasting, works on company performance evaluation are quite rare. Kamei [25] indicated that bad operating performance can be used to account for 99% of financial crises - that is, bad operating performance is a stage leading up to a financial crisis eruption. Thus, an inevitable task appears in how to establish a performance evaluation architecture that not only provides synthetic information about efficiency and effectiveness of firm operations, but also generates a comparative assessment that is intuitive, simplified, and standardized [42].

The nature of performance evaluation is unfortunately multi-dimensional, and the traditional evaluation architecture tends to focus merely on financial indicators, such as return on investment (ROI) and return on assets (ROA) that only represent past outcomes and reveal little regarding expectations of future trends. Thus, there is an urgent requirement to construct an over-arching architecture that goes beyond financial measures and takes other measures into consideration, such as employee skills, customer satisfaction, and culture of innovation, which are fundamental for business development [27]. Balanced scorecard (BSC), introduced by Kaplan and Norton [26], is one of the best well-known and most widespread utilized frameworks for performance evaluation. The aim of BSC is to translate a company’s strategic objectives into a set of operational measures and to illustrate the cause-and-effect relations between strategies and processes through four perspectives: financial, customer, internal business process, and learning and growth.

The relations among these four perspectives play a critical role in performance evaluation, but most researchers consider them in a simple manner (i.e., unidirectional) [46]. However, it must be mentioned that other essential relations between these perspectives may exist. Looking at the possible relations among the four perspectives (i.e., network-based structure) may yield a more reliable and comprehensive outcome for decision makers to form their own judgments [1]. We believe that the network-based structure better captures the dynamics of the production processes and sub-processes as well as more clearly depicts the company’s real operational status.

The network-based structure is established based on decision making trial and evaluation laboratory (DEMATEL), introduced by the Science and Human Affairs Program of the Battelle Memorial Institute of Geneva in 1972 [20], and solves a group of complex and intertwined social problems. This technique helps decision makers to recognize the interrelated and entwined relations among evaluation criteria, which can assist them in identifying the influential directions and weights of the utilized criteria when making a judgment, such as Petrović and Kankaraš [41] and Liu et al. [34] who used it to identify the relations among attributes and eliminate less significant attributes so as to prevent the problem of information overload. By joint utilization of DEMATEL and BSC, we can introduce a network-based decision architecture for advanced management.

Although BSC is one of the most comprehensive and intuitive performance evaluation frameworks, in practice this method suffers some challenges, including a lack of strategic focus, failure to identify inefficiency in the usage of resources, forcing managers to chase local optimization rather than looking for continuous improvement, and the inability to generate a final aggregated outcome without a benchmarking exercise [46]. Data envelopment analysis (DEA) can be applied to overcome these problems. The advantages of the joint approach (one stage model: BSC+DEA) are summarized as follows: (1) BSC eliminates the impact of information overload by focusing attention on the key successful elements, which may suggest the main attributes in the DEA model [3]; (2) DEA deals with multiple inputs and outputs of BSC at the same time without pre-determined cost functions, thus handling the problems coming from market participants; and (3) DEA assists BSC in generating operational standards by translating qualitative measures into quantitative ratios [10, 17].

Although the one-stage model tends to summarize well in yielding an overall performance score, it can also bury essential messages and obscure the needed action of users [19]. To combat this, we aim to move away from a unique, all embracing one-stage model by introducing a two-stage complementary model that calculates a performance score for each BSC perspective. By doing that, users can highlight the strongest and weakest dimensions of performance and identify relevant benchmarks for continuous learning toward efficiency.

How to convert the performance scores from each BSC perspective into a synthesized measure is a critical issue. This situation turns into a classical multiple-criteria decision-making (MCDM) problem - that is, decision makers often have to consider multiple criteria at the same time with conflicting outcomes on different criteria [35, 46]. One can apply a MCDM technique, Vlse Kriterijumska Optimizacija Kompromisno Resenje (VIKOR), to handle this MCDM task.

After performing VIKOR, we obtain the aggregated outcome that can be used to arrange the companies from efficient to inefficient. The analyzed outcome is then fed into an artificial intelligence (AI)-based approach to construct the forecasting model. This study employs random vector functional link (RVFL) networks as they present numerous advantages, such as easy-to-use, superior approximation capability, and fast convergence [7]. However, RVFL is a neural network (NN)-based algorithm, and one critical challenge of this approach is being able to represent the embedded decision logics in a human readable format - that is, the inherent decision logic is a black-box. If users cannot examine or judge the model’s decision logics (that is, the model’s decision judgment is vague), then its practical applications are impeded. To overcome this obstacle, we utilize an instance learning strategy to open up the black-box and represent the knowledge in a transparent and easy-to-use manner. The contributions of this study are as follows.

We introduce a dynamic decision architecture that involves three steps: (1) performance evaluation and integration, (2) forecasting model construction, and (3) knowledge generation, for performance forecasting in today’s highly competitive environment.

In the first stage, we extend the original BSC with a unidirectional structure into network-based BSC by performing DEMATEL so as to describe the highly intertwined and interconnected business relations. By doing that the users can realize which element contributes more to the final outcome as well as assist in scarce resource allocation.

Despite BSC posing numerous advantages and widespread utilizations, it still comes with some difficulties, such as it does not specify how tradeoffs are to be made between different perspectives, and it cannot provide a specific direction for users to follow. To combat this, DEA is conducted. Apart from prior one-stage models (BSC+DEA) that merely focus on an aggregated outcome, this study calculates a performance score for each BSC perspective. By doing that, we can identify where there is room for improving operation performance and point out opportunities for reciprocal learning between companies. We also can help with the interests of multiple stakeholders. A MCDM method called VIKOR is implemented to convert performance scores from each BSC perspective into a synthesized one. Inspired by the principle of a hybrid model, we believe that the two-stage complementary model (DEMATEL+BSC+DEA+VIKOR) performs better in capturing the dynamisms of the production processes and sub-processes.

In the second stage, the performance score derived from a two-stage complementary model is used to divide the companies into two groups: efficient and inefficient. The analyzed results are then injected into RVFL to construct the model for performance forecasting.

In the third stage, the instance learning method is utilized to extract the decision logics from RVFL and represent them in a human-readable format. If the users can examine or justify the decision logics derived from the black-box model (i.e., RVFL), then they will have greater incentive to accept this model as well as increase its practical applications.

The remainder of the paper runs as follows. Section 2 briefly surveys related works on performance evaluation. Section 3 details the methodologies used. Section 4 demonstrates the experiments and comparisons. Section 5 concludes the paper.

2 Literature review

2.1 The applications of DEMATEL and its extensions in an uncertain environment

DEMATEL has demonstrated its effectiveness in aggregating expert opinions about a problem and uses them to overcome complicated and multifaceted tasks they encounter [4, 35]. It is also one of the most powerful approaches to identify the interrelations among criteria, determines the central criteria to express the usefulness of factors, and prevents the problem of over-fitting in the performance evaluation task. However, it is widely recognized that human perceptions on decision criteria are normally full of subjectivities. In real-life applications, the human preference model is uncertain and might be unable to assign a crisp number to depict the preference [49]. To combat this, fuzzy logics with the ability to handle information full of uncertainty and vagueness are incorporated into DEMATEL to handle intrapersonal uncertainty [51]. This has been successfully applied in many research fields, such as Lin and Wu [32] using fuzzy DEMATEL in the R&D projection selection of a Taiwanese company, Chang et al. [9] utilizing the model to identify the key successful factors in supplier selection, Karaşan and Kahraman [28] adopting the model for freight village location selection, and Tan and Zhang [48] performing it to assess the risk elements associated with typhoon disaster management.

2.2 The integration of BSC and DEA

Balanced scorecard (BSC) [27], which can convert a company’s business objectives into numerous practical criteria distributed among four perspectives, is one of the most widely implemented frameworks for performance evaluation. Although this approach has been enriched by different scholars, some gaps still exist in certain aspects. It lacks a solid assessment of the criteria and approach for combining information from different sources that can yield an aggregated outcome [38]. To combat this, DEA is considered. Rouse et al. [43] is the first study to emphasize the potential of complementing DEA analysis with performance assessment architectures based on BSC. They analyzed the productivity of the engineering service division in the airline industry. Based on the hybrid model, they can speed up the process in identifying the source of inefficiencies. Grounded on the same idea, Chen and Chen [11] studied efficiency in Taiwan’s semiconductor industry. Chiang and Lin [12] developed a hybrid model that integrated DEA with BSC to assess the performance of auto companies and commercial banks. One of the possible reasons for the aforementioned researchers’ willingness to work with hybrid models is that they all believe the hybrid model can capture the dynamics of production processes and sub-processes as well as gain deeper insights to assist them in formatting better decisions. Thus, this study is also based on this concept to develop a hybrid model for performance assessment.

3 Methodologies

3.1 Decision making trail and evaluation laboratory: DEMATEL

DEMATEL, first proposed in the 1970 s, is a commonly utilized approach to analyze and visualize the structure of complicated systems and to represent the direct and indirect relationships between factors. The strength of this approach is in revealing useful information about the structure of the task and assisting at identifying factors that play an essential role, which would otherwise be neglected [2]. This approach can be briefly addressed by the following steps [16 , 47].

Step 1: Collect expert opinions.

This method starts with gathering opinions from g experts E ={ E₁, E₂, …, E_g } regarding the impact of assessing criteria D ={ D₁, D₂, …, D_n } on each other, utilizing the scaling system (i.e., it ranges from 0 to 4, or from “no influence” to “very high influence”) by establishing a pair-wise comparison matrix. The k^th expert generates an individual direct influential $Z_{k} = {[z_{ij}^{k}]}_{n \times n}$ with a zero value of all principal diagonal factors, and $z_{ij}^{k}$ denotes expert E_k’s opinion regarding the strength to which factor D_i influences factor D_j.

Step 2: Calculating Z: average matrix

By calculating the matrix Z = [z_ij] _n×n, expert g ’s opinions can be summarized. $z_{ij} = \frac{1}{g} \sum_{k = 1}^{g} z_{ij}^{k}, \begin{matrix} \end{matrix} i, j = 1, 2, \dots, n$ (1)

Step 3: Calculating normalized influence matrix X.

Based on the average matrix Z, the normalized matrix X = [x_ij] _n×n can be decided by utilizing Equation (2). $\begin{matrix} X = \frac{Z}{f}, \\ where \\ f = max (max_{1 ⩽ i ⩽ n} \sum_{j = 1}^{n} z_{ij}, max_{1 ⩽ i ⩽ n} \sum_{i = 1}^{n} z_{ij}) \end{matrix}$ (2)

Step 4: Calculating total influence matrix T.

The total influence matrix T = [t_ij] _n×n can be determined from X by utilizing the transition rule and aggregating whole direct and indirect effects. $\begin{matrix} T = X^{1} + \dots + X^{h} = X {(I - X)}^{- 1}, \\ where \begin{matrix} \end{matrix} h \to \infty \end{matrix}$ (3)

Step 5: Identifying the influence relation map (IRM).

IRM can be derived from summing up the rows and columns of total relation matrix T. $\begin{matrix} R = {[r_{i}]}_{n \times 1} = {(\sum_{j = 1}^{n} t_{ij})}_{n \times 1} \\ C = {[c_{j}]}_{1 \times n} = {(\sum_{j = 1}^{n} t_{ij})}_{1 \times n}^{T} \end{matrix}$ (4)

Here, r_i denotes the sum of the i^th row of matrix T and can be used to indicate the direct and indirect impacts that are dispatched from factor D_i to other factors; c_j represents the sum of the j^th column of matrix T can be used to illustrate the direct and indirect impacts that factor D_j receives from the other factors.

The causal diagram utilizes $(r + c \begin{matrix} , \end{matrix} r - c)$ as ordered pairs. The horizontal axis (r + c) indicates the degree of influential relations between elements, and the vertical axis (r - c) depicts the degree of influential relations between one element and the others. By performing this method (i.e., causal diagram), the opaque and sophisticated causality elements themselves can be accessed in a transparent and intuitive way, and managers or decision makers can take this structure as a guide to form appropriate judgments.

3.2 Data envelopment analysis: DEA

Data envelopment analysis (DEA), proposed in the second half of the 1970 s [8], is a non-parametric linear-programming technique that assesses the efficiency of economic entities, also termed as decision making units (DMUs), by converting multiple inputs into multiple outputs. Compared with other performance evaluation techniques, DEA poses numerous benefits: (1) it can handle multiple inputs and multiple outputs; (2) it does not require a pre-determined cost function; and (3) it works well for a small dataset; thus, it has become widely accepted and gained momentum in numerous research fields [22 , 30].

By measuring DMUs with multiple inputs and outputs, DEA generates efficient frontiers. If the DMUs are on the efficient frontier, then they are considered efficient and hold an efficiency score of 1; if they are not on the efficient frontiers, then they are considered inefficient and given a non-negative efficiency score below 1. Assuming s - DMUs with e - inputs and f - outputs, the CCR efficiency of the k - th DMU is determined by: $Max \begin{matrix} g_{k} = \frac{\sum_{j = 1}^{f} o_{j} y_{jk}}{\sum_{i = 1}^{e} q_{i} x_{ik}} \end{matrix}$ (5)

The solution to Equation (5) can be reached by transforming it into a linear programming format as in: $\begin{matrix} Max \begin{matrix} g_{k} = \sum_{j = 1}^{f} o_{j} y_{jk} \end{matrix} \\ s . t . \sum_{i = 1}^{e} q_{i} x_{ik} = 1 \\ \sum_{j = 1}^{f} o_{j} y_{jp} - \sum_{i = 1}^{e} q_{i} x_{ip} ⩽ 0, \begin{matrix} \end{matrix} p = 1, \dots, s \\ o_{j} \cdot q_{i} ⩾ φ \begin{matrix} \end{matrix} \forall_{j, i} \begin{matrix} \end{matrix} φ : a \begin{matrix} positive \end{matrix} \begin{matrix} infiniesimal \begin{matrix} value \end{matrix} \end{matrix} \\ where \\ y_{jp} : quantity \begin{matrix} of \end{matrix} \begin{matrix} j - th \end{matrix} \begin{matrix} output \end{matrix} \begin{matrix} of \end{matrix} \begin{matrix} {DMU}_{p} \end{matrix} \\ x_{ip} : quantity \begin{matrix} of \end{matrix} \begin{matrix} i - th \end{matrix} \begin{matrix} input \end{matrix} \begin{matrix} of \end{matrix} \begin{matrix} {DMU}_{p} \end{matrix} \\ o_{j} : the \begin{matrix} weight \end{matrix} \begin{matrix} of \end{matrix} \begin{matrix} j - th \end{matrix} \begin{matrix} output \end{matrix} \\ q_{i} : the \begin{matrix} weight \end{matrix} \begin{matrix} of \end{matrix} \begin{matrix} i - th \end{matrix} \begin{matrix} input \end{matrix} \end{matrix}$ (6)

As for model orientation, we assume that a company generally uses the resources at hand in order to reach its goals. Under this assumption, an output orientation seems a natural choice (that is, this study handles the problem of maximization of g_k instead of the minimization of g_k).

3.3 Random vector functional links network: RVFL

With its advantage of universal approximation capability, the artificial neural network (ANN) with a back-propagation (BP) supervised learning mechanism is one of the most popular machine learning algorithms, but it has some weaknesses, such as slow convergence and being extremely sensitive to learning rate determination and difficulty at escaping from the local minimum [45]. To handle these challenges, a randomized-based NN, called the random vector functional-link (RVFL) network, was introduced that assigns a weight randomly and connects the input and output layers by a functional link [18, 40]. Igelnik and Pao [23] and Georgopoulos and Hasler [21] indicated that randomly generating the weights from the input layers to the hidden layers can enhance the model’s forecasting performance.

Assume that the approximation of g (x) is represented as g^* (x) and can be used to map input data x = [x₁, x₂, …, x_n] to a target value Y = [y₁, y₂, …, y_n]. We present a mathematical representation of RVFL as: $g^{*} (x) = \sum_{j} (α_{j} \cdot {b (F}_{j}^{T} x + g_{j})),$ (7)

where F_j is the vector of weights connecting the input to the j^th enhancement node, and the error term and the weights connected to the output are g_j and α_j, respectively. Since the weights from the input layer to enhancement node and error term can be randomly determined in an appropriate range and kept constant in the learning procedure, the only computation task is to determine the output weight α, which can be realized via handling the following task: $Y_{i} = e_{i}^{T} α, \begin{matrix} \end{matrix} i = 1, 2, \dots, N,$ (8)

where N denotes the total amount of research targets, and e expresses the vectors that can be used to concatenate initial and random features.

In order to avoid the problem of over-fitting, we conduct the Moore-Penrose Pseudoinverse. Zhang and Suganthan [53] further indicated that the ridge regression can perform a satisfactory job in handing this task. Thus, we apply ridge regression to handle the following task: $\sum_{i} (Y_{i} - e_{i}^{T} α)^{2} + β {∥ α ∥}^{2},$ (9)

where α = E (E^TE + βI) ^-1Y denotes the solution of the aforementioned task, β depicts the regularization value, and the input and output matrices are E and Y, respectively. For a more detailed illustration of RVFL, one may refer to Zhang and Suganthan [53] and Katuwal et al. [29].

3.4 The introduced dynamic network-based decision architecture

This study introduces a dynamic network-based decision architecture (see Fig. 1) that involves three steps for corporate operating performance forecasting: (1) performance evaluation and integration, (2) forecasting model construction, and (3) knowledge generation. Undoubtedly, one of the best-known and most widely used frameworks for performance evaluation in recent years is balanced scorecard (BSC). It not only can translate a company’s vision and strategic objectives into a set of performance measures distributed among four perspectives, but also can explicitly form the links between different perspectives in a single system. The original BSC assumes that the relations among four perspectives are unidirectional, but this cannot depict today’s highly intertwined business relations. To combat this, DEMATEL can help determine the causal relations and mutual influence among perspectives. While prior research merely focuses on an aggregated outcome, this study aims to go beyond an all-embracing outcome and advances to calculate the performance score for each BSC perspective so as to gain much deeper insights of business operations. Realizing the performance of a company in each BSC perspective may be useful, because it can help top managers to identify the relative efficiency of their firm’s operations with respect to different focal points represented by the perspectives.

Fig. 1

The dynamic network-based decision architecture.

How to transform the performance scores derived from each BSC perspective into a synthesized outcome is an essential task. This task is a classical MCDM task, and the MCDM method can be used to solve it. Thus, VIKOR (one type of MCDM method) is considered. After going through the DEMATEL+BSC+DEA+VIKOR procedure, all the companies can be discriminated into two groups: efficient and inefficient. Next, the analyzed results are injected into RVFL to construct the forecasting model. To prevent the impact of over-fitting, five-fold cross-validation is conducted. The inherent parameters of RVFL are decided randomly. The calculation process is terminated when the performance of RVFL reaches the stopping criterion.

RVFL belongs to neural network (NN)-based models, which have been criticized for lacking interpretability. To combat this, the instance learning algorithm grounded on a pedagogical structure is utilized to extract the decision rules from RVFL and represent them in a human-readable format. Users can rely on the model to assist them in forming their own judgments as well as maximize their business profits under anticipated risk exposure.

4 Experimental results and sensitivity tests

4.1 The research target and data

Because the attention among members of the organization for economic cooperation and development (OECD) has shifted from a labor-based economy to a knowledge-based economy, knowledge-intensive industries have turned into an essential center of economic growth and sustainable development. Knowledge-intensive industries in the United States sprang up dramatically in the early 1990 s, reaching nearly 40% of GDP (National Science Board 2012 [39]). This situation not only exists in developed countries, but also in developing countries. Many developing economies have made a considerable effort to become the main producers/manufacturers of knowledge-intensive goods and services [14]. For example, the knowledge-intensive industries in Taiwan contributed to 20.4% of its GDP in 2012, and these specific industries have also become the main suppliers in the global supply chain, such as IC chips, laptops, and liquid crystal displays [13]. The Taiwan government has also provided numerous tax incentives and preferential loans for these specific industries, turning them into its economic backbone as well as important capital markets for global investors. In fact, the trading volumes of these specific industries make up over 60% of domestic stock turnover. Thus, we choose them as our research target.

All the data are collected from the Taiwan Economic Journal (TEJ), which provides detailed company-level data of companies listed and traded on the Taiwan Stock Exchange (TSE) and Taipei Exchange (TE). The research period ranges from 2016 to 2018. After deleting missing and extreme values, 1120 data points remain.

4.2 The independent attribute

BSC has demonstrated its usefulness in guiding successful strategy implementation, because it focuses on financial and non-financial perspectives, long-term and short-term strategies, and internal and external business measures [26]. However, Makhijani and Creelman [37] indicated that some companies have diluted the effectiveness of their BSC systems as a result of basic mistakes in mapping - that is, there is a lack of articulation about the cause-and-effect relationships between some of the suggested areas of assessment in BSC [36, 37]. To overcome this task, DEMATEL is implemented to obtain the relationships among BSC perspectives.

One of the critical weaknesses of BSC is that it does not provide an aggregated outcome. To overcome this, we use DEA used to measure company performance for each BSC perspective. By realizing this performance in each BSC perspective, managers can prioritize their own focus on reaching their goals. Lastly, we apply VIKOR to synthesize the performance result of the various BSC perspectives into an aggregated final outcome.

4.3 The dependent attribute

Company operating performance evaluation is highly related to the issue of financial crisis prediction, whereby the chosen attributes are taken as the surrogates for joining up with the dependent attributes. Based on previous related works, this study employs 10 attributes which are A1: CA/CL (Current assets to current liabilities), A2: TD/TA (Total debts to total assets), A3: WC/TA (Working capital to total assets), A4: OI/TA (Operating income to total assets), A5: S/TA (Sales to total assets), A6: OI/S (Operating income to sales), A7: LTD/TA (Long-term debts to total assets), A8: EBIT/IE (Earnings before interest and tax to interest expense), A9: C/S (Cash to sales), and A10: NI/(TA-TL) (Net income to (total assets – total liabilities)).

4.4 The results

After surveying the related works on performance evaluation by BSC, this study obtains 9 variables (see Table 1). Sequentially, the 9 variables are designed to extract the implicit knowledge from specialists regarding the performance evaluation task. This study gathered 10 questionnaires from specialists as inputs to obtain the influential weights for the 9 variables. In our questionnaires, we presented specialists with a brief description on the purpose of this study. All of the specialists have more than 15 years of working experience in banks, CPA firms, or the financial industry (including one senior manager from Bank of Taiwan, two CPAs and four senior managers from the BIG 4: Deloitte, Ernst & Young, PriceWaterhouseCoopers, and KPMG, and one financial analyst from Fubon Financial Holding Company).

Table 1
The variables of performance evaluation on BSC

Perspective AVG. S.D. Rank Weight

Financial perspective: FP

F1: Return on assets 6.4 0.55 1 0.20

F2: Return on equity 5.6 0.89 4 0.13

F3: Gross profit ratio 5.4 1.52 5 0.11

Customer perspective: CP

C1: Sales return rate 5.8 0.45 6 0.09

C2: Market share 4.4 1.14 8 0.04

Internal business process perspective: IBPP

I1: R&D expenditure rate 5.0 0.71 3 0.16

I2: Staff productivity 6.2 0.45 2 0.18

Learning and growth perspective: LGP

L1: Employee work seniority 4.6 0.89 7 0.07

L2: Employee education degree 4.2 0.84 9 0.02

Perspective	AVG.	S.D.	Rank	Weight
Financial perspective: FP
F1: Return on assets	6.4	0.55	1	0.20
F2: Return on equity	5.6	0.89	4	0.13
F3: Gross profit ratio	5.4	1.52	5	0.11
Customer perspective: CP
C1: Sales return rate	5.8	0.45	6	0.09
C2: Market share	4.4	1.14	8	0.04
Internal business process perspective: IBPP
I1: R&D expenditure rate	5.0	0.71	3	0.16
I2: Staff productivity	6.2	0.45	2	0.18
Learning and growth perspective: LGP
L1: Employee work seniority	4.6	0.89	7	0.07
L2: Employee education degree	4.2	0.84	9	0.02

^*Average is abbreviated to AVG., and standard deviation is abbreviated to S.D. ^**This study asks the specialists to mark the importance of measures, utilizing a 7-point Likert scale that ranges from 1 “extremely unimportant” to 7 “extreme important”, respectively. ^***The weight is determined by a trapezoidal method.

To obtain the relationships among BSC perspectives, it is necessary to undertake some processes as follows. First, we have to obtain the initial direct-relation matrix and normalize it to get the total relation (direct/indirect) matrix (see Table 2). In order to realize the influential relation between the measures, we take the median as a threshold. If the values reach or exceed the threshold, then that perspective is considered to be more influential than the others. Sequentially, we add up the values in each row and the values in each column as r_i and c_j, respectively. The horizontal axis (r_i + c_j) depicts the degree of influential relations between perspectives, while the vertical axis (r_i - c_j) displays the degree of influential relations between one perspective and the other perspectives. Figure 2 represents the total-relation matrix with r_i + c_j and r_i - c_j. By performing DEMATEL, we move go beyond the unidirectional relationships among the original BSC to a network-based structure so as to provide more information for users to make better decisions.

Table 2

The total relation (direct/indirect) matrix

Perspective	LGP	IBPP	CP	FP	r	r+c	r-c
LGP	2.52	2.72	2.70	2.49	10.43	21.39	–0.53
IBPP	3.00	2.68	2.95	2.71	11.34	22.07	0.61
CP	2.74	2.73	2.48	2.48	10.43	21.20	– 0.34
FP	2.70	2.60	2.64	2.21	10.15	20.04	0.26
c	10.96	10.73	10.77	9.89

^*The mean (2.65) is taken as the threshold. If the value exceeds the threshold, then the value has an influence; if the value is less than the threshold, then the value has no influence.

Fig. 2

The network structure of BSC.

The performance evaluation of all BSC perspectives is useful, because it may help managers to identify the relative efficiency of company operations with respect to the different focal points expressed by the perspectives [3]. The outcomes from each BSC perspective are synthesized by VIKOR to form an overall performance rank. We rank each company’s performance from superior to inferior by a synthesized outcome (i.e., overall performance rank). The highest quintile (top 20%) is designated as those companies at efficient operations, and the lowest quintile (bottom 20%) is designated as those companies at inefficient operations.

Table 3 represents the results of two different groups. We can see that the efficient group usually has good business operations (i.e., simplify workflow and eradicate redundancies), appropriate and quick customer reaction (i.e., enhance communication and sound customer loyalty), and superior profit potential (i.e., sell profitable products and eliminate product cost). No matter whether a company is efficient or inefficient, both groups place little emphasis on LGP among the four perspectives. In order to overcome this problem, the government announced the “Enterprise Human Resource Improvement Program” to improve employee capability and skills and to spur the restructuring of business operating procedures so as to increase competitiveness and further upgrade the nation’s industrial level.

Table 3

The results of two different groups (AVG)

Module	Efficient company	Inefficient company	p-value
Module 1: LGP	0.73	0.36	0.000
Module 2: IBPP	0.84	0.32	0.000
Module 3: CP	0.82	0.35	0.000
Module 4: FP	0.87	0.37	0.000

The analyzed outcomes are then inserted into the AI-based technique to construct the forecasting model. Before model construction, we have to decide the predictors first. Based on the related works, 10 predictors are taken. However, not all the collected predictors are informative and useful - that is, a great proportion of the data is derived from financial statements that may be contaminated by some degree of errors, such as selective accounting principles and different estimation methods. To avoid any biased outcome, we conduct rough set theory (RST) with the merits of handling uncertainty, vagueness, and incompleteness [5]. In addition, ensemble learning is a prolific field in pattern recognition and data mining since it is grounded on the assumption that integrating the outcome of multiple models is better than utilizing a singular model, and it usually achieves superior results.

West [52] indicated that even a little improvement in a model’s forecasting accuracy can transmit into a large amount of future savings for market participants. Thus, this study performs RST with an ensemble structure to identify the most essential and relevant attributes which are A1: CA/CL, A2: TD/TA, A5: S/TA, A6: OI/S, and A9: C/S. We see that efficient companies usually have sound risk-absorbing ability, appropriate capital structure, superior profitability, and sufficient free cash flow.

To examine the usefulness of feature selection, we divide the experiments into three scenarios: (1) no feature selection, (2) feature selection by RST, and (3) feature selection by RST with an ensemble structure. To prevent an outcome just happening by coincidence, we conduct a statistical examination, taking into consideration four different kinds of measures: overall accuracy, precision, recall, and F-measure. The results are in Table 4. We can see that the model with feature selection performs better than the model without feature selection. This finding is in accordance with the prior work of Wang et al. [51] who pointed out that feature selection reduces computational complexities and also increases a model’s generalization ability. The results in scenarios 2 and 3 demonstrate the usefulness of ensemble learning. This outcome is the same as by Bolón-Canedo and Alonso-Betanzos [6], who stated that an ensemble-structured model can add stability and transparency in the procedure of knowledge discovery.

Table 4

Comparison of results

Measures (AVG.)	Classifier: RVFL
	Scenario 1	Scenario 2	Scenario 3
Accuracy	63.33	70.89	83.33
Precision	63.21	70.94	82.70
Recall	64.00	71.11	84.44
F-Measure	63.55	71.00	83.50

To test the effectiveness of our introduced model, we take the proposed model as a benchmark and compare it with other three models: decision tree (DT), support vector machine (SVM), and neural network (NN). Five-fold cross-validation (CV) is performed to eliminate the problem of over-fitting. The compared results are in Table 5, indicating that the introduced model outperforms the other three models for all assessment measures.

Table 5

Comparison of results in the four models

Model	Assessing criterion (AVG.) (Rank)
	OA	Pre.	Rec.	F-m
RVFL	83.33 (1)	82.70 (1)	84.44 (1)	83.50 (1)
DT	65.56 (4)	66.02 (4)	64.44 (4)	65.15 (4)
SVM	74.00 (2)	73.28 (2)	75.56 (2)	74.38 (2)
NN	69.11 (3)	68.90 (3)	68.90 (3)	69.30 (3)
p-value	0.000	0.000	0.000	0.000

Notes: OA: overall accuracy; Pre.: precision; Rec.: recall, and F-m.: F-measure.

As RVFL is an NN-based algorithm, and one of the critical weaknesses is an inability to provide an intuitive and comprehensive judgment for users to examine and test the forecasted outcome - that is, the inherent decision process is a black-box. To open up the black-box, this study implements the instance learning method to extract the implicit decision rules and represent them in a human-readable manner so as to increase their practical applications. The decision rules are in Table 6. We can see that a company with efficient operations usually has a suitable capital structure, higher profitability, and sufficient cash flow so as to react quickly to a severe financial shock and to survive in a turbulent business atmosphere.

Table 6

The decision rules from RVFL

No	If “Condition”	Then “Decision”
1	If “A1:CA/CL [0.85, 1.21]”, and “A5:S/TA [0.97, 1.33]”	Efficient
2	If “A1:CA/CL [0.05, 0.27]”, “A2:TD/TA [0.54, 0.73]”, and “A6:OI/S [0.03, 0.11]”	Inefficient
3	If “A2:TD/TA [0.54, 0.73]”, “A6:OI/S [0.18, 0.27]”, and A9:C/S [0.04, 0.13]	Inefficient
4	If “A2:TD/TA [0.54, 0.73]”, “A5:S/TA [0.97, 1.33]”, and “A9:C/S [0.21, 0.35]”	Efficient
5	If “A5: S/TA [0.38, 0.63]”, and “A9:C/S [0.21, 0.35]”	Efficient

4.5 Sensitivity test

Most studies reaching a conclusion have only relied on one pre-determined dataset, which is not reliable and trustworthy, because different inputs lead to different outputs. To make our research findings robust and to examine the applicability of the proposed model, we consider three other situations: (1) one stage mode (BSC+DEA) [11], (2) two stage model (BSC+DEA+DEA) [3], and (3) proposed model (DEMATEL+BSC+DEA+VIKOR). Figure 3 illustrates the results. We can see that our proposed mechanism has better discriminant ability (lower mean and higher variance) than the other two models. Table 7 expresses the forecasting outcomes when performance ranks are determined by these two models. We can see that our proposed model still performs better than the other three models for all assessment measures.

Fig. 3

The results of three different mechanisms.

Table 7

The results from two different situations

Model	Assessing criterion (AVG.) (Rank) in situation 1
	OA	Pre.	Rec.	F-m
RVFL	80.78 (1)	80.80 (1)	80.78 (1)	80.77 (1)
DT	63.67 (4)	64.07 (4)	62.56 (4)	63.18 (4)
SVM	74.11 (2)	74.14 (2)	74.11 (2)	74.12 (2)
NN	71.44 (3)	71.11 (3)	72.33 (3)	71.69 (3)
p-value	0.001	0.000	0.002	0.000
	Assessing criterion (AVG.) (Rank) in situation 2
RVFL	81.58 (1)	84.64 (1)	80.41 (1)	82.47 (1)
DT	66.35 (4)	70.74 (4)	65.35 (4)	67.89 (4)
SVM	75.19 (2)	78.90 (2)	74.41 (2)	76.58 (2)
NN	73.90 (3)	78.25 (3)	72.29 (3)	75.14 (3)
p-value	0,002	0.032	0.003	0.006

Notes: OA: overall accuracy; Pre.: precision; Rec.: recall, and F-m.: F-measure.

4.6 Managerial implications

Moving away from an all-embracing one-stage model toward a hybrid two-stage model can be advantageous for performance evaluation and improvement. One reason is because by calculating the performance score for each BSC perspective the managers can realize the strongest and weakest perspectives of their company and identify the relevant benchmark for continuous learning toward efficiency. Furthermore, in line with a network-based learning structure, we believe that this model better captures the dynamics of production processes and sub-processes as well as gains more insights and assists in allocating resource to suitable places.

Taking the efficient group for example, if this group wants to improve its LGP in the subsequent period, then it should set “variables in FP” as the top priority, because the aggregated performance gap on perspective LGP is the highest among the four perspectives (see Fig. 4). By realizing the cause-and-effect relations among the four perspectives, firm managers can allocate valuable resources to appropriate places so as to fulfill their mission and reach the goal of both profits and sustainable development.

Fig. 4

The performance gaps to the optimal level with cause-and-effect influences.

5 Conclusions and future works

One of the essential issues in a company’s performance evaluation is identifying the weak points in the inputs and taking internal and intertwined relations into consideration. The goal of top executives is to form a balance between dissimilar visions; hence, balanced scorecard (BSC), which can handle financial and non-financial perspectives, long-term and short-term strategies, and internal and external business measures, seems like an appropriate approach to handle this balancing task, but the issue is that relations between different perspectives are designed as unidirectional. Unfortunately, in today’s highly interconnected and collaborative environment, unidirectional relations among the four BSC perspectives are unable to precisely describe the whole picture of a company’s operations.

To handle the above task, we apply DEMATEL to extract the causal relationships and mutual influences among the perspectives. In other words, this study considers advancing BSC with a network structure instead of utilizing the traditional BSC with a unidirectional structure in order to provide users with more overarching considerations to form better judgments. However, BSC is unable to indicate inefficiency in the utilization of a resource, and it cannot discriminate the difference between efficient and inefficient companies.

We therefore add DEA into our approach. Apart from prior related studies that combined BSC and DEA into a one-stage (BSC+DEA) procedure to get an aggregated outcome, the aggregated outcome derived herein follows a two-stage procedure. In the first stage, DEA is performed to compute the performance score for each BSC perspective. By doing that, managers can realize their company’s performance of each BSC perspective and identify where there is room for improvement as well as determine opportunities for reciprocal learning between DMUs. Moreover, managers can further allocate resources to a suitable place to reduce the gap of each criterion and to promote a continuous learning procedure for reaching the optimal level.

In the second stage, VIKOR is applied to synthesize the performance scores of the various BSC perspectives into an overall performance measure. The analyzed outcome is then fed into RVFL to construct the model for operating performance forecasting. However, while RVFL offers superior forecasting capability, it also comes with a critical weakness, as it does not provide any intuitive and transparent decision logic - that is, the inherent decision judgment is a black-box. To handle this task, we execute instance learning, grounded on a pedagogical structure, to extract the decision knowledge from RVFL, to present it in a human-readable manner, and to increase its practical applications.

Future works can consider two potential research directions. First, we have worked on the target sample in Taiwan’s electronics industry, which suggests that the ability to generalize the results could be limited. Future studies can look into other industries or conduct cross-country analysis. Second, future research can extend a singular mechanism into much more sophisticated mechanisms, such as fuzzy DEMATEL, in order to handle an uncertain environment and human subjectivity as well as to gain more insight from the outcome.

Disclosure statement

No conflicts of interests exist in the submission of this manuscript.

Footnotes

Acknowledgments

The authors would like to thank the Ministry of Science and Technology, Taiwan, R.O.C. for financially supporting this work under contracts. No. 106-2410-H-034-011-MY3, No. 108-2410-H-034-056-MY2, No. 108-2410-H-034-050-MY2, and No. 109-2410-H-034-034-MY2.

References

Amado

C.A.F.

, Santos

S.P.

and Marques

P.M.

, Integrating the Data Envelopment Analysis and the Balanced Scorecard approaches for enhanced performance assessment, Omega 40 (2012), 390–403.

Asan

, Kadaifci

, Bozdag

, Soyer

and Serdarasan

, A new approach to DEMATEL based on interval-valued hesitant fuzzy sets, Applied Soft Computing 66 (2018), 34–49.

Basso

, Casarin

and Funari

, How well is the museum performing? A joint use of DEA and BSC to measure the performance of museums, Omega 81 (2018), 67–84.

Bavafa

, Mahdiyar

and Marsono

A.K.

, Identifying and assessing the critical factors for effective implementation of safety programs in construction projects, Safety Science 106 (2018), 47–56.

Bhattacharya

, Goswami

R.T.

and Mukherjee

, A feature selection technique based on rough set and improvised PSO algorithm (PSORS-FS) for permission based detection of Android malwares, International Journal of Machine Learning and Cybernetics 10 (2019), 1893–1907.

Bolón-Canedo

and Alonso-Betanzos

, Ensembles for feature selection: A review and future trends, Information Fusion 52 (2019), 1–12.

Bisoi

, Dash

P.K.

and Mishra

S.P.

, Modes decomposition method in fusion with robust random vector functional link network for crude oil price forecasting, Applied Soft Computing 80 (2019), 475–493.

Charnes

, Cooper

W.W.

and Rhodes

, Measuring the efficiency of decision making units, European Journal of Operational Research 2 (1978), 429–444.

Chang

, Chang

C.W.

and Wu

C.H.

, Fuzzy DEMATEL Method for developing supplier selection criteria, ications 38 (2011), 1850–1858.

10.

Chang

T.M.

, Hsu

M.F.

and Lin

S.J.

, Integrated news mining technique and AI-based mechanism for corporate performance forecasting, Information Sciences 424 (2018), 273–286.

11.

Chen

and Chen

, DEA performance evaluation based on BSC indicators incorporated: the case of semiconductor industry, International Journal of Productivity and Performance Management 56 (2007), 335–357.

12.

Chiang

C.Y.

and Lin

, An integration of Balanced Scorecards and data envelopment analysis for firm’s benchmarking management, Total Quality Management 20 (2009), 153–172.

13.

Chien

M.S.

, Lawler

J.S.

and Uen

J.F.

, Performance-based pay, procedural justice and job performance for R&D professionals: Evidence from the Taiwan high-tech sector, The International Journal of Human Resource Management 21 (2010), 2234–2248.

14.

Chi

C.W.

, Lieu

P.T.

, Hung

and Cheng

H.W.

, Do industry or firm effects drive performance in Taiwanese knowledge-intensive industries? Asia Pacific Management Review 21 (2016), 170–179.

15.

Cruz

J.M.

and Liu

, Modeling and analysis of the multiperiod effects of social relationship on supply chain networks, European Journal of Operational Research 214 (2011), 39–52.

16.

Dalvi-Esfahani

, Niknafs

, Kuss

D.J.

, Nilashi

and Afrough

, Social media addiction: Applying the DEMATEL approach, Telematics and Informatics 43 (2019), doi.org/10.1016/j.tele.2019.101250

17.

Eilat

, Golany

and Shtub

, Constructing and evaluating balanced portfolios of R&D projects with interactions: A DEA based methodology, European Journal of Operational Research 172 (2006), 1018–1039.

18.

Friedman

, Hastie

and Tibshirani

, The elements of statistical learning, second ed., Springer, (2009).

19.

Fitzgerald

and Storbeck

E.J.

, Distinguishing interests in the performance of regulated water: the UK experience, Centre for Business Performance (2002), 197–203.

20.

Gabus

and Fontela

, World Problems an Invitation to Further Thought within the Framework of DEMATEL. (1972), Battelle Geneva Research Centre, Switzerland, Geneva.

21.

Georgopoulos

and Hasler

, Distributed machine learning in networks by consensus, Neurocomputing 124 (2014), 2–12.

22.

Hsu

M.F.

, Yeh

C.C.

and Lin

S.J.

, Integrating dynamic Malmquist DEA and social network computing for advanced management decisions, Journal of Intelligent & Fuzzy Systems 35 (2018), 231–241.

23.

Igelnik

and Pao

Y.H.

, Stochastic choice of basis functions in adaptive function approximation and the functional-link net, IEEE Transactions on Neural Networks 6 (1995), 1320–1329.

24.

Jiang

, Zou

, Lio

and Li

, The uncertain DEA models for specific scale efficiency identification, Journal of Intelligent & Fuzzy Systems (2019). DOI: 10.3233/JIFS-190662

25.

Kamei

, Risk management (in Japanese). Tokyo: Dobunkan (1997).

26.

Kaplan

R.S.

and Norton

D.P.

, The balanced scorecard-measures that drive performances, Harvard Business Review 70 (1992), 71–79.

27.

Kaplan

R.S.

and Norton

D.P.

, Having trouble with your strategy? Then map it. Focus your organization on strategy—with the Balanced Scorecard, Harvard Business Review 78 (2000), 167–176.

28.

Karaşan

and Kahraman

, A novel intuitionistic fuzzy DEMATEL – ANP – TOPSIS integrated methodology for freight village location selection, Journal of Intelligent & Fuzzy Systems 36 (2019), 1335–1352.

29.

Katuwal

, Suganthan

P.N.

and Zhang

, An ensemble of decision trees with random vector functional link networks for multi-class classification, Applied Soft Computing 70 (2018), 1146–1153.

30.

Khan

I.U.

and Karam

F.W.

, Intelligent business analytics using proposed input/output oriented data envelopment analysis DEA and slack based DEA models for US-airlines, Journal of Intelligent & Fuzzy Systems DOI: 10.3233/JIFS-190641

31.

Karaşan

and Kahraman

, A novel intuitionistic fuzzy DEMATEL – ANP – TOPSIS integrated methodology for freight village location selection, Journal of Intelligent & Fuzzy Systems 36 (2019), 1335–1352.

32.

Lin

C.J.

and Wu

W.W.

, A causal analytical method for group decision-making under fuzzy environment, ications 34 (2008), 205–213.

33.

Lin

C.L.

, Shih

Y.H.

, Tzeng

G.H.

and Yu

H.C.

, A service selection model for digital music service platforms using a hybrid MCDM approach, Applied Soft Computing 48 (2016), 385–403.

34.

Liu

, Aiwu

, Lukovac

and Vukić

, A multicriteria model for the selection of the transport service provider: A single valued neutrosophic DEMATEL multicriteria model, Decision Making: Applications in Management and Engineering 1 (2018), 121–130.

35.

Mahdiyar

, Tabatabaee

, Abdullah

and Marto

, Identifying and assessing the critical criteria affecting decision-making for green roof type selection, y 39 (2018), 772–783.

36.

Malmi

, Balanced scorecards in Finnish companies: A research note, Management Accounting Research 12 (2001), 207–220.

37.

Makhijani

and Creelman

, How leading organizations successfully implement corporate strategy with the balanced scorecard, The, OTI Thought Leadership Series 1 (2008), 1–16.

38.

Min

and Joo

S.J.

, A data envelopment analysis-based balanced scorecard for measuring the comparative efficiency of Korean luxury hotels, International Journal of Quality & Reliability Management 25 (2008), 349–365.

39.

National Science Board. Science and engineering indicators 2012. NSB 12-01. Arlington VA: National Science Foundation. (2012).

40.

Pao

Y.H.

and Takefuji

, Functional-link net computing: theory, system architecture, and functionalities, Computer 25 (1992), 76–79.

41.

Petrović

I.B.

and Kankaraš

, DEMATEL-AHP multi-criteria decision making model for the determination and evaluation of criteria for selecting an air traffic protection aircraft, Decision Making: Applications in Management and Engineering 1 (2018), 93–110.

42.

Quezada

L.E.

, López-Ospina

H.A.

, Palominos

P.I.

and Oddershede

A.M.

, Identifying causal relationships in strategy maps using ANP and DEMATEL, Computers & Industrial Engineering 118 (2018), 170–179.

43.

Rouse

, Putterill

and Ryan

, Integrated performance measurement design: insights from an application in aircraft maintenance, Management Accounting Research 13 (2002), 229–248.

44.

Saberi

, Cruz

Jose M.

, Sarkis

and Nagurney

, A competitive multiperiod supply chain network model with freight carriers and green technology investment option, European Journal of Operational Research 266 (2018), 934–949.

45.

Schmidt

W.F.

, Kraaijveld

M.A.

and Duin

R.P.W.

, Feedforward neural networks with random weights, in: 11th IAPR International Conference on Pattern Recognition, (1992), 1–4.

46.

Shafiee

, Lotfi

F.H.

and Saleh

, Supply chain performance evaluation with data envelopment analysis and balanced scorecard approach, Applied Mathematical Modelling 38 (2014), 5092–5112.

47.

Sumrit

and Anuntavoranich

, Using DEMATEL Method to Analyze the Causal Relations on Technological Innovation Capability Evaluation Factors in Thai Technology-Based Firms, International Transaction Journal of Engineering, Management, & Applied Sciences & Technologies 4 (2012), 81–103.

48.

Tan

and Zhang

, Multiple attribute decision making method based on DEMATEL and fuzzy distance of trapezoidal fuzzy neutrosophic numbers and its application in typhoon disaster evaluation, Journal of Intelligent & Fuzzy Systems doi: 10.3233/JIFS-191758

49.

Tseng

M.L.

and Lin

Y.H.

, Application of fuzzy DEMATEL to develop a cause and effect model of municipal solid waste management in Metro Manila, Monitoring and Assessment 158 (2008), 519–533.

50.

W.W.

and Lee

Y.T.

, Developing global managers’ competencies using the fuzzy DEMATEL method, ications 32 (2007), 499–507.

51.

Wang

, Wu

, Wang

, Xiang

and Huang

, A feature selection approach for hyperspectral image based on modified ant lion optimizer, Knowledge-Based Systems 168 (2019), 39–48.

52.

West

, Neural network credit scoring models, Computers & Operations Research 27 (2000), 1131–1152.

53.

Zhang

and Suganthan

P.N.

, A comprehensive evaluation of random vector functional link networks, Information Sciences 367–368 (2016), 1094–1105.