A new case retrieval method based on double frontiers data envelopment analysis

Abstract

Case retrieval is a major step in case-based reasoning (CBR), which seeks the most similar historical case to correspond to the target case. However, the first step of similarity measurement is to determine the weights of attributes, which would affect the accuracy of the similarity calculation results. In this study, we propose a new method, called DEA-CBR that integrates the double frontiers data envelopment analysis (DEA) to determine the most similar historical case based on the similarity efficiency of each historical case. This proposed method is different from the traditional distance-based similarity measurement methods in that attribute weights are determined by DEA models without the need to be specified. The proposed DEA-CBR approach first defines attribute distances between each historical case and target case to calculate attribute similarity for each attribute. The maximum and the minimum similarity efficiencies of each historical case are then measured with DEA models and are geometrically averaged to measure the overall similarity efficiency of each historical case, based on which the most similar historical case can be determined. Two numerical examples are provided to illustrate the potential applications and benefits of the proposed DEA-CBR method.

Keywords

Data envelopment analysis case-based reasoning overall similarity efficiency similarity efficiency ranking

1 Introduction

Case-based reasoning (CBR) is an intelligent method that can solve new problems by referring to previous similar cases. It involves retrieving the most similar historical case from case base to provide solution for a new problem. In the past decades, CBR has been widely used in various areas, such as medicine [1], business [2], mechanical design [3], engineering design [4], emergency decision-making [5], and the like. For example, coal mine gas explosion is an unexpected event and causes serious harm to society. It is therefore very important to generate alternatives to deal with the explosion. Since the coal mine gas explosion has the characteristic of suddenness, the generation of alternative is often based on experts’ experience in emergency rescue. Obviously, it is appropriate to apply CBR to generate alternatives in emergency decision-making situation. Case retrieval is a crucial step in CBR. If the retrieved historical case is closely related to target case, the solution of the target case will be easy and effective, otherwise, the solution will be tough. Usually, the retrieval of historical cases(s) is achieved based on the weighted sum of attribute similarities or distances. Thus, attribute weights have significant effects on the retrieval results. For example, both historical cases and target case are assumed to be described by two attributes. Suppose that the attribute similarities between target case A and historical case B are 0.6 and 0.8, respectively, and that the attribute similarities between the target case A and historical case C is 0.7 and 0.7, respectively. If the attribute weights are 0.6 and 0.4, then the similarities between the target case and the two historical cases are 0.68 and 0.7, respectively. If the attribute weights are 0.4 and 0.6, then the similarities between the target case and the two historical cases are, respectively, 0.72 and 0.7. It can be seen clearly that the two different sets of weights lead to different results of retrieval. Obviously, case similarities are affected by the weights of attributes. Therefore, the determination of the attribute weights need to be objective. However, the existing case retrieval methods are almost based on distance or similarity, and the attribute weights should be determined in advance. It is argued that the determination of attribute weights is not easy. Moreover, decision is expected to be made objectively. In this paper, a new case retrieval method based on double frontiers data envelopment analysis (DEA) is proposed, which can determine the attribute weights automatically and objectively.

DEA, which is developed by Charnes et al. [6], is a method for evaluating the best relative efficiencies of a group of peer decision-making-units (DMUs) objectively by determining the most favorable input and output weights to each of the DMUs automatically. Chin et al. [7] has used DEA to determine the risk priorities of failure modes, which not only measures the risks of each failure mode effectively but also avoids determining the attribute weights. Wang et al. [8] has used DEA to determine the priority in the AHP and extended it to group AHP. Wang et al. [9] has proposed an integrated AHP-DEA methodology to assess bridge risk. Han et al. [10] has developed a DEA integrated artificial neural network method to optimize and predict the energy usage of complex petrochemical systems. Geng et al. [11] has proposed an improved DEA cross-model to analyze the energy and environment efficiency and apply it to case study [12]. In this paper, we focus on the case retrieval and employ DEA to evaluate the case similarities between historical cases and target case.

The primary contribution of this paper is to propose a method, called DEA-CBR, to determine the most similar historical case with the target case and to provide support for generating the solution alternative of the target case. In case retrieval, it is very important to determine the attribute weights, because it has great effects on the case retrieval results. Therefore, the existing case retrieval methods based on similarity or distance should determine the attribute weights in advance, while the proposed DEA-CBR method can generate the attribute weights automatically, which are determined by DEA models. Meanwhile, the DEA-CBR method measures the case similarities based on double frontiers DEA, which are proved to be reasonable [13], while most of the existing similarity measures can only determine the similarity from a single angle, i.e. distance or similarity. Furthermore, DEA is an objective evaluation method [14], so the proposed DEA-CBR method combines similarity measurement and DEA models can improve the objectivity of case retrieval.

The rest of this paper is organized as follows. The next section introduces some background information on case retrieval and DEA. In Section 3, we develop a retrieval method by incorporating DEA to obtain the most similar historical case. In Section 4, a case study on emergency response is conducted to illustrate the use and effectiveness of the proposed method. Discussion and conclusions of this study are presented in Section 5 and Section 6, respectively.

2 Research background

2.1 Case retrieval methods: an overview

Case retrieval is the core of CBR systems. The most usual method in case retrieval is similarity measurement, which evaluates similarities between target case and historical cases. The commonly used approach to assessing similarity is the distance function including Euclidean distance [15], Manhattan distance [16], Gaussian distance [17], and so on. In addition, there are other approaches that can be used for case retrieval, such as neural network [18], rule base [19], decision tree [20], outranking relations [21]. The relation between the distance and the attribute weights is expressed as follows: ${Sīm}_{i} = F (w_{j}, d_{ij})$ (1) where F is a function of w_j and d_ij, Sim_i represents the similarity between the target case C₀ and the historical case C_i (i = 1, 2, …, m), w_j is the weight of the jth attribute satisfying the condition that $\sum_{j = 1}^{n} w_{j} = 1$ and w_j ≥ 0, d_ij represents the distance of the jth attribute between C₀ and C_i. From Eq. (1), the attribute weights should be determined in advance, the attribute distances between the target case and the historical cases are then calculated, and the case similarities will finally be generated by Eq. (1). Obviously, these methods have to determine the weights of attributes in advance. The determination of attribute weights influences the result of case retrieval [22]. Specifically, different attribute weights will result in various similarities and the retrieved similar cases will be very different; thus, the solution of the target case will also be different.

Many researchers have engaged in the studies in the determination of attribute weights. In existing studies, the methods for determining attribute weights can be mainly classified into two types: subjective methods [15 , 20] and objective methods [23, 24]. The subjective determination of attribute weights is usually based on the decision-makers’ judgments or preferences, which leads to different attribute weights. Objective methods are based on some soft computing methods and also have some drawbacks. For example, the drawback of the genetic algorithm is its tendency to be trapped in local minimum value, and artificial neural networks (ANNs) require a large number of interconnected neurons to allocate connection weights to represent specific weight information. Therefore, new retrieval methods for considering how to evaluate the case similarities objectively and determining the attribute weights automatically are worth being studied.

2.2 DEA models

The classic DEA model is first proposed by Charnes et al. [6] to evaluate the relative efficiency of DMUs with multiple inputs and outputs. Over the past decades, many important models have been proposed, such as BCC model [25], super-efficiency model [26], cross efficiency model [27]. However, the traditional DEA models measure only the optimistic efficiencies of DMUs, which are difficult to be discriminated from each. Therefore, Wang et al. [13] proposed the pessimistic efficiency model and suggested a geometric average efficiency model, which integrated the optimistic and pessimistic efficiencies of DMUs as the overall efficiency measure of DMUs. Using the geometric average efficiency model, all DMUs can be fully ranked and discriminated. We briefly introduce the geometric average efficiency model as follows.

Assuming that there are p DUMs, and the fth DMUs has s outputs y_rf (r = 1, …, s ; f = 1, …, p) and t inputs x_ef (e = 1, …, t), which are known and nonnegative. The efficiency of DMU_f relative to the other DMUs is determined using the CCR model [6] as follows:

$\begin{matrix} Maximize & θ_{0} = \sum_{r = 1}^{s} u_{r} y_{r 0} \\ s . t . & \sum_{r = 1}^{s} u_{r} y_{rf} - \sum_{e = 1}^{t} v_{e} x_{ef} \leq 0, \\ f = 1, \dots, p, \\ \sum_{e = 1}^{t} v_{e} x_{e 0} = 1; \\ u_{r}, v_{e} \geq ɛ, r = 1, \dots, s; \\ e = 1, \dots, t \end{matrix}$ (2) where u_r and v_e are the output and input weights with respect to the rth output and the eth input, respectively. θ₀ represents the efficiency of the DMU under evaluation, y_r0 and x_e0 represent the output and input of the DMU under evaluation, respectively, and ɛ is the Archimedian infinitesimal.

If there exists a set of positive weights that makes $θ_{0}^{*} = 1$ , then DMU₀ is called to be an optimistic efficiency unit; otherwise, it is referred to as optimistically inefficient.

The above CCR model is called optimistic efficiency model which measures the optimistic efficiency of a DMU by maximization within the range of less than or equal to one. The pessimistic efficiency model suggested by Wang et al. [13] measures the pessimistic efficiency of a DMU by minimization within the range of greater than or equal to one and is constructed as follows [28, 29]:

$\begin{matrix} Maximize & ψ_{0} = \sum_{r = 1}^{s} u_{r} y_{r 0} \\ s . t . & \sum_{r = 1}^{s} u_{r} y_{rf} - \sum_{e = 1}^{t} v_{e} x_{ef} \geq 0, \\ f = 1, \dots, p, \\ \sum_{e = 1}^{t} v_{e} x_{e 0} = 1; \\ u_{r}, v_{e} \geq ɛ, \\ r = 1, \dots, s; e = 1, \dots, t \end{matrix}$ (3)

If there exists a set of positive weights that makes $ψ_{0}^{*} = 1$ , then DMU₀ is referred to as pessimistically inefficient; otherwise, DMU₀ is referred to pessimistically efficient.

Based on the above optimistic efficiency model and the pessimistic efficiency model, Wang et al. [13] proposed the geometric average efficiency determined by $φ_{f}^{*} = \sqrt{ψ_{f}^{*} θ_{f}^{*}}$ (4) where $θ_{f}^{*}$ and $ψ_{f}^{*}$ are the optimal optimistic and pessimistic efficiencies of DMU_f (f = 1, …, p), respectively. The geometric average efficiency considers both the optimistic efficiency and the pessimistic efficiency of a DMU and is therefore more meaningful and more comprehensive than the use of only one of them. Such a method for measuring the overall performance of a DMU by considering its both optimistic and pessimistic efficiencies is referred to as double frontiers DEA [30].

3 The proposed case retrieval method

In this section, we present a new case retrieval method based on double frontiers DEA, which can not only scientifically evaluate case similarities between target case and historical cases, but also avoid the determination of attribute weights. First, the formula to measure the attribute similarity is provided. Then, the similarity efficiency model is constructed using DEA models to obtain similarity efficiencies between historical cases and the target case. Afterward, the proper historical case(s) can be obtained by ranking the overall similarity efficiencies. In this study, we consider three formats of attribute values. The proposed case retrieval method for each format is presented as follows.

3.1 Similarity efficiency measure for real numbers

Suppose there are m historical cases denoted by C_i (i = 1, …, m) and one target case denoted by C₀. Both target and historical cases are described by multiple attributes. Let ${C_{1}^{P}, \dots, C_{n}^{P}}$ be the set of n attributes, q_ij (j = 1, …, n) be the value of case C_i on attribute $C_{j}^{P}$ and q_i0 be the value of target case C₀ on attribute $C_{j}^{P}$ , w = (w₁, …, w_n) be the vector of attribute weights, where w_j is the weight of attribute $C_{j}^{P}$ such that $\sum_{j = 1}^{n} w_{j} = 1$ and 0 ≤ w_j ≤ 1.

For attribute $C_{j}^{P}$ , q_ij and q_0j are real numbers. We define attribute distance d_ij between historical case C_i and target case C₀ on attribute $C_{j}^{P}$ based on Tsai et al. [31]: $d_{ij} = | q_{0 j} - q_{ij} |$ (5)

Let Sim_ij denote the attribute similarity between historical case C_i and target case C₀ with regard to attribute $C_{j}^{P}$ and d_jmax denote the maximum value of d_ij with regard to attribute $C_{j}^{P}$ such that $d_{jmax} = max_{0 \leq i \leq n} {d_{ij}}$ . According to the similarity measurement in the study of Fan et al. [5], similarity is inversely proportional to the ratio of d_ij and d_jmax, which means that if d_ij is farther from d_jmax, the attribute similarity is greater. Therefore, we use the value by subtracting d_ij from d_jmax to express the attribute similarity. The formula of Sim_ij is given by ${Sīm}_{ij} = d_{jmax} - d_{ij}$ (6)

If every Sim_ij (j = 1, …, n) of historical case C_i is equal to 0, which means that there is a maximum dissimilarity between the target case and historical case C_i, we will not consider this historical case C_i when ranking the historical cases according to the similarity efficiencies.

Furthermore, the similarity function can be constructed based on the attribute similarity Sim_ij. Let Sim_i denote the similarity between historical case C_i and target case C₀, then the formula of is Sim_i defined by [32] as ${Sīm}_{i} = \sum_{j = 1}^{n} w_{j} {Sīm}_{ij}$ (7)

From Eq. (7), the weights of attributes need to be determined in advance. In this study, we determine the attribute weights using DEA models objectively.

The traditional DEA often produces too many zero weights for inputs and outputs, which leads to overestimated optimistic efficiency and underestimated pessimistic efficiency. To overcome this drawback, Chin et al. [7] proposed imposing a constraint condition as follows: $w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j$ (8)

According to Chin et al. [7], we view each historical case as a DMU, its similarity with target case as an output, and assume a dummy input value of one for all the DMUs. Then, DEA models can be built to measure the maximum and minimum similarity efficiencies between the historical cases C_i and the target case C₀. By the study of Chin et al. [7], the similarity efficiency model can be formulated as

$\begin{matrix} {Sīm}_{imax} = Maximize {Sīm}_{i} \\ s . t . & {Sīm}_{l} \leq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, \\ j, k = 1, \dots, n; k \neq j, \end{matrix}$ (9)

$\begin{matrix} {Sīm}_{imin} = Minimize {Sīm}_{i} \\ s . t . & {Sīm}_{l} \geq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, \\ j, k = 1, \dots, n; k \neq j, \end{matrix}$ (10) where Sim_i is the similarity between the historical case C_i under evaluation and the target case C₀, Sim_l is the similarity between the historical case C_l (l = 1, 2, …, m) and the target case C₀, Sim_imax is the maximum similarity efficiency, Sim_imin is the minimum similarity efficiency. Model (9) compares the similarities of all the historical cases with the maximum case similarity to calculate the case similarity of each historical case from the optimistic perspective, while model (10) compares the similarities of all the cases with the minimum case similarity to calculate the case similarity of each historical case from the pessimistic perspective.

Taking into account the similarities in optimistic and pessimistic situations, the overall similarity efficiency ${Sīm}_{i}$ of every historical case is defined by Eq. (4) as the geometric average of the maximum and minimum similarity efficiencies. That is

$\begin{matrix} {Sīm}_{i} = \sqrt{{Sīm}_{imax} \cdot {Sīm}_{imin}}, i = 1, \dots, m, \end{matrix}$ (11)

It is obvious that the bigger the geometric average similarity efficiency, the higher the case similarity. We can then rank the historical cases according to ${Sīm}_{i}$ . The historical case ranked first is the most similar to the target case.

3.2 Similarity efficiency measure for interval numbers

We use interval number to express the uncertainty of an attribute value. For example, the value of ’the concentration of the residual O₂’ in a coal mine gas explosion emergency is in the range of 23 to 28, and it can be express as [23, 28]. Let $[p_{ij}^{L}, p_{ij}^{U}]$ be interval numbers of attribute $C_{j}^{P}$ with regard to the historical case C_i and $[p_{0 j}^{L}, p_{0 j}^{U}]$ be interval numbers of attribute $C_{j}^{P}$ with regard to target case C₀. If the attribute value p_ij is a real number, it can also be transformed into an interval number as $[p_{ij}^{L}, p_{ij}^{U}] = [p_{ij}, p_{ij}]$ . When the attribute value is an interval number, the distance formula of d_ij is given by

$\begin{matrix} d_{ij} & = & [d_{ij}^{L}, d_{ij}^{U}] \\ = & [\min (| p_{0 j}^{L} - p_{ij}^{L} |, | p_{0 j}^{U} - p_{ij}^{U} |), \\ \max (| p_{0 j}^{L} - p_{ij}^{L} |, | p_{0 j}^{U} - p_{ij}^{U} |)] \end{matrix}$ (12)

Then, the maximum distance formula of d_jmax is given by

$\begin{matrix} d_{jmax} & = & [d_{jmax}^{L}, d_{jmax}^{U}] \\ = & [max_{1 \leq i \leq n} {d_{ij}^{L}}, max_{1 \leq i \leq n} {d_{ij}^{U}}] \end{matrix}$ (13)

Furthermore, according to the interval distance proposed in [33], the formula of Sim_ij is given by

$\begin{matrix} {Sīm}_{ij} & = & [{Sīm}_{ij}^{L}, {Sīm}_{ij}^{U}] \\ = & [\min {d_{jmax}^{L} - d_{ij}^{L}, d_{jmax}^{U} - d_{ij}^{U}}, \\ \max {d_{jmax}^{L} - d_{ij}^{L}, d_{jmax}^{U} - d_{ij}^{U}}] \end{matrix}$ (14)

If all Sim_ij = 0 (j = 1, …, n), then the historical case C_i has the maximum distance from the target case C₀ on all attributes $C_{j}^{P} (j = 1, \dots, n)$ and we will not consider the similarity between the historical case C_i and target case C₀ in similarity efficiency ranking in this situation.

Let $[{Sīm}_{i}^{L}, {Sīm}_{i}^{U}]$ denote the interval similarity between the historical case C_i and the target case C₀, then the formula of $[{Sīm}_{i}^{L}, {Sīm}_{i}^{U}]$ is given by

$\begin{matrix} [{Sīm}_{i}^{L}, {Sīm}_{i}^{U}] & = & [\sum_{j = 1}^{n} w_{j} {Sīm}_{ij}^{L}, \sum_{j = 1}^{n} w_{j} {Sīm}_{ij}^{U}] \\ (i = 1, \dots, m) \end{matrix}$ (15)

As aforementioned, each historical case can be viewed as a DMU, each case similarity is assumed as an output, and one is assumed as a dummy input value for all the DMUs. According to the DEA models introduced in Section 2 and the DEA models proposed in Chin et al. [7], the maximum and minimum similarity efficiency models can be constructed as

$\begin{matrix} [{Sīm}_{imax}^{L}, {Sīm}_{imax}^{U}] = Maximize [{Sīm}_{i}^{L}, {Sīm}_{i}^{U}] \\ s . t . {\begin{matrix} @ l [{Sīm}_{l}^{L}, {Sīm}_{l}^{U}] \leq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix} \end{matrix}$ (16)

$\begin{matrix} [{Sīm}_{imin}^{L}, {Sīm}_{imin}^{U}] = Minimize [{Sīm}_{i}^{L}, {Sīm}_{i}^{U}] \\ s . t . {\begin{matrix} [{Sīm}_{l}^{L}, {Sīm}_{l}^{U}] \geq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix} \end{matrix}$ (17) which can be broken down into the linear programming models as follows

$\begin{matrix} {Sīm}_{imax}^{L} = Maximize {Sīm}_{i}^{L} \\ s . t . & {Sīm}_{l}^{U} \leq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix}$ (18)

$\begin{matrix} {Sīm}_{imax}^{U} = Maximize {Sīm}_{i}^{U} \\ s . t . & {Sīm}_{l}^{U} \leq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix}$ (19)

$\begin{matrix} {Sīm}_{imax}^{L} = Minimize {Sīm}_{i}^{L} \\ s . t . & {Sīm}_{l}^{L} \geq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix}$ (20)

$\begin{matrix} {Sīm}_{imax}^{U} = Minimize {Sīm}_{i}^{U} \\ s . t . & {Sīm}_{l}^{U} \geq 1, l = 1, \dots, m \\ w_{j} - 9 w_{k} \leq 0, j, k = 1, \dots, n; k \neq j, \end{matrix}$ (21)

Based on ${Sīm}_{imax}^{L}$ , ${Sīm}_{imin}^{L}$ , ${Sīm}_{imax}^{U}$ and ${Sīm}_{imin}^{U}$ , the geometric average similarity efficiency defined by Eq. (10) can be determined by

$\begin{matrix} [{Sīm}_{i}^{L}, {Sīm}_{i}^{U}] & = & [\sqrt{{Sīm}_{imax}^{L} \cdot {Sīm}_{imin}^{L}}, \\ \sqrt{{Sīm}_{imax}^{U} \cdot {Sīm}_{imin}^{U}}], \\ i = 1, \dots, m . \end{matrix}$ (22)

The minimax regret approach (MRA) developed by Wang et al. [14] can be used to rank the interval numbers. According to the ranking, we can obtain the most similar historical case with the target case.

3.3 Historical cases ranking for fuzzy linguistic variables

Emergency has also the characteristic of fuzziness, it is therefore very natural to use fuzzy linguistic variables to express the attributes. For example, the value of ’the concentration of CO₂’ in a coal mine gas explosion emergency is either ’high’, ’medium’ or ’low’, and it can be expressed in the format of fuzzy linguistic variables. Let E = {e_h|h = 0, 1, …, T} be the pre-established ordered linguistic term set, where e_h denotes the (h + 1)th linguistic variable of the set E, and (T+1) is the number of the linguistic variables. Then, according to the study of [34], the linguistic variable e_h can be expressed as the triangular fuzzy number ${\tilde{g}}_{h} = (g_{h}^{a}, g_{h}^{b}, g_{h}^{c})$ , which is obtained by

$\begin{matrix} {\tilde{g}}_{h} = (g_{h}^{a}, g_{h}^{b}, g_{h}^{c}) & = & (\max ((h - 1) / T, 0), h / T, \\ \min ((h + 1) / T, 1)) \end{matrix}$ (23) where $g_{h}^{a}, g_{h}^{b}$ and $g_{h}^{c}$ are real numbers, $0 \leq g_{h}^{a} \leq g_{h}^{b} \leq g_{h}^{c}$ , $g_{h}^{a}$ and $g_{h}^{c}$ represent the range of values for the linguistic variable e_h, $g_{h}^{b}$ indicates the most likely value of the linguistic variable e_h. If there is a linguistic variable set E = {e₀ =DB:denfinitely bad, e₁ =VB:very bad, e₂ =B:bad, e₃=M:medium, e₄=G:good, e₅=VG:very good, e₆=D:definitely good}, the linguistic variables {e₀, …, e₆} can be expressed as the triangular fuzzy number by Eq. (23), i.e., ${\tilde{g}}_{0} = (0, 0, 0.17), {\tilde{g}}_{1} = (0, 0.17.0.33), {\tilde{g}}_{2} = (0.17, 0.33, 0.5), {\tilde{g}}_{3} = (0.33, 0.5, 0.67), {\tilde{g}}_{4} = (0.5, 0.67, 0.83), {\tilde{g}}_{5} = (0.67, 0.83, 1), {\tilde{g}}_{6} = (0.83, 1, 1)$ .

If the value of p_ij is the fuzzy linguistic variable, it can be expressed as the triangle fuzzy numbers ${\tilde{p}}_{ij} = (p_{ij}^{a}, p_{ij}^{b}, p_{ij}^{c})$ . Let d_ij denote the distance between ${\tilde{p}}_{0 j}$ and ${\tilde{p}}_{ij}$ , then the formula of d_ij is given by Fan et al. [5] as

$\begin{matrix} d_{ij} = \\ \frac{\sqrt{(p_{0 j}^{a} - p_{ij}^{a})^{2} + (p_{0 j}^{b} - p_{ij}^{b})^{2} + (p_{0 j}^{c} - p_{ij}^{c})^{2}}}{{max}_{1 \leq i \leq m} {\sqrt{(p_{0 j}^{a} - p_{ij}^{a})^{2} + (p_{0 j}^{b} - p_{ij}^{b})^{2} + (p_{0 j}^{c} - p_{ij}^{c})^{2}}}} \end{matrix}$ (24)

Taking into account the logarithmic function as a decreasing function that has the same monotony with the attribute distance and attribute similarity, the formula of Sim_ij is given by

${Sīm}_{ij} = {\begin{matrix} \log_{\frac{1}{2}} (d_{ij} + α), & d_{ij} < 1, \\ 0, & d_{ij} = 0, \end{matrix}_$ (25) where α is the Archimedian infinitesimal.

Furthermore, we use Eq. (7) to express the attribute similarity Sim_i. Then, we employ the similarity efficiency models of Eq. (9) and Eq. (10) to obtain the maximum and minimum similarity efficiencies. We use Eq. (11) to obtain the geometric average similarity efficiency ${Sīm}_{i}$ and gain the most similar historical case(s) by ranking the geometric average similarity efficiencies.

3.4 Hybrid similarity ranking

An emergency case is usually denoted by the hybridization of the real numbers, interval numbers, and fuzzy linguistic variables. For example, in a coal mine gas explosion, the value of ’the number of trapped personnel’ is [47, 53], the value of ’the affected area of the blast’ is 17m², and the value of ’the concentration of CO’ is very good. It is necessary to measure the hybrid similarity ranking. When the attribute value is a real number or fuzzy linguistic variable, we obtain the value of Sim_ij using Eq. (6) and Eq. (25), respectively, and Sim_ij is a real number. When the attribute value is an interval number, we obtain the value of Sim_ij using Eq. (14), and Sim_ij is an interval number. To unify all Sim_ij as one data format, we express all Sim_ij as interval numbers. If Sim_ij is a real number, then it can be expressed as Sim_ij = [Sim_ij, Sim_ij]. Furthermore, we can obtain ${Sīm}_{imax}^{L}$ , ${Sīm}_{imin}^{L}$ , ${Sīm}_{imax}^{U}$ and ${Sīm}_{imin}^{U}$ using Eq. (18)–(21). Finally, the geometric average similarity efficiency can be obtained by Eq. (22), and the ranking of the historical cases can be achieved by using the MRA.

In summary, the steps of the proposed method for case retrieval are given as follows:

Step 1. For attributes of real numbers, calculate the attribute similarity Sim_ij using Eqs. (5)–(6), express the case similarity Sim_i using Eq. (7), calculate the maximum similarity efficiency Sim_imax using Eq. (9) and the minimum similarity efficiency Sim_imin using Eq. (10), and then calculate the overall similarity efficiency ${Sīm}_{i}$ using Eq. (11).

Step 2. For attributes of interval numbers, calculate the attribute similarity Sim_ij using Eqs. (12)–(14), express the case similarity Sim_i using Eq. (15), calculate the maximum similarity efficiency Sim_imax using Eqs.(18)–(19) and the minimum similarity efficiency Sim_imin using Eqs. (20)–(21), and then calculate the overall similarity efficiency ${Sīm}_{i}$ using Eq. (22).

Step 3. For attributes of fuzzy linguistic variables, transform the fuzzy linguistic variable into the triangular fuzzy number using Eq. (23), calculate the attribute similarity Sim_ij using Eqs. (24)–(25), express the case similarity Sim_i using Eq. (7), calculate the maximum similarity efficiency Sim_imax using Eq. (9) and the minimum similarity efficiency Sim_imin using Eq. (10), and then calculate the overall similarity efficiency ${Sīm}_{i}$ using Eq. (11).

Step 4. Rank the overall similarity efficiencies ${Sīm}_{i}$ . If the value of ${Sīm}_{i}$ is an interval, the ranking is got using the MRA and we then get the most similar case according to the ranking.

4 Illustrative examples

In this section, we provide two examples to illustrate the effectiveness of the proposed method. In recent years, coal mine gas explosion emergencies occurred frequently in China, which brings serious losses of life and property. The coal mine companies and their related departments pay close attention to the problem of how to deal with the emergencies rapidly and effectively. When the emergency occurs, the decision makers often make an emergency response according to their historical experience. Hence, the method of CBR is very suitable for assisting the decision makers to generate alternatives.

Company A is a coal mine company in China. When a coal mine gas explosion emergency occurs, the company uses the CBR method to retrieve the historical cases and gives an emergency alternative quickly according to the most similar historical case. For this, this company creates a historical case base based on mine gas explosion in other companies in recent years. Company A collects 21 historical cases and identifies four attributes in the main problem, namely, the number of trapped personnel(X₁, unit: person), the affected area of the blast (X₂, unit: m²), the concentration of the residual O₂ (X₃, unit: %), and the concentration of CO (X₄, unit: %). In the next, we will give two cases to illustrate the proposed method. In example 1, the data of collected attributes is real numbers. In example 2, the collected attribute data is mixed, i.e., real numbers, interval numbers and fuzzy linguistic variables.

Example 1. Table 1 shows the attribute values with regard to the historical cases C_i (i = 1, 2, …, 21) and target case C₀. To find the most similar historical case with the target case, the proposed DEA-CBR method in this study is used to rank the similarity efficiency between each historical case and target case. According to the ranking, the desirable historical case can be obtained. The computation processes and results are presented as follows.

Table 1
Attribute values of the historical cases and the target case

Cases X ₁ X ₂ X ₃ X ₄

C ₁ 50 14 26 36

C ₂ 54 30 13 32

C ₃ 77 28 12 26

C ₄ 45 20 30 30

C ₅ 80 17 27 45

C ₆ 73 35 18 19

C ₇ 37 40 29 24

C ₈ 41 27 30 26

C ₉ 67 20 20 29

C ₁₀ 48 30 25 32

C ₁₁ 42 29 28 35

C ₁₂ 39 18 20 28

C ₁₃ 40 23 23 33

C ₁₄ 43 28 27 41

C ₁₅ 60 32 30 42

C ₁₆ 65 42 23 38

C ₁₇ 73 37 29 37

C ₁₈ 78 32 19 43

C ₁₉ 32 16 32 30

C ₂₀ 57 19 31 36

C ₂₁ 49 22 32 28

C ₀ 50 43 25 32

Cases	X ₁	X ₂	X ₃	X ₄
C ₁	50	14	26	36
C ₂	54	30	13	32
C ₃	77	28	12	26
C ₄	45	20	30	30
C ₅	80	17	27	45
C ₆	73	35	18	19
C ₇	37	40	29	24
C ₈	41	27	30	26
C ₉	67	20	20	29
C ₁₀	48	30	25	32
C ₁₁	42	29	28	35
C ₁₂	39	18	20	28
C ₁₃	40	23	23	33
C ₁₄	43	28	27	41
C ₁₅	60	32	30	42
C ₁₆	65	42	23	38
C ₁₇	73	37	29	37
C ₁₈	78	32	19	43
C ₁₉	32	16	32	30
C ₂₀	57	19	31	36
C ₂₁	49	22	32	28
C ₀	50	43	25	32

Step 1: The distance between the attribute similarity Sim_ij and the maximum distance with regard to the attribute $C_{j}^{P}$ is calculated by Eqs. (5)–(6). The computation results are shown in Table 2.

Table 2

Computational results of Sim_ij concerning each historical case

Cases	Sim _i1	Sim _i2	Sim _i3	Sim _i4
C ₁	30	0	12	9
C ₂	26	16	1	13
C ₃	3	14	0	7
C ₄	25	6	8	11
C ₅	0	3	11	0
C ₆	7	21	6	0
C ₇	17	26	9	5
C ₈	21	13	8	7
C ₉	13	6	8	10
C ₁₀	28	16	13	13
C ₁₁	22	15	10	10
C ₁₂	19	4	8	9
C ₁₃	20	9	11	12
C ₁₄	23	14	11	4
C ₁₅	20	18	8	3
C ₁₆	15	28	11	7
C ₁₇	7	23	9	8
C ₁₈	2	18	7	2
C ₁₉	12	2	6	11
C ₂₀	23	5	7	9
C ₂₁	29	8	6	9

Step 2: According to Eq. (7) and models (9)–(10), we build models for measuring the maximum and minimum similarity efficiencies between historical cases C_i and target case C₀. Then, we obtain the maximum similarity efficiency Sim_imax and the minimum similarity efficiency Sim_imin. The computation results are shown in Table 3.

Table 3

Similarities for the coal mine gas explosion by the DEA-CBR method

Cases	Sim _imax	Sim _imin	${Sīm}_{i}$	Ranking
C ₁	0.9945	1.8039	1.3394	6
C ₂	0.9168	1.5556	1.1942	12
C ₃	0.1614	1.0000	0.4018	20
C ₄	0.8544	1.6797	1.1980	11
C ₅	0.0305	1.5000	0.2139	21
C ₆	0.3126	2.2386	0.8366	17
C ₇	0.6652	3.0654	1.4279	4
C ₈	0.7457	2.1078	1.2537	9
C ₉	0.4695	1.5229	0.8456	16
C ₁₀	1.0000	2.9935	1.7302	1
C ₁₁	0.7941	2.4935	1.4072	5
C ₁₂	0.6517	1.4641	0.9768	14
C ₁₃	0.7130	2.1732	1.2448	10
C ₁₄	0.8132	2.5556	1.4416	3
C ₁₅	0.7252	2.4379	1.3296	7
C ₁₆	0.6160	3.4118	1.4497	2
C ₁₇	0.3401	2.7288	0.9634	15
C ₁₈	0.1474	2.0850	0.5544	19
C ₁₉	0.4210	1.0000	0.6489	18
C ₂₀	0.7813	1.4673	1.0707	13
C ₂₁	0.9824	1.6340	1.2670	8

Step 3: The geometric average of the maximum and minimum similarity efficiencies ${Sīm}_{i}$ can be calculated by Eq. (11), and the computation results are shown in Table 3.

Step 4: According to the geometric average similarity efficiencies ${Sīm}_{i}$ , we rank the historical cases as shown in Table 3.

The higher the ranking is, the more similar the historical case C_i with target case C₀ will be. Consequently, C₁₀ is the most similar case according to the obtained ranking.

In order to express the feasibility and validity of the proposed DEA-CBR method, we use several methods to make comparisons with it. Firstly, the entropy weight method [35] is used to calculate the attribute weights. Then, the attribute similarities or distances are calculated by using three methods, i.e. Euclidean distance method (EDM), Gaussian distance method (GDM), Fan˛aŕs method (FANM) [5]. Furthermore, the case similarities are got by weighted average method and linear weighted method. Figs. 1 and 2 show the ranking of historical cases by CBR methods, i.e., EDM, GDM, FANM, DEA-CBR, under average weights and entropy weights.

Fig.1

The ranking order of the historical cases under average weights.

Fig.2

The ranking order of the historical cases under entropy weights.

Based on the results in Figs. 1 and 2, we can see that there is no much difference between the four sets of case similarities rankings. A major difference between DEA-CBR and the other three methods lie in historical cases C₄, C₁₄ and C₁₅. All the other historical cases are either ranked in the same order or have a very small gap. From Table 3, it is seen that the minimum similarity efficiency Sim_4min = 1.6797 is a little small, the minimum similarity efficiencies Sim_imin of historical cases C₁₄ and C₁₅ are great value. DEA-CBR considers both the minimum and maximum similarity efficiency, so the ranking is slightly different from the other three methods. The above observations show the applicability of the DEA-CBR.

Example 2. Since the emergencies occur suddenly and the data of emergencies can not be accurately obtained, the representation of the case is often in the form of hybrid data, such as real numbers, intervals, and fuzzy linguistic variables. In the coal mine gas explosion, the data acquisition also has the problem of uncertainty and fuzziness, and the case information exists in the form of hybridation. Table 4 shows the information of the 22 coal mine gas explosion emergency cases that include the hybrid attribute type. To find the most similar historical case with the target case, the DEA-CBR method is used. According to the similarity efficiency ranking, the desirable historical case can be obtained. The computational processes and results are presented as follows.

Table 4

Attribute values of the historical cases and target case for hybrid attributes

Cases	X ₁	X ₂	X ₃	X ₄
C ₁	[47,53]	17	[23,28, 23,28]	VG
C ₂	[51,57]	28	[12,15, 12,15]	DG
C ₃	[75,79]	30	[10,15, 10,15]	G
C ₄	[43,48]	22	[28,]	G
C ₅	[77,83]	19	[25,29, 25,29]	VG
C ₆	[71,75]	33	[16,20, 16,20]	M
C ₇	[35,38]	38	[27,31, 27,31]	DG
C ₈	[40,43]	29	[29,31, 29,31]	M
C ₉	[30,35, 30,35]	18	[30,35, 30,35]	G
C ₁₀	[45,50]	35	[23,28, 23,28]	VG
C ₁₁	[40,44]	27	[25,30, 25,30]	VG
C ₁₂	39	20	20	M
C ₁₃	[38,42]	25	23	DG
C ₁₄	[41,45]	26	[25,28, 25,28]	VB
C ₁₅	[58,63]	29	[28,33, 28,33]	B
C ₁₆	[63,68]	42	[21,25, 21,25]	G
C ₁₇	[71,75]	36	[28,30, 28,30]	DG
C ₁₈	[75,80]	35	[18,20, 18,20]	M
C ₁₉	[65,68]	23	20	G
C ₂₀	[55,58]	20	[30,32, 30,32]	VG
C ₂₁	[47,51]	25	[30,35, 30,35]	VB
C ₀	[48,51]	42	[23,27, 23,27]	G

Step 1: For the attribute in the format of real numbers, i.e., X₂, Sim_ij is calculated by Eqs. (5)–(6).

Step 2: For the attributes in the format of interval numbers, i.e. X₁ and X₃, Sim_ij is calculated by Eqs. (12)–(14).

Step 3: For the attribute in the format of fuzzy linguistic variables, i.e. X₄, Sim_ij is calculated by Eqs. (23)–(25).

Step 4: By Eqs. (16)–(21), the interval maximum and minimum similarity efficiency ${Sīm}_{imax}^{L}$ , ${Sīm}_{imin}^{L}$ , ${Sīm}_{imax}^{U}$ and ${Sīm}_{imin}^{U}$ are obtained, i ∈ {1, 2, …, m}, and the computational results are shown in Table 5.

Table 5

Similarity efficiencies and ranking

Cases	${Sīm}_{imax}^{L}$	${Sīm}_{imax}^{U}$	${Sīm}_{imin}^{L}$	${Sīm}_{imax}^{U}$	$[{Sīm}_{i}^{L}, {Sīm}_{i}^{U}]$	Ranking
C ₁	0.9324	0.9775	1.2340	1.2531	[1.0726,1.1067, 1.0726,1.1067]	7
C ₂	0.8435	0.8435	1	1.0000	0.9184	13
C ₃	0.7441	0.7583	1	1.0226	[0.8626,0.8806]	17
C ₄	0.9872	1.0000	1.5469	1.5579	[1.2357,1.2482]	3
C ₅	0.6013	0.6583	1	1.0237	[0.7754,0.8209]	20
C ₆	0.6000	0.6115	1	1.1229	[0.7746,0.8287]	21
C ₇	0.8658	0.9158	1.4342	1.5814	[1.1143,1.2035]	5
C ₈	0.7193	0.8140	1.4506	1.5873	[1.0215,1.1367]	9
C ₉	0.7734	0.7810	1	1.0093	[0.8794,0.8879]	16
C ₁₀	0.9943	1.0000	2.0616	2.0827	[1.4317,1.4432]	2
C ₁₁	0.7912	0.8222	1.6342	1.6724	[1.1371,1.1726]	4
C ₁₂	0.6551	0.6919	1	1.2077	[0.8094,0.9141]	18
C ₁₃	0.7012	0.8864	1.3354	1.5748	[0.9677,1.1815]	10
C ₁₄	0.8366	0.8570	1.3993	1.4416	[1.0820,1.1115]	6
C ₁₅	0.6569	0.6821	1.2262	1.2473	[0.8975,0.9224]	14
C ₁₆	0.9941	1	2.3842	2.4939	[1.5395,1.5792]	1
C ₁₇	0.7158	0.7473	1.1579	1.2808	[0.9104,0.9784]	12
C ₁₈	0.6575	0.6651	1	1.1014	[0.8109,0.8559]	19
C ₁₉	0.8081	0.8629	1.3139	1.6123	[1.0304,1.1795]	8
C ₂₀	0.7159	0.8106	1.017	1.1170	[0.8534,0.9516]	15
C ₂₁	0.9365	0.9980	1.099	1.1237	[1.0146,1.0590]	11

Step 5: By Eq. (22), the geometric average similarity efficiency $[{Sīm}_{i}^{L}, {Sīm}_{i}^{U}]$ is obtained, i ∈ {1, 2, …, m}, and the computational results are shown in Table 5.

Step 6: Using the MRA to rank the intervals of the geometric average similarity efficiencies, the results are shown in Table 5.

The smaller the ranking is, the more similar the historical case and the target case will be. Consequently, according to the obtained ranking, C₁₆ is the most similar case.

In what follows, we use the hybrid similarity measurement [5] to calculate the similarities between the historical cases and the target case. First, let the attribute weights be {0.25, 0.25, 0.25, 0.25}, {0.5, 0.1, 0.3, 0.1} and {0.2, 0.1, 0.3, 0.4}. Then, we calculate the case similarities using the FAN method, and the methods are named FAN1, FAN2 and FAN3 respectively in views of the attribute weights. Finally, the ranking of historical cases are shown in Fig. 3.

Fig.3

The ranking order of the historical cases under different attribute weights.

We can see from Fig. 3 that there is no much difference between the four sets of case similarities rankings. However, a major difference between the DEA-CBR and the other three FAN methods lies in historical cases C₁ and C₁₂. All the other historical cases have a very small gap in the ranking. From Table 5, we can get that the minimum similarity efficiencies $[{Sīm}_{1 \min}^{L}, {Sīm}_{1 \min}^{U}]$ are relatively small, then the geometric average similarity efficiency $[{Sīm}_{1}^{L}, {Sīm}_{1}^{U}]$ is relatively small. The minimum similarity efficiencies $[{Sīm}_{12 \min}^{L}, {Sīm}_{12 \min}^{U}]$ and the maximum similarity efficiencies $[{Sīm}_{12 \max}^{L}, {Sīm}_{12 \max}^{U}]$ are relatively small, and the geometric average similarity efficiency $[{Sīm}_{12}^{L}, {Sīm}_{12}^{U}]$ are relatively small. The DEA-CBR method considers both the minimum and maximum similarity efficiencies simultaneously and the results are interval numbers, so there are some differences in the ranking. This shows that the DEA-CBR method can deal with hybrid data types in an effective way.

5 Discussions

First, the DEA-CBR method is proposed to retrieve the most similar historical case with the target case. Compared with other methods, the proposed method can generate the attribute weights automatically. Meanwhile, the proposed method is more reasonable and objective for the case retrieval.

Second, the proposed method can provide decision support for decision makers. The most similar historical case can be gotten by averaging the maximum similarity efficiencies and minimum similarity efficiencies. And decision makers can give a better alternative by referring to the similar case. As can be seen from the illustrative examples, a most similar historical case can be obtained by using the proposed method. And the proposed method considers three formats of attribute values, such as real numbers, interval numbers and fuzzy linguistic variables. It takes a more comprehensive account of the formats of attribute values in the emergencies, and can better deal with the case retrieval in emergencies.

Third, the proposed method is developed to retrieve the similar historical case effectively. However, it fails to take into account other fuzzy formats, such as intuitionistic fuzzy number, hesitant fuzzy number.

6 Conclusions

This paper presents a new case retrieval method based on double frontiers DEA. In this method, the maximum and the minimum similarity efficiencies are obtained. Afterwards, the overall similarity efficiencies are calculated and the proper historical case(s) can be retrieved according to the obtained overall similarity efficiencies. Furthermore, illustrative examples are conducted to demonstrate the practical use of the proposed method. Compared with the existing methods for case retrieval, the proposed method has the distinct characteristic as discussed below.

The proposed method measures the case similarities by the geometric average similarity efficiencies, which is the first to use DEA models for evaluating the case similarities. The results using the proposed method would be more objective because DEA is an objective evaluation method [30]. Especially, DEA models can generate the attribute weights automatically. So, the proposed method does not need to determine the attribute weights in advance, while most of the existing retrieval methods should determine the attribute weights beforehand. In addition, the proposed method considers three formats of attribute values, such as real numbers, interval numbers and fuzzy linguistic variables. It takes a more comprehensive account of the formats of attribute values in the emergencies, and can better deal with the case retrieval in emergencies. Finally, two emergency cases have been examined using the proposed DEA-CBR method, demonstrating its feasibility and validity. It is expected that the method developed in this study may have more potential applications in the near future.

For future research, there are still some issues that need to be explored. For example, there are other formats of fuzzy data in practice, such as intuitionistic fuzzy numbers, hesitant fuzzy numbers, and so on. Besides, decision maker’s psychological behavior may also to be considered in case retrieval.

Footnotes

Acknowledgments

This work was partly supported by the National Natural Science Foundation of China under the Grant No. 61773123, Humanities and Social Science Foundation of Chinese Ministry of Education under the Grant No. 16YJC630008, Fujian Natural Science Foundation of China, No. 2017J01513.

References

Zhuang

Z.Y.

, Churilov

, Burstein

and Sikaris

, Combining data mining and case-based reasoning for intelligent decision support for pathology ordering by general practitioners. European Journal of Operational Research 195(3) (2009), 662–675.

and Sun

, Predicting business failure using multiple case-based reasoning combined with support vector machine. Expert Systems with Applications 36(6) (2009), 10085–10096.

, Hu

and Peng

, Hybrid weighted mean for CBR adaptation in mechanical design by exploring effective, correlative and adaptative values. Computers in Industry (2016), 58–66.

Guo

, Peng

and Hu

, Research on high creative application of case-based reasoning system on engineering design. Computers in Industry 64(1) (2013), 90–103.

Fan

, Li

, Wang

and Liu

, Hybrid similarity measure for case retrieval in CBR and its application to emergency response towards gas explosion. Expert Systems with Applications 41(5) (2014), 2526–2534.

Charnes

, Cooper

W.W.

and Rhodes

, Measuring the efficiency of decision making untis. European Journal of Operational Research 3(4) (1978), 339–338.

Chin

K.S.

, Wang

Y.M.

, Poon

G.K.K.

and Yang

J.B.

, Failure mode and effects analysis by data envelopment analysis. Decision Support Systems 48(1) (2009), 246–256.

Wang

Y.M.

and Chin

K.S.

, A new data envelopment analysis method for priority determination and group decision making in the analytic hierarchy process. European Journal of Operational Research 195(1) (2009a), 239–250.

Wang

Y.M.

, Liu

and Elhag

T.M.S.

, An integrated ahp-dea methodology for bridge risk assessment. Computers & Industrial Engineering 54(3) (2008), 513–525.

10.

Han

Y.M.

, Geng

Z.Q.

and Zhu

Q.X.

, Energy optimization and prediction of complex petrochemical industries using an improved artificial neural network approach integrating data envelopment analysis. Energy Conversion & Management 124 (2016), 73–83.

11.

Geng

Z.Q.

, Dong

J.G.

, Han

Y.M.

, et al., Energy and environment efficiency analysis based on an improved environment DEA cross-model: Case study of complex chemical processes. Applied Energy 205 (2017), 465476.

12.

Geng

Z.Q.

, Dong

J.G.

, Han

Y.M.

, Zhu

Q.X.

, Geng

Z.Q.

, Dong

J.G.

, et al., Energy and environment efficiency analysis based on an improved environment dea cross-model: Case study of complex chemical processes. Applied Energy 205 (2017), 465–476.

13.

Wang

Y.M.

, Chin

K.S.

and Yang

J.B.

, Measuring the performances of decision-making units using geometric average efficiency. Journal of the Operational Research Society 58(7) (2007), 929–937.

14.

Wang

, Greatbanks

and Yang

, Interval efficiency assessment using data envelopment analysis. Fuzzy Sets and Systems 153(3) (2005), 347–370.

15.

Chang

P.C.

, Liu

C.H.

and Lai

R.K.

, A fuzzy case-based reasoning model forsales forecasting in print circuit board industries. Expert Systems with Applications 34(3) (2008), 2049–2058.

16.

W.D.

and Liu

Y.C.

, Hybridization of CBR and numeric soft computingtechniques for mining of scarce construction databases. Automation in Construction 15(1) (2006), 33–46.

17.

and Sun

, Gaussian case-based reasoning for business failure prediction with empirical data in China. Information Sciences 179(1) (2009), 89–108.

18.

Florentino

F.R.

and Juan

M.C.

, CBR based system for forecasting red tides. Knowledge-Based Systems 16 (2003), 321–328.

19.

Xiong

, Learning fuzzy rules for similarity assessment in case-based reasoning. Expert Systems with Applications 38(9) (2011), 10780–10786.

20.

Wiratunga

, Craw

, Taylor

and Davis

, Casebased reasoning for matching SMARTHOUSE technology to people’s needs. Knowledge-Based Systems 17 (2004), 139–146.

21.

, Sun

and Sun

, Financial distress prediction based on OR-CBR in the principle of k-nearest neighbors. Expert Systems with Applications 36(1) (2009), 643–659.

22.

Yan

, Shao

and Guo

, Weight optimization for case-based reasoning using membrane computing. Information Sciences (2014), 109–120.

23.

Ahn

, Kim

and Han

, Global optimization of feature weights and the number of neighbors that combine in a case-based reasoning system. Expert Systems 23(5) (2006), 290–301.

24.

K.H.

and Ha

S.H.

, A personalized counseling system using case-based reasoning with neural symbolic feature weighting (CANSY). Applied Intelligence 29(3) (2008), 289–289.

25.

Banker

R.D.

, Charnes

and Cooper

W.W.

, Some models for estimating techniacl and sacle inefficiencies in data envelopment analysis. Management Science 30(9) (1984), 1078–1092.

26.

Andersen

P.V.

and Petersen

N.C.

, A procedure for ranking efficient units in data envelopment analysis. Management Science 39(10) (1993), 1261–1264.

27.

Sexton

T.R.

, Silkman

R.H.

and Hogan

A.J.

, Data envelopment analysis: Critique and extensions. New Directions for Program Evaluation 1986(32) (1986), 73–105.

28.

Parkan

and Wang

Y.M.

, The worst possible relative efficiency analysis based on inefficient production frontier, Working Paper, Department of management sciences, City University of Hong Kong, 2000.

29.

Paradi

J.C.

, Asmild

and Simak

P.C.

, Using DEA and worst practice DEA in credit risk evaluation. Journal of Productivity Analysis 21(2) (2004), 153–165.

30.

Wang

Y.M.

and Chin

K.S.

, A new approach for selection of advanced manufacturing technologies: DEA with double frontiers. International Journal of Production Research 47(23) (2009b), 6663–6679.

31.

Tsai

, Chiu

C.C.

and Chen

, A case-based reasoning system for PCB defect prediction. Expert Systems With Applications 28(4) (2005), 813–822.

32.

Watson

, Case-based reasoning is a methodology not a technology. Knowledge Based Systems 12(5) (1999), 213–223.

33.

Zeng

W.Y.

and Zhao

Y.B.

, Relationship among the normalized distance, the similarity measure, the entropy and the inclusion measure of interval-valued fuzzy sets based on interval-number measurement. Fuzzy systems and Mathematics 26(2) (2012), 81–90. (Press in China).

34.

Jiang

, Fan

and Ma

, A method for group decision making with multi-granularity linguistic assessment information. Information Sciences 178(4) (2008), 1098–1109.

35.

Wang

, Xu

, Ai

, et al., An Integrated CBR model for predicting endpoint temperature of molten steel in AOD. Transactions of the Iron & Steel Institute of Japan 52(1) (2012), 80–86.

36.

Ramĺőn

, Ruiz

J.L.

and Sirvent

, On the choice of weights profiles in cross-efficiency evaluations. European Journal of Operational Research 207(3) (2010), 1564–1572.