Applying data mining on customer relationship management system to discover forgotten effects

Abstract

Companies need to know customer preferences for decision-making. For this reason, the companies take into account the Customer Relationship Management (CRM). These information systems have the objective to give support and allow the management of customer data. Nevertheless, it is possible to forget causal relationships that are not always explicit, obvious, or observables. The aim of this study on new methodologies for finding causal relationships. This research used a data analysis methodology of a CRM. The traditional analysis method is the Theory of Forgotten Effects (TFE), which is considered in this work. The new approach proposed in this article is to use Data Mining Algorithms (DMA) like Association Rules (AR) to discover causal relationships. This study analyzed 5,000 users’ comments and opinions about a Chilean foods industry company. The results show that the DMA used in this work obtains the same values as the TFE. Consequently, DMA can be used to identify non-obvious comments about products and services.

Keywords

Management CRM system data mining on customer forgotten effects theory methodology effects of the causal food industry Chile

1 Introduction

Nowadays, social networks have a high impact on companies. Daily, more companies obtain information about multiple data sources. This information is used to have control of the users’ comments about products and services [1]. Consequently, Business Intelligence (BI), marketing, and Data Mining Algorithms (DMA) have increased their popularity about the topic of computer science research. In the main researchers of BI and DMA, the methods used have been sentiment analysis, text mining, ontologies and opinion mining [2 –4].

The events, phenomena, and acts that interact with the people are elements of some system. That is, causes-effects influence many peoples’ activities [51]. Although, companies have a precise control system, there is always the possibility of not considering or forgetting causal relationships. These causes-effects are not always explicit, obvious, or observables, and they are usually not detected directly. Commonly, the relationships’ effects remain hidden and, therefore, cause an accumulation of effects [5].

Human intelligence needs the support of tools and models that can create a technical basis for all the information. The tools and models must show all direct and indirect causal relationships. Causes and effects are significant to business strategies because these elements allow informed decision-making to improve customer service. For these reasons, the companies developed Customer Relationship Management (CRM), which has the objective to support a business strategy focused on the client and allow the management customer data. In this study, the customers’ interest data to companies are preferences and impacts [6 –8].

Companies value the customers’ feedback about their products or services offered. This topic can be studied, for example, using the Forgotten Effects Methodology (FEM) [47 –50] and DMA. Therefore, the customer-oriented approach is fundamental to the organizations that are focused on achieving customer satisfaction. It is essential to provide products and services that meet customer expectations [9].

According to the literature, sentiment analysis is one of the most commonly used techniques. This method allows pattern analysis in the customer data. BI studies customer satisfaction about products and services that organizations offer. The most use DMA are the Association Rules (AR) by apriori algorithm, e-commerce studies, and care service [6 , 10–12]. Besides, other data extraction methods, like as Naïve Bayes and Ontologies, are used for investigations about service quality and sentiments analysis [3 , 13].

The service quality of companies like Walmart and Kmart has been recently studied in [14] by social media analytics of Twitter and Facebook. Social media analytics refers to obtain data from social media platforms and analyzing the data to help decision-makers [15]. This process analyses the information of customers and competitors [15]. For example, in [16] has been analyzed the social media competitive of the three largest drugstore chains in the United States: Walgreens, CVS, and Rite Aid.

One of the most used DMA is sentiment analysis [2, 17]. The main application of sentiment analysis in BI has been the financial market prediction [18 –20] and business review analysis [21 –26]. On the other hand, [27] has studied the opinion mining for CRM by fuzzy formal concept analysis and sentiment analysis. [28] Has presented an approach for CRM by Data Mining (DM). The DMA used were Fuzzy c-means and AR. However, previous researches do not give a method to retrieve forgotten information.

DM is a set of techniques that allow the search of correlations between the data. These techniques generate strategic information for the business. On the other hand, this research paper proposes: DMA can be used to determine causes-effects and forgotten effects. Hence, DMA and FEM can determine the causes-effects of text data obtained from user opinions in social networks. This information serves as decision-making support for areas such as marketing [4, 29]. In this manuscript are used methods to provide additional knowledge for companies. That is, the previous methodologies serve to recover the forgotten knowledge of a company [30]. Therefore, we present a practical case study of a CRM data of a Chilean foods industry company. In this research, we compared the results of FEM and DMA. Besides, we evaluate the feasibility of automating the data analysis methodology of CRM by DMA [31].

The main contributions of this manuscript can be summarized as follows: (1) FEM is the actual methodology used to determine the cause-effect elements in BI. The FEM and DMA results are very similar in our approach. Hence, AR can be used to determine cause-effect elements. (2) This manuscript describes the main steps to determine cause-effect elements by FEM and DMA. These two methodologies are applied to a data set of a Chilean food industry company. This cause-effect results can be applied as customers’ recommendations. (3) The support, confidence, and lift of AR have been used to measure the cause and effect. As AR are DMA, we present the first approximation to discover forgotten effects by DMA. These algorithms are implements using high-level programming languages like R and Python. Therefore, the previous methodology can be easily automated.

The rest of the paper is organized as follows: Section 2 details the conceptual framework proposed for FEM. Section 3 exposes fundamental concepts about DM, explain AR and performance measures used in this research. Section 4 presents the results and discussion about the case of the study. Finally, Section 5 summarizes the main conclusions.

2 Forgotten effects methodology

Theory of Forgotten Effects (TFE) allows obtaining the relationship between data and getting the forgotten effects. According to [5] the events that interact with people are part of a system, which can be described as “causes” and “effects”. Although, companies have a good control system, there is always the possibility of forgetting causal relationships that are not always explicit, obvious, or observables. This work adapted the FEM to identify no-obvious content.

Figure 1 shows the FEM steps used: (1) obtaining coincidences between the causes; (2) obtaining coincidences between the effects; (3) obtaining coincidences between the causes and effects; (4) obtaining similarities using coincidences; (5) obtaining similarities between causes and effects using TFE, and (6) association map. These steps are exposed in the following sub-sections.

Fig.1

Steps forgotten effects methodology.

2.1 Obtaining coincidences between the causes

Table 1 shows cause matrix. A_ij index represents the closeness between the causes. Therefore, A High index corresponds to close causes. The process for obtaining the cause matrix is the following: The first step is to define the object of study. The object of study corresponds to a keyword about the research topic. Then, the causes list is set. An expert (marketing manager) validates the causes. The coincidences correspond to the interceptions between the causes. For example, the formula for the coincidence between a₁ and a₂ is: “keywords″ + a₁ + a₂.

Table 1
Cause matrix

a₁ a₂ ... a_j

a₁ ^A11 ^A12 ... ^A1j

a₂ ^A21 ^A22 ... ^A2j

... ... ... ... ...

a_i ^Ai1 ^Ai2 ... ^Aij

	a₁	a₂	...	a_j
a₁	^A11	^A12	...	^A1j
a₂	^A21	^A22	...	^A2j
...	...	...	...	...
a_i	^Ai1	^Ai2	...	^Aij

2.2 Obtaining coincidences between the effects

Table 2 shows the effect matrix. The same method was used as in the cause matrix (Table 1). In this step, the keywords associated with the effects is defined. B_ij index represents the closeness between the effects. For example, the formula for the coincidences between b₁ and b₂ is: “keywords″ + b₁ + b₂.

Table 2
Effect matrix

b₁ b₂ ... b_j

b₁ ^B11 ^B12 ... ^B1j

b₂ ^B21 ^B22 ... ^B2j

... ... ... ... ...

b_i ^Bi1 ^Bi2 ... ^Bij

	b₁	b₂	...	b_j
b₁	^B11	^B12	...	^B1j
b₂	^B21	^B22	...	^B2j
...	...	...	...	...
b_i	^Bi1	^Bi2	...	^Bij

2.3 Obtaining coincidences between causes and effects

Table 3 shows the cause and effect matrix. C_ij index represents the closeness between causes and effects. Alike as in the previous steps, the coincidences correspond to the intersections between causes and effects. For example, the formula for the coincidence between the cause a₁ and effect b₁ is: “keywords″ + a₁ + b₁.

Table 3
Cause and effect matrix

a₁ a₂ ... A_j

b₁ ^C11 ^C12 ... ^C1j

b₂ ^C21 ^C22 ... ^C2j

... ... ... ... ...

b_i ^BC1 ^Ci2 ... ^Cij

	a₁	a₂	...	A_j
b₁	^C11	^C12	...	^C1j
b₂	^C21	^C22	...	^C2j
...	...	...	...	...
b_i	^BC1	^Ci2	...	^Cij

2.4 Obtaining coincidences between causes and effects

Jaccard similarity coefficient is used to calculate the similarities using coincidences [7]. Given two matrix, A and B. Where A represents causes, and B represents effects. Each matrix with n binary attributes. Each element of A and B can either be 0 or 1. The total number of each combination of attributes is specified as follows:

M₁₁ represents the total number of attributes, where A and B both have a value of 1.

M₁₀ represents the total number of attributes, where the attribute of A is 1 and the attribute of B is 0.

M₀₁ represents the total number of attributes, where the attribute of A is 0 and the attribute of B is 1.

M₀₀ represents the total number of attributes, where A and B both have a value of 0.

The Jaccard similarity coefficient, J, is given as: $J = \frac{M_{11}}{M_{10} + M_{01} + M_{11}} .$ (1)

The similarity index (S_C) corresponds to the proximity between the causes and effects, where: C_cicj is the coincidences between the cause a_i and the effect b_j, C_i is the amount of match for the cause a_i, and C_j is the amount of match for the effect b_j. It is defined as: $S_{C} = \frac{C_{cicj}}{C_{i} + C_{j} - C_{cicj}} .$ (2)

2.5 Obtaining similarities between causes and effects using TFE

TFE is used to obtain all similarities between causes and effects. The following summarizes the rectangular matrix that will be used [5].

Let A be a set of elements, A = {ai/i = 1, 2, … , n}, which will be called causes.

Let B = {bj/j = 1, 2, … , m}, which will be called effects.

Let C be a third set of elements, C = {ck/k = 1, 2, …, p}, which act as effects of set B.

The expression v (a_i, c_k) correspond to the set of pairs elements, which is known as the “direct incidence matrix.” [M ∼] will represent this matrix.

The max-min composition of matrices is the mathematical operator that sets the incidences of A on C. The formula (3) represent the relationship of a_i over c_k. $v (a_{i}, c_{k}) = \max (\min (v (a_{i}, b_{j}), v (b_{j}, c_{k}))) .$ (3)

Causes-causes are noted in the incidence matrix for the set as [A ∼], and effects-effects as [B ∼]. A new matrix of incidences between the elements of A and B is obtained by composition [M ∼ *] = [A ∼] ° [M ∼] ° [B ∼]. This result composition represents indirect causal relationships.

2.6 Association map

An association map is a graphic representation of the elements in a cartesian plane [32]. The causes and effects are represented by multivariable analysis, which is a technique to adjust the distribution of the items [33]. The distribution evaluates the distance between the different elements. [M ∼ *] adjust the causes, and [M ∼ *] ’ transposed adjust the effects. The tool used for multivariable analysis was the Statistical Package for the Social Sciences (SPSS) [34].

The distances between the causes were observed. The closest causes are the most similar causes. Also, the effects were correlated. The distance between the causes and effects allowed interpreting aspects of cause-effect similarity.

3 Data mining

DMA extracts concise and useful knowledge of available data. [35] defines these methods as a “set of techniques used for information discovery in a large dataset.” There are two DM task: (1) predictive tasks, and (2) descriptive tasks. Predictive tasks infer unknown or future values. Examples are classification and regression. Descriptive tasks find data describing patterns, relevant information. Examples are AR and clustering.

3.1 Association rules

AR is a technique links data set attributes using some association. This method examines the dataset, and then, these techniques identify the frequent co-occurrence [36]. Although, AR are not specific causes and effects rules, in our case study, the cause and effect relationship is validated by the results of FEM.

Consider X and Y (two items set), X → Y defines the AR, X is called antecedent, and Y is the consequence. AR use three performance measures: support, confidence and lift. The support designates the proportion of data that contains the items set X and Y. This measure determinates the overall impact of the AR. On the other hand, the confidence designates the conditioned probability that contains Y in a conditioned rule that also contains X [37, 38]. Lift denotes how efficient the rule compared with the random selection of set items. If the lift is greater than 1, the rule has weight compared with the random match about antecedent and consequence [39]. The AR has been used in BI for itemset mining, wich generates more profitable itemsets and the association among these itemsets [40].

The Apriori Algorithm has been used in this study. This algorithm does multiple searches in the data set. Based on the frequency of an item set is use to obtain the AR [37]. A rule is defined as an implication of the form, X → Y where X intersection Y equal 0.

4 Result and discussion

This section shows the results of a case study by FEM and DMA. First, the case of the study is presented. Then, the application results of FEM: coincidences between the Chilean Regions, coincidences between categories, coincidences between regions and categories, similarities using coincidences, similarities using TFE and association map. Finally, this section shows the application of DMA.

4.1 Case of study

The first step has been to get comments and opinions from the CRM data of a Chilean food industry company. The users (customers, consumers, suppliers, and applicants) enter requests, inquiries, or complaints. Therefore, positive and negative users’ comments about products and services are known. For the application of the methodology, the region and category have been defined. A region is the country’s first-level administrative division for Chile. The regions represent the causes, and the categories are the effects. Table 4 shows a description of the fields of the users’ comments.

Table 4
Field description data set

Column Type Description

Reference Char Case identifier within the web, this field contains the creation date concatenated with a numerical sequence that starts at zero every day.

Created Date Case Creation Date. Date when the user entered a case into the system.

Finalized Date Case Finalization Date. Date when a company operator considers the settled case.

Month Number Month in numerical format. Case Admission Month.

Year Number Case Admission Year.

Matter Char Case matter.

Status Char Case status (entered, updated, and solved).

Category Char Case company area.

Contact Number Contact identifier. It is who entered the case.

Commune Char Contact commune. Commune is the smallest Chilean territorial division for administrative purposes.

Region Char Contact Region. The Region is the country’s first-level administrative division.

Sentiment Number A numeric value that represents when the text is negative, positive, or neutral.

Text Char Body case. It is the text that users type when they enter a case.

Column	Type	Description
Reference	Char	Case identifier within the web, this field contains the creation date concatenated with a numerical sequence that starts at zero every day.
Created	Date	Case Creation Date. Date when the user entered a case into the system.
Finalized	Date	Case Finalization Date. Date when a company operator considers the settled case.
Month	Number	Month in numerical format. Case Admission Month.
Year	Number	Case Admission Year.
Matter	Char	Case matter.
Status	Char	Case status (entered, updated, and solved).
Category	Char	Case company area.
Contact	Number	Contact identifier. It is who entered the case.
Commune	Char	Contact commune. Commune is the smallest Chilean territorial division for administrative purposes.
Region	Char	Contact Region. The Region is the country’s first-level administrative division.
Sentiment	Number	A numeric value that represents when the text is negative, positive, or neutral.
Text	Char	Body case. It is the text that users type when they enter a case.

In this study, a Chilean foods industry company gives us 5,867 users’ comments for this BI analysis. These records are obtained from the CRM of “Productos Fernández S.A.”

4.2 Application of the forgotten effects theory

The following sections show the results of the Forgotten Effects Theory.

4.3 Application of the FEM

The following sub-sections show the results of the FEM: coincidences between the Chilean Regions, coincidences between categories, coincidences between regions and categories, similarities using coincidences, similarities using TFE and association map.

4.3.1 Coincidences between the Chilean regions

This work answers the following research question: what is the interaction of users on the company’s website? Therefore, the interactions of the site in the regions and the coincidences between regions have been counted. For this analysis, an Excel pivot tablet has been used. This table of statistics summarize the data of a more extensive table. The regions used in this study have been 13. The Table 5 shows the code and name for each region.

Table 5
Chilean regions. Roman numeral code

Code Name (Region)

I Tarapacá

II Antofagasta

III Atacama

IV Coquimbo

V Valparaíso

VI Libertador Bernardo O’Higgins

VII Maule

VIII BioBío

IX Araucanía

X Los Lagos

XI Aysén del Gral. Carlos Ibañez del Campo

XII Magallanes

XIII Metropolitana

Code	Name (Region)
I	Tarapacá
II	Antofagasta
III	Atacama
IV	Coquimbo
V	Valparaíso
VI	Libertador Bernardo O’Higgins
VII	Maule
VIII	BioBío
IX	Araucanía
X	Los Lagos
XI	Aysén del Gral. Carlos Ibañez del Campo
XII	Magallanes
XIII	Metropolitana

Table 6 shows the coincidences between the 13 Chilean regions. The Maule region (VII), and the Metropolitana region (XIII) are important for this study because the food industry company is located in VII region. The XIII region is the Capital of Chile, which has more than a third of the Chilean population. For example, the matches between the VII and XIII region are 2,176 (marked cell). A large coincidence number corresponds to a high similarity between the regions. The previous result indicates the high interactions of people with the company’s website in these regions (VII and XIII). The table shows that the coincidences between the XI and XII are 25. On the other hand, the coincidences of the VII region are 3,712, and the coincidences of the XIII region are 4,246.

Table 6

Similarities between Chilean Regions

Regions	I	II	III	IV	V	VI	VII	VIII	IX	X	XI	XII	XIII
I	2923	96	74	98	262	146	853	210	126	104	37	52	1387
II	96	2955	106	130	294	178	885	242	158	136	69	84	1419
III	74	106	2933	108	272	156	863	220	136	114	47	62	1397
IV	98	130	108	2957	296	180	887	244	160	138	71	86	1421
V	262	294	272	296	3121	344	1051	408	324	302	235	250	1585
VI	146	178	156	180	344	3005	935	292	208	186	119	134	1469
VII	853	885	863	887	1051	935	3712	999	915	893	826	841	2176
VIII	210	242	220	244	408	292	999	3069	272	250	183	198	1533
IX	126	158	136	160	324	208	915	272	2985	166	99	114	1449
X	104	136	114	138	302	186	893	250	166	2963	77	92	1427
XI	37	69	47	71	235	119	826	183	99	77	2896	25	1360
XII	57	84	62	86	250	134	841	198	114	92	25	2911	1375
XIII	1387	1419	1397	1421	1585	1469	2176	1533	1449	1427	1360	1375	4246

Table 6. Coincidences between Chilean Regions. The marked cell corresponds to the matches between the VII and XIII region. The company studied belongs to the VII region, and the capital of Chile is the XIII region.

4.3.2 Coincidences between categories

Table 7 shows the categories list defined by a marketing manager. The categories are 13 because the number of elements must be equal to the number of regions (matrix definition for TFE, Section 2.5). Table 8 presents the coincidence matrix between these categories, which has been calculated by the analogous procedure of Table 6. For example, the matches between Quality (1) and Work with us (11) are 3868 (marked cell). The coincidences in Collaborators (3) are 36, the same value than in Transportation (12).

Table 7
Categories list. These categories have been defined by a marketing manager

Identifier Category

1 Quality

2 Call Center

3 Collaborators

4 Commercial

5 Credits and Payments

6 Congratulations

7 Marketing

8 Supplier

9 I want to be a customer

10 I want to be a supplier

11 Work with Us

12 Transportation

13 Plant Visit

Identifier	Category
1	Quality
2	Call Center
3	Collaborators
4	Commercial
5	Credits and Payments
6	Congratulations
7	Marketing
8	Supplier
9	I want to be a customer
10	I want to be a supplier
11	Work with Us
12	Transportation
13	Plant Visit

Table 8

Coincidence matrix between categories. Matches between Quality (1) and Work with us (11) are 3868 (marked cell)

Categories	1	2	3	4	5	6	7	8	9	10	11	12	13
1	1000	1036	1016	1104	1046	1054	1845	1098	1735	1117	3868	1016	1050
2	1036	36	72	52	140	82	90	881	134	771	153	2904	52
3	1016	72	16	120	62	70	861	114	751	133	2884	32	66
4	1104	52	120	104	150	158	949	202	839	221	2972	120	154
5	1046	140	62	150	46	100	891	144	781	163	2914	62	96
6	1054	82	70	158	100	54	899	152	789	171	2922	70	104
7	1845	90	861	949	891	899	845	943	1580	962	3713	861	895
8	1098	881	114	202	144	152	943	98	833	215	2966	114	148
9	1735	134	751	839	781	789	1580	833	735	852	3603	751	785
10	1117	771	133	221	163	171	962	215	852	117	2985	133	167
11	3868	153	2884	2972	2914	2922	3713	2966	3603	2985	2868	2884	2918
12	1016	2904	32	120	62	70	861	114	751	133	2884	16	66
13	1050	52	66	154	96	104	895	148	785	167	2918	66	50

4.3.3 Coincidences between regions and categories

Table 9 shows the matrix coincidences between regions and categories. As with previous steps, the coincidences between regions and categories have been counted. For example, the matches between VII (Maule) and Quality are 107 (marked cell). The XIII region and I want to be a customer have 403 coincidences. The columns of the XI and XII region have values 0 in most categories.

Table 9
Coincidences between regions and categories. Marked cell shows the matches between Quality and VII (Maule)

I II III IV V VI VII VIII IX X XI XII XIII

Quality 4 12 3 9 50 14 107 23 14 15 0 6 314

Call center 0 0 0 0 0 0 8 0 0 0 0 0 1

Collaborators 0 0 0 0 0 0 14 0 0 0 0 0 0

Commercial 1 3 3 4 6 0 18 12 1 3 0 1 39

Credits and payments 2 2 0 0 0 0 13 0 0 0 0 0 22

Congratulations 0 1 0 1 2 2 12 5 0 0 0 0 11

Marketing 1 8 4 12 53 15 212 32 14 11 0 0 211

Provider 1 1 0 0 0 3 12 4 0 2 0 0 23

I want to be customer 10 25 16 26 66 27 39 53 23 9 4 5 403

I want to be provider 0 0 1 1 1 7 4 3 1 1 0 0 20

Work with us 12 12 15 13 51 43 341 43 40 31 1 8 295

Transportation 0 0 0 0 0 0 8 0 0 0 0 0 8

Plant visit 1 0 0 0 1 3 33 3 1 0 0 0 8

	I	II	III	IV	V	VI	VII	VIII	IX	X	XI	XII	XIII
Quality	4	12	3	9	50	14	107	23	14	15	0	6	314
Call center	0	0	0	0	0	0	8	0	0	0	0	0	1
Collaborators	0	0	0	0	0	0	14	0	0	0	0	0	0
Commercial	1	3	3	4	6	0	18	12	1	3	0	1	39
Credits and payments	2	2	0	0	0	0	13	0	0	0	0	0	22
Congratulations	0	1	0	1	2	2	12	5	0	0	0	0	11
Marketing	1	8	4	12	53	15	212	32	14	11	0	0	211
Provider	1	1	0	0	0	3	12	4	0	2	0	0	23
I want to be customer	10	25	16	26	66	27	39	53	23	9	4	5	403
I want to be provider	0	0	1	1	1	7	4	3	1	1	0	0	20
Work with us	12	12	15	13	51	43	341	43	40	31	1	8	295
Transportation	0	0	0	0	0	0	8	0	0	0	0	0	8
Plant visit	1	0	0	0	1	3	33	3	1	0	0	0	8

4.3.4 Similarities using coincidences

Table 10 shows the similarities obtain by coincidences. The formula of similarity (Sc) used for this calculation is (2). Values closer to 1 correspond to higher similarity. Values closer to 0 correspond to lower similarity. The cell market is the highest similarity, which is the XIII region and I want to be a customer (9). As in Table 9, the columns of the XI and XII region have values 0 in most categories.

Table 10
Similarities between regions and categories. Marked cell shows the matches between I want to be a customer and XIII (Metropolitana)

I II III IV V VI VII VIII IX X XI XII XIII

1 0.001 0.011 0.003 0.008 0.043 0.005 0.089 0.013 0.011 0.012 0.000 0.006 0.148

2 0.000 0.000 0.000 0.000 0.000 0.000 0.007 0.000 0.000 0.000 0.000 0.000 0.001

3 0.000 0.000 0.000 0.000 0.000 0.000 0.044 0.000 0.000 0.000 0.000 0.000 0.000

4 0.001 0.017 0.013 0.001 0.018 0.000 0.042 0.012 0.002 0.008 0.000 0.005 0.025

5 0.002 0.005 0.000 0.000 0.000 0.000 0.024 0.000 0.000 0.000 0.000 0.000 0.013

6 0.000 0.004 0.000 0.003 0.001 0.001 0.028 0.005 0.000 0.000 0.000 0.000 0.007

7 0.000 0.008 0.002 0.007 0.030 0.003 0.123 0.013 0.007 0.006 0.000 0.000 0.074

8 0.001 0.001 0.000 0.000 0.000 0.002 0.004 0.004 0.000 0.004 0.000 0.000 0.014

9 0.005 0.094 0.018 0.027 0.071 0.011 0.037 0.014 0.021 0.009 0.001 0.006 0.220

10 0.000 0.000 0.004 0.003 0.003 0.004 0.009 0.003 0.002 0.000 0.000 0.000 0.013

11 0.003 0.057 0.005 0.004 0.017 0.010 0.121 0.012 0.013 0.010 0.000 0.003 0.074

12 0.000 0.000 0.000 0.000 0.000 0.000 0.026 0.000 0.000 0.000 0.000 0.000 0.006

13 0.000 0.000 0.000 0.000 0.001 0.001 0.020 0.001 0.001 0.000 0.000 0.000 0.002

	I	II	III	IV	V	VI	VII	VIII	IX	X	XI	XII	XIII
1	0.001	0.011	0.003	0.008	0.043	0.005	0.089	0.013	0.011	0.012	0.000	0.006	0.148
2	0.000	0.000	0.000	0.000	0.000	0.000	0.007	0.000	0.000	0.000	0.000	0.000	0.001
3	0.000	0.000	0.000	0.000	0.000	0.000	0.044	0.000	0.000	0.000	0.000	0.000	0.000
4	0.001	0.017	0.013	0.001	0.018	0.000	0.042	0.012	0.002	0.008	0.000	0.005	0.025
5	0.002	0.005	0.000	0.000	0.000	0.000	0.024	0.000	0.000	0.000	0.000	0.000	0.013
6	0.000	0.004	0.000	0.003	0.001	0.001	0.028	0.005	0.000	0.000	0.000	0.000	0.007
7	0.000	0.008	0.002	0.007	0.030	0.003	0.123	0.013	0.007	0.006	0.000	0.000	0.074
8	0.001	0.001	0.000	0.000	0.000	0.002	0.004	0.004	0.000	0.004	0.000	0.000	0.014
9	0.005	0.094	0.018	0.027	0.071	0.011	0.037	0.014	0.021	0.009	0.001	0.006	0.220
10	0.000	0.000	0.004	0.003	0.003	0.004	0.009	0.003	0.002	0.000	0.000	0.000	0.013
11	0.003	0.057	0.005	0.004	0.017	0.010	0.121	0.012	0.013	0.010	0.000	0.003	0.074
12	0.000	0.000	0.000	0.000	0.000	0.000	0.026	0.000	0.000	0.000	0.000	0.000	0.006
13	0.000	0.000	0.000	0.000	0.001	0.001	0.020	0.001	0.001	0.000	0.000	0.000	0.002

4.3.5 Similarities using TFE and association map

Table 11 shows the incidence matrix [M ∼ *]. The max-min composition of matrices establishes the indirect causal relationship. Every category has the same value for each one region.

Table 11
Similarities using TFE. The rows 1 to 13 have the same value. This table represent [M ∼ *]

Regions Values of the rows 1 to 13

I 0.148

II 0.007

III 0.044

IV 0.042

V 0.024

VI 0.028

VII 0.028

VIII 0.014

IX 0.220

X 0.013

XI 0.121

XII 0.026

XIII 0.020

Regions	Values of the rows 1 to 13
I	0.148
II	0.007
III	0.044
IV	0.042
V	0.024
VI	0.028
VII	0.028
VIII	0.014
IX	0.220
X	0.013
XI	0.121
XII	0.026
XIII	0.020

In the following, the results are presented graphically. An association map presents the elements of the study in an area. Figure 2 shows the association map obtained by IBM statistics tool SPSS and Correspondence Analysis. The map display position of regions and categories. The distribution evaluates the distance between the different elements. [M ∼ *] adjust the causes, and [M ∼ *] ’ transposed adjust the effects. The distance evidence the similarity. The nearest regions have a higher similarity. Close categories have a higher correlation [41]. For example, the categories: Transportation with Credits and payments.

Fig.2

Association map. Categories marked in blue, and regions marked in red.

The distance between the cause and effect explains the causes close to each other. For example: I want to be a customer with the XIII region. The association map has two dimensions. The categories or “incidents categories,” mostly define dimension 1. Dimension 2 determines the regions.

4.4 Application of data mining

Free computational statistics software R (2018) [42] has been used, specifically Apriori Algorithm [43] by packages arules and arulesViz [44, 45]. This Algorithm is a technique to find associations between many variables by using an intelligent search level for frequent itemsets. Based on work [46] has been employed a support of 0.01 and confidence of 0.35. Table 12 shows 10 first AR obtained by this algorithm. The first rule is the set of items that appears most frequently in the data set, namely the co-occurrence for the category: I want to be a customer in the Metropolitana region. The second and third rules possess the same support and also have the same frequency, which is quantified by the TFE. Table 13 shows confidence. The two previous rules differ in confidence. In the Table 13 the first rule is plant visit in Maule region and the second rule is I want to be a customer in Metropolina region. Confidence means Cause-Effect relationships more than support. Therefore, this performance measurement can be used to compare the results of the Apriori algorithm with the TFE.

Table 12
AR ordered by support

AR Support

category: I want to be a customer → region: Metropolitana 0.1296

region: Maule → category: work with us 0.1149

category: work with us → region: Maule 0.1149

category: quality → region: Metropolitana 0.0954

category: marketing → region: Maule 0.0707

category: marketing → region: Metropolitana 0.0656

region: O’higgins → category: work with us 0.0147

region: Araucania → category: work with us 0.0136

category: commercial → region: Metropolitana 0.0129

category: plant visit → region: Maule 0.0102

AR	Support
category: I want to be a customer → region: Metropolitana	0.1296
region: Maule → category: work with us	0.1149
category: work with us → region: Maule	0.1149
category: quality → region: Metropolitana	0.0954
category: marketing → region: Maule	0.0707
category: marketing → region: Metropolitana	0.0656
region: O’higgins → category: work with us	0.0147
region: Araucania → category: work with us	0.0136
category: commercial → region: Metropolitana	0.0129
category: plant visit → region: Maule	0.0102

Table 13

AR ordered by confidence

AR	Confidence
category: plant visit → region: Maule	0.6382
category: I want to be a customer → region: Metropolitana	0.5631
category: quality → region: Metropolitana	0.5470
region: Araucania → category: work with us	0.4705
category: commercial → region: Metropolitana	0.4418
region: Maule → category: work with us	0.4200
category: marketing → region: Maule	0.3847
category: work with us → region: Maule	0.3818
region: O’higgins → category: work with us	0.3805
category: marketing → region: Metropolitana	0.3568

Table 14 shows the lift measurement. As in Table 13 the first rule is plant visit in Maule region. Only the last rule has a lift less than 1. In particular, this rule has a frequency less than expected under independent conditions. If the last rule is suppressed because this rule does not exceed a threshold 1. Rule 5 is deleted, which is the same rule 4. The eight rules are as follows. Regarding the region: (a) three rules about Maule; (b) three rules about Metropolitana; (c) one rule about Araucania (d); and one rule about O’Higgins. The other hand in terms of the category: (a) three rules speak about work with us; (b) one rule about plant visit; (c) one rule about marketing; (d) one rule about I want to be a customer; (e) one rule about quality; and (f) one rule about commercial.

Table 14

AR ordered by lift

AR	Lift
category: plant visit → region: Maule	2.3329
region: Araucania → category: work with us	1.5636
category: marketing → region: Maule	1.4062
category: work with us → region: Maule	1.3955
region: Maule → category: work with us	1.3955
category: I want to be a customer → region: Metropolitana	1.3079
category: quality → region: Metropolitana	1.2705
region: O’higgins → category: work with us	1.2644
category: commercial → region: Metropolitana	1.0262
category: marketing → region: Metropolitana	0.8288

Figure 3 shows a graph of the AR considering the lift. Highlighting the following rules: (1) I want to be a customer in Metropolitana region; (2) Work with us in Maule region; (3) Quality in Metropolitana region; and (4) Marketing in Maule region.

Fig.3

Graph of the AR by lift. The size bubble represents the support. Color intensity denotes the lift.

Figure 4, shows the bubble chart of the ten rules with higher confidence. The biggest bubble represents the rule: I want to be a customer Metropolitana region.

Fig.4

Matrix of 10 rules with the highest confidence. As Fig. 3 the size bubble represents the support. Color denotes the lift.

Figure 5 shows an association map that is very similar to the previous Fig. 2. This graphic describes the majority of the variables in dimension 1. This association map has been obtained by analysis of similarities. This method is a statistical technique used for evaluating by graphical viewpoint, the dependent and independent relations of a categorical variable set from a contingency table. This analysis is a DMA that strengthens the results of the Apriori algorithm (Figs 3 and 4).

Fig.5

Graph of the correlation between variables. Dimension 1 describes 71.2% of the data, and dimension 2 describes 9.5% of the data. Blue circles represent categories, and red triangles denote regions.

5 Conclusions and future works

The traditional analysis BI of causal relationships is TFE. Causes and effects, which are not always explicit, obvious, or observables, are useful to discover new knowledge. This causes-effects and incidences allow to support for the decision-making and creating of indicators not considered, such as highlighted elements by the customers. This information is useful for organizations that need analyzing customers’ opinions and changing the focus to determine where there are problems in the company. The literature presents several DMA studies for different tasks for BI in recent years. The commons topics research of DMA for BI has been: e-commerce studies, care service, sentiments analysis, and service quality. However, the actuality research of DMA for BI does not present forgotten effects. Therefore, we proposed a new approach for discovering the forgotten effects. For this task has been used AR, which are not necessarily causes and effects. In this study, the AR algorithm has found the same relationships as FEM. Therefore, AR serves to find forgotten effects.

This document presented an analysis of comments and opinions from the Chilean foods industry company. FEM analyzed coincidences, similarities between Chilean regions and categories. The method shows the regions and categories with more coincidences. The analysis of this information between regions and categories shows the most relevant relationships in Tables 6, 8, 9, 10. Association map (Fig. 2) denotes relationships between categories and regions. As well, the distances between different variables are observed. This analysis presents that FEM has the same results as AR. The AR were ordered by support, confidence and lift. These results allow to obtained rules, which determined the most important regions and categories for the company. These rules have been represented by a graph of association and a matrix of rules (Figs. 3, 4). The graph of the correlation between variables is presented such as association maps in Fig. 5, which is highly similar to the previous association map obtain for FEM (Fig. 2). In this study, these results strengthen the DMA. The previous algorithms are implemented by high-level programming languages such as R and Python. Therefore, the proposed is the first approach DMA for forgotten effects that are easily automated.

For future investigations, it is proposed to continue reviewing other DMA to support the results obtained by applying the FEM. Thereby, to detect new indicators or strengthen this existing. Additionally, it is a task to find DMA that support the FEM in each one of the steps of this methodology.

The main limitations of the investigation are the characteristics of the CRM systems’s databases. Because these usually do not have text data standardized, purified and classified. So, to develop the data mining process it requires cleaning and classifying the data of the base, then an ETL (extract, transform, load) process is performed, and then filter and extract classified data, finally create a groundwork on which you can apply the Theory of Forgotten Effects (TFE) methodology. In the process of text classify errors or loose of information can occur.

Author contributions

Conceptualization, A.U.-S. and C.N.-A.; Data curation, F.R.-O.; Formal analysis, F.R.-O.; Funding acquisition, A.U.-S.; Investigation, R.A.-G., A.U.-S., F.R.-O. and C.N.-A.; Methodology, F.R.-O.; Project administration, A.U.-S. and C.N.-A.; Resources, A.U.-S.; Supervision, A.U.-S. and C.N.-A.; Validation, C.N.-A.; Writing - original draft, A.U.-S., F.R.-O. and C.N.-A.; Writing - review & editing, R.A.-G. and A.U.-S.

Funding

This research was funded by Department of Computer Science and Industry, Faculty of Engineering Science, Universidad Católica del Maule; and School of Economics and Business, Universidad Santo Tomás.

Conflicts of interest

The authors declare no conflict of interest.

Footnotes

Appendix: Abbreviations

The following abbreviations are used in this manuscript: AR

Association Rules

Business Intelligence

CRM

Customer Relationship Management

Data Mining

DMA

Data Mining Algorithms

FEM

Forgotten Effects Methodology

TFE

Theory of Forgotten Effects

SPSS

Statistical Package for the Social Science

Acknowledgments

This project is supported by “Red Iberoamericana para la Competitividad, Innovación y Desarrollo” (REDCID) Project NO. 616RT0515 in “Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo” (CYTED).

References

Zavala

and Ramirez-Marquez

J.E.

, Visual analytics for identifying product disruptions and effects via social media, Int J Prod Econ 208 (2019), 544–559.

Saura and Bennett , A Three-Stage method for Data Text Mining: Using UGC in Business Intelligence Analysis, Symmetry (Basel) 1(4) (2019), 519.

Krauss

and Arbanowski

, Social preference ontologies for enriching user and item data in recommendation systems, in IEEE International Conference on Data Mining Workshops, ICDMW, (2014), pp. 365–372.

, Zhang

and Yin

, Research on the evaluation of product quality perceived value based on text mining and fuzzy comprehensive evaluation, in Proceedings - International Conference on Identification, Information and Knowledge in the Internet of Things, IIKI 2016, (2016), pp. 563–566.

Linares-Mustarós

, Gil-Lafuente

A.M.

, Corominas Coll

and Ferrer-Comalat

J.C.

, Premises for the theory of forgotten effects, in Advances in Intelligent Systems and Computing 894 (2020), 206–215.

Usmani

Z.A.

, Manchekar

, Malim

and Mir

, A predictive approach for improving the sales of products in e-commerce, in Proceedings of the 3rd IEEE International Conference on Advances in Electrical and Electronics, Information, Communication and Bio-Informatics, AEEICB 2017, (2017), pp. 188–192.

Nicolás

, Valenzuela

and Gutierrez

, Temas claves en Investigación de Mercados. Santiago: Ediciones Copygraph Ltda., 2015.

Foltean

F.S.

, Trif

S.M.

and Tuleu

D.L.

, Customer relationship management capabilities and social media technology use: Consequences on firm performance, J Bus Res 104 (2019), 563–575.

Alsac

, Colak

and Keskin

G.A.

, An integrated customer relationship management and Data Mining framework for customer classification and risk analysis in health sector in 6th International Conference on Industrial Technology and Management, ICITM 2017, (2017), pp. 41–46.

10.

Jacob

D.W.

, Fudzee

M.F.M.

, Salamat

M.A.

, Saedudin

, Abdullah

and Herawan

, Mining significant association rules from on information and system quality of indonesian e-government dataset, inAdvances in Intelligent Systems and Computing, (2017), pp. 608–618.

11.

Riaz

, Arooj

, Hassan

M.T.

and Kim

J.B.

, Clustering based association rule mining on online stores for optimized cross product recommendation, in International Conference on Control, Automation and Information Sciences, ICCAIS 2014, (2014), pp. 176–181.

12.

hsien Liao

and Tasi

Y.S.

, Big data analysis on the business process and management for the store layout and bundling sales, Bus Process Manag J 25(7) (2019), 1783–1801.

13.

Siryani

, Mazzuchi

and Sarkani

, Framework using Bayesian belief networks for utility effective management and operations, in Proceedings - IEEE 1st International Conference on Big Data Computing Service and Applications, BigDataService 2015, (2015), pp. 72–78.

14.

, Tian

, Hung

, Akula

and Zhang

, Measuring and comparing service quality metrics through social media analytics: a case study, Inf Syst E-bus Manag 16(3) (2018), 579–600.

15.

Lee

, Social media analytics for enterprises: Typology, methods, and processes, Bus Horiz 61(2) (2018), 199–210.

16.

, Chen

, Tian

and Chong

, Actionable social media competitive analytics for understanding customer experiences, J Comput Inf Syst 56(2) (2016), 145–155.

17.

Yadav

and Vishwakarma

D.K.

, Sentiment analysis using deep learning architectures: a review, Artif Intell Rev 2019.

18.

Bhardwaj

, Narayan

, Pawan

and Dutta

, Sentiment Analysis for Indian Stock Market Prediction Using Sensex and Nifty, in Procedia Computer Science 70 (2015), 85–91.

19.

Napitu

, Bijaksana

M.A.

, Trisetyarso

and Heryadi

, Twitter opinion mining predicts broadband internet’s customer churn rate, in IEEE International Conference on Cybernetics and Computational Intelligence, CyberneticsCOM 2017 - Proceedings, (2018), pp. 141–145.

20.

and Kešelj

, Collective sentiment mining of microblogs in 24-hour stock price movement prediction, in Proceedings - 16th IEEE Conference on Business Informatics, CBI 2014, 2 (2014), 60–67.

21.

Zvarevashe

and Olugbara

O.O.

, A framework for sentiment analysis with opinion mining of hotel reviews, in Conference on Information Communications Technology and Society, ICTAS 2018 - Proceedings (2018), pp. 1–4.

22.

Singla

, Randhawa

and Jain

, Statistical and sentiment analysis of consumer product reviews, in 8th International Conference on Computing, Communications and Networking Technologies, ICCCNT, 2017.

23.

Hegde

and Padma

S.K.

, Sentiment analysis using random forest ensemble for mobile product reviews in kannada, in Proceedings - 7th IEEE International Advanced Computing Conference, IACC (2017), pp. 777–782.

24.

Haque

T.U.

, Saber

N.N.

and Shah

F.M.

, Sentiment analysis on large scale Amazon product reviews, in IEEE International Conference on Innovative Research and Development, ICIRD 2018 (2018), pp. 1–6.

25.

Xiong

, Wang

, Ji

and Wang

, A short text sentiment-topic model for product reviews, Neurocomputing 297 (2018), 94–102.

26.

Mataoui

, Bendali Hacine

T.E.

, Tellache

, Bakhtouchi

and Zelmati

, A new syntax-based aspect detection approach for sentiment analysis in Arabic reviews, in 2nd International Conference on Natural Language and Speech Processing, ICNLSP 2018 (2018), pp. 1–6.

27.

Ravi

, Ravi

and Prasad

P.S.R.K.

, Fuzzy formal concept analysis based opinion mining for CRM in financial services, Appl Soft Comput J 60 (2017), 786–807.

28.

Chiang

W.Y.

, Applying data mining for online CRM marketing strategy: An empirical case of coffee shop industry in Taiwan, Br Food J 120(3) (2018), 665–675.

29.

Nicolas

, Urrutia Sepúlveda

, Valenzuela-Fernández

and Gil-Lafuente

, Systematic mapping on social media and its relation to business, Eur Res Manag Bus Econ 24(2) (2018), 104–113.

30.

Arroyo

and Cassú

, Application of the forgotten effects model to the agency theory, in Advances in Intelligent Systems and Computing, (2015), pp. 67–79.

31.

Ciarapica

, Bevilacqua

and Antomarioni

, An approach based on association rules and social network analysis for managing environmental risk: A case study from a process industry, Process Saf Environ Prot 128 (2019), 50–64.

32.

Babaian

, Lucas

and Chircu

, Mapping Data Associations in Enterprise Systems, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11491 LNCS: 254–268, 2019.

33.

Bozdag

, Yanmaz

and Kadaifci

, A hesitant fuzzy correspondence analysis, in Advances in Intelligent Systems and Computing 1029 (2020), 362–368.

34.

Wagner

W.E.

, Using IBM SPSS statistics for research methods and social science statistics. Sage Publications, 2019.

35.

Pérez

, Minería de datos a través de Ejemplos.México D.F.: Alfaomega Grupo Editor S. A., 2015.

36.

Zhan

, Zhu

, Zhang

, Wang

and Liu

, Summary of Association Rules, in IOP Conference Series: Earth and Environmental Science 252(3), 2019.

37.

Saxena

and Gadhiya

, A Survey on frequent pattern mining methods-Apriori, Eclat, FP growth, Int J Eng Dev Res 2(1), 2014.

38.

Mostafa

M.M.

, More than words: Social networks’ text mining for consumer brand sentiments, Expert Syst Appl 40(10) (2013), 4241–4251.

39.

Aggarwal

C.C.

, Data Mining: The Textbook. New York: Springer, Cham, 2015.

40.

J.M.-T.

, Zhan

and Chobe

, Mining Association rules for Low-Frequency itemsets, PLoS One 13(7) (2018), e0198066.

41.

Speight

K.C.

, Schiano

A.N.

, Harwood

W.S.

and Drake

M.A.

, Consumer insights on prepackaged Cheddar cheese shreds using focus groups, conjoint analysis, and qualitative multivariate analysis, J Dairy Sci 102(8) (2019), 6971–6986.

42.

Etaati

and Etaati

, Descriptive Analysis in Power Query with R, in Machine Learning with Microsoft Technologies, Apress (2019), 121–135.

43.

Olson

D.L.

and Lauhoff

, Association Rules, in Descriptive Data Mining, Springer, (2019), pp. 67–76.

44.

Hahsler

, Grün

and Hornik

, Arules - A computational environment for mining association rules and frequent item sets, J Stat Softw 14(1) (2005), 1–25.

45.

Hahsler

and Chelluboina

, Visualizing Association Rules: Introduction to the R-extension Package arulesViz, R Proj Modul (2011), pp. 223–238.

46.

Ishibuchi

, Kuwajima

and Nojima

, Multiobjective association rule mining, in PPSN Workshop on Multiobjective Problem Solving from Nature 12, 2006.

47.

Blanco-Mesa

, Leon-Castro

, Velázquez-Cázeres

Cifuentes-Valenzuela

, and Sánchez-Ovalle

V.G.

, Medición de las Capacidades de Innovación en Tres Sectores Primarios en Colombia. Efectos Olvidados de las Capacidades de Innovación de la Quínoa, la Guayaba y Apícola en Boyacá y Santander, (A. M. Gil-Lafuente, ed.)., Barcelona España: Real Academia de Ciencias Económicas y Financieras, 2019.

48.

Gil-Lafuente

A.M.

and Barcellos de Paula

, Una Aplicación de la Metodología de los Efectos Olvidados: Los Factores que Contribuyen al Crecimiento Sostenible de la Empresa, Cuadernos Del CIMBAGE 12 (2010), 23–34.

49.

Gil-Lafuente

A.M.

and Luis Bassa

, The Forgotten Effects Model in a CRM Strategy, Fuzzy Economic Review 16(1) (2011), 3–19.

50.

Gil-Lafuente

A.M.

, Blanco

F.R.

and Castillo

, Forgotten Effects of Sport. In Gil-Lafuente

Anna M

Gil-Lafuente

, & Merigó-Lindahl

J. M.

(Eds.), Soft Computing in Management and Business Economics, (2012), pp. 375–391.

51.

Maqueda Lafuente

J.F.

, Gil-Lafuente

A.M.

, Guzman-Parra

V.F.

and Gil-Lafuente

, Key Factors for Entrepreneurial Success, Management Decision 51(10) (2013), 1932–1944.

52.

Nicolás

and Gil-Lafuente

, Customer Experience Assessment: Forgotten Effects, Journal of Computational Optimization in Economics and Finance 4(2–3) (2012), 77–88.

Applying data mining on customer relationship management system to discover forgotten effects

Abstract

Keywords

1 Introduction

2 Forgotten effects methodology

Table 1 Cause matrix a1 a2 ... aj a1 A11 A12 ... A1j a2 A21 A22 ... A2j ... ... ... ... ... ai Ai1 Ai2 ... Aij

Table 2 Effect matrix b1 b2 ... bj b1 B11 B12 ... B1j b2 B21 B22 ... B2j ... ... ... ... ... bi Bi1 Bi2 ... Bij

Table 3 Cause and effect matrix a1 a2 ... Aj b1 C11 C12 ... C1j b2 C21 C22 ... C2j ... ... ... ... ... bi BC1 Ci2 ... Cij

3 Data mining

3.1 Association rules

4 Result and discussion

4.1 Case of study

4.3 Application of the FEM

4.3.1 Coincidences between the Chilean regions

Table 5 Chilean regions. Roman numeral code Code Name (Region) I Tarapacá II Antofagasta III Atacama IV Coquimbo V Valparaíso VI Libertador Bernardo O’Higgins VII Maule VIII BioBío IX Araucanía X Los Lagos XI Aysén del Gral. Carlos Ibañez del Campo XII Magallanes XIII Metropolitana

Table 11 Similarities using TFE. The rows 1 to 13 have the same value. This table represent [M ∼ *] Regions Values of the rows 1 to 13 I 0.148 II 0.007 III 0.044 IV 0.042 V 0.024 VI 0.028 VII 0.028 VIII 0.014 IX 0.220 X 0.013 XI 0.121 XII 0.026 XIII 0.020

Author contributions

Funding

Conflicts of interest

Footnotes

Appendix: Abbreviations

Acknowledgments

References

Table 1
Cause matrix

a₁ a₂ ... a_j

a₁ ^A11 ^A12 ... ^A1j

a₂ ^A21 ^A22 ... ^A2j

... ... ... ... ...

a_i ^Ai1 ^Ai2 ... ^Aij

Table 2
Effect matrix

b₁ b₂ ... b_j

b₁ ^B11 ^B12 ... ^B1j

b₂ ^B21 ^B22 ... ^B2j

... ... ... ... ...

b_i ^Bi1 ^Bi2 ... ^Bij

Table 3
Cause and effect matrix

a₁ a₂ ... A_j

b₁ ^C11 ^C12 ... ^C1j

b₂ ^C21 ^C22 ... ^C2j

... ... ... ... ...

b_i ^BC1 ^Ci2 ... ^Cij

Table 11
Similarities using TFE. The rows 1 to 13 have the same value. This table represent [M ∼ *]

Regions Values of the rows 1 to 13

I 0.148

II 0.007

III 0.044

IV 0.042

V 0.024

VI 0.028

VII 0.028

VIII 0.014

IX 0.220

X 0.013

XI 0.121

XII 0.026

XIII 0.020