From indoor paths to gender prediction with soft clustering

Abstract

Customer-based practices enable benefits to organizations in a contentious business. Offering individualized proposals increase customer loyalty to be able to afloat. Understanding customers is a vital difficulty to perform personalized recommendations. As a demographic feature, gender information essentially cannot be captured by human tracking technologies. Hence, several procedures are improved to predict undiscovered gender information. In the research, the followed indoor paths in a shopping mall are used to predict customer genders using fuzzy c-medoids, one of the soft clustering techniques. A Levenshtein-based fuzzy classification methodology is proposed the followed paths as string data. Although some studies focused on gender prediction, no research has centered on path-oriented. The novelty of the investigation is to analyze customer path data for the gender classes.

Keywords

Gender prediction string classification soft clustering path classification levenshtein fuzzy c-medoids

1 Introduction

Smartphones possess an essential role in people’s regular days since their usage has grown after the millennium. Smartphones are frequently used in both academic and industrial data collection projects in an unobtrusive way [1]. Smart devices with various technologies communicate with smartphones through sensors. Whereas it provides spatiotemporal data collection, demographic information is not involved in the captured data. Because demographic information is meaningful in personalized systems, researchers have gradually focused on gender prediction in recent years. The demographic information such as age, location, gender, education level, and marital status are employed to prepare recommendation systems [2], prediction of future [3] and purchasing choices [4]. The gender information holds a vital role in decision-making, especially in the marketing domain [5 –7].

Although developed technology with industry 4.0 provides to collect a vast amount of data, many sensor data do not contain gender information. Hence, gender prediction studies have gained importance [8]. Sensors integrated into smart devices capture spatiotemporal data from customers’smartphone as long as they permit the communication between devices. The event logs created from sensor data provide to find customers’paths, backgrounds, and interests [9]. For instance, female shoppers may visit more gym stores than male shoppers. Male consumers may spend higher time in entertainment stores than female consumers. Gender prediction is one of the most challenging issues thanks to highly dynamic human behaviors [10 –12]. In previous studies, researchers used various approaches such as face [13], speech [14], handwriting [15].

Understanding customer needs from their indoor paths is a fascinating research field [16 –18]. Nowadays, studies on path analysis for various purposes have a risen reputation [19, 20]. Dogan et al. [21] examined male and female shoppers’paths to explain changes in behaviors. The investigation shows male and female consumers have different actions concerning their paths in the shopping mall. Some researches adopted descriptive statistics to represent buyer behaviors [22], yet understanding user behaviors needs an overall insight into customer pathways [1 , 24]. Henceforth, process mining methods are applied to create and evaluate user paths [1 , 26]. The aim of process mining includes not only discovering customer paths for an overall insight but also producing understandable solutions.

In this study, data collected from one of the biggest shopping malls are collected. The data are collected through a Beacon network built for location tracking and other services by a Turkish start-up Blesh. The company has a Beacon network involving more than 100.000 beacons in over 50 cities in Turkey. Because of the company’s business model and related regulations, only a unique ID is collected from each user. The Beacons sends low energy Bluetooth based signals, any nearby mobile device can receive this signal, and if an appropriate application is installed on the device, it sends the Bluetooth signal to the servers. The server receiving the signal id and the id of the mobile device can process the data and send any messages to the mobile device. The decision process that runs on the server can provide better results as more information is known about the owner of the mobile device. Any demographics or interests of the users help the algorithms to give better campaigns or actions. However, the system can only have an ID representing each user. At this point, the research question of the paper is defined as the prediction of the gender from the location data of the users. The focal problem differs from classical classification problems in the fact that the input data is more complex, and it is not in a tabular form. To overcome this challenge, a novel approach Levenshtein distance-based Fuzzy c-Medoids Clustering (L-FCMd) method was proposed. One of the challenges about the problem is numerous stores exist in the shopping mall, and using them in the model with their names or identifiers causes the loss of some patters. The problem is simplified by converting the stores to store types such as Clothing, Catering, and entertainment. Because of this, visitor paths are represented by order of the store types. Levenshtein distance, a string metric for measuring the difference between two sequences, was used to obtain the distances between the paths. The distance values are then embedded into Fuzzy c-Medoids clustering, which can define clusters of paths by using fuzzy principles and distances between the paths.

This study is structured as follows. Section 2 presents related works related to both gender classification and user paths. Furthermore, studies about fuzzy-based k-nearest neighbor are given. The section reveals the literature gap. Section 3 introduces the developed algorithm, Levenshtein distance-based Fuzzy c-Medoids Clustering. Section 4 represents the case study. Finally, the consequences of the proposed algorithm and limitations are discussed in Section 5.

2 Related works

The literature provides various studies on the behavior analysis of individuals. Yoshimura et al. [22] used museum visitors’data to detect the frequent paths being using in Louvre Museum using Bluetooth devices. In another branch of studies, De Looni et al. [27] used event log data to diagnose the process bottlenecks and optimize the hospital processes. Frisby et al. [28] focused on activities in the emergency room of a hospital;instead of the previous study, this study uses Bluetooth signals to define and optimize the operations. Arrayo et al. [29] concentrated on suspicious behaviors in a shopping mall. The researchers used the video-surveillance system and video processing techniques to detect suspicious events. Popa et al. [30] provided a study on shopping behaviors by using trajectories and shopping-related activities. Wu et al. [31] investigated on in-store events of the customers to provide a customer flow analysis within a store. In another study, Yim et al. [32] used the local area network system to detect customer location and estimate the next visiting location of customers. Oosterlinck et al. [33] also focused on human behavior in a shopping mall and showed the efficiency of using Bluetooth devices for human tracking.

As a member of data analysis methods, classification groups data samples into the predefined classes. A classification algorithm proposes to assign data points into the existing classes concerning the similarities.

Table 1
Path studies with study purposes

Study Goal^* Detail

[34] 1 A new pattern discovery method is developed to capture frequent user movements

[35] 1 Decision tree method is built using trajectory patterns extracted from the spatio-temporal data.

[36] 1 A novel location prediction, which aims to estimate the probability of visiting a location, is proposed to find frequent patterns.

[37] 1 A new clustering-oriented prediction technique is proposed, which not only the geographic but also semantic properties of user paths.

[38] 2 A new retailer segmentation strategy is presented considering multi-criteria decision-making integrated with soft clustering.

[39] 2 A recommendation system is proposed based on path-based and collaborative filtering to be able to offer individualized offers.

[40] 2 Similarity among people considering their visited locations in the past is computed by developing a hierarchical-graph-based approach.

[41] 2 Maximal travel match method is applied to measure the user similarity from GPS routes and semantic past location data.

[42] 2 A novel approach is developed for touristic tour plannings to present personalized offers by modelling as a generalized maximum coverage problem.

[43] 2 Both generic and personalized recommendations are created. Generic recommendations are created with a tree-based hierarchical method. A collaborative filtering model is proposed for personalized recommendations.

[44] 2 A method is developed to recommend individualized friend and location offers on the Web by estimating personal interests from location history.

[45] 2 A fuzzy logic-based recommendation system is developed to apply restaurant offerings in various social media platforms.

[46] 3 A novel algorithm is developed, which cut the path duration into groups and later employs the k-means clustering method.

[47] 3 An association rule mining is applied to shoppers location data collected by Bluetooth-based iBeacon devices.

[48] 3 A graph-based mining algorithm is developed to analyze frequent movements, which outperforms the Apriori-based and PrefixSpan-based methods.

[49] 3 Discovering frequently visited places are studied by developing Apriori method.

[50] 3 A grid partition technique for vertical projection distance is proposed to overcome difficulties of some other methods.

^* 1: Path-based prediction, 2: Path-based recommendation,

3: Mining frequent patterns

Study	Goal^*	Detail
[34]	1	A new pattern discovery method is developed to capture frequent user movements
[35]	1	Decision tree method is built using trajectory patterns extracted from the spatio-temporal data.
[36]	1	A novel location prediction, which aims to estimate the probability of visiting a location, is proposed to find frequent patterns.
[37]	1	A new clustering-oriented prediction technique is proposed, which not only the geographic but also semantic properties of user paths.
[38]	2	A new retailer segmentation strategy is presented considering multi-criteria decision-making integrated with soft clustering.
[39]	2	A recommendation system is proposed based on path-based and collaborative filtering to be able to offer individualized offers.
[40]	2	Similarity among people considering their visited locations in the past is computed by developing a hierarchical-graph-based approach.
[41]	2	Maximal travel match method is applied to measure the user similarity from GPS routes and semantic past location data.
[42]	2	A novel approach is developed for touristic tour plannings to present personalized offers by modelling as a generalized maximum coverage problem.
[43]	2	Both generic and personalized recommendations are created. Generic recommendations are created with a tree-based hierarchical method. A collaborative filtering model is proposed for personalized recommendations.
[44]	2	A method is developed to recommend individualized friend and location offers on the Web by estimating personal interests from location history.
[45]	2	A fuzzy logic-based recommendation system is developed to apply restaurant offerings in various social media platforms.
[46]	3	A novel algorithm is developed, which cut the path duration into groups and later employs the k-means clustering method.
[47]	3	An association rule mining is applied to shoppers location data collected by Bluetooth-based iBeacon devices.
[48]	3	A graph-based mining algorithm is developed to analyze frequent movements, which outperforms the Apriori-based and PrefixSpan-based methods.
[49]	3	Discovering frequently visited places are studied by developing Apriori method.
[50]	3	A grid partition technique for vertical projection distance is proposed to overcome difficulties of some other methods.
^* 1: Path-based prediction, 2: Path-based recommendation,
3: Mining frequent patterns

Gender classification is one of the famous application domains in classification research. Several ways adopted to predict gender information in previous studies such as face [51, 52], speech [14], handwriting [15 , 54] and video-based gait [55, 56]. On the other side, smartphones are a new tool to collect human data that can be used to predict gender. Choi et al. [57] investigate mobile text data for users’gender prediction. Current gender classification techniques using smartphones are mainly implemented in two forms: visual and audio. Danisman et al. [13] proposed a fuzzy-based inference scheme that utilizes some facial characteristics such as mustache and inner face. Agneessens et al. [58] gather audio signals from mobile phones and represent the gender of speakers.

Researches have various research goals in user path analysis such as path-based prediction, path-based recommendation, and mining frequent patterns. Table 1 presents a summary of the literature review. Naserian et al. [34] predicted individualized positions, which consider users’trip type. They introduced a new pattern discovery procedure from spatial-temporal routes to accomplish this aim. from spatial-temporal trajectories of users. User position data show individual interests and visit purposes. Salkin and Oztaysi [38] presents a new retailer segmentation strategy considering multi-criteria decision-making integrated with soft clustering. Shaw and Gopalan [46] developed a novel algorithm that cut the path duration into groups and later employs the k-means clustering method. Their study aims to reveal the frequent paths of moving objects using the clustering method and sequential pattern mining.

Dogan et al. [21] mention that no research predicts gender information using customer paths. Previous researches about gender prediction are not related to customer paths, and researches focus on customer paths have not predicted gender information. Dogan and Oztaysi [7] filled the gap in their recent study. They developed a Levenshtein-based fuzzy kNN method. In this study, a similar algorithm is adopted with different clustering method, fuzzy c-medoids algorithm.

3 Proposed methodology

This paper proposes a novel technique to predict the gender of customers in a shopping mall by combining Levenshtein distance and fuzzy c-Medoids. Figure 1 depicts the general view of the proposed technique.

Fig.1

Flowchart of the proposed method for gender prediction from paths.

The introduced technique commences with data collection from beacon devices through users’mobile phones. The data is transformed to store group information. Various data preprocessing steps are implemented to prepare for analysis. Then, the individual indoor paths are discovered for each customer. The paths are classified into gender patterns. Levenshtein distance computes the distances between a gender-unknown path and all gender-known paths. Finally, fuzzy c-Medoids manages to predict the gender of customers.

3.1 Levenshtein distance

Levenshtein distance is a special kind of sequential alignment technique. It can compute the distance between two string variables such as customer paths. Traditional similarity methods based on calculating numerical results are not applicable in string-based data [59]. Levenshtein distance is one of the well-known algorithms that can cover variations among string data [60]. In this algorithm, the distances are computed with three kinds of transformations: substitution, deletion, and insertion [61].

The first and second customer paths are shown by f and s, respectively. The transformations are executed to the single part of f and s. The first element in the string array is adjusted zero to calculate the distance. Hence, the similarity comparison matrix comprises of l_f + 1 rows and l_s + 1 columns.

f = f (0) , f (1) , f (2) , …, f (l_f)

s = s (0) , s (1) , s (2) , …, s (l_s)

where l_f and l_s refer to the first and second customer path lengths. f (i) and s (j) give ith element of the first path and jth element of the second path, respectively. The Levenshtein distance, shown by Lev (f, s), presents the distance between the f (i) and s (j), and can be determined as Equation 1.

Lev (i, j) = \min {\begin{matrix} Lev (i - 1, j) + 1 \\ Lev (i, j - 1) + 1 \\ Lev (i - 1, j - 1) + 1) \end{matrix}

(1)

where Lev (i - 1, j) +1 is deletion, Lev (i, j - 1) +1 is insertion and Lev (i - 1, j - 1) +1 substitution operations.

3.2 Fuzzy c-medoids clustering

Fuzzy clustering allows an individual data element to be partially classified into more than one cluster. In crisp clustering, each individual is declared as a member of only one cluster. When the number of clusters is set to K, the algorithm provides a set of variables m_i1, m_i2, . . . , m_ik which represent the probability that data element i is classified into cluster k. In classical crisp clustering algorithms, only one of these values will be one and the rest will be zero since they assign a data element into only one cluster. In fuzzy clustering, each data element has membership values spread to all clusters. The m_ik can be between zero and one, with the condition that the sum of m_ik is one. The advantage of fuzzy clustering is the fact that it does not force data objects into a specific cluster and thus it can provide much more information to be interpreted.

The fuzzy c-medoids algorithm is proposed by Krishnapuram [62] as a modified version of the fuzzy c-means algorithm. The fuzzy c-means algorithm is a popular clustering technique, but it can be affected by outlier data. Thus, Krishnapuram modified the original algorithm and replaces the means with the medoids [62]. The steps of fuzzy c-medoids algorithm is given as follows:

Step 1: Set the number of clusters (c)

Step 2: Randomly pick initial set of medoids V = v₁, v₂, v₃, . . . , v_c from X_c, where X = {x_i, |i = 1, . . . , n} is the set of n objects. Set iter = 0.

Step 3: For each cluster and data object compute u_ik values using Equation 2

$u_{ik} = \frac{{(\frac{1}{(1 - K (x_{k}, v_{i}))})}^{1 / (m - 1)}}{\sum {(\frac{1}{(1 - K (x_{k}, v_{i}))})}^{1 / (m - 1)}}$ (2)

Step 4: Store the medoid values (V^old).

Step 5: Calculate the new medoids (V) values by using Equation 3.

$\begin{matrix} x_{i}^{*} = \underset{i}{argmin} (d_{ik}^{2} u_{ik}^{m}) \\ v_{i} = x_{i}^{*} \end{matrix}$ (3) where $d_{ik}^{2} = (x_{k} - v_{i}) A_{i} {(x_{k} - v_{i})}^{T}$ Step 6: Stop if V^old - V = 0 or maximum iteration limit is reached. In other cases Go to Step 3.

3.3 Levenshtein distance-based fuzzy c-medoids (L-FCMd) clustering

The Levenshtein distance-based classification method is adopted from D’Urso and Massari [59]. Let X = (x₁, x₂, …, x_N) be the dataset to be clustered in this research ordered sequences and V = (v₁, v₂, …, v_N) is a subset of X with cardinality C. Levenshtein distance-based fuzzy C-medoids clustering is formalized like in Equation 4.

$\min \sum_{i = 1}^{N} \sum_{k = 1}^{C} u_{k} (x_{i})^{m} Lev (x_{i}, v_{k})$ (4)

where u_k (x_i) is the membership value of data x in fuzzy cluster c_k. The fuzzifier m must be greater than 1. The clusters are turned into the crisp format when m equals to 1. Levenshtein distance between data point x_i and cluster center v_k is represented by Lev (x_i, v_k).

The optimal solution is obtained by minimizing Equation [63]. $u_{k} (x_{i}) = \sum_{j = 1}^{c} {[\frac{Lev (x_{i}, v_{k})}{Lev (x_{i}, v_{j})}]}^{- \frac{1}{(m - 1)}}$ (5)

Figure 2 represents the proposed L-FCMd algorithm for string data. It can determine the optimum number of clusters without apriori knowledge by calculating Xie-Beni Index (XBI), which is a performance indicator of a fuzzy clustering method.

Fig.2

The integrated L-FCMd algorithm.

4 Case study

User’s gender prediction from smart devices is a difficulty which needs a vast amount of location, time, and text data [27]. iBeacon devices, a Bluetooth-based technology, is established to collect customer data. Four hundred eighty-two (482) gender-known indoor paths, which are 96 male and 385 female, and 1124 gender-unknown paths are collected. Store groups, instead of stores, are considered to enhance the understandability of the customer paths. Every store in the shopping mall is combined according to the most dominant product or service to decrease the number of considered locations. Figure 3 shows the flowchart of the proposed system.

Fig.3

An overview to gender prediction methodology.

In the data preparation step, because the shopping mall opens between 10 am and 10 pm, paths out of this period are ignored. When the time gap between two visits for the same shopper is greater than ninety minutes, it is assumed as a different visit. Minimum visit duration is adjusted to 1 minute to ignore walking people data. Since beacons send signals every three seconds, customers can be seen in every location during their visit. One-location visits are ignored to decrease classification accuracy. Two-hundred ninety-three (293) stores in the shopping mall are grouped into store groups with respect to their dominant product or services. Eleven store groups are obtained, which are alphabetically Accessory (A), Catering (C), Clothing (D), Common area (E), Electronics (F), Entertainment (G), Entrance (H), Home (I), Mother&Baby (J), Personal Care (K), Supermarket (L). The path lengths have a range from 2 to 18 locations. Visits between 2 and 10 locations constitute 99.07%of all customer paths. The most visited store groups are C and D. Table 2 gives an example of customer paths. Each letter indicates a consecutive visit for a specific location. For example, customer C3 visits three different places;first Clothing (D), then Personal Care (K) and finally, Catering (C).

Table 2

A sample of customer paths

Customer	S1	S2	S3	S4	S5	S6
C1	H	D
C2	A	C	A	K	D	I
C3	C	D	L
C4	D	K	C
C5	H	K	J	A
C6	C	N	C	D

Table 3

10-fold cross-validation results

k =	1	2	3	4	5
Accuracy	0.9583	0.7292	0.5833	1.0000	0.8750
k =	6	7	8	9	10
Accuracy	0.8958	0.9375	0.9375	0.5417	0.9592
Average	84.16 %

Table 4

Comparison of two methods

	L-FCMd		k-Medoids
	Female	Male	Female	Male
Female	176	21	186	11
Male	22	167	93	106
Accuracy	84.16%		75.65%

Table 5

A sample of L-FCMd results

Customer _ID	Path	d _m	d _f	u_m (x)	u_f (x)	Path Prediction
10548435_visit2	FH	4	2	0.3333	0.6667	Female
34281840_visit2	IDJCLH	2	5	0.7143	0.2857	Male
10624475_visit1	NAC	4	3	0.4286	0.5714	Female
10625272_visit1	DAN	4	2	0.3333	0.6667	Female
11323351_visit1	DCLI	2	3	0.6000	0.4000	Male
13155707_visit1	DCNC	3	3	0.5000	0.5000	Equal
Medoids for the classes		DNCLH		2lDFD

In predictive models, some methods, such as cross-validation, are developed to progress the correctness. 10-fold cross-validation is applied to the training dataset to eliminate some obstacles such as underfitting and overfitting [64]. Eight hundred ninety-three (893) customer paths, including 189 male, 704 female paths, are investigated. The introduced method is returned ten times. The training dataset is interchanged in each run, and other known elements are predicted. Table 3 presents the 10-fold cross-validation results for Levenshtein distance-based fuzzy c-medoids clustering method. Then, the average accuracy of the introduced method is calculated. The accuracy between 59%and 70%is acceptable to validate the robustness of a classification problem [53]. In the study, it calculated 87.62%. It indicates every roughly nine out of ten shopper paths are ideally classified.

4.1 Comparison

k-medoids is a clustering method based on the centroid model. The centroid must be a real data point, which described the medoids. Therefore calculation of the means is not needed at each step. Additionally, k-medoids is more beneficial for the achievement time, not sensitive to noise points and outliers. k-medoids algorithm starts with randomly selecting a sample from the dataset to determine as medoid k. Then, it computes all the distances between each data point and medoid k. The nearest medoid is chosen to assign the selected data point. The medoids for each cluster are re-defined by only considering assigned cluster elements. The distance calculation step is repeated with new medoids. The loop continues until a predetermined number of iteration or encountered with no changes in medoids between subsequent iterations.

The fuzzy c-medoids upgrades accuracy obtained by the crisp k-medoids method in most classification problems [65]. In the study, the accuracy results are compared to verify the improved method and crisp method. Table 4 gives the empirical outcomes of the training dataset for both methods. fuzzy c-medoids outperforms k-medoids, as expected, by producing more reliable performance.

4.2 Results

Because the improved L-FCMd approach predicts better gender information, it is applied to the paths for the gender-unknown customer. Table 5 presents a small part of the obtained results for one iteration given in Table 2. d_m and d_f indicate the minimum Levenshtein distance to male and female clusters, respectively. u_m (x) and u_f (x) present the membership values of the path belonging to the male and female clusters, respectively. Path Classification shows the predicted class. For instance, the path HD requires five Levenshtein operations to be the same with the male cluster medoid, DNCLH, and two Levenshtein operations for the female class medoid, DFD. The number of Levenshtein operation defines the similarity. The membership values of the classes are determined by Equation 5.

225 male, 623 female paths are classified by the developed L-FCMd algorithm, and 45 customer paths are not classified because they have equal membership values, shown with u_m (x) and u_f (x). Some extra data processing steps are required to get the gender information. Because one consumer can visit the shopping mall more than once, the path classification is turned into gender prediction (Figure 4). For example, customer 34612045 acts as a male in her visit 4 and visit 12, and as a woman in the other five visits. Hence, it is decided customer 34612045 is female. At the end of the study, 494 gender-unknown customers are classified as 121 male and 373 female shoppers using path prediction.

Fig.4

Gender prediction from classified path.

5 Conclusion

Gender information has great significance for companies, especially in the marketing domain, to improve user-oriented systems such as recommendations, discounts and campaings. Nevertheless, data collected by sensors do not involve gender data. There are some ways to predict customers’gender, such as handwriting, voice, and facial. However, none of them is suitable for sensor data. Shoppers’gender prediction using the followed indoor paths is an exciting field. In this research, users’gender is predicted from the paths, which classified using fuzzy clustering. The developed Levenshtein distance based Fuzzy c-Medoids (L-FCMd) Clustering algorithm is trained with the gender-known path. Then, it is tested and compared with the crisp k-medoids clustering method. It is confirmed that L-FCMd produces a more powerful classification performance which is measured by accuracy as 84.16%. Later, L-FCMd is applied to the gender-unknown paths to predict genders. The proposed method L-FCMd predicts 225 male and 623 female paths. Since shoppers can visit the shopping mall more than one time, the classified paths are used to obtain genders. Consequently, 121 male and 373 female paths are uncovered among gender-unknown paths. Because 45 user paths have an equal membership value to male and female clusters, they cannot be classified. Some other clustering techniques can be modified to decrease information missing by attempting to minimize the equal membership values during path classification. The study has a limitation. only 1-location paths are ignored since they may cause low accuracy. Besides, disregarding 2-location customer paths may improve the found classification efficiency.

Acknowledgment

This work was supported by Research Fund of the Istanbul Technical University. Project Number: MGA-2019-41949

References

Fernandez-Llatas

, Lizondo

, Monton

, Benedi

J.-M.

, Traver

, Process mining methodology for health process tracking using real-time indoor location systems, Sensors 15(12) (2015), 29821–29840.

Choi

I.Y.

, Oh

M.G.

, Kim

J.K.

, Ryu

Y.U.

, Collaborative filtering with facial expressions for online video recommendation, International Journal of Information Management 36(3) (2016), 397–402.

Boland

, Riggs

K.J.

, Anderson

R.J.

, A brighter future: The effect of positive episodic simulation on future predictions in non-depressed, moderately dysphoric &highly dysphoric individuals, Behaviour research and therapy 100 (2018), 7–16.

Lim

, Choi

J.-G.

, Akhmedov

, Chung

, Predicting future trends of media elements in hotel marketing by using change propensity analysis, International Journal of Hospitality Management (2018), (In Press, Corrected Proof).

Chong

A.Y.-L.

, Predicting m-commerce adoption determinants: A neural network approach, Expert Systems with Applications 40(2) (2013), 523–530.

Yeh

C.-H.

, Wang

Y.-S.

, Yieh

, Predicting smartphone brand loyalty: Consumer value and consumer-brand identification perspectives, International Journal of Information Management 36(3) (2016), 245–257.

Dogan

, Oztaysi

, Genders prediction from indoor customer paths by levenshtein-based fuzzy knn, Expert Systems with Applications 136 (2019), 42–49.

Dogan

, Oztaysi

, Gender prediction from classified indoor customer paths by fuzzy c-medoids clustering, in: Intelligent and Fuzzy Techniques in Big Data Analytics and Decision Making, Springer, 2019, pp. 160–169.

Zhong

, Tan

, Mo

, Yang

, User demographics prediction based on mobile data, Pervasive and mobile computing 9(6) (2013), 823–837.

10.

Moeini

, Mozaffari

, Gender dictionary learning for gender classification, Journal of Visual Communication and Image Representation 42 (2017), 1–13.

11.

Dogan

, Martinez-Millana

, Rojas

, Sepúlveda

, Munoz-Gama

, Traver

, Fernandez-Llatas

, Individual behavior modeling with sensors using process mining, Electronics 8(7) (2019), 766.

12.

Dogan

, Fernandez-Llatas

, Oztaysi

, Process mining application for analysis of customer’s different visits in a shopping mall, In: International Conference on Intelligent and Fuzzy Systems, Springer, 2019, pp. 151–159.

13.

Danisman

, Bilasco

I.M.

, Martinet

, Boosting gender recognition performance with a fuzzy inference system, Expert Systems with Applications 42(5) (2015), 2772–2784.

14.

Bisio

, Delfino

, Lavagetto

, Marchese

, Sciarrone

, Gender-driven emotion recognition through speech signals for ambient intelligence applications, IEEE Transactions on Emerging Topics in Computing 1(2) (2013), 244–257.

15.

Ahmed

, Rasool

A.G.

, Afzal

, Siddiqi

, Improving handwriting based gender classification using ensemble classifiers, Expert Systems with Applications 85 (2017), 158–168.

16.

Abedi

, Bhaskar

, Chung

, Miska

, Assessment of antenna characteristic effects on pedestrian and cyclists traveltime estimation based on bluetooth and wifi mac addresses, Transportation Research Part C: Emerging Technologies 60 (2015), 124–141.

17.

Mazimpaka

J.D.

, Timpf

, Trajectory data mining: A review of methods and applications, Journal of Spatial Information Science 2016(13) (2016), 61–99.

18.

Dogan

, Ayyar

, Cagil

, Process-oriented evaluation of customer satisfaction: Process mining application in a call center, in: Proceedings of 10th International Symposium on Intelligent Manufacturing and Service Systems (2019), 172–181.

19.

Brun

, Saggese

, Vento

, Dynamic scene understanding for behavior analysis based on string kernels, IEEE Transactions on Circuits and Systems for Video Technology 24(10) (2014), 1669–1681.

20.

Saini

, Kumar

, Roy

P.P.

, Dogra

D.P.

, An efficient approach for trajectory classification using fcm and svm, 2017 IEEE Region 10 Symposium (TENSYMP) (2017), 1–4in:, IEEE.

21.

Dogan

, Bayo-Monton

J.-L.

, Fernandez-Llatas

, Oztaysi

, Analyzing of gender behaviors from paths using process mining: A shopping mall application, Sensors 19(3) (2019), 557.

22.

Yoshimura

, Sobolevsky

, Ratti

, Girardin

, Carrascal

J.P.

, Blat

, Sinatra

, An analysis of visitors’behavior in the louvre museum: A study using bluetooth data, Environment and Planning B: Planning and Design 41(6) (2014), 1113–1131.

23.

Dogan

, Discovering customer paths from location data with process mining, European Journal of Engineering Science and Technology 3(1) (2020), 139–145.

24.

Dogan

, Oztaysi

, Fernandez-Llatas

, Segmentation of indoor customer paths using intuitionistic fuzzy clustering: Process mining visualization, Journal of Intelligent & Fuzzy Systems 38(1) (2020), 675–684.

25.

Dogan

, Process mining for check-up process analysis, IIOABJ 9(6) (2018), 56–61.

26.

Hwang

, Jang

Y.J.

, Process mining to discover shoppers’pathways at a fashion retail store using a wifi-base indoor positioning system, IEEE Transactions on Automation Science and Engineering 14(4) (2017), 1786–1792.

27.

De Leoni

, van der Aalst

W.M.

and Dees

, A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs, Information Systems 56 (2016), 235–257.

28.

Frisby

, Smith

, Traub

, Patel

V.L.

, Contextual computing: a bluetooth based approach for tracking healthcare providers in the emergency room, Journal of Biomedical Informatics 65 (2017), 97–104.

29.

Arroyo

, Yebes

J.J.

, Bergasa

L.M.

, Daza

I.G.

, Almazán

, Expert video-surveillance system for real-time detection of suspicious behaviors in shopping malls, Expert Systems with Applications 42(21) (2015), 7991–8005.

30.

Popa

M.C.

, Rothkrantz

L.J.

, Shan

, Gritti

, Wiggers

, Semantic assessment of shopping behavior using trajectories, shopping related actions, and context information, Pattern Recognition Letters 34(7) (2013), 809–819.

31.

Y.-k.

, Wang

H.-C.

, Chang

L.-C.

, Chou

S.-C.

, Customer’s flow analysis in physical retail store, Procedia Manufacturing 3 (2015), 3506–3513.

32.

Yim

, Jeong

, Gwon

, Joo

, Improvement of kalman filters for wlan based indoor tracking, Expert Systems with Applications 37(1) (2010), 426–433.

33.

Oosterlinck

, Benoit

D.F.

, Baecke

, Van de Weghe

, Bluetooth tracking of humans in an indoor environment: An application to shopping mall visits, Applied geography 78 (2017), 55–65.

34.

Naserian

, Wang

, Dahal

, Wang

, Personalized location prediction for group travellers from spatial–temporal trajectories, Future Generation Computer Systems 83 (2018), 278–292.

35.

Monreale

, Pinelli

, Trasarti

, Giannotti

, Wherenext: a location predictor on trajectory pattern mining, In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM (2009), 637–646.

36.

Ying

J.J.-C.

, Lee

W.-C.

, Tseng

V.S.

, Mining geographictemporal- semantic patterns in trajectories for location prediction, ACM Transactions on Intelligent Systems and Technology (TIST) 5(1) (2013), 2.

37.

Ying

J.J.-C.

, Lee

W.-C.

, Weng

T.-C.

, Tseng

V.S.

, Semantic trajectory mining for location prediction, In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems ACM. 2011, pp. 34–43.

38.

Oner

S.C.

, Oztaysi

, An interval type 2 hesitant fuzzy mcdm approach and a fuzzy c means clustering for retailer clustering, Soft Computing 22(15) (2018), 4971–4987.

39.

del Carmen Rodríguez-Hernández

, Ilarri

Hermoso

and Trillo-Lado

, Towards trajectory-based recommendations in museums: evaluation of strategies using mixed synthetic and real data, Procedia Computer Science 113 (2017), 234–239.

40.

, Zheng

, Xie

, Chen

, Liu

, Ma

W.-Y.

, Mining user similarity based on location history, in: Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, ACM, 2008, 34.

41.

Xiao

, Zheng

, Luo

, Xie

, Finding similar users using category-based location history, in: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems, ACM. 2010, pp. 442–445.

42.

Brilhante

, Macedo

J.A.

, Nardini

F.M.

, Perego

, Renso

, Where shall we go today?: planning touristic tours with tripbuilder, in: Knowledge Management, ACM. 2013, pp. 757–762.

43.

Zheng

, Xie

, Learning travel recommendations from usergenerated gps traces, ACMTransactions on Intelligent Systems and Technology (TIST) 2(1) (2011), 2.

44.

Zheng

, Zhang

, Ma

, Xie

, Ma

W.-Y.

, Recommending friends and locations based on individual location history, ACM Transactions on the Web (TWEB) 5(1) (2011), 5.

45.

Oner

S.C.

, Oztaysi

, Oner

, An interval valued intuitionistic fuzzy location based recommendation system utilizing social platforms, in: Data Science and Knowledge Engineering for Sensing Decision Support, World Scientific, 2018, pp. 1143–1151.

46.

Shaw

A.A.

, Gopalan

, Finding frequent trajectories by clustering and sequential pattern mining, Journal of Traffic and Transportation Engineering(English Edition) 1(6) (2014), 393–403.

47.

Dogan

, Gurcan

O.F.

, Oztaysi

, Gokdere

, Analysis of frequent visitor patterns in a shopping mall, in: Industrial Engineering in the Big Data Era, Sringer, 2019, pp. 217–227.

48.

Lee

A.J.

, Chen

Y.-A.

, Ip

W.-C.

, Mining frequent trajectory patterns in spatial–temporal databases, (13), Information Sciences 179 (2009), 2218–2231.

49.

Cao

, Mamoulis

, Cheung

D.W.

, Mining frequent spatiotemporal sequential patterns, in: Fifth IEEE International Conference onDataMining(ICDM’05), IEEE, 2005, pp. 8.

50.

Chen

, Yuan

, Qiu

, Pi

, An indoor trajectory frequent pattern mining algorithm based on vague grid sequence, Expert Systems with Applications 118(2019), 614–624.

51.

Chen

, Ross

, Evaluation of gender classification methods on thermal and near-infrared face images, 2011 International Joint Conference on Biometrics (IJCB), IEEE, 2011, pp. 1–8.

52.

, Chen

, Jain

A.K.

, Multimodal facial gender and ethnicity identification, in: Advances in Biometrics, Sringer, 2005, pp. 554–561.

53.

Bouadjenek

, Nemmour

, Chibani

, Local descriptors to improve off-line handwriting-based gender prediction, in: 2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR), IEEE, 2014, pp. 43–47.

54.

Al Maadeed

and Hassaine

, Automatic prediction of age, gender, and nationality in offline handwriting, EURASIP Journal on Image and Video Processing 2014(1) (2014), 10.

55.

, Wang

, Zhang

, Gait-based gender classification using mixed conditional random field, Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 41(5) (2011), 1429–1439.

56.

, Maybank

S.J.

, Yan

, Tao

, Xu

, Gait components and their application to gender recognition, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38(2) (2008), 145–155.

57.

Choi

, Kim

, Park

, An on-device gender prediction method for mobile users using representative wordsets, Expert Systems with Applications 64 (2016), 423–433.

58.

Agneessens

, Bisio

, Lavagetto

, Marchese

, Design and implementation of smartphone applications for speaker count and gender recognition, in: The Internet of Things, Sringer, 2010, pp. 187–194.

59.

D’Urso

, Massari

, Fuzzy clustering of human activity patterns, Fuzzy Sets and Systems 215 (2013), 29–54.

60.

Klomsae

, Auephanwiriyakul

, Theera-Umpon

, A string grammar fuzzy-possibilistic c-medians, Applied Soft Computing 57 (2017), 684–695.

61.

Levenshtein

V.I.

, Binary codes capable of correcting deletions, insertions, and reversals, Soviet physics doklady 10(8) (1966), 707–710.

62.

Krishnapuram

, Joshi

, Yi

, A fuzzy relative of the k-medoids algorithm with application to web document and snippet clustering, in: FUZZ-IEEE’99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No. 99CH36315), Vol. 3, IEEE, 1999, pp. 1281–1286.

63.

Krishnapuram

, Joshi

, Nasraoui

, Yi

, Lowcomplexity fuzzy relational clustering algorithms for web mining, IEEE transactions on Fuzzy Systems 9(4) (2001), 595–607.

64.

Nikoo

M.R.

, Kerachian

, Alizadeh

M.R.

, A fuzzy knn-based model for significant wave height prediction in large lakes, Oceanologia 60(2) (2018), 153–168.

65.

Maillo

, Luengo

, García

, Herrera

, Triguero

, A preliminary study on hybrid spill-tree fuzzy k-nearest neighbors for big data classification, in: 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), IEEE, 2018, 1–8.