Multi-dimensional features models and compacted clustering for ILBD (Indoor Location Big Data)

Abstract

LBS (Location Based Services) have been a type of “killer application” for ongoing and upcoming internet services. ILBD (Indoor Location Big data) are extremely big multimedia data indeed. However, indoor location data are more complicated than outdoor. Lack of unified representation model and data redundancies make ILBD hard to cluster and mine location based values. Therefore, this paper proposes a new multi-dimensional features model and compacted clustering for ILBD. Unified ILBD model combines spatial and time features of different scales and states, which employs normalized data frames to pre-process original data. Scalable Euclidean extending distance is designed to characterize relationships between heterogeneous data and represent connection of different dimensions. In order to reduce ILBD redundancies and flaws, compacted clustering method are proposed, which construct location ontology and sensations parameters to determine ILBD main affecting elements, the sluggish elements would be filtered and shrink to decrease the amount of ILBD. The new multi-dimensional features model would be applied in LBS framework. The tests and simulations verify proposed model have enhanced 36.7% convergence estimation RMSE and 12.3% regional flow estimation accuracy performance, which improve accuracy of ILBD mining and reduce ILBD redundancies and flaws.

Keywords

LBS ILBD data process clustering and mining data models

1 Introduction

1.1 Burst demand of LBS and ILBD big multimedia data

Benefited from the rapidly development of LTE (Long term evolution) and Big data, LBS (Location Based Services) have been a type of “killer application” for ongoing and upcoming internet services [1]. Especially, accuracy indoor positioning is truly expanding a much wider new internet service mode, such as O2O (Online to Offline). Imagine that you can book a discount e-voucher in an indoor restaurant for location finding, fireman can locate you around target room in a building when fires happen, or you can monitor your baby in the hospital to prevent missing care. According to data statistics, over 120 billion service requirements are launched from online booking car in China just in 2016 [2]. The crazy increasing demands for indoor LBS are taking up more throughout in network.

ILBD (Indoor Location Big data) are extremely big multimedia data indeed. First of all, great amount of services requirements surly demand high throughout and bandwidths to carry. It’ s hard to handle billions of data transition, analysis and mining, even the protocols carried in network, as Fig. 1 shown. Secondly, LBS data always consist of different sequences, files, pictures and videos, which are unstructured and of big difference size [3]. Current data models seem difficult to present the complicated correlations between locations and other information. Thirdly, the information from LBS data always are redundant, location can be found in the coordinates calculated, pictures users took or speech users talked. LBS data need great data clustering and simplify. At last, LBS data from heterogeneous network are not delay tolerant, a moving terminal would obviously miss the former seconds location coordinate, which cause low LBS QoS (Quality of Services) and vague recommendation.

Fig.1

ILBD big multimedia data.

1.2 LBD data values and data processing

LBS big multimedia data researches gain enough concerns from worldwide academies and institutions. Outdoor LBS data are the earliest cases to be researched. The location of targets are the first question users want to know, so coordinates would be calculated as the basis data [4]. As the development of GNSS (Global Navigation Satellite System), coordinates information is combined with geographic information to serve outdoor driving navigation, geospatial mapping and missile guidance. The introduce of customer relationships and pay information made LBS original value data business services [5]. Gradually, groups of business services have been binding with those potential to make more profits. Products and services recommendations combined with seller’s information, shopping malls discounts, goods advantages, provide merchants with a smarter or more convenience access to customers around offline locations. After a period of location track, an active customer behavior would be recorded, combined with indoor location map which are the distribution of shops and corridors, we can figure out the rules of customer shopping features. With the development of new internet applications and services, such as WeChat, Microblogging, Mobile payment, such mobile social networks are accelerated to fuse digital words, pictures, video into location to provide with more valuable information. Such ILBD always are unstructured and with sparse value for fragmentations location information.

In other to furtherly mine the value of ILBD, more efforts are given to compensate with the un-satisfied source location information [5 , 14]. Data preprocess are firstly exploited to enhance location attributes associations by increase intricacy of maps. Grid-enable regions partition could group all information around different specified coordinates range, which are an original attempt for ILBD clustering. A scheme of data filter by pattern of road distribution was proposed to relate paths to enhance navigationperformance, which modified the accuracy of location estimation by paths predictions. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for indoor LBS seemed easier to mine data into heat concern shopping area, whose attributes of data around crowds would be considered [6]. Thiessen diagram was introduced to location data to calculate the optimal partition scheme for spatial clustering. Plenty of geographical features are taken into account to enhance the relevance between ILBD, which improve the data utilization and redundancy. Deeper features would be fused with several original features to achieve more precise LBS [7]. Regional static characteristics can be grouped around specified coordinates range not only maps information, but also roads, buildings, traffic facilities, gardens. The common features would be extracted from those static characteristics to strengthen the correlation and provide with earlier data sparse foundation. Regional transportation mechanics characteristics combine the moving features from target humans or vehicles, to generate an important dynamic information to represent crowds moving rules [8]. Moving behavior information can be widely used in monitoring and management. Individual moving patterns deserve the moving rules one person into crowds, which show the mean value of states transform probability. Dimensionality reductions are efficient for ILBD to be mine, which exploit hyperreal number or principal component regression to reduce each order and rank in features matrix. Combined spatial and time features contribute to better prediction performance, time features correlations methods such as hidden Markov model, conditional Markov model, conditional random fields are supplement the lost messages from discrete data collection, spatial features correlations such as Euclidean distance, Mahalanobis distance provide with spatial features convergence to reduce spatial samples. Those data processes can cluster certain ILBD to reduce matrixes dimensions and enhance data values[9, 11, 13].

However, indoor location data are more complicated than outdoor. Indoor positioning requirements are thousands times more than ever, which demands more ILBD processing and calculating resources. The difference scale of region static characteristics is extremely amplified, heterogeneous spatial features correlations are hard to be analyzed in a unified model. There are so much data redundancies and flaws that lots of ILBD should be re-cluster and re-mine again to improve the data value.

1.3 Brief state of art in proposed ILBD models

Therefore, focus on the serious problems caused by ILBD lack of unified representation model and data redundancies, this paper proposes a new multi-dimensional features model and compacted clustering for ILBD. Unified ILBD model combines spatial and time features of different scales and states, which employs normalized data frames to pre-process original data. Scalable Euclidean extending distance is designed to characterize relationships between heterogeneous data and represent connection of different dimensions. In order to reduce ILBD redundancies and flaws, compacted clustering method are proposed, which construct location ontology and sensations parameters to determine ILBD main affecting elements, the sluggish elements would be filtered or shrink to decrease the amount of ILBD. The new multi-dimensional features model would be applied in LBS framework. The tests and simulations verify proposed model have a better performance, which improve accuracy of ILBD mining and reduce ILBD redundanciesand flaws.

2 Multi-dimensional features model with normalized data frames and scalable Euclidean extending distance

Different from outdoor data features models, ILBD suffer from much more spatial and time heterogeneous data. Indoor location data are more complicated than outdoor. Therefore, new ILBD models have to satisfy several demands below.

2.1 ILBD models challenges and demands

Great capacity of data features: numerous devices would generate thousands of LBS requirements, imagine that IOT (Internet of Things) such as cellphones, RFIDs, sensors, blueteeth, computers, household appliances would connect each other by location findings, great capacity would be necessary for numerous data features.

Extensible dimensional distance: different data types can extract the same features from different dimensions. In LBS, coordinates certainly show locations, we can also figure out locations by looking at pictures, locations can be described by speech. So, the relationships between locations and other features can be extended to different semantic dimensions.

Scalable data representation: the same as maps, LBS would be represented in a full information maps. Users can watch a panorama view of ILBD, which are rough but brief. With larger scale, more detailed information not only locations would be shown in the full information maps [14].

2.2 Multi-dimensional features model

In other to describe the heterogeneous, unstructured and numerous ILBD, multi-dimensional features model is designed as features sets. The multi-dimensional features would be concluded as following concepts.

Definition 1. Notate indoor location big data sets as IBLD = {L, T, S, E_S, A, E_A}. Location information L can be extracted from coordinates, pictures, videos or voices. Time features T provide with specified periods, a series of recorded coordinate points would line as a moving trail. Correlated subject features S are the necessary part of LBS, which can be groups into entirely achieve services. E_S denotes the relationships between locations and subject features, both of them comprise of different contents LBS. A is the c of subject features, E_A denotes the relationships between subject features and attributes. Multi-dimensional features model is shownin Fig. 3.

Fig.2

ILBD models challenges and demands.

Fig.3

Multi-dimensional Features Model.

As Fig. 3 shown, multi-dimensional features model is constructed as the shape of tree. Location information L are considered as domain ontology. That is, the main body of LBS is location information, once location determined, ILBD would be basically structure confirmed. There can be many features in one location node. Subject features S link directly into Location information L with relationships E_S, correlation distance and angle of E_S are designed to characterize the relationships between locations and subject features, which provide with a 3 dimensions features plane. Each subject features have several attributes A to descript the conditions of current subject features, which also use E_A to present the relationships between each subject features and attributes. The correlation distance and angle of E_A extend the presentation dimensions of ILBD. Above all, multi-dimensional features model succeeds both exact relationships and extensible dimensions of ILBD at the same time. The concept ontology of big data sets would be described as follow.

Location information L = {L₁, L₂, ⋯, L_n}, denotes n different types of location information, which extracted from coordinates, pictures, videos or voices. Adjacent location also would be recorded.

Time features T = {T₁, T₂, ⋯, T_n}, all situations change in different periods T_n, in addition, location information must be such real time data that ILBD would be valuable. ILBD are transformed into IBLD = <IBLD_T
₁, IBLD_T
₂,⋯, IBLD_{T
_n}>.

Relationships E_S = {E_S (d), E_S (θ)}, distance d and angle θ are the correlation degree and deviation degree between locations and subject features, E_S can describe as relation operators, such as compose operator, inclusion operator, effect operator.

Subject features S = {S₁, S₂, ⋯, S_n}, comprise of different heterogeneous information, such as users, customers, merchants, goods, services, maps, which would be divided into sub-features.

Attributes = {A₁, A₂, …, A_n}, which contain the size, capacity, number, speed, direction, height, weight and so on. Each attributes are mutual independent, but have relationships with center locations node.

Relationships E_A ={ E_A (d) , E_A (θ) }, also can be control operator, need operator, assistance operator, interference operator and so on.

2.3 Services to elements models transforming methods

Multi-dimensional features model is abstracted from progressive logic layers. From services to elements, models transforming methods are designed with three steps. The methods procedures can be described in Fig. 4.

Fig.4

Services to elements models transforming methods.

As Fig. 4 shown, multi-dimensional features model is divided into four layers, which are services layer, data layer, feature layer and element layer. In sight, service layer presents directly the LBS compositions, which have complicated services flows to capture basic elements. Each LBS service comprises of words, pictures, videos and voices. We firstly extract data from LBS services, all types of data would be classified [16].

After services to data transforming, we exploit data to features mapping. Each data contains several features such as location, speed, operation, target and result, which are essential for multi-dimensional features model. For example, speed is the average distance between two locations. Features to element mapping are the last procedure, which separate important ontology elements from features layers. Multi-dimensional features model would be described as subject, attribute, relationship and extending, in other to represent ILBD.

3 Compacted clustering method for reducing ILBD redundancies and flaws

Global efforts in form of data clustering and mining, the density of data values have been greatly improved. Contrast to our door LBS, lack of unified representation model and data redundancies make ILBD hard to cluster and mine location based values. Unified ILBD model combines spatial and time features of different scales and states, which employs normalized data frames to pre-process original data [17]. Scalable Euclidean extending distance is designed to characterize relationships between heterogeneous data and represent connection of different dimensions. In order to reduce ILBD redundancies and flaws, compacted clustering method are proposed, which construct location ontology and sensations parameters to determine ILBD main affecting elements, the sluggish elements would be filtered or shrink to decrease the amount of ILBD.

3.1 Scalable Euclidean extending distance

We design scalable Euclidean extending distance for ILBD Clustering, which comprises of scalable Euclidean extending distance D_i,k, for the indoor location big data sets as IBLD = {L, T, S, E_S, A, E_A}, IBLD_u and IBLD_v are assumed as two specific ILBD. Location information L, Relationships E_S and E_A are observed by Euclidean distance with weighting parameters, as Fig. 5 shown.

Fig.5

Scalable Euclidean extending distance for ILBD clustering.

Euclidean extending distance for ILBD D_i,k can be calculated, $D_{i, k} = α_{1} d_{l} (L) + α_{2} d_{l} (E_{S}) + α_{3} d_{l} (E_{A})$ (1)

α₁, α₂, α₃ are weighting parameters to adjust the influences by three features. The scale of location information L are much intensive, and have higher similarity. So, we operate semantic analysis to refine data items mapping to location information L, the mean square root variant of whole correlation distance are, $\begin{matrix} δ (L) = \\ \frac{1}{n} \sqrt{\sum_{0}^{n} \sum_{0}^{n} (\int_{0}^{T} {(L_{l} (t) * S_{i} (t))}^{2} {dt}^{2} -} \end{matrix}$ (2) $\bar{{- \int_{0}^{T} {(L_{l} (t) * S_{j} (t))}^{2} dt)}^{2}}$

It has also been widely understood that time period T also plays a significant role in ILBD, so location information L cannot be effective aftera certain delay. We calculate the total correlation value between location and subject features in one ILBD case. Take the sensitive information entropy ratio of δ (L) under the correlation probability p_l,that is, $d_{l} (L) = \frac{p_{l} \log 2 δ (L)}{- \sum_{0}^{l} p_{l} \log 2 δ (L)}$ (3)

The mean square root variant of relationships d_l (E_S) between locations and subject features are intimate with chained target function J_m,w [18], we set an sensitive parameter to calculate appropriated_l (E_S), $d_{l} (E_{S}) = f^{- 1} (\frac{\partial^{2} J_{m, w}}{\partial L \partial t} = 0 | min_{L, T} δ (L))$ (4)

In LBS, what customers concern would change the value of ILBD, in addition, suitable relationships also change by different targets, different regions or different periods. We take the extremum of target function J_m,w, which contain serval customers aims under location L and time t. Constrained by the minimum of d_l (L), d_l (E_S) can be operated by inverse function of extremum of target function.

Relationships E_A are limited within specific subject features S, self-correlation between attributes contribute more efforts to cross-correlation between features and attributes. So we make mixed mean square root d_l (E_A), $d_{l} (E_{A}) = \sqrt{1 - exp ({∥ A_{i} - A_{j} ∥}^{2})}$ (5)

What’s more, d_l (L), d_l (E_S) and d_l (E_A) substitute into Equation (1) to get the value of Euclidean extending distance for ILBD.

3.2 Target function with extending location relationships

The value of ILBD always vary under different targets. In a shopping mall, not all the customers are eager to buy. Going to dinner, toilet or monitoring are also essential. So target function with extending location relationships are proposed to construct clustering threshold. Based on location ontology, ILBD have higher data density, that is, farther location points would be firstly filtered. Near location points would be operated in construction of chained target function J_m,w.

We take chained target function J_m,w as matrix of hyperplane features,

$\begin{array}{l} \min J_{m, w} (L, T, U, V) = \\ \sum_{i = 1}^{l} (1 - \exp (| | l_{i} - x_{j} | |^{2})) \\ (\sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{i k}^{m} t_{i k}^{w} D_{i, k}) \\ + \sum_{i = 1}^{c} Z_{i} \sum_{k = 1}^{n} u_{i k}^{m} {(1 - t_{i k})}^{w} {(1 - l_{i})}^{w} \end{array}$ (6)

For m > 1, 0 ≤ u_ik, t_ik ≤ 1, u_ik and t_ik are the spatial and time membership of ILBD, we have $\sum_{i = 1}^{c} u_{ik} = 1$ . Construct Lagrange equation to expand the minimum of chained target functionJ_m,w, $\begin{array}{l} \min J_{m, w} \hat{(L, T, U, V)} = \\ \sum_{i = 1}^{c} \sum_{k = 1}^{n} \sum_{i = 1}^{l} (1 - \exp (| | l_{i} - x_{j} | |^{2})) u_{i k}^{m} t_{i k}^{w} D_{i, k} \\ + \sum_{i = 1}^{c} Z_{i} \sum_{k = 1}^{n} u_{i k}^{m} (1 - t_{i k})^{w} {(1 - l_{i})}^{w} \\ \cdot \sum_{i = 1}^{l} (1 - \exp (| | l_{i} - x_{j} | |^{2})) - T \sum_{i = 1}^{c} Z_{i} u_{i k} l_{i} - 1 \end{array}$ (7)

After that, we observe the result under the value of zero equal to partial derivatives of Equation (7), $u_{ik} = {[\sum_{i = 1}^{c} {(\frac{t_{ik}^{w - 1} (D_{i, k})^{2}}{t_{jk}^{w - 1} (1 - exp (| | l_{i} - x_{j} | |^{2}))})}^{\frac{1}{m - 1}}]}^{- 1}$ (8) $t_{ik} = {[1 + {(\frac{{(D_{i, k})}^{2}}{Z_{i}})}^{\frac{1}{w - 1}}]}^{- 1}$ (9) $l_{i} = \frac{\sum_{k = 1}^{n} u_{ik}^{m} t_{ik}^{w} {(D_{i, k})}^{2}}{\sum_{k = 1}^{n} u_{ik}^{m} t_{ik}^{w} (1 - exp (| | l_{i} - x_{j} | |^{2}))}, \forall i, j$ (10)

3.3 Clustering method for reducing ILBD redundancies and flaws

Equations (3–5) would substituted into Equations (8–10), for all the value of i and k, we design clustering method for reducing ILBD redundancies and flaws, the procedure of compacted clustering for ILBD are as following,

Step 1. Randomly k several location points in form of ILBD, which set {L₁, L₂, …, L_l} as the initial center of clustering sphere, m and w are configured permanent value [21], 1 < m, w< + ∞. The number of initial iteration is r = 1, maximum number is r_max = d_l (E_S (d) · tan θ) * d_l (E_A (d) · tan θ).

Step 2. Assign a neighbor set {L_1+q, L_2+q, ⋯ , L_l+q } into initial set, so we have {L₁, L₂, L_1+q, L_2+q, ⋯ , L_l+q, ⋯ , L_l}, which is a new ILBD set. In addition, initialize the spatial and time membership of ILBD, get the value of Euclidean extending distance for ILBD from Equation (1). D_i,k have considered heterogeneous ILBD data structure. If target distance $argmin \sum_{i = 1}^{l} (1 - exp (| | l_{i} - x_{j} | |^{2}))$ are larger than D_i,k, target location points [23] would be ignored for long distance isolation. Likewise, location points within the scope of effective clustering range.

Step 3. According to changes by added location points and the new ILBD set, the center of clustering sphere would be updated, that is, $Z_{i} = K \frac{\sum_{k = 1}^{n} l_{i} u_{ik}^{m} {(D_{i, k})}^{2}}{\sum_{k = 1}^{n} u_{ik}^{m}}$ , K > 0, the spatial and time membership of ILBD would provide with a new center of clustering sphere.

Step 4. Take Equation (8) to update spatial membership U, take Equation (9) to update the time feature T, take Equation (9) to update the attribute feature V. Limited by the chained target function J_m,w, we make operation iteration from Step 2 to Step 3, until the convergence of Z_i, and center of clustering sphere would not move.

4 Experiment and analysis

4.1 Experimental environment

We have implemented indoor positioning systems in several shopping malls in China, which exploit TC-OFDM (Time-Code Orthogonal Frequency Division Multiplexing) [19] to provide with precise indoor location information, which are within 3 meters. All location information would be collected by LBS data center servers along with other shopping or customers’ information. We group five servers into a trunking calculation environment. Hadoop system would be installed to achieve cluster and mine location based values. One master server has 4.2 GHz CPU and 8 G Memory, four slave servers have 2.7 GHz CPU and 4 G Memory. Each server has 300 GB hard disk capacity, experimental environment in shown in Fig. 6.

Fig.6

ILBD clustering experimental environment.

We collected about 500 million ILBDs in a shopping mall during 3day, unified ILBD model combines spatial and time features of different scales and states would be applied, compacted clustering method reduce ILBD redundancies and flaws. Crowds situation analysis and regional passenger flow analysis are observed in our tests.

4.2 Convergence & divergence situation analysis

Convergence and divergence situation analysis aims at calculating the customers gather locationdistribution. The precise indoor location information is organized by multi-dimensional features model. After reducing data redundancies and flaws by compacted clustering method, ILBD would be divided into N regions. Model of crowds’ situation analysis are based on customers visit probability $P_{r} (p_{j}) = N_{i}^{j} / \sum_{p_{j} \in r_{i}} N_{i}^{j}$ , the observed number of moving customers p_j are assumed as $N_{i}^{j}$ in specific region r_i. Convergence and divergence would be limited by k period, Convergence and divergence ratio are, CTD = (∑_{day
_i}P_r (p_i) C^{day
_i}/D^{day
_i})/day_i, C^{day
_i} and D^{day
_i} the number of convergence and divergence customers. ILBD convergence and divergence situation are shown in Fig. 7.

Fig.7

ILBD Convergence & Divergence Situation.

In Fig. 7, we can figure out that the different convergence and divergence situations in three different moments, which are weekday daytime, weekday night and weekend night. Deeper marked regions denote the more crowded convergence situations. In weekday daytime, customers seem more close to restaurants.

Likewise, clothing shops, watch stores, cafés and restaurants are most popular by ILBD data clustering and mining analysis. We also make a contrast with different location based data clustering methods, such as towards fusing uncertain location data from heterogeneous sources from reference [10] and Home location inference from sparse and noisy data from reference [11]. Through large amount of experimental samples, the accuracy of convergence and divergence situation estimations are shown in Fig. 8.

Fig.8

Performance of ILBD convergence & divergence situation estimation.

As Fig. 8(a) shown, convergence and divergence situation are obvious different in different time. We take manual statistics results as the normal value. Under the same moment, multi-dimensional features model for ILBD have better performance of convergence and divergence situation than those of methods in reference [10]. The real-time estimation results by proposed model are more close to the normal value by manual statistics results.

That’s because scalable data representation of multi-dimensional features model for ILBD, which provide with direct customers’ situation to data model transformation to reduce the process delay. Figure 8(b) shows different estimation RMSE (Root Mean Square Error) by different ILBD models and clustering method under different amount of experimental samples. We observe that, at the case of same amount of experimental samples 20 million, the estimation RMSE of proposed compacted clustering method is 0.179, which improve by 36.7% and 52.6% RMSE than that of reference [10] and reference [11]. The reason is that compacted clustering method constructs location ontology and sensations parameters to determine ILBD main affecting elements, which have reduced the data redundancies.

4.3 Regional flow analysis

Regional flow analysis aims at getting statistical data to clear analyze regional customers flowing and lingering situations. In Fig. 7, we can also find that deeper marked regions denote longer lingering time in specific region, which means the lower flowing probability of customers flowing. In fact, the shopping and resting areas attract more customers to linger, while they don’t want to stay around the entrances and exits. Contrast with different location based data clustering methods, such as activity purposes using machine learning algorithms from reference [15] and New Spatial Transformation Scheme from reference [20], the estimations experimental data of regional flow analysis would be collected, which are processed by Monte Carlo method, and the root mean square error of subjective speech quality would be analyzed, as Fig. 9 shown.

Fig.9

Performance of ILBD regional flow analysis estimation.

In Fig. 9(a), we can see degree of deviation error estimations performance by different data models and clustering methods. Proposed multi-dimensional features model for ILBD have lower deviation error estimations degree than that of reference [15] and reference [20] in region r_1, r_2 and r_3. There are rare deviated location estimations point in ILBD model. That’s because scalable Euclidean extending distance are effective to adjust the influences by subject, attribute, relationship, which improve the representation of complicated ILBD memberships. The estimation accuracy of regional flow analysis is shown in Fig. 9(b). As the increase of amount of experimental samples, estimation accuracy of three methods have performance degradation. Under the same amount of experimental samples, proposed multi-dimensional features model for ILBD have better estimation accuracy. When amount of experimental samples is 40 million, there is 55.61% estimation accuracy with proposed method, which are 12.3% higher than that of reference [20]. The reason is that target function with extending location relationships can reduce ILBD the data flaws.

5 Conclusion

This paper contributes a novel multi-dimensional features model and compacted clustering for ILBD. Unified ILBD model combines spatial and time features of different scales and states, which employs normalized data frames to pre-process original data. Scalable Euclidean extending distance is designed to characterize relationships between heterogeneous data and represent connection of different dimensions. In order to reduce ILBD redundancies and flaws, compacted clustering method are proposed, which construct location ontology and sensations parameters to determine ILBD main affecting elements, the sluggish elements would be filtered or shrink to decrease the amount of ILBD. The new multi-dimensional features model would be applied in LBS framework. The tests and simulations verify proposed model have a better performance, which improve accuracy of ILBD mining and reduce ILBD redundancies and flaws.

Footnotes

Acknowledgments

This work was supported by the China National support program (Grant No. 2014BAD10B06-03) and the National Natural Science Foundation of China (Grant No. 61271182, 61401040 and 61372110).

References

, Zhang

and Huang

, A down-to-earth cloud computing: Location-based service, Acta Electronica Sinica42(4) (2014), 786–790.

Guo

, Liu

, Fang

, et al., Value extraction and collaborative mining methods for location big data, Journal of Software25(4) (2014), 713–730.

, Li

, Bao

, et al., Top-k spatio-textual similarity join, IEEE Transactions on Knowledge and Data Engineering28(2) (2016), 551–565.

Wang

and Li

, Representation and organization for spatial data in LBS, Journal of Earth Science25(3) (2014), 544–549.

Liu

, Zhou

and Zhang

, Estimating users’ home and work locations leveraging large-scale crowd-sourced smartphone data, IEEE Communications Magazine53(3) (2015), 71–79.

Milojevic

and Rakocevic

, Location aware data aggregation for efficient message dissemination in vehicular Ad Hoc networks, IEEE Transactions on Vehicular Technology64(12) (2015), 5575–5583.

Komai

, Sasaki

, Nara

and Nishio

, k nearest neighbor search for location-dependent sensor data in MANETs, IEEE Access3 (2015), 942–954.

Shan

and Zhu

, Camera location for real-time traffic state estimation in urban road network using big GPS data, Neurocomputing169 (2015), 134–143.

Meek

, Jackson

and Leibovici

, A BPMN solution for chaining OGC services to quality assure location-based crowdsourced data, Computers and Geosciences87 (2016), 76–83.

10.

Zhang

, Trajcevski

and Liu

, Towards fusing uncertain location data from heterogeneous sources, Geoinformatica20(2) (2016), 179–212.

11.

, Luo

, Kautz

and Sadilek

, Home location inference from sparse and noisy data: Models and applications, Frontiers of Information Technology and Electronic Engineering17(5) (2016), 389–402.

12.

Xiong

, Zhu

, Niu

and Li

, A differentially private algorithm for location data release, Knowledge and Information Systems47(3) (2016), 647–669.

13.

Park

, An efficient scalable spatial data search for location-aware mobile services, Journal of Information Science and Engineering31(1) (2015), 165–178.

14.

Kelly

, Smyth

and Caulfield

, Uncovering measurements of social and demographic behavior from smartphone location data, IEEE Transactions on Human Machine System43(2) (2013), 188–198.

15.

Liu

, Janssens

, Wets

and Cools

, Annotating mobile phone location data with activity purposes using machine learning algorithms, Expert System with Applications40(8) (2013), 3299–3311.

16.

Peng

, Wang

and Ouyang

, Approximation of discrete spatial data for continuous facility location design, Integrated Computer Aided Engineering21(4) (2014), 311–320.

17.

Martinez

, Yamamoto

and Takahashi

, Pace-based clustering of GPS data for inferring visit locations and durations on a trip, IEICE Transaction on Information and SystemsE97D(4) (2014), 663–672.

18.

Mun

, Kim

, Shilton

, et al., PDVLoc: A personal data vault for controlled location data sharing, ACM Transactions on Sensor Networks10(4) (2014), 58.

19.

Deng

, Yu

, Yuan

, Wan

and Yang

, Situation and development tendency of indoor positioning, China Communicaitons10(3) (2013), 42–55.

20.

Yoon

, Kim

, Jang

and Chang

, A new spatial transformation scheme for preventing location data disclosure in cloud computing, International Journal of Data Warehousing and Mining10(4) (2014), 26–49.

21.

Jamil

M.K.

, Computing eccentric version of second zagreb index of polycyclic aromatic hydrocarbons pahkpahk, Applied Mathematics & Nonlinear Sciences2(1) (2016), 247–252.

22.

Jiang

, Zhang

, Gao

and Wu

, Fault detection, identification, and location in smart grid based on bata-driven computational methods, IEEE Transactions on Smart Grid5(6) (2014), 2947–2956.

23.

Costamagna

, A model for the operations to render epidemic-free a hog farm infected by the Aujeszky disease, Applied Mathematics & Nonlinear Sciences1(1) (2016), 207–208.