Mining spatial colocations from image-objects: A tensor factorization approach

Abstract

The spatial colocation problem is totally different from the traditional association rule problem, as it operates on spatial data and not on conventional transaction data. In this work, a spatial colocation mining framework is proposed that mines spatial colocation of image-objects present in images using a tensor factorization approach. The framework takes in image data directly, tensorize it and perform the mining task, thus eliminating the need of converting into a transaction based approach. An interestingness measure called, spatial dominance is also proposed in this work. This measure is an indicator of the prevalence of the mined colocation pattern. Algorithms are designed in this framework, first to map the classified pixels as members of image-objects, which is a pre-stage before mining and second to find spatial colocation patterns. Experiment results are provided to show the strength of the spatial colocation mining algorithm.

Keywords

Data mining spatial colocation tensors image-objects

1. Introduction

Colocation pattern mining is a member of data mining family and is the process of finding patterns that are located together or in proximal location. Mining of colocations in the spatial domain is referred to as spatial colocation pattern mining. Colocation pattern mining yields important insights in application domains like environment monitoring [1], earth science [2], mobile services [3], public safety [4] and urban facility analysis [5]. The task of colocation pattern mining is challenging because of the following facts: (a) the features of the data under study is embedded in continuous space in contrast to the traditional transaction type discretized structure and (b) a number of spatial relationships exists between the features of the data thus resulting in considerable computation time for finding the significant number of colocation instances and patterns. Challenges become more difficult and tedious when the data varies from ordinary text data to complex media data.

Generally, spatial colocations are mined in transaction databases, where each space instance is modeled as a row in a table. The approach to find spatial colocation patterns is to find spatial colocation instances and generalize the same to a pattern, based on interestingness measures. This will result in enormous computation time as the instances are to be discovered, modeled appropriately and further downsized to obtain patterns. The complexity of the work structure mentioned increases when there are no transaction databases for image data. Building a transaction database for image data is a manual interventional task. Hence there is a crucial need to find spatial colocations from image data without the aid of a transaction database. One another reason to be noted for considering image data is attributed to the fact that today’s world is witnessing an overflow of image data from various arena. Hence it is felt that the need for analyzing the patterns present in this image data is the responsibility of the spatial research community.

The spatial colocation mining algorithm proposed in this research is based on the concept of tensors, which are basically multi-way arrays. The tensor data structure captures all kinds of spatial relationships that exists between objects or entities. The objects or entities in this research are from images, which are termed as image-objects. Hence there is a necessity of a pre-stage of finding image-objects from pixel-wise classified images, which is also taken care of in this work. To the best of our knowledge, this is the first work which proposes to mine spatial colocation in images.

The advantages of modelling image-objects as tensors are (i) management of huge amount of data with tensors is easy (scalable images/image-objects) (ii) tensors are easily reducible to lower dimensions resulting in easy understanding of latent information and (iii) extraction from tensor data to lower dimensions results in more components of information than ordinary matrix-based methods.

The paper is organized as follows. The next section briefs about the relevant and related work in colocation pattern mining. The framework proposed is detailed in the further sections. Section 3 provides the basics of the concept of tensors used in this work. Section 4 builds the framework to mine spatial colocation patterns and explains the proposed algorithms. The evaluation and discussion of the proposed algorithms with respect to other relevant algorithms in the literature are briefed as well in Section 5. A valid summary of this work as well as suggested future directions are described towards the end of the paper in Section 6.

2. Background

The first attempt to perform spatial colocation mining is seen in [6]. This work is an attempt to capture a subset of spatial features for a particular class, which is different from the colocation mining concept in today’s world. An effective approach is presented in [7] that depicts a space partitioning method for identifying neighbourhood regions which contain instances of colocations. The algorithm may miss colocations across the different neighbourhoods, due to distinct partitions. A join based colocation mining algorithm is presented in [8] which works similar to Apriori [9]. This is a computationally expensive process with the increase in the number of colocation instances. An approach to perform a partial join to increase computational efficiency is seen in [10]. In this work, the spatial data is modeled as clique neighbourhood and the cut in the neighbourhood determines the colocation instances mined. The joinless approach [11] reduces the computational time by introducing the instance look -up scheme instead of the regular join operation. The algorithm does not miss any colocation patterns, even though the computational time is dependent on the size of the data. A framework for spatial colocation pattern mining based on association analysis and maximal clique representation of the spatial data is presented in [12]. However, in this case, the spatial data has to be modelled as transaction type data to perform mining of the spatial colocation instances. It is observed that computing spatial colocations using candidate based approaches are extensively expensive due to the exponential number of candidate subsets. Realizing this fact, approaches had been attempted in the literature to avoid the generation of candidate sets to mine spatial colocation patterns.

Representative Colocation Pattern (RCP) mining is introduced in [13] to reduce the exponential number of patterns that arise due to increasing in data size. Instead of the distance measure, a new prevalence measure is introduced in this work to find the covering relationship among spatial colocation instances. In [14] maximal colocations are identified through a maximal clique based approach wherein a Sparse undirected Graph is used for the purpose. Each instance clique of a maximal colocation is further stored in a Condensed Tree for verification and to reduce the storage size. This algorithm is hereafter referred to Sparse Graph Condensed Tree algorithm (SGCT). However, the algorithm does redundant computations when the instances generated have a huge number of object types.

Tensor-based approaches are being used in temporal data networks [15] to find the activity patterns of the time series data due to the generation of huge amount of data and the experimental results are satisfactory, which motivated to apply the tensor factorization techniques to spatial colocation pattern mining problems. The tensor based approaches are nowadays seen in big data processing, understanding and analyzing signal processing applications [16, 17] and in information retrieval [22]. Tensors has helped to improvise the standard flat view matrix models of data to high dimensional and versatile models [18, 23].

3. Tensor model for pattern discovery in image-objects

Image-objects are entities in an image, which are actually groups of pixels of similar digital values. Image-objects possess size and shape in addition to the pixel value and location, that is, an image-object holds spatial as well as non-spatial attributes. Non-spatial attributes are the characteristic features holding nominal values like label or name of image-objects. Spatial attributes describe the spatial characteristics like spatial location (longitude and latitude), spatial extent (area, perimeter, size), spatial shape (point, extended, polygon) and even spatial elevation [19]. As the non-spatial attributes and their relationships are explicit, we focus on discovering the implicit spatial attributes and their relationships. The spatial relationships or patterns that exists among image-objects can be sought in set, topological, metric or distance space. Hence it is felt that the pattern discovery of image-objects in all these mentioned spaces will yield knowledge beneficial for decision making systems. In our approach, a tensor model for pattern discovery of image-objects is proposed.

Tensors are arrays in which more than two dimensions can be represented. This is the obvious reason for choosing tensors in our approach, as the attributes and relationships that exists among image-objects can be of any dimension. Thus the tensor modeling is enabling a paradigm shift from two-way to multi-way components or analysis of the spatial image-objects.

3.1 Tensorization

The multifaceted/multidimensional data has to be stacked as a tensor. The ‘N’ spatial relationships between the image-objects can be modeled as tensor to start with. In this work, the tensor notations are marked in bold-faced calligraphic font, matrices are in bold faced capital letters and vectors are in bold faced small letters. Let the tensor be represented as 𝒮. The tensor is to be stacked with the spatial relationship between all image-objects. Assuming there are K image-objects in the datasets (I₁, I₂, I₃, …, I_S) and the number of spatial relationships can be (S₁, S₂, S₃,….,S_N). Thus the image-objects can now be looked upon as a tensor as follows. $S \in ℝ^{K \times K \times S_{1} \times S_{2} \times S_{3} \times \dots \times S_{N}}$ (1)

Depending on the spatial relationship under set, topological, metric or distance space chosen for study, the set S₁, S₂, S₃, …, S_N can be decided. Thus the process of tensorization can be defined as follows.

Definition - Tensorization of Image-Objects – The process of converting or stacking all spatial relationships in the defined space existing between the image-objects into a tensor.

3.2. Spatial pattern discovery from tensorized image-objects

The modeled tensor contains information about the patterns of spatial relationship that exists between the image-objects. The question posed here is how to uncover the pattern that exists inside the tensor. The spatial patterns that exists between the image-objects have to be discovered by extracting the lower dimensional factors of the tensor. This can be achieved by canonical decomposition of the tensor [17]. The terms ‘decomposition’ and ‘factorization’ are synonymously being used in tensors, referring to the same process of factorizing the tensors to generate decomposed factors.

Consider a 3-order tensor $S \in ℝ^{X \times Y \times Z}$ . The tensor in factorized form can be expressed as the sum of component rank-1 tensors as follows. $S = \sum_{r = 1}^{R_{S}} a_{r} \circ b_{r} \circ c_{r}$ (2)

The symbol ∘ represent the outer product of the vectors and R_S is the number of components in this model and the smallest value of R_S is the rank of the tensor 𝒮. The tensor rank cannot be calculated by any known algorithm, as it is a NP-hard problem. A typical use of 3-order tensor is to model the interaction between 3 entities X, Y, and Z. An entry s_ijk of the tensor denotes the interaction pattern of (x_i, y_j, z_k). In accordance with the factorization model described as above, each entry in the tensor is the product of three latent vectors. $s_{ijk} = \sum_{r = 1}^{R_{S}} x_{ir} \circ y_{jr} \circ z_{kr}$ (3)

Thus the tensor contains latent feature represented for the image-objects under consideration. The interaction pattern of the image-objects can be recovered once the decomposition is done successfully.

To generalize the pattern mining, continuing with Equation (2), the set of vectors a{1,2,…,R_S}, b{1,2,…,R_S} and c{1,2,…,R_S}, can be written as a matrix, where each of the R_S vectors is a column of the matrix. Thus the factorization of a 3-order tensor can thus be represented in terms of three matrices, say, A, B, C. To conclude, an effective factorization is to minimize the difference between 𝒮 and [A, B, C] as $min_{A, B, C} ∥ S - [A, B, C] ∥_{F}^{2}$ (4) where A, B, C have dimensions X x R, Y x R, and Z x R respectively and R< R_S.

The way to solve this problem is to find R rank-1 tensors that best approximate the tensor. The decision of the value of R helps to find the patterns that exists between image-objects. A lower value of R yields only the strongest underlying patterns whereas higher value of R while producing weakest patterns, is also prone to the risk of over-fitting. Thus choosing R is an optimization problem and the resulting R number of components yield the spatial patterns that exists between image-objects.

Definition – Spatial Pattern Discovery from Tensorized Image-Objects – The process of finding explicit patterns in the latent space that exists between image-objects through decomposition of tensorized data.

3.3. Advantages of tensor based model

The high dimensionality associated with the spatial relationships is never a curse in our model, in fact, turns out as a blessing which helps to find different compact spatial patterns.

The tensor model can be used for finding the spatial pattern in any space that exists between the image-objects as long as the relationship patterns can be appropriately represented. Hence the model can find patterns that capture multiple interactions in addition to standard pairwise interactions.

4. Spatial colocation pattern mining framework

The spatial pattern discovery using the tensor based model is attempted in the metric space. There are two kinds of relationships in the metric space, namely, distance and topological. Mining the distance relationship that exists between image-objects helps to discover spatial colocation patterns. In this framework, the generalized tensor model is adopted for finding spatial colocation patterns. The workflow of the framework is presented in Fig. 1.

Fig.1

Spatial Colocation Mining Framework.

The proposed framework operates in a two-stage scenario, wherein (i) a neighborhood growing technique to find image-objects from pixel-wise classified image and (ii) discovering spatial colocation patterns by tensorizing image-objects. The first stage performs the mapping of pixels to appropriate image-objects through a neighborhood growing technique and is named as “Pixel Mapping to Image-Objects” (PMIO) phase in the framework. The second stage consists of two phases (a) tenorization of image-objects and (b) tensor factorization to mine spatial colocation patterns and is named as SCLP-TF.

4.1. Pixel mapping to image objects

The first phase, abbreviated as PMIO, is the phase in which a neighborhood growing technique is applied on a window of classified pixels. The objective of this phase is to extract image-objects from the classified image. The heuristic approach proposed, selects a window of random size, say W. The window size is generally set to a power of 2, for better computational results. From the centroid pixel of the window, the neighborhood is examined in log₂ W group of pixels, which is referred as sub-window. On examining the neighborhood, find the most occurring class label and assign it to the entire sub-window. The growing technique terminates when the threshold limits in terms of size (from the knowledge base input) is reached. The entire set of image-objects in the image can be identified when this algorithm is applied throughout the image. The algorithm also finds out the position of image-objects from the centroid pixel.

The selection of the size of window (and sub-window) is the deciding criteria for the extraction of image-objects. An appropriate window-size helps to find the image-objects accurately, whereas an under-fitting window will not identify all image-image and over-fitting window will result in more computational complexity.

Algorithm 1
Pixel Mapping to Image-Objects (PMIO)
Input
Classified Image, I
Window Size, W
Threshold value, α
Output
Label of Image-Objects I₁, I₂,…., I_S
Position of Image-Objects P_{I ₁}, P_{I ₂}, ……, P_{I _K}
1. Input the pixel-wise classified image I and window-size W
2. Choose the threshold value for size of image-objects, α
from the knowledge base
3. Repeat
a. From the centroid pixel of window mark sub-window
size as log₂ W
b. Find class label of each pixel of sub-window, say C_j,
where j = 1,2,3
c. Find count (C_j)
d. Reassign label (C_j) ← max(C_j)
4. Until the size of sub-window ≥α
5. Return C_j as class label of image-object I_k and centroid
position of region limited by α as P_{I _K}
6. Repeat steps 3–5 for non-overlapping windows for
whole image and return all image-objects and its position

4.2. Spatial colocation pattern mining using tensor factorization (SCLP-TF)

After obtaining image-objects and the corresponding position, the next objective in second stage is to mine spatial colocation patterns. The spatial colocation patterns are mined using the tensor factorization method explained in Section 3.

4.2.1 Tensorization of image-objects

The image-objects (I₁, I₂,…., I_S) and their corresponding positions P_{I
₁}, P_{I
₂}, …… , P_{I
_K} in images under study has to be stacked as a tensor and the process is referred to as tensorization. As the intention is to find spatial colocation patterns, the spatial relationship has to be sought in metric space, and we fix the Euclidean distance as the relationship type. From here onwards, whenever we refer to distance, it is the Euclidean distance which is being taken into account of. The tensor stack has to model the distance relation among all image-objects. The tensor is built with labels of image-objects in 1st and 2nd dimension (say, N image-objects), 3rd dimension is tensorized using the Euclidean distance between image-objects (say, S). Let the tensor be represented as 𝒮.

4.2.2 Tensor factorization to find SCLP

The latent spatial patterns present in the tensorized data has to be discovered by using the principle of tensor factorization. Tensor factorization yields components of the tensorized image-objects and their distance relationships. The tensor 𝒮 obtained after stacking is of 3rd order kind. The canonical decomposition is applied on 𝒮 for factorization using Alternating Least Squares [17] method. It is well known that finding the rank of a tensor is still an open problem. The general solution is to find different number of components till the factorization fits into a defined error ratio.

The tensor 𝒮 is of the order N x N x S. The tensor 𝒮, has to be factorized to obtain the decomposed components (matrices) say A, B and C of the dimensions N x R, N x R and S x R respectively, where R is the rank of the tensor. To start with the factorization, initialize R as R_min and randomly choose any two components say A and B. Find C using the formula given in the Eqn. The symbol ⊙ indicates Khatri-Rao product and ‡ is the Moore-Penrose pseudo inverse. See [17] for further details. $C = S_{3} (B ⊙ A) ((B^{T} B) \times (A^{T} A))^{‡}$ (5)

Repeatedly change the entries in A, B and C and iterate this process over a definite number of times, where the deciding factor is the difference between the entries in the original tensor and the recovered tensor from the components A, B and C. The optimal selection of rank is done by finding the fitting of the original tensor and the decomposed components. The process terminates when R reaches R_max or the difference between the original tensor (𝒮) and recovered tensor ( $\hat{𝒮}$ ^|) meets the error-ratio, ɛ. Thus the decomposed R components for A, B and C is obtained. The matrices A and B are same, as the two dimensions of the tensors are labels of image-objects. Each element in the recovered tensor is calculated as the inner product of a_i, b_j, c_k and the value s_ijk is assigned the association between (a_i, b_j, c_k), which is expressed as follows. $s_{ijk} = f (a_{i}, b_{j}, c_{k}) = \sum_{r = 1}^{R} a_{ir} b_{jr} c_{kr}$ (6)

Algorithm 2
Spatial Colocation Pattern Mining
Input
Image Objects and Positions (I₁, I₂,…., I_S, P_{I ₁}, P_{I ₂}, ……,
P_{I _K})
Minimal rank R_min, Maximal rank R_max
Error Ratio ɛ
Thresholds – Colocation -t_CL,
Dominance - t_D
Output
Decomposed components A, B, C
Spatial Colocation Pattern a_ir
Spatial Dominance value for each SCLP c_kr
1. For all images, tensorize (I₁, I₂,….,
I_K, P_{I ₁}, P_{I ₂}, ……, P_{I _K}) to form a 3-order tensor 𝒮
2. Initialize A and B randomly, R as R_min
3. Repeat
a. C = S₃ (B ⊙ A) ((B^TB) x (A^TA)) ^‡
b. B = S₂ (C ⊙ A) ((C^TC) x (A^TA)) ^‡
c. A = S₁ (C ⊙ B) ((C^TC) x (B^TB)) ^‡
d. Find the approximate tensor, $\hat{𝒮}$ ^\| =[[A, B, C]]
and compute diff _(A,B,C) as ŝ_ijk - ŝ¹_ijk
4. Until diff _(A,B,C) ceases to improve
5. Increase R
6. Repeat 3-5 until R = R_max or 𝒮 - $\hat{𝒮}$ ^\|≤ ɛ
7. Return the decomposed components A, B, C
8. For r = 1 to R, $\prod_{r = 1}^{R} a_{ir} \geq t_{CL}$
9. For each a_ir find corresponding spatial dominance value
c_kr
10. Find image-objects from a_ir whose c_kr≥t_D

The decomposed components (a_ir, b_jr, c_kr) shows the interaction with respect to image-objects and the distance relation. The R components of A, shows the interaction among image-objects (or frequent image-objects) with respect to the spatial pattern under consideration. The prominent/frequent image-objects are chosen from A as $\prod_{r = 1}^{R} a_{ir} \geq t_{CL}$ (7)

The R components of C, shows the degree of spatial dominance for each image-object pattern identified from A.

Definition – Spatial Dominance – It is an interestingness measure in spatial domain that determines the degree of domination of a particular spatial colocation pattern in the set of images and is an indicator of how strong the pattern is in the given set of images. $\prod_{r = 1}^{R} a_{ir} \geq t_{CL} and c_{kr} \geq t_{D}$ (8) The prominent image-objects chosen are termed as a spatial colocation pattern if and only if the corresponding spatial dominance in c_kr is greater than the threshold dominance value t_D.

4.3. Correctness of the algorithm

The decomposed components from the tensorized data, namely A, B and C contain latent structure of the data under study. In our work, A and B are equal components. The component A is of the dimension N x R, where N is the number of image-objects and R is the rank of the tensor. It has to be understood that the R vectors/columns of A contain a weighted assignment of the image-objects and their spatial relationship. On examining the R components, the weighted values show variation from a minimum to maximum value, indicating the weak or strong association between image-objects. The threshold value called colocation threshold (t_CL) has to be chosen so that the strong associations of image-objects are to be extracted as patterns from the R columns of A.

The decomposed component C is of the dimension S x R, where S is the spatial relationship (distance) modeled and R is the rank of the tensor. The R components contain a weighted component of the spatial relationship that exists between image-objects of all images under study. Hence this component gives a clear indication of the spatial measure under consideration. This is termed as ‘spatial dominance’ in our study. This interestingness measure gives an indication of the relevance of the spatial colocation pattern obtained from A. Associating the spatial dominance with the pattern from A, we obtain a set of image-objects termed as spatial colocation pattern with a prevalence measure called spatial dominance.

5. Results and discussions

5.1. Dataset

Sparse and dense datasets are being used in this study for finding the spatial colocation patterns. Data_1 consists of 10103 images and is a sparse kind of dataset. Data_2 [21] consists of 2873 images and is a dense kind of dataset. These pixel-wise classified images are first run through PMIO (stage 1) and image-objects and their positions are identified. After identification of image-objects, the SCLP-TF (stage 2) is applied to find the spatial patterns.

5.2. Experiment setup

The first stage in the framework is to perform the mapping of classified pixels to image-objects. As the output of stage 1, the labels of image-objects and their corresponding positions in the images are obtained. Sample examples of classified image-objects from Data_1 and Data_2 are shown in the Fig. 2. The threshold sizes for the datasets are fixed through manual intervention. After the stage 1, 59 and 107 image-objects were extracted from Data_1 and Data_2 respectively. The image-objects are labeled semantically and their positions in the images are also obtained as the output in stage 1. On doing an evaluation of the image-objects extracted with respect to the ground truth data, Data_1 consists of 68 image-objects, and Data_2 consists of 146 image-objects. Thus there are some missing image-objects when the PMIO method is applied, amounting to 0.09% in Data_1 and 0.39% in Data_2. The higher error rate in Data_2 is accounted for the following two reasons (a) Data_2 is a highly dense data set and consists of overlapping objects (b) Data_2 consists of different types of image-objects which vary much in size (pointing to the fact that the single threshold value is the cause of 39 objects missed in the mapping).

Fig.2

Image-Objects from the Dataset after PMIO.

In stage 2, the image-objects in all images and the spatial relationship between them (distance) is tensorized. The tensor thus holds the association between image-objects and the distance between them. When the distance between the image-objects are computed, the resolution of the image helps to find the same. The distance value is tensorized only after normalization with respect to the resolution of the image.

After tenesorization, the tensor is decomposed by applying ALS method, into which the minimum and maximum rank of the tensor has to be inputted. The range is chosen from 2 to 24. For each of the dataset, the convergence of the rank happen at different points. For Data_1, the rank of the approximate tensor is 7 and for Data_2, the rank is 11. At these rank values, the decomposed components resulting from factorization is projected to find spatial colocation patterns as per Algorithm 2.

5.3. Evaluation

The objective of the proposed framework is to find spatial colocation patterns. The sample patterns mined from the datasets are summarized in the Table 1. The threshold value for spatial dominance is set at 0.5. Following the antimonotone property, the subsets of spatial colocation patterns are also collocated.

Table 1
Sample Spatial Colocation Patterns Mined

Sl No. Dataset Sample Spatial Colocation Patterns with Spatial Dominance

1 Data_1 tvmonitor, cabinet, sofa [0.89]

computer, cup, person [0.78]

bicycle, person, road, sidetrack [0.56]

2 Data_2 deskpart, doorside, screen [0.83]

desk, chairpart, chairwhole, bookshelf [0.77]

mousepad, deskwhole, keypad, mouse [0.77]

chairpart, table, stand, personsitting, shelf [0.68]

telephone, personstanding, poster [0.56]

Sl No.	Dataset	Sample Spatial Colocation Patterns with Spatial Dominance
1	Data_1	tvmonitor, cabinet, sofa [0.89]
		computer, cup, person [0.78]
		bicycle, person, road, sidetrack [0.56]
2	Data_2	deskpart, doorside, screen [0.83]
		desk, chairpart, chairwhole, bookshelf [0.77]
		mousepad, deskwhole, keypad, mouse [0.77]
		chairpart, table, stand, personsitting, shelf [0.68]
		telephone, personstanding, poster [0.56]

The number of spatial colocation patterns mined by the proposed system and [13, 14] are compared for the purpose of understanding the significance. The number of image-objects involved in each spatial colocation pattern is chosen as the performance parameter. The comparison is made in terms of number of image-objects involved in mined patterns. It is observed that patterns containing more number of image-objects are being mined by the proposed systems. It is also understood by us that longer patterns are less understandable and hence the threshold value to choose from the decomposed components is fixed to obtain a maximum of eight image-objects in the pattern. Figure 3 shows the number of patterns (indicating the count of image-objects) for Data_1 and Data_2.

Fig.3

Number of SCLP mined vs Number of Image-Objects.

The execution time for the algorithm is also compared with [13] and [14] for finding the computational efficiency and is depicted in Fig. 4. It is to be noted that the time for feature extraction is not taken into account in the comparisons. The number of images being input to the data is taken as a function to find the execution time. The influence on computation time on the size of the input dataset for different algorithms is compared for Data_1 and Data_2. The computation time for SCLP-TF increases as the size of the dataset increases, just like other algorithms. The exponential increase in the computational time stabilizes after a particular feature/image size, attributed to the reason that there exists very few colocation patterns.

Fig.4

Computation time vs Size of Images.

To summarize the discussions on the experiments, the following points are noted.

The tensor based model to find spatial colocation patterns results in scalable computation time as compared with other algorithms and exhibits less sensitivity in dense data environments.

The colocation patterns yielded from the proposed models contain patterns with more significance in terms of containment of the number of image-objects.

The tensor modeling supports the image data without the need of conversion to transaction type data.

6. Conclusion

In this work, a spatial colocation pattern mining framework is proposed. The framework first performs a pixel mapping of the classified image to image-objects and their corresponding locations of the images. The image-objects and their positions are tensorized to stack the objects and their spatial relationship in a 3-order tensor. The tensorized data is decomposed to obtain the association between the image-objects in terms of the spatial colocation relationship existing between them. The significant collocated patterns are identified with the aid of the spatial dominance factor from the decomposed component of the tensor. On analysis of the spatial colocation patterns, it is observed that patterns containing more than 3 image-objects are obtained from the proposed system and the computation time associated with the proposed system also is at par with the existing system. Thus the proposed work attains the objective to mine spatial colocation patterns consisting of all kinds of image-objects.

Similarly, the tensor model can be used to define different spatial relationships and find patterns accordingly. In our future research, we propose to extend the tensor factorization methodology to obtain spatiotemporal colocation patterns, where the temporal relation between image-objects is the decisive factor instead of the spatial relationship. A temporal analysis of the tensor will help to discover patterns, predict evolutions and find anomalies.

Footnotes

Acknowledgements

The authors acknowledge the computing facilities built under projects funded by UGC-RUSA and DST-PURSE for this work, in the Department of Computer Science, Cochin University of Science and Technology.

References

Akbari ,

Samadzadegan and

Weibel , A generic regional spatio-temporal co-occurrence pattern mining model: A case study for air pollution, Journal of Geographical Systems 17(3) (2015), 249–274.

S.K.

Kim , et al., A framework of spatial co-location pattern mining for ubiquitous GIS, Multimedia tools and applications 71(1) (2014), 199–218.

J.S.

Yoo ,

Shekhar ,

Kim and

Celik , Discovery of co-evolving spatial event sets. In Proceedings of the 2006 SIAM International Conference on Data Mining 2006, pp, 306–315.

D.G.

Leibovici , et al., Local and global spatio-temporal entropy indices based on distance-ratios and co-occurrences distributions, International Journal of Geographical Information Science 28(5) (2014), 1061–1084.

Yu , Spatial co-location pattern mining for location-based services in road networks, Expert Systems with Applications 46 (2016), 324–335.

Koperski and

Han , Discovery of spatial association rules in geographic information databases. In International Symposium on Spatial Databases Springer, Berlin, Heidelberg, 1995.

Morimoto , Mining frequent neighboring class sets in spatial databases. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining 2001, pp. 353–358.

Shekhar and

Huang , Co-location rules mining: A summary of results. In Proc. Spatio-temporal Symposium on Databases, 2001.

Agarwal and

Srikant , Fast algorithms for mining association rules. In Proc. of the 20th VLDB Conference 1994, pp. 487–499.

10.

J.S.

Yoo ,

Shekhar and

Celik , A join-less approach for co-location pattern mining: A summary of results. In Fifth IEEE International Conference on Data Mining (ICDM’05), 2004, pp. 4.

11.

Xiong ,

Shekhar ,

Huang ,

Kumar ,

Ma and

J.S.

Yoc , A framework for discovering co-location patterns in data sets with extended spatial objects. In Proceedings of the 2004 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, 2004, pp. 78–89.

12.

R.G.

Cromley ,

Dean M.

Hanink and

George , Bentley. Geographically weighted colocation quotients: Specification and application, The Professional Geographer 66(1) (2014), 138–148.

13.

Liu ,

Chen ,

Liu ,

Zhang and

Qiu , RCP mining: Towards the summarization of spatial co-location patterns. In International Symposium on Spatial and Temporal Databases. Springer, Cham. 2015, pp. 451–469.

14.

Yao , et al., A fast space-saving algorithm for maximal co-location pattern mining, Expert Systems with Applications 63 (2016), 310–323.

15.

Gauvin ,

Panisson and

Cattuto , Detecting the community structure and activity patterns of temporal networks: A non-negative tensor factorization approach, PloS one 9(1) (2014), e86028.

16.

Cichocki ,

Mandic ,

De Lathauwer ,

Zhou ,

Zhao ,

Caiafa and

H.A.

Phan , Tensor decompositions for signal processing applications: From two-way to multiway component analysis, IEEE Signal Processing Magazine 32(2) (2015), 145–163.

17.

T.G.

Kolda and

B.W.

Bader , Tensor decompositions and applications, SIAM Review 51(3) (2009), 455–500.

18.

Cichocki , Era of big data processing:Anewapproach via tensor networks and tensor decompositions. arXiv preprint arXiv:1403.2048. (2014)

19.

Shekhar ,

Jiang ,

R.Y.

Ali ,

Eftelioglu ,

Tang ,

Gunturi and

Zhou , Spatiotemporal data mining: A computational perspective, ISPRS International Journal of Geo-Information 4(4) (2015), 2306–2338.

20.

Mottaghi ,

Chen ,

Liu ,

N.G.

Cho ,

S.W.

Lee ,

Fidler and

Yuille , The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, pp. 891–898.

21.

Torralba ,

K.P.

Murphy and

W.T.

Freeman , Sharing features: Efficient boosting procedures for multiclass object detection. In Computer Vision and Pattern Recognition, 2004. 2, pp. II-II. IEEE.

22.

Wang ,

Zhang and

Yuan , Semantically enhanced medical information retrieval system: A tensor factorization based approach, IEEE Access 5 (2017), pp. 7584–7593.

23.

Padia ,

Kalpakis and

Finin , (2016, December). Inferring relations in knowledge graphs with tensor decompositions. In 2016 IEEE International Conference on Big Data (Big Data) (2016), pp. 4020–4022. IEEE.