Abstract
Classifying the motion pattern of marine targets is of important significance to promote target surveillance and management efficiency of marine area and to guarantee sea route safety. This paper proposes a moving target classification algorithm model based on channel extraction-segmentation-LCSCA-
Keywords
Introduction
As economic globalization and “ocean century” have come, marine transportation with the merits of large volume of freight traffic, low cost and less energy consumption has become an important bond and bridge of economic trade in the world [1, 2]. In view of that frequent shipping activities caused by increasing ship amount make the global maritime environment to be more complicated and changeable, so how to explore ship’s motion patterns and analyze different ship behaviors is useful to enhance marine surveillance, ship safety and management technology [3]. For example, mastering the motion pattern of various ships near the port can help maritime sectors to recognize abnormal behaviors [4]; surveilling ships with the same motion pattern can timely discover and warn them their dangerous sailing behavior, and prevent ship accident; and finding out different ships’ motion pattern can timely identify illegal fishing activities [5]. Therefore, more and more scholars start to attach importance to the classification of ship motion patterns.
Recently, automatic identification system (AIS) base station and coastal radar station have been established in many coastal areas, which generate numerous dynamic and static ship information. In general conditions, ships’ real-time state can be easily identified according to above information, meanwhile, the marine traffic and ship’s behavior mode can be inferred from historical data to facilitate marine supervision, the discovery of abnormal behavior and the construction of port infrastructures. However, some illegal ships may escape from marine supervision by turning off transponder, tampering location or deliberately transmitting wrong data; at the same time, AIS and radar devices still have the problems of poor data quality and difficulty in identifying target. Due to above reasons, it is usually hard to find out the real type or identity of some ships from historical data, which not only brings challenges to marine surveillance and security defense, but also challenges the data mining work. Thus, correctly classifying and identifying various ship motion patterns has become a crucial task.
In the research of ship motion pattern classification, one of the most common means is the detection technology based on ship pictures or videos collected by monitoring equipment [6]. But monitoring equipment has limited view and is unable to feedback information with strong timeliness, so it is very difficult and highly-cost to timely collect ship pictures and video information in to-be-identified sea area [7]. As all kinds of trajectory data collection systems rise, exploring marine targets’ characteristic pattern by mining spatial-temporal trajectories has become one research hotspot at present [8, 9, 10]. For example, it is feasible to find out attractive marine region from trajectories, further extract and analyze frequent traffic flow activity routes [11, 12]; discover potential periodic activity rule of ships [13]; use the obtained ship characteristic knowledge to discover abnormal ships, including abnormal on/off of AIS device [14], aberrant risk assessment [15], ship traffic changes due to extreme weather [16]; marine traffic flow prediction [17], etc.
From above research results, it is discovered that in the process of mining target trajectory characteristic pattern, the targets with similar property show certain spatial-temporal similarity in their motion tracks, which provides certain support for our classifying target motion pattern by trajectory mining. Recently, trajectory motion pattern classification is mainly applied in traffic planning of smart city and identification of users’ traffic trajectory and motion behavior [18, 19]. For example, identify if a user goes out by bike, by bus or drives out [20]; identify if a taxi is carrying passengers, is empty, or stops at some place [21]; identify human action such as walking, jogging, clapping, etc. [22] Gonzalez et al. [23] utilized neural network to automatically identify travelling mode, retained more important trace point information on the basis of filtering out unnecessary GPS data, and took average speed and stay time as network input to realize the classification of travelling mode. Sun et al. built test program Hidden Markov Model (BP-HMM) to identify various human motion patterns [24]. Fielda et al. used Gaussian mixture clustering algorithm to dynamically time align trajectories, and developed an unsupervised classification method to describe and classify human motion trajectory [25]. Santos et al. depended on practical human motion trajectory, used Dynamic Bayesian network (DBN) model to classify human behaviors, and applied sliding window method to improve model performance and validity [26]. Chen et al. utilized Matching Pursuit – Fletcher Reeves (MPFR) method to explore the classification of ship motion pattern in inland waterway [27].
Above research is heuristic to our exploring the classification of marine target motion patterns. But considering that marine region is a free moving space, which is different from road network limiting vehicle and human motion, and there is no sea lanes for ships in sea area [28], so it is more difficult to discover valuable knowledge from marine trajectory. Based on above idea, this paper extracts channel distribution in sea area, proposes an
Problem modeling
Ship trajectory is the sequential trace points of sampling data points ranked according to time sequence. Suppose there are N targets in selected specific time interval and region coverage R, in which the ship trajectory of target
Schematic diagram of similar sub-segments.
In real conditions, because of different ships’ moving time, route, and trajectory position data distribution frequency, the motion trajectory of each sailing segment has different lengths. Meanwhile, generally, it is hard to find out highly similar representative tracks from a whole channel, and similar tracks always only exist in sub-segments which satisfies similar constraint of time and space [29], as shown in Fig. 1. In order to form various standard motion pattern trajectory training samples, we need to accurately describe each representative similar sub-segment, and obtain fixed-length vector as over-complete basis function. The paper adopts the secondary segmentation based Least-squares Cubic Spline Curves Approximation (LCSCA) technology to describe the sub-motion trajectory in samples. Related research will be introduced in Section 2.2 in details.
Framework of classifying motion pattern of maritime target.
Regarding the obtained training sample sets and unclassified tracks containing a great number of motion trajectory labels, we build a marine target motion pattern classification model by establishing trajectory dictionary consultation, label the linear combination of training samples to approximate all unknown tracks, and figure out the residue and minimum of unknown track under over-complete basis through
The framework of entire motion pattern classification is shown in Fig. 2. The classification process can be summarized as: first, obtain common channels in selected sea area, determine the motion pattern categories in this region and the beginning and ending boundary of trajectories; second, use secondary segmentation method to obtain the sub-tracks with similar characteristics in each trajectory category, describe each trajectory segment by LCSCA algorithm, and obtain training sampling sets with fixed length as over-complete basis of trajectory dictionary; third, obtain the residual sum between to-be-classified trajectories and each kind of over-complete basis by
Wavelet clustering based sea area channel extraction
Difference maps between road network and sea area traffic trajectory distribution.
Now, the research of classification of target’s motion pattern has been widely used in taxi’s passenger carrying behavior analysis [30], bicycle-sharing behavior rule [31], traffic jam rule, etc. The precondition of analyzing the spatial-temporal trajectory big data generated by such moving objects is to understand the road distribution information they have traveled. As shown in Fig. 3, compared to on-land roads, there is no man-made channels in the sea, and the marine situation is more complex due to various external influential factors such as climate, ocean current and accidence [34]. So, in order to analyze the motion pattern of marine targets, the first thing is to acquire the channel distribution in sea area. Generally, channel extraction has two problems: 1) since the offshore region and the region near ports are required to report to navigation management department or avoid ships, and high seas region is very broad, so the magnitude of ships’ trajectory data in different regions at the same time scope differs apparently, which makes channel pattern extraction to become very complicated; 2) in big data environment, the clustering of trajectories has the problems such as large calculation amount, long response time and needing to define cluster number, which is particularly bad for our quickly adapting to channel pattern extraction. This section raises a wavelet clustering based sea area channel extraction algorithm, as shown in Fig. 4. The research starts from four respects: 1) data preprocessing; 2) meshing processing; 3) wavelet transform; 4) search for connected region and construct cluster.
Steps of traffic pattern extraction based on wavelet clustering.
Track preprocessing is mainly to remove large-range track hopping, track point landing and other noise points resulted from physical factors like equipment failure and startup & shutdown abnormality and external factors like geographic environment and weather. It also detects the stay point of ships when they are located in berth and anchorage, and individually analyzes the ship tracks generated during sailing to ensure the accuracy of subsequent channel pattern extraction.
(1) Remove wrong data
Remove illogical track points caused by mechanical equipment, etc., which include track points distributed on land; track points with the longitude
(2) Detect and remove mooring
When a ship docks at a port or a mooring, a large number of data points with minimal location error may generate which are more frequent compared to those generate during sailing. These points will not only interfere our selection of threshold in channel extraction, but also affect region segmentation precision and increase calculation amount. So, we deem the track point of the same target with navigational speed less than 3 knots/hour and adjacent points distance less than 0.1
Meshing processing
The data structure form applicable to wavelet clustering calculation is shown in Eq. (3.1.2). The track points taking vector chain as subject are evidently not applicable to wavelet clustering. Meanwhile, to improve data processing efficiency and reduce data quantity, it is necessary to perform rasterization processing of regional data, and generate quantified characteristic space for further analysis.
The to-be-analyzed region
in which,
Wavelet transform is an essential step in wavelet clustering. Through wavelet transform, the mesh characteristic space can be transformed to the transformation space of frequency domain, to reduce data scale and facilitate clustering operation. A mesh characteristic space M composed by density quantification characteristic matrix receives two-dimensional discrete wavelet transform of 1 scaling, as shown in Fig. 5. The unit in mesh space M keeps a same-position constant-proportion mapping relationship with the low-frequency unit LL
Two-dimensional discrete wavelet transform.
Suppose the selected basis function and scaling function are
in which,
After obtaining the transform space Q, we have to search for connected regions as regional channel based on the idea of density, to build channel clusters. Here, it is required to set the transform unit density threshold
Multiple connected channel clusters in transform space Q are labelled in sequence. Due to the mapping relation between unit and special type of space in transform space, we can easily obtain the cluster label of data in original characteristic space, and take the points without cluster label as abnormal or noise point, to complete the extraction of channel distribution features in the region.
Secondary segmentation based least square cubic spline curve trajectory description
Basic motion mode of vessel trajectory.
Ship’s motion trajectory with difference lengths can’t be directly used in the classification of ship motion patterns. In order to obtain the fixed length vector of motion trajectory, we can use the Least Square Cubic Spline Curve Approximation (LCSCA) technology [36] to extract the same quantity of sample description trajectories. In Fig. 6, generally speaking, a ship trajectory has three behaviors: berthing, steering and line navigation. When the steering area of a mobile target is too large, or its turning angle is too large, the curve fitting is easy to show over-fitting condition, and the obtained curve by fitting fails to precisely describe real trajectory. Thus, we need to conduct secondary segmentation on all trajectories in the region: first, set speed and time thresholds according to trajectory features of mooring region, divide the trajectory
Take sub-trajectory
in which,
in which,
This section mainly introduces the
Suppose there are ships with
All test trajectories
in which,
in which,
Nevertheless, in practical solving process, the model 10 has NP-hard problem, that’s to say, a lot of time will be spent in the process of solving the minimum vale [27], which is not applicable to our practical application. At the same time, the specific regional analysis selected by us belongs to small sample classification problem, which could not represent all motion patterns in this region within the specific time period of the research, and different type of trajectories always has certain local similarity. Thereby, we consider to convert
In the equation,
Trajectory moving target classification based on channel extraction-segmentation-LCSCA-
Experimental environment
The stock spatial-temporal trajectory data used in the experiment are all AIS information broadcast by marine ships including location information. The data region is the Bohai Sea-Yellow Sea region in China (with east longitude of 120
In order to study the motion pattern of targets, first, we need to perceive the channel distribution of concerned region, extract representative periodic trajectory in the channel. According to the requirement of trajectory classification, every training sample target needs to be segmented into several sub-trajectories according to voyage number and maneuver times, and constitutes the classification dictionary over-complete basis with fixed length through trajectory description method. Through consulting “dictionary”, we can obtain the residual sum of test trajectory under over-complete basis of several types of targets, and that with minimum corresponding residual sum is such type of ships, so as to realize the classification of test targets. The experimental results are shown in Section 3.2.
Comparison of LCSCA accuracy between primary and secondary segmentation
Comparison of LCSCA accuracy between primary and secondary segmentation
Experimental data distribution and channel distribution.
Four motion mode trajectories and trajectory distribution after one or two LCSCA algorithm trajectory descriptions.
In the experimental process, we divide the region into 100 meshes, use Harr wavelet to conduct scale transformation twice, take 20% of maximum value of trace points in mesh region as deviation threshold to divide regional channel distribution, as shown in Fig. 8c. In the figure, the region meshes where blue points are distributed are our identified channel distribution meshes, and red points are deviation points. It can be seen from the figure that, ignoring two passing channels extending from northwest to southeast, the entire region has four channels stretching across the Bohai Sea-the Yellow Sea area, which can represent four motion patterns. From these channels, we select 289 representative target trajectories in total from every channel segment to compose the training sample set. The distribution condition of them in the region is shown in Fig. 8d. The trajectory distribution of four motion patterns is shown in Fig. 9a.
The trajectory distribution of adopting primary and secondary segmentation LCSCA trajectory description methods is shown in Fig. 9b. It can be seen from the figure that the LCSCA trajectory description after secondary distribution according to steering points is more accurate than that after the primary distribution. Especially under the third type of motion pattern, when the trajectory maneuver time is very frequent, it is worth discussing on the accuracy of overall trajectory description of primary voyage. We measure the accuracy between the two methods by working out the root-mean-square error (RMSE) between fitting trajectory and real trajectory, as shown in Table 1. It shows that the LCSCA algorithm after secondary segmentation is more accurate than that after primary segmentation, so it is essential to conduct secondary segmentation for us to build over-complete basis.
Performance analysis
In order to better evaluate the algorithmic performance in this paper, the algorithmic classification accuracy is measured by true positive rate (TPR).
in which, TP is the number of A-type ship motion that is correctly classified as type A, while FN is the number of A-type ship motion that is wrongly recognized as other categories. TPR can state the accuracy of correctly classifying as A-type ship motion pattern. Table 2 compares the proposed algorithm, the nearest neighbor classification algorithm (knn), the linear kernel function support vector machine algorithm (svm-linear) and the Gaussian radial basis kernel function support vector machine algorithm (svm-rbf). In the table, the proposed algorithm has the highest classification accuracy than the other three motion patterns, and its overall classification accuracy is higher than the other three algorithms. Compared to other methods, the proposed algorithm has better performance and stability in classifying various motion patterns and is superior to other existing algorithms.
Algorithm classification accuracy comparison
Table 3 summarizes the comparison of four algorithms on running time. It is discovered that the proposed algorithm is better than other algorithms in time performance. When it is applied in the motion pattern classification with mass data, the calculating time-efficient advantage of its algorithm will be more evident, which is more in line with our requirement on spatial-temporal big data mining.
Comparison of running time of different algorithms
Through above contrast experiment, we can come to the following conclusion:
It is feasible to use spatial-temporal trajectory data to classify motion patterns of marine targets; The proposed trajectory moving object classification algorithm model based on channel extraction-segmentation-LCSCA-
This paper mainly lies on the idea of spatial-temporal trajectory data mining, and solves the classification problem of marine target motion pattern. The proposed trajectory moving object classification algorithm model based on channel extraction-segmentation-LCSCA-
This classification method can be applied in trajectory analysis and behavior recognition of targets under complicated sea area environment. Through mastering the motion pattern of normal traffic flow in the region, it monitors and alerts the marine target with special and abnormal behavior pattern, and assists to complete the daily sea area control of relevant departments.
