A semantic-aware collaborative filtering recommendation method for emergency plans in response to meteorological hazards

Abstract

Meteorological hazards have great influence all over the world. An emergency plan is an important means of coping with meteorological hazards. The preparation of emergency plans needs to refer to historical emergency plans, but these are too numerous and are of uneven quality. We can alleviate these problems by means of recommender systems, which are very useful tools in many domains; however, they suffer from information overload. In this paper, we propose a Semantic-Aware Collaborative Filtering method, which is called SACF, for emergency plans recommendation to address the aforementioned challenges. It is designed to effectively present a highly targeted emergency plan recommendation list and recommend the most appropriate emergency plans for a targeted meteorological hazards event. Specifically, we use semantic knowledge to represent scenario-based meteorological hazards, including target and previous events. The search for similar events (i.e., neighbors) for a collaborative filtering recommendation algorithm is adopted. By helping to avoid both the generation of fake neighbors and also the omission of true neighbors the recommendation process is improved. Finally, extensive experiments are conducted on a real-world dataset, and the results demonstrate that SACF improves the accuracy of emergency plan recommendations.

Keywords

Meteorological hazards emergency plan semantics collaborative filtering recommendation

1. Introduction

Meteorological hazards (MHs) cause numerous losses in life and infrastructure [1], but we can mitigate the impacts of these unfortunate events through emergency management [2]. Emergency planning for MHs can play a role in rapid responses [3]. Plan preparation teams need to consult and reference a large number of emergency plans that have been released for help and guidance. However, in recent years, the number of emergency plans in the world has been explosively increasing [4], and these emergency plans are of uneven quality. Finding emergency plans corresponding to the scenario of a target MH event quickly and accurately is a challenge [5] for emergency management. The solution to such a challenge is an emergency plan recommendation system. Selecting and offering event emergency plans that meet the target MHs can considerably increase the plan’s efficiency and quality.

One of the most popular recommendation strategies for emergency plan recommendations is collaborative filtering [6, 7], which is based on providing the target event with the recommendation plans of other previous events with similar characteristics (including hazards, the natural environment, and social economic scenarios [8]) and basic regional information (neighbors). The key factor when making a recommendation based on collaborative filtering is the formation of the neighbor because that determines the match of the recommendation to the target event [9]. The great diversity of the MHs that are available to program staff and the current abundance of null values in the data make the finding of similar events subject to the generation of fake neighbors [10] and the omission of real neighbors. The problem arises when events share similar values (including null values) for numerous properties (for which they are considered neighbors) but they differ with respect to their properties that are strongly relevant to the type of the target event. Meanwhile, some true neighbors, which differ with respect to numerous properties but share similar values for the most relevant properties, are excluded from the population of neighbors. For example, a flood event arises in Place A, and another flood event arises in Place B. The two events that are in places that have similar proportions of ethnic minorities, acreages and population densities but different terrain are fake neighbors because terrain is critical to flood responses while the other three properties are less important. In addition, utilizing semantic knowledge can improve the accuracy of the similarity computation [11] and in turn the recommendation process; therefore, we consider taking advantage of semantic knowledge in our recommendation algorithm.

The problems mentioned above can be summarized as two types: a. the data sparsity of the event-plan effect matrix, and b. the insufficient description of emergency plans and rough similarity calculating methods make it easy to generate fake neighbors and omit true neighbors.

Motivated by these observations, in this paper, we address these challenges through the following contributions.

(1)
A Semantic-Aware Collaborative Filtering method for emergency plan recommendations (SACF) is proposed in this paper. It uses the event-plan effect matrix and introduces semantic knowledge to capture more profound features for MH events. Specially, a semantic model of the event emergency scenario is established. It is composed of a disaster hazard factor, a natural environment scenario and a social economic scenario, which considers the basic geographic information.
(2)
In SACF, to avoid the generation of fake neighbors and the omission of true neighbors to improve the recommendation process, many measures have been adopted such as the generalization of the properties, the reasonable treatment of the short text properties and our Flexible Mixed Weight-Calculation method.

The remainder of the paper is organized as follows. Section 2 presents related works and Section 3 gives some basic definitions in this paper. The Semantic-Aware Collaborative Filtering method for emergency plans (SACF) is described in Section 4. Section 5 describes the experiments that are designed and analyzed to evaluate the accuracy of SACF. Section 6 concludes the paper and provides an outlook on the possible continuation of our work.
2. Related works

To respond to MHs in a fast and effective manner, it is of critical importance to have efficient plans that lead to minimum losses [12]. Researchers have paid an increasing amount of attention to gaining knowledge and experience from past plans and utilizing computer and database technology to manage and assist the development of emergency plans and the recommendation of emergency plans, which have become a new concern in the field of emergency plan research [13, 14].

The collaborative filtering algorithm is a popular recommendation technique for information filtering in all fields. The most important aspect of collaborative filtering algorithms is how they differ in their computing similarity, i.e., how they select neighbors [15]. Traditional object recommendations (such as those for products and literature) usually utilize a user-item rating matrix to select neighbors. For example, [16] compares the active user’s rating and the neighbor’s rating to recommend products for purchase to the active user. The emergency plan recommendation is different because the buying behavior of a user can occur many times while a disaster event occurs only once, and it adopts or refers to the corresponding emergency plan once. In this way, the event-plan effect matrix will be inevitably sparse, and it is infeasible to select neighbors using only the effect matrix. With respect to the data sparsity problem in recommendations, auxiliary information is introduced to calculate the similarity and obtain the neighbors. The model in [17] not only considers the local contextual information of user ratings but also considers the global preferences of users to improve the recommendation performance. In this paper, we utilize the semantic knowledge of event emergency scenarios and basic regional information to select neighbors and combine the event-plan effect matrix to generate recommendations.

In the field of natural disaster emergency plan recommendations, the existing method for finding neighbors determines the similarities of the properties of the events [13]. However, different properties have different importance in finding neighbors, and [17] explains how they can improve the recommendation system by removing some minor properties. In addition [13], considers only a disaster hazard factor’s simple quantitative representation with respect to the target event when computing events’ similarity, such as the wind-force and wind speed of a typhoon, in the search for a suitable plan. Considering only the hazard factors leads to very low similarities for some disaster events with different types and high reference values for the target event, which omit the true neighbors. Considering only a simple quantitative representation as the disaster hazard factor results in a rough similarity computation and the generation of fake neighbors [10]. In summation, an insufficient description of emergency plans and a rough weight distribution when calculating the similarity causes inferior neighborhood quality, which leads to poor recommendations for target events.

Generally, compared to existing methods, SACF considers event emergency scenario information, establishes a semantic model of an event emergency scenario, and utilizes semantic knowledge during the recommendation to avoid the generation of fake neighbors and the omission of true neighbors, which makes the recommendations more accurate.

3. Basic definitions

This section introduces several fundamental definitions and some descriptions related to this research. The proposed SACF will be described later based on these basic backgrounds.

Definition 1: Event. In this paper, an event refers to a specific MH arising in a specific geographical location (province, city or county). The event information contains the basic geographic information (BG) and emergency scenario (ES) including the hazard scenario (SH), natural environment scenario (SN) and social economic scenario (SE). Events are stored in an event base, which typically includes the event emergency scenario information (involving what has already occurred and artificially designed), the emergency plans that were used or referenced, and the scores of the emergency plans, among which the effect score refers to the score of the used emergency plan and the assistance score shows the score of the referenced emergency plan. The last may be both a positive and negative result since even if an emergency plan did not help to resolve the problem, but this still is a useful knowledge. Table 1 shows the internal structure of the event base.

Definition 2: Semantic feature. Semantic features are used to indicate an MH’s emergency scenario with respect to the target and previous events. In this scenario, the semantic features include the background, people, transportation, infrastructure, potential secondary disasters, weather, economy, basic services, housing, etc.

Table 1
The internal structure of the event base

Event id

Emergency scenario

Place name

Plans that were used

Effect score

Plans that were referenced

Assistance score

A collaborative filtering recommendation algorithm is adopted to generate appropriate recommendations. SACF calculates the rate of some previous emergency plans in a logical way for a target MH event, then presents an emergency plan recommendation list and recommends the most appropriate emergency plans.

Moreover, one of our important contributions here is our neighborhood formation approach in which we use semantics to avoid the generation of fake neighbors and the omission of true neighbors, as discussed in Section 1. Specifically, we undertake the following.

(i)

We generalize the properties (that are embodied in the model). For a target event such as a typhoon, the model considers the concrete effects of the disaster events, such as the number of houses that collapsed instead of the wind-force, to avoid the omission of true neighbors that are not typhoons.

(ii)

Short text properties are processed into composite simple properties and the semantic knowledge is incorporated into the similarity computation.

(iii)

A Flexible Mixed Weight-Calculation method (FMWC) is proposed.

Table 2

Basic symbols and notations

Symbol	Definition
E	Event
ES	Event emergency scenario (SH, SN, and SE)
R	Basic geographic information (BG)
P	ES semantic feature list; $P^{i}=\{P^{i}_{1},P^{i}_{2},\ldots,P^{i}_{n}\}$ , where $i=0,1,2,\ldots,m$
$\text{sim}\left(\textit{ES}_{0},\textit{ES}_{i}\right)$	Emergency scenario (SH, SN, and SE) similarity between the target event $E_{0}$ and the previous event $E_{i}$
$\text{sim}\left(R_{0},R_{i}\right)$	Region intrinsic (BG) similarity between the target event and previous event
$\text{sim}\left(E_{0},E_{i}\right)$	Event (SH, SN, SE, and BG) similarity between the target event and previous event
Z	Semantic feature theoretical weight vector $Z=\{z_{1},z_{2},\ldots,z_{n}\}$ of the target event
V	Semantic feature data worth weight vector $V=\{v_{1},v_{2},\ldots,v_{n}\}$ of the target event
W	Semantic feature importance weight vector $W=\left\{w_{1},w_{2},\ldots,w_{n}\right\}$ of the target event

Table 2 summarizes the basic symbols and notations that are used in this paper. $n$ indicates the number of semantic features of an emergency scenario event, and m indicates the number of previous events. When $i=$ 0, the target event is indicated.

4. Semantic-aware collaborative filtering recommendation method for emergency plans

After the introduction of the basic definitions in Section 3, this section describes the proposed SACF in detail. We first present the whole process of SACF. Then, Section 4.1 explains the semantic model of the event emergency scenario for MHs, Section 4.2 introduces the neighborhood formation that avoids the generation of fake neighbors and the omission of true neighbors, and Section 4.3 gives the details of calculating ratings and generating recommendations.

Our strategy can be divided into two parts. First we compute the event similarity to form a neighbor. Second, we obtain the prediction score of the emergency plan for the target event and form the emergency plan recommendation list. Figure 1 shows whole process of SACF.

Figure 1.

SACF algorithm flow chart.

4.1 Event emergency scenario semantic model of MH

The same event is handled differently when there are different geographical features. For example, the regional proportion of ethnic minorities is related to the treatment of the people in a disaster area. Therefore, we think that basic geographic information (BG) is regionally indigenous special information that must be considered when selecting neighbors, and it is independent of the event. To formalize the MH event domain (all types), we build an emergency scenario semantic model – the event emergency scenario ontology – a brief excerpt of which is shown in Fig. 2. In Fig. 2, white ellipses, white squares, gray squares and gray ellipses denote the MH event classes, the MH event, the property value and the property value classes, respectively. In addition, we utilize the geographic ontology as presented in [19] as shown in Fig. 3. From these two ontologies’ metadata specifications, the atomic semantic features of events are extracted. The two ontologies are linked together by the place (where the event occurs) name property. The event emergency scenario ontology and geographic ontology are combined to form the event ontology.

Figure 2.

A micro-excerpt from our ontology of the MH event emergency scenario.

Figure 3.

Geographic ontology introduced in the event ontology.

Due to the great diversity of MHs, our ontology contains multiple hierarchies of classes that represent the domain concepts (typhoon, flood, hail, etc.), as well as specific instances (flood event 1, typhoon event 1, etc.). In addition to the classes and instances of MHs, the ontology includes their properties. It is worth noting that some properties will be typical of a specific hierarchy of events and others will be general. For example, hail events will include the hail diameter while typhoon events do not, whereas the number of victims (data property) is important for all kinds of MHs. Each one of these properties is related to the corresponding MH event by conveniently labeled properties (water_lev_rising_degree, water_lev_declining_degree, regionOfOccurence, diameterOfHail, etc.). All properties (including the data property and object property) are added to the top class – MH – which means that all events share a set of properties.

Figure 4.

A part of a MH event emergency scenario property layered system.

Through the establishment of an event emergency scenario semantic model, we summarized the MH event emergency scenario property layered system, as shown in Fig. 4. The first layer shows the semantic features, the last layer contains the properties (leaf properties) that are the atomic semantic features in the ontology, and a semantic feature includes multiple atomic semantic features. We have more than 200 leaf properties, including simple properties and complex properties. Short text property processing of a complex property is especially important.

4.2 Neighborhood formation – Avoiding the generation of fake neighbors and the omission of true neighbors

We establish the relationships between the target event and the previous event ontology using the shared properties between their instances, that is, between the target event and previous events presenting common atomic semantic features. We start from a set of previous events for the target event. Our goal is to discover events that are similar to the target event. Namely, we identify the previous events whose emergency scenarios and basic geographic information of the area of the accident are similar to the target event needing emergency plan recommendations.

To this end, we propose a metric that quantifies the semantic similarities between a target event and previous events. Our semantic similarity metric is based on discovering the implicit relations between a target event and previous events to match. Then, the stronger the connections are between the two of them, the greater their similarity will be.

We discovered the relationships among a target event and previous events that share semantic characteristics (e.g., people, houses, economy, traffic, and infrastructure), including a series of leaf properties. Particularly, our metric distinguishes two types of connections: simple and complex. A simple connection includes 2 types: those that are established through a common property value (if the two events share some property value (the same instance in the ontology)), and those that are established through a sibling property value (if they have a property value belonging to the same class in the ontology). For example, between a flood disaster event whose number of disaster victims is 1.2 million and a typhoon event whose number of disaster victims is 1 million, there is a relation through a common property value duo within the numerical statistical error, whereas between the target event “Fishing vessel is damaged” and a previous event “Seafood production has a loss”, there is a relation through a sibling property value (seafood and fishing vessel belong to the same class in the ontology, which is fishery (loss)). Complex connections are established through short text properties whose value is represented by short text, which need to be split and processed into composite simple properties. Our metric considers the relations using a common property value that is stronger than that of a sibling property value.

Furthermore, the effects of a semantic feature on different target events are different. Consequently, the more important (for the target event) that the relations that we can infer between the target event and a previous event are (and the more important the relations from that set that are measured using a common property value), the greater their semantic similarity will be.

Next, we will analyze the strategy of selecting a neighbor from a specific and quantitative point of view. In our method, data preprocessing and our flexible mixed weight-calculation method are used to avoid the generation of fake neighbors and the omission of true neighbors.

4.2.1 Data preprocessing – Short text property processing

The data that are obtained from real-world problems are rarely complete [20] and cannot be directly used. The properties of the events that we obtained are mostly short text properties, and it is very important to split and process a short text property into composite simple properties.

We exploit the semantics when computing events’ short text property similarities, and with the help of a semantic property thesaurus, we split short text (property value) into a content phrase and degree phrase. For a content phrase, we use the phrase’s ancestor information instead of the short text property information itself. Thereby, even if there is no intersection between the two property value phrase sets, they may have high similarity. For example, in computing the similarity between “many oysters damaged” and “a large number of crabs died of anoxia”, there is no common word in these two short text property values, so the similarity is 0 according to a traditional statistical method; however, what they both declare is that seafood suffered heavy losses, and so their similarity is high.

A. Split short text property value

In our method, two data structures, “keyword-candidate list” and “property thesaurus”, are introduced to help split short text properties.

Definition 3: Keyword-candidate list. The keyword-candidate list is a set of keywords about the values for a property $p_{i}$ , which can be denoted as $K_{i}=\{k^{i}_{1},k^{i}_{2},\ldots,k^{i}_{\text{e}}\}$ , where e is the number of the keywords in the keyword-candidate list of $p_{i}$ . Every short text property has its own keyword-candidate list.

An example of a simple keyword-candidate list of the special barriers to the event processing property is described in Table 3. The keywords in the keyword-candidate list can be a word or multiple words that are related to a description of the corresponding short text properties.

Table 3
Keyword-candidate list of special barriers to the event processing property

No.	Keyword	No.	Keyword
1	Low-lying	5	Disaster prone places
2	Downstream	6	Pileup
3	Collapse	7	Toxin
4	Gale sand	8	Mountain area

In this paper, different short text properties extract different keyword-candidate lists from their respective property value sets. Some of the words in a short text property cannot exactly match the corresponding keywords in the keyword-candidate list since they have the same semantics but different expressions. Therefore, we assume that specific property thesauruses are built to support the keyword extraction, and different property thesauruses are built for different short text properties.

Definition 4: Property thesaurus. A property thesaurus references the keyword-candidate list that lists the words that are grouped together according to the similarities of the keyword meanings. A traditional domain thesaurus includes related words and antonyms [21], whereas our property thesauruses are more specific and contain words whose similarity exceeds a certain threshold (high). For example, for “traffic location” and “train station”, the latter is a subset of the former, but here we consider them as different keywords. An example of a simple property thesaurus of special barriers to the event processing property is shown in Fig. 4. The words in the gray rectangles are the keywords in the corresponding keyword list, and the words in the ovals are the related words of the keywords. Often, property thesauruses are regularly updated to ensure the timeliness of the words.

B. Short text property similarity computation

There are two types of property thesauruses: content and degree. Figure 5 is an example of a content property thesaurus. Degree property thesauruses measure the degree of the content property according to a property description. For a combination of multiple content phrases and a mixed combination of content and degree phrases, we handle these as follows.

Figure 5.

An example of a simple property thesaurus of special barriers to the event processing property.

B.1. Pure content type – Multiple content keywords

According to the links among keywords, there are two types of structures for pure content phrases: coordination and tree. Coordination is the relation among content phrase sets where we change the original property into a composite property including multiple juxtaposed subproperties whose value is yes or no. For a tree, there is an overall partial relationship between the content phrase sets. For example, for the special circumstances of events, traffic locations include railway stations, highways, and subway stations. Because all these words can be organized from top to bottom into a tree-like structure, any word can be reached through another path in the tree. The relation between a word and another word can be reflected by their path length and their respective depth information. For such a property, we select only the phrase with the maximum depth from the property value phrase set to participate in the property similarity calculation.

Generally, when the length of the path from one word to another is much shorter, the relation of the two words is much closer, and their similarity is greater [22]. For example, the semantic similarity between the two phrases “harbor” and “railway station” is greater than that between “harbor” and “traffic location”. We evaluate the semantic similarity between two words or two concepts as follows:

$\displaystyle\text{sim}(\textit{concept}_{1},\textit{concept}_{2})=\frac{2% \times\textit{depth(LCS)}}{\textit{depth}\left(\textit{concept}_{1}\right)+% \textit{depth}\left(\textit{concept}_{2}\right)}$ (1)

where concept indicates a word, and LCS indicates the least common ancestor node of two content phrases.

Figure 6.

Example of composite property – fishery loss property.

B.2. Composite type – Content and degree keywords

A property value may simultaneously refer to content and degree phrases, as shown in Fig. 6. For the property of fishery industry losses (under the semantic feature “economic loss”), the following provides some descriptive information on it.

Event 1: More than 30 fishing boats had to lay off employees.

Event 2: Some oysters had been destroyed.

Event 3: Aquaculture facilities were severely damaged, and aquatic products died.

In this example, the keyword-candidate list of content information has two words: fishing boats and seafood. In addition, for the levels of damage regarding seafood and fishing boats are different.

We divided this property into two parts: content property and degree property.

First, the method to calculate the similarities between content phrases can refer to the way in which it is used to calculate the similarities between pure content properties. Finally, we can obtain the original property similarities by multiplying the two similarities of the subproperties.

4.2.2 Flexible mixed weight-calculation method

A Flexible Mixed Weight-Calculation method is proposed for the event similarity computation.

Definition 5: Importance weight vector. In the FMWC method, the theoretical importance and data worth of a target event’s semantic feature will be transformed into n dimensional weight vectors, namely, an importance weight vector, which can be denoted as $W=\left\{w_{1},w_{2},\ldots,w_{n}\right\}$ , where $n$ is the number of semantic features, $p_{i}$ indicates a semantic feature, and $w_{i}$ is the weight of the target event. If the target event does not have any descriptions about the semantic feature (i.e., all of its leaf properties’ values are null), then the weight of the semantic feature is 0, i.e., $w_{i=0}$ . Different target events have different importance weight vectors.

In this paper, we use 3 steps to determine the importance weight of the semantic feature in the semantic feature set for the target event.

Step 1: We use the Analytic Hierarchy Process (AHP) [23] model to determine the theoretical weight vector $Z=\left\{z_{1},z_{2},\ldots,z_{n}\right\}$ of the semantic feature in the semantic feature set for the target event. The Zs of the events of the same disaster type are the same. The theoretical weight computation based on the AHP model is determined as follows.

We construct the pair-wise comparison matrix in terms of the relative theoretical importance between each two semantic features. The pair-wise comparison matrix $A_{m}=(a_{ij})_{m}$ must satisfy the following properties, where $a_{ij}$ represents the relative importance of two semantic features:

(1)
$a_{ij}=$ 1, where $i=j=1,2,\ldots,n$ ;
2)
$a_{ij}=\frac{1}{a_{ji}}$ , where $i,j=1,2,\ldots,n$ and $i\neq j$ ; and
(3)
$a_{ij}=\frac{a_{ik}}{a_{jk}}$ , where $i,j,k=1,2,\ldots,n$ and $i\neq j$ .

After checking the consistency of the matrix, we then calculate the theoretical weight using the following function:

$\displaystyle z_{i}=\frac{1}{n}\sum^{n}_{j=1}{\frac{a_{ij}}{\sum^{n}_{k=1}{a_{% kj}}}}$ (2)

where $a_{ij}$ is the relative theoretical importance between two semantic features, and n is the number of semantic features.

Step 2: We should consider a target event’s semantic feature’s value worth. Every semantic feature consists of multiple leaf properties, and we treat the percentage of leaf properties with nonempty property values as the semantic feature data worth of the target event. Then, we can obtain a target event’s data worth vector $V=\{v_{1},v_{2},\ldots,v_{n}\}$ .

Step 3: We obtain the semantic feature importance weight vector W of the target event as follows:

$\displaystyle W=\left\{w_{1},w_{2},\ldots,w_{n}\right\}=\frac{Z\cdot V}{\Sigma% {z_{i}\times v_{i}}}$ (3)

where $n$ is the number of event scenario semantic features.
4.2.3 Neighbor formation – Event similarity computation

Then, the event emergency scenario similarity $\text{sim}\left(\textit{ES}_{0},\textit{ES}_{i}\right)$ based on the FMWC is defined as follows:

$\displaystyle\text{sim}\left(\textit{ES}_{0},\textit{ES}_{i}\right)=\sum^{n}_{% j=1}\omega_{j}\times\textit{sim}(P^{0}_{j},P^{i}_{j})$ (4)

where $\textit{sim}(P^{0}_{j},P^{i}_{j})$ indicates semantic feature $P_{j}$ ’s similarity (which is determined using all leaf pro-perties’ similarities under semantic feature $P_{j}$ ) between the target event $E_{0}$ and a previous event $E_{i}$ , and ${\omega}_{j}$ indicates semantic feature $P_{j}$ ’s weight for the target event (using the FMWC method).

The calculation process of a region’s intrinsic similarity $\text{sim}\left(R_{0},R_{i}\right)$ is similar; however, the weight calculation method is relatively simple (because the rate of missing geographic data is not high), and we directly use a regional semantic feature theoretical weight vector as weight vector.

Finally, we can obtain the event similarity $\text{sim}\left(E_{0},E_{i}\right)$ :

$\displaystyle\text{sim}\left(E_{0},E_{i}\right)=\alpha\text{sim}\left(\textit{% ES}_{0},\textit{ES}_{i}\right)+\left(1-\alpha\right)\text{sim}\left(R_{0},R_{i% }\right)$ (5)

where $\alpha$ is the proportionality coefficient. For the merged processing of $\text{sim}\left(\textit{ES}_{0},\textit{ES}_{i}\right)$ and $\text{sim}\left(R_{0},R_{i}\right)$ , we calculate $\text{sim}\left(E_{0},E_{i}\right)$ by adjusting the coefficient $\alpha$ .

4.3 Calculate ratings and generate recommendations

Based on the similarities of the target event and previous events, further filtering will be conducted. Given a threshold $S_{1}$ , if $\text{sim}\left(E_{0},E_{i}\right)<S_{1}$ , the previous event $E_{i}$ will be filtered out, and otherwise $E_{i}$ will be retained.

Once the set of the most similar events is found, the personalized ratings of each candidate emergency plan for the target event can be calculated. Finally, a high-target emergency plan recommendation list will be presented for the target event and the emergency plan(s) with the highest rating(s) will be recommended.

Here, we use a weighted average approach to calculate the rating preScore of an emergency plan for the target event.

$\displaystyle\text{preScore}(\text{Plan}_{x})=\sum^{k}_{i=1}\beta_{\textit{top% }_{i}x}\text{sim}\left(E_{0},E_{\textit{top}_{i}}\right)\times\textit{score}_{% \textit{top}_{i}x}$ (6)

where $E_{\textit{top}_{i}}$ ( $i=1,2,\ldots,k$ ) indicates the $k$ neighbors of the target event. $\text{sim}\left(E_{0},E_{\textit{top}_{i}}\right)$ indicates the event similarity between a target event and its neighbor, and $\textit{score}_{\textit{top}_{i}x}$ indicates a previous event $E_{\textit{top}_{i}}$ ’s score for the emergency plan x. In Eq. (6), $\textit{score}_{\textit{top}_{i}x}$ and ${\beta}_{\textit{top}_{i}x}$ are respectively calculated using Eqs (7) and (8). Specially, ${\beta}_{\textit{top}_{i}x}$ is the coefficient for the standardized processing.

$\displaystyle\textit{score}_{\textit{top}_{i}x}=\left\{\begin{array}[]{ll}0;&% \text{event}\ E_{\textit{top}_{i}}\ \text{did not reference or adopt plan}\ x% \\ E_{\textit{top}_{i}}^{\prime}s\ \text{score for plan}\ x\ (0-10);&\text{event}% \ E_{\textit{top}_{i}}\ \text{referenced or adopted plan}\ x\end{array}\right.$ (7)

$\displaystyle{\beta}_{\textit{top}_{i}x}=\frac{\lambda_{i}}{\sum^{k}_{i=1}% \lambda_{i}\textit{sim}\left(E_{0},E_{\textit{top}_{i}}\right)\times\xi_{% \textit{top}_{i}x}}$ (8)

where $\lambda_{i}$ is the proportionality coefficient reflecting the different importance of the two types of scores. If the previous event did not reference emergency plan x, then ${\xi}_{\textit{top}_{i}x}=$ 0, and otherwise ${\xi}_{\textit{top}_{i}x}=$ 1.

Repeating these steps, we can calculate the predicted ratings of all candidate emergency plans for the target event. Then, we can rank the emergency plans using the ratings and present a high-target emergency plan recommendation list. Without the loss of generality, we assume that the plans with the higher ratings are more useful for the event. Thus, the plans with the highest rating(s) will be recommended for the target event. Alternatively, we can recommend the Top-p services for the event. Table 4 shows the basic algorithm of SACF.

Table 4

The process of SACF

Input: The property values of the target event E

{}_{0}

; candidate emergency plans EP

=

{ep

{}_{1}

, ep

{}_{2}

\ldots

, ep

{}_{Z}

}; previous events PE

=

{pe

{}_{1}

, pe

{}_{2}

\ldots

, pe

{}_{z}

} that adopted or referenced corresponding emergency plan; previous event

{pe}_{i}

’s score for the emergency plan Score

=

{}_{i1}

, s

{}_{i2}

\ldots

, s

{}_{iq}

} threshold

S_{1}

in the filtering phase

Output: The Top-k emergency plans with highest ratings TP = {tp

{}_{1}

, tp

{}_{2}

\ldots

, tp

{}_{k}

}

Process begin:

1. Draw event emergency scenario semantic feature importance weight vector W1

=\{\omega_{11},\omega_{12},\ldots,\omega_{1n}\}

for E

{}_{0}

2. Draw regional semantic feature weight vector W2

=\{\omega_{21},\omega_{22},\ldots,\omega_{2m}\}

3. W

=\{\text{W1},\text{W2}\}

4. for each

{ep}_{i}=

1: ep

{}_{Z}

5. Neighbor

=\Phi

r=0

sw=0

6. for each

E_{r}

{pe}_{i}

7. sim

\left(E_{0},E_{r}\right)=\sum^{n+m}_{j=1}\omega_{j}\times\text{sim}(P^{0}_{j},% P^{i}_{j})

8. if

\text{sim}\left(E_{0},E_{r}\right)>S_{1}

, then

9. insert

\{E_{i},s_{ir}\}

into Neighbor

10. end if

11. end for

12. for each

\{E_{i},\textit{score}_{i}\}\in

Neighbor do

13.

r=r+\text{sim}\left(E_{0},E_{i}\right)*\textit{score}_{i}

14. sw

=

+\text{sim}\left(E_{0},E_{i}\right)

15. end for

16.

\text{preScore}_{i}=\frac{r}{\textit{sw}}

17. end for

18. EP

\leftarrow

RankByPreScoreDescent(EP)

19. TP

\leftarrow

SelectTopk(EP, k)

20. return TP

=

{tp

{}_{1}

, tp

{}_{2}

\ldots

, tp

{}_{k}\}

end process

5. Experiment

For each of these MHs in the MH case library, our algorithm produced a list of five ranked emergency plans. We then randomized the order within each group of five to obscure the algorithm’s ranking decisions. Three domain experts independently evaluated the relevance and quality of each emergency plan to its respective MH event. This process induced a human-generated ranking, which we then compared to the algorithm’s rank order. The rating scale that is used by the experts is from 1 to 5. A score of 5 is the best, meaning that the recommended emergency plan is quite useful to the event and the quality of the emergency plan is quite high.

The evaluation factors include two parts: the quality of the emergency plan (integrity, maneuverability, etc.) and the correlation degree between the emergency plan and the target event.

5.1 Metrics

We used two metrics to evaluate the relevancy and quality of our recommendations: the NDCG and k-precision.

The discounted cumulative gain (DCG) [24] is a measure of ranking quality. Highly relevant and high-quality emergency plans are more useful when they have a higher ranking, and the discounted cumulative gain is higher. The principle of the DCG is that if the high-quality results unexpectedly fall behind the low-quality results, then we should discount the score. The accumulated DCG containing $p$ results is defined as follows [25]:

$\displaystyle\textit{DCG}_{p}=\sum^{b-1}_{i-1}{\textit{rel}_{i}}+\sum^{p}_{i=2% }\frac{\textit{rel}_{i}}{\text{log}_{b}i}$ (9)

where $\textit{rel}_{i}$ is the gain associated with the emergency plan at position $i$ . The base $b$ of the logarithm determines how sharp the applied discount is, and here we set $b$ to 2.

We use a graded relevance and quality scale of 1, 2, 3, 4, and 5, and computed the DCG for the ranked recommendations, which we did for each MH event. The ideal value of DCG (IDCG) is defined as the DCG that is based on the ideal ranking as judged by the experts. To obtain the IDCG, we sort the rankings that are given by the experts in decreasing order of relevance and quality scores and compute the DCG of the sorted ranking [25] This corresponds to the maximum theoretically possible DCG in any ranking of recommendations for that event. We normalize the DCG for our ranking using the IDCG to obtain the Normalized DCG (NDCG):

$\displaystyle\text{NDCG}(i)=\frac{\textit{DCG(i)}}{\textit{IDCG(i)}}$ (10)

Here, we have 5 results, and we report NDCG(5) as NDCG, which is the overall rating for the ranking. The worst situation is that the 5 emergency plans, which are recommended for a target event, are graded by the experts as 5, 4, 3, 2, and 1 in that order; however, the results of our method are 1, 2, 3, 4, and 5 (indicating gain, where 5 is the best). Then, min(NDCG) $=$ NDCG (5) $=$ 0.734060709. To more easily show the results, we adjust the NDCG to a range from 0 to 1:

$\displaystyle\text{NDCG}(i)=\frac{\frac{\textit{DCG(i)}}{\textit{IDCG(i)}}-% \text{min(NDCG)}}{1-\text{min(NDCG)}}$ (11)

The formula of the precision of the top $k$ is $P=\frac{\textit{TP}}{k}$ ( $k$ is the length of the recommended list), which calculates the proportion of emergency plans (TP) whose reference value exceeds a certain threshold among the recommended top-k plans.

5.2 Results

We compare SACF with SACF that does not use the FMWC method (i.e., SACF_NW in the results figures). As shown in Fig. 7, at first, the SACF result’s precision is basically greater than 0.7 (except for sandstorms), and higher than that of SACF_NW under the same situation. Next, for sandstorm events whose precision is small because the emergency plans that are useful for sand storm events in the event database are rare, most of the useful emergency plans are successfully found in the front recommendation list while using SACF, thereby indicating the SACF recommended results are very good.

Figure 7.

Comparison of the SACF and SACF_NW precision of the Top-p ( $p=$ 3, 5, 7, and 9) recommendation lists under different disaster types.

We classify the target events using different properties: the number of disaster victims, the number of collapsed houses, and the number of destroyed houses. Then, we calculate the top-10 precision of every group. We can see that the precision of our SACF is essentially relatively high, as shown in Fig. 8.

Figure 8.

SACF precision of the Top-10 recommendation list for different groups.

In Fig. 9, SACF’s NDCG of the Top-5 recommendation list under different disaster types is high, which reflects that the sequence of plans that is recommended by SACF is reasonable. Furthermore, compared to SACF using the FMWC method, our SACF’s NDCG is higher, which suggests that our FMWC method is useful for improving the recommended results.

Figure 9.

SACF and SACF_NW’s NDCGs of the Top-5 recommendation list under different disaster types.

We cannot consider recall since we do not know the entire set of relevant emergency plans.

Considering only disaster hazard factors’ features while searching for neighbors can find only the neighbors whose type is the same of that of the target event. However, in a SACF recommendation, we can see that flood events’ neighbors may be typhoon events, whereas a flood event’s neighbor in a small village may not be a flood event in a large city because the difference between the two events’ basic regional information is too great. This shows that our SACF can accurately avoid the generation of fake neighbors and the omission of true neighbors.

In addition, removing some properties cannot allow one to find hidden referential neighbors. For example, agricultural loss is a relatively minor property, but after removing it, it is difficult to find a neighbor whose agricultural loss degree is similar to that of the target event. However, in a SACF recommendation, the “Emergency plan for agricultural natural disasters” that is adopted by a typhoon event (which is a neighbor of the target event with a similar agricultural loss degree) is recommended to a snow event (i.e., a target event) that occurred in Hengyang.

From the detailed recommendation results, we recommended Longyan’s preventative flood emergency plan to a typhoon event that occurred in Nantong. Although the two places are in different regions and the types are also different, the flood emergency plan obviously has some referential value for the typhoon event. A traditional keyword retrieval method and the case-based reasoning method of Zhang cannot obtain these results, which shows the superiority of our SACF.

6. Conclusions and future work

In this paper, we have proposed a semantic-aware emergency plan collaborative filtering recommendation method named SACF. In SACF, a semantic model is used to represent an MH event emergency scenario, and a Collaborative Filtering algorithm is adopted to generate appropriate recommendations. More specifically, we avoid omitting true neighbors using property generalization, and keyword-candidate lists, a property thesaurus and semantic knowledge are provided to help compute short text property similarities. A weight calculation method called the FMWC is proposed to avoid the generation of fake neighbors and the omission of true neighbors and improve the accuracy and rationality of the recommendation algorithm. Our method aims to provide a highly targeted emergency plan recommendation list and recommend the most appropriate plan(s) to the events. Finally, the experimental results demonstrate that the KASR has satisfactory accuracy.

The work we presented here is a first step, and many opportunities for future work remain. First, the development of an event is dynamic, and the time period for collecting data has a great impact on the results. Nevertheless, we consider only a static emergency scenario at a certain point in time. A follow-up study can focus on the indication and similarity computation of a dynamic emergency scenario. Last, the content of the emergency plan is also related to the recommendation. In fact, lessons are drawn from only part of the emergency plans. Therefore, our follow-up research will be to split the MH emergency plan and recommend the appropriate portion of the emergency plan.

Footnotes

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61672102, the National Social Science Foundation of China under Grant No. BCA150050, and the Program for New Century Excellent Talents in University of the Ministry of Education of China under Grant No. NCET-10-0239.

References

Ribault

and Wainer

, Simulation processes in the cloud for emergency planning, Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), 2012.

Sergio

and Pennock

, Social media in emergency management advances, challenges and future directions, 2015 Annual IEEE Systems Conference (SysCon) Proceedings, 2015.

Zhong

S.B.

et al., Emergency Management System based generation of Incident Action Plans, 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2013.

Chen

Cui

B.G.

and Zhang

, A Method of XML Containing Data State in Emergency Plan Management Information System, 2012 International Conference on Management of e-Commerce and e-Government, 2012.

Wang

W.J.

et al., Emergency plan process ontology and its application, 2010 2nd International Conference on Advanced Computer Control, 2010.

Burke

, Integrating knowledge-based and collaborative-filtering recommender systems, Proceedings of the Workshop on AI and Electronic Commerce, 1999.

Yang

W.J.

Wang

Z.B.

and You

M.Y.

, An improved collaborative filtering method for recommendations generation, 2004 IEEE International Conference on Systems, Man and Cybernetics, 2004.

Gallina

et al., A review of multi-risk methodologies for natural hazards: Consequences and challenges for a climate change impact assessment, Journal of environmental management 168 (2016), 123–132.

Martin-Vicente

M.I.

Gil-Solla

Ramos-Cabrer

et al., A semantic approach to avoiding fake neighborhoods in collaborative recommendation of coupons through digital TV, IEEE Transactions on Consumer Electronics 56(1) (2010), 54–62.

10.

Liu

and Li

X.F.

, A hybrid collaborative filtering method for multiple-interests and multiple-content recommendation in E-Commerce, Expert Systems with Applications 8(1) (2005), 67–77.

11.

Harispe

et al., Semantic similarity from natural language and ontology analysis, Synthesis Lectures on Human Language Technologies 8(1) (2015), 1–254.

12.

Muhammad Usman Shahid

et al., Macroserv: A route recommendation service for large-scale evacuations, IEEE Transactions on Services Computing 10(4) (2015), 589–602.

13.

Zhang

Y.J.

, Research on aid-design and evaluation Problems of contingency Plans, Ph.D. Dissertation, Dalian University of Technology, 2011.

14.

, Research on Construction of Emergency Plans and Emergency Decision Based on Icons and Workflows, Doctoral Dissertation, Zhejiang University, 2016.

15.

Verstrepen

and Goethals

, Unifying nearest neighbors collaborative filtering, Proceedings of the 8th ACM Conference on Recommender systems, ACM, 2014.

16.

Kim

Y.S.

and Yum

B.J.

, Recommender system based on click stream data using association rule mining, Expert Systems with Applications 38(10) (2011), 13320–13327.

17.

Liu

H.F.

et al., A new user similarity model to improve the accuracy of collaborative filtering, Knowledge-Based Systems 56 (2014), 156–166.

18.

Estola

, When Recommendation Systems Go Bad, the 10th ACM Conference ACM, 2016.

19.

Jing

D.S.

S.W.

and Wu

, Geospatial information services on the basis of agent and OWL-S, Proceedings. 2005 IEEE International Geoscience and Remote Sensing Symposium 2, 2005.

20.

García

Luengo

and Herrera

, Tutorial on Practical Tips of the Most Influential Data Preprocessing Algorithms in Data Mining, Knowledge-Based Systems (2016), 1–29.

21.

G.W.

et al., Keyword extraction of web pages based on domain thesaurus, 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems, 2014.

22.

Z.B.

and Palmer

, Verb Semantics and Lexical Selection, Acl Proceedings of Annual Meeting on Association for Computational Linguistics (1994), 133–138.

23.

Saaty

T.L.

, Analytic heirarchy process, Wiley statsRef: Statistics reference online, 2014.

24.

Dupret

, Discounted Cumulative Gain and User Decision Models. String Processing and Information Retrieval, Springer Berlin Heidelberg, 2011.

25.

Jayashree

and Christy

, Improving the Enhanced Recommended System Using Bayesian Approximation Method and Normalized Discounted Cumulative Gain, Procedia Computer Science 50 (2015), 216–222.

A semantic-aware collaborative filtering recommendation method for emergency plans in response to meteorological hazards

Abstract

Keywords

1. Introduction

3. Basic definitions

Table 1 The internal structure of the event base

4.2.1 Data preprocessing – Short text property processing

A. Split short text property value

Table 3 Keyword-candidate list of special barriers to the event processing property

B. Short text property similarity computation

B.1. Pure content type – Multiple content keywords

B.2. Composite type – Content and degree keywords

5.1 Metrics

Footnotes

Acknowledgments

References

Table 1
The internal structure of the event base

Table 3
Keyword-candidate list of special barriers to the event processing property