Abstract
In recent times, frequent occurrences of natural disasters have been the cause of widespread disruptions to life and property. Albeit attempts to prevent such disasters may be a lost cause, emerging technologies can be resorted to, for minimization of their impact. This study proposes a deep learning-based computer vision and crowdsourcing methodology for the detection and estimation of flood depths, one of the most intense disruptive disasters. State-of-the-art flood detection systems work off of satellite or radar images. This research deals with processing images, captured at random, from flood ravaged zones, by smartphones or digital cameras. The crowdsourced image collection of the flood scenes afford better coverage and diverse perspectives, for assessments of the flood devastation. This paper proffers a fuzzy logic-based algorithm, and image segmentation based on color, to estimate the extent of flooding by analysis of crowdsourced images. Deployment of these methods helps in classification of the flooded areas into high, medium, or low level of flooding, to facilitate cost-effective, time-critical rescue operations. This algorithm yielded an accuracy of 83.1% on our dataset.
Keywords
Introduction
Recent times have seen exponential increases in the incidence of natural disasters, such as cyclones, earthquakes and floods [1]. Human losses due to flooding alone are anticipated to increase by 70 to 80 percent [2] over the pre-industrial levels, precipitated by the worldwide global warming phenomena [3]. There is an urgent need for reliable surveillance systems which can help in disaster management, before and after the occurrence of natural disasters. For instance, an automated flood monitoring system [4], [5] could analyze data acquired from the affected areas, and provide timely relevant information, to forewarn the local residents about the surveilled area’s susceptibility to floods. Doing so, would entail collection and analysis of crowdsourced flood image data, which enable an in-depth study of the flood incident. Assessment of flood events includes estimation of dynamic factors such as water depth, velocity of floodwater flow, and projected time duration of the flood event. For this purpose, a huge data set of images was collected, at random, from various flood events and the internet, and analyzed using computer vision [6] algorithms. A computer vision algorithm [7] was developed for estimation of the flood depth, using image data. Unlike synthetic aperture radar (SAR) or unmanned aerial vehicle (UAV) images, reference objects in crowdsourced images, like vehicles and humans, serve as parameters for estimation of the depth of floodwater. Reference objects used in flood detection systems help classify flooded areas, based on the need for urgent action. During flood emergencies, such ranking information would be helpful to disaster relief and rescue team, for immediacy of action in the worst affected areas. Accurate timely notification on the level of flooding in the affected areas help rescue teams plan the overall relief, rescue and recovery logistics. Stratification of floodwater data is imperative for the appraisal of the effectiveness of mitigation efforts – clearing the debris and widening the water-flow channels.
Contribution
The key contributions of this paper are the following:
Common objects like humans and vehicles considered as reference objects for estimation of water levels i.e. water depths. Application of deep learning approach for improved accuracy. Introduction of the concept of Golden ratio [8] for accurate estimation of human height. Fuzzy-logic implementation of decision-making algorithm using linguistic variables.
This paper is organized as follows. Related work is covered in Sections 2 and 3 discusses the study area and dataset. Section 4 describes the four algorithms whose results are combined to estimate the extent of flooding. Sections 5 and 6 discuss the experimental results and the performance evaluation, respectively. Section 7 concludes the paper with closing remarks and scope for future work.
Related work
The ubiquity of smartphones [9] has made crowdsourcing a handy appliance for timely acquisition of a vast amount of textual, graphic and visual image information that afford valuable raw data for subsequent processing. Various flood monitoring and management systems have been developed, but very few using crowdsourced data [10, 11]. [12] suggested a novel crowdsourcing approach for urban flood management, through which pertinent information for rescue relief and flood status can be disseminated in real-time. Researchers have investigated the prevalence and importance of crowdsourcing in diverse scenarios; but, leveraging such information for determination of the flood depth is new. [13] proposed an algorithm for the extraction and analysis of social media posts during a flood event. Analysis of crowdsourced images to estimate flood extent, using deep learning and fuzzy logic based computer vision algorithms, is a novel idea. In a recent study, [4] estimated the extent of flooding by analyzing crowdsourced images. The algorithm involved two major steps: 1) water segmentation 2) face detection. [14] proposed a water ruler method for detection of water level, to compare virtual markers, using images captured by Closed Circuit Television (CCTV). Geotagged images of flood scenes were feature matched with a reference image, using the scale invariant feature transform (SIFT) algorithm, for estimation of the flood line [5]. The major drawback of these methods is that they were tailor-made for a scene that cannot be generalized.
Recent approaches used in the design of flood detection were based on machine learning and deep learning, investigated by [15, 16]. [15] reviewed 300 images of natural scenes that constituted a dataset, for training deep neural networks. The major drawback of [15] is that it is restricted to natural scenes. [16] compared machine learning algorithms, for the classification of partially submerged vehicles in flood scene images, towards the estimation of flood depth.
Various mobile applications are being developed for monitoring and management of natural disasters, such as cyclones, floods and earthquakes. The mobile apps use Visual data captured on smart phones as inputs. Visual scene understanding [17], [18] is a key component in the processing of such data. Image segmentation is a prerequisite for identification of objects [19] in a visual scene. Image segmentation algorithms have evolved from edge detection and edge linking [20] schemes to those based on deep learning.
Study area and data set
Our model designed for the estimation of flood extent needs to be tested and validated using crowdsourced images. A total of 320 crowdsourced images were processed, to identify whether they belonged to flooded or non-flooded scenarios; 180 images dealt with flooded, and the residual 140 images were about non-flooded areas. The selected 180 images are comprised of a mix of flood scenes – few images with neither humans nor vehicles present; some with humans only, and others, where only vehicles were present. There were also images with both humans and vehicles present.
Methodology
The proposed method involves application of computer vision and deep learning techniques to crowdsourced images, for estimation of the extent of floods. The results of brown color intensity, largest brown-colored area, and depth of water with respect to the detected reference parameters i.e. human and vehicle, have been combined to estimate the extent of flood. Our flood extent estimation model is primarily comprised of four modules:
Largest brown-colored area segmentation Estimation of floodwater depth, using human height as reference Floodwater depth estimation with vehicle height as reference Fuzzy module for estimation of flood extent
Figure 1 illustrates the components involved in the process of estimating the flood water depth. Our proposed method analyzes the input image (Fig. 2), to examine the major objects in the scene like humans, vehicles, and brown-colored area. Gamma correction [21] helps in varying the brightness of the image as shown in Fig. 3. Gamma encoded images store tones more efficiently. The largest brown-colored area segmentation followed by estimation of the floodwater depth, using average heights of humans and vehicles present in the flood scene image, as a reference benchmark. The average brown colour intensity and largest brown-colored area in the input image are then analyzed, using the rule set generated in Subsection 4.4. Finally, the extent of flood is estimated by fuzzy based decision making, via the procedure proposed in Algorithm 4.
FPreprocessPreprocess FLargestBrownAreaSegmentationLargestBrownAreaSegmentation FAvgIntensityAvgIntensity FWaterDepthWaterDepth FnFunction: Extent of Flood InputInput Image img processed
Proposed method for estimation of extent of flood.
Input image.
Gamma corrected.
The brown-colored area in the input image, which is assumed to be the region of water, has to be segmented out. This region has to be further taken for estimation of brown color intensity. This is the region that is compared with humans and vehicles detected in the image for estimation of the extent of water. The detailed process steps for this module are given in Algorithm 4.1.
Water Area
Water segmentation.
Thresholding.
Largest contour.
The brown-colored area is masked in the input image, by setting a threshold value for the pixels corresponding to the brown color, as shown in Fig. 4. The largest area within the detected brown-colored area is segmented for further processing, for estimation of the extent of flooding. Thresholding [22] is one of the simplest techniques used for segmentation, in gray-scale images. Both binary and Otsu’s thresholding schemes [23] were evaluated, as shown in Fig. 5. If the image is bimodal, i.e. histogram of an image has two peaks, the threshold value is automatically calculated as the middle value of those peaks as per the Otsu’s method; otherwise binary thresholding is used. In the binary method, the threshold value used is provided by the user.
Find contours and the largest contour
Human body segmentation.
On completion of the image thresholding process, different sets of continuous points are found, each set forming a contour as shown in Fig. 6. The horizontal, vertical and diagonal segments are ignored, and contours are formed, using only the end points of the line segments. The external contours, which are obtained from the thresholded image, are stored as a list. The area of each contour is calculated and the contour with the largest area is selected. The ratio of the largest contour area to the overall size of the image is defined as the water area of the image. If the largest contour area is less than a specified threshold value of 15000 pixels (preset by trial and error), then the system directly predicts no flood and exits. This is because water area is taken as the most significant parameter in estimation of flood. Otherwise, the average of the segmented image pixels is defined as average brown colour intensity, and fed to the fuzzy module, for estimation of flood extent. The details are shown in Algorithm 7.
Brown Color Intensity
This subsection describes the estimation of water level, with respect to height of human beings [19] in the input image. To segment the human body, the faces are detected first, illustrated in Fig. 7. The proportionately drawn rectangles, below the detected human faces [24], are assumed to be the bodies of the corresponding faces. The maximum water area in the three divided regions of the human body is used to decide where the water level stands, when compared against the human body. The details are given in Algorithm 4.2.
Water Depth w.r.t Human
Human body segmentation
Tallest human segmented into levels.
After all the faces in a scene are detected, the face with maximum y coordinate(tallest face is found), as shown in Fig. 7. The person with this face is segmented as the tallest human in the image. A list consisting of the mean pixel intensity of each detected face is constructed. If the list is found empty, then no face is detected. In that case, water depth with respect to human is set to zero. Golden ratio concept [8] states that the height of a human body is eight times the height of the face. If a human face is detected, eight times the bounding box of the face is drawn beneath, to mark the feet of the human, as shown in Fig. 8, to represent the body. The human body is thus divided into three regions: torso, waist and knee regions which are correspondingly high, medium and low flood levels.
The following steps are used to estimate the area under water in each of the 3 human regions:
The tallest human is selected (when there are multiple faces in an image scene) and divided into three regions. The brown color segmented image i.e. water segmented image is trimmed to the three regions, per the preset thresholds, so that the contours and sum of contours in each region can be found. The areas of all contours (sum of brown-colored area in each region) are computed. The water level in region where the water area is the largest, is compared against human height and is fed into the final fuzzy module. The system finds the maximum water area among the three regions.
Segmenting vehicles, mainly cars, from the image is a challenge due to the following reasons:
In the image, the front, side or rear view of the vehicle may be present. Since the image is of a flood scene, the vehicle can be partially or fully submerged in water. The normal vehicle segmentation using detection of the vehicle’s license plate is not feasible.
Consequently, the vehicle is segmented using deep learning technology.
Conditional Random Fields as Recurrent Neural Network (CRFasRNN) is used for semantic segmentation. CRFasRNN is a kind of a Convolutional Neural Network (CNN), with the strengths of both CNN and CRF based probabilistic, graphical modelling.
Semantic scene segmentation.
Vehicle segmentation.
Vehicle division into levels.
The CRFasRNN algorithm segments over 20 classes of objects, ranging from vehicles, humans, bottles, cats, chairs, cows, and birds. For vehicle segmentation, only car was considered since majority of the scenes contained submerged cars as vehicles. In Fig. 9, both vehicle and human are segmented out. Thereafter, only vehicle(car) is segmented out for further estimation purpose, as shown in Fig. 10. Since car is classified in gray color, the gray region from the image is segmented out using yet another color-based segmentation, as displayed in Fig. 10.
After taking the largest contour in the gray color segmented image, the area of this contour is checked whether it is greater than a threshold value. The threshold is fixed after empirical research. If the largest contour area is less than the specified threshold value, there is not enough vehicle area in the image, to carry out the estimation of water depth i.e.; no vehicle detected, and depth with respect to vehicle is set to zero. Otherwise, the water depth is estimated with respect to the detected vehicle. If the image has multiple vehicles, the vehicle with largest contour is considered. The steps involved are given in Algorithm 11.
Water Depth w.r.t Vehicle
After the vehicle is segmented out from the image successfully, a bounding box is drawn along the contour of the detected vehicle and subdivided into three equal parts as shown in Fig. 11. From the boundaries of the bounding boxes, the three regions of the vehicle in the scene are trimmed out, to carry out the same process as pursued in water level estimation with respect to humans. The brown color segmented image i.e. the water segmented image is trimmed out to these three regions. Each region is assigned with a heuristic threshold value, based on which, the contours and sum of contours in each region are computed.
Generating membership functions.
The region where the water area is the largest, is taken to be the level of water with respect to the segmented vehicle. The water area corresponding to this region is taken as the water depth compared against vehicles in the scene and is given as a parameter to the final fuzzy module. The algorithm finds the maximum among the three values passed and thus, finds the maximum among the three areas.
All the four parameters – water area, brown color intensity, water depth with respect to human, and water depth with respect to vehicles – are required for the fuzzy module’s estimation of the flood extent. The flood extent is approximated as a value between (0 and 100)%.
[b] Estimated Extent of Flood
Fuzzy membership functions
Water area, brown color intensity, water depth, with respect to human and vehicles are the four input parameters to the fuzzy system. The input parameters are first fuzzified in the range (0 to 100), along with the output of extent of flood. A triangular model of the fuzzy system which defines the range of fuzzy values in the form of a triangle, was considered. The triangular curve is a function of a vector,
The parameters
After generation of the corresponding membership functions, the degree of membership is determined for each input parameter. After this step, the presence of each parameter should be checked, and the rules are written for each condition. If all the four parameters are available in an image, then a set of fuzzy rules are defined for the situation.
If there is no human in the scene, ‘water depth with respect to humans’ is set to zero. If there is no vehicle in the scene, then ‘water depth with respect to vehicles’ is set to zero. If there are no human and vehicle in the image, then both these parameters are set to zero.
All the rules are generated through empirical research. There are different cases according to the presence of humans and vehicles. For each case, separate sets of fuzzy rules are defined. Also, the fuzzy variables are assigned as: fuz1 for ‘water area’, fuz2 for ‘water intensity’, fuz3 for ‘water depth using human height as reference’ and fuz4 for ‘water depth using vehicle as reference’.
Rule 1: IF water area is LOW or IF brown color intensity is LOW, then flood extent is LOW Rule 2: IF brown color intensity is MEDIUM, then flood extent is MEDIUM Rule 3: IF water area is HIGH or IF brown color intensity is HIGH, then flood extent is HIGH
Rule 1: IF water area is LOW or brown color intensity is LOW or water depth with respect to vehicle is LOW, then flood extent is LOW Rule 2: IF water depth with respect to vehicle is MEDIUM, then flood extent is MEDIUM Rule 3: IF water area is MEDIUM, brown color intensity is MEDIUM and water depth with respect to vehicle is HIGH, then flood extent is HIGH Rule 4: IF water area is HIGH or brown color intensity is HIGH or water depth with respect to vehicle is HIGH, then flood extent is HIGH
Rule 1: IF water area is LOW or brown color intensity is LOW or water depth with respect to human is LOW, then flood extent is LOW Rule 2: IF water depth with respect to human is MEDIUM, then flood extent is MEDIUM Rule 3: IF water area is MEDIUM, brown color intensity is MEDIUM and water depth with respect to human is HIGH, then flood extent is HIGH Rule 4: IF water area is HIGH or brown color intensity is HIGH or water depth with respect to human is HIGH, then flood extent is HIGH
Rule 1: IF water area is HIGH, then flood extent is HIGH Rule 2: IF water area is MEDIUM and brown color intensity is MEDIUM, then flood extent is LOW Rule 3: IF water area is LOW and brown color intensity is HIGH, then if water depth with respect to human is HIGH or water depth with respect to vehicle is LOW, then flood extent is HIGH Rule 4: IF water area is LOW and brown color intensity is HIGH, then if water depth with respect to human is LOW or water depth with respect to vehicle is HIGH, then flood extent is HIGH Rule 5: IF water area is LOW and brown color intensity is HIGH, then if water depth with respect to human is MEDIUM or water depth with respect to vehicle is MEDIUM, then flood extent is HIGH
All of the rules defined for low flood, medium flood and high flood are aggregated together to improve the overall estimation accuracy. The fuzzy membership functions of all the three levels of floods, are aggregated, to be subsequently defuzzified. Defuzzification is the process of converting the degrees of membership of output linguistic variables into numerical value. It is a process of providing quality result in crisp logic. Defuzzification of a membership function, returning a defuzzified value of the function, using various defuzzification methods is performed using the defuzz method provided by the
Centroid defuzzification, the most commonly used option, is used here. It returns the center of area under the curve. In centroid defuzzification method, all of these triangles are superimposed one upon another, forming a single geometric shape. Then, the centroid of this shape, called the fuzzy centroid, is calculated. The
Experimental results
The results from the algorithm were manually analyzed and verified for accuracy. Different cases were considered, during analysis of the results:
When there is neither vehicle nor human present in the scene. When there is no vehicle present in the scene. When there is no human present in the scene. When both human and vehicle are present in the scene.
Images with no human and vehicle present (IM1, IM2 and IM3). (i)–(iii) Input image. (iv)–(vi) Gamma correction. (vii)–(ix) Water Segmentation. (x)–(xii) Thresholding. (xiii)–(xv) Largest contour.
Images with no vehicle present (IMG1 and IMG2) and images with no human present (IMG3 and IMG4). (i)–(iv) Input image. (v)–(viii) Gamma correction. (ix)–(xii) Water Segmentation. (xiii)–(xvi) Thresholding. (xvii)–(xx) Largest contour. (xxi), (xxii) Face detection and human body divisions into regions. (xxiii), (xxiv) Vehicle segmentation. (xxv), (xvi) Tallest human segmentation. (xxvii), (xxviii) Vehicle division into regions.
Images with both human and vehicle present. (i), (ii) Input image. (iii), (iv) Gamma correction. (v), (vi) Water Segmentation. (vii), (viii) Thresholding. (ix), (x) Largest contour. (xi), (xii) Human body divisions into regions. (xiii), (xiv) Vehicle segmentation. (xv), (xvi) Vehicle division into regions.
Classifications and Mis-classifications. (i), (ii) classified as flooded w.r.t only vehicles. (iii) and (v) classified as flooded w.r.t only humans. (vii) and (ix) classified as flooded using only water area and brown color intensity. (vi) and (viii) classified as flooded w.r.t both human and vehicle. (iv) and (x) are not flood scene images. Red bordered images are mis-classified.
In case when there is no human as well as no vehicle present in the scene, the factors available for estimation of the flood extent are intensity of brown color and area of water. In such a case, the processes of gamma correction, color segmentation, thresholding and largest contour detection take place, whereas the water depth estimation with respect to human and vehicle will not take place. The different processing stages of three such images are given in Fig. 13.
Since there were no human faces to be detected and no vehicles to be segmented in the scene, water depth with respect to humans and water depth estimation with respect to vehicles were set to zero. Consequently, the corresponding fuzzy rule set worked, and the extent of flood was printed on our display screen and saved.
No vehicle or no human in the scene
If there is no vehicle present in the image, then the image will be put through the following processes: gamma correction, brown color segmentation, thresholding, largest contour, human body segmentation and segmentation of tallest human. IMG1 and IMG2 shown in Fig. 14 clearly depict the above stages in a case, where no vehicle is present. For both IMG1 and IMG2, water depth with respect to vehicle were not present. The corresponding defined rule set was executed, and the extent of flood was printed and saved.
If there is no human being present in the image, the image will undergo all the processes starting from gamma correction ending with finding out the largest contour. The process is followed by vehicle segmentation, and subdivision of the segmented vehicle into different levels. IMG3 and IMG4 in the Fig. 14 depict the same pictorially.
Both vehicle and human in the scene
In this case, all the four fuzzy input parameters were present, and the set of fuzzy rules generated for this particular scenario were executed. The image was processed through all the steps in the model – gamma correction, brown color segmentation, thresholding, detection of the largest contour, human body segmentation, tallest human segmentation, vehicle segmentation and vehicle level division. Therefore, it also estimated water depth with respect to humans and water depth with respect to vehicle. The water area, brown color intensity and two depth estimations together contributed to the estimation of the extent of flood as shown in Fig. 15.
No flood in the scene
Designation of a given image as being from a flood scene was solely deduced from the water area computed. As the water area calculated by finding the largest contour was less than a heuristic threshold value, the scene was classified as a non-flood scene. The input images can be correctly or incorrectly classified into flood or non-flood category. Figure 16 represents a few correctly and incorrectly classified images. Red bordered images show the incorrectly classified images, whereas green bordered images show the correctly classified images by the system.
The confusion matrix depicting the number of correctly classified images and incorrectly classified images is given in Table 1 indicative of the accuracy and efficiency of the proposed method. From the set of 180 images of flood scene, 146 were properly classified into flood category and the rest were mis-classified as non-flooded. Among the set of 140 non-flood scene images, 120 were correctly classified as non-flooded, with the remaining incorrectly classified as flooded.
Confusion matrix
Confusion matrix
Flood extent classification
The extent of flooding was estimated from randomly collected crowdsourced images, using color-based segmentation, water depth analysis and fuzzy based decision making. The results obtained were helpful for detection and estimation of flood severity in the affected areas. Table 2 presents a classification summary of flood extent, for the test images used in this study, in terms of area of flood water, average brown color intensity of the water, water depth with respect to human beings, and water depth with respect to vehicles. Table 2 includes information on whether the system classified the test images into flood or non-flood scene image. All the images from Fig. 16 were taken into consideration here as examples. Different flood scenarios and their outputs based on our analysis are given in Table 2. Images (i) and (ii) have vehicles only, without human beings in the scene. Clearly, there is no water depth with respect to human values for them in the flood extent classification Table 2. On the other hand, for images (iii) and (v), constituted of only humans in the scene, with no vehicles, water depth with respect to vehicle was nil. In images (vii) and (ix), there are neither any human nor any vehicle present in the scene and the flood extent is computed with the aid of only water area and brown color intensity. Considering the cases with both humans and vehicles in the scene, images (vi) and (viii) have estimations of the water depth analysis with respect to both human and vehicle. All these images were correctly classified as “Flood Scenes”. In contrast, the non-flood scene images of (iv) and (x) have largest contour areas less than the preset threshold. Thus, both of them were correctly classified as “Not a Flood Scene”. Even with a large number of correctly classified images, there are a few mis-classifications that need some review. Images may incorrectly be classified as flood scene images if there are other brown-colored objects in the image, for example, wooden roofs of a house as given in Fig. 16. Some flood-scene images were mis-classified into non-flood scenes ensued by the shining light effect of water in the images, which hinder proper discernment of contours, as illustrated in Fig. 16.
Out of the total number of 320 images, 266 were correctly classified images. The accuracy for this work was calculated using the Eq. (6). Accuracy was computed as 83.1% in this work.
Crowdsourced images, collected at random, were analyzed using a deep learning and fuzzy-based propitious system, for the estimation of flood extent. The extent or degree of flood within the range of 0 to 100 was estimated from the input image. We propound that the investigation can be extended in future by considering more reference parameters like two wheelers, bridges, and street-lamp posts for the estimation of flood extents with enhanced accuracy. Water segmentation algorithm can be improved to handle images, with sunlight falling on water and with flowing water, where brown color cannot be accurately segmented. Such probes can be of immense help in furnishing technical assistance to disaster rescue and recovery teams. This study uncovered the following gaps that need to be addressed:
Low resolution images can lead to failure of the face and vehicle detection algorithm. In low lighting conditions, our algorithm may fail. In the case of humans used as reference, the images should be frontal, and the human beings should be in a standing posture. In the case of vehicles as reference, their side view images should be factored in.
Footnotes
Acknowledgments
We are extremely grateful to our beloved Chancellor, Dr. Mata Amritanandamayi Devi, also known as Amma, for always being the source of guidance, inspiration and support environment to work on this paper.
