Abstract
This paper introduces CurvS, a web-based tool for researchers and analysts that automatically extracts, visualizes, and analyzes roadway horizontal alignment information using readily available geographic information system roadway centerline data. The functionalities of CurvS are presented along with a brief background on its methodology. The validation of its estimation results are presented using actual horizontal alignment data from two different roadway types: Route 83, a two-lane two-way rural roadway in New Jersey and I-80, a freeway segment in Nevada. Different metrics are used for validation. These are identification rates of curved and tangent sections, overlap ratio of curved and tangent sections between estimated and actual horizontal alignment data, and percent fit of curve radii. The validation results show that CurvS is able to identify all the curves on these two roadways, and the estimated section lengths are significantly close to the actual alignment data, especially for the I-80 freeway segment, where 90% of curved length and 94% of tangent section length are correctly matched. Even when curves have small central angles, such as the ones in Route 83, CurvS’s estimations covers 71% of curved length and 96% of tangent section length.
The average crash rate for horizontal curves is nearly three times higher than that of other road sections ( 1 ). To alleviate this risk and deploy proper safety countermeasures one needs to locate these sections and obtain their respective geometric properties. Yet, more often than not this information is not readily available in states’ roadway inventory databases. The necessity of horizontal alignment dataset and the difficulties in obtaining it have become more apparent especially since the publication of the Highway Safety Manual (HSM) in 2010, after which there has been an increased interest by many states’ department of transportation (DOT) in the calibration of safety performance functions (SPFs) provided in the manual. The locations of horizontal curves, their radii and length are required to calculate the crash modification factor (CMF) used in the HSM’s crash prediction model for two-lane two-way rural roadways and freeways.
The available methods for collecting and extracting horizontal alignment data include field surveys, Global Positioning Systems (GPS) methods, light detection and ranging (LiDAR), satellite imagery, manual extraction via online mapping services, automated image processing, Geographic Information System (GIS) methods, and as-built plan sheets ( 2 , 4 ).
Although the most accurate horizontal curvature data can be obtained by field surveys or data extraction from as-built plans, these methods are impractical when the analysis scope is large, as it usually is with the SPF calibration process. Mobile asset vehicles (MAVs) are often used to extract roadway alignment data for such purposes. These vehicles are equipped with a variety of technologies including high-resolution video cameras, scanners, Ball Bank Indicator (BBI), GPS systems, LiDAR technology, and distance measurement instruments ( 5 – 12 ). However, in addition to being time consuming, labor intensive, and expensive, requiring comprehensive post-processing of collected data, this method has its shortcomings as discussed in Findley et al. ( 8 ), in which curvature data collected by three different commercial vendors utilizing MAV are found to have contrasting results.
A similar but less costly option is proposed by Wood and Zhang ( 13 ), in which they presented a novel approach to automatically identify and measure horizontal curves using smartphone sensor data including the GPS, accelerometer, and gyroscope readings. This approach is proposed as a low-cost mobile road inventory system for two-lane horizontal curves based on off-the-shelf smartphones. The proposed approach uses Butterworth low-pass filtering to reduce sensor noise, extended Kalman filtering to improve GPS accuracy, and K-means clustering to identify curve locations. Only simple curves with radii less than 5,730 ft are used for validation. The estimated curve radii of 21 curves are compared with the design radii. The results indicate that this approach can measure the radius of sharp horizontal curves with a satisfactory degree of accuracy. However, this approach requires, preferably multiple, runs through survey locations, and it is time and labor intensive and not applicable for a large-scale horizontal curvature extraction.
Other methods include processing of high-resolution photos ( 14 ) and still images from digital cameras ( 15 ). These methods, however, are also not applicable to a large-scale curvature extraction process.
Another method, utilized by many studies ( 16 – 18 ), is manual extraction of horizontal curvature information from satellite images using web-based mapping tools, such as Google Earth™. However, extracting this information, especially for a multitude of roadways, demands a significant amount of time, and the quality of the extracted data is often questionable. For example, Bartin et al. tested the sensitivity of manually extracted curvature data on an 8-mi-long rural two-lane two-way roadway using 15 different analysts ( 19 ). The number of detected curves varied between 10 and 23 curves between trials, with an average of 17.0 curves.
The key conclusion drawn from the review of these methods is that curvature data obtained from the available methods differ significantly and that, for the purpose of calculating CMF for horizontal curvature, a quick and reliable method is needed.
Among the available methods, GIS-based ones appear to be ideal in relation to cost and time, as they utilize GIS roadway centerline shapefiles, already available in state DOTs’ data repository. there are only a few publicly available tools that can be utilized by researchers and analysts. The most notable GIS-based tools are Curve Calculator extension by ArcGIS software tool ( 20 ), CurveFinder ( 21 ), Curvature Extension, a toolbar in ArcGIS software developed by the FLDOT ( 22 ), and ROCA (Road Curvature Analyst) ( 23 ). Curve Calculator requires users to identify a curve manually by defining its limits. The tool then automatically calculates the radius and curve length. Similarly, Curvature Extension requires manual identification of each curve by specifying its limits. As stated in Xu and Wei, the accuracy of the extracted curvature information depends highly on users’ judgment and experience ( 24 ). Curve Finder, however, automatically detects curves on a network of roadways by calculating the heading angle between consecutive GIS data points. If the angle is greater than a threshold, then a curve begins, otherwise it is assumed a tangent section ( 21 ). However, Curve Finder is not publicly available. ROCA is an ESRI ArcGIS toolbox application. The tool can automatically process a network of roadway and extract horizontal curvature information. The identification of curves is based on the naïve Bayes classifier. ROCA is available on request ( 23 ). Currently its training dataset is based on 32 roadways and 43 mi of rural roadways in the Czech Republic. Analysts who intend to utilize this tool for another region are urged to preprocess a training dataset, which poses a shortcoming, as generating a training dataset demands time and it is often compiled via manual identification of curves.
Therefore, as mentioned in Bartin et al. ( 19 ), there is a need for a computer tool designed for researchers that embodies the following properties:
Utilizes GIS roadway centerline shapefiles,
Detects different types of curves and calculates curvature information quickly and accurately,
Performs network-wide analysis,
Produces output files that can easily be integrated to an existing roadway feature database,
Is accessible to other researchers and improves the efficiency of their safety-related work.
The study by Bartin et al. ( 19 ) was the first step toward such a tool with the above-listed properties, presenting a reliable and efficient method for identifying and measuring curvature data. They used clustering analysis of the approximated curvatures of discrete data points from GIS roadway centerline shapefiles. This paper is intended to be the next step toward this goal. Thus, the objective of this paper is to introduce CurvS, an online tool that encompasses the desired properties listed above, and validate its accuracy using ground truth data. CurvS is based on an improved version of the clustering approach presented by Bartin et al. ( 19 ). CurvS is conceived as a tool to allow researchers and analysts to extract, analyze, and visualize horizontal alignment information using readily available GIS roadway centerline data.
The next section presents the methodology used by CurvS to detect horizontal curvature. The functionalities of CurvS are then demonstrated. The validation of its estimation results is conducted by using two different roadway types, a rural two-lane two-way roadway in New Jersey (NJ) and a freeway segment in Nevada (NV). Conclusions are presented in the last section.
Methodology
CurvS’s horizontal alignment estimation is based on the clustering approach. The use of clustering approach for horizontal alignment estimation stems from the idea that distinct road sections, either curved or tangent, have similar explanatory variables. CurvS detects these sections by identifying the clusters of these variables.
Let us consider a roadway represented by discrete points (vertices) in a GIS shapefile. Suppose that the roadway is discretized by
where
Using the approximated first and second-order derivatives, the curvature at vertex i can be calculated as:
Although
This study extends the clustering algorithm used in Bartin et al. ( 19 ) by utilizing multiple explanatory variables to detect horizontal curved sections. The selected explanatory variables are:
Heading angle (
Curvature (
Change in heading angle (
Distance between consecutive vertices (
Milepost (MP)
Radius calculated using the chord method (R)
Radius of a circumscribed circle (RCC)
Cumulative change in heading angles at three consecutive points
The difference in cumulative heading angles around each vertex
Figure 1 illustrates the key explanatory variables. These can be described as follows. The heading angle
where
Cumulative change in heading angle at three consecutive points,
The difference in cumulative heading angles around each vertex,
These explanatory variables are selected to represent quantitatively how each vertex is positioned with respect to its neighboring vertices. Many of these explanatory variables are also used in the literature. Change in heading angle,

Illustration of the key explanatory variables.
Each vertex, therefore, has a vector of explanatory variables, denoted by
where
Because the common K-means algorithm often fails to find the optimum cluster, the modified global K-means algorithm by Bartin et al. ( 19 ) is utilized. Readers are referred to this study for further details of this algorithm and how its results are used in detecting distinct roadway section on a roadway.
CurvS Tool
In this section, the functionalities of the CurvS tool are demonstrated. Its estimation results are also validated by using horizontal alignment data of one rural roadway in NJ and one freeway in NV.
Functionalities of CurvS
CurvS is a web-based tool that adopts a typical client-server model. The server component of the tool is written in PHP programming language and provides data analysis and processing service to one or many clients, which initiate requests for the service. The client-side interface is developed using JavaScript (JS), HTML, CSS, and Google Maps™ Application Programming Interface. The interface accepts the inputs for the analysis and uploads the data to the server via an AJAX (Asynchronous JavaScript and XML) request without interfering with the display of the interface. On receiving the input parameters and the data, the server executes the C code developed by Bartin et al. ( 19 ) using these inputs. The progress of the execution is continuously monitored by the server, and after the completion the output is automatically converted to a JSON (JavaScript Object Notation) object which is a convenient data format used in AJAX queries. Then, the output is sent to the client to be processed by the built-in JS functions to create curves from the output in a format that can be mapped on Google Maps of the interface.
CurvS utilizes the methodology explained in the previous section. The tool’s main interface provides users with four main modules: (1) Input Module, (2) Scope of Analysis, (3) Analysis Preferences, and (4) Visualization & Reporting. Below are the descriptions of each module.
Input Module
The main input files to the tool are the geometric information for the roadway(s) of interest. CurvS requires users to upload this information in two separate files in *.csv format: nodes and links, as shown in Figure 2. The required fields in the links file are highway name, link id number, and coordinates of the start and end nodes of each link. The required fields in the nodes file are highway name, link id number of the node, node index, and node coordinates. These fields can be easily extracted from roadway centerline shape files using any available GIS software. The required coordinate referencing system for CurvS is World Geodetic System (WGS) 84.

Demonstration of CurvS modules.
It should be noted that users can either upload the geometric information on a single roadway, multiple roadways, or the entire roadway network.
Scope of Analysis Module
Once the input files are read, users are asked to define the scope of analysis. If users have a list of roadways for analyses it can be uploaded as an input file, as shown in Figure 2. The required fields in this input file are roadway name, start and end mileposts. If users prefer not to upload a list of roadways, the tool generates a table of roadway names from the links file. Users can select the ones to be included in the analyses by simply clicking on the check boxes, as shown in Figure 2.
Analysis Preferences Module
This module provides the users with the ability to modify the default options and parameter values used in the analysis. These are (1) minimum cluster size, (2) radius threshold, (3) curve sensitivity, (4) radius sensitivity, (5) Monte Carlo simulation toggle on/off, and (6) number of Monte Carlo simulation runs.
A brief explanation is warranted here in relation to how CurvS applies the results of the clustering algorithm for identifying curves.
Once the clustering results are obtained, CurvS calculates the radius of each section using the chord method and determines whether the section is curved or tangent based on the radius threshold. The default value of the radius threshold parameter is 12,000 ft; any value above is identified as a tangent section. If there are contiguous curved sections as depicted in Figure 3, CurvS applies the “patch-up routine” and calculates the distance between the centers of contiguous curves, if

Example of two contiguous clusters.
During the digitization of orthophotos, tangent sections are usually represented by few vertices. Depending on the accuracy of digitization, wider gaps between vertices are bound to inaccurate estimation of PC and PT locations. Note that in Figure
The default clustering runs are conducted by assigning equal weights to each explanatory variable listed in the Methodology section. One could suggest that curvature
Visualization and Reporting Module
Once the analysis is complete, the results are tabulated in an interactive output table as shown in Figure 4. Each line in the result table include the following information: (1) highway name, (2) start and end milepost of section, (3) section length, (4) curve flag (1: curve, 0: tangent), (5) average radius, (6) lower and upper bound for radius, (7) CV of radius estimation, (8) section start and end coordinates, (9) center coordinates (if curved).

Reporting module of CurvS.
As shown in Figure 4, users can click on each line in the output table and display the section on the map along with its relevant information. The tool provides users the option to save the output table in a variety of formats such as *.csv and *.xlsx.
Validation of CurvS
The validation of CurvS’s results is performed by using two different roadways: (1) Route 83, a two-lane two-way rural roadway, in Cape May County, NJ, and (2) a freeway segment on I-80, between mileposts 15 and 25.7, in Eureka County, NV in the eastbound direction. The horizontal alignment information for Route 83 is extracted from as-built plan sheets provided by the NJDOT, the information for I-80 is presented thoroughly in the analysis results of Xu and Wei ( 24 ).The GIS roadway centerline shape file for Route 83 and I-80 are available at NJDOT ( 30 ) and NDOT ( 31 ) websites, respectively.
Let us first define the notation used in the formulation of the metrics for the comparison of the actual horizontal alignment data, denoted by
Five different comparison metrics are employed:
Metric 1 (
Metric 2 (
Metric 3 (
Metric 4 (
Metric 5 (
where
where
where
where
To elucidate the calculations of these metrics, let us consider the hypothetical sets of actual and estimated horizontal alignment data of a 1-mi roadway shown in Figure 5. There are three curved and two tangent sections in the actual alignment data, that is,

Hypothetical sets of actual and estimated horizontal alignment data.
Table 1 shows the actual horizontal alignment and the ones estimated by CurvS for Route 83 in NJ and I-80 in NV, along with the comparison metrics results. The table rows are arranged to align the actual sections with the estimated ones for presentation purposes.
Comparison Results
These sections are
When calculating the
Route 83, being a rural roadway, is a more challenging one for horizontal alignment detection than I-80 because there are more curves per route length, and the curves are shorter in length and larger in radius than those of I-80. For example, the central angles, Δ, of the curves on Route 83 vary from 3.0° to 24.2° with an average of 11.4°, whereas the ones on I-80 vary from 15.7° to 48.6° with an average of 25.1°.
Route 83 is 3.79 mi long with 11 curves, whereas the I-80 segment is 10.7 mi long with 10 curves. As shown in Table 1, CurvS detected all curves, and thus
Several remarks are in order as to the other comparison metrics:
The start and end points of curves do not match precisely. Therefore, the overlap ratio of estimated curves with the actual ones,
These discrepancies, however, are more conspicuous between mileposts 0.29 and 0.75 on Route 83. Whereas two curves are observed in the plan sheets (i = 4, 6), CurvS detected three curves within the same milepost range (j = 3, 5, and 6) with significantly different curve lengths and radii. This difference stems from the apparently incomplete information provided in the plan sheets. Namely, there are three additional curves detected in the as-built plan sheets, not indexed as a curve but only marked by their point of intersection (PI) and its station numbers, without further details. A quick examination of satellite images shows that these points in fact coincide with horizontal curves. Their PI station numbers correspond to mileposts 0.25, 0.34, and 0.42 on Route 83. Therefore, sections between mileposts 0.29 and 0.75 are not included in the calculation of metrics reported in Table 1. In addition, there is a short curve at the end of Route 83, which is also not shown in the plan sheets. Therefore, this section too is not included in the calculation of comparison metrics.
The compound curve on Route 83 (i = 8, 9) is estimated as a single curve by CurvS (j = 8). The estimation results are based on the default value of ∝ = 1.0. When the sensitivity is increased that is, lower ∝ value, this specific compound curve can be detected by CurvS. However, in general unless the digitization of orthophotos is conducted meticulously it is often difficult to discern compound or spiral curves using discrete data points.
Two tangent section on Route 83, i = 1,5, and two tangent sections on I-80, i = 7, 11, are not identified by CurvS. This is because few vertices are used to represent these sections in the GIS shapefile, and because of the minimum cluster size parameter, these are not considered as a separate section (i.e., cluster) by CurvS. Therefore
Estimation of curve radius is highly dependent on the correct detection of PC and PT points, vertex resolution, and how close the vertices follow the roadway centerline. The results show that
Route 83 is one of the roadways used in the analysis conducted by Bartin et al. (
19
). Because the curved sections were of interest only the curvature information of Route 83 is reported as obtained via the MAV method and manual extraction method. These results also attest to the remark (2) listed above, where both methods indicate a different horizontal alignment between mileposts 0.29 and 0.75. Also, in relation to the remark (3) above, both methods labeled the compound curve (i = 8, 9) as a single curve. Comparing these results with the actual curvature information shown in Table 1, the metrics for the MAV method are calculated as
Comparing with the other available methods, it can be asserted that CurvS can provide valid horizontal roadway alignment information.
In addition, to explore the impact of vertex resolution on the estimation accuracy, the centerline shapefile of Route 83 is regenerated using high-resolution satellite images downloaded from US Geological Survey website (
33
) as overlays. The vertex resolution of Route 83 is increased by reducing the average distance between two vertices to 68 ft (0.013 mi) from its original value of 120 ft (0.022 mi). Using CurvS results, the validation metrics are computed as
Finally, to demonstrate the degree of accuracy of CurvS’s results in reference to another available tool, the same roadways are processed in ROCA. As mentioned earlier, ROCA is the only other publicly available tool used to process horizontal curvature at a network-wide scale. ROCA requires jurisdiction-specific training data for its naïve Bayes classifier, and thus a training dataset specific to the U.S. roadways is created, consisting of nearly 4,000 data points. Although a dataset comprised of two roadways with a total of 21 curves cannot be deemed sufficient for a comparative analysis, it can be used to demonstrate the level of accuracy provided by CurvS in comparison to ROCA. The similar validation metrics are computed using the ROCA outputs for Route 83 and I-80. For Route 83 the metrics are calculated as
Summary and Conclusions
This paper demonstrates CurvS, a web-based tool that allows researchers and analysts to automatically extract, visualize, and analyze roadway horizontal alignment information using readily available GIS roadway centerline data. The functionalities of CurvS are presented along with a brief background on its methodology, which is an improved version of the clustering approach presented in Bartin et al. ( 19 ). In addition, a thorough validation of its estimation results is presented using the actual horizontal alignment data from two different roadway types, a two-lane two-way rural roadway in NJ, and a freeway segment in NV.
The validation results in Table 1 indicate that CurvS provides satisfactory horizontal roadway alignment estimations. As discussed, even though there are some inherent estimation errors, these are inevitable because of the (1) space discretization of continuous lines and (2) errors during the digitization process of orthophotos. The review of the literature indicates that such errors also exist in other available methods for collecting horizontal alignment data ( 8 , 34 ). CurvS is able to identify all the curves on these two roadways, and the estimated section lengths are significantly close to the actual alignment data, especially for the I-80 freeway segment, where 90% of curved length and 94% of tangent section length are correctly matched. Even when curves have small central angles, such as the ones in Route 83, CurvS’s estimations covers 71% of curved length and 96% of tangent section length. In addition, using the data generated by the MAV method and manual extraction on Route 83 reported in Bartin et al. ( 19 ), it is found that the MAV method correctly identified 82% of the curves, covering 57% of curved length, and the manual extraction process correctly identified 91% of the curves, covering 77% of the curved length. Based on these metrics, CurvS outperforms these two time-consuming and costly methods.
In addition, it is shown that when the vertex resolution of Route 83 GIS shape file is enhanced, all curved and tangent sections are identified by CurvS, and 94% of both curved and tangent lengths are correctly matched.
It should be underlined that these results are obtained in a matter of seconds in CurvS and can be easily utilized in various safety analyses, such as computing CMF values for roadway curvature, required by the HSM’s crash prediction models.
Currently the only disadvantage of CurvS is that it requires users to preprocess the nodes and link input files in *.csv format. Although this process can be performed easily in any GIS software, future work will include modifying CurvS so that it can directly import shape files (*.shp) and conduct the preprocessing internally.
Footnotes
Author Contributions
The authors confirm contribution to the paper as follows: Study conception and design: Bekir Bartin, Sami Demiroluk, Kaan Ozbay; data collection: Bekir Bartin, Mojibulrahman Jami, Sami Demiroluk; analysis and interpretation of results: Bekir Bartin, Kaan Ozbay, Mojibulrahman Jami; draft manuscript preparation: Bekir Bartin, Kaan Ozbay, Sami Demiroluk, Mojibulrahman Jami. All authors reviewed the results and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study is supported by the NJDOT (FHWA-NJ-2017-001) and partially by C2SMART, a Tier 1 UTC at New York University funded by the U.S. DOT.
The contents of this paper only reflect the views of the authors who are responsible for the facts and do not represent any official views of any sponsoring organizations or agencies.
