Abstract
VANET has been an area of great interest and exploration for researchers to solve several challenging issues regarding communication, topology, security etc. in the last few years. Currently, a lot of work has been done and is looked after to establish effective communication and message passing among the vehicles (V2V and V2I routing) with a number of algorithmic models developed. The paper presents a survey of the routing algorithms proposed to have communication inside a VANET among the nodes. Since V2V and V2I interactions is a complex combinatorial problem which falls under the class of NP-Complete set of problems. The paper here presents a modified mobicast routing version using genetic algorithm with certain considerations for mutation and crossover operator for the algorithm in order to achieve more accuracy for the results. The method shows a great enhancement in the execution timing for a considerable number of vehicles where the traditional algorithms may hang up to produce a route. But still the serial version stucks up for heavy density vehicle scenarios for message passing in real time. So, the efficiency of the proposed method is enhanced using parallel processing power of multi-core and many-core processors using OpenMP and Computationally Unified Device Architecture (CUDA) API. The enhanced results show a great improvement in the performance in terms of execution time when compared with the serial algorithm, especially for the cases where the solutions cannot be obtained in real time. The results over GPU based architecture suggests that the proposed method has a huge potential to scale up with the vehicles on the road, thus, reducing the road side units for providing a larger range of coverage.
Introduction
Vehicular Adhoc Network (VANET) is an intelligent transportation system to manage the rapidly growing and complex traffic environment on roads. VANET being the future for the road traffic management can considerably help in avoiding severe accidents, traffic jams etc. which certainly is a loss to the mankind as well as resources in terms of time and life. VANET modelled from MANET (Mobile Adhoc Networks) is a network where the moving vehicles take the role of broadcaster, forwarder, switching nodes [1] to spread information in the network. The On Board Unit (OBU) on the vehicles let the wireless networks to be formed with short – range frequencies [2]. Global Positioning System (GPS) as well as Differential Global Positioning System (DGPS) are also mounted over the vehicles to facilitate Vehicle to Vehicle Interaction (V2I). Similarly, these moving nodes can interact with the road side units to spread the information. VANETs are meant to provide safeguard to the moving vehicles on the road with other applications, some of being crucial services like smart navigation system, location-based service like indicating fuel station at minimum distance, restaurant or travel lodge [3, 4] and entertainmentapplications.

[3] Inter-vehicle communication.
Since, majority of communications in VANET are broadcasting messages, they are of concern to all the vehicles on the road like an accident occurred with a roadside vehicle, bad weather conditions, heavy traffic jam etc. In all these scenarios, the network is created as soon as the vehicles come in a formidable range. The challenge here is to find out an efficient route as quickly as possible in order to best utilize the resources and bandwidth because of high mobility of nodes in the network owing to topology change. This Intelligent Transportation System follow two broadcasting mechanism: naïve and intelligent broadcasting [4]. Figure 2 shows a single hop broadcasting procedure with road- side unit being the broadcaster to all the vehicles in the range. Figure 3 shows a multiple hop unicast technique where a message moves through multiple hops till the vehicle having the designated data is not reached to the intended destination.

[3] Vehicle-to-roadside.

[3] Routing-based communication.
In naïve broadcasting technique, broadcast messages are sent periodically on a regular basis. The vehicle discards the message if it is not from some vehicle in front of it. Otherwise the recipient vehicle broadcast it to vehicles behind it. This guarantees the delivery of broadcast message to all the vehicles in the network. But the bottleneck is the bandwidth consumed due to high frequency of broadcast messages in the network resulting in collision lowering the rate of message delivery and increasing the delivery time. In contrast, intelligent broadcasting limits the number of messages broadcasted. If the already acknowledged vehicle again gets the same message, it is considered by default that at least one vehicle in the back trace has the message and, hence, does not broadcast further as the vehicle in the back further move the message behind.
VANETs (Fig. 5) [38] are distributed, autonomous communication characterized by very high entropy of vehicles and restricted vehicle movement patterns [5]. Broadcasting plays an important role in VANETs in disseminating time-sensitive and critical alert messages within the entire network. These broadcast messages need to be disseminated in an appropriate passion which demands the routing algorithm with high delivery ratio and less time to move the packets on the route. In past few years [6–9] various routing algorithms have been proposed and has been enhanced in order to make the best outcome fir as per the need of the network. Since VANETs are a special form of ad hoc networks, and the operational tendency similarizes to MANET (Mobile Ad-hoc Networks) which uses address based and topology based routing mechanisms to assign unique addresses to the nodes in the network. The drawback being of these approaches is that they do not guarantee duplicate addresses allocation in the network [10].

Heavy traffic Jam in New-Delhi in rainy season (source taken from https://www.telegraphindia.com/1120829/jsp/frontpage/story_15911284.jsp).
Proactive routing protocols like Destination-Sequenced Distance- Vector (DSDV) routing or Optimized Link State Routing protocol (OLSR) and Topology Broadcast-based on Reverse-Path Forwarding (TBRPF) keeps the route information and updates are made periodically to the routing table even when the paths are not in use. The maintenance of unused paths consumes the bandwidth due to frequent topology changes and highly dynamic nature of VANET makes this protocol unsuitable.
Reactive routing protocols
Reactive routing protocols like DSR (Dynamic Source Routing), and AODV (Ad hoc On-demand Distance Vector) runs the routing algorithm on demand basis, thus, reducing burden on network for producing a limited number of routes for vehicles to communicate making it suitable for this application scenario.
Position-based routing protocols
These class of protocols and algorithms have gained a high significance in a highly mobile environment of VANET [6–9]. It makes decision on the basis of information about destination’s position [11] which is known through a location service extracted through periodically transmitted beacons by a vehicle to its neighbor. At the same time, they do not require creation or maintenance of routes. GPSR (Greedy Perimeter Stateless Routing) [12] and DREAM (Distance Routing Effect Algorithm for Mobility) [13] are its prime example. Karp et al. [13] discusses a greedy mechanism for forwarding the packets which are moved along the nodes that are positioned closer to the destination geographically as compared to the previous node. Thus the next hop will always be close to destination in comparison to the current hop. GPSR’s “perimeter routing” phase look for alternate paths and do not consider the routes that are geographically far away as in case of highways where road sizes are within the transmission range. Position-based routing gives a more scalable and efficient mechanism for forwardinginformation in highly volatile networks in VANETs [14]. Naumov et al. [15] presents a Connectivity Aware Routing (CAR) protocol for VANETs which is capable of finding connected paths between the nodes. Leon- tiadis et al. [16] gives a geographical opportunistic based routing protocol which exploits the topology of VANETs alongwith geographical routing information. Hartenstein [17] uses a unique identifier like IP address for vehicle identification along with its current GPS coordinates. The scheme shows highly adaptive and scalable behavior for topological uncertainties as this only requires every vehicle to know its own address and destination address which is binded in the packet forwarded.
Geographic location based messageforwarding
A geographic unicast transmit packets among nodes through a couple of wi-fi hops. Once any node has a unicast packet to send, it identifies the location of the destination node through watching the location table. A greedy forwarding algorithmic program is then accustomed forwards the packet (see Fig. 4), describing the minimal last distance to the destination car and this method repeats at every vehicle on the forwarding path until the packet reaches its vacation spot. A geographic broadcasting mechanism floods the packets and are the packets are rebroadcasted if they are in the geographic transmission range. However, flooding of broadcast message and further rebroadcast may significantly affect the network traffic due to high flooding of messages in the channel.

[3] Cached Greedy Geographic Unicast (CGGC) Example of a greedy unicast transmission based on knowledge of the destination’s position.
Biswas et al. [18] speak extraordinary kinds of forwarding and broadcasting mechanism- naïve and intelligent broadcast. Naive huge- solid forwarding calls for a broadcast message to be send periodically. The automobile receiving the message drops the message with admire to its course of movement if it comes from the past. If it is from the vehicle in front, it’s far believed to be an emergency in front and broadcast messages periodically. A challenge of this approach effects from the quantity of forwarded messages. A result of that is that the range of 802.11 mac message collisions can increase which in flip lowers the message delivery price and will increase the shipping time. This takes place if the message is dropped and forces the occasion-detecting automobile to periodically retransmit it. So as a way to this, intelligent broadcast forwarding mechanism become proposed by means of Biswas et al. [18] describing a more efficient broadcast protocol with auto acknowledgement to deal this. This protocol improves system overall performance through restricting the wide variety of messages wide- cast in the platoon for a given emergency. If the event- detecting car receives the equal message from at the back of, it assumes that at least one car in the back has acquired it and ceases broadcasting.
Mobicasting
Chen et al. [19] proposed a spatiotemporal routing protocol known as mobicast routing protocol. The protocol is mainly designed to disseminate the safety and comfort broadcast messages to the vehicles. The protocol has two peculiar features: Zone of relevance (ZOR) and Vehicles in ZOR at instant t refered to as ZORt. The aim is to forward the information to the vehicles in ZOR within a specified time t before it moves out of the range due to accelerated speed. The technique is well suited for the VANET environment for the purpose of disseminating an important message triggered off first from a vehicle to all the others in ZOR in time t + k where t is the time at which the vehicle remains within ZOR and k is the constrained delay expected to deliver the message. The paper uses this approach in hybrid with genetic algorithm in section 4 of paper to further enhance the efficiency and to prohibit the channel being overburdened with multiple broadcast and forwarding messages in the network resulting in packet drops and delay in transmission.
Need of heuristics for VANET and GAbased optimized routing solutionsfor MANET
VANET [39] being the future of vehicular management system on roads, the challenge is to provide an effective and secure communication between the vehicles within the adhoc network as well as with the roadside infrastructure. The major concern being the dissemination of important message alerts, warnings etc. as a broadcast to all the vehicles in the network. As per the increasing load on the roads of vehicles which is expected to expand quite rapidly in future (see Fig. 5), this task will become even more challenging. The routing protocols discussed in the previous section have been used which have their limitations in terms of latency, channel and bandwidth utilization, secure communication etc. and, thus, there stands out a need for enhancing the proposed routing algorithms for the cause. The state of art for developing heuristic approaches for modification of routing algorithms have been discussed in [20, 21]. There are a number of heuristics that can be used and the two papers present a consolidated survey over such techniques. The paper uses one such heuristic popularly known as Genetic Algorithm to the Mobicasting technique discussed in section 4. The routing solution will be a hybrid approach where GA is used to obtain the most optimized path to forward packets in network within a given specified time. But before discussing about the technique let us first examine how the GA based heuristic and hybrid approach based routing solutions work for MANET.
VANET is a specialized adhoc network which have been derived from MANETs [22] which are another special class of wireless ad-hoc networks. B. Lorenzo [23] et al. used multi-objective function based GA to solve the NP-hard problem of optimizing relay topology for route exploration minimizing the cell interference. Apetroaei et al. [24] explains the generation of a minimum spanning tree for a WSN using GA approach in order to minimize the route coverage time and maximizing the network usage. Yan et al. [25] used a genetic set of rules method to discover the energy efficient routes through retaining the strength balanced nodes at some point of the distribution community. Badia et al. [26] addressed the routing and hyperlink scheduling trouble for computational complexity of large WMNs. Chiang et al. [27] employed topology crossovers to remedy the multicast routing constraints efficiently. In the same stream, there are a lot of GA approach based metaheuristics applied to a MANET environment in order to find the most optimized route for data forwarding so as to account for the energy issues in the network, reducing network traffic, increasing reliability etc. Based on the same philosophy, the idea can be mapped to the VANET scenario in order to fulfil other objectives about which we will discuss in next section.
Proposing a GA based optimized routingsolution for VANET
Genetic algorithm being a stochastic process [28] and an evolutionary algorithm based upon the principles of natural selection provides an approximate solution near to the best optimum in situations where it is almost impossible to go for exact solutions or it is being a costly affair. GA represents an iterative method. Every generation is synonymous to an iterative cycle. A typical range of generations for a simple GA can range from 50 to over 500. The entire set of generations is known as a run. Using GA as a searching heuristic, process will be initiated by a set of population or possible solutions from the entire solution space. These populations will be refered to as chromosome buildup of a sequence of bit or numeric string where each member of the sequence will be refered to as genes. Based on the above setup, we will try to propose an optimal routing algorithm for a VANET environment in combination with Mobicasting to broadcast, unicast or multicast a message as per the requirement. For this let us map the road scenario shown in Fig. 5 to a graphical network. For simplicity, let us consider a more simplified network as shown in Fig. 6. Here each node represents a vehicle on the road with one of the nodes identified as the road side infrastructure interacting with the vehicles. Our objective will be to find an optimized route at an instant and then forward the packets on the route identified in order to ensure that the message reaches the maximum vehicles in a very quick span of time without flooding the messages unnecessarily consuming the network resources. Roadside infrastructure ‘I’ is responsible for disseminating the information to the vehicles in the range as well as to specific vehicle through other vehicles acting as intermediate nodes. Since all the vehicles in the range of ‘I’ are uniquely identified based on some identifier assigned to the vehicle while it is in range, as soon as the intended destination will receive the message it will stop forwarding the message.

Diagrammatic Representation of a VANET scenario.
It is assumed for the process that the topology of the network for node I to disseminate information holds for time ‘t’. So, ‘I’ has time t to find an optimal route and k is the time taken to forward the message over the path for the intended destination to reach. As the time exceeds this ‘t+k’ quanta, the last node in the route that has the message will start broadcasting the message as per the Zone of relevance as in mobicast routing algorithm. Based on the no. of vehicles (let us say n), there can be n! possible routes out of which a set of routes can be chosen serving as the population set for the process. The objective function will be to find the minimum cost of travel between the nodes that is the distance between two vehicles denoted by the edge weights in the graphical network as: F (x) = Min∑distance (i, j) with i and j being the nearest adjacent nodes. In other form the problem can be formulated as:

Order Crossover for the routes.
Here, N represents the vertices count, dij represents distance between node i and j, and the xij is the sequence values for route. For eg. considering 10 vehicles in the zone of relevance of I, there are 10! routes for message transfer between the vehicles. These routes can be:
1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10
1 - 3 - 2 - 4 - 5 - 6 - 7 - 8 - 9 - 10………and so on.
The idea here is to find the most optimum route based on the distance between the moving vehicles. The algorithm can be briefed out as:
As per the above algorithm, we will be able to find the most optimized path over which the information can be disseminated to reach the intended destination in a heavily populated VANET environment. The intended destination can be the last vehicle within the range or zone of relevance of ‘I’. The quanta for which the topology of the network is assumed to be maintained is t+k where ‘t’ is the time to find the route and ‘k’ is the time taken for the nodes of the network to get the message. If this time quantum expires before all the node in the route are covered, the node currently holding the message will start broadcasting the message as shown in Fig. 8.

Message delivery to vehicles out of Zone of relevance.
The following results have been calculated for vehicles within the zone of relevance of roadside infrastructure acting as the source node for the network. If the case is to make a message delivery to the node using intermediate vehicles in the network, then, the proposed approach results are shown in the table. The experimental results in Table 1 shows a scaled speed up with a significant reduction in execution time. As per Fig. 9, it is clearly visible that time gap between the two approaches will continue to increase as the vehicles keep on increasing.

Comparison of proposed algorithm with a no heuristic based approach to establish the route.
Performance evaluation for the proposed algorithm v/s the traditional approach
The y-axis of the graph in Fig. 9 indicates the execution timing of the algorithm (in seconds) to find the optimized route of the network and x-axis represents the vehicle count. The results have been obtained by test running the algorithms several times for dynamic scenarios through randomly generated adjacency matrix. However, the results of the proposed algorithm outperform the traditional approach in each case as for some cases results could not be produced after waiting for an indefinite amount of time. However, for constructing the graph the non-optimized curve has been plotted with some hypothetical values. In any case, the time for producing result for this much vehicle count is par above or the machine may fail or hang up to process the data. Also it can be understood that due to message flooding or broadcasting, as a result of high resource consumption, the network performance is expected to degrade resulting in increased latency in message delivery which may cause timer to expire and the message delivery may fail in middle. The curve of Fig. 10 shows the trend of increasing number of messages in the network as a result of broadcast due to increased control messages, acknowledgements etc. with the vehicle count.

Increase in message generation rate with the vehicle count.
Although the above GA based heuristic version of mobicast routing algorithm produces optimized execution timing results for route generation, for significant increase in the number of vehicles, the proposed algorithm’s efficiency tends to lower down and after a certain time it also meets the computational limitations. Thus, comes the need to map the algorithmic structure to today’s modern computing architectures and programming platforms of the machine. In this section, we will produce a parallel version of the algorithm for the task independent functions and will see how the algorithms scales up [40] for increased number of vehicles. But before proceeding, let us discuss the two programming platforms OpenMP for multi-core programming and CUDA (Computationally Unified Device Architecture) for many-core programming in brief.
Compute unified device architecture
The NVIDIA (Quadro-600) processor based on the GeForce four hundred assortment Fermi-architecture implements a 4-stage pipeline. All the 128 cores are of the identical kind that turns into giant parallel processor which can perform numerous homogeneous tasks parallely (SIMD). It is associated with enriched library functions that has an extension of ordinary c programing language. Programmer views CUDA as a multicore co-processor which allows them to take advantage of the parallel energy of the processor for traditional purpose computing in an additional higher manner.
OpenMP
OpenMP (Open multi-handling) is an API that helps shared memory multiprocessing in C, C++ and FORTRAN. It comprises of an immovable of compiler orders, library schedules, and condition factors that affect run-time lead. OpenMP moved toward becoming presented in 1997 for Fortran and in 1998 it move toward becoming included for C/C++ stage. OpenMP prerequisites make the utilization of various CPU centers to give parallelization improvement to a calculation in the meantime as CUDA makes utilization of numerous joined centers over the pictures processor to do additional calculations per unit time. In a nutshell the size up change for a calculation through OpenMP [34–37] is improved the situation multi-center design which may be the dormancy arranged centers and works at the thoughts of reserve hit and store advancements. Then again, CUDA has throughput situated centers where numerous running units inside the type of several string offer algorithmic streamlining by cutting down the computational time. Each the working stages goes under SIMD kind of Flynn’s scientificclassification.
OpenMP optimization
A fitness operator is employed to pick the offsprings from the pool of solutions available for generating new population. The fitness operator applied to each generation is given as: average of Maximum which indicates the costliest route the and Minimum which indicates the least path length within the current possible solutions. Initial populace generation is done using a straightforward permute operator with limit on higher and edge ids of routes in C++. The optimal route calculation based on this is as follows:
Begin
#pragma omp parallel for
/* spawning over n threads*/
For 10 vehicles considering 10000 routes from pool
{
Route_min = Initial_route
If (Route_min <Cost of route taken in the current iteration)
Min = Current route’s cost
}
End A similar pragma can be written to find the route with maximum path length. Chromosomes (routes) having path length below threshold are preserved for next generation
Begin
# pragma omp parallel for
/*for n threads*/
for 10000 routes having 10 vehicles
{
/* Testing each route cost*/
If (Cost of current route < (Route_min + Route_max)/2)
Cost vector [thread id] = Cost of current route.
}
End
{ 0,3,4,1,2,5} { 5,3,4,0,2,1} { 1,2,3,4,0,5} and so on ...
The possible routes are held as vector blocks with value of every route related to it as per Fig. 11. After successful determination of the possible routes on the premise of objective test function, a mating of the routes is done. This is needed to make a new more fir generation of solutions. Two chromosomes (possible solutions) apply the crossover operator between them to breed a replacement generation. Here ordinal crossover operator has been used to induce new off springs.

Ordinal Crossover method.
The parallelization approach remains however over associate in nursing accrued process and computing power over thousands of processors. each parallel improvement is written in CUDA kernel perform and given a launch within the host or computer hardware code with needed variety of threads. The fitness perform analysis procedure for CUDA is as follows:
__global__void optimal_fitness()
{
If(current cost < (Route_min + Route_max)/2)
Store route explored by current thread id in optimal route array.
}
Kernel launch: optimal_fitness< < <no.of blocks, no. of threads> > > (arguments to the kernel);
Aside from this the course cluster in which the cost of each course is put away as bit string is exchanged to GPU from CPU through CudaMemcpy () work and again the outcomes are exchanged back to CPU through same capacity. Additionally, the hybrid capacity can be created where hybrid between two chromosomes is again taken care of by a string. Here the underlying bit dispatch contains number of strings comparable to the quantity of courses considered.
Experimental results enhancement
The results reported below have been test run over OpenMP 4.0 specification on a quad-core computing machine and CUDA 8.0 runtime environment over Quadro-600 graphics processor. Here the results have been obtained by taking an average of the runs for various number of cities. Best results are obtained by running the implementation over the number of threads equal to the number of cores. Running the code over larger number of threads, might not provide a spare improvement as the limit to growth of the optimization function is reached. A comparative enhancement of the parallel execution over serial is shown in Table 2. Figure 12 shows the enhancement achieved for various cases.

Serial v/s OpenMP implementations. X-axis: Execution timing in seconds; Y-axis: Number of tours.
Performance Evaluation for the proposed algorithm over OpenMP
CUDA [30, 31] suggests the fact that because of amalgamation of the many integrated cores thus known as coprocessors, helps in lowering the computational time significantly as output quantitative relation gets improved or in different sense, compute to memory access operation (CGMA ratio) has been improved considerably. Results obtained justify the reason because the execution timing of the algorithm has been optimized considerably. The execution time to search out the optimum or near-optimal answer decreases by around seven times in case of CUDA. The time needed for determination for top range of routes additionally decreases as shown in Fig. 13. The outcomes got here are equivalent to [32] where creators utilized a multi-processing group setting including up to sixteen virtual machines. For a populace size of 1024 hubs, the execution fleeting course of action for the algorithmic technique run is around 20–30 seconds which lessens to around to under a second in our proposed approach over CUDA including a huge number of co-processor units on one machine. Creators in [33] talks about a brisk parallel hereditary algorithmic approach for TSP that includes copy chromosomes distinguishing proof and expulsion inside the technique. Contrasted with the present, the approach anticipated amid this paper doesn’t fabricate utilization of change administrator or isn’t required on account of determination of decision of hybrid as requested hybrid that guarantees that despite a copy open produces it’s known and isn’t taken into next ages. In [33], the outcomes were gotten over a multi-center registering setting more than one machine that really bears a breaking point when a significant populace estimate is to be dealt with inferable from processing equipment confinements. For this to counter, the anticipated approach has revised the condition of illicit relationships to deal with the work out force for monster data over GPU processors.

Performance graph for CUDA.
The paper discusses various routing protocols and standards that currently prevails for VANETs. Simple flooding or broadcasting at each node may bring down the network performance heavily which may result in packet drops or slow delivery time especially if the message is intended for a particular node. The paper here discusses a genetic algorithm based heuristic for mobicast routing protocol, a hybrid approach to solve the problem. The success of such heuristic based routing algorithms for MANETs in section 3 opens the gate of exploration for VANETs. The experimental results for the approach proposed in section strengthens the significance for the need of an intelligent routing algorithm for the cause. Further the approach has been clubbed with the mobicast routing model which ensures the reliability of the message to be delivered to the node.
However, the time taken by the proposed approach to find an optimal path shoots up with the increase in the number of vehicles and thus the future aspect can be worked out with the high end computational power of modern systems. Thus, a parallel version of GA approach is used to explore the route. Using OpenMP and GPU CUDA programming platform, the execution time of the algorithm has been significantly cut down. This implies that a system can be built with high reliability and quick response to real time topology changes where a large number of vehicles can be covered up with high range and less number Road Side Units.
Further, here the investigation has been done only with respect to the cut down in the execution time for route discovery so that the message can be forwarded over the optimized route calculated. However, no analysis has been done that by how much factor the traffic reduction has been done as instead of broadcast messages, the proposed technique will have unicast message to forward the information to the vehicles. This can be a further scope of investigation in order to further evaluate the efficiency of proposed methodology.
