Abstract
Unusual biological events and outbreaks require rapid epidemiologic investigation and contact tracing procedures, allowing optimal handling of resources. Currently, these are resource intensive, time consuming, and extremely complex, requiring large teams of trained and prepared personnel. The goal of this study was to determine whether a technological alternative to the classic systems, based on the use of mobile phones and a unique algorithm, could perform a complete epidemiologic investigation in a setting of a bioterrorism scenario. The system was tested with 32 volunteers during a bioterrorism simulation drill, with quantitative assessment of key outcome measures: perform a complete analysis of the scenario, determine the fundamental biological attributes of the scenario, distinguish between related and unrelated cases, and identify possible exposed people among a known group of participants. The system fully achieved the objectives in just under 5 hours from the beginning of the simulation with only 3 false-positive “exposed” participants, while identifying all 11 true-positive “exposed” participants (overall accuracy of 85%). We find the system advantageous over currently used tools in a way that could be integrated in conjunction with current outbreak epidemiologic investigation tools and syndromic surveillance efforts to shorten the response time of national authorities in handling adverse biological events.
The goal of this study was to determine whether a technological alternative to classic systems, based on the use of mobile phones and a unique algorithm, could perform a complete epidemiologic investigation in a bioterrorism scenario. The system was tested during a bioterrorism simulation drill, with quantitative assessment of key outcome measures: perform a complete analysis of the scenario, determine the fundamental biological attributes of the scenario, distinguish between related and unrelated cases, and identify possible exposed people among a known group of participants.
I
The aim of epidemiologic investigation is to accurately define and assess the scenario of such an event: the origin of the event (“ground zero”), time frame, infectivity, possible spread of the agent, and so on. Some of the data needed are purely clinical, but most need the combined expertise of medical staff and public security personnel. Currently, the predominant method is to conduct a full-scale investigation based on interviews with infected patients and their family and friends, trying to discern a recurring theme. This is a time-consuming and resource-intensive method that usually spans well over 24 hours from the initiation of the investigation.
We now have a technological opportunity to tackle this ever-growing challenge using widely available tracking devices—our personal cell phones. 9 Cellular transmitting devices are everywhere; they include GPS and rely on widely spread transmission antennas and thus can help to improve the traditional epidemiologic investigation tools used by national health systems worldwide. In addition, they allow 2-way communication, potentially helpful in performing a more rapid and accurate epidemiologic investigation: both a retrospective effort, defining time, place, and person, and a prospective effort in contact tracing.9-12
The aim of the current study was to evaluate a novel analytical platform based on cell phone transmission, developed by BioCell EPI as an epidemiologic investigation system, and to see whether it could help with the unmet need of performing fast and accurate epidemiologic investigation, in this case a simulated bioterrorism event conducted by the Israeli Ministry of Defense (IMOD) and the Israeli Ministry of Health. An IMOD team together with the developers evaluated the outcome measures. We compared them to currently used epidemiologic investigation tools, as seen in both real-life events worldwide and in large-scale bioterrorism drills conducted in Israel in the past several decades.13-18 The assessment included both quantitative and qualitative measures of the added benefit of such a technology. Among them, the IMOD team followed the time from first notice until a preliminary assessment was made. It also looked at the time from alert until the assessment was well established (including time, place, and chain of infection). Since the developers did not know the nature of the event, unlike the IMOD team, they also looked at the time from alert until the realization of the presence of a suspected infectious agent, and whether the system would be able to determine the time, place, chain of infection, and infectious agent.
Methods
Study Design
The study was a simulation drill with embedded quantitative assessment of key outcome measures of epidemiologic investigation and contact management in a bioterrorism epidemic scenario.
Setting and Scenario
A small-scale simulation was designed and performed by the IMOD. It was based on a drill performed as part of a multi-year preparedness activity of the IMOD and the Israeli Ministry of Health, which had performed several large-scale bioterrorism drills in the past 2 decades involving hospitals, health maintenance organization (HMO) clinics, public health authorities, and IDF Medical Corps clinics.13,16 The specific scenario started 3 days after the beginning of a drill and 7 days prior to the epidemiologic investigation system test onset. In this scenario, an alleged radical terrorist group prepared an improvised dispersion device containing Bacillus anthracis spores. They activated it inside a museum located in a small town in the center of the country. According to the setup, 21 people were exposed or infected by the spores. At the starting point of the simulation, 7 days from the time of dispersion, 5 patients presented within 12 hours in 2 hospitals with pulmonary and CNS illness. The team was tasked with performing an epidemiologic investigation to identify the probable cause, determine whether it was an infectious or noninfectious agent, define the timeline, identify unrecognized ill people and unaware exposed individuals, and perform contact tracing. The timeline of the scenario is shown in Figure 1.

Timeline of the Scenario. Volunteers were instructed to follow strict routes during a 3-day period in order to create the scenario. The simulation in which the epidemiologic investigation system algorithm was evaluated took place on the 7th day of the scenario.
The dispersion scenario was confined to the interior of the museum. Near the museum, there were several relevant locations, including a shopping mall, a large parking area, and a railway station, through which large numbers of people pass every day. The developers of the epidemiologic investigation system were asked to test their system without knowing the exact nature of the scenario and the timeframe of the simulation.
The BioCell Epidemiologic Investigation System
The epidemiologic investigation system of BioCell EPI is based on automating geometric computations and scaling them to big data. There are 4 defined stages: (1) collecting geolocation samples of patients' phones (Figures 2 and 3); (2) computing and analyzing trajectories of patients based on the samples collected in the previous step; (3) analyzing an infection tree (Figure 4); and (4) extending the analysis to the entire population to detect potentially infected individuals.

Samples Received from a 1-Hour Train Ride, with 1 Train Switch. Although a person's trajectory is continuous in space and time, the samples are relatively sporadic, only once in a few minutes, and might have low resolution. This could be improved once the large cellular companies provide their database.

An Example of a Joint Ride of Two Devices.

Infection Trees. Each node represents a person (labeled with his or her phone number), and 2 nodes are connected by a directed edge from A to B if A infected B. The width of an edge correlates to the likelihood that its encounter has led to an infection (long or intimate encounters). The X-axis serves as the time axis, where the earlier the infection occurred the farther left the node is drawn.
The Collection Phase
Geolocation data can be collected either by handheld cellular devices or by antennas. A cellular application can get a fine resolution location with an accuracy of several meters, or a coarse location, using a GPS service or nearby antennas, respectively. Cellular devices also communicate continuously with the networks. The list of devices intercepted by each antenna is recorded, together with the time of interception, for various reasons such as monitoring or optimizing the network operation. The radius of communication of an antenna is set by the operator and varies greatly from a few meters to thousands of meters. The radius of communication is proportional to the number of people in that area, so antennas in densely populated areas (eg, shopping malls) have small communication radii, while sparsely populated areas (eg, countryside) have large communication radii.
Analyzing Trajectories
Although a person's trajectory is continuous in space and time, the intercepted samples are relatively sporadic—once every few minutes—and might have low resolution. Figure 2 shows the samples received from a 1-hour train ride, with 1 train switch. Computing the similarity between a set of samples and a given route has been already studied.19-21 For the sake of an accurate forensic analysis, we were interested in determining whether 2 devices rode the same car or train. This is challenging, since the devices communicate at different times, with different antennas located at different centers, and have different communication radii. See Figure 3 for an example of a joint ride of 2 devices. In road segments where the car's speed is high, a phone may travel a long distance between 2 signals. The antennas intercepting those signals may be so far away that the areas they cover may not overlap. To address this issue, we developed a set of algorithms and heuristics that allowed us to determine whether 2 sample sets belong to the same car ride.
Building an Infection Chain
When completing the trajectory analysis, we get a set of joint car rides, joint walks, and encounters between pairs of devices, each with a score proportional to the likelihood that the samples indeed correspond to an encounter. To build an infection tree, we need to consider the likelihood of an encounter, the “quality” of encounters the users had in the relevant days, as well as the estimated infection time of each patient based on the time symptoms appeared, and the incubation period of the pathogen. (See Figure 4 for example.) Since the pathogen is not always known, the incubation period is unknown as well. In such a case, one can iterate over all possible incubation periods and build all possible infection chains. It is likely that if the pathogen is contagious, setting the incubation period to that of the pathogen will yield the longest infection chain, thus providing a tool to predict the incubation period of the pathogen.
Participants
A total of 32 people took part in the simulation. Ten days prior to the simulation, each participant received a new cell phone with a tracking app for their coarse location as estimated by the location of the closest cellular antennas. The tracking data from the app were sent to a closed database, to be used by the epidemiologic investigation system at the end of the simulation. The participants were divided into 3 groups: sick patients (n = 10), exposed only (n = 11), and people who were not exposed (n = 11). The participants were instructed to be at specific points in specific times during the entire duration of the experiment, based on their assigned group, but they were blinded to their role in the scenario.
The volunteers simulated a group of tourists visiting the museum during the dispersion phase (5 sick, 4 exposed), locals visiting the museum at a later time during the same day (2 sick), museum workers (3 sick), people passing by outside the museum (7 exposed), people walking in the nearby parking area (3 nonexposed), workers at the shopping mall (3 nonexposed), and people walking in the town, far from the museum (4 nonexposed). In a real-life scenario, many bioterror agents lack unique, easily identifiable symptoms (especially in the early stages), and thus “false patients” create “noise” in any epidemiologic investigation procedure. To evaluate the epidemiologic investigation system's ability to identify such cases, we designated 1 of the nonexposed participants as a “mock patient” dataset.
Key Measures and Performance Indicators
Prior to the drill, 2 key measures and 3 performance indicators were defined, in order to evaluate the efficacy of the epidemiologic investigation system. The first key measure was the time required to complete cross comparison of routes of the cell phone owners, as a background essential data for further analysis. The second measure was the time required to identify the place and time of the simulated attack and the accuracy of this estimation. Three values were defined as performance indicators: (1) sensitivity—the percentage of correctly identified exposed cases (11 total, true positive, TP) within the frame of the scenario should be over 90%; (2) specificity—the percentage of correctly identified nonexposed cases (10 total, true negative, TN) should be over 50%; and (3) accuracy—the combined attributes of TP and TN out of the total cases should be over 75%. The values for each measure, though somewhat lenient, represent the relative gravity of each measurement for decision makers in a case of a bioterror attack.
Results
The timeline of the epidemiologic investigation system evaluation test is detailed in Table 1. Immediately after receiving the cellular numbers of the first 5 patients (T), a large number of encounters were detected, followed by the drill-down analyses of the patients' behavior, specifically in terms of time and place. At T + 1hr, 3 sites were tagged as “highly probable” candidates for place of dispersion. Six more places were detected not far from the museum and tagged as “lower probability.” At T + 1hr, 45min, the BioCell operators were notified of 6 more patients with similar complaints in the same emergency departments (one of them was actually not related to the event, but the team was blinded to that). By T + 3hr, they had completed analyzing the data of all 11 patients, and possible “ground zero” sites had been narrowed to 2 locations. At T + 4hr, out of the 2 possible locations and based on the total number of correlating data points, the epidemiologic investigation system developers correctly identified the museum as the place of attack. Further analysis yielded correctly the time of the attack (both date and time range). They also noted that they identified a pattern characteristic of a noninfectious bacterial disease outbreak (rather than a toxin or a viral agent). Following these definitions, they started to analyze the data of all remaining cell phones, aiming to identify all those among the volunteers who were in or near the museum at the relevant period and were suspected to be either sick or exposed. At T + 5hr, they had completed the analysis of all cell phones. Fifteen minutes later, the drill was stopped by the organizers. The scenario was correctly characterized by the system in less than 5 hours from the beginning of the test. All patients as well as the nonrelated patient were correctly identified (sensitivity = 100%). Three participants from the healthy nonexposed group were falsely identified as being exposed (3 false positives; specificity = 70%). No false-negative cases were defined (overall accuracy = 85%).
Timeline of the Epidemiologic Investigation System Evaluation Test
Discussion
An unusual biological event or an outbreak entails major challenges for both local and national health systems, especially with regard to epidemiologic investigation.22-24 Health systems may find it difficult to cope with resulting high morbidity and mortality, as well as with the rate of newly diagnosed patients and the need for rapid identification of newly exposed individuals. Traditional methods of epidemiologic investigation are cumbersome and involve relatively large numbers of personnel who need to be knowledgeable about how to properly perform the investigation, and in many cases, they are themselves afraid of becoming sick or infecting their family members. 25 Even when equipped and trained personnel are available, the time for a full epidemiologic investigation usually requires well over 24 hours, even for smaller-scale cases.14,15,17,26,27 Adjusting an appropriate medical response may be late, as well as directing and enhancing relevant research efforts and funds for timely development of medical countermeasures against the threat agent. 6 The existing gap leads to a high potential for damage because of late identification of the biological agent and delayed response,14,26,28 especially in the case of biological warfare agents.
Therefore, national health systems, when facing an unidentified outbreak, should have full situational awareness as fast as possible, which in turn will allow decision makers and operational organizations and agencies to successfully cope, each at its own level, with the developing scenario.3,4,25 Some major events highlighting these issues from recent history include the SARS epidemic in 2007, the anthrax envelopes in 2001, the E. coli outbreak in 2013, and the 2014-15 Ebola virus disease outbreak in West Africa.15,17,27,28-30 In all of these events, we saw a high international impact, but all could have been dealt with more efficiently if decision makers had had a better understanding of the scenario before taking any action and, more important, before publishing misinformed proclamations.
In this drill, which had no support from any cellular companies, we demonstrated that the BioCell-based epidemiologic investigation system has the capabilities and the potential to reinforce, complement, and focus the efforts of traditional techniques of epidemiologic investigation. Our goal was achieved, and the system showed the ability to correctly identify the scenario in a relatively short period, in less than 5 hours from the moment they received the cellular numbers of the first patients. The data used by the epidemiologic investigation system was coarse (ie, the resolution of the phone location was based on antennas with a range of 60m-1500m and depended on a specifically designed application installed on the phones). However, there is a possibility to extract and compute more accurate locations using the cellular networks.
Three participants were falsely identified as having been exposed. This was the result of using the application, instead of using the database of large cellular companies. We regard this as a methodological bias, since we did not want to approach the cellular companies at this stage for the sake of the experiment. In a real-life scenario, we may apply for a judicial approval to receive the data, allowing for a more accurate and much faster allocation of all people involved. Based on the developers' data, we estimate it could be achieved in about 1 hour.
By the end of the study (5 hours from the drill initiation), the developers clearly identified the area near the museum as the relevant area of interest, advising us to notify the people who were walking there during the time of the attack to seek medical assistance, by using text messages. They also advised sending sampling teams to that same area. They added that, in case of a contagious agent, there were several more people (contacts) who should have been included and notified as early as possible, but they stated that this was probably not the case (based mainly on no evidence for secondary infections). They correctly identified 1 of the patients as not being involved in this event.
When evaluating the key measures and performance indicators, the BioCell-based epidemiologic investigation system technology demonstrated the ability to shorten the timeline of investigation. It narrowed down the possible locations of “ground zero” in merely 1 hour, and within 4 hours completed route comparisons and identified the place and time of attack. An additional hour allowed completion of the analysis of all phone routes and identification of possible “exposed” individuals. Performance indicators were successfully fulfilled, demonstrating that 100% of the patients were tracked within 5 hours, and the scenario was accurately reconstructed.
The difference 20 hours can make to response to an unusual biological event is important not only on the biomedical level (prophylactic treatment to exposed populations) but to national response and crisis management. Dealing with an unknown agent in an unknown scenario puts a lot of strain on decision makers, mainly through media and the public who demand answers and solutions. A fast and reliable epidemiologic investigation system can lower this pressure.
There are several limitations related to the use of such a platform. There is a need to study and discuss legal issues related to this specific setting, taking into account issues of civil rights and privacy on one hand, and welfare and biosecurity issues of the whole population on the other hand. The fact that we approach patients, explain what we intend to do with their phone numbers, and ask them (or their kin in case they are unable to respond) to provide us with the numbers answers only part of the privacy issues. Another question is what we do with all those we identify as being in close contact with the victims, exposed, and even infected. Do we send them text messages? Do we call them for symptom investigation and instructions? Do we approach the media? If so, what do we say? These unresolved issues have not been thoroughly dealt with so far and deserve attention.
Conclusions
In this study, we compared the use of a cell phone–based epidemiologic investigation system to traditional methods in a bioterrorism scenario. A technological tool based on retrospective and prospective geolocation data analysis from cell phones and advanced analytics was shown to allow rapid epidemiologic investigation and contact tracing, 2 critical and clinically significant components of coping with an outbreak. For the first time since we started to perform large-scale national bioterrorism drills, we had the full scenario analyzed within a staggering 5 hours. This shows the potential for this technology to become a “game changer” in the field of rapid epidemiologic investigation, including all major components of fast and accurate identification of the event and contact tracing.
Several advantages of the BioCell system over traditional methods include speed of analysis, the ability to characterize the agent as contagious or not, and the ability to define those who are most likely exposed and need to seek medical help. This is based on accurate cell phone information relating to time and place, as well as pointing at contacts in case of a contagious agent. This system should be tested in a large-scale scenario, involving more individuals, dealing with a contagious agent, and using the large cellular companies' data before it can be defined as operational. In addition, issues of personal privacy and legal implications of using such tools should be addressed by the appropriate authorities as part of any future development of the system.
Despite these limitations, we find the BioCell system to be advantageous over currently used tools. It could be integrated in conjunction with outbreak epidemiologic investigations and syndromic surveillance efforts, significantly shortening the response time and contributing dramatically to life-saving efforts.
