Assessing the Continuum of Event-Based Biosurveillance Through an Operational Lens

Abstract

This research follows the Updated Guidelines for Evaluating Public Health Surveillance Systems, Recommendations from the Guidelines Working Group, published by the Centers for Disease Control and Prevention nearly a decade ago. Since then, models have been developed and complex systems have evolved with a breadth of disparate data to detect or forecast chemical, biological, and radiological events that have a significant impact on the One Health landscape. How the attributes identified in 2001 relate to the new range of event-based biosurveillance technologies is unclear. This article frames the continuum of event-based biosurveillance systems (that fuse media reports from the internet), models (ie, computational that forecast disease occurrence), and constructs (ie, descriptive analytical reports) through an operational lens (ie, aspects and attributes associated with operational considerations in the development, testing, and validation of the event-based biosurveillance methods and models and their use in an operational environment). A workshop was held in 2010 to scientifically identify, develop, and vet a set of attributes for event-based biosurveillance. Subject matter experts were invited from 7 federal government agencies and 6 different academic institutions pursuing research in biosurveillance event detection. We describe 8 attribute families for the characterization of event-based biosurveillance: event, readiness, operational aspects, geographic coverage, population coverage, input data, output, and cost. Ultimately, the analyses provide a framework from which the broad scope, complexity, and relevant issues germane to event-based biosurveillance useful in an operational environment can be characterized.

This article frames the continuum of event-based biosurveillance systems, models, and constructs through an operational lens. The authors describe 8 attribute families for the characterization of event-based biosurveillance: event, readiness, operational aspects, geographic coverage, population coverage, input data, output, and cost.

Early recognition and warning of disease outbreaks, spanning human, animal, and plant species, enable health authorities and the public to prepare, intervene, control, and ultimately mitigate an outbreak's potential consequences. In support of this goal, the International Health Regulations (IHRs) were revised in 2005.^1-4 IHR (2005) provides an international legal framework for the early detection and reporting of and response to outbreaks of infectious disease. World Health Organization (WHO) member nations are obligated to develop and maintain surveillance, reporting, notification, verification, and response capabilities. Any nation with knowledge of a disease outbreak of international concern is obligated to report it to WHO within 24 hours. IHR (2005) is designed to ensure timely recognition of disease outbreaks of international public health significance and to promote effective containment before they spread.^2,3,5

This research follows the Centers for Disease Control and Prevention's “Updated Guidelines for Evaluating Public Health Surveillance Systems; Recommendations from the Guidelines Working Group,” published nearly a decade ago.⁶ This influential document described system attributes that could be used to evaluate the usefulness of public health surveillance systems, including system simplicity, flexibility, data quality, acceptability, sensitivity, predictive value positive, representativeness, timeliness, and stability. After the guidelines were published, formal frameworks were developed to assess public health surveillance systems.^7,8 The assessment of public health surveillance systems is progress toward a One Health landscape, which is a worldwide strategy for expanding interdisciplinary collaboration and communications in all aspects of health care for humans, animals, plants, and environment.

Since the public health surveillance system assessments nearly a decade ago, models have been developed and complex systems have evolved with a breadth of disparate data to detect or forecast chemical, biological, and radiological events that have a significant impact in the One Health landscape.^1,4,9,10 Moreover, the development of internet-based biosurveillance systems has added a new dimension to the field. Today, systems must deal with data collected with different methods and quality standards, statistically ill-defined and nonstationary data, and data several steps removed from clinical, syndromic, or test-based data.

New and emerging biosurveillance approaches often use qualitative variables—for example, alerts such as high, medium, or low event severity; 1%, 10%, or 100% disease risk; or categorical variables such as biosurveillance event type. Or they may use quantitative variables (eg, numerical values) to infer the health status of a population or set of populations. How the attributes identified in 2001 relate to current systems for biosurveillance is unclear.

This article frames the continuum of event-based biosurveillance systems (that fuse media reports from the internet), models (computational models that forecast disease occurrence), and constructs (ie, descriptive analytical reports) through an operational lens (ie, aspects and attributes associated with operational considerations in the development, testing, and validation of the event-based biosurveillance methods and models and their use in an operational environment). This science-based assessment framework will ensure that potential users and stakeholders understand the capabilities and limitations of event-based biosurveillance.

Event-Based Biosurveillance

Homeland Security Presidential Directive 21 (HSPD-21) defines the term biosurveillance as the process of active data gathering, with appropriate analysis and interpretation of biosphere data that might relate to disease activity and threats to human or animal health—whether infectious, toxic, metabolic, or otherwise, and regardless of intentional or natural origin—in order to achieve early warning of health threats, early detection of health events, and overall situational awareness of disease activity.¹¹ U.S. Government Accountability Office Report-10-645 expands the scope of biosurveillance to include “pathogens in plants, animals, and humans; food; and the environment.”^12(p6) Historically, biosurveillance has relied on human, animal, and, to a lesser extent, plant populations as early warning sensors for outbreaks. While this practice continues today,¹⁰ additional surveillance options have been developed, further expanding the event-based biosurveillance continuum.^5,13

Surveillance data and results may be quickly and easily distributed globally, a critical development in an age of widespread air travel and international commerce.^14,15 Open-source media analysis may detect evidence of biological threats and/or social disruption, leading to direct and indirect indicators of outbreaks.¹⁶ Moreover, this analysis may provide warning of outbreaks sooner than traditional surveillance methods.¹⁷

Sophisticated analytic techniques have been applied to existing data sources to better distinguish anomalies from baseline data. These include the development of Markov switching models,¹⁸ epidemiologic network models,¹⁹ and Bayesian disease detection models,^20,21 as well as the application of scan statistic analysis.²² The receiver operating characteristic curve is another analytic technique; it is the plot of sensitivity versus 1- specificity or, equivalently, a detector's true positives versus false positives as the discrimination threshold is varied. Although the use of receiver operating characteristic was unable to distinguish between the clinical presentation of H1N1 (2009) and seasonal influenza, use of these curves may have application in discriminating between other diseases.²³

Existing data sources have been combined in new ways—for example, melding diverse data including pharmaceutical sales, ambulance call logs, school and workplace absenteeism, and emergency room chief complaint records in syndromic surveillance systems.²⁴ However, concerns have been raised about the low specificity of syndromic surveillance, and it remains unclear if syndromic surveillance will provide a temporal advantage over clinical surveillance.²⁵ Data on global avian migration dynamics have been combined with phylogenetic analysis to explain the evolution and dispersion of influenza virus strains.²⁶ Global migration patterns may be useful in predicting the global spread of other diseases.

New data sources have allowed the expansion of biosurveillance activities. Satellite imagery and data have detected crop pests and diseases,²⁷ predicted outbreaks of cholera²⁸ and Rift Valley fever,^29,30 and been used to develop risk maps for malaria.³¹ Improved data processing methods have increased the utility and application of remotely sensed data as well.³² In addition, radio frequency identification and environmental sensors have been applied to outbreak detection.³³ Internet-based disease detection methods have also been developed,³⁴ including search-based detection,³⁵ unstructured event detection,³⁶ and crowd-sourced event reporting³⁷—though the accuracy of these methods is still being evaluated.³⁸ A recent addition to the biosurveillance landscape is the analysis of social media messages.^39-41

Despite these advances in biosurveillance, their performance in an operational environment is yet to be determined. Lloyd-Smith et al⁴² describe a relative paucity of zoonotic disease models, and particularly multihost and vectorborne disease models; this may have an impact on the quality of biosurveillance models for zoonotic diseases. It was noted, however, that a “sense of urgency and global risk” due to an outbreak or the emergence of a new pathogen is contributing to the rapid development and publication of models.⁴² Frameworks to evaluate surveillance systems have tended to focus on a particular type of system.^43,44 Hartley et al⁵ identify a lack of techniques and methods for the evaluation of event-based biosurveillance systems. Therefore, this article proposes a characterization framework that may accommodate the entire continuum of event-based biosurveillance.

Methods

In August 2010, a 2-day subject matter expert workshop was held at the Pacific Northwest National Laboratory (PNNL) in Richland, Washington, to scientifically identify, develop, and vet a set of attributes for event-based biosurveillance. These subject matter experts were invited from 7 federal government agencies and 6 academic institutions researching biosurveillance detection. (One federal expert and 1 academic expert were unable to attend.) The workshop was led and facilitated by scientific staff from PNNL with professional backgrounds in informatics, microbiology, epidemiology, operations research, and biosurveillance system development.

The workshop participants discussed general features of biosurveillance models and systems and lessons learned from previous and current efforts. Consensus was reached on the attributes and features of the continuum of event-detection biosurveillance. The workshop conversation was moderated, and the results of the workshop were reviewed and vetted by participants over a 2-month period. In all, 8 attribute families and 40 attributes within those families are defined to evaluate and catalog event-detection biosurveillance models and systems. The families of attributes are: event, readiness, operational aspects, geographic coverage, population coverage, input data, output, and cost.

Common Terminology

Some terms are used throughout this article and are defined here to help build a vocabulary with which the event-based biosurveillance continuum may be characterized.

We define a biosurveillance event to be a chemical, biological, radiological, nuclear, or high-yield explosive event with focus on the all-hazards and One Health landscape. These events are characterized by evidence of condition and risk. These are neither mutually exclusive nor limited to the following examples for evidence of condition: person-to-person transmission (eg, Mycobacterium tuberculosis), zoonoses (eg, Francisella tularensis), foodborne pathogens (eg, Salmonella), vectorborne pathogens (eg, equine encephalitis virus), waterborne pathogens (eg, Vibrio cholerae), airborne pathogens (eg, influenza), animal-to-animal transmission (eg, Aphtae epizooticae), and plant pathogens (eg, soybean and wheat rusts). Examples for evidence of risk are accidental or deliberate events affecting air or water quality (eg, volcanic ash, pesticide runoff), economically motivated adulteration of the food and pharmaceutical supply, and intentional release.

The continuum of event-based biosurveillance is a scientific discipline in which diverse sources of data (eg, clinical activity, syndromic surveillance, internet and media reports) are characterized prospectively (eg, in a networked information system or a biosurveillance model) to provide information on infectious disease events. Biosurveillance complements traditional public health surveillance to provide both early warning of infectious disease events as well as situational awareness. This approach can also be less specific than traditional public health surveillance, though such trade-offs may be appropriate for a network designed to provide early warning.¹⁰

A biosurveillance model is broadly defined as an abstract computational, algorithmic, statistical, or mathematical representation that produces informative output related to biological, chemical, radiological, and nuclear event risk or event detection. The model is formulated with a priori knowledge and may ingest, process, and analyze data. A biosurveillance model may be proactive or anticipatory (eg, used to detect or forecast an event, respectively), it may assess risk (eg, contextualized products arising from disparate data sources, surrogate markers, and subject matter expert analysis), or it may be descriptive in nature (eg, used to understand the dynamics or drivers of an event).

Results

We identified and described a comprehensive set of attributes for the characterization of event-based biosurveillance above.

Event Attribute Family

The event family of attributes is concerned with providing a high-level characterization of the events that can be monitored by the event-based biosurveillance. The event-type attribute characterizes the causative agent, the source of the event, and the detection mode of the event-based biosurveillance. The event-type attribute is concerned with providing an overall classification of the incidents that may be detected by the biosurveillance model or system. These incidents could be biological, chemical, or radiological.

The source attribute helps characterize the event origin (eg, point source or distributed), specify the status of the etiology of the event (known or unknown), and clarify the motivation (intentional or unintentional). For example, consider a contagious severe acute respiratory syndrome prior to identification of the causative virus. The source attribute would be characterized as a distributed respiratory disease of unknown etiology. An example of a point-source event with known etiology and motivation could be melamine contamination in imported powdered milk products. The source attribute can also characterize a radiological threat such as colbalt-60 (⁶⁰Co) contamination of a city block or americium-241 (²⁴¹Am) contamination of agricultural products.

The operational capability of event-based biosurveillance to identify or forecast an event is characterized by the detection mode attribute (eg, event detection, establish baseline, or event detection expected with no baseline). For example, are events detected that deviate from the established baseline in frequency and/or magnitude? It should be noted that the baseline fluctuates and that any excess or anomalous occurrence should be statistically characterized and should include some measure of abnormality or confidence.^45-47 Ideally, the event-based biosurveillance should also discriminate anticipated occurrences from unanticipated occurrences in order to identify and distinguish between rare natural disease events and intentionally caused novel disease events. To illustrate, in developed countries foot-and-mouth disease is a rare but expected and naturally occurring disease event for which the baseline incidence is typically set to zero. Surveillance activities are predicated on the assumption that the event must be caught as early as possible (event recognition) against a background in which no disease should occur, yet it is an essentially normal and expected occurrence. This is juxtaposed against the prospect of a hemorrhagic filovirus arising in a population and ecologic context in which the occurrence is not normal (eg, Marburg hemorrhagic fever in Colorado⁴⁸).

Readiness Attribute Family

Readiness describes the willingness or ability of stakeholders, users, and/or organizations to leverage the event-based biosurveillance in an operational environment and can be either a subjective judgment or an objective measure. Readiness is characterized in 2 contexts: policy (eg, ability to share event-based biosurveillance outputs as needed) and technology (eg, validation and verification of the event-based biosurveillance).

Policy refers to the willingness or ability of stakeholders, users, and/or organizations to share outputs as needed for effective surveillance and response. Considerations of policy readiness include whether event-based biosurveillance output is publicly available or classified:

• event-based biosurveillance output is shared with built-in time lag or clearance processes;

• data sharing is prohibited by regulation;

• a memorandum of understanding or similar document currently exists or may exist in the future.

The technology attribute refers to a technology's readiness for operational use. The most widely used measure for this is assigning a technology readiness level, a measurement system that supports assessments of the maturity of a particular technology and enables the consistent comparison of maturity between different technologies. Technology readiness levels were originally developed and used by the National Aeronautics and Space Administration (NASA) for technology planning and have been widely adopted in government and industry. Some key aspects of technology readiness include:

• verification and validation of the algorithms against historical data;

• degree of rigor in testing from software, usability, and workflow perspectives;

• level of use within an operational environment rather than in a prototype pilot or development setting;

• basic principles and concepts of the event-based biosurveillance are defined and demonstrated.

Operational Aspect Attribute Family

The operational aspect family of attributes focuses on quantifying the overall operational characteristics of the biosurveillance detection model or system and falls into one of the following categories: system requirements, ability to continue tracking the event, system redundancy/reliability, operational mode (always on vs. activation required), and scalability and robustness of higher order effects.

The system requirements attribute specifies the requirements needed to operate the event-based biosurveillance. These requirements can be defined as information technology infrastructure (eg, desktop or supercomputer), operating systems, specific software (eg, SQL, MATLAB^®, registered trademark of The MathWorks), complexity (eg, qualitative requirements of components under standard operating conditions), activation (eg, real-time, 24/7, normal or nonholiday hours, during response to a crisis), and subject matter expertise (eg, infectious disease medicine, modeling, foreign language translation, plant and environmental health).

The ability to continue tracking an event relates to the ability of the biosurveillance model or system to follow an event throughout the event life cycle. For example, an event-based biosurveillance may be quite successful in detecting or forecasting an outbreak of disease, but it may lose its sensitivity as the event progresses (eg, background level of event indicators is too high due to a media announcement).

The system redundancy attribute measures the ability of the biosurveillance model or system to operate in the face of component loss or degradation. These components compose the set of requirements for event-based biosurveillance function and could include staff, information technology infrastructure, data inputs, or interagency agreements, and so forth. Without system redundancy, the loss of any one of these components would negatively affect event-based biosurveillance performance.

Scalability relates to how well event-based biosurveillance may be applied to various volumes of data input, levels of geographic coverage, or numbers of targets. Scalability also refers to the maximum potential effectiveness of a model were it to be transitioned to another volume of data input, geographic coverage, number of targets, and so forth. It addresses the extent to which the model or system scales with respect to input data, geographic coverage, number of targets, and so forth.

Robustness measures the event-based biosurveillance's fidelity under significant departures from its assumptions—that is, the degree to which the event-based biosurveillance accurately reproduces features of a real-world system. Can the event-based biosurveillance cope with low-quality data? Does it use multiple component algorithms (such models or systems may be resistant or sensitive to deviations in data quality, fitness, assumptions) or multiple data in aggregate with potential overlap? Is the system able to adapt or be adapted to as yet unknown events? Is the event-based biosurveillance able to accept real-world data with possible inconsistencies and incompleteness? How compromised (if at all) is the analysis by breaks in the historical maintenance of the system?

The first-order effects of an event include all of the consequences directly related to that event. Consideration of higher order effects, however, includes impacts indirectly caused by the event and that have some dependency on consequences directly affected by the event. This may include the cleanup of a radiological attack or the loss of consumer confidence in a food product or pharmaceutical as a result of a contamination event. Note that there might be cases when sensitivity to higher-order effects might conflict with sensitivity first-order effects.

Geographic Coverage Attribute Family

Geographic coverage refers to the physical domain in which the event-based biosurveillance operates. The geographic coverage family of attributes evaluates the geospatial domain associated with detecting or forecasting a biosurveillance event. This family of attributes characterizes density and the input and output geography.

Density, in terms of geographic coverage, represents the number of “sensors” per domain (eg, the number of sensors per square mile, surveys). This drives confidence in a forecast of risk or condition in a geographic domain.

Input geography refers to the domain of the data that the event-based biosurveillance uses for event detection. For example, remotely sensed sea surface temperature increases and chlorophyll levels are inputs used to predict cholera outbreaks in Bangladesh.²⁸ This attribute could answer whether the reports come from the local level or are geographically unidentifiable. For example, it will help determine if a regional report of a specific event came from that region or if the report is about something that happened in that region.

The output geography attribute describes the resulting domain where the event is detected or forecast. This helps determine the location of the point source for the event detected and whether inputs need to be from the same region as the request output, and it helps determine what region-specific events may occur.

Population Coverage Attribute Family

The population coverage attribute family describes the underlying population that is affected by the event-detection event-based biosurveillance. Attributes include quantitative and qualitative measures of a population's size, density, and species.

The size attribute can be either a quantitative value (eg, total spore count, size of the population of Chicago, number of homeless people exposed to tuberculosis) or a qualitative one (eg, flock, herd, family, community).

The density attribute refers to the quantitative or qualitative value for the population count per geographic area or the sampling scheme to drive the event-based biosurveillance (eg, number of samples polled to represent the population at large or the number of deaths due to disease to drive a forecast model).

The species attribute characterizes the specific population included in the event-based biosurveillance to distinguish among models and systems for human, animal, and plant health. The population's physiological status would be characterized here (eg, immune status).

Input Data Attribute Family

The overarching goal of the input data family of attributes is the characterization of event-based biosurveillance according to its general and specific data requirements. This includes determining the characteristics required of the data, the availability of such data, and whether an input requirement is relevant. Input data have great bearing on operational utility and are summarized using the following attributes: accessibility, content, granularity, indicators and warnings, latency, longevity, quality, and utility.

The input data accessibility attribute refers to the availability of the data for use in event-based biosurveillance as well as the data-gathering and management processes (eg, commodity trading logs from China). When data are accessible, they can be consistently accessed as often as needed for the event-based biosurveillance and do not prevent the scheduled operation of the event-based biosurveillance. Human and technological resources also need to be considered when evaluating input data accessibility. The accessibility of the input data may be affected by the data's availability to the event-based biosurveillance, whether the data are formatted for use in the event-based biosurveillance, and the frequency at which the data are available.

Input data content is the data specification required by the event-based biosurveillance, such as the quantity and class of data needed to produce meaningful results. The content attribute also spans the quantity and class of other data required of the event-based biosurveillance (eg, demographic, behavioral, and exposure information for the acute emergent event). Additionally, the dependence of the data on other systems or constructs may be included. The manner of data collection is also of interest, including the number and type of sources and the requirement to follow up or otherwise update the data. Some typical questions asked about data content include: Are the data simple case counts over time? Are the data complex (eg, consisting of large numbers of variables and associated metadata with specific properties such that any deviation would invalidate the event-based biosurveillance results)? Examples of input data content may be one or more of but limited to the following:

• Diagnostic—Does the event-based biosurveillance use an independent empirical confirmation?

• Syndromic—Does it employ surveillance using health-related data that precede or that are a proxy for diagnosis? Does it describe analysis of diagnostic result data and consider the approach to be one that looks for anomalies in “case counts” versus seasons and geography?

• Curated data—Is the event-based biosurveillance based on data maintained in a curated format (eg, a database or data warehouse) such that the event-based biosurveillance results are the most recent available? Does the event-based biosurveillance account for historical changes made to the underlying data (eg, updating case definitions in archives of curated data)? Does the event-based biosurveillance automatically update based on changes to these data?

• Dynamic data—Is the event-based biosurveillance focused on a predetermined target of interest, or can it discover items of interest unconstrained by a particular target set (ie, anomalies)? Are the results in a static or dynamic format (eg, disease frequency updated dynamically in a geographic domain, real-time assessment of febrile airline passengers arriving on international flights)?

The input data granularity attribute describes the level of detail expected in the input data sources. This attribute can be based on a single input data field, an entire data table, or the total body of all input data sources. A mismatch between the input data granularity required by the event-based biosurveillance and that of the data available for input will affect the results of the event-based biosurveillance. Data with greater detail may, with some effort, be aggregated to a less detailed level, but the reverse is not always true.

There are 2 different types of indicators and warnings attributes. Direct indicators and warnings are unambiguous information that an event is indeed occurring.¹⁶ Indirect indicators and warnings are proxy data that indicate the circumstances wherein a biosurveillance event is likely to occur.

Input data latency is the time between the current date and the most recent date for which data are available to the event-based biosurveillance. Latency is a function of inherent time lags in the processes of collecting, transmitting, editing, and preparing data for analysis. Additional delays may be incurred in order to obtain independent confirmation of observed manifestations of the emergent event. Input data latency is a property of the data source and associated collection methods. It should not be used to identify the refresh rate required by a system or model.

The following is an example scenario for the latency attribute: an environmental sensor collects aerosol samples on a filter at a physical site, as input samples. These filter samples are then processed in the laboratory at another location to produce a signal related to the microorganisms that are present on the filter sample. The time it takes for the sample to be collected until the sample results are available is the latency.

Input data longevity of the input data measures how long they are considered valid. In many cases, the data on which the event-based biosurveillance operate are specific to a particular time period. These data are often first reported as preliminary and then become current before finally becoming historical data that may be revised. The longevity of input data is tied to its intended use and therefore is attributed to the event-based biosurveillance that employs them. Input data longevity helps determine whether the input data expire and the length of time that the data are useful.

The input data quality may be appraised in terms of consistency, comprehensiveness, accuracy, timeliness, and validity. Poor quality data might contain simple errors such as misspellings, multiple addresses for a single entity, or missing information, or it may come from a biased source. There are many factors to data quality, but perhaps the 2 most important are the validity and completeness of the data. Degradation of these 2 factors will negatively affect event-based biosurveillance performance. Example considerations include whether the data elements satisfy the needs of their intended use, whether the data are a complete and accurate portrayal of the actual phenomenon, whether there are any internal conflicts in the data, and how significantly data quality issues affect outputs.

Input data utility is the suitability of the input data to the expected event-based biosurveillance outputs. This attribute includes the degree of preprocessing that input data require to be employed by an event-based biosurveillance. Many event-based biosurveillances rely on supplementary data beyond simple case counts, such as demographic attributes of affected populations, event details, and other exogenous threat indicators. It is the sum total of these data that determines their utility. Example considerations include whether these data are a proxy for the optimal but unobtainable data or whether the data are a subset of other easily obtained data.

Output Attribute Family

The output family of attributes describes event-based biosurveillance with respect to the information it produces. While any event-based biosurveillance will produce some form of output, the fit, form, and function of that output must be relevant and usable for the specific analysis or monitoring scenario. The following attributes are key facets of this important family: accessibility, confidence, content, granularity, latency, longevity, quality, and utility.

The degree to which the results from an event-based biosurveillance are available to end users is captured by the output accessibility attribute. Accessibility refers to the ease of dissemination of event-based biosurveillance output. When the output information is accessible, it could potentially enable the use of model results by decision makers and planners. This attribute includes an assessment of the channels through which the findings are published or distributed. The accessibility of the output data may be assessed by permissions and/or rights to share the data, interagency data access, sensitivity of data (eg, personally identifiable information, Health Insurance Portability and Accountability Act information), the postprocessing that output data require prior to consumption, and the frequency of publication of output data.

Output confidence relates to the accuracy of the event-based biosurveillance results. It is used to evaluate the likelihood that the confidence interval contains the true result. This attribute helps evaluate whether the event-based biosurveillance generates estimates at specific intervals with some degree of probability. Confidence also facilitates the evaluation of estimates for frequency and probability distributions.

The output content attribute defines what the event-based biosurveillance generates as a product. The output may be qualitative or quantitative and may include probabilities, forecasts, digests, counts, warnings, or lists of outlying observations. The content attribute also spans the quantity and class of data produced by the event-based biosurveillance. If the event-based biosurveillance output lends guidance to other systems, this should be included. Output may be an analytic report or document arising from subject matter expert analysis of raw and potentially disparate data types that leverage qualitative models, computational models, or both. Output content might be binary, categorical, or interval responses, and it may contain structure warnings or graded alerts. The output may contain forecasts. This may take the form of numeric forecasts, probabilities that discrete outcomes will occur, projections based on assumed scenarios, or some combination thereof.

Output granularity describes the level of detail provided in the output data. This attribute is based on the spectrum of output that an event-based biosurveillance may generate. The output may be a single data field, an output table, or a graphic representation of the output data. It is not always possible to aggregate output containing low detail to a level of high detail. This attribute can help the model forecast a region of infection, a specific population of infection, a specific number of individuals infected (adults, children), and so forth. It also helps evaluate the level of detail for the resultant data.

Output latency describes the time between the current and the most recent date for which output is available from the event-based biosurveillance. Output latency consists of the delay between the event-based biosurveillance system's first published output and the time required to produce additional published output. An example consideration is the length of time before the results are refreshed.

Output longevity is a measure of how long event-based biosurveillance output may be considered valid. In most cases, the information produced by the event-based biosurveillance is specific to a given time (as well as the place and set of circumstances). The output may also go through a life cycle wherein it is preliminarily reported, revised, and then completed. This attribute also addresses whether the expiration time is clearly indicated on products.

Resultant information that fulfills the purpose of an event-based biosurveillance is measured by the output quality attribute. This event-based biosurveillance attribute encompasses the concepts of scientific accuracy and precision as well as the related concepts of sensitivity, specificity, and negative predictive value. This attribute addresses output data bias, the relevance of the output, and the usefulness of data to other event-based biosurveillances.

The output utility attribute gauges the usefulness of the output in analysis, situational awareness, or decision support. Examples of such output include projected attack rates or the probability of a potential outcome. Many acute emergent events evolve over time. It is the event-based biosurveillance system's output at critical junctures of this evolution that determine its utility (eg, do the output data lend themselves to further use by decision makers?). The following are examples of the utility of an event-based biosurveillance:

• analysis of consequences (ie, what-if scenarios);

• situational awareness (the output data describe events that are proximal in time and space);

• decision support (biosurveillance system/results may be leveraged alongside raw data, domain knowledge, or heuristic rules to inform decisions);

• the output data's usefulness as input data to other or subsequent event-based biosurveillances.

Cost Attribute Family

Cost is a largely objective measure of the total necessary input of time, money, and other resources into event-based biosurveillance. It is also a measure of whether the source of those resources can be expected to continue to provide those resources to the event-based biosurveillance. The cost family of attributes comprises 3 main attributes: sustainment of funding, research and development, and operations and human capital.

Sustainment of funding refers to the ability of a biosurveillance model or system to continue to receive necessary resources from its funding source. The willingness of the funding source to sustain funding may be a result of the success of the event-based biosurveillance, the policy readiness of the event-based biosurveillance, or a number of other reasons. It is important to note that even the very best of biosurveillance models and systems cannot reliably function without a steady stream of resources. As a result, the sustainment of funding is important for all event-based biosurveillances since an organization that plans to use the event-based biosurveillance will invest much of its own time and resources in training and integrating the event-based biosurveillance with its internal work processes. This attribute can be measured both in units of time and resources as well as by the future prospects for funding of the event-based biosurveillance.

The research, development, testing, and evaluation costs relate to the construction of and the preparation of event-based biosurveillance for an operational setting. This operational setting will likely differ from the setting in which the event-based biosurveillance was developed, so additional resources will likely need to be allocated to make sure that the event-based biosurveillance is prepared for this new environment. Research, development, testing, and evaluation also include the costs of transitioning a model or system to use expanded data volume, including hiring staff and acquiring necessary hardware and software. Considerations for these costs include:

• the cost to develop a prototype biosurveillance model or system from a research concept;

• the cost to make the prototype event-based biosurveillance operational;

• the cost to increase the technology readiness level.

Operations costs are those related to the operational aspects (eg, data gathering, data processing) of a biosurveillance model or system. This attribute also includes the acquisition and retention of trained individuals and field experts and updating the model or system in order to remain current. It includes the cost of the data used by the event-based biosurveillance, the cost to retain valued staff, and the cost to update the event-based biosurveillance system to maintain performance.

Discussion and Summary

Authorities with oversight of One Health⁴⁹ priorities are concerned with the forecast and detection of biological, chemical, radiological, and nuclear events that have significant impact in their jurisdictional purview. Calls for collaboration among veterinarians, physicians, plant experts, and public health professionals to improve biosurveillance effectiveness are rising.^12,50 The research highlights the interconnection among human and animal disease with ecology and crop and plant science (eg, famine caused by crop disease and drought drives malnutrition, a critical risk factor in many human and animal diseases).

Recent biosurveillance research, such as the Institute of Medicine/National Research Council's BioWatch and Public Health Surveillance: Evaluating Systems of Early Detection of Biological Threats, highlights the importance of better collaboration between BioWatch, a biosurveillance program that seeks to detect the release of airborne pathogens in American cities, and various public health surveillance systems.^49,50 The report evaluates the relative merits and potential capabilities of the BioWatch system (Generation 2 and Generation 3). It further describes characteristics of an “enhanced national surveillance system” that relies on hospitals and public health systems to provide a rapid response to bioterrorist attacks or other biothreats. The report emphasizes that the 2 systems are complementary, both in their current configuration and the conceptual “enhanced” configuration. Finally, the report states the importance of conducting systematic testing and evaluation of current and future biosurveillance efforts, thus underscoring the importance of the research presented herein to identify and describe attributes for the characterization of event-based biosurveillance.

Previous work to characterize biosurveillance established attributes of public health surveillance systems by providing descriptive summaries of their characteristics.^6-8 This novel work forms an analytical framework to view and characterize the event-based biosurveillance continuum. This article represents the culmination of this effort and the first major product of the workshop. Future work will include developing specific evaluation metrics from the proposed attributes and further elucidating connections between the attributes (eg, data input latency and output latency are strongly tied to policy).

Eight families of attributes and more than 40 individual attributes have been presented to describe event-based biosurveillance. We posit these attributes are sufficient to characterize the event-based biosurveillance continuum. This scientifically driven process will ensure that potential users and stakeholders use common attributes to describe event-based biosurveillance, and they can use these attributes to further evaluate event-based biosurveillance and understand its capabilities and limitations. Ultimately, the evaluation of biosurveillance models and data sources will provide key information to enable the identification of gaps and recommendations for improvements to existing models, which will optimize the utility of the models in an operational environment.

Authors' Contributions

RW, CC, and RB designed the experiments and organized the biosurveillance workshop. CC, JC, ML, and RB drafted the manuscript. AC, GD, DH, EL, CN, NN, JO, and CT revised the document for critical intellectual content. All authors participated in the workshop and contributed to the analysis and interpretation of the biosurveillance continuum. All authors read and approved the final manuscript.

Footnotes

Acknowledgments

This study was supported through a contract to Pacific Northwest National Laboratory from the Science and Technology Directorate, Chemical and Biological Division, Threat Characterization and Attribution Branch, of the U.S. Department of Homeland Security (DHS). The authors thank Christine Noonan and Andrew Cowell for information analysis and helpful discussions during manuscript development. The authors are grateful for the operational biosurveillance subject matter expertise provided by management and analysts at DHS's National Biosurveillance Integration Center. The authors' opinions do not necessarily reflect those of their organizations. Pacific Northwest National Laboratory is operated by Battelle for the U.S. Department of Energy under contract DE-ACO5-76RLO 1830.

References

World Health Organization. International Health Regulations. Geneva: World Health Organization, 1969.

Baker

, Fidler

. Global public health surveillance under new international health regulations. Emerg Infect Dis, 2006 Jul 12,7:1058–1065.

Sturtevant

, Anema

, Brownstein

. The new International Health Regulations: considerations for global public health surveillance. Disaster Med Public Health Prep, 2007 Nov 1,2:117–121.

World Health Organization. Frequently Asked Questions about the International Health Regulations. Geneva: World Health Organization, 2005. http://www.who.int/csr/ihr/howtheywork/faq/en/print.html. 2011 November 9.

Hartley

, Nelson

, Walters

et al. The landscape of international event-based biosurveillance. Emerg Health Threats J, 2010; 3Article e3.

CDC Guidelines Working Group. Updated guidelines for evaluating public health surveillance systems. MMWR Recomm Rep, 2001; 50,RR13:1–35.

Beuhler

, Hopkins

, Overhage

, Sosin

, Tong

. CDC Working Group. Framework for evaluating public health surveillance systems for early detection of outbreaks: recommendations from the CDC Working Group. MMWR Recomm Rep, 2004; 53,RR-5:1–11.

European Centre for Disease Prevention and Control. Framework for a Strategy for Infectious Disease Surveillance in Europe (2006-2008) 2005. http://www.ecdc.europa.eu/en/activities/surveillance/documents/0806_framework_surveillance_strategy_in_europe.pdf. 2011 November 9.

Mazet

, Clifford

, Coppolillo

, Deolalikar

, Erickson

, Kazwala

. A “one health” approach to address emerging zoonoses: the HALI project in Tanzania. PLoS Med, 2009 Dec 6,12:e1000190.

10.

Kahn

. Animals: the world's best (and cheapest) biosensors. Bull At Sci 2007. http://www.thebulletin.org/web-edition/columnists/laura-h-kahn/animals-the-worlds-best-and-cheapest-biosensors. 2011 November 9.

11.

The White House. Homeland Security Presidential Directive/HSPD-21: Public Health and Medical Preparedness. 2007. http://www.fas.org/irp/offdocs/nspd/hspd-21.htm. 2011 November 9.

12.

U.S. Government Accountability Office. Biosurveillance: Efforts to Develop a National Biosurveillance Capability Need a National Strategy and a Designated Leader. June 2010Report No. GAO-10-645http://www.gao.gov/new.items/d10645.pdf. 2011 November 9.

13.

Walters

, Harlan

, Nelson

, Hartley

, Voeller

. Data Sources for Biosurveillance Wiley Handbook of Science and Technology for Homeland Security. New York: John Wiley & Sons, 2008.

14.

National Electronic Disease Surveillance System Working Group. National Electronic Disease Surveillance System (NEDSS): a standards-based approach to connect public health and clinical medicine. J Public Health Manag Pract, 2001 Nov 7,6:43–50.

15.

Widdowson

M-A

, Bosman

, van Straten

et al. Automated, laboratory-based system using the internet for disease outbreak detection, the Netherlands. Emerg Infect Dis, 2003; 9,9:1046–1052.

16.

Wilson

, Polyak

, Blake

, Collmann

. A heuristic indication and warning staging model for detection and assessment of biological events. J Am Med Inform Assoc, 2008 Mar-Apr 15,2:158–171.

17.

Nelson

, Brownstein

, Hartley

. Event-based biosurveillance of respiratory disease in Mexico, 2007-2009: connection to the 2009 influenza A(H1N1) pandemic? Euro Surveill, 2010 Jul 15,30 19626.

18.

H-M

, Daniel

, Chen

. Bioterrorism event detection based on the Markov switching model: a simulated anthrax outbreak study. Intelligence and Security Informatics, 2008 ISI IEEE International Conference, June 17-20, 2008.

19.

Reis

, Kohane

, Mandl

. An epidemiological network model for disease outbreak detection. PLoS Med, 2007; 4,6:e210.

20.

Cooper

, Dash

, Levander

, Wong

W-K

, Hogan

, Wagner

. Bayesian biosurveillance of disease outbreaks. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. Banff, Canada1036855AUAI Press, 2004.

21.

Jiang

, Cooper

. A Bayesian spatio-temporal method for disease outbreak detection. J Am Med Inform Assoc, 2010; 17,4:462–471.

22.

Burkom

, Elbert

. Biosurveillance applying scan statistics with multiple, disparate data sources. J Urban Health, 2003; 80:i131–i2.

23.

Ong

, Chen

, Lin

et al. Improving the clinical diagnosis of influenza—a comparative analysis of new influenza A (H1N1) cases. PLoS One, 2009; 4,12:e8453.

24.

Hurt-Mullen

, Coberly

. Syndromic surveillance on the epidemiologist's desktop: making sense of much data. MMWR Morb Mortal Wkly Rep, 2005; 54:141–146.

25.

Buehler

, Berkelman

, Hartley

, Peters

. Syndromic surveillance and bioterrorism-related epidemics. Emerg Infect Dis, 2003; 9,10:1197–1204.

26.

Bedford

, Cobey

, Beerli

, Pascual

. Global migration dynamics underlie evolution and persistence of human influenza A (H3N2) PLoS Pathog, 2010; 6,5:e1000918.

27.

Riedell

, Osborne

, Hesler

. Insect pest and disease detection using remote sensing techniques. Proceedings of the 7th International Conference on Precision Agriculture, Minneapolis, MN, 2005.

28.

Constantin de Magny

, Murtugudde

, Sapiano

et al. Environmental signatures associated with cholera epidemics. Proc Natl Acad Sci U S A, 2008; 105:17676–17681.

29.

Anyamba

, Chretien

J-P

, Small

et al. Prediction of a Rift Valley fever outbreak. Proc Natl Acad Sci U S A, 2009; 106,3:955–959.

30.

Linthicum

, Anyamba

, Tucker

, Kelley

, Myers

, Peters

. Climate and satellite indicators to forecast Rift Valley fever epidemics in Kenya. Science, 1999; 285,5426:397–400.

31.

Omumbo

, Hay

, Snow

, Tatem

, Rogers

. Modelling malaria risk in East Africa at high-spatial resolution. Trop Med Int Health, 2005; 10,6:557.

32.

Scharlemann

JPW

, Benz

, Hay

et al. Global data for ecology and epidemiology: a novel algorithm for temporal Fourier processing MODIS data. PLoS One, 2008; 3,1:e1408.

33.

Chowdhury

, Chowdhury

, Sultana

. Real-time early infectious outbreak detection systems using emerging technologies. International Conference on Advances in Recent Technologies in Communication and ComputingARTCom '09;October 27-28, 2009.

34.

Boulos

MNK

, Sanfilippo

, Corley

, Wheeler

. Social web mining and exploitation for serious applications: technosocial predictive analytics and related technologies for public health, environmental and national security surveillance. Comput Methods Programs Biomed, 2010; 100,1:16–23.

35.

Butler

. Web data predict flu. Nature, 2008; 456:287–288.

36.

Keller

, Blench

, Tolentino

et al. Use of unstructured event-based reports for global infectious disease surveillance. Emerg Infect Dis, 2009 May 15,5:689–695.

37.

Brownstein

, Freifeld

, Madoff

. Digital disease detection—harnessing the web for public health surveillance. New Engl J Med, 2009 May 21 360,21:2153–2157.

38.

Google Flu Trends estimates off, study finds. Science Daily. May 17, 2010. http://www.sciencedaily.com/releases/2010/05/100517101714.htm. 2011 November 10.

39.

Chew

, Eysenbach

. Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS One, 2010; 5,11:e14118.

40.

Culotta

. Towards detecting influenza epidemics by analyzing Twitter messages. 1st Workshop on Social Media Analytics (SOMA '10) ACM, 2010.

41.

Corley

, Cook

, Mikler

, Singh

. Text and structural data mining of influenza mentions in web and social media. Int J Environ Res Public Health, 2010; 7,2:596–615.

42.

Lloyd-Smith

, George

, Pepin

et al. Epidemic dynamics at the human-animal interface. Science, 2009; 326,5958:1362–1367.

43.

Sosin

. Draft framework for evaluating syndromic surveillance systems. J Urban Health, 2003; 80,Suppl 1:i8–i13.

44.

Siegrist

, Pavlin

. Bio-ALIRT biosurveillance detection algorithm evaluation. MMWR Morb Mortal Wkly Rep, 2004; Suppl 53:152–158.

45.

Shea

, Lister

. The BioWatch Program: Detection of Bioterrorism. Congressional Research Service Report No. RL 32152. Washington, DC: CRS, 2003.

46.

Cohen

. Swine flu outbreak out of Mexico? Scientists ponder swine flu's origins. Science, 2009 May 8 324,5928:700–702.

47.

Cohen

, Enserink

. Infectious diseases: as swine flu circles globe, scientists grapple with basic questions. Science, 2009 May 1 324,5927:572–573.

48.

Imported case of Marburg hemorrhagic fever—Colorado. 2008. MMWR Morb Mortal Wkly Rep, 2009; 58,49:1377–1381.

49.

Kahn

. Confronting zoonoses, linking human and veterinary medicine. Emerg Infect Dis, 2006 Apr 12,4:556–561.

50.

Committee on Effectiveness of National Biosurveillance Systems. BioWatch and Public Health Surveillance: Evaluating Systems for the Early Detection of Biological Threats. Washington, DC: National Academies Press, 2010.