Public health in digital cities
Transcription
Public health in digital cities
Knowledge & Innovation Community EIT ICT Labs Technical Report TR2011-003 Innovation Radar Public health in digital cities December 2011 Public health in digital cities: Syndromic surveillance 1 Introduction The sustained migration towards cities, for work and for improved living conditions, places increasingly high demands on urban infrastructures. Cities of the future will require more efficient means of transportation, equal distribution of resources such as water and electricity, and robust healthcare networks to accommodate larger populations. To aid the growing numbers of city inhabitants in collectively asserting their will in issues that affect their life chances, the cities of the future will also require inclusive means of political participation, and tools for increasing the transparency of decision making processes [1]. Digital cities of the future is a European Institute of Innovation & Technology (EIT) thematic action line that investigates the future of urban environments, with a focus on the information technologies employed in these settings. This report focuses on the intersection of the digital city, and an informationintensive form of public health surveillance called syndromic surveillance, where data gathered for non-diagnostic purposes are used to identify signals relevant to public health. Urban surveillance systems can be found in almost any location linked to consumption: supermarkets, gas stations, shopping centres, cinemas, restaurants, etc. The most common form is a security camera that monitors the premises. Camera-based surveillance is also present in traffic, to enforce speed limits, detect red light violations, and control toll roads. The motivation behind the development of these systems, or of any surveillance system in general, often focuses on permutations of three issues: efficiency, safety, and convenience [2, p.96]. In contemporary societies, systems of surveillance take the form of information systems; collecting, storing, and distributing information on populations. The digital city thematic action line proposes to create new, robust, and in many cases, invisible information systems that are part of the urban infrastructure. This invisibility is imagined to be a result of the seamless integration of new information systems with the existing methods of travel, recreation, and consumption within the city. Currently, technologies such as electronic travel cards, credit cards, and personal identification cards are used by city dwellers in their daily activities. Attempts to integrate these systems to combine the benefits, and increase convenience at the same time, occur often. Examples include credit cards with photos of their owners that can be used for identification [3], and travel cards with built-in credit [4]. Proposals for new syndromic surveillance systems can also be situated within the same trend, aiming to take advantage of increased data collection for public health purposes. The digital 1 city, envisioned as a central hub for data collection activities, fulfils the functional requirements for the employment of syndromic surveillance systems. 1.1 Defining syndromic surveillance Syndromic surveillance is a recent addition to the set of methods collectively described as disease surveillance. Disease surveillance is an epidemiological practice where the activities of a population are monitored for predefined signs. These signs are then interpreted in an attempt to prevent or minimise the spread of the disease. Disease surveillance is performed for both communicable diseases (influenza, chlamydia, salmonella, etc.) and non-communicable diseases (asthma, cancer, diabetes, etc.). The surveillance of the former is called infectious disease surveillance, and it most often involves analysing case reports or lab reports filed after doctors’ visits. In this analysis, the lab-verified results are often used as highly accurate indicators of the disease, but the delay between the onset of symptoms and the verification of the diagnosis through the reports may be several days to weeks depending on the disease, the diagnosis and the local infrastructure available to the practitioners. To reduce the delay, data originally collected for other purposes have been proposed as additional indicators to aid the understanding of infectious diseases. This approach is called syndromic surveillance, and its practitioners collect and analyse data from different data sources including pre-diagnostic case reports, number of hospital visits, over-the-counter drug sales and Web search queries, among many other sources. One of the most comprehensive and influential definitions of syndromic surveillance was given by the United States Centers for Disease Control and Prevention (CDC): Syndromic surveillance for early outbreak detection is an investigational approach where health department staff, assisted by automated data acquisition and generation of statistical signals, monitor disease indicators continually (real-time) or at least daily (near real-time) to detect outbreaks of diseases earlier and more completely than might otherwise be possible with traditional public health methods (e.g., by reportable disease surveillance and telephone consultation). The distinguishing characteristic of syndromic surveillance is the use of indicator data types. [5, p.2] Practitioners have criticised the usage of syndromic surveillance as imprecise and misleading, because many of the systems described by the term do not actually monitor syndromes, the association of signs and symptoms often observed together, but other non-health related data sources such as over-thecounter medication or ambulance dispatches [6, 7]. Despite its shortcomings, the term remains the most widely recognised among the alternatives. Other terms that describe similar or equivalent activities include: early warning systems, prodrome surveillance, pre-diagnostic surveillance, outbreak detection systems, information system-based sentinel surveillance, biosurveillance systems, health indicator surveillance, nontraditional surveillance, and symptom-based surveillance. Some developers of syndromic surveillance systems have argued for broader terms, such as biosurveillance, to describe their work, in an attempt 2 to unify outbreak detection and outbreak characterisation, claiming that epidemiologists may consider outbreak characterisation to be separate from public health surveillance [8, p.3]. 3 2 Background Syndromic surveillance is a recent development in the long history of disease surveillance. While a full review of developments in disease surveillance is outside the scope of this report, a brief look at current disease surveillance systems from the perspective of information technologies may be helpful [9]. The unifying property of complex disease surveillance systems as information and communication systems is that they are designed to function without human intervention, performing statistical analyses at regular intervals to discover aberrant signals that match the parameters set by their operators. Recent advances in information and communication technology (ICT) have made the development and operation of such systems technically feasible, and many systems have been proposed to interpret multiple data sources, including those containing non-health related information, for disease surveillance. The introduction of these systems to the public health infrastructure has been accompanied by significant criticism regarding the diverting of resources from public health programs to the development of the systems [10, 11], the challenges of investigating the alerts raised by the systems [6], and the claims of rapid detection [12, 13]. The motivation behind developing complex ICT systems for disease surveillance can be partially explained by the observation that epidemiologists tasked with monitoring communicable diseases are expected to maintain an awareness of multiple databases during their daily work. To provide the experts with a rapid overview of all available data, and to equip them with additional information to make decisions, development of ICT systems are proposed. Efficiently interpreting the combined output of these systems, however, remains a technical challenge. In many cases, the populations represented in the data sources monitored by the systems differ significantly, preventing the application of traditional statistical methods to analyse the collected data. In theory, syndromic surveillance complements traditional disease surveillance in order to increase the sensitivity and specificity of outbreak detection and public health surveillance efforts. However, the syntactic and semantic diversity of the syndromic data sources complicates such efforts. 2.1 Constituents of disease surveillance systems Conceptually, disease surveillance systems may be partitioned into collection, analysis, and notification. The collection component contains lists of available data sources, collection strategies for data sources, instructions for formatting the collected data, and storage solutions. The analysis component stores a wide variety of computational methods used to extract significant signals from the collected data. The final component, notification, contains the procedures for communicating analysis results to interested parties. The results may be presented in many forms: numerical output from statistical analysis, incident plots displaying exceeded thresholds, maps coloured to indicate different levels of observed activity, or simply as messages advising the experts to check a data source for further information. 4 2.1.1 Collection The set of accessible data sources is the most important factor in determining the capabilities of a disease surveillance system. Once again, following the growth of syndromic surveillance, a wide variety of sources have been proposed for monitoring. They may be divided into three groups based on when they become visible to the system relative to the patients’ status: pre-clinical, clinical prediagnostic, and diagnostic [14]. An alternative method of categorising the data is to group according the type of patient behaviour that produces the data: information seeking after onset of symptoms; care seeking where the patient attempts to contact the healthcare provider or decides to purchase medication; and post-contact, when the patient becomes visible in traditional public health surveillance systems. Data sources most often used for syndromic surveillance, ordered by availability, from earliest to latest, are as follows [13, 15]: • • • • • • • • • • • • over-the-counter drug sales triage nurse line calls work and school absenteeism prescription drug sales emergency hotline calls emergency department visit chief complaints laboratory test orders ambulatory visit records veterinary health records hospital admissions and discharges laboratory test results case reports In a survey of operational syndromic surveillance systems [16], it is reported that the 52 respondents monitor the following data sources: emergency department visits (84%), outpatient clinic visits (49%), over-the-counter medication sales (44%), calls to poison control centres (37%), and school absenteeism (35%). Another review by [17] examines 56 systems and presents a comparable distribution of data source usage. The availability of data sources depends on the local context of the project: jurisdiction of the organisation responsible for the system, diagnoses to be monitored, existing laws regulating data access, and technical concerns such as ensuring sustained connectivity to the data sources. Recent research suggests that additional sources such as Web search queries [18, 19], and Twitter posts [20] can also contain indicators for disease surveillance. The timeliness of a data source is often inversely proportional to its reliability [14]. Sources with immediate availability such as Twitter posts or search queries often contain large amounts of false signals, and usually lack geographic specificity. In contrast, laboratory test results provide definitive diagnostic information, but they are not available early. An example between the two extremes is chief complaint records from emergency departments. These records are available on the same day as the visit, contain specific signs and symptoms as well as geographic information, but initially lack diagnoses [21]. 5 2.1.2 Analysis In traditional disease surveillance systems, the data forwarded by the collection component is associated with a diagnosis directly, and analysis begins. In syndromic surveillance systems, the data may contain signals for multiple diagnoses. Therefore, every data stream is assigned a syndrome category before it can be investigated for statistically significant signals. Syndrome categories are lists of signs and symptoms that indicate specific diseases; examples include respiratory, gastrointestinal, influenza-like, and rash. The assignment proceeds in two steps: first, the information relevant to the categorisation is extracted from the collected data, and second, the extracted information is used to associate the data with the syndrome category. The extraction procedure is trivial for pre-formatted data sources such as over-the-counter drug sales (the data are already categorised by drug type), but may require complex methods for free-text data sources such as emergency department chief complaints. The extracted information is then associated with one or more syndrome categories, either by using a static mapping of data sources to diseases, or by an automated decisionmaking mechanism. An example of the former is CDC’s syndrome categories in BioSense, and of the latter, BioStorm’s ontology-driven assignments. Both systems are described in more detail in the next section. In existing syndromic surveillance systems, Bayesian, rule-based, and ontology-based classifiers have been used to assign syndrome categories [17, p.53]. After the categories are assigned, statistical analysis is used to detect significant signals. These signals may be short-term changes such as sharp increases or decreases in the number of cases, indicating emerging outbreaks or effects interventions; or long-term shifts, indicating the appearance of the disease in previously unaffected age groups or geographical regions. The literature on statistical analysis of disease surveillance data is vast, and interested readers are recommended to refer to previously published reviews [22, 23, 24, 25, 26] for a more thorough analysis of existing methods. Most of the algorithms used in disease surveillance are adapted from other fields such as industrial process control or econometrics [27], but some have been developed specifically for disease surveillance. Time series methods, meanregression methods, auto-regressive integrated moving average (ARIMA) models, hidden Markov models (HMMs), Bayesian HMMs, and scan statistics [28] are among the most commonly used algorithm classes. The spatial and spacetime statistics, specifically, have gained popularity among practitioners as methods for the detection of disease clusters [29]. The detection algorithm is chosen based on the needs of the users, and the available data sources. To aid the decision, the performance of aberrancydetection algorithms are often expressed in terms of sensitivity, specificity, and timeliness [30]. Sensitivity (true positive rate) is the probability that an alarm is raised given that an outbreak occurs. Specificity (true negative rate) is the probability that no alarm is raised given that no outbreak occurs. Timeliness is the difference in time between the event and the raised alarm. Additionally, to be able to understand and describe the detection algorithms better, researchers have proposed a classification scheme algorithms that considers the types of information and the amount of information processed by the algorithms: number of accessible data sources, number of covariates in each source, and the availability of spatial information [31]. 6 2.1.3 Notification The results of the analyses are visualised and communicated to the users by the notification component. The simplest method of communication is by providing the statistical output from the analysis method directly to the user, as a table or in plain text. Although the output contains all the essential information, understanding these reports is often time-consuming, and they quickly become overwhelming if many data sources, diagnoses, or geographic regions are involved in the analysis. The most common way of summarising the results is by displaying their values at different time points, using line charts or bar graphs. The variable may be case reports, ambulance dispatches, drug sales, or any other indicator used in surveillance. If previously computed historical baselines exist, they may also be plotted on the same graph, to put the current results in a larger context. Scatter plots and pie charts may also be used to summarise non-temporal components of the analysis. Figure 1: Sample output graph from an outbreak detection system in use at the Swedish Institute of Communicable Disease Control. When a spatial analysis method is used, the same variable may be displayed using a map where colours, shades, or patterns illustrate the differences between geographical regions [32]. Results of clustering methods, such as spatial scan statistics, may also be visualised on maps, often using geometric shapes or grids drawn on the map in addition to the regional borders [33]. Visualising the results 7 of hybrid spatio-temporal analysis methods may be achieved by animating the map, or presenting snapshots from the same map at different time points sideby-side. Alternatively, Geographic Information Systems (GIS) may be used to visualise spatial or spatio-temporal analysis results if the data source contains detailed geographical information. These systems are commonly used in disease surveillance [34], and epidemiologists are more likely to be familiar with GIS software given the long tradition of their usage [35]. In some cases, GIS include their own analysis tools [36], but these may be bypassed by importing the analysis results directly to the visualisation component. A simpler system, Google Earth [37], also provides similar functionality for spatial visualisation. The result reports and visualisations are communicated to the users periodically through email, SMS, automated phone calls, Web sites, or a dedicated display unit placed at the institution tasked with disease surveillance. 2.2 State-of-the-art Considering the extensive history of public health literature, the development of complex information systems for disease surveillance, mainly in the form of syndromic surveillance, is a recent addition. The first systems that proposed to monitor non-health related data sources for indicators of public health appeared in late 1990s [38], and the development of larger systems began after 2001, most of them in the United States [39]. The last ten years have seen the development of a surprisingly large number of systems with diverse functionality. Four directions taken by developers of syndromic surveillance can be identified from existing literature on the topic. These systems are described further in the following subsection. • Integrating data collected from many institutions tasked with public health response to provide an overview of events concerning public health at the national level. BioSense, one of the largest syndromic systems ever deployed, accomplishes this by combining the data from diverse health facilities in 26 states in the U.S. [40]. • Understanding signals in many public health data sources in relation to each other, during the collection process, at the institute tasked with collection. RODS achieves this task by providing a self-contained system that can be deployed independently at multiple public health facilities [41]. • Structuring the collected data and the methods of analysis in order to ease the difficulty of adding new sources or methods to existing systems. BioStorm provides ontologies that can classify analysis methods based on the goal of the analysis. It matches available data sources with suitable analysis methods [42]. • Increasing the visibility of results of disease surveillance analyses. The Web-based HealthMap is accessible over the Internet without any additional authentication [43]. Syndromic surveillance literature published in English in the last decade is dominated by systems developed in and intended to be deployed in the United States, with a few exceptions [44, 45]. More systems, developed in European states have 8 been documented in the broader field of disease surveillance and outbreak detection [46]. Additionally, the European Commission has sponsored several large projects on Europe-wide systems for awareness and monitoring of pandemics, but the scope has been very wide and the software output has been modest; see, e.g., the INFTRANS (Transmission modelling and risk assessment for released or newly emergent infectious disease agents) project on the sixth framework programme, 2002–2006 [47]. The most recent development in European syndromic surveillance at the time of writing is an ongoing project titled Triple-S. It is co-financed by the European Commission through the executive agency for health and consumers, and aims to create “an inventory of existing and proposed syndromic surveillance systems in Europe, and provide an in-depth understanding of selected systems” [48]. The three-year project, initiated in September 2010, involves twenty-four organisations from thirteen countries. Personal communication with the project revealed that an investigation about the existing systems had already been performed, and additional information about the systems would be provided by them at a later, unspecified date. 2.3 Implementations of syndromic surveillance Four implementations are presented in this section to illustrate different properties of existing syndromic surveillance systems. For additional information on other systems, the reader is recommended to refer to [17], which includes an overview of 50 syndromic surveillance systems and examines eight in further detail. An earlier review of 115 disease surveillance systems, including nine syndromic surveillance systems, by [49] is also informative. 2.3.1 BioSense BioSense is a CDC (Centers for Disease Control and Prevention) initiative that aims to “support enhanced early detection, quantification, and localisation of possible biologic terrorism attacks and other events of public health concern on a national level” [40, p.1]. The software component of the initiative is called the BioSense application. The development of the application started in 2003, and the first version was released in 2004. Initially BioSense included three national data sources: United States Department of Defence military treatment facilities, United States Department of Veterans Affairs treatment facilities, and test orders from Laboratory Corporation of America. In a later technical report, the BioSense data sources were reported to also include state/regional surveillance systems, private hospitals and hospital systems, and outpatient pharmacies [50]. As of May 2008, 454 hospitals from 26 US states were sending data to BioSense. The BioSense application classifies incoming data into eleven syndrome categories: botulism-like, fever, gastrointestinal, hemorrhagic illness, localised cutaneous lesion, lymphadenitis, neurologic, rash, respiratory, severe illness and death, and specific infection. The daily statistical analysis is performed using CUSUM [51], SMART [52], and W2 (a modified version of the C2 method [51] for anomaly detection). The data reporting component displays the results of 9 the analyses as spreadsheets of observed case counts, time series graphs, patient maps, or detailed case reports. In 2010, CDC started redesigning the BioSense program. The redesign aims to expand the scope of BioSense beyond early detection to contribute information for “public health situational awareness, routine public health practise, [and] improved health outcomes and public health” [53]. Earlier presentations about the future of the project have noted additional goals about improving the usability of biosurveillance tools and “reducing excessive features which miss the needs of the users” [54, p.19]. Open sourcing of the system is also included as a possibility for the redesigned BioSense project [55]. The initial motivation for the development and operation of the BioSense application was expressed primarily in terms of preventing biologic terrorism [40]. As part of the redesign, the motivation for developing the system is broadened considerably: The goal of the redesign effort is to be able to provide nationwide and regional situational awareness for all-hazard health-related threats (beyond bioterrorism) and to support national, state, and local responses to those threats. [53] The BioSense program has also contributed to the International Society for Disease Surveillance report on developing syndromic surveillance standards and guidelines for meaningful use [56]. The current BioSense application is one of the largest syndromic surveillance systems in existence, and the scope of its next iteration is likely to be influential in defining what is viable in the field of disease surveillance systems. 2.3.2 RODS The development of the Real-Time Outbreak and Disease Surveillance system (RODS) began in 1999 at the University of Pittsburgh for the purpose of detecting the large-scale release of anthrax [41]. The sixth iteration of the software is currently reported to be under development and the source code for several versions licensed under GNU GPL or Affero GPL are available from the RODS Open Source Project Web site [57]. The first implementation of RODS in Pittsburgh, Pennsylvania collected chief complaints data from eight hospitals, classified them into syndrome categories, and analysed the data for anomalies [58]. The system was then expanded to collect additional data types and deployed in multiple states. It was also used as a user-interface to the American National Retail Data Monitor [59], which collects over-the-counter medication sales. The most recent publicly available version of RODS supports user-defined syndrome categories. Implementations of the recursive least-squared (RLS) algorithm [60] and an initial implementation of the wavelet-detection algorithm [61] are also included. The results of the analyses can be displayed as time series graphs, or work with a GIS to create maps of the spatial distribution. From a data collection perspective, RODS is the decentralised counterpart of BioSense. Unlike BioSense, which collects data from a large number of sources centrally within a single implementation, RODS is designed to be installed at facilities on the sub-national level to collect and analyse the available data locally. In 2009, more than 300 healthcare facilities in 15 states in the U.S., more than 10 200 in Taiwan, and an unspecified number in Canada were being monitored by independent RODS implementations [57]. At the time of writing, no updates to the RODS open source project have been committed to the code repository for the last two years, and the latest available RODS publications date back to 2008. It is unclear if RODS 6 will be released in the future, but the availability of the source code for many earlier versions makes RODS an important resource for developers of disease surveillance systems. 2.3.3 BioStorm The BioStorm system (Biological spatio-temporal outbreak reasoning module) has been developed at the Stanford Center for Biomedical Informatics Research in collaboration with McGill University. The goal of the project is to “develop fundamental knowledge about the performance of aberrancy detection algorithms used in public health surveillance” [62]. The source code for the system is available at [63]. The aim of the BioStorm project is to create a scalable system that integrates multiple data sources, includes support for many problem solvers, and provides flexible configuration options [42]. The defining feature of the project is the central use of ontologies. A data-source ontology is used to describe data sources. The descriptions are then used to map to suitable analysis methods available in the system’s library of problem solvers [64]. Intermediate components such as a data-broker, a mapping interpreter, and a controller are used to connect the data sources to the analysis methods. The use of ontologies is intended to ease the process of adding new data sources and new analysis methods to an existing BioStorm implementation. No existing syndrome categories or visualisation components are provided, but any category or visualiser can be added to the system according to the needs of its users. The BioStorm project differs from the majority of disease surveillance systems primarily due to its highly complex mechanism for classifying data sources and problem solvers. The developers reflect on the high overhead of this approach, but state that the overhead is acceptable for systems that connect to many data sources and require diverse analysis methods [65]. In contrast, most of the existing syndromic surveillance systems do not suffer from this overhead, but require additional programming to accommodate new data sources or methods. The complexity of the BioStorm project creates a significant obstacle for implementation in a public health facility for day-to-day monitoring. However, the feature set provided by the project is ideal for systematically comparing and evaluating the performance of different analysis methods on different data sources. The possibility of categorising not only the syndromes, but also the data sources and the analysis methods [66] promises to simplify experiment design for evaluating detection algorithms. The BioStorm source code has not been updated since 2010, and the most recent publication related to the project dates back to 2009, but the source code continues to be available from the Stanford Center for Biomedical Informatics Research [63]. 11 2.3.4 HealthMap HealthMap is a freely accessible Web site that integrates data from electronic sources, and visualises the aggregated information onto the world map, classified by infectious disease agent, geography, and time. The project aims to deliver real-time information for emerging infectious diseases. It has been online since 2006, and its current data sources include Google News, the ProMED mailing list, World Health Organisation announcements and Eurosurveillance publications, among others [67]. HealthMap uses automated text processing to classify incoming alerts and to create or update points of interest on the world map based on the classification results [43]. The time-frame of alerts, the number of alerts, and the number of sources providing information are reflected by the colour of the markers for the points of interest on the world map. HealthMap also includes an interface for users to report missing outbreaks. HealthMap’s reliability, much like any other system, depends on the reliability of its data sources. Since it accesses less reliable sources compared to the systems discussed previously, different weights are assigned to different sources based on their credibility when creating reports to offset the influence of less reliable reports [68]. HealthMap is unique among the systems discussed so far because its analysis results are available to all World Wide Web users instead of a small group of experts. The results can be made available without major privacy concerns because all of the incoming data are also publicly available on the Web. Another notable system that employs a similar approach is EpiSpider [69]. A recent HealthMap feature, Outbreaks Near Me [70], provides the users with mobile tools to report and view outbreaks. Accessing the system without a standard browser requires a smart-phone which limits its availability, but it is argued that such limitations will eventually be overcome with cheaper devices [71]. The development of HealthMap-like systems signifies the presence of a different perspective in public health surveillance, where a larger group of users are able to influence the surveillance process and access the results of statistical analyses. 12 3 Employing syndromic surveillance in the digital city The analysis of the possibilities of employment for syndromic surveillance in this section begins with a theory of surveillance, which examines the drivers of surveillance in societies. These drivers are used to reveal future directions that may be taken by urban applications of syndromic surveillance. The analysis then proceeds to look at current problems faced by, or posed by, systematic applications of surveillance, and attempts to extrapolate them into the future. 3.1 A theory of surveillance “(S)urveillance is a focused attention to personal life details with a view to managing or influencing those whose lives are monitored. It may involve care, and more often, control. It is, in other words, the power of classification, of social sorting.” [72, p.152] The demand for surveillance appears increasingly often as societies reconfigure themselves to emphasise the privacy of the individual. This seemingly paradoxical phenomenon is due to the necessity of recognising participants in social exchanges; the disappearance of local networks of identification (the village, the extended family, the household) leads to an increase in the demand for “tokens of trustworthiness” [72, p.66] about the individual. Surveillance is uniquely positioned to generate these tokens, in different forms depending on the contents of the social exchange: credit cards in financial transactions, access cards when negotiating physical access, or identification documents when crossing national borders. A significant driver for the employment of surveillance technologies is fear: “Fear of attack, of intrusion, of violence, prompts efforts to ward off danger, to insulate against risk” [72, p.58]. Risk management, in diverse forms, aims to alleviate this fear with promises of awareness provided by surveillance. Communicable disease control is one example in which populations are protected from undesirable biological agents. 3.2 Trends Current trends that drive surveillance can be understood through five concepts: rationalisation, technology, sorting, knowledgeability, and urgency [2, p.26], all of which are present in syndromic surveillance: • An attempt to understand the behaviour of infectious disease, a seemingly chaotic phenomenon, • through the development of new systems that include previously unused data sources, • to be able to identify, contact, and respond to the infected individuals, • (without requiring the awareness of the subjects of surveillance), • more rapidly than methods used for traditional disease surveillance. 13 These five concepts provide directions in which syndromic surveillance are most likely to develop in. Advances in syndromic surveillance technologies are taken for granted within this report, and the four remaining concepts are used to trace what shape those technologies may take. From the rationalisation perspective, it would be possible to observe a movement towards methods that provide better detection algorithms. These better detection algorithms are likely to require more computational power to tackle the ever-expanding data sources, and take into account more variables to make their predictions. The sorting perspective drives the search for more data sources that can be used in syndromic surveillance. Any large-scale information system intersecting with urban populations can be seen as an immediate candidate for syndromic surveillance, and therefore of public health interest in urban environments, while remaining within the limits of established data protection laws and regulations. Knowledgeability, referring to the awareness among the subjects of surveillance about the existence and the functions of surveillance systems, would continue to influence how individuals react to newly developed systems, leading to explicit acceptance or rejection if the systems are revealed from the beginning, or a gradual awareness and eventual reaction if the systems attempt to conceal their presence. The final concept, urgency, would drive developers of syndromic surveillance systems to seek data sources that contain indicators for increasingly rapid detection. These extremely early indicators may be sought in genetic information, purchase habits, or any other source that records an individual before her contact with the infectious agent. 3.3 Current issues Computerised systems for public health surveillance have clear advantages compared to their earlier, paper-based versions. When the required information infrastructure for the operation of the system is in place, these systems reduce the number of humans required to detect and communicate health risks, and they can even automate daily monitoring and detection tasks. Additionally, the systems function without any additional input from individuals whose data are used. No one has to opt in to be included in an outbreak investigation, and individuals are very rarely identified during such investigations. However, these systems require considerable infrastructure as prerequisites before they can even be implemented. For example, areas without reliable communication networks rarely benefit from the implementation of such systems. Also following directly from the previous advantage, it is not possible to opt out of many public health surveillance systems, and as a result the majority of current implementations are undemocratic. Individuals, whose data inform public health, have no say in how their data are used, or whether it should be used at all. Contemporary syndromic surveillance systems have a tendency to generate more false alarms than traditional disease surveillance methods. Practitioners of public health tasked with monitoring such alarms have stated that without the resources to investigate detected events, the detection capability is useless [6]. In future systems, this may lead to several new research directions: Syndromic surveillance algorithms may aim to have higher specificity and lower sensitivity, or they may require more data sources to confirm the detected event before an alert is raised. To accomplish this cross-checking, more data sources that are 14 representative of the same population would have to be monitored. Quantifying the representativity of a data source, and judging how similar two separate data sources are in terms of representativity is yet another investigation that is likely to take place in institutions that employ syndromic surveillance methods. Surveillance of data sources that include multiple activities linked to the same identity are likely candidates for syndromic surveillance. Since these systems link diverse data sources, they require a complex network of users and machines to support their daily functions; the maintenance of syndromic surveillance systems are non-trivial. They connect to software that may change over time, and as the number of sources increases, compatibility issues also increase. While there are standards for storing and communicating health information digitally, the field of syndromic surveillance today is far from standardisation. If a single project were to gain enough momentum, such a standard could be established. Factors that affect the adoption of one project are fairly similar to other software standards, with one important difference: Public health is performed at the national level, each nation state exercises its public health duties in different ways, and those ways are rigidly encoded in the laws of the state. For this reason, the standardisation of syndromic surveillance, much like any other information system dealing with health at the national level, would require consensus at the international level. 3.3.1 Commerce and public health Syndromic surveillance, as well as other forms of surveillance for public health purposes, are performed with the explicit aim of raising the health of populations. The information gathered is used to establish norms of disease, in order to be able to detect when those norms are violated, and to judge if the costs of intervention serve the aim of raising the health of the population. This calculation of costs is inherently incompatible with the notion of profit. Since the aim is to raise the health of the whole population indiscriminately, any potential gains, be it knowledge, or accumulation of resources, are committed to the same aim. Creating other venues where capital can be generated for those who operate these mechanisms is bound to encounter an internal inconsistency. This inconsistency may take either the form of withholding, where resources generated from the surveillance process are not completely committed to the improvement of that same process, or it may take, in the worst case, the form of active deception, where the primary aim of public health is subverted to benefit not the whole but only certain parts of the population without acknowledging the change in recipients of the benefits explicitly. For this reason, a commercial entity, predicated by the motive of creating additional value for its shareholders, very rarely functions in the public health domain without critical conflicts. A deeper theoretical investigation of the causes of this incompatibility is described in a report from the United Nations Research Institute for Social Development [73], which uses cross-country comparable data to illustrate, among many other comparisons, a negative relationship between life expectancy and private health care expenditure. Regarding commercialisation, public health surveillance does not differ from health care significantly. In a practice where subjects rarely volunteer to be included, problems of redistribution arise immediately when commercial interests appear. Attempts to convince individuals to volunteer, to offer them other 15 advantages in exchange for their data, have succeeded in domains of social communication through information systems. Google’s automatic mining of emails to provide personalised ads, or Facebook’s initiative to insert individuals to ads personalised for the viewing of that individual’s friends are existing examples of such technology. These measures have been contested [74, 75, 76], and continue to be contested, but their presence has not led to a rejection of the services as a whole by their users. In public health, however, attempts to convince individuals to volunteer must be held to much closer scrutiny. Existing research on ethics, such as Gary Marx’s ethics of surveillance [77], may provide an idea of what forms this scrutiny should take. The issue is ensuring that the governed population receives equal benefit from public health measures, including health surveillance, and that they are provided with opportunities to exercise such benefits. Emphasising individual choice leads naturally to the possibility that the wrong choice will be made by some, and that they will be responsible for the consequences of those wrong choices. The ability to make choices, especially of making the right choice given a particular circumstance, is strongly influenced by prior privilege; those equipped with more privilege are more likely to make the better choice. The logic of choice is incompatible with the imperative of public health, because the outcomes linked to individual choice are strongly linked to prior privilege, and the only part prior privilege can play in the discourse of public health is that of an obstacle to be overcome. Commercial attempts at entering the domain of public health surveillance vary based on how the expertise of the commercial organisation is positioned. Using the collection, analysis, and notification framework for syndromic surveillance, these entry points can be made slightly more visible. At the collection stage, a company can propose to handle data collection and storage tasks, by proposing similarities to data warehousing that is very common in companies that work with information technologies. The arguments that such companies would propose are likely to mention the difficulty of storing large databases in hardware, managing backup methods, or distributing resources to optimise the speed of retrieval. In the analysis stage, the role of the company can be stated as a consultancy, bringing expertise (statistical, computational, etc.) to analysis tasks. For example, companies with data mining technologies may present proposals where their methods are used, or companies that manage large groups of experts may provide the actual worker who would perform and interpret the statistical analyses chosen by the operators of the surveillance systems. Finally, at the notification stage, companies may provide novel visualisation methods that are inaccessible to health agencies, due to laws regulations surrounding intellectual property for example, or they may claim to have more suitable technological solutions for the reliable communication of messages such as secure cellular networks. 3.4 The future of syndromic surveillance It is possible to provide a rudimentary method for inventing new applications of syndromic surveillance: Proposals for future information systems or newly implemented systems are searched for specifications of the data they will collect or generate. Then, the behaviour of infectious diseases of interest are crossreferenced one by one to with these specifications of data sources, to see if there are any intersections. These intersections may occur as transmission mechanics, 16 symptoms of the diseases, treatment methods, etc. If any intersection exists, an investigation can be made regarding the applicability of syndromic surveillance to the data source. This investigation would have to take into account what part of the population the data source represents, and how reliable (in terms of sensitivity, specificity, and timeliness) the signals derived from the data source would be. Some example scenarios are as follows: • Records of energy usage in households (the smart home scenario): Does sickness change the energy consumption behaviour of the inhabitants in ways that are visible to the smart home data sources? • Transportation: Electrical cars are plugged to a grid for charging. Given a car user that drives to work every day, would it be possible to assume the presence of sickness if the car remains plugged in to the grid during work hours? • Supermarket customer loyalty cards: Can personal purchase records be analysed real-time to detect the presence of communicable diseases from the consumption of remedies? As stated previously, data collected by a wide variety of information systems in urban environments can be used in syndromic surveillance with the aim of extracting health-related information. Current examples include ambulance dispatch records and over-the-counter medicine sales that are analysed statistically to detect unexpected changes. Following the same trend, other data sources may be used in the near future for disease surveillance tasks. One particularly complicated scenario links cellphone logs to disease spread. Assuming full availability of cellphone handover logs between base stations, it would be possible to trace potential contacts of infected individuals in order to warn them about the effects of exposure, and recommend them to take precautionary measures to avoid spreading the disease further. The privacy implications of this scenario depend primarily on the roles of the actors who would define the practical effects of the precautionary measures to be taken, and how those measures should be enforced. Given a wealth of real-time or near-real-time data sources, many detection methods previously thought impossible become possible. However, these implementations should be accompanied with a discussion of their implications: how do these methods shape society, who enjoys the benefits, who suffers the costs, and finally, which possible futures become impossible, or highly unlikely, once such systems are in place? 3.4.1 Exclusion Syndromic surveillance systems must strive to avoid excluding parts of the population. Two factors contribute to this requirement: First, the imperative of public health places strong emphasis on equal inclusion, and to deviate from it would make syndromic surveillance less compatible with the practice of public health. Second, communicable diseases spread indiscriminately. Not taking into account parts of the population would only lead to partitions where communicable disease can spread unchecked among available hosts, which would not only make syndromic surveillance less efficient, but also increase the human suffering caused by communicable diseases that syndromic surveillance aims to alleviate. 17 Where is exclusion observed in the methods and systems of syndromic surveillance? It appears, most importantly, in the data sources. If parts of the population are not represented in the data sources, syndromic surveillance will not say anything about their suffering caused by communicable diseases. Exclusion may also appear during the analysis stage where the intersection of multiple data sources are investigated. If those sources do not represent identical segments of the population, which is the usual case, individuals outside of that intersection would not benefit from the surveillance even though their data appears in some of the data sources. Urban environments do not have rigid borders; they grow, recede, and shift as the cities evolve. While this is, from a humanitarian perspective, a property highly worth preserving, it introduces complications for a public health practice that aims to be non-exclusive. New inhabitants of the city do not become visible to public health surveillance as easily as already established inhabitants, and consequently, delivery of interventions or preventive measures may not encompass the recently settled. Under the assumption that the future influx of populations to cities will tend to increase, new systems of surveillance, and particularly of syndromic surveillance, must be designed to take into account the necessity of including those that have recently arrived in the city. 3.5 A candidate for future case studies The London 2012 Olympics is one example where urban syndromic surveillance systems of the future will become observable. Relatively short in duration, but very large in terms of attendance, the Olympics creates a unique opportunity where extreme migration challenges existing infrastructure, when an urban event becomes highly complex due to the sudden increase in the population. Proposals for temporary syndromic surveillance systems, constructed with the aim of monitoring the population during the event, would be very valuable for observing the trends discussed in this report in action. The discussions surrounding the hosting of the Olympics in a highly populated capital are informative, because they illustrate what the organising institutions consider to be inadequate for the influx of visitors in the near future. These discussions can be extended into the further future, under the assumption that problems caused by increased migration are likely to take similar forms. After the event, evaluations of the proposed remedies can serve as additional indicators of the future issues. 3.6 Conclusion No information system stands independent of the population it informs or monitors. It is of utmost importance that the potential social effects of surveillance systems be put under close scrutiny before any resources are committed to realising them, but the complete picture, with regards to the redistribution of privilege due to the operation of the system, will never be possible to predict in its entirety. In fact, the act of prediction modifies the outcome, especially if it leads to media attention, political scrutiny, or any form of campaigning for, or against, the proposal. The best outcome that planners and developers of syndromic surveillance can hope to achieve, while staying committed to the imperative of public health, is to minimise the tendencies of their systems to selectively privilege only certain parts of the population. 18 The main author of this report is Baki Cakici, Ph.L., KTH. The editing of the report was done by Magnus Boman, professor, KTH. 19 References [1] EIT ICT Labs. Digital cities of the future. http://eit.ictlabs.eu/ ict-labs/thematic-action-lines/digital-cities-of-the-future, 2011. Accessed 2011-10-06. [2] David Lyon. Surveillance Studies: An Overview. Polity, first edition, August 2007. ISBN 074563592X. [3] Bank of America. Bank of America - Photo Security, 2011. URL http://www.bankofamerica.com/creditcards/index.cfm?template= cc_features_photo_security. Accessed 2011-11-24. [4] Transport for London. Pay as you go - Transport for London, 2011. URL http://www.tfl.gov.uk/tickets/19798.aspx. Accessed 2011-11-24. [5] J. W Buehler, R. S Hopkins, J. M Overhage, D. M Sosin, and V. Tong. Framework for evaluating public health surveillance systems for early detection of outbreaks. MMWR Morbidity and Mortality Weekly Report, 53: 1–11, 2004. [6] F. Mostashari. Syndromic surveillance: a local perspective. Journal of Urban Health: Bulletin of the New York Academy of Medicine, 80:i1–i7, 2003. ISSN 1099-3460. [7] Kelly J Henning. What is syndromic surveillance? MMWR. Morbidity and Mortality Weekly Report, 53 Suppl:5–11, September 2004. [8] Michael M. Wagner, Andrew W. Moore, and Ron M. Aryel. Handbook of Biosurveillance. Academic Press, first edition, June 2006. ISBN 0123693780. [9] Baki Cakici. Disease surveillance systems. Licentiate dissertation, Royal Institute of Technology (KTH), 2011. [10] V.W. Sidel, R.M. Gould, and H.W. Cohen. Bioterrorism preparedness: cooptation of public health? Medicine & Global Survival, 7(2):82–89, 2002. [11] K. C Dowling and R. I Lipton. Bioterrorism preparedness expenditures may compromise public health. American Journal of Public Health, 95 (10), 2005. [12] A. Reingold. If syndromic surveillance is the answer, what is the question? Biosecurity and Bioterrorism: Biodefense Strategy, Practice, and Science, 1(2):77–81, 2003. [13] Magdalena Berger, Rita Shiau, and June M Weintraub. Review of syndromic surveillance: implications for waterborne disease detection. Journal of Epidemiology and Community Health, 60(6):543–550, June 2006. [14] David L. Buckeridge, Justin Graham, Martin J. O’Connor, Michael K. Choy, Samson W. Tu, and Mark A. Musen. Knowledge-based bioterrorism surveillance. Proceedings of the AMIA Symposium, pages 76–80, 2002. 20 [15] Steven Babin, Steven Magruder, Shilpa Hakre, Jacqueline Coberly, and Joseph S. Lombardo. Understanding the data: Health indicators in disease surveillance. In Joseph S. Lombardo and David L. Buckeridge, editors, Disease Surveillance: A Public Health Informatics Approach, chapter 2, pages 43–90. Wiley-Interscience, first edition, apr 2007. [16] J. W Buehler, A. Sonricker, M. Paladini, P. Soper, and F. Mostashari. Syndromic surveillance practice in the united states: findings from a survey of state, territorial, and selected local health departments. Advances in Disease Surveillance, 6(3):1–20, 2008. [17] Hsinchun Chen, Daniel Zeng, and Ping Yan. Infectious Disease Informatics: Syndromic Surveillance for Public Health and Bio-Defense. Springer, first edition, December 2009. ISBN 1441912770. [18] Anette Hulth, Gustaf Rydevik, and Annika Linde. Web queries as a source for syndromic surveillance. PloS One, 4(2), 2009. ISSN 1932-6203. [19] Jeremy Ginsberg, Matthew H. Mohebbi, Rajan S. Patel, Lynnette Brammer, Mark S. Smolinski, and Larry Brilliant. Detecting influenza epidemics using search engine query data. Nature, 457(7232):1012–1014, February 2009. [20] Vasileios Lampos, Tijl Bie, and Nello Cristianini. Flu detector - tracking epidemics on twitter. In José Luis Balcázar, Francesco Bonchi, Aristides Gionis, and Michèle Sebag, editors, Machine Learning and Knowledge Discovery in Databases, volume 6323, pages 599–602. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010. [21] D. Travers, C. Barnett, A. Ising, and A. Waller. Timeliness of emergency department diagnoses for syndromic surveillance. In AMIA Annual Symposium Proceedings, volume 2006, page 769, 2006. [22] Ron Brookmeyer and Donna F. Stroup. Monitoring the Health of Populations: Statistical Principles and Methods for Public Health Surveillance. Oxford University Press, USA, 1 edition, October 2003. ISBN 0195146492. [23] Christian Sonesson and David Bock. A review and discussion of prospective statistical surveillance in public health. Journal of the Royal Statistical Society: Series A (Statistics in Society), 166(1):5–21, February 2003. [24] Andrew B. Lawson and Ken Kleinman. Spatial and Syndromic Surveillance for Public Health. Wiley, May 2005. ISBN 0470092483. [25] W.K. Wong and A.W Moore. Classical time-series methods for biosurveillance. In Michael M. Wagner, Andrew W. Moore, and Ron M. Aryel, editors, Handbook of biosurveillance, chapter 14. Academic Press, jun 2006. [26] H. Burkom. Alerting algorithms for biosurveillance. In Joseph S. Lombardo and David L. Buckeridge, editors, Disease Surveillance: A Public Health Informatics Approach, chapter 4. Wiley-Interscience, first edition, apr 2007. 21 [27] David L Buckeridge, Anna Okhmatovskaia, Samson Tu, Martin O’Connor, Csongor Nyulas, and Mark A Musen. Understanding detection performance in public health surveillance: Modeling aberrancy-detection algorithms. Journal of the American Medical Informatics Association, 15(6):760–769, November 2008. [28] Anita Pelecanos, Peter Ryan, and Michelle Gatton. Outbreak detection algorithms for seasonal disease data: a case study using ross river virus disease. BMC Medical Informatics and Decision Making, 10(1):74, 2010. [29] Martin Kulldorff, Farzad Mostashari, Luiz Duczmal, W. Katherine Yih, Ken Kleinman, and Richard Platt. Multivariate scan statistics for disease surveillance. Statistics in Medicine, 26(8):1824–1833, 2007. [30] K. P. Kleinman and A. M. Abrams. Assessing surveillance using sensitivity, specificity and timeliness. Statistical methods in medical research, 15(5): 445, 2006. [31] David L. Buckeridge, Howard Burkom, Murray Campbell, William R. Hogan, and Andrew W. Moore. Algorithms for rapid outbreak detection: a research synthesis. Journal of Biomedical Informatics, 38(2):99–113, April 2005. [32] R. G Cromley and E. K Cromley. Choropleth map legend design for visualizing community health disparities. International Journal of Health Geographics, 8(1), 2009. [33] Francis P. Boscoe, Colleen McLaughlin, Maria J. Schymura, and Christine L. Kielb. Visualization of the spatial scan statistic using nested circles. Health & Place, 9(3):273–277, September 2003. [34] Candace I. J. Nykiforuk and Laura M. Flaman. Geographic information systems (GIS) for health promotion and public health: A review. Health Promotion Practice, 12(1):63–73, January 2011. [35] K. C. Clarke, S. L. McLafferty, and B. J. Tempalski. On epidemiology and geographic information systems: a review and discussion of future directions. Emerging Infectious Diseases, 2(2):85–92, 1996. [36] Kyusuk Chung, Duck-Hye Yang, and Ralph Bell. Health and GIS: toward spatial statistical analyses. Journal of Medical Systems, 28(4):349–360, 2004. [37] Google. Google earth. http://earth.google.com, 2011. Accessed 201104-22. [38] R. Heffernan et al. New york city syndromic surveillance systems. MMWR. Morbidity and mortality weekly report, 53:25–27, 2004. [39] D. M. Sosin and J. DeThomasis. Evaluation challenges for syndromic surveillance – Making incremental progress. MMWR. Morbidity and mortality weekly report, 53:125, 2004. 22 [40] C. A. Bradley, H. Rolka, D. Walker, and J. Loonsk. BioSense: implementation of a national early event detection and situational awareness system. MMWR. Morbidity and Mortality Weekly Report, 54:11, 2005. [41] Fu-Chiang Tsui, Jeremy U Espino, Virginia M Dato, Per H Gesteland, Judith Hutman, and Michael M Wagner. Technical description of RODS: a real-time public health surveillance system. Journal of the American Medical Informatics Association: JAMIA, 10(5):399–408, 2003. ISSN 10675027. [42] Martin J. O’Connor, David L. Buckeridge, Michael Choy, Monica Crubézy, Zachary Pincus, and Mark A. Musen. BioSTORM: a system for automated surveillance of diverse data sources. AMIA Annu Symp Proc, 2003:1071– 1071, 2003. [43] Clark C Freifeld, Kenneth D Mandl, Ben Y Reis, and John S Brownstein. HealthMap: global infectious disease monitoring through automated classification and visualization of internet media reports. Journal of the American Medical Informatics Association, 15(2):150–157, March 2008. [44] L Josseran, J Nicolau, N Caillère, P Astagneau, and G Brücker. Syndromic surveillance based on emergency department activity and crude mortality: two examples. Eurosurveillance, 11(12):225–229, 2006. [45] C. van den Wijngaard, W. van Pelt, N. Nagelkerke, M. Kretzschmar, and M. Koopmans. Evaluation of syndromic surveillance in the netherlands: its added value and recommendations for implementation. Eurosurveillance, 16(9), 2011. [46] A. Hulth, N. Andrews, S. Ethelberg, J. Dreesman, D. Faensen, W. van Pelt, and J. Schnitzler. Practical usage of computer-supported outbreak detection in five european countries. Eurosurveillance, to appear, 15(36), 2010. [47] European Commission. INFTRANS – infectious diseases: modelling for control. http://ec.europa.eu/research/fp6/ssp/inftrans_en.htm, 2008. Accessed 2011-04-27. [48] The Triple-S project. Triple-S – The syndromic surveillance project. http: //www.syndromicsurveillance.eu, 2011. Accessed 2011-10-06. [49] Dena M. Bravata, Kathryn M. McDonald, Wendy M. Smith, Chara Rydzak, Herbert Szeto, David L. Buckeridge, Corinna Haberland, and Douglas K. Owens. Systematic review: Surveillance systems for early detection of Bioterrorism-Related diseases. Annals of Internal Medicine, 140(11): 910–922, June 2004. [50] CDC. Biosense technical overview of data collection, analysis, and reporting. Technical report, Centers for Disease Control and Prevention, June 2008. [51] L Hutwagner, W Thompson, G M Seeman, and T Treadwell. The bioterrorism preparedness and response early aberration reporting system (EARS). Journal of Urban Health: Bulletin of the New York Academy of Medicine, 80:i89–i96, 2003. 23 [52] Ken Kleinman, Ross Lazarus, and Richard Platt. A generalized linear mixed models approach for detecting incident clusters of disease in small areas, with an application to biological terrorism. American Journal of Epidemiology, 159(3):217 –224, February 2004. [53] CDC. CDC BioSense Home. http://www.cdc.gov/biosense, 2011. Accessed 2011-04-01. [54] Taha A. Kass-Hout. BioSense: next generation. http://www.cdc. gov/biosense/files/PHIN_PowerPoint_BioSense_Next_Generation_ FINAL_508.pdf, 2009. Accessed 2011-04-19. [55] Taha A. Kass-Hout. BioSense: Going forward. http://www.cdc.gov/ biosense/files/BioSense_ISDS.ppt, December 2009. Accessed 2011-0419. [56] ISDS. Final Recommendation: Core Processes and EHR Requirements for Public Health Syndromic Surveillance. Technical report, International Society for Disease Surveillance, January 2010. URL \url{http://www. syndromic.org/projects/meaningful-use}. [57] RODS. The RODS Open Source Project. sourceforge.net, 2009. Accessed 2011-04-04. http://openrods. [58] J. U Espino, M. Wagner, C. Szczepaniak, F-C. Tsui, H. Su, R. Olszewski, Z. Liu, W. Chapman, X. Zeng, L. Ma, Z. Lu, and J. Dara. Removing a barrier to computer-based outbreak and disease surveillance—The RODS open source project. Syndromic Surveillance, page 32, 2004. [59] Michael M. Wagner, J.Michael Robinson, Fu-Chiang Tsui, Jeremy U. Espino, and William R. Hogan. Design of a national retail data monitor for public health surveillance. Journal of the American Medical Informatics Association, 10(5):409–418, 2003. [60] Monson H. Hayes. Statistical digital signal processing and modeling. John Wiley & Sons, 1996. ISBN 0471594318. [61] Jun Zhang, Fu -Chiang Tsui, Michael M Wagner, and William R Hogan. Detection of outbreaks from time series data using wavelet transform. AMIA Annual Symposium Proceedings, pages 748–752, 2003. [62] BioSTORM. Welcome to BioSTORM. http://biostorm.stanford.edu/ doku.php, 2009. Accessed 2011-04-06. [63] BioSTORM. BioSTORM SVN Repository. https://bmir-gforge. stanford.edu/gf/project/biostorm/scmsvn, 2010. Accessed 2011-0406. [64] Monica Crubézy, Martin O’Connor, David L. Buckeridge, Zachary Pincus, and Mark A. Musen. Ontology-Centered syndromic surveillance for bioterrorism. IEEE Intelligent Systems, 20(5):26–35, 2005. [65] D. L. Buckeridge, M. A. Musen, P. Switzer, and M. Crubézy. A modular framework for automated Space-Time surveillance analysis of public health data. In AMIA Annual Symposium, Washington, DC, pages 120–124, 2003. 24 [66] Zachary Pincus and Mark A Musen. Contextualizing heterogeneous data for integration and inference. In AMIA 2003 Symposium Proceedings, pages 514–518, 2003. [67] HealthMap. Global health, local information. http://healthmap.org/en, 2011. Accessed 2011-04-07. [68] John S Brownstein, Clark C Freifeld, Ben Y Reis, and Kenneth D Mandl. Surveillance sans frontières: Internet-based emerging infectious disease intelligence and the healthmap project. PLoS Med, 5(7):e151, 2008. [69] Herman Tolentino, Raoul Kamadjeu, Paul Fontelo, Fang Liu, Michael Matters, Marjorie Pollack, and Larry Madoff. Scanning the Emerging Infectious Diseases Horizon-Visualizing ProMED Emails Using EpiSPIDER. Advances in Disease Surveillance, 2:169, 2007. [70] HealthMap. Outbreaks near me. outbreaksnearme, 2011. Accessed 2011-04-08. http://healthmap.org/ [71] Clark C. Freifeld, Rumi Chunara, Sumiko R. Mekaru, Emily H. Chan, Taha Kass-Hout, Anahi Ayala Iacucci, and John S. Brownstein. Participatory epidemiology: Use of mobile phones for community-based health reporting. PLoS Med, 7(12), 12 2010. [72] David Lyon. Surveillance Society: Monitoring Everyday Life. Open University Press, first edition, February 2001. ISBN 0335205461. [73] Maureen Mackintosh. Health care commercialisation and the embedding of inequality, RUIG/UNRISD health project synthesis paper. Geneva, Switzerland: United Nations Research Institute for Social Development, 2003. [74] Electronic Frontier Foundation. Lawsuit demands answers about socialnetworking surveillance, 2009. URL https://www.eff.org/press/ archives/2009/11/30. Accessed 2011-11-30. [75] Tor Project. Tor Project: Anonymity Online, 2011. URL https://www. torproject.org/. Accessed 2011-11-30. [76] Adblock Plus. Adblock Plus – for annoyance-free web surfing, 2011. URL http://adblockplus.org/en/. Accessed 2011-11-30. [77] Gary T. Marx. Ethics for the new surveillance. The Information Society, 14:171–185, August 1998. doi: 10.1080/019722498128809. URL http: //www.tandfonline.com/doi/abs/10.1080/019722498128809. 25 Imprint: EIT ICT Labs IVZW 22 Rue d'Arlon, 1050 Brussels BELGIUM ISBN 978-91-87253-04-1