Complex Networks Applications to Study of Disease Transmission
In the year 542, Bubonic Plague swept across Europe and bought the Roman Empire to it’s knees. Eight hundred years later,in the winter of 1347, a Genoese fleet returning from Crimea arrived at the Sicilian port of Messina. On board were rats carrying the plague bearing bacteria Yersinia pestis.  The ships’ crews were already decimated: most were infected and many were dead. From Sicily the plague entered continental Europe and during the next three years the population of Europe was slashed by around one-third (figures range from one-quarter to one-half). Prior to this introduction of plague into Europe it had been absent since the fall of the Roman Empire. Historical records clearly indicate that the plague of 1347-1350 spread through Europe as a wave (see Figure 1). In fact the large scale dynamics of an infection such as Bubonic Plague are fairly easy to model mathematically. Standard text books in mathematical biology [1] show that simple calculations based on the population density of Europe in the fourteenth century provide very good estimates for the rate of diffusion of plague through Europe. 
To model the spread of infectious diseases one usually applies the spatially extended diffusive version of the standard compartmental model of disease transmission. This model, described by Kermack and McKendrick in a series of papers starting in 1927 [2], has been very successful in modelling a wide range of infectious diseases. However, both the original compartmental model, and its spatial extension are deterministic models. Stochastic generalisations exist, but these do not do a good job of modelling the sporadic spread of relatively uncommon infections. Data from the outbreak of atypical pneumonia in Hong Kong in 2003 show particularly bursty behaviour (Figure 2). Of course, stochastic explanations can be provided that match this data, but to achieve this rather complex probability distributions must be inferred from a very limited data sets. That is, the complexity of the the stochastic model exceeds the information provided by the data. Paradoxically, complex network models offer a much simpler explanation. We have recently been working on developing such models and showing that they offer good explanatory power for two specific (and topical) diseases: Avian Influenza and the aforementioned atypical pneumonia.   
At the end of 2002 several disturbing reports of an unusual outbreak of pneumonia began to emerge in the Hong Kong press. The cases related to individuals in Guangdong province (the region commonly known as Canton, neighbouring Hong Kong to the north) contracting a fatal form of pneumonia that was not responsive to the usual treatments. By early 2003 the disease had emerged in Hong Kong, and it was spreading. Apparently, rapidly. Standard mathematical epidemiology shows that a disease will spread faster in large, densely packed populations rather than in rural areas (where it may in fact not support itself). Hence, the arrival of this previously unknown disease from the rural city of Foshan was potentially very dangerous. Because of the excellent air transport links from Hong Kong, outbreaks were soon reported elsewhere: Vietnam, Singapore, Taiwan, Mongolia, the Philippines, Canada and the United States. This emerging outbreak was given the redundant moniker Severe Acute Respiratory Syndrome (SARS) and hence forevermore linked to the Special Administrative Region (SAR) of Hong Kong.  
Reports in the elite medical and scientific literature soon surfaced describing the pathology and epidemiology of this new disease [3,4]. But, two unusual factors quickly became clear: as the disease spread it exhibited “Super-Spreader Events” (SSE) and strong clustering. Super-spreader event were defined as isolated outbreaks in which a large number of secondary infections could be traced back to a single primary. Notable examples include a single hospitalised individual that infected a large number of visiting medical students, and the original case of a single doctor staying at a Hong Kong hotel and infecting 16 other guests. Clusters occur when a large number of cases (not necessarily from the same source) emerge at the same geographical location. This occurred in Hong Kong at the Prince of Wales Hospital and the Amoy Gardens housing estate. For existing mathematical models to explain these features, one needs to explicitly build them into the model: either with detailed geographical features or peculiar probability distributions. By using complex network topologies we have been successful in modelling these features without a priori inserting them into the model.
The standard compartmental model for disease transmission has three, or four, compartments. For the case of SARS it is appropriate to consider the four compartment version. An individual may be either susceptible (S), exposed (E), infected (I) or removed (R). Susceptible individuals do not have the disease, but can acquire it through contact with someone who is infected  with some probability p. Once they acquire the infection the susceptible individual moves to the exposed state, and from there transitions to the infected state with some rate q. The exposed state models the latent period of the disease: during this time the individual is carrying the disease but is not yet infectious. Finally, infected individuals become removed with some further rate parameter r. The balance between the parameters r, p and q  dictates whether the disease will continue to spread, or not. The derivation is straightforward, and subject to the standard assumption that individuals are universally mixed. The assumption implies that all susceptible individuals have contact with the infected and infection between them occurs with some (presumably rather small) rate p. In our model were use a complex network to explicitly trace the contacts between individuals, and in this model the parameter p becomes the probability of acquiring the infection, given that contact with an infected person has occurred. 
Fig. 3: Computer simulation of disease transmission on a complex network lattice (100-by-100 nodes). Black, red and blue points indicate removed, infected and exposed individuals. Note the formation of clusters and the apparent explosion in the number of cases (corresponding to the model’s burstiness). Each frame is for the same simulation, 20 days apart

Complex networks have arisen as a good model for a wide variety of phenomena [5] and in particular we apply them to model the social connectivity between individuals [6]. The disease then moves through this social connection network.  Let the individuals in the society be nodes on the network, connections between nodes represent that the corresponding two individuals have had personal contact “close enough” to transmit the disease between them. We further subdivide the type of social contact into two discrete groups: close familial contact and incidental contact. The first type of contact models cohabitation, the second describes incidental acquaintances (for example through the workplace or public transport). The network of familial contacts is used to build our model as a regular two dimensional rectangular lattice: connections exist only between adjacent nodes, and therefore traversing the network will be very slow. The network of incidental contacts provides short-cuts through that network: random connections between points in the lattice. This structure induces what is commonly referred to as the small-world effect and means that the number of steps required to traverse the network becomes very small; nonetheless the lattice structure ensures that the network is still highly clustered. Computer simulations (see Figure 3) of this structure exhibit  the clustering of cases (because of the lattice) and the sudden large jump in outbreaks (because one encounters a single node with a large number of incidental contacts) [7].  
Fancy simulation aside, we can also learn several things about the underlying outbreak from our models. We can show that the occurrence of super-spreader events does not necessarily need to be dependent on highly infectious (that is, highly pathological) individuals only on highly social ones. The variation in the extent of social contact between people (already demonstrated in other fields and observed experimentally) is sufficient to explain the sudden outbreaks of infection due to a single primary infection. Of course, we can also say something about the affect that restrictions on movement will have on the spread of the disease. In Hong Kong in 2003 several governmental level control measures were implemented. Residents of buildings with large clusters (notably Amoy Gardens) were isolated. Schools where closed and employers where encouraged to let their employees work from home. These measures effectively reduce the degree of incidental contact and reduce the underlying speed of transmission (in the model) from exponential to quadratic. Similarly, further public health campaigns (encouraging good hygiene and the wearing of surgical face masks in public) had some effect in reducing the overall rates of infection to smaller values, without changing the speed of transmission from its fundamentally exponential form. Finally, on careful analysis of the available data we have been able to show that the rate of transmission of SARS was only self-sustaining because of poor infection control in hospitals [8]. If we remove hospital infection (technically, nosocomial transmission) from our model the rate at which the infection spreads is very slow and a small perturbation (increase in control) is sufficient to cause the pathogen to become extinct (in effect, this was verified in Hong Kong, albeit over a three and a half month period). 
Because data relating to the actual chain of infection is difficult to obtain (for both privacy and political reasons) we still have no direct evidence that the actual chain of infection is consistent with what we generated in our computer simulations. While the the computer models have the qualitative distribution and the quantitative time dynamics of the actual reported SARS cases from Hong Kong we have been unable to confirm that the transmission pathways are also consistent. In fact, while the physics literature is full of studies of infection dynamics on networks there has been (until now) no experimental observation consistent with this hypothesis. To address this, we have turned our attention to the current global distribution of outbreaks of Avian Influenza (primarily H5N1) among wild and domestic birds. The data is freely available online [9], and is even in a form compatible with Google Earth (Figure 4).  The available data comprises detailed information about the size, time, location and response for each case of Avian Influenza recorded by the World Organisation for Animal Health (the animal version of the the World Health Organisation). From this data we construct a network. The nodes in the network are outbreak cases (this may be a single dead wild bird, or an entire poultry farm, we do not discriminate) connections between outbreaks are drawn if they are geographically and temporally close.  To be precise, we obtain an estimate of the rate of spread of the disease (in km/day) u from the data. If the product of the number of days between two cases and u is greater than the distance between the two outbreaks (subject to some upper bound on either time or distance) then we infer a connection. Hence, the network represents outbreaks of Avian Influenza, and potential infection pathways. It is precisely analogous to the network we constructed to model SARS. For Avian Influenza the model is derived from data, for SARS the model is intended to mimic data. 
The surprising, and important, thing is that the network we obtain for Avian Influenza is scale-free [10]. In other words, the number of links that a given node has follows a power-law distribution. Despite significant work in the physics literature developing the theory, this is the first time (of which we are aware) of data demonstrating that disease transmission can be scale-free. A subsequent report has also partially corroborated this by looking at Dengue infections in Singapore [11]. Significantly, the exponent of that power law for avian influenza is also very small (Figure 5). Mathematically, this has very important consequences [12]. The distribution has an infinite variance, and for the value of exponent we observe, the mean does not exist. Hence, provided the rate of infection is non-zero, a disease spreading on this network will become endemic. We can see this mathematically, as the rate at which the disease continues to spread depends on the product of the rate of infection and the mean number of links for a node (which is infinite). The implications of this model and the advice for infection control are even more stark than before. 
Avian Influenza cannot be eradicated by reducing the rate of infection (unless it can be reduced to zero). Instead transmission of the disease must be addressed through structural change in the connectivity between outbreak sites. That is, the transmission pathways for poultry (and perhaps for wild birds) must be manipulated in such a way that the distribution of links between sites is no longer power-law. From a theoretical standpoint we must either truncate the tail of the distribution or increase the power law exponent sufficiently (to provide a finite mean and variance). There are several ways in which one could imagine achieving this. Firstly, control must be rapid and stringent. In Hong Kong in 1997 an outbreak of avian influenza was terminated (or at least stalled) through an extensive cull of domestic poultry. Second, in terms of our model, a large number of small autonomous farming units in a close proximity to one another is bad. Large farms, factories or co-operatives are better — provided that control is applied universally. Thirdly, long distance transportation of live poultry should be avoided, or at least subject to strict infection control.
Of course, the specific control measures we suggest are only some possible solutions. The ultimate aim is to affect the underlying structure of the disease transmission network to make eradication more attainable. Nonethelesss, modelling the transmission of diseases with complex networks has provided us with new insight into which types of control strategies are likely to be most effective. We are able to contrast the effect of general hygiene measures or of limitations on personal movement on the spread of an infectious agent. We have shown that the severity of SARS in particular was due largely to transmission in hospital. And, we have shown that Avian Influenza will require a stringent and co-ordinated response for it to be effectively control.  The strength of complex network models is not that they replace existing techniques, but that they provide an efficient mechanism to model complicated patterns and structures when there is not enough information (data) to build these features into the model. Complex network models provide a general phenomenological explanation for observed behaviour rather than relying on a particular exact (and data intensive) computer simulation of reality.  We are currently working on more detailed mathematical and statistical analysis of the behaviour of diseases on networks and are looking at ways to incorporate these types of model into the very extensive computer simulation model popular in computational epidemiology.    

[1] J.D. Murray, Mathematical Biology, Springer 1993.
[2] W.O. Kermack and A.G. McKendrick  “A Contribution to the Mathematical Theory of Epidemics.” Proc. Roy. Soc. Lond. A 115, 700-721, 1927.
[3] S. Riley, et al., “Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions”, Science 300, 1961–1966, 2003. 
[4] C.A. Donnelly, et al., “Epidemiological determinants of spread of causal agent of severe acute respiratory syndrome in Hong Kong”, Lancet 361, 1761-1766, 2003. 
[5] D.J. Watts, Six Degrees: The Science of a Connected Age, Norton, New York, 2003. 
[6] S. Milgram, “The small world problem”, Psychol. Today 2, 60-67, 1967. 
[7] M. Small, C.K. Tse and D.M. Walker, “Super-spreaders and the rate of transmission of the SARS virus” Physica D 215, 146-158, 2006.
[8] M. Small, C.K. Tse and D.M. Walker, “Clustering model for transmission of the SARS virus: application to epidemic control and risk assessment”, Physica A 351, 499-511, 2005.
[9] http://declanbutler.info/blog/?p=58
[10] M. Small, D.M. Walker and C.K. Tse, “Scale-free distribution of Avian Influenza Outbreaks”, Phys. Rev. Lett., 99, 188702, 2007.
[11] E. Massad, et al., “Scale-free network of a dengue epidemic”, Applied Mathematics and Computation, 195, 376-381, 2008.
[12] R. Pastor-Satorras and A. Vespignani, “Epidemic spreading in scale-free networks”, Phys. Rev. Lett. 86, 3200–3203, 2001. 
Michael Small, Hong Kong Polytechnic University (Email: ensmall@polyu.edu.hk) Research News_files/Figure%201-1.pngResearch News_files/Figure%202-1.jpgResearch News_files/Figure%203-1.jpgResearch News_files/Figure%204-1.jpgResearch News_files/Figure%205-1.jpghttp://declanbutler.info/blog/?p=58mailto:ensmall@polyu.edu.hkshapeimage_3_link_5shapeimage_3_link_6
Fig. 1 (left): Spread of Bubonic Plague across Europe between 1347 and 1351. Colour coded by date of outbreak. (Image from Wikipedia released under the GNU Free Documentation License Fig. 2: Number of reported cases of SARS in Hong Kong during 2003. The dashed line indicates the revised and estimated actual number of cases Fig. 4: A depiction of Avian influenza cases (circles for birds, triangles for human cases) depicted within Google Earth. Outbreaks are colour coded by date Fig. 5: The scalefree distribution of degree (outbreak connectivity) for the animal avian influenza data. The estimated exponent is 1.2028 and it is statistically significant