By Shaina Eagle, Global Disease Biology, ‘24
Over 300,000 people reported having cholera in 2020 . This infectious disease is spread by water or seafood contaminated by the Vibrio cholerae bacteria. V. cholerae can survive in the open ocean within phytoplankton . The bacteria also spreads into inland water sources such as rivers, getting into people’s drinking water. This spread of cholera is affected by climate variables such as precipitation, temperature, and oceanic conditions [1, 2, 5, 6, 7, 11, 13]. Climate patterns such as the El Nino Southern Oscillation (ENSO) and the Indian Ocean Dipole (IOD) influence local weather patterns in coastal regions, causing more phytoplankton blooms [2, 11]. Climate change also disrupts water, sanitation, and hygiene (WASH) infrastructure,  creating favorable environmental conditions for V. cholerae to thrive . As climate change causes fluctuations in weather patterns and coastal biology, researchers need a reliable method for tracking and predicting cholera. Early warning systems are key for health officials to be able to take proper preventative measures–from vaccine deployment campaigns to emergency clean water storage–to reduce the prevalence and fatality of cholera.
Satellites are one method to gather measurements of variables that affect the spread of cholera. Using electromagnetic reflection, satellites provide remotely-sensed geophysical data on variables such as temperature, water quality, precipitation, or vegetation . Researchers use remotely sensed data in conjunction with algorithms and statistical analyses to model cholera outbreaks and predict how changing variables will alter disease spread. Satellite data is widely accessible, often free, and provides data over huge temporal-spatial ranges [5, 10]. Researchers are able to compile their data without being physically near the areas they are studying . This review will analyze how researchers have developed methods for predicting cholera outbreaks using remotely sensed data, and demonstrate how refinement of these techniques will be crucial to combating cholera outbreaks amidst climate change.
Collection of Satellite Data
Natural disasters are increasing in intensity and frequency, heightening the opportunity for a cholera epidemic [2, 4]. Cholera epidemics have historically begun after storms, such as after Hurricane Matthew in Haiti . Hurricanes can destroy WASH infrastructure, allowing cholera to seep into water supplies and leave people vulnerable to drinking this contaminated water . Detecting outbreaks and identifying the source are crucial steps in managing deaths from cholera; it is also crucial to improve sanitation and access to clean drinking water and increase vaccination campaigns. These steps can be aided by remotely sensed data that feeds into prediction models .
Remotely sensed data measures variables that are known to be connected to cholera incidence. Huq et al. (2017) published research using remotely sensed precipitation, wind swath, geophysical landmarks, and population density after Hurricane Matthew struck Haiti in October 2016 . The researchers created a map that showed areas at high risk for cholera and were able to predict where outbreak hotspots would occur up to four weeks after the hurricane .
Other useful variables include sea surface temperature (SST), sea surface salinity (SSS), land surface temperature (LST), precipitation, chlorophyll-a concentration (Ch-a), and soil moisture (SM) [1, 4, 5, 6, 7, 9, 11, 13]. SST and Ch-a are indicators of a habitat that is suitable for Vibrio cholerae growth [5, 6]. Flooding from extreme precipitation can flush seawater carrying V. cholerae into inland rivers, estuaries, or drinking water [4, 5, 6].
Satellites can provide data on climate variables in regions that health officials cannot access safely, or after a natural disaster when researchers cannot collect field data due to accessibility or time constraints . This data could help researchers identify particular regions at risk for a cholera outbreak after an extreme weather event and help policymakers make informed decisions about where to implement vaccination programs or establish WASH infrastructure. And in districts where cholera survives endemically, remotely sensed data could help identify outbreak sources or thresholds for when an outbreak becomes an epidemic. Satellite data on EVCs and WASH infrastructure needs to remain publicly and freely available, and will be particularly effective in identifying potential cholera outbreaks as climate change increases the intensity and incidence of natural disasters and climate patterns that suit V. cholerae proliferation.
Turning Raw Data into Models
Tracking Essential Climate Variables
Satellites provide data across vast geospatial and temporal ranges about the Essential Climate Variables (ECVs) correlated with cholera outbreaks. Remote sensing systems allow researchers to build models of cholera dynamics based on these relationships . Fooladi et al. (2021) used precipitation data from 1983 to 2016 to compute a non-standardized precipitation index (nSPI) in the Gavkhooni basin in Iran. Their model demonstrates how previous understanding of the environmental conditions that precede cholera outbreaks can be combined with satellite data to make novel predictions about disease outbreaks . For example, an algal bloom is an exponential growth of phytoplankton, which requires chlorophyll-a to photosynthesize sunlight, grow, and produce nutrients . Phytoplankton is a reservoir of V. cholerae and can be seen from space because of its green pigment. Therefore, Ch-a is a close enough proxy to phytoplankton for modeling the levels of V. cholerae bacteria in an area . In 2021, Lai et al. used Landsat images from NASA and Sentinel-2A images from the European Space Agency to measure Ch-a in the Guanting Reservoir, one of the main water supply sources for Beijing, China. They developed a model between variables in the satellite images (bands, normalized difference vegetation index, surface algal bloom index, Apple algorithm values) and Ch-a . Their studies in 2016, 2017, and 2019 predicted Ch-a to be correlated with the actual measured chlorophyll-a levels at a 0.05 significance level . This data allowed the researchers to model trends of Ch-a and water nutrition status, which has applications to reservoir eutrophication statuses  and thus disease transmission.
Variables such as LST and SM can be linked to cholera outbreaks through machine learning (ML) algorithms. ML elucidates complex relationships between variables, such as the risk of a cholera outbreak and EVCs . Statistically analyzing input data taken from satellites, ML allows researchers to build models that predict an output (i.e. an outbreak) . Algorithms such as Random Forest (RF), XGBoost, K-Nearest Neighbors, and Linear Discriminant Analysis have been examined by researchers [1, 9]. Campbell et al. (2020) found RF to be the most effective classifier due to its superior performance in handling oversampled and imbalanced datasets, yielding a high true positive rate (probability that an actual trend is correctly predicted) of 89.5% when fitting a model combining a season encoder, location encoder, LST, Ch-a, SSS, sea level anomalies, SM, and their lag values (using past variables to predict future variables) . Campbell et al.’s model (2020) combined five EVCs and pulled data across forty districts of India from 2010 to 2018 .
In a 2013 study, Jutla acknowledged that Ch-a alone cannot serve as an accurate predictive factor of a cholera outbreak, as other organic matters and detritus not represented by a chlorophyll index can also contribute to the presence of cholera bacteria . To account for this, Jutla developed the Satellite Water Marker (SWM) index, which uses wavelength radiance to identify coastal conditions and predict cholera outbreaks one to two months in advance . SWM is based on the shifts in the difference between blue (412 nm) and green (555 nm) wavelengths, which determine the turbidity (impurity) of water . A high correlation between SWM in the early winter months in the Bay of Bengal and cholera peaks in the spring was observed, and likely related to multiple coastal conditions, not just Ch-a . Jutla et al. (2013) tested the Bay of Bengal SWM in Mozambique, where there is one annual cholera peak as opposed to two. They again found that the SWM was a more accurate indicator of cholera than Ch-a alone. Julta’s index was used again by Ogata et al. (2021) to determine the specific environmental conditions in previous seasons that precede cholera outbreaks in northern coastal regions in the Bay of Bengal. They linked spring cholera to summer precipitation and the previous fall/winter SWM. Meanwhile, La Niña-driven SST deviations and floods caused by high summer rainfall anticipated fall cholera outbreaks . Variability in climate conditions and SWM over decades indicates that the predictive models are ever-shifting . A clear understanding of shifts in climate patterns over time is thus integral to accurate forecasting.
Challenges and Limitations
Remotely sensed data is integral to developing timely and accurate predictive models and early warning systems for cholera outbreaks. There is no set of ECVs or a specific ML technique that can be applied universally, especially when looking at endemic versus epidemic cholera [1, 2, 5, 6, 7, 9]. Many studies struggled with a lack of field data against which to test their models, particularly after extreme weather events which may destroy existing data collection infrastructure . Researchers were also challenged by imbalanced datasets when programming ML algorithms, even with particularly resilient algorithms like RF [1, 9]. Cholera is notoriously difficult to model because it can occur through multiple pathways of transmission, and cholera outbreaks are related to several climate variables through complex relationships [5, 6, 9]. Further testing in diverse regions, under various climate conditions, utilizing assorted ECVs, and employing numerous ML techniques is necessary to make these models as accurate as possible. Future studies should focus on long-term observations of variables known to be connected to cholera and V. cholerae, such as sea surface salinity [1, 11]. Future models also need to take socioeconomic data into account [1, 4].
The purpose of this review was to demonstrate how and why remotely sensed data is being used to predict cholera outbreaks, particularly as climate change makes local weather patterns more unpredictable. Researchers do not indicate a lack of sufficient satellite or ML technology necessary to make satellite data-driven cholera prediction models commonplace. However, different regions around the world have different seasonal and interannual variability of cholera transmission , making it difficult to develop a universal model. Therefore, future studies should emphasize testing various ML methods with diverse EVCs worldwide. Future studies should also work to formulate indices such as the SWM that can be applied over different geographical regions with minimal alterations. As climate change intensifies, cholera prediction models are vital components of disease prevention. Cholera is unlikely to be eradicated , but there are steps that can be taken to control its transmission and minimize its mortality. These steps are more effective the more time officials have to deploy them, so models that can provide significant lead times are critical.
 Campbell AM, Racault M-F, Goult S, Laurenson A. 2020. Cholera risk: a machine learning approach applied to essential climate variables. IJERPH. 17(24):9378.
 Christaki E, Dimitriou P, Pantavou K, Nikolopoulos GK. 2020. The impact of climate change on cholera: A review on the global status and future challenges. Atmosphere. 11(5):449.
 Fooladi M, Golmohammadi MH, Safavi HR, Singh VP. 2021. Fusion-based framework for meteorological drought modeling using remotely sensed datasets under climate change scenarios: resilience, vulnerability, and frequency analysis. Journal of Environmental Management. 297:113283.
 Huq A, Anwar R, Colwell R, McDonald MD, Khan R, Jutla A, Akanda S. 2017. Assessment of risk of cholera in Haiti following Hurricane Matthew. The American Journal of Tropical Medicine and Hygiene. 97(3):896–903.
 Jutla AS, Akanda AS, Islam S. 2010. Tracking cholera in coastal regions using satellite observations 1. JAWRA Journal of the American Water Resources Association. 46(4):651–662.
 Jutla A, Akanda AS, Huq A, Faruque ASG, Colwell R, Islam S. 2013. A water marker monitored by satellites to predict seasonal endemic cholera. Remote Sensing Letters. 4(8): 822-831.
 Khan R, Aldaach H, McDonald C, Alam M, Huq A, Gao Y, Akanda AS, Colwell R, Jutla A. 2019. Estimating cholera risk from an exploratory analysis of its association with satellite-derived land surface temperatures. International Journal of Remote Sensing. 40(13):4898–4909.
 Lai Y, Zhang J, Song Y, Gong Z. 2021. Retrieval and evaluation of chlorophyll-a concentration in reservoirs with main water supply function in Beijing, China, Based on Landsat Satellite Images. IJERPH. 18(9):4419.
 Leo J, Luhanga E, Michael K. 2019. Machine learning model for imbalanced cholera dataset in Tanzania. The Scientific World Journal. 2019:1–12.
 Moore GK. 1979. What is a picture worth? A history of remote sensing / Quelle est la valeur d’une image? Un tour d’horizon de télédétection. Hydrological Sciences Bulletin. 24(4):477–485.
 Ogata T, Racault M-F, Nonaka M, Behera S. 2021. Climate precursors of satellite water marker index for spring cholera outbreak in Northern Bay of Bengal coastal regions. International Journal of Environmental Research and Public Health. 18(19):10201.
 World Health Organization. 2021. Cholera annual report 2020. Weekly Epidemiological Record, Volume 96, page 445-460.
 Xu M, Cao CX, Wang DC, Kan B, Xu YF, Ni XL, Zhu ZC. 2016. Environmental factor analysis of cholera in China using remote sensing and geographical information systems. Epidemiol Infect. 144(5):940–951.