Image via Folding@Home
By Nathan Levinzon, Neurobiology, Physiology, and Behavior ‘23
Author’s Note: The purpose of this article is to introduce and inform the UC Davis scientific community of Folding@Home; a distributed computing project that allows individuals and researchers to donate computing resources from their computers towards COVID-19 research.
Keywords: COVID-19, Folding@Home, Distributed Computing
Reports of localized viral pneumonia cases in the Chinese city of Wuhan began in December 2019, initially amounting to little concern for humanity. Since then, the world has drastically changed as a result of COVID-19. As of September 16, 2020, almost thirty million cases of COVID-19 have been confirmed across the globe, claiming the lives of close to one million individuals. In California alone, there have been seven hundred seventy thousand cases with close to twenty three thousand deaths as of September 28th . Millions of individuals are currently under a government-mandated shelter-in-place order, forcing the lives of many to come to a standstill. In a statement made by UC Davis Chancellor May in March of this year, “much of [UC Davis’] research is ramping down, but when it comes to the coronavirus, our efforts continue apace” . One such effort taking place at UC Davis is called Folding@Home (FAH), and it allows researchers to study the mechanisms of COVID-19 with nothing but a computer.
FAH originated as a project to study how protein structures interact with their environment. Currently, some proteins of particular interest to FAH are the constituents of the virus that causes COVID-19. Like other coronaviruses, SARS-CoV-2 has four types of proteins: the spike, envelope, membrane, and nucleocapsid proteins. Many copies of the spike protein protrude from the surface of the virus, where they wait to encounter Angiotensin-Converting Enzyme 2 (ACE2) on the surface of human cells . In order to develop therapeutic antibodies or small molecules for the treatment of COVID-19, scientists need to better understand the structure of the viral spike protein and how it binds to the human enzymes required for viral entry into the cells. Before the spike proteins on SARS-CoV-2 can function, they must first take on a particular structure, known as a ‘conformation’, through a process known as “protein folding.” As a result of the many factors that impact protein folding, like electrostatic interaction, especially the electrostatic interactions between amino acids and their environment, research into therapeutics against COVID-19 first necessitates intensive computation in order to resolve protein structure . Only after the proteins of SARS-CoV-2 are understood can the hunt for a cure begin.
FAH is able to study the complex phenomena of protein folding thanks to the computational power of distributed computing. A distributed computing project is a piece of software that allows volunteers to “donate” computing time from the Central Processing Units (CPUs) and Graphics Processing Units (GPUs) located in their personal computers towards solving problems that require significant computing power, like protein folding. In essence, FAH uses a personal computer’s computational resources while the computer is idle to perform calculations involving protein folding. This donated computing power is what forms the “nodes” within a greater cluster of other computers in a process known as “cluster computing.” FAH uses the cluster’s resources to run complex biophysical computer simulations in order to understand the complexities and outcomes of protein folding . In this way, FAH brings together citizen scientists who volunteer to run simulations of protein dynamics on their personal computers.
Studying protein folding via distributed computing has humble beginnings but has grown into a technology that has the potential to research even the most elusive proteins. First, established protein conformations are used by FAH as starting points for a set of simulation trajectories through a technique called ‘adaptive sampling.’ The theory behind adaptive sampling goes as follows: If a protein folds through the states A to B to C, researchers can calculate the length of the transition time between A and C by simulating the A to B transition and the B to C transition . First, a computer simulates the initial conditions of a protein many times to determine the sample space of protein conformations. As the simulations discover more conformations, a Markov state model (MSM) is created and used to find the most dominant of protein conformations. The MSM represents a master equation framework: meaning that, in theory, the complete dynamics of a protein can be described using a single MSM . The MSM method significantly increases the efficiency of simulation as it avoids unnecessary computation and allows for the statistical aggregation of short, independent simulation trajectories . The amount of time it takes to construct an MSM is inversely proportional to the number of parallel simulations running, i.e., the number of CPUs and GPUs available . At the end of computation, an aggregate final model of all the sample states is generated. This final model is able to illustrate folding events and dynamics of the protein, which researchers can use to study and discover potential binding sites for novel therapeutic compounds.
The power of FAH’s distributed computing in the hunt for a cure to COVID-19 grows with each computer on FAH’s network.ch citizen scientist who donates the power of their idle computer. Currently, pharmaceutical research in COVID-19 has been hindered by the fact that there are no obvious drug binding sites on the surface of the SARS-CoV-2 virus. This makes developing therapeutic remedies for COVID-19 a long, expensive process of ‘check and guess.’ However, there is promise: in the past, FAH’s simulations have captured motions in the proteins of the Ebola virus that create a potentially druggable site not otherwise observable. Using the same methodology as they did for Ebola, FAH has now found similar events in the spike protein of SARS-CoV-2 and hopes to use this result and future results to one day produce a life-saving treatment for COVID-19. By downloading Folding@Home and selecting to contribute to “Any Disease”, anyone can help provide FAH-affiliated researchers with the computational power required to tackle this worldwide epidemic. For more information, refer to https://foldingathome.org/start-folding/.
- Smith, M., White, J., Collins, K., McCann, A., & Wu, J. (2020, June 28). Tracking Every Coronavirus Case in the U.S.: Full Map. The New York Times. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
- May, G. S. (2020, March 20). Update on Our Response to COVID-19. UC Davis Leadership. https://leadership.ucdavis.edu/news/messages/chancellor-messages/update-on-our-response-to-covid19-march-20
- Astuti, I., & Ysrafil. (2020). Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): An overview of viral structure and host response. Diabetes & Metabolic Syndrome: Clinical Research & Reviews. https://doi.org/10.1016/j.dsx.2020.04.020
- Sarah Everts. (2017, July 31). Protein folding: Much more intricate than we thought | July 31, 2017 Issue – Vol. 95 Issue 31 | Chemical & Engineering News. Cen.Acs.Org. https://cen.acs.org/articles/95/i31/Protein-folding-Much-intricate-thought.html
- About – Folding@home. (n.d.). Folding@Home. Retrieved June 28, 2020, from https://foldingathome.org/about/
- Bowman, G. R., Voelz, V. A., & Pande, V. S. (2011). Taming the complexity of protein folding. Current Opinion in Structural Biology, 21(1), 4–11. https://doi.org/10.1016/j.sbi.2010.10.006
- Husic, B. E., & Pande, V. S. (2018). Markov State Models: From an Art to a Science. Journal of the American Chemical Society, 140(7), 2386–2396. https://doi.org/10.1021/jacs.7b12191
- Sengupta, U., Carballo-Pacheco, M., & Strodel, B. (2019). Automated Markov state models for molecular dynamics simulations of aggregation and self-assembly. The Journal of Chemical Physics, 150(11), 115101. https://doi.org/10.1063/1.5083915
- Stone, J. E., Phillips, J. C., Freddolino, P. L., Hardy, D. J., Trabuco, L. G., & Schulten, K. (2007). Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry, 28(16), 2618–2640. https://doi.org/10.1002/jcc.20829
- Cruz, M. A., Frederick, T. E., Singh, S., Vithani, N., Zimmerman, M. I., Porter, J. R., Moeder, K. E., Amarasinghe, G. K., & Bowman, G. R. (2020). Discovery of a cryptic allosteric site in Ebola’s ‘undruggable’ VP35 protein using simulations and experiments. https://doi.org/10.1101/2020.02.09.940510