Massive genetic study shows coronavirus mutating and potentially evolving amid rapid U.S. spread
Scientists in Houston on Wednesday released a study of more than 5,000 genetic sequences of the coronavirus that reveals the virus’s continual accumulation of mutations, one of which may have made it more contagious.
The new report, however, did not find that these mutations have made the virus deadlier or changed clinical outcomes. All viruses accumulate genetic mutations, and most are insignificant, scientists say.
Coronaviruses such as SARS-CoV-2 are relatively stable as viruses go, because they have a proofreading mechanism as they replicate. But every mutation is a roll of the dice, and with transmission so widespread in the United States — which continues to see tens of thousands of new, confirmed infections daily — the virus has had abundant opportunities to change, potentially with troublesome consequences, said study author James Musser of Houston Methodist Hospital.
“We have given this virus a lot of chances,” Musser told The Washington Post. “There is a huge population size out there right now.”
Scientists from Weill Cornell Medicine, the University of Chicago, Argonne National Laboratory and the University of Texas at Austin also contributed to the study.
The new study, which has not been peer-reviewed, was posted Wednesday on the preprint server MedRxiv. It appears to be the largest single aggregation of genetic sequences of the virus in the United States thus far. A larger batch of sequences was published earlier this month by scientists in the United Kingdom, and, like the Houston study, concluded that a mutation that changes the structure of the “spike protein” on the surface of the virus may be driving the outsized spread of that particular strain.
David Morens, a virologist at the National Institute of Allergy and Infectious Diseases, reviewed the new study and said the findings point to the strong possibility that the virus, as it has moved through the population, has become more transmissible, and that this “may have implications for our ability to control it.”
Morens noted that this is a single study, and “you don’t want to over-interpret what this means.” But the virus, he said, could potentially be responding — through random mutations — to such interventions as mask-wearing and social distancing, Morens said Wednesday.
“Wearing masks, washing our hands, all those things are barriers to transmissibility, or contagion, but as the virus becomes more contagious it statistically is better at getting around those barriers,” said Morens, senior adviser to Anthony S. Fauci, the director of the NIAID.
This has implications for the formulation of vaccines, Morens said. As people gain immunity, either through infections or a vaccine, the virus could be under selective pressure to evade the human immune response.
“Although we don’t know yet, it is well within the realm of possibility that this coronavirus, when our population-level immunity gets high enough, this coronavirus will find a way to get around our immunity,” Morens said. “If that happened, we’d be in the same situation as with flu. We’ll have to chase the virus and, as it mutates, we’ll have to tinker with our vaccine.”
Peter Thielen, a molecular biologist at the Johns Hopkins University Applied Physics Laboratory, said scientists will need to continue studying the virus to see whether the new mutations identified by the Houston researchers change the “fitness” of the virus, “and if SARS-CoV-2 transmissibility is truly increased as a result of these mutations.”
Another scientist who has studied the coronavirus, Jeremy Luban, a virologist at the University of Massachusetts Medical School, said in an email Wednesday that “the Houston paper highlights the fact that, with respect to SARS-CoV-2, we need to remain vigilant, and increase our capacity to monitor the virus for mutations.”
At Houston Methodist, whose main hospital is part of the Texas Medical Center in central Houston but also includes hospitals around the city, scientists have been sequencing the 30,000-character genome of the coronavirus since early March, when the virus first appears to have arrived in the metropolitan area of 7 million. The paper documents 5,085 sequences.
The research shows that the virus disseminated across Houston neighborhoods in two waves, first striking wealthier and older individuals but then spreading, in the second wave, to younger people and lower-income neighborhoods — affecting many Latino city residents.
At the same time, as the virus spread Zip code by Zip code, it also compiled a catalogue of mutations, many affecting the spike protein. That structure on the surface of the virus, which resembles a tree decked with curled ribbons, enables the virus to enter cells.
The genetic data shows the virus arrived in Houston many separate times, presumably at first by air travel. Notably, 71 percent of the viruses that arrived initially were characterized by a now famous mutation, which appears to have first originated in China, that scientists increasingly suspect may give the virus a biological advantage in how it spreads. It is called D614G, referring to the substitution of an amino acid called aspartic acid (D) for one called glycine (G) in a region of the genome that encodes the spike protein.
By the second wave of the outbreak in Houston, the study found that this variant had leaped to 99.9 percent prevalence — completing its domination of the outbreak. The researchers found that people infected with the strain had higher loads of virus in their upper respiratory tracts, a potential factor in making the strain spread more effectively.
Kristian Andersen, an immunologist at the Scripps Research Institute in California, who was not involved in the new research, downplayed the significance of the new study. He said it “just confirms what has already been described — G increased in frequency over time.” As for the numerous other mutations the study finds, “they just catalogue them, but we don’t know if any of them have any functional relevance.”
Musser said his interpretation is that D614G has been increasingly dominant in Houston and other areas because it is better adapted to spreading among humans. He acknowledged that the scientific case is not closed on this matter.
“This isn’t a murder trial,” Musser said. “We’re not looking for beyond a reasonable doubt. This is a civil trial, and clearly, it’s the preponderance of the evidence that I think forces all of us into the same conclusion, which is there’s something biologically different about that strain, that family of strains.”
Recently, the even larger study of the spread of the coronavirus in the United Kingdom, based on some 25,000 genomes, also found evidence that this variant of the virus outdistances its competitors “in a manner consistent with a selective advantage.”
In general, scientists would expect natural selection to favor mutations that help the virus spread more effectively — since that allows it to make more copies of itself — but not necessarily ones that make it more virulent. Killing or incapacitating the host would generally not help the virus spread to more people.
The study found 285 separate mutation sites that change a physical building block of the spike protein, which is the most important part of the coronavirus in the sense that it is what allows it to infect and harm humans. Forty-nine of the changes at these sites had not been seen before in other genomes sequenced around the world.
The study characterizes some of the spike protein mutations as “disconcerting.” While the paper does not present strong proof that any additional evolution of the spike protein is occurring, it suggests that these repeated substitutions provide a hint that, as the virus interacts with our bodies and our immune systems, it may be learning new tricks that help it respond to its host.
“I think there’s pretty good evidence that’s consistent with immunologic selection acting on certain regions of the spike protein,” Musser said.
The actual mutations in the virus occur randomly as it makes mistakes trying to copy its genome within our cells. But every new case gives a chance for more mutations to occur, which in turn increases the chance that one of these mutations will be useful to the virus, just as D614G apparently already has been.
Given the changes that are already occurring to the genetic code of the virus, one key conclusion of Musser’s is that we are not sequencing it nearly enough if we want to be able to anticipate what the virus will do next.
While some large metropolitan areas in the United States, such as Seattle and Boston, are also doing a great deal of sequencing, the country as a whole is missing many areas — and many potential virus variants, as a result. Even in Houston, the study estimates that only about 10 percent of known coronavirus cases have been sequenced.
“I think we need to be doing this pretty aggressively in multiple locations on a real-time basis,” Musser said. “I think it’s shameful that we’re not doing that.”
Sarah Kaplan and Aaron Steckelberg contributed to this report.