Experts in bioinformatics from St Petersburg University help to assemble the genome sequence of coronavirus from Russia
Research Fellows at the Center for Algorithmic Biotechnology at St Petersburg University Dmitry Antipov and Mikhail Rayko have participated in genome sequencing of SARS-CoV-2. The viral RNA was extracted from a swab sample obtained from an infected resident of St Petersburg on 15 March. The complete genome sequence of the coronavirus SARS-CoV-2 from a Russian patient has been sequenced by the researchers from the Smorodintsev Research Institute of Influenza, St Petersburg.
At the moment, the attention of doctors and scientists around the world is focused on the coronavirus infection COVID-19, caused by the SARS-CoV-2 virus. Over just a few months, the outbreak of the new coronavirus infection has turned into a pandemic: tens of thousands of new cases are being reported daily. Furthermore, as the virus spreads across the globe, its genetic information inevitably changes: the ‘Italian’ and ‘Russian’ variants differ slightly from the original ‘Chinese’ virus.
Monitoring changes in the SARS-CoV-2 genome over time will help scientists to better understand where the coronavirus came from and how it is likely to change in the future. For these purposes, the researchers are exploring different variants of the virus. Currently, more than 1,000 swab samples from around the world have been used for genome sequencing. Recently, the ‘Russian’ variant of the SARS-CoV-2 genome has been added to the list of the decoded viral variants. It was sequenced by the scientists from the Smorodintsev Research Institute of Influenza and the experts in bioinformatics from the Center for Algorithmic Biotechnology at St Petersburg University. They helped to assemble short RNA fragments in a single sequence.
When we learnt about genome sequencing of SARS-CoV-2 at the Smorodintsev Research Institute of Influenza, we offered to assist with assembly. We were aware of the importance of the task, and put all other projects on hold for two days and focused on this task.
Dmitry Antipov, Junior Research Fellow at the Center for Algorithmic Biotechnology, St Petersburg University
Dmitry Antipov noted that the SARS-CoV-2 genomic sequence is relatively tiny – about 30,000 nucleotides. In comparison, human DNA is composed of 3.2 billion pairs of nucleotides. The experts had to work with a huge amount of raw data obtained after sequencing – 18 gigabytes of information. Nonetheless, the task proved to be well within their range of competence, as the development of software tools for genome assembly is one of the main areas of work of the center.
‘RNA viruses show a fairly high degree of genetic variability,’ explained Mikhail Rayko. ‘This is one of their essential features. It is crucial that we distinguish the mutations that arise spontaneously from sequencing errors. On the other hand, the virus mutation rate is vital for vaccine development.’
The faster the virus mutates, the greater the likelihood that at some point the developed vaccine will become ineffective.
Mikhail Rayko, Research Fellow at the Center for Algorithmic Biotechnology, St Petersburg University
The genome sequences of SARS-CoV-2 obtained in different countries, including the ‘Russian’ one, are shared in GISAID database, explains Mikhail Rayko. Initially, GISAID (Global Initiative on Sharing All Influenza Data) was developed to monitor new strains of the influenza virus. In this database, you can now see a phylogenetic tree of the new virus and transmission patterns of the coronavirus disease. The virus is changing; therefore, it is important that scientists understand which variant they are dealing with and where it has come from.
The sequencing and assembly of the SARS-CoV-2 genome enabled researchers from the Influenza Research Institute and their colleagues from The Skolkovo Institute of Science and Technology to clarify some of the details. On the whole, the ‘Russian’ coronavirus differs little from the original ‘Chinese’ variants. Yet, genetically it is closer to the European samples. The researchers emphasise that for more information it is imperative to continue sequencing various strains of the virus found around the world.
‘I am very glad that Mikhail and Dmitry were able to help colleagues from the Smorodintsev Research Institute of Influenza. Sequencing and assembly errors lead to misinterpretation of the data and incorrect conclusions. To complicate matters further, in challenging situations there is always little time for re-analysis and corrections,’ stressed Alla Lapidus, Associate Director of the Center for Algorithmic Biotechnology of the Institute of Translational Biomedicine at St Petersburg University. At present, bioinformatics expertise is extremely scarce. In these trying times, this has become far too obvious. Hence, it is vital to develop and implement undergraduate, graduate and postgraduate academic programmes in bioinformatics in order to overcome the expertise scarcity and train professionals with most in-demand skills.’