I am a Postdoctoral Fellow at the University of California San Diego, co-suprevised by Rob Knight and Pavel Pevzner. I completed my PhD in September 2019 in the Computer Science and Engineering Depertment at UCSD, under the mentorship of Pavel Pevzner.
My research focus is bioinformatics. In particular, I am interested in algorithms for genome assembly using long reads, which enable high-quality reconstruction of the human genome sequence. I also work on tools for comparative genomics and computational proteomics.
Long-read assembly using Flye and metaFlye. The new long-read sequencing technologies (such as Pacific Biosciences or Oxford Nanopore) increased the read length up to tens of thousands of nucleotides, and substantially improved the quality of many genome assemblies. These technologies, however, are facing the challenge of the high read errors. We have created the Flye algorithm for assembly of long and error-prone reads to address this challenge. Flye is using the novel repeat graph framework, which enables fast and accurate assemblies of various organisms. In particular, Flye is good for assembly of human genomes using ultra-long Oxford Nanopore sequencing data (such as NA12878 or CHM13). We are now working on the new metaFlye algorithm for metagenome assembly using long reads.
Watch our Flye assembler presentation and discussion hosted by Long Read Club:
Comparative assembly using multiple references. Since many de novo assemblies of large genomes are still incomplete, one can use the information for related referece genomes to order and orient the contig fragments. We have developed Ragout that infers structural rearrangements between the multiple input refences and reconstructs the most probable architecture of a target genome. We used Ragout to produce chromosome assemblies of multiple mice genomes, which gave insights into rodent genome evolution and novel functional loci. Mouse assemblies were generated as a part of Mouse genomes sequencing project, hosted by Wellcome Sanger Institute.
Tools for assembly graphs analysis.The analysis of genome graphs is helpful in studying repeat structure of genomes (for example, mosaic segmental duplications in humans). To visualize large and complex assembly graphs, we developed AGB - an interactive graph visualization tool. We have also introduced a new Synteny Paths approach for comparison of two related genomes in a graph from, similarly to synteny block for linear genomes. The tools were developed in a collaboration with the Center for Algorithmic Biotechnology and Bioinformatics Institute in St. Petersburg, Russia.