Research | Sedlazeck Lab

1. The impact of SVs on gene expression / phenotypes

We led studies to identify the role of SV across multiple organisms. For example in yeast, we showed the transient behavior of rapid appearing and disappearing SV along almost identical genomes, leading to gene expression changes. These transient SV occurred much faster than SNV and improved the heritability of certain traits. Over the last two years, the Sedlazeck lab played a key role in identifying the impact of SV in tomatoes. Here we identified SV impacting: fruit growth, number of fruits, weakening the stem of the fruit and other highly important traits to improve the harvest and fruit quality. In addition, CRISPR-Cas9 was used to induce these alleles in sister clades to observe the reassembling of these traits as predicted. More recently, the Sedlazeck lab co-led the detection of SV across 19,652 human genomes from mixed ethnicities together with 4,000 protein measurements across 4,000 individuals. Here we associated SV with direct or indirect impact on proteins important for cardiovascular diseases.

2. SV in human diseases and diagnostics

Structural variation play a central role in polymorphic variation, pathogenic conditions as well as cancer; yet the robust detection of these events in human diseases remains challenging. We led several analysis efforts to identify SV and their pathogenicity or importance across breast cancer, neurological diseases and mendelian diseases. Sedlazeck lab was key to two breast cancer studies where he identified SV surrounding the Her2 oncogene amplifications and innovated the comparison across different sequencing technologies. Here we and the team showed how SV are connected and lead to large copy number changes with underlying molecular structure, which lead to new key insights in Her2+ breast cancer. We further pursued a more cost efficient strategy with collaborators to use CRISPR-Cas9 to target genes in cancer samples to identify haplotype resolved SNV, SV and methylation. In Parkinson and MSA diseases, Sedlazeck lab co-developed a cost efficient and high accurate assay to target the gene GBA. The team is now able to sequence ~160 samples per day and haplotype resolve all the SNV and SV. This approach is now being adopted in multiple hospitals. Lastly in Mendelian diseases, we collaborated with the Dr. James Lupski’s lab to help identify and phase pathogenic CNV and de-novo variation to assess their impact. The Sedlazeck lab now leads analysis efforts over 11,000 long read genomes to obtain novel insights in multiple diseases.

3. The role of SVs in evolution, through studies in Comparative Genomics

We have led and been involved in collaborations to identify SV and their role over multiple model and non model organisms (fungi, plants and animals). As an example, we studied the transposon activity and impact in the Crow and Fish population and identified their role for diversity and evolution. Sedlazeck lab further identified an important duplication explaining the compensation of modern Tomatoes for a splicing deficiency due to other SV impacting MADS-box gene that initially lead to a harmful phenotype. Most recently, we were able to identify small SVs also in SARS-CoV-2 samples from the NYC and Houston metro area. These SV mainly targeted NSP11 and NSP12 with common SV co occurring in SPIKE and other important genes. To improve the quality of reference genomes and thus the detection of SV, we also contributed to the development of Falcon-Unzip and Shasta de-novo assembler that are one of the few assemblers to produce high quality haplotype resolved assemblies.

4. Development of novel methods for Structural Variant Detection

Structural Variation (SV) remain hard to identify, but play an essential role in evolution and disease. We have spearheaded the detection of SV and the utilization of long reads in genomics and transitioning into medical research. The Sedlazeck lab has developed highly cited innovative methods that enabled a more comprehensive analysis such as NGM and NGMLR for alignments of short and long reads to reference genomes. We are the lead developer of multiple methods including Sniffles, the state of the art long read based SV caller. Sniffles has been cited over 344 times in 2 years and significantly advances the field of SV detection and interpretation and is utilized in hundreds to thousands of projects. Sedlazeck lab further led the development of SV calling methods for short reads such as Parliament2 to enable accurate SV calling at scale. This is needed to obtain accurate allele frequencies and we have used it across 200,000 human genomes (Topmed, CCDG). The Sedlazeck lab led the development of methods to accurately genotype SV in a population and others to compare SV. These methods are paving the way to include SV in comparative and medical research.

5. Benchmarks and standards in Genomics

We contributed to detailed analysis of sequencing biases including technology specific consensus errors that can be mistaken as SNPs, sequencing errors and nucleotide biases (e.g. GC). Sedlazeck lab also played a key role in establishing genomic benchmarks sets for SNV and SV in GIAB (NIST) and SEQC2 (FDA). In addition, we developed methods to facilitate the comparison between NGS mapping methods to enable key insights in their individual advantages and disadvantages across multiple parameters for non-expert users. This reflects our detailed knowledge of biases across different sequencing technologies and their implications on variation calling and establishing high quality call sets. These call sets are key to enable new biology and medical findings to study mechanism and occurrences of complex variations.