top of page
20190115_7802.png

Publications

This study uses the G-DNABERT model to analyze live cell data, suggesting that G-quadruplexes play a key role in chromatin looping and gene expression regulation. By integrating KEx footprinting with other genomic datasets, the research provides high-confidence evidence linking G-quadruplexes to enhancer-promoter interactions.

Authors: Maria Poptsova, Alan Herbert, Dmitry Konovalov, Dmitry Umerenkov

This study uses deep learning (DeepZ) to map conserved Z-DNA flipons in human and mouse genomes, revealing their association with transcription regulation, chromatin organization, and promoter activity, suggesting a role in epigenetic state transitions. The findings highlight Z-DNA's potential influence on transcription initiation and genetic information readout.

Authors: Nazar Beknazarov, Dmitry Konovalov, Alan Herbert, Maria Poptsova

This study introduces DNABERT-Z, a transformer-based deep learning model that improves the prediction of Z-DNA regions in genomes by leveraging pretrained DNABERT models and fine-tuning them with experimental data from human and mouse studies, outperforming the previous CNN- and RNN-based DeepZ approach. The model demonstrates cross-species applicability, accurately predicting Z-DNA sites in the mouse genome when trained on human data.

Authors: Dmitry Umerenkov, Vladimir Kokh, Alan Herbert, Maria Poptsova

Here we describe an approach that uses deep learning neural networks such as CNN and RNN to aggregate information from DNA sequence; physical, chemical, and structural properties of nucleotides; and omics data on histone modifications, methylation, chromatin accessibility, and transcription factor binding sites and data from other available NGS experiments. We explain how with the trained model one can perform whole-genome annotation of Z-DNA regions and feature importance analysis in order to define key determinants for functional Z-DNA regions.

Authors: Nazar Beknazarov, Maria Poptsova

Identifying roles for Z-DNA remains challenging given their dynamic nature. Here, we perform genome-wide interrogation with the DNABERT transformer algorithm trained on experimentally identified Z-DNA forming sequences (Z-flipons). The algorithm yields large performance enhancements (F1 = 0.83) over existing approaches and implements computational mutagenesis to assess the effects of base substitution on Z-DNA formation. We show Z-flipons are enriched in promoters and telomeres, overlapping quantitative trait loci for RNA expression, RNA editing, splicing, and disease-associated variants. We cross-validate across a number of orthogonal databases and define BZ junction motifs. Surprisingly, many effects we delineate are likely mediated through Z-RNA formation. A shared Z-RNA motif is identified in SCARF2, SMAD1, and CACNA1 transcripts, whereas other motifs are present in noncoding RNAs. We provide evidence for a Z-RNA fold that promotes adaptive immunity through alternative splicing of KRAB domain zinc finger proteins. An analysis of OMIM and presumptive gnomAD loss-of-function datasets reveals an overlap of Z-flipons with disease-causing variants in 8.6% and 2.9% of Mendelian disease genes, respectively, greatly extending the range of phenotypes mapped to Z-flipons.

Authors: Dmitry Umerenkov, Alan Herbert, Dmitrii Konovalov, Anna Danilova, Nazar Beknazarov, Vladimir Kokh, Aleksandr Fedorov, Maria Poptsova

Identifying roles for Z-DNA remains challenging given their dynamic nature. Here, we perform genome-wide interrogation with the DNABERT transformer algorithm trained on experimentally identified Z-DNA forming sequences (Z-flipons). The algorithm yields large performance enhancements (F1 = 0.83) over existing approaches and implements computational mutagenesis to assess the effects of base substitution on Z-DNA formation. We show Z-flipons are enriched in promoters and telomeres, overlapping quantitative trait loci for RNA expression, RNA editing, splicing, and disease-associated variants. We cross-validate across a number of orthogonal databases and define BZ junction motifs. Surprisingly, many effects we delineate are likely mediated through Z-RNA formation. A shared Z-RNA motif is identified in SCARF2, SMAD1, and CACNA1 transcripts, whereas other motifs are present in noncoding RNAs. We provide evidence for a Z-RNA fold that promotes adaptive immunity through alternative splicing of KRAB domain zinc finger proteins. An analysis of OMIM and presumptive gnomAD loss-of-function datasets reveals an overlap of Z-flipons with disease-causing variants in 8.6% and 2.9% of Mendelian disease genes, respectively, greatly extending the range of phenotypes mapped to Z-flipons.

Authors: Dmitry Umerenkov, Alan Herbert, Dmitrii Konovalov, Anna Danilova, Nazar Beknazarov, Vladimir Kokh, Aleksandr Fedorov, Maria Poptsova

Pokrovsky Blvd, 11с4, Moscow, 109028

flipons.ai

bottom of page