1. 超过329,000个人的全基因关联分析确定了116个影响神经质的独立变变位点
Association analysis in over 329,000 individuals identifies 116 independent variants influencing neuroticism(Nature Genetics)
Abstract
Neuroticism is a relatively stable personality trait characterized by negative emotionality (for example, worry and guilt)1; heritability estimated from twin studies ranges from 30 to 50%2, and SNP-based heritability ranges from 6 to 15%3,4,5,6. Increased neuroticism is associated with poorer mental and physical health7,8, translating to high economic burden9. Genome-wide association studies (GWAS) of neuroticism have identified up to 11 associated genetic loci3,4. Here we report 116 significant independent loci from a GWAS of neuroticism in 329,821 UK Biobank participants; 15 of these loci replicated at P
2. 植入前的人类胚胎的单细胞DNA甲基化组测序
Single-cell DNA methylome sequencing of human preimplantation embryos
(Nature Genetics)
Abstract
DNA methylation is a crucial layer of epigenetic regulation during mammalian embryonic development1,2,3. Although the DNA methylome of early human embryos has been analyzed4,5,6, some of the key features have not been addressed thus far. Here we performed single-cell DNA methylome sequencing for human preimplantation embryos and found that tens of thousands of genomic loci exhibited de novo DNA methylation. This finding indicates that genome-wide DNA methylation reprogramming during preimplantation development is a dynamic balance between strong global demethylation and drastic focused remethylation. Furthermore, demethylation of the paternal genome is much faster and thorough than that of the maternal genome. From the two-cell to the postimplantation stage, methylation of the paternal genome is consistently lower than that of the maternal genome. We also show that the genetic lineage of early blastomeres can be traced by DNA methylation analysis. Our work paves the way for deciphering the secrets of DNA methylation reprogramming in early human embryos.
3. 使用LeafCutter进行非注释定量RNA可变剪接
Annotation-free quantification of RNA splicing using LeafCutter(Nature Genetics)
Abstract
The excision of introns from pre-mRNA is an essential step in mRNA processing. We developed LeafCutter to study sample and population variation in intron splicing. LeafCutter identifies variable splicing events from short-read RNA-seq data and finds events of high complexity. Our approach obviates the need for transcript annotations and circumvents the challenges in estimating relative isoform or exon usage in complex splicing events. LeafCutter can be used both to detect differential splicing between sample groups and to map splicing quantitative trait loci (sQTLs). Compared with contemporary methods, our approach identified 1.4–2.1 times more sQTLs, many of which helped us ascribe molecular effects to disease-associated variants. Transcriptome-wide associations between LeafCutter intron quantifications and 40 complex traits increased the number of associated disease genes at a 5% false discovery rate by an average of 2.1-fold compared with that detected through the use of gene expression levels alone. LeafCutter is fast, scalable, easy to use, and available online.
4. 鉴定小鼠胚胎的早期谱系中的表观基因组动态变化
Dynamic epigenomic landscapes during early lineage specification in mouse embryos(Nature Genetics)
Abstract
In mammals, all somatic development originates from lineage segregation in early embryos. However, the dynamics of transcriptomes and epigenomes acting in concert with initial cell fate commitment remains poorly characterized. Here we report a comprehensive investigation of transcriptomes and base-resolution methylomes for early lineages in peri- and postimplantation mouse embryos. We found allele-specific and lineage-specific de novo methylation at CG and CH sites that led to differential methylation between embryonic and extraembryonic lineages at promoters of lineage regulators, gene bodies, and DNA-methylation valleys. By using Hi-C experiments to define chromatin architecture across the same developmental period, we demonstrated that both global demethylation and remethylation in early development correlate with chromatin compartments. Dynamic local methylation was evident during gastrulation, which enabled the identification of putative regulatory elements. Finally, we found that de novo methylation patterning does not strictly require implantation. These data reveal dynamic transcriptomes, DNA methylomes, and 3D chromatin landscapes during the earliest stages of mammalian lineage specification.
5. Hi-C数据中使用概率模型和等级拓扑结构域识别启动子 - 增强子相互作用
Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains(Nature Communications)
Abstract
Proximity-ligation methods such as Hi-C allow us to map physical DNA–DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter–enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA–DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA–DNA interaction data.
6. MICRA:一种用于从高通量测序数据快速表征微生物基因组的自动化流程
MICRA: an automatic pipeline for fast characterization of microbial genomes from high-throughput sequencing data(Genome Biology)
Abstract
The increase in available sequence data has advanced the field of microbiology; however, making sense of these data without bioinformatics skills is still problematic. We describe MICRA, an automatic pipeline, available as a web interface, for microbial identification and characterization through reads analysis. MICRA uses iterative mapping against reference genomes to identify genes and variations. Additional modules allow prediction of antibiotic susceptibility and resistance and comparing the results of several samples. MICRA is fast, producing few false-positive annotations and variant calls compared to current methods, making it a tool of great interest for fully exploiting sequencing data.
7. OMSV能够准确,全面地鉴定纳米通道单分子光学图谱的大型基因组结构变异
OMSV enables accurate and comprehensive identification of large structural variations from nanochannel-based single-molecule optical maps(Genome Biology)
Abstract
We present a new method, OMSV, for accurately and comprehensively identifying structural variations (SVs) from optical maps. OMSV detects both homozygous and heterozygous SVs, SVs of various types and sizes, and SVs with or without creating or destroying restriction sites. We show that OMSV has high sensitivity and specificity, with clear performance gains over the latest method. Applying OMSV to a human cell line, we identified hundreds of SVs >2 kbp, with 68 % of them missed by sequencing-based callers. Independent experimental validation confirmed the high accuracy of these SVs. The OMSV software is available at http://yiplab.cse.cuhk.edu.hk/omsv/.
8. SIDR:同时分离和平行测序单细胞基因组DNA和总RNA
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells (Genome Research)
Abstract
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level.
9. TF2Network:使用公开可用的结合位点信息预测拟南芥中的转录因子调节子和基因调控网络