Genetic variation and genome architecture in development, health and disease
This seminar integrates data on genome architecture into genetic studies, in order to identify functional variants that influence normal development, health and disease.
The genome provides not only the instructions that specify the structure and function of proteins, and also the programs that control how these instructions are implemented in different cell types and environments, and at different developmental stages. The expression of protein-coding genes is regulated by complex mechanisms, including epigenetic modification of DNA and histones (chromatin), and dynamic changes in the 3-dimensional structure of chromatin through interactions between DNA (enhancers and promoters), RNA (long non-coding RNAs) and proteins (transcription factors). The elucidation of genome architecture has been made possible by recent advances in genomic technologies (e.g. RNAseq, ChIP-seq, Hi-C, ChIA-PET, ATAC-seq). These technologies have revealed an underlying modular structure of the genome, comprising of Typologically Associated Domains (TADs) in which DNA elements (such as enhancers and promoters) interact with each other and with proteins (transcription factors) to regulate gene expression. Large datasets on genome architecture from projects such as ENCODE and Roadmap Epigenomics have been deposited in public databases for the research community. Knowledge of genome regulation and architecture is crucially important for elucidating the causes and mechanisms of complex diseases, as recent genome-wide association studies have found the majority of disease-predisposing genetic variants to be located outside of protein-coding sequences. Because neighboring single nucleotide polymorphisms (SNPs) are usually highly correlated with each other, it is often difficult to pinpoint the causal variant in an association region. Knowledge on genome architecture may help identify functionally important genetic sequences, in which mutations are likely to be deleterious. Thus, deletions that disrupt the boundaries between TADs can result in disease because a gene is deprived of its normal enhancers and come to be controlled by enhancers normally residing in another TAD. Single nucleotide variations in promoters or enhancers may have smaller but nevertheless significant impact on gene regulation. Another important development is the increasing amounts of human whole-genome sequencing data being generated. Projects such as Genomics England and Human Longevity Inc. have already sequenced tens of thousands of subjects. These massive datasets may help to identify non-coding DNA elements that have less variation than the rest of the genome, and are thus likely to be functionally important. The integration of data on human genome variation and on human genome architecture promises to provide detailed knowledge of the key regulatory elements in which variation may alter function and lead to disease. This knowledge will help to pinpoint the causal variants for a disease within an association region. More importantly, the integration of genetic and functional information will provide insights on how genetic variation in non-coding DNA influences normal development and disease risk through changes in gene regulation The joint consideration of genome variation and genome architecture will be crucial for the success for genomic studies such as the Theme-based Research Scheme projects on skeletal development and intervertebral disc disease, and enteric nervous system development and Hirschsprung disease. However, expertise in genome architecture is currently lacking in Hong Kong . This Croucher Advanced Study Institue provides education and training to local researchers.