Vsevolod J. Makeev, PhD
The Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
Georgia Tech and Moscow Institute for Physics and Technology Joint PhD Program in Bioinformatics
November 24, 2014, 10:30-11:30am
Petit Institute Room 1128
DNA sequences responsible for transcription regulation comprise less than 1% of total human genome length but control such important processes as tissue differentiation, tissue identity maintenance, response to signals coming from cell environment, etc. At least 50% of human genome mutations have been found in promoter and enhancer regions. Such variations probably affect gene expression regulation patterns through changing binding affinity of transcription factors (TFs). Sequence patterns, preferably bound by specific TFs are currently known for about 35% of all human TFs. Often such patterns are obtained by experimental technologies either in vivo or in vitro; difference in techniques complicates creation of a single scale of binding affinities of different TFs in the same regulatory region. Cooperative binding of TFs makes situation even more complex. I will discuss the current progress in sequence analysis of TF binding sites (TFBS) in the context of large-scale functional genomics studies e.g. ENCODE and FANTOM5. The databases of transcription factor binding motifs (HOCOMOCO) and software for binding specificity analysis (MacroAPE and PerfectosAPE) will be presented.
I will describe strengths and weaknesses of sequence analysis approach for assessing sequence variations in gene regulatory regions. The study of a number of sample sets of somatic mutations associated with cancer development allowed us to evaluate possible neutral and non-neutral somatic mutations in cancer cell lines affecting TFBS binding.