Record:   Prev Next
Author Osada, Robert Radoslaw Zygmunt
Title Computational methods for predicting transcription factor binding sites
book jacket
Descript 90 p
Note Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 3911
Adviser: Mona Singh
Thesis (Ph.D.)--Princeton University, 2006
A major challenge in computational biology is to understand the mechanisms that control gene expression. Transcription factor proteins mediate this process by interacting with a cell's DNA. Here the problem of identifying sequence-specific DNA binding sites of transcription factors is studied, taking two complementary approaches, one based primarily on identifying sequence features and the other exploiting a transcription factor's structure
The first approach considers the problem of developing a representation for DNA binding sites known to be bound by a particular transcription factor, in order to recognize its other binding sites. The effectiveness of several commonly used approaches is compared, including position-specific scoring matrices, consensus sequences and match-mismatch based methods, showing that there are statistically significant differences in their performances. Furthermore, the use of per-position information content improves all basic approaches, and including local pairwise nucleotide dependencies within binding site models results in statistically significant improvements for approaches based on nucleotide matches. Based on the analysis, the best results when searching for DNA binding sites of a transcription factor are obtained by methods that use both information content and local pairwise correlations
The second approach focuses on a particular structural class of transcription factors, the C2H2 zinc fingers, that comprise the largest family of eukaryotic transcription factors. Zinc finger protein-DNA interactions are modeled by their pairwise residue-base interactions that make up their structural interface using a modified support vector machine framework to find the favorability of each residue-base interaction. Unlike previous approaches, this framework includes not only examples of known interactions but also quantitative information about the relative binding affinities between different protein-DNA configurations. The resulting classifier performs well in a variety of cross-validation testing
School code: 0181
Host Item Dissertation Abstracts International 67-07B
Subject Biology, Bioinformatics
Computer Science
Alt Author Princeton University
Record:   Prev Next