8th Annual Computational Molecular Biology Symposium at Carnegie Mellon
Department of Biological Sciences
Gene Regulatory Evolution in an Equivalence Class of Developmental Enhancers
Biological cells behave in complex ways by producing different RNA molecules in response to diverse conditions. These RNA molecules fold into specific functional RNA structures, or are translated into peptide sequences, which fold into proteins. RNA transcripts are encoded in and transcribed from DNA segments called genes, in a process called "gene expression". Gene expression is accomplished by the presence of regulatory DNA sequences present at each gene locus, where they instruct the cell as to the conditions under which that gene should be expressed. Thus a gene encodes a potential RNA transcript as well as several instructions for when to produce the transcript. Regulatory DNAs therefore are critical for specifying the number of different gene expression states available to a cell, and the situations in which a cell transitions between these states. Regulatory DNAs are vastly more numerous and complex than the easily identifiable protein-coding DNAs that they regulate. Regulatory DNAs represent the latest frontier in biology.
In my talk, I will focus on the structure and evolution of an equivalence class of regulatory DNAs from a model developmental system. I will also discuss how such an example corpus can help guide a unified computational approach to the study of the native computational infrastructure of living cells.
Department of Biological Sciences
University of Pittsburgh
Vertical and Horizontal Inheritance: How Bacteria Rewrite Evolutionary History
The relationships among organisms are dictated by the rules they follow when exchanging genes. For typical diploid eukaryotes, organisms within a species have complex, ever-changing relationships to their history of gene exchange. Between species, relationships are simple and stable, since species boundaries limit gene exchange; the relationships between typical eukaryotic species reflect the time since genetic exchange ceased. But bacteria break both of these rules, since they exchange genes in a much different fashion. First, bacterial species are not clearly delineated, and the long process of lineage separation result in necessarily ambiguous relationships among sister taxa. Second, gene exchange across species boundaries results in groups whose genotypic and phenotypic similarity reflect not their common ancestry, but their tendency to exchange genes with each other.
Department of Biology
Pennsylvania State University
Studying Mutations in the Age of Statistical Genomics
Evolution starts with and is impossible without mutations. Yet, mutations are rather infrequent and therefore are difficult to study. However, with completely sequenced genomes accumulating at a growing pace, bioinformatic and statistical analyses of mutations are now feasible. It is known that mutation rates fluctuate greatly from locus to locus in mammalian genomes, and that the rates of some mutation types co-vary regionally. The causes of this variation and co-variation remain largely unexplored, and deciphering them computationally is expected to unravel the intricacies of mutagenesis. Compared with wet-lab experiments, computational analyses enable us to study mutations in their native genomic environment and on a whole-genome scale. In this presentation I will focus on our recent studies of regional variation of indel, microsatellite, and substitution mutation rates with employment of advanced regression methods.
Associate Curator and Head of Botany
Section of Botany
Carnegie Museum of Natural History
Urban Tree Diversity a Call for HELP Initiated by an AFLP Assessment of Genetic Variability Among Schenley Park London Planetree
London planetree (Platanus X acerfolia) is a popular street tree, which is a hybrid between the American sycamore and the Oriental planetree. Using AFLP markers, gene diversity and phenetic relationships were estimated in a collection of 38 London planetree samples from Schenley Park and several plant nurseries. Four selective primer combinations generated a total of 492 amplification products. The average number of scoreable fragments were 95 per primer combination. A total of 381 polymorphic markers was detected. The polymorphisms obtained ranged from 60% to 82% with an average of 75%. The final phenetic trees were constructed using Nei and Li¹s coefficient of similarity with UPGMA. Other clustering algorithms were examined and all had similar co-phenetic correlation value. The phenetic tree separated the nursery trees from the Schenley Park samples. Bootstrap and Jackknife analyses were completed and their values indicated strong support for this clade. We investigated the different propagating techniques between these samples and found that the Schenley Park plants were grown from seeds or seedlings and nursery plants were propagated from clones, which interconnected with the differences in similarity coefficients. The level of genetic variability detected within the London planetree samples with AFLP analysis suggests that it is a reliable, efficient, and effective marker technology for determining genetic differences that can be used for better management of rural tree plantings. Additional research examining urban biodiverisity and our knowledge of nursery practices has lead us to observe a call for help in managing our urban trees. This is especially argent since we are currently planting billions of dollars of trees to clean our cities without assessing the genetic diversity of these trees.
Assistant Professor, Biological Statistics and Computational Biology
Signatures of Selection in 29 Mammals
With the recent completion of 22 new genome assemblies, the number of sequenced placental mammalian (eutherian) species has more than quadrupled, and their total phylogenetic diversity (in neutral substitutions per site) has increased nearly five-fold. These new data provide unprecedented opportunities for phylogenomics, not only in mammals, but in any group of closely related species.
We have recently developed two new computational methods for detecting signatures of selection in deep comparative genomic data sets, and have applied them to genome-wide alignments of 29 eutherian genome sequences. The first method, called phyloP, can detect possible negative or positive selection at individual sites in mammalian genomes, both across all branches of the phylogeny, or in individual clades or lineages. PhyloP support several statistical tests for conservation or acceleration, and is the basis of a set of new tracks in the UCSC Genome Browser. Preliminary analyses of these tracks reveal new cases of apparent primate-specific acceleration and primate-specific conservation, as well as refined estimates of the share of the genome that is under selection.
The second method, called dmotif, uses a phylogenetic hidden Markov model to detect cases of transcription factor binding site gain or loss along the branches of a phylogeny. Inference is performed by Markov chain Monte Carlo sampling, so that full posterior distributions over binding site histories can be obtained. The method makes use of previously characterized binding site motifs, and is designed to be applied to regions of the genome identified in high- throughput chromatin immunoprecipitation experiments (ChIP-chip, ChIP-PET, or ChIP-seq). We will present preliminary results based on simulated and real data.