1999 Merck Participants-Department of Biological Sciences - Carnegie Mellon University

1999 Merck-supported Participants

Ryle GoodrichRyle Goodrich, Carnegie Mellon University
(Advisor: Dr. David Hackney)

The Effect of Molecular Shape on Brownian Dynamics Simulation
Brownian Dynamics (BD) 3-D computer simulations have been used to model the movements of molecules and small particles in solution. Researchers often use these models to make a theoretical estimation of the bimolecular rate constant for a pair of molecules. Most, if not all, models use one translational diffusion coefficient and one rotational diffusion coefficient to model molecular motion, even though molecules that are non-spherical can rotate and move laterally about their different axes at very different rates. Diffusion coefficients can be calculated for rotational and lateral motion about each axis for simple geometric models of small particles. The current simulation is an extension of a previous experiment that calculated the bimolecular rate constant for small spheres reacting with each other (S. H. Northrup and H. P. Erickson, Proc. Natl. Acad. Sci. 89, pp. 3338-3342, 1992). Ours calculates the bimolecular rate constants for a sphere reacting with prolates (ovular 3-D structures) of various lengths and equal volume. This will help determine the effects of shape on diffusion-dependant particle reactions. Presently, results are inconclusive to the exact effect of shape change except that increased elongation of prolates does decrease the bimolecular rate constant.

Kiarri KershawKiarri Kershaw, Carnegie Mellon University
(Advisor: Drs. Robert Murphy and Mark Craven)

Creation of a Knowledge Base for Protein Localization
A lot of information has been made available to scientists over the Internet, but it is often difficult to access this information in an effective way. Keyword-based searching, generally used for most web databases, is often an ineffective way to answer queries. Reasons for this include the fact that keywords may have many synonyms and that answers may require connections more subtle than Boolean operations. For example, Eisenhaber and Bork found that searches with "intracell", "cytoplasm", "cytosol", "extracell", and "membran" only classify 22% of SWISS-PROT into intracellular, extracellular and membrane-associated proteins. Keyword searches are also inadequate when the answers involve integrating information from multiple sources. The goal of this project is to create a thorough, up-to-date, easy-to-use knowledge base focused on protein subcellular localization. The starting point has been the creation of an ontology describing all terms used to describe subcellular locations and all known relations between those terms (e.g., cis-Golgi is-connected-to medial Golgi). I participated in the development of this ontology during the Spring 1999 semester. This summer, I developed computer programs to analyze the extent to which the ontology includes all terms used in SWISS-PROT records on subcellular locations, and improved the ontology by adding missing terms identified by the programs. Future work will consist of using this ontology to populate a subcellular location knowledge base from MEDLINE and other Internet sources, and to apply data mining techniques to the knowledge base to discover new knowledge about protein location and function.

Christopher MasonChristopher Mason, Carnegie Mellon University
(Advisors: Dr. Gordon Rule and Michael Erdmann)

Application of Homology to the Determination of Protein Structure by NMR
The determination of protein structure is a time consuming but important process. Protein homology is an attractive resource for supplementing NMR spectroscopy of novel proteins. Our work proposes to accelerate the determination of protein structure by detecting and applying homology to unknown proteins from simple NMR data. It does so using amide NOESY data without requiring experimental NMR assignments. Filtering techniques allow the rapid creation of alignments between a protein of known structure and one of unknown structure based on distance matrix comparison. The techniques are implemented in C++ and applied to aspartate transcarbamalyase, calmodulin and troponin C.
View slide show from 1999 SURP Symposium

W. Matt VietaW. Matt Vieta, Carnegie Mellon University
(Advisor: Dr. Frederick Lanni)

Computing Vector Fields from Image-Derived Deformation Data
The ability to determine the forces that a cell exerts on its surroundings has many applications in areas such as tissue engineering and wound healing. This summer I have worked in developing a program which will determine the forces exerted by a cell on a collagen gel. This program takes as input a series of time lapse images. Detection of deformations in the gel is necessary for finding the forces applied by the cell. Once these deformations have been found the forces are calculated using a non-linear curve fitting method. Current work consists of interfacing the curve fitting methods with the deformation finding algorithm. Non-linear curve fitting is a problem in itself. In the future we would like to improve on the curve fitting method and if possible on the underlying mathematical model.

Daniel VogelDaniel Vogel, Carnegie Mellon University
(Advisor: Dr. David Yaron)

Parallel Processing Issues in Semi-Empirical Quantum Calculations
The repulse library is a series of functions and methods that utilize semiempirical quantum chemistry to explore the energy states in large molecules. The methods are based on the Intermediate Neglect of Differential Overlap (INDO) and the Pariser-Parr-Pople (PPP) theory. While considerable work has been done on developing parallel algorithms for quantum chemistry, these calculations present new computational bottlenecks requiring different strategies of optimization.

This library, written straight from theory over the past several years, has been heavily used by the Yaron group in the exploration of the photophysical properties of long conjugated polymers. The calculations necessary are computationally demanding and memory consuming. This places limits on the library's abilities and uses.

Working from an existing, stable and tested object-oriented codebase, we have started to apply a number of optimization techniques, including loop optimization, specialized compiler optimizations specific to the target platforms, class-based mutex protection, multithreading, and clustering. By starting from a theoretically accurate base, and through heavy code profiling, we reduced a number of the library's computational bottlenecks and therefore have begun to allow the library to be applied to new classes of problems.