2003 Merck Participants-Department of Biological Sciences - Carnegie Mellon University

2003 Merck-supported Participants

Julianna Conde-AdornoJulianna Conde-Adorno, Universidad Metropolitana (Advisor: Dr. Robert Murphy)

Temporal Texture Features to Location Proteins

Proteomics, the systematic study of the proteins expressed in a given cell type or tissue, is a growing field that seeks the protein differences between cell types responsible for the different behaviors of those cell types. Knowledge of the subcellular location of proteins is important to understanding protein function. This project is part of research on developing methods that allow the numerical description and classification of the characteristic patterns of subcellular structures in fluorescence microscope images of eukaryotic cells. Previous work has been restricted to the analysis of static images. In this project we used temporal texture features to describe fluorescence microscope image time-series. The goal was to be able to classify proteins, not only by their static patterns, but to subdivide those classifications by considering their behavior over time. We calculated temporal texture features by a modification of the procedure we previously used to calculate Haralick spatial texture features. Specifically, we used a time-series gray level co-occurrence matrix in place of the various gray level co-occurrence matrices for various spatial directions. Element [i, j] in this matrix is the number of pixels with value i in the image a time t being in the same position as pixels with value j in image t1,+ normalized by sum of the matrix elements. Since this matrix is inherently dependent on the gray level resolution used, we explored converting images to various numbers of gray levels in order to find the best value for classifying a set of time series for various proteins. From the results obtained we can conclude that temporal texture features can be used to differentiate images from different subcellular locations, and that the number of gray levels used doesn't significantly affect the results of the classification.
Kuok Chiang KimKuok Chiang Kim, Carnegie Mellon University (Advisor: Dr. Russell Schwartz)

Computational Analysis of Protein Coats and SNARE Complexes in the Maintenance and Functioning of the Golgi

Compartmentalization of the Golgi is the result of continuous protein flux and membrane exchanges between the ER and plasma membrane. There are three types vesicles involved in these protein exchanges, but the focus in our study is on coatomer-coated vesicles COPI and COPII. These transport vesicles have distinct protein coats and soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) proteins that regulate their functions and pathways. ADP-ribosylation factor 1 (Arf1) and SAR1 are necessary regulatory proteins that recruit coat proteins before membrane vesiculation and fission of new vesicles. These G-proteins are only active when bound to GTP, and the activation is initiated by guanine-nucleotide exchange factors (GEF). Given the recruitment of specific coat proteins and cargo receptors, coated vesicles selectively store cargo molecules by the binding action between coat proteins to cargo receptors in the compartment membrane. Interactions between complementary vesicle and target membrane SNAREs form tight oligomeric protein complexes; this ensures the specific docking of the vesicle with its target compartment for homotypic fusion. Such specific pairing of target-SNARE (tSNARE) and vesicle-SNARE (vSNARE) puts forth a mechanism for explaining the specificity of membrane fusion or membrane trafficking that we observe in the Golgi and ER. To develop an understanding of the protein dynamics in the Golgi complex, computational models were constructed for analysis. The underlying mechanisms in these models incorporated the SNARE machinery in regulating vesicular trafficking and the protein coat mediated sorting of proteins. The simulated protein coat mediating sorting is correlated to the initiation of coat assembly by Arf1 and SAR1 for COPI and COPII vesicles respectively in membrane vesiculation. Two types of discrete event computational models that described the protein coat mediated sorting and the SNARE machinery in vesicular trafficking were constructed. The first model, the Steady State Model, tested whether the two primary mechanisms were sufficient in maintaining the compartmentalization and identity of the Golgi compartments. The other model, the De Novo Biogenesis Model, was designed to test the de novo construction of a new Golgi based on these mechanisms. From our simulations, we noted that the protein coats alone do not suffice in explaining the sorting of proteins and compartmentalization of the Golgi structure. However, if protein mediated sorting is coupled with the SNARE machinery, the distinct vesicular pathways and compartmentalization of the Golgi can be sustained. Furthermore, for the system of protein flux to effectively sustain the Golgi structure and functionality, a retrieval pathway that is analogous to the retrograde transport is imperative in correcting imbalances in protein content or leakage of marker proteins. Protein retention in the compartments is unnecessary to minimize the leakage of marker proteins because of the incorporation of corrective pathways. However, that does not necessarily rule out that the use of protein retention is nonexistent in the compartmentalization of the Golgi complex. In the De Novo Biogenesis model, a template of matrix proteins that specifies the Golgi structure was included in the simulation. Large stabilized Golgi compartments were constructed and the functionality was maintained. However, in another variant model, a template was not provided. Although new distinct Golgi compartments were developed, they were unable to maintain steady state and resulted in an increasing collection of free vesicles. The collapse of the system is attributed to the imbalance of protein content in the transport vesicles. Despite the drawbacks in this model, there is potential in continuing research on this De Novo Biogenesis model. Further refinements are necessary before more conclusions can be drawn. After a thorough examination of the protein dynamics from our computational models, we can then combine the Steady State and De Novo Biogenesis models to design a more representative discrete event model that describes the protein dynamics in the Golgi.
Daniel KleinbaumDaniel Kleinbaum, Carnegie Mellon University (Advisors: Drs. William Brown and Gordon Rule)

Structure-based Analysis of ScFvs for Transducer Placement

The goal of my project this summer was to write a computer program based on knowledge of the structure of a single chain variable fragment (ScFv) to identify potential residues for signal transducer placement. If correctly placed, such a signal transducer should make the ScFv a biosensor that highlights where and when an antigen binds. Thus, the residues the program selects must not be involved with binding to the antigen, but must be close enough to the binding pocket to recognize when the antigen is bound. The program uses information from the given ScFv's total structure and location of its hyper-variable region to determine which amino acids are the best candidates for transducer placement. This is done by evaluating residues based on their position in the Cartesian plane, the location of the residue in the molecule, and the CA-CB angle of each amino acid. The program has selected four common positional amino acids in four different ScFv structures as candidates for transducer coupling. Although the program was written generally so that it can be used on any ScFv, a specific molecule is being used to test the validity of the program's selections. ScFv12 is a molecule that targets an isocyanate hapten bound to a protein carrier. Four of the residues on ScFv12 selected by the program are currently undergoing site-directed mutagenesis and binding affinity studies to see whether or not the program is selecting appropriate residues.
Yenixsa Rivera SierraYenixsa Rivera Sierra, Universidad Metropolitana (Advisor: Dr. Russell Schwartz)

Computational Methods for Detecting Alternative Splicing Involved in Neural Plasticity
The determination of complete genome sequences has greatly facilitated proteomics, the study of all the proteins produced by a given cell type and organism. This project is part of a broader project that is using proteomics analysis methods to allow us an understanding of the mechanism involved in learning, using Drosophila melanogaster as our subject. The present work involved one step in the computational model generation: finding sequence motifs predictive of differential splicing or expression. The first step was the acquisition of the data sets of DNA sequences around splice sites of genes exhibiting alternative splicing. The genes under study were those that experiments showed to be involved in neural plasticity either by genetic evidence or by their biochemical functions. The regions taken from the transcripts were divided in four different data sets: 5' splice sites of introns exhibiting alternative splicing, 5' splice sites of introns not exhibiting alternative splicing, 3' splice sites of introns exhibiting alternative splicing, and 3' splice sites of introns not exhibiting alternative splicing. We developed a computer program that searched the entire data set for short sequence patterns and calculated chi-square significance scores relating alternative splicing behavior to the presence of the patterns tested. Through this method we found several significant patterns that may be binding sites of unidentified splicing factors. We also found a significant negative relationship between alternative splicing and the pattern GTAAG, a component of the consensus 5' splice site determined by point mutations that prevent splicing in vivo and in vitro. This result suggests that alternatively spliced introns will tend to favor weaker, non-consensus splice sites, which serves as a validation and confirmation of our method. We plan to follow up the investigation by using the identified patterns for further exploration with other methods such as Gibbs Sampling, Hidden Markov Model, and modifications our own methods. Future ways for evaluating our results could be the alteration of predicted motifs in the proteins to see if that eliminates alternative splicing.
Ruben ValasRuben Valas, Carnegie Mellon University (Advisors: Drs. Michael Erdmann and Gordon Rule)

PEPMORPH Reloaded: Rapid Protein Structure Determination from Sparse NMR Data

Determination of the 3D structure of a protein is currently a very time consuming task, taking months. Knowledge of a protein structure has many applications. Determining changes in structure due to ligand binding is extremely useful in drug design. The goal of this project is to be able to quickly determine a protein's 3D structure using sparse data. PEPMORPH is a program written by Dr. Michael Erdmann and Dr. Gordon Rule that takes sparse NMR data (Noesy and RDC) and assigns it to a known structure. This is done by creating a graph for the experimental data and known structure. The polytopes from the two graphs are aligned to come up with the assignments. My work this summer focusesd on refinement of PEPMORPH. There are still many questions about the best way to do some of underlying steps of PEPMORH. The first step of this project was to write an extremely streamlined version in C. Then rigorous testing was used to determine when and why the program fails. Over time, PEPMORPH will become more robust to handle such cases better. PEPMORPH will only become a useful tool after its capabilities are thoroughly understood. The current version of PEPMORPH Reloaded is doing assignments quite well in systems with no noise. The program is a work in progress; the next step will be development of more robust methods for determining the orientation of the graphs when noise is present.
Ling WangLing Wang, Carnegie Mellon University (Advisor: Dr. Nathan Urban)

Graphical Representation of Processing in the Olfactory Bulb through Image Processing with Filters

Sensing an odor involves binding of an odorant to nasal receptor cells, which project their axons to glomeruli in the olfactory bulb. Axons of olfactory receptor neurons expressing a particular odorant receptor converge on single glomeruli. Presentation of even a pure odor results in the activation of glomeruli distributed widely across the bulb In contrast with this poor spatial specificity, neurons in the olfactory bulb are connected laterally, allowing nearby cells to inhibit each other. What the purpose of such a circuit is if not to process spatial information is unclear. One proposal is that the pattern of inhibition and excitation observed in the olfactory bulb serves to reduce noise and facilitate discrimination of odors. This goal of this project was to enable simulation of noise reduction in odor processing through use of filters on graphical representations of patterns of odor-evoked activation odors. The first stage of the project involved taking a database of existing mapped odors (http://leonlab.bio.uci.edu) and interpreting the graphical representations into a form appropriate for image filtering meant to simulate odor processing. Image filters were designed to incorporate known physiological characteristics of odor processing such as the broadness of the area affected by each glomerulus, the ratio of excitation to inhibition, and the overall shape of the filter. After testing several filters, we chose for this analysis a filter possessing a flat central peak and an inhibitory surround that decayed exponentially. This filter was applied to three groups of odor maps: 1) odor maps for different concentrations of the same odor, 2) odor maps representing the same concentration of structurally related odors, and 3) odor maps of random, unrelated odors at similar concentrations. Differences in odor maps within these groups were calculated by computing a pairwise thresholded percent difference. This measure of map similarity was then used to compare filtered and unfiltered images within these three groups. Our analysis resulted in several findings. First, patterns of activity produced by different concentrations of the same odor or by organic acids differing only in the length of their carbon chain produced patterns of activation that were more similar than patterns generated by random odor pairs, and the difference was larger for larger differences in concentration. Secondly, differences in the patterns of odor-evoked activity from compounds that differed only in their carbon chain length and differences in the patterns of activity produced by different concentrations were reduced by application of the physiologically-derived filter.