2005 Merck Participants-Department of Biological Sciences - Carnegie Mellon University

2005 Merck-supported Participants

Orhan AyasliOrhan Ayasli, Carnegie Mellon University
Mentor: Dr. Robert Murphy

Classification and Presentation of Subcellular Images with Confidence using Data Extracted Automatically from Online Journals
Knowledge of subcellular protein localization patterns and their dynamics will also give us insight into how the proteins in a cell interact with each other, and thus, how these proteins function. Such information is critical in many real-world applications such as designing disease diagnostics and therapies. There has been extensive work done in automating the collection, organization, and analysis of subcellular localization patterns from text and Fluorescence Microscope Images (FMI) in online journals as part of the Subcellular Location Image Finder (SLIF) project. The SLIF program currently has nine major parts: 1) downloading articles that match a query, 2) extracting figure images and the corresponding captions from the articles, 3) splitting figures into individual panels, 4) identifying Fluorescence Microscope Images among the panel images, 5) removing annotations from images, 6) extracting scale information from the images and captions, 7) identifying protein references from analysis of the captions, 8) classifying images into localization classes based on image features, 9) providing a user interface to explore the data generated above. Although many parts of the SLIF project used assertions from algorithms that had varying levels of accuracy, the project did not communicate these confidence levels to the user. The goals of my project were to improve the FMI classifier, to generate confidence levels for the images classified as FMI, and to improve the user interface such that confidence levels are communicated to the user. The FMI classifier was improved from a k-Nearest Neighbor algorithm to a Support Vector Machine (SVM) supervised learning algorithm that was trained using several gray-level image features for each image. Using this new FMI classifier, the images in the SLIF data were reclassified, and confidence levels were generated for each image based on the score given to each image by the classifier. The user interface was improved by creating a clone of the SLIF server for backup purposes, and by incorporating confidence levels in the output displayed to the user. These confidence levels will help users determine the accuracy of the results returned by the SLIF service. As SLIF is still in its early stages, it is especially important that users interpret the SLIF results in their proper context.

T.J. CorriganT.J. Corrigan, Carnegie Mellon University
Mentor: Dr. Gordon Rule

Purification and Characterization of Mutant Glutathione Transferase T2-2
Glutathione S-Transferases (GSTs) are involved in breaking down and trafficking of both electrophilic and genotoxic compounds. Currently some drugs used to treat cancer are rendered ineffective by GSTs. The goal of this research project is to better understand both the structure and dynamics of the GST isoform T2-2 in order to aid in the development of new drugs that are not affected by GSTs. Past studies on GST T2-2 were limited due to disulfide linkages between protein molecules which caused aggregation. This aggregation led to an inability to use NMR to characterize the protein. Therefore the project focused on determining the viability of a mutant form of T2-2 as a model system for studying the protein. The mutant differs from the wildtype form in that the three cysteine residues are replaced with serines in order to prevent the formation of disulfide bonds which were suspected to cause the aggregation. The mutant was first purified using ion-exchange chromatography and affinity chromatograph as is the standard procedure for purifying the wildtype protein. The purification scheme led to unsatisfactory yields so alternative approaches including using a cation rather than an anion exchange column and the use of size exclusion chromatography are being investigated. Enzematic assays on the protein purified from the standard method using 4-nitrobenzyl chloride (NBC) as a substrate indicated that the mutant protein retained the same activity as the wildtype. An SDS-PAGE gel was run to confirm that the mutant was not cross-linking. Since the mutant protein is active and does crosslink, it can effectively model the wildtype.
Andrew Larson ErbAndrew Larson Erb, Carnegie Mellon University
Mentor: Dr. Tomasz Kowalewski

Development of Streamlined Signal Management and Processing Code for Scanning Probe Acceleration Microscopy
Tapping Mode Atomic Force Microscopy (TMAFM) is a method of rendering high resolution images of nanoscale structures through the use of a harmonically oscillating cantilever that is driven to collide with the surface of the sample. Deviations from the standard amplitude represent elevation changes that can be compiled into composite topographies of the sample. As the tip of the cantilever intermittently is driven into contact with the surface of the sample, highly anharmonic deflection trajectories are observed. The anharmonicities can then be filtered through "de-spiking" and "de-noising" algorithms and the second derivative of this cleaned data yields a force analysis of the contact between the cantilever and the sample. The amplitude of these new peaks correlates to the variations in the sample modulus. It is from this principle that scanning probe acceleration microscopy (SPAM) derives. Currently, major issues facing SPAM are a lack of unification of elements necessary for signal management and processing within a single software suite, as well as a lack of a streamlined, near real-time algorithms for signal transform analysis and "de-noising". We hypothesized that rewriting the acquisition and analysis algorithms in C, rather than their current Matlab, we could improve performance and yield nearer to real time analysis of image data. While this only yielded an approximate 15% performance improvement, it refocused our efforts to adjusting the sampling method used by SPAM to dramatically reduce the data set size while keeping integrity high. By reducing the data set ten fold, calculations can be much more easily performed in a near real time environment. Current work will continue to improve the efficiency of the acquisition algorithms until analysis can be performed on dynamic samples with changing pH and temperature.

Sarah PennieSarah Pennie, Carnegie Mellon University
Mentor: Dr.Roberto Gil

Structural Analysis of Solution Aggregates of a Modified ß -Cyclodextrin Derivative
A new amphiphilic mono-substituted cyclodextrin derivative, consisting of two isomers that cannot be separated, has been synthesized from the native b -cyclodextrin. From previous research, there has been suspicion that these aggregates may form one of three types of compounds in solution which may include self-inclusion. The purpose of this research is to determine the structure of these aggregates in solution. Cyclodextrins are bucket-shaped and cyclic a -(1, 4)-linked oligosaccharides of a -D-gluco-pyranose containing a relatively hydrophobic central cavity and hydrophilic outer surface. As a result of their molecular structure and shape, they possess a unique ability to act as molecular containers by entrapping guest molecules in their internal cavity. The resulting inclusion complexes offer a number of potential advantages. In the pharmaceutical industry, cyclodextrins have mainly been used as complexing agents to increase the aqueous solubility of poorly water-soluble drugs, and to increase their bioavailability and stability. The goals of this research are to collect structural information on the compound based on variables such as its concentration, temperature, and solvent. The structural analysis of these compounds undertaken by combined use of 1D and 2D nuclear magnetic resonance (NMR) spectroscopy indicate that the aliphatic chain is inside the cavity. While the self-inclusion of this compound is intensely strong, as the concentration and pH increases, the cyclodextrin derivative also undergoes inter-molecular inclusion. The strength of the interaction between the aliphatic chain and the cyclodextrin cavity has been evaluated by a competitive method using adamantamine, a strong binding host to cyclodextrin. There haven't been many studies on amphiphilic cyclodextrin derivates, and therefore any research could contribute to improving pharmaceutical formulations and drug delivery.

Daniel SmithDaniel Smith, Carnegie Mellon University
Mentor: Dr. Justin Crowley

Optimizing in vivo Two-Photon Laser Scanning Microscopy by Motion Artifact Correction
The synapse is thought to be the fundamental unit of computation in the brain. Monitoring experience-dependent changes in the synaptic connections between neurons, such as the addition or elimination of a synapse or a change in synapse morphology, is key to understanding the mechanism of synaptic change in developing and mature animals. In vivo two-photon fluorescence imaging of a neuron allows optical sectioning on the order of a few microns using a laser scanning microscope. The resolution of this technique allows dendritic spines and axonal boutons (diameter of ~2 microns) to be captured even in deep-tissue imaging (up to 500 microns in depth). In vivo imaging of individual neural processes is made more challenging by dynamic pressure waves due to biological events such as heartbeat and respiration which cause motion on the order of 1 to 10 microns in the features being imaged. This makes the comparison of features across different time points difficult and complicates signal averaging of multiple scans. To minimize the effect of cardiac-associated motion, I developed a real-time data acquisition and control system to trigger each planar scan of the microscope from the electrocardiogram of the animal with sub-millisecond precision. We find that the laser scanning microscope can begin a scan with a delay of 4.6 msec. (SD = 0.3 msec.) after a trigger. This method qualitatively increased the correlation between multiple scans of the same location, however, residual motion is still evident in the images; this may be the result of respiration of the animal. We are presently quantitatively assessing the improvement in between-image correlation produced by our triggering method using 2D cross-correlation analysis. In addition, we are determining the extent to which small variability in the timing of cardiac events limits our method's ability to remove cardiac-associated motion. The correlation between the residual motion and the respiration of the animal is currently being determined by monitoring the contraction of the diaphragm muscle during imaging. Post hoc processing of the images to remove additional noise will be performed using statistical methods developed in collaboration with Dr. William Eddy in the Statistics Department.