Carnegie Mellon University

Pre-College Computational Biology provides extensive training in both cutting-edge laboratory experiments to generate biological data and the computational analysis of the data that these experiments generate.

Statement on COVID-19

Because Carnegie Mellon University is holding its summer session online this year, we are transitioning the 2020 offering of the PreCollege program in computational biology to an online format.

The overarching focus of our program is to provide an unparalleled learning experience for our students by helping them explore the beautiful computational approaches that can be used to analyze biological data.  This primary goal has not changed.

We are disappointed that we will not get to interact with our students in person this year, but we see the transition to online delivery as an opportunity.  For one, we will be reaching out to each of our students soon to start a conversation between our students and our dedicated instructors as early as possible. We are also rethinking best practices for delivering our program online, given that our students will be spread across different time zones, and we describe our plan below.

We will be providing an asynchronous, self-paced coding bootcamp for our students in advance.  Because students complete the coding bootcamp on their own time, we will be able to hit the ground running the first day of our classes.  The program itself will transition from a traditional classroom format to a high-octane hackathon format in which students work in breakout groups to solve real problems in biological data analysis with instructor guidance.  Instructors will also provide mini-lectures connecting the different parts of each day’s hackathon, along with further asynchronous materials that students complete between sessions.  Even though we will not have access to a lab, some of the lectures provided will describe the beautiful experimental approaches that are needed to generate the data that we have access to.  Furthermore, we will show just how much analysis can be done using publicly available data, including data that were generated by students in last year’s program.

Finally, biological research is at the forefront of today’s news, as researchers scramble to study the coronavirus and build a vaccine.  We will see how computational analysis can be used to identify the origin of the virus and track how it is mutating as it spreads throughout the human population.

In short, we will need to make changes to our program to ensure a great student experience, but we are optimistic and very excited about getting to train a wonderful group of young learners.  We can’t wait until this summer.


Biological and medical research have become fully fledged computational disciplines. Tomorrow's life scientists will need deep knowledge of not only the laboratory techniques for generating experimental data but also the rigorous computational techniques necessary to analyze and model these data. The Pre-College Computational Biology program offers an unparalleled experience for high school students to explore this relationship in a university setting.

After sampling water from one of Pittsburgh’s three rivers, students will isolate the bacterial DNA from the water and break the DNA strands into millions of tiny fragments that are then sequenced. Students will then build algorithms to identify the species of microorganisms present in the water samples and construct an evolutionary tree determining how they've evolved.

Picture of Pre-College Computational Biology students talking and laughing after working in the Biology Lab.


Students in the Pre-College Program in Computational Biology conduct cutting-edge experiments in molecular biology and build computational approaches from the ground up to analyze the datasets that they obtain from these experiments.  The synergy between experimentation and computation is a hallmark of modern biological research, and for the first time, there is a program explicitly for high school students allowing them to explore this relationship while solving a real research problem.

Students will complete laboratory and coding bootcamps, as well as collect their own biological samples; in 2019, we sampled river water from Pittsburgh's three rivers to unlock the hidden bacterial ecosystems inhabiting these rivers throughout the year.  Students will then explore the following topics:


  • Bacterial colonization and genome sequencing
  • DNA extraction
  • Genome assembly
  • Polymerase chain reaction
  • 16S ribosomal RNA sequencing


  • Genome assembly
  • Downstream genome analysis, such as gene finding
  • Sequence alignment and its applications to species identification, genome annotation, and gene comparison
  • Evolutionary tree construction
  • Metagenomics analysis


To be eligible for the Pre-College Computational Biology students must: 

  • Be at least 16 years old by the program start date (to participate in the residential program).
  • Be a current sophomore or junior in high school.
    • Program is typically for juniors, however some sophomores will be accepted.
  • Have an academic average of B (3.0/4.0) or better.
  • Have a strong interests in math, biology, and computer science.
    •  Previous exposure to computational biology isn't required.
  • Have programming experience. No specific language requirements. 

Application Requirements

The complete application for the Pre-College Computational Biology consists of the following:

  • Completed Online Application
  • Unofficial Transcript
  • Standardized Test Scores (required)
  • One Letter of Recommendation
  • Responses to the following essay prompts (300-500 words):
    • What do you hope to gain from participating in the Carnegie Mellon Pre-College program?
    • Why are you interested in studying Computational Biology?

Frequently Asked Questions

Great question! The short answer is the application of high-powered computational approaches to analyze biological or medical datasets. For a lengthier explanation, check out our post from Dr. Bob Murphy.

We will provide all supplies needed for the pre-college program. The computational components of the program will be completed in University computer clusters, and so students will not need to bring their own computer.