Fall 2022
September 9: Lukas Heinrich (Technical University Munich)
Title: Systematic Uncertainties in Frequentist Analysis at the LHC
[Heinrich Talk Recording] [Heinrich Talk Slides]
Abstract: The precise measurement of the basic building blocks of matter - the particles of the Standard Model of Particle Physics - as well as the search for potential new particles beyond the Standard Model presents a formidable statistical challenge. The complexity and volume of the data collected at the Large Hadron Collider requires careful modeling not only of the primary sought-after phenomenon but a precise assessment of uncertainties related to the modeling of possible backgrounds. This work is complicated by the facts that particle physics at its core does not admit a closed-form likelihood model and requires likelihood-free approaches and the highly distributed nature of large-scale collaboration calling for collaborative statistical modeling tools. In this talk I will give a broad overview of how statistical modeling is performed currently at the LHC and discuss ideas that exploit recent advances in Machine Learning to go beyond the existing methodology.
Bio: Lukas Heinrich is a particle physicist and professor for data science in physics at the Technical University of Munich. He is a long-time member of the ATLAS Collaboration at the Large Hadron Collider a CERN, where he is searching for phenomena beyond the Standard Model of Particle Physics and is engaged in computational, statistical and machine-learning methods research. He is one of the main developers of the statistics tool pyhf, which paved the way for a public release of the complex statistical models that underpin the data analyses at the LHC.
October 14: Benjamin Nachman (Lawrence Berkeley National Laboratory)
Title: Building Robust Deep Learning Methods for High Energy Physics
[Nachman Talk Recording] [Nachman Talk Slides]
Abstract: Deep Learning is becoming a widely used tool in High Energy Physics to enhance the search for new elementary particles and forces of nature. However, great care is required to design new methods that are robust in order to perform reliable statistical tests with our data (e.g. preventing false claims of discovery!). In this talk, he will provide examples from high energy physics related to uncertainty/inference-aware deep learning and draw a connection to algorithmic fairness and related topics in the statistics and machine learning literature.
Bio: Ben Nachman is a Staff Scientist in the Physics Division at LBNL where he is the group leader of the cross-cutting Machine Learning for Fundamental Physics group. He was a Churchill Scholar at Cambridge University and then received his Ph.D. in Physics and Ph.D. minor in Statistics from Stanford University. After graduating, he was a Chamberlain Fellow in the Physics Division at Berkeley Lab. Nachman develops, adapts, and deploys machine learning algorithms to enhance data analysis in high energy physics. He is a member of the ATLAS Collaboration at CERN.
October 27: STAMPS-NSF AI Planning Institute Joint Seminar - Kaze Wong (Flatiron Institute)
Title: Challenges and Opportunities from gravitational waves: data scientists on diet
[Wong Talk Recording]
Abstract: The gravitational wave (GW) community has numerous exciting discoveries in the past 7 years, from the first detection to a catalog of ~80 GW events, containing all sorts of surprises such as binary neutron stars and neutron star-black hole mergers. In the coming decade, there will be next generation facilities such as the third generation GW detectors network and space-based GW observatory, that will provide many more surprising events. There are quite a number of open modelling and data analysis problems in GW that await to be solved in order to unlock the full potential of next generation detections. Despite the recent rapid development of machine learning and efforts trying to solve these problems in GW, it seems GW has a number of traits which make applying machine learning to GW difficult. In this talk, I will discuss a number of challenges and opportunities in GW, and some insights from GW on how we should apply modern techniques such as machine learning to physical science in general.
Bio: Kaze Wong is a research fellow studying black holes through gravitational waves at the Flatiron Institute. He graduated from the physics and astronomy department of Johns Hopkins University in 2021, and he is the recipient of the 2021 GWIC-Braccini Prize. Kaze's research centers around the intersection between physical science and deep learning. He is particularly interested in building production-grade hybrid methods to take on challenges in physical science.
December 9: STAMPS-ISSI Joint Seminar - Rebecca Willett (University of Chicago)
Title: Machine Learning for Inverse Problems in Climate Science
[Willett Talk Recording] [Willett Talk Slides]
Abstract: Machine learning has the potential to transform climate research. This fundamental change cannot be realized through the straightforward application of existing off-the-shelf machine learning tools alone. Rather, we need novel methods for incorporating physical models and constraints into learning systems. In this talk, I will discuss inverse problems central to climate science — data assimilation and simulator model fitting — and how machine learning yields methods with high predictive skill and computational efficiency. First, I will describe a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, our methods leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Second, I will describe learning emulators of high-dimensional climate forecasting models targeting parameter estimation with uncertainty estimation. We assume access to a computationally complex climate simulator that inputs a candidate parameter and outputs a corresponding multichannel time series. Our task is to accurately estimate a range of likely values of the underlying parameters that best fit data. Our framework learns feature embeddings of observed dynamics jointly with an emulator that can replace high-cost simulators for parameter estimation. These methods build upon insights from inverse problems, data assimilation, stochastic filtering, and optimization, highlighting how theory can inform the design of machine learning systems in the natural sciences.
Bio: Rebecca Willett is Professor of Statistics and Computer Science & Director of AI at the Data Science Institute at the University of Chicago, with a courtesy appointment at the Toyota Technological Institute at Chicago. She is also the faculty lead of AI+Science Postdoctoral Fellow program. Her work in machine learning and signal processing reflects broad and interdisciplinary expertise and perspectives. She is known internationally for her contributions to the mathematical foundations of machine learning, large-scale data science, and computational imaging.
In particular, Prof. Willett studies methods to learn and leverage hidden structure in large-scale datasets; representing data in terms of these structures allows ML methods to produce more accurate predictions when data contain missing entries, are subject to constrained sensing or communication resources, correspond to rare events, or reflect indirect measurements of complex physical phenomena. These challenges are pervasive in science and technology data, and Prof. Willett’s work in this space has had important implications in national security, medical imaging, materials science, astronomy, climate science, and several other fields. She has published nearly two hundred book chapters and scientific articles published in top-tier journals and conferences at the intersection of machine learning, signal processing, statistics, mathematics, and optimization. Her group has made contributions both in the mathematical foundations of signal processing and machine learning and in their application to a variety of real-world problems.
For her full bio, please see her webpage.