May 16, 2025
Brian Nord - Fermilab
[Nord Talk Recording] [Nord Talk Slides]
Location: Zoom
Title: Simulation-based Inference and the Design and Operation of Science Experiments
Abstract: Simulation-based inference (SBI) is a highly efficient approach for inferring expressive density distributions in many areas of research – economics, climate, population genetics physics, astronomy. Moreover, SBI has potential application in measuring properties of nature in those areas, but also in the design of experiments – including instruments and data acquisition. However, some key challenges remain before SBI will be ready for these tasks – e.g., domain adaptation, trustworthy/credible uncertainty quantification, and high-dimensional parameter space sampling. In this talk, I will discuss some work by my group and the rest of the community in working toward these goals.
Bio: Brian Nord’s work focuses on how to improve the ways in which we make scientific discoveries --- developing algorithms, building statistical models, and auto-designing experiment. Brian started his career in large-scale structure cosmology, analyzing galaxy clusters and strong gravitational lenses. More recently, he has been exploring the potential of AI algorithms to address critical challenges in cosmological data analysis. Currently, he is integrating AI with rigorous statistical methods and using this to aid in the design of scientific experiments.
April 25, 2025
Stefano Castruccio - University of Notre Dame
[Castruccio Talk Recording] [Castruccio Talk Slides]
Location: Zoom
Title: New Perspectives on Balancing Physics with Data-Driven Models: the Case for Physics Informed Neural Networks in Environmental Statistics
Abstract: The idea of performing data analysis by leveraging physical information with a data-driven model has a long history in environmental Statistics. Physical-Statistical models are predicated on the idea that hierarchical Bayesian models could have a spatio-temporal process informed by a partial differential equation (PDE) which expresses some well-known physical information about the system. The machine learning literature has recently focused on the same problem by proposing a different yet related solution: instead of devising a purely data-driven neural network, inference can be penalized by means of PDE expressing the physics of the system. This approach allows for “soft” constraints on the model instead of a “hard” specification of the process dynamics in physical-statistical models. In this talk, I will discuss two of my recent works on this topic developed by my research group, and discuss the relative merits of this new approach from the perspective of a statistician. The first work will focus on a deep double reservoir model informed by two-dimensional incompressible Navier Stokes, while the second one will discuss the link between a PDE-driven penalty and physics-informed priors. I will also briefly discuss some of my recent work on physics informed convolutional autoencoders and transformers with attention mechanisms.
Bio: Stefano Castruccio is the Notre Dame Collegiate Associate Professor in Statistics at the University of Notre Dame. He obtained his PhD in 2013 at the University of Chicago, and he was later postdoctoral fellow King Abdullah University of Science and Technology (Saudi Arabia), Lecturer at Newcastle University (UK), before moving to his current institution. He works in spatio-temporal models for complex environmental problems, from air pollution to climate change.
April 25, 2025
Joel Leja - The Pennsylvania State University
[Leja Talk Recording] [Leja Talk Slides]
Location: Steinberg Auditorium (BH A53) + Zoom
Title: Rapid inference of galaxy properties in the age of deep and large-scale surveys of the universe
Abstract: The inference of the physical properties of galaxies at cosmological distance requires modeling a wide range of physics, including e.g. stellar evolution and atmospheres; dust attenuation and re-emission; nebular physics; and AGN emission. Bayesian inference is often used to map the inevitable degeneracies, and the large amount of physics and wide parameter space means these codes are typically not fast (~1-10 hours/object). Yet current and near-future surveys of the universe will yield spectra for millions of galaxies and imaging for billions. I will discuss the tactics employed to speed up these codes, ranging from neural net emulators of key physics (photoionization modeling; stellar spectra) to efficient gradient-enhanced GPU-accelerated high-dimensional sampling to rapid simulation-based inference. These yield speed-ups of somewhere between 100x and 100,000x, with unavoidable trade-offs in flexibility and accuracy. I will discuss applications of these techniques to model modern astronomical data, including both industrial-scale modeling of galaxy observations and newly-possible directions such as spatially resolved galaxy modeling. Finally, time permitting, I will discuss some of the exciting new discoveries made with these techniques in the very distant universe seen by JWST.
Bio: Joel Leja is the Dr. Keiko Miwa Ross Early Career Endowed Faculty Chair and an Assistant Professor of astronomy and astrophysics at Penn State University. His research aims to understand how galaxies form using large ground-and space-based telescopes, large surveys, and fast computers. He specializes in modeling observations of distant galaxies and in data-intensive astrophysical methodologies. Joel was named a Clarivate Highly Cited Researcher in 2023 and in 2024 (top 1% of cited researchers in astrophysics), and awarded Yale University's Brouwer Prize in 2019 for a PhD thesis of unusual merit.