Carnegie Mellon University

STAMPS@CMU

STAtistical Methods for the Physical Sciences Research Center

Fall 2021

Douglas Nychka headshotSeptember 10: Doug Nychka (Department of Applied Mathematics and Statistics, Colorado School of Mines)

Title: Climate models, large spatial datasets, and harnessing deep learning for a statistical computation
[Nychka Talk Recording] [Nychka Talk Slides]

Abstract: Numerical simulations of the motion and state of the Earth’s atmosphere and ocean yield large and complex data sets that require statistics for their interpretation. Typically climate and weather variables are in the form of space and time fields and it is useful to describe their dependence using methods from spatial statistics. Throughout these problems is the need for estimating covariance functions over space and time and accounting for the fact that the covariance may not be stationary. This talk focuses on a new computational technique for fitting covariance functions using maximum likelihood. Estimating local covariance functions is a useful way to represent spatial dependence but is computationally intensive because it requires optimizing a local likelihood over many windows of the spatial field. Thus the problem we tackle here is having numerous (tens of thousands) small spatial estimation problems and is in contrast to other research that attempts a single, global estimate for a massive spatial data set. In this work we show how a neural network (aka deep learning) model can be trained to give accurate maximum likelihood estimates based on the spatial field or its empirical variogram. Why train a neural network to reproduce a statistical estimate? The advantage is that the neural network model evaluates very efficiently and gives speedups on the order of a factor of a hundred or more. In this way computations that could take hours are reduced to minutes or tens of seconds and facilitates a more flexible and iterative approach to building spatial statistical models. An example of local covariance modeling is given using the large ensemble experiment created by the National Center for Atmospheric Research.

See: Gerber, Florian, and Douglas Nychka. “Fast covariance parameter estimation of spatial Gaussian process models using neural networks.” Stat 10.1 (2021): e382.



Bio: Douglas Nychka is a statistician and data scientist whose areas of research include the theory, computation and application of curve and surface fitting with a focus on geophysical and environmental applications. Currently he is a Professor in the Department of Applied Mathematics and Statistics at the Colorado School of Mines and Senior Scientist Emeritus at the National Center for Atmospheric Research (NCAR), Boulder, Colorado. Before moving to Mines he directed the Institute for Mathematics Applied to Geosciences at NCAR. His current focus in research has been efficient computation of spatial statistics methods for large data sets and the migration of these methods into easy to use R packages. He is a Fellow of the American Statistical Association and the Institute for Mathematical Statistics.


Yang Chen headshotOctober 8: Yang Chen (Department of Statistics, University of Michigan)

Title: Matrix Completion Methods for the Total Electron Content Video Reconstruction
[Chen Talk Recording] [Chen Talk Slides]

Abstract: The total electron content (TEC) maps can be used to estimate the signal delay of GPS due to the ionospheric electron content between a receiver and satellite. This delay can result in GPS positioning error. Thus it is important to monitor the TEC maps. The observed TEC maps have big patches of missingness in the ocean and scattered small areas of missingness on the land. In this work, we propose several extensions of existing matrix completion algorithms to achieve TEC map reconstruction, accounting for spatial smoothness and temporal consistency while preserving important structures of the TEC maps. We call the proposed method Video Imputation with SoftImpute, Temporal smoothing and Auxiliary data (VISTA). Numerical simulations that mimic patterns of real data are given. We show that our proposed method achieves better reconstructed TEC maps as compared to existing methods in literature. Our proposed computational algorithm is general and can be readily applied for other problems besides TEC map reconstruction. Brief discussions on ongoing efforts for prediction models for TEC maps will be given if time allows.

Bio: Yang Chen received her Ph.D. (2017) in Statistics from Harvard University and joined the University of Michigan as an Assistant Professor of Statistics and Research Assistant Professor at the Michigan Institute of Data Science (MIDAS). She received her B.A. in Mathematics and Applied Mathematics from the University of Science and Technology of China. Research interests include computational algorithms in statistical inference and applied statistics in the field of biology and astronomy.


Glen Cowan headshotNovember 12: Glen Cowan (Department of Physics, Royal Holloway, University of London)

Title: Errors on Errors: Refining Particle Physics Analyses with the Gamma Variance Model
[Cowan Talk Recording] [Cowan Talk Slides]

Abstract: In a statistical analysis in Particle Physics, one faces two distinct challenges: the limited number of particle collisions and imperfections in the model itself, corresponding to “statistical” and “systematic” errors in the result. To combat the modeling uncertainties one includes nuisance parameters, whose best estimates are often treated as a Gaussian distributed with given standard deviations. The appropriate values for these standard deviations are, however, often the subject of heated argument, which is to say that the uncertainties themselves are uncertain.

A type of model is presented where estimates of the systematic variances are modeled as gamma distributed variables. The resulting confidence intervals show interesting and useful properties. For example, when averaging measurements to estimate their mean, the size of the confidence interval increases as a for decreasing goodness-of-fit, and averages have reduced sensitivity to outliers. The basic properties of the model are presented and several examples relevant for Particle Physics are explored.

Bio: Ph.D. in Physics 1988 from University of California, Berkeley, followed by postdoc positions in Munich and Siegen working on electron-positron collisions at LEP (CERN). Research focus on Quantum Chromodynamics (multijet production, measurements of alpha_s, properties of hadronic Z decays). 1998-present, faculty member in Department of Physics, Royal Holloway, University of London. My research in High Energy Physics has involved experiments at the Large Hadron Collider (CERN) on proton-proton collisions, with focus on application and development of statistical methods.


Elizabeth Barnes headshotDecember 3: Elizabeth Barnes (Department of Atmospheric Science, Colorado State University)

Title: Benefits of saying “I Don’t Know” when analyzing and modeling the climate system with ML
[Barnes Talk Recording] [Barnes Talk Slides]

Abstract: The atmosphere is chaotic. This fundamental property of the climate system makes forecasting weather incredibly challenging: it’s impossible to expect weather models to ever provide perfect predictions of the Earth system beyond timescales of approximately 2 weeks. Instead, atmospheric scientists look for specific states of the climate system that lead to more predictable behaviour than others. Here, we demonstrate how neural networks can be used, not only to leverage these states to make skillful predictions, but moreover to identify the climatic conditions that lead to enhanced predictability. We introduce a novel loss function, termed “abstention loss”, that allows neural networks to identify forecasts of opportunity for regression and classification tasks. The abstention loss works by incorporating uncertainty in the network’s prediction to identify the more confident samples and abstain (say “I don’t know”) on the less confident samples. Once the more confident samples are identified, explainable AI (XAI) methods are then applied to explore the climate states that exhibit more predictable behavior.

Bio: Dr. Elizabeth (Libby) Barnes is an associate professor of Atmospheric Science at Colorado State University. She joined the CSU faculty in 2013 after obtaining dual B.S. degrees (Honors) in Physics and Mathematics from the University of Minnesota, obtaining her Ph.D. in Atmospheric Science from the University of Washington, and spending a year as a NOAA Climate & Global Change Fellow at the Lamont-Doherty Earth Observatory. Professor Barnes' research is largely focused on climate variability and change and the data analysis tools used to understand it. Topics of interest include earth system predictability, jet-stream dynamics, Arctic-midlatitude connections, subseasonal-to-decadal (S2D) prediction, and data science methods for earth system research (e.g. machine learning, causal discovery).

She teaches graduate courses on fundamental atmospheric dynamics and data science and statistical analysis methods. Professor Barnes is involved in a number of research community activities. In addition to being a lead of the US CLIVAR Working Group: Emerging Data Science Tools for Climate Variability and Predictability, a member of the National Academies’s Committee on Earth Science and Applications from Space, a funded member of the NSF AI Institute for Research on Trustworthy AI in Weather, Climate and Coastal Oceanography (AI2ES), and on the Steering Committee of the CSU Data Science Research Institute, she recently finished being the lead of the NOAA MAPP S2S Prediction Task Force (2016-2020).