Snigdha Panigrahi

Snigdha Panigrahi

Assistant Professor of Statistics

University of Michigan

About me

I am a statistician with wide-ranging interests, directed towards understanding big, diverse and complex data. Before joining the University of Michigan in 2018, I completed my doctoral thesis at Stanford University with Jonathan Taylor in the same year.

Broadly, my work explores the potential of new methodology at the intersections of machine learning, probabilistic modeling, statistical inference and genomics. A special emphasis is on reconciling between inferential coherence and adaptive modeling of data— a confrontation inevitable with the richness and scale of modern data analyses. I delivered a tutorial on the recent advances in this very evolving field of research at the Simons Institute for the Theory of Computing in 2018; Tutorial-I, Tutorial-II. I gratefully acknowledge support through NSF-DMS 1951980 and NSF-DMS 2113342 for some of my current investigations.

In my pursuit of scalableinterpretablereproducible methods, I greatly enjoy interdisciplinary and exciting collaborations. Some of my recent work with multidisciplinary teams of data scientists are listed here; Projects.

Ongoing research themes

Structure aware inference anchored in data-assisted learning 

An overarching focus is geared towards prudently resuing data in order to exploit complex patterns in big data for reproducible modeling and inferential reportings with meaning. My recent activities draw direct motivation from how massive structured data is collected, stored and deployed in realistic pipelines. Specifically, methods dubbed as carving offer a data-efficient way out when adaptive inference must be updated upon the availability of fresh samples or the arrival of a new dataset. Proposals investigating mathematical properties and computational algorithms around data carving are highlighted in blue.

Stochastic processes and their geometric properties                                                                                      Source

My work in this domain explores the behavior of stochastic processes through relevant probabilistic and geometric characteristics, such as maximal moments, path regularity properties and their phase transition boundaries, integral geometric formulae etc. This understanding finds relevant applications in the analysis of cosmological data and heavy tailed modeling of real data. My contributions here rely on a combination of ergodic, group theoretic and probabilistic techniques.

Interdisciplinary investigations in data science 

Multidisciplinary projects with a specific focus on statistical genomics are explored under this theme. These projects are due to collaborative efforts, which are largely not confined to walls of my home department. Identification of biological underpinnings, assessing accurately their uncertainties in data-mined models and efficiently consolidating confirmatory reports in integrative domains define some my active interests here.

Publications, preprints and software

Sifan Liu and Snigdha Panigrahi. Selective Inference with Distributed Data. 2023. [arxiv]

Snigdha Panigrahi, Kevin Fry, and Jonathan Taylor. Exact Selective Inference with Randomization. 2022. [arxiv]

Snigdha Panigrahi, Shariq Mohammad, Arvind Rao, and Veerabhadran Baladandayuthapani. Integrative Bayesian models using Post-selective Inference: a case study in Radiogenomics. 2022. Biometrics (Forthcoming). [arxiv]

Snigdha Panigrahi, Natasha Stewart, Chandra Sripada, and Elizaveta Levina. Selective Inference for Sparse Multitask Regression with Applications in Neuroimaging. 2022. [arxiv]

Snigdha Panigrahi and Jonathan Taylor. Approximate selective inference via maximum likelihood. 2022. Journal of the American Statistical Association [arxiv]; [publication]

Snigdha Panigrahi, Jingshen Wang, and Xuming He. Treatment Effect Estimation via Efficient Data Aggregation. 2022. [arxiv]

Snigdha Panigrahi, Jonathan Taylor, and Asaf Weinstein. Integrative methods for post-selection inference under convex constraints. 2021. Annals of Statistics. [arxiv]; [publication]

Snigdha Panigrahi, Parthanil Roy, and Yimin Xiao. Maximal Moments and Uniform Modulus of Continuity for Stable Random Fields. 2021. Stochastic processes and their applications. [arxiv]; [publication]

Snigdha Panigrahi, Peter W. Macdonald, and Daniel Kessler. Inference post selection of Group-sparse Regression Models. 2020. [arxiv]

Basil Saeed, Snigdha Panigrahi, and Caroline Uhler. Causal Structure Discovery from Distributions Arising from Mixtures of DAGs. 2020. International Conference on Machine Learning. [arxiv]; [publication]

Snigdha Panigrahi, Junjie Zhu, and Chiara Sabatti. Selection-adjusted inference: an application to confidence intervals for cis-eQTL effect sizes. 2019. Biostatistics. [arxiv]; [publication]

Snigdha Panigrahi. Carving model-free inference. 2019. [arxiv]

Qingyuan Zhao and Snigdha Panigrahi. Selective Inference for Effect Modication: An Empirical Investigation. 2019. Observational Studies: Special issue devoted to ACIC. [arxiv]; [publication]

Snigdha Panigrahi, Nadia Fawaz, and Ajith Pudhiyaveetil. Temporal Evolution of Behavioral User Personas via Latent Variable Mixture models. 2019. IUI Workshops on Exploratory Search and Interactive Data Analytics. [arxiv]; [publication]

Snigdha Panigrahi, Jonathan Taylor, and Sreekar Vadlamani. Kinematic Formula for Heterogeneous Gaussian Related Fields. 2018. Stochastic processes and their applications. [arxiv]; [publication]

Snigdha Panigrahi and Jonathan Taylor. Scalable methods for Bayesian selective inference. 2018. Electronic Journal of Statistics. [arxiv]; [publication]

Snigdha Panigrahi, Jelena Markovic, and Jonathan Taylor. An MCMC-free approach to post-selective inference. 2017. [arxiv]

Xiaoying Tian Harris, Snigdha Panigrahi, Jelena Markovic, Nan Bi, and Jonathan Taylor. Selective sampling after solving a convex problem. 2016. [arxiv]

                SOFTWARE     My contributions to software development in the field of Selective Inference can be tracked here: [Github]



STATS 600 - Linear Models

Graduate Level Course - Core class for the first year Ph.D. program

Aug 2020 – Present

The course covers the following topics.

  • A comprehensive treatment of linear models for independent observations using least squares estimation, including both simple and multiple regression
  • Some regression methods for dependent data
  • Automated variable selection algorithms from a competing class of linear models such as L-1 and L-2 penalized algorithms
  • Permutation tests and bootstrapped regression methods
  • Challenges from a modern regression (learning) perspective: multiple comparisons, multiple testing and inference post selection.

STATS 280 - Honors Introduction to Statistics & Data Analysis

Lower-level Undergraduate Course

Aug 2020 – Present Offered previously - Fall 2018-19, Fall 2019-20

The course is co-taught in the Winter semesters (and is largely developed) by Johann Gagnon-Bartsch. The course is organized as the following main topics.

  • Summarizing data: Histograms, Boxplots etc., Measures of location and spread
  • Randomized experiments, Observational studies and Confounding
  • Measures of Association, Regression and Regression Fallacy
  • Probability basics, rules and examples
  • Random Variables and Discrete probability distributions: Expectation, Covariance and Standard Errors
  • Law of Large Numbers and Normal approximation to sums, Box models
  • Inference: Confidence intervals, Hypothesis testing: Z-test and T-test
  • Prediction: Bias-Variance tradeoff, Cross-validation, In-sample and out-of-sample errors, Multiple regression, Logistic regression.

STATS 605 - Statistical methods for Adaptive Data Analysis

Special Topics Course - Advanced class for Ph.D. program

Jan 2019 – Apr 2019 Offered previously - Winter 2019-20.

This course is centered around modern topics in selective inference.

  • Relevance of selective inference in a modern scientific realm : Distinguishing between exploratory and confirmatory statistics, we will discuss the notions of Family wise error rate and False Discovery rate.
  • Is a statistician’s job done after exploratory analyses once we report a (principled) list of potential discoveries? : We will visit two main approaches in the modern literature, a simultaneous take and a conditional take to selective inferential problems after introducing the false coverage rate control.
  • What if we collect new data? After all, science is iterative in nature! : A realistic pipeline in science proceeds via multiple complicated steps and calibrates the next steps in the pipeline based upon previous results. We will look at new methods to combine fresh data with previously collected data for powerful inference.
  • Powerful science redefined through integrative analyses : Distributed computing environments, multiple data sources playing diverse roles in a pipeline etc. are part of this discourse. We will think of tools in order to adapt to this emerging class of problems.