Research - Avi Bagchi

Featured

Watermarking Discrete Diffusion Language Models (2025)

Avi Bagchi, Akhil Bhimaraju, Moulik Choraria, Daniel Alabi, and Lav R. Varshney

Watermarking Discrete Diffusion Language Models

Watermarking has emerged as a promising technique to track AI-generated content and differentiate it from authentic human creations. While prior work extensively studies watermarking for autoregressive large language models (LLMs) and image diffusion models, none address discrete diffusion language models, which are becoming popular due to their high inference throughput. We introduce the first watermarking method for discrete diffusion models by applying the distribution-preserving Gumbel-max trick at every diffusion step and seeding the randomness with the sequence index to enable reliable detection.

Under submission. Presentation to UIUC Information and Intelligence Group.

[arXiv PDF] | [Slides]

Doppler Invariant CNN for Signal Classification (2025)

Avi Bagchi, Dwight Hutchenson

Doppler Invariant CNN for Signal Classification

Radio spectrum monitoring in contested environments motivates the need for reliable automatic signal classification technology. Prior work highlights deep learning as a promising approach, but existing models depend on brute-force Doppler augmentation to achieve real-world generalization, which undermines both training efficiency and interpretability. In this paper, we propose a convolutional neural network (CNN) architecture with complex-valued layers that exploits convolutional shift equivariance in the frequency domain. To establish provable frequency bin shift invariance, we use adaptive polyphase sampling (APS) as pooling layers followed by a global average pooling layer at the end of the network. Using a synthetic dataset of common interference signals, experimental results demonstrate that unlike a vanilla CNN, our model maintains consistent classification accuracy with and without random Doppler shifts despite being trained on no Doppler-shifted examples. Overall, our method establishes an invariance-driven framework for signal classification that offers provable robustness against real-world effects.

Under submission. Full report & slides internally distributed at MIT Lincoln Labs.

[arXiv PDF]

In Progress

Polynomial Flow Matching

Polynomial Flow Matching Diagram

Preliminary work for ESE 5460 Final Project under Professor Pratik Chaudhari

Continuing with Sourya Basu, Lav R. Varshney, Daniel Alabi

[PDF]

Towards Efficient and Trustworthy Discrete Diffusion Models

Avi Bagchi

Senior Thesis under Professors Weijie Su and Surbhi Goel

Diffusion Factor Models

Avi Bagchi, Om Shastri, Michael Tesfaye

Extension of “Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure” (Chen et al. 2025)

Smaller Projects

On the PAC Learnability of Distortion-Free Language Model Watermarks

Avi Bagchi, Michael Tesfaye

CIS 6250 Final Project under Professor Michael Kearns

[PDF]

Elliptic Curve Cryptography (2024)

Avi Bagchi

Elliptic Curve Cryptography

Directed Reading Program. Research on elliptic curve cryptography and its applications.

Presentation to Penn Department of Mathematics

[Slides]

Auditing the Use of Language Models to Guide Hiring Decisions (2024)

Johann D. Gaebler, Sharad Goel, Aziz Huq, and Prasanna Tambe

I was acknowledged for my research assistance at The Wharton School (Operations, Information, and Decisions Department) under Professor Prasanna Tambe in this work.

[arXiv PDF]

MOSQUITO EDGE: An Edge-Intelligent Real-Time Mosquito Threat Prediction Using an IoT-Enabled Hardware System (Sensors 2022)

Shyam Polineni^†, Om Shastri^†, Avi Bagchi^†, Govind Gnanakumar^†, Sujay Rasamsetti^†, Prabha Sundaravadivel

^† Equal contribution

MOSQUITO EDGE System

Species distribution models (SDMs) using climate variables effectively predict mosquito niches under current and future conditions. We use NOAA climate data matched to mosquito presence and absence points from NASA’s GLOBE Observer and the National Ecological Observatory Network to train an 86%-accurate Random Forest classifier that predicts mosquito threat. Temperature increases threat up to about 28 °C, producing high-threat clusters in warm, humid regions and low-threat clusters in cold, dry ones. We develop a low-cost IoT edge device that collects local climate data and automatically queries the model, enabling real-time predictions in remote or resource-limited settings and supplying new data for future SDM training.

Published in Sensors (2022). Cited 14.

[Paper]

Archival & Policy Research

The South Sea Bubble (The Concord Review 2021)

Avi Bagchi

The South Sea Bubble by William Hogarth

“I can calculate the motions of heavenly bodies, but not the madness of people.” —Isaac Newton

Through an investigation within the British Archives, this paper uncovers the British government’s corrupt involvement in the fraudulent South Sea Company.

Published in The Concord Review (a premier international history journal)

[PDF]

Water Insecurity in Forgotten Nations (World Food Prize 2020)

Avi Bagchi

Uzbekistan Water Insecurity

Photo taken in Mongolia (Penn Global Seminar 2025) where I continued water insecurity research.

Institutional fragmentation, contested borders, and “ninja mining” threaten water insecurity in Uzbekistan and Mongolia.

Published in The Global Youth Institute World Food Prize Conference. Cited 1.

[PDF]