Background

I am an Applied Scientist at Amazon AGI, where I am part of a team that focuses on understanding how to scale language model pretraining. My own work involves research related to developing new techniques for scaling up pretraining through the utilization of sparse architectures (mixture-of-experts models), adaptive compute for mitigating inference inefficiencies and understanding how scaling laws fit in these regimes.

Prior to this, I was part of the AI Research and Education (AIRE) group at AWS. Even before, I contributed both as an ML engineer and researcher in the areas of search and recommendations at Twitter in San Francisco. I was part of the Tweet Search Ranking team where I worked on prototyping and deploying Twitter's first content based search relevance model utilizing explicit user survey feedback in Twitter's Search service.

I obtained my PhD in computer science at the Arizona State University, Tempe. I was advised by Paulo Shakarian. My thesis focused on measuring the impact of social network interactions using observational and experimental studies. During my graduate studies, I had the opportunity to spend summers at Amazon A9 in Palo Alto and Nokia Bell Labs in New Jersey. I live in and work remotely from San Francisco, California. I finished my undergraduate studies at the Indian Institute of Engineering Science and Technology (IIEST), Shibpur.

Research Interests

My research interests include topics in large-scale machine learning, including data and model efficient pretraining and distributed optimization, computational social science, and their applications in search and recommendation systems. I also enjoy doing independent research in the fields at the intersection of economics and machine learning, mainly with decision making in peer lending platforms and reinforcement learning.

You can find an updated list of published papers and preprints in my Google Scholar page. Please feel free to reach out to me using my email on anything related to my research, paper reviews or any collaborations. For more detailed information on my past work, please check the Interests section in the navigation bar.

News

Check out our recent papers on contrastive learning and sparse pretraining - EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence accepted in ICML 2024 and Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning (code and models to be released).