Hi! I am Nitin, a Pre-Doctoral Research Fellow in the AI Infrastructure group at Microsoft Research, India, where I am fortunate to work with Dr. Ramachandrandran Ramjee, Dr. Jayashree Mohan, Dr. Ashish Panwar and Dr. Nipun Kwatra.
My research interests lie at the intersection of computer systems and machine learning, with a current focus on optimizing Large Language Model (LLM) inference.
Previously, I was a Senior Software Engineer at Zeta, a unicorn startup modernizing legacy banking systems with a cloud-native stack. During my three years there, I scaled the Web Application Infrastructure, delivering it to HDFC Bank (Indiaโs largest private bank), and expansion to the US; and developed Zetaโs API Playground. I also interned at Flipkart, in the ads app team.
I graduated with a B.Tech. in Computer Science and Engineering from IIT Guwahati in 2020. I have fond memories of competing in programming contests at Codeforces. See this repo for fast plug-n-play data structures and algorithm implementations in C++.
Iโm looking for PhD positions. Please feel free to reach out if you think I might be a good fit.
Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems
Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov
Preprint
๐ URL | ๐ PDF | ๐ป Code
Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
Amey Agrawal, Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, Bhargav S. Gulavani, Alexey Tumanov, and Ramachandran Ramjee
OSDIโ24 (The 18th USENIX Symposium on Operating Systems Design and Implementation) Conference
๐ URL | ๐ PDF | ๐ป Code
Vidur: A Large Scale Simulation Framework For LLM Inference
Amey Agrawal, Nitin Kedia, Jayashree Mohan, Ashish Panwar, Nipun Kwatra, Bhargav S. Gulavani, Ramachandran Ramjee, Alexey Tumanov
MLSysโ24 (The 7th Annual Conference on Machine Learning And Systems) Conference
๐ URL | ๐ PDF | ๐ป Code