Siddhant Bhambri

Senior AI Research Engineer · Samsung Research America  |  PhD, Arizona State University

"To live life is to take decisions, and to take decisions is to reason and plan."

I build intelligent AI systems that empower personalization and agentic memory to support reasoning and decision-making. My work spans LLMs, Large Reasoning Models, and Reinforcement Learning, with a focus on Human-AI Interaction. PhD from Yochan Lab at ASU, advised by Dr. Subbarao Kambhampati.

Siddhant Bhambri

Experience

2025 – Present
Senior AI Research Engineer
Samsung Research America
Building agentic AI memory systems to support personalization and reasoning for end-user applications via cloud and on-device LLM-based architectures on the Language & Personalized Intelligence team.
2021 – 2025
Graduate Research Associate · PhD in Computer Science
Arizona State University — Yochan Lab
Advised by Dr. Subbarao Kambhampati. Researched the intersection of LLMs & RL to optimize agent behavior in human-centric scenarios, with a GPA of 4.0/4.0. Dissertation: "Role of Large Language Models in Human-AI Interaction: A Critical Appraisal".

Recent News

Publications

∗ denotes equal contribution. Full list on Google Scholar.

✓ OK ✗ ERR Trace correctness ≠ Answer correctness low correlation
Interpretable Traces, Unexpected Outcomes: Investigating the Disconnect in Trace-Based Knowledge Distillation
Siddhant Bhambri, Upasana Biswas, Subbarao Kambhampati
ACL 2026🎉 New AcceptanceNeurIPS 2025 Workshop (LAW + CogInterp)
LLMsLRMsFine-tuning
AI HU Constructive Interdependence cooperation metrics
Who is Helping Whom? Analyzing Inter-dependencies to Evaluate Cooperation in Human-AI Teaming
Upasana Biswas, Vardhan Palod, Siddhant Bhambri, Subbarao Kambhampati
AAAI 2026
Human-AIRL
CoT? Interpretable ≠ Better perf. LLaMA · Qwen · R1
Do Cognitively Interpretable Reasoning Traces Improve LLM Performance?
Siddhant Bhambri∗, Upasana Biswas∗, Subbarao Kambhampati
NeurIPS 2025 Workshop (CogInterp + LAW)
LLMsLRMsHuman-AI
token token token … token token token … Not "thinking"! Intermediate tokens ≠ reasoning traces
Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
Subbarao Kambhampati, Kaya Stechly, Karthik Valmeekam, Lucas Saldyt, Siddhant Bhambri, Vardhan Palod, Atharva Gundawar, Soumya Rani Samineni, Durgesh Kalwar, Upasana Biswas
NeurIPS 2025 Workshop (LAW + CogInterp)
LLMsLRMs
local ✓ global ✗ RLVR Traces Math domain analysis
Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains
Soumya Rani Samineni, Durgesh Kalwar, Vardaan Gangal, Siddhant Bhambri, Subbarao Kambhampati
NeurIPS 2025 Workshop (Math Reasoning + Efficient Reasoning)
LRMsRL
<think> … </think> ReAct prompt trace Performance ← exemplar sim. Think tags don't help planning TMLR 2025
Do Think Tags Really Help LLMs Plan? A Critical Evaluation of ReAct-Style Prompting
Siddhant Bhambri∗, Mudit Verma∗, Subbarao Kambhampati
TMLR 2025NeurIPS 2024 Workshop (Adaptive Foundation Models)
LLMsHuman-AI
LLM Heuristic RL Reward Shaping Sample efficiency ↑ BabyAI · Minecraft Household · Mario
Extracting Heuristics from Large Language Models for Reward Shaping in Reinforcement Learning
Siddhant Bhambri, Amrita Bhattacharjee, Durgesh Kalwar, Lin Guan, Huan Liu, Subbarao Kambhampati
NeurIPS 2024 Workshop (Open-World Agents)
LLMsRLHuman-AI
NYC LAX LLM-Modulo Travel Planning Verifier loop Robust plan generation
Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning
Atharva Gundawar, Mudit Verma, Lin Guan, Karthik Valmeekam, Siddhant Bhambri, Subbarao Kambhampati
arXiv 2024
LLMs
LLM Verifier ✓ LLMs can't plan, but can help planning ICML 2024 Spotlight
LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks
Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Kaya Stechly, Mudit Verma, Siddhant Bhambri, Lucas Saldyt, Anil Murthy
ICML 2024 Spotlight
LLMs
🤖 👤 thinks X means Y ToM in LLMs: An Illusion HRI 2024
Theory of Mind Abilities of Large Language Models in Human-Robot Interaction: An Illusion?
Mudit Verma∗, Siddhant Bhambri∗, Subbarao Kambhampati
HRI 2024
LLMsHuman-AI
Multi-Agent Pref-based RL Human-AI Teaming
Benchmarking Multi-Agent Preference-Based Reinforcement Learning for Human-AI Teaming
Siddhant Bhambri, Mudit Verma, Anil Murthy, Subbarao Kambhampati
arXiv 2023
RLHuman-AI
L U U RL agent Feedback-efficient Preference RL
Exploiting Unlabeled Data for Feedback Efficient Human Preference Based Reinforcement Learning
Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati
AAAI 2023 Workshop (R2HCAI)ICML 2023 Workshop
RLHuman-AI
LLM Human A B B A Preference Proxies ICML Workshop 2023
Preference Proxies: Evaluating Large Language Models in Capturing Human Preferences in Human-AI Tasks
Mudit Verma, Siddhant Bhambri, Subbarao Kambhampati
ICML 2023 Workshop (Theory of Mind + Preference Learning)
LLMsHuman-AI
S P A R S H E D Rollout / POMDP Online RL approach near-optimal play
Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach
Siddhant Bhambri, Amrita Bhattacharjee, Dimitri Bertsekas
IEEE CoG 2023
RL
🔵 🚩 🎭 Deception in Markov Game (CTF)
Using Deception in Markov Game to Understand Adversarial Behaviors Through a Capture-The-Flag Environment
Siddhant Bhambri, Purv Chauhan, Frederico Araujo, Adam Doupé, Subbarao Kambhampati
GameSec 2022
RL
Visual Attention Robot Arm Contrastive Learning Affordance Cues IROS 2021
Contrastively Learning Visual Attention as Affordance Cues from Demonstrations for Robotic Grasping
Yantian Zha, Siddhant Bhambri, Lin Guan
IROS 2021
RL
Multi-obj RL · Power opt.
Multi-Objective Reinforcement Learning Based Approach for User-Centric Power Optimization in Smart Home Environments
Saurabh Gupta, Siddhant Bhambri, Karan Dhingra, Arun Balaji Buduru, Ponnurangam Kumaraguru
IEEE SMDS 2020
RL
Black-Box Adversarial Attacks Computer Vision Survey (2019)
A Survey of Black-Box Adversarial Attacks on Computer Vision Models
Siddhant Bhambri, Sumanyu Muku, Avinash Tulasi, Arun Balaji Buduru
arXiv 2019
Adversarial ML

Talks & Outreach

Oct 2025 · Ai2 — Allen Institute for AI
"Role of LLMs in Human-AI Interaction: A Critical Appraisal"
Mar 2025 · Podcast — Ones Changing The World (1CW)
"Beyond ChatGPT: The Future of Human-Aware AI Agents"
Jan 2025 · Podcast — Turning Turing
"CS PhD in the USA Demystified"
Fall 2023 · ASU School of Computing & AI — AI Day
"Large Language Models for Human-Aware AI"
Invited talk at the School of Computing & AI, Arizona State University.

Reviewing & Service

Venues NeurIPS (2023–26), ICML (2023–26), ICLR (2024–25), ACL (2026), Interspeech (2026), PLOS ONE (2026), AISTATS (2025), RLC (2024)
PC Member IJCAI (2024), GameSec (2023–24), ICAPS (2023–24), RA-L (2022–24), IROS (2021–22)