Rupak Sarkar

I am a fourth year PhD. student at the University of Maryland, College Park in the department of Computer Science in the amazing CLIP lab with Prof. Philip Resnik. My primary area of research is computational pragmatics, with a special interest in the role of implicit content. Current NLP methods focus (too) heavily on the surface form of text, but a large part of the communicative intent of humans is implicit. I am interested in understanding how to model and interpret these implicit aspects of human communication, with specific applications in modeling common ground in conversations, and understanding the role of implicit content in text.

Mailto : rupak@umd.edu, rupaksarkar.cs@gmail.com

More Info : [CV], [Google Scholar]

Recent updates

2025: My paper on using pairwise comparisons to estimate text polarity was accepted at NAACL 2025 Findings! Link to [pdf] and [code].

2025: I am heading back to Microsoft Research as a Research Intern! I'll be working jointly with MSR and the OfficeAI team on LLM applications.

2024: I spent a wonderful summer at Microsoft Research as a Research Intern working on making conversational AI respond better to a broad range of user prompts. (Link to Paper)

Selected Publications

Conversational User-AI Intervention: A Study on Prompt Rewriting for Improved LLM Response Generation Rupak Sarkar, Bahereh Sarrafzadeh, Nirupama Chandrasekharan, Nagu Rangan, Philip Resnik, Longqi Yang, Sujay Kumar Jauhar (Under Review at ARR) [pdf]
Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs Rupak Sarkar, Neha Srikanth, Taylor Hudson, Claire Bonial, Rachel Rudinger, Philip Resnik (Under Review at ARR) [pdf]
PairScale: Analyzing Attitude Change with Pairwise Comparisons Rupak Sarkar, Patrick Wu, Kristina Miler, Alexander Hoyle, Philip Resnik NAACL 2025 Findings [pdf]
Pregnant Questions: The Importance of Pragmatic Awareness in Maternal Health Question Answering Neha Srikanth*, Rupak Sarkar*, Rachel Rudinger, Jordan Boyd-Graber NAACL 2024 Main [pdf]
Natural Language Decompositions of Implicit Content Enable Better Text Representations. Alexander Hoyle*, Rupak Sarkar*, Pranav Goel, Philip Resnik EMNLP 2023 Main [pdf]
We Don't Speak the Same Language: Interpreting Polarization Through Machine Translation. A. R. KhudaBukhsh*, Rupak Sarkar*, Mark S. Kamlet, Tom M. Mitchell (AAAI) 2021 [slides]
Are chess discussions racist? An Adversarial Hate Speech Data Set (Student Abstract). Rupak Sarkar, A. R. KhudaBukhsh 35th AAAI Conference on Artificial Intelligence AAAI 2021 Best Student Abstract [poster]
Social Media Attributions in the Context of Water Crisis. Rupak Sarkar*, Sayantan Mahinder*, Hirak Sarkar, A. R. KhudaBukhsh Empirical Methods in Natural Language Processing (EMNLP), 2020
The Non-native Speaker Aspect: Indian English in Social Media. Rupak Sarkar, Sayantan Mahinder, A. R. KhudaBukhsh 6th Workshop on Noisy User-generated Text (W-NUT), EMNLP 2020 [slides]

Selected Media Coverage

The Science of Political Polarization. [CMU CS Magazine Cover Story]
Why a YouTube Chat About Chess Got Flagged for Hate Speech.[Wired]
AI May Mistake Chess Discussions as Racist Talk.[CMU Press Release]
Even our language is polarized.[CMU Press Release]
The Left and the Right Speak Different Languages - Literally.[Wired]
Mask Vs. Muzzle: Even Words Are Now Polarized.[Futurity]
Fox News viewers write about 'BLM' the same way CNN viewers write about 'KKK'.[The Conversation], [Yahoo News]