Rupak Sarkar
I am a third year PhD. student at the University of Maryland,
College Park in the department of Computer Science in the amazing CLIP lab
with Prof. Philip Resnik.
My primary area of research is computational pragmatics, with a special interest in the role of implicit content. Current NLP methods focus (too) heavily on the surface form of text, but a large part of the communicative intent of humans is implicit. I am interested in understanding how to model and interpret these implicit aspects of human communication, with specific applications in modeling common ground in conversations, and understanding the role of implicit content in political polarization.
In the past, I have worked on evaluating topic models that can help domain experts pick the best model for their real-world use cases (with the FDA). Prior to my PhD, I served as a Course Research Engineer in the course 11-865/11-665 Tracking Political Sentiments Using Machine Learning
(Fall 2020) at CMU with instructors Dr. Ashiqur Khudabukhsh, Prof. Tom Mitchell and Prof. Mark Kamlet. My previous research mentor is Prof. Ashiqur R KhudaBukhsh.
Mailto : rupak@umd.edu, rupaksarkar.cs@gmail.com
Recent updates
- 2023: Our paper on inducing pragmatic behavior in question answering in the domain of maternal and infant health was accepted at NAACL 2024!
We are also releasing a dataset of questions laden with assumptions along with expert-annotated assumptions and answers soon. Stay tuned!
- 2023: Our paper on creating natural language decompositions for understanding implicit content was accepted at EMNLP 2023!
Selected Publications
- Natural Language Decompositions of Implicit Content Enable Better Text Representations. Alexander Hoyle*, Rupak Sarkar*, Pranav Goel, Philip Resnik
Empirical Methods in Natural Language Processing (EMNLP), 2023 [pdf]
- Towards Pragmatic Awareness in Question Answering: A Case Study in Maternal and Infant Health. Neha Srikanth*, Rupak Sarkar*, Rachel Rudinger, Jordan Boyd-Graber
[Under Review] [pdf]
- Fringe News Networks: Dynamics of US News Viewership following the 2020 Presidential Election. A. R. KhudaBukhsh*, Rupak Sarkar*, Mark S. Kamlet, Tom M. Mitchell
14th ACM Web Science Conference (WebSci), 2022
- We Don't Speak the Same Language: Interpreting Polarization Through Machine Translation. A. R. KhudaBukhsh*, Rupak Sarkar*, Mark S. Kamlet, Tom M. Mitchell
35th AAAI Conference on Artificial Intelligence (AAAI), 2021. [slides]
- Are chess discussions racist? An Adversarial Hate Speech Data Set (Student Abstract). Rupak Sarkar, A. R. KhudaBukhsh
35th AAAI Conference on Artificial Intelligence (AAAI), 2021 (SA-21) [poster]
- Social Media Attributions in the Context of Water Crisis. Rupak Sarkar*, Sayantan Mahinder*, Hirak Sarkar, A. R. KhudaBukhsh
Empirical Methods in Natural Language Processing (EMNLP), 2020
- The Non-native Speaker Aspect: Indian English in Social Media. Rupak Sarkar, Sayantan Mahinder, A. R. KhudaBukhsh
6th Workshop on Noisy User-generated Text (W-NUT), EMNLP 2020 [slides]
Selected Media Coverage