MindDial: Enhancing Conversational Agents with Theory-of Mind for Common Ground Alignment and Negotiation
ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning
LangSuit⋅E: Controlling, Planning, and Interacting with Large Language Models in Embodied Text Environments