菜单

Research
- Research Index
- Demo
Blog
About
Career

Machine Learning

From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes

Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning

NEP: Autoregressive lmage Editing via Next EditingToken Prediction

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning

Falcon: Fast visuomotor policy via partial denoising

MCU: An Evaluation Framework for Open-Ended Game Agents

SYNERGAI: Perception Alignment for Human-Robot Collaboration

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

文章分页

1 2 … 4 后一页 →

Copyright © 2023 BIGAI:Beijing Institute for General Artificial Intelligence

Facebook-f Twitter Google-plus-g Pinterest

Scroll to Top