From VQA to VLN: Recent Advances in Vision-and-Language Research

In conjunction with CVPR 2021

July 19th - July 25th 2021 (Full Day)

Location: Virtual

Photo by NASA on Unsplash

CVPR 2021 Tutorial on "From VQA to VLN: Recent Advances in Vision-and-Language Research"

A long-term goal of AI research is to build intelligent agents that can see the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and act in a physical or embodied environment. To this end, recent advances at the nexus of Computer Vision and Natural Language Processing have made tremendous progress -- from generating natural language descriptions of images/videos, to answering questions about them, and to holding free-form conversations about visual content.

Most recently, Embodied AI, where embodied agents are trained to perform various tasks in egocentric perception, has attracted a surge of interest within computer vision, natural language processing and robotics communities. Vision-Language Navigation (VLN) is one fundamental topic in Embodied AI that was proposed by Anderson and Wu et al..

In this tutorial, we will not only cover the latest approaches and principles at the frontier of vision-and-language research, but also present a comprehensive overview of the field of VLN.

Program (PDT)

Our program is devided into two sub-sessions: (1) Morning Session: Vision-and-Language Understanding and (2) Afternoon Session: Vision-and-Language Navigation. Recordings and slides will be made available after the tutorial.

9:00-9:10 Opening Remarks Jingjing Liu and Xiaodong He
9:10-10:00 Visual Question Answering and Reasoning Zhe Gan
10:00-10:50 Video-and-Language Understanding Luowei Zhou
Coffee Break
11:10-12:00 Vision-and-Language Pre-training Linjie Li
Lunch Break
13:00-13:45 Introduction to VLN Qi Wu
13:45-14:30 Key Methodologies to VLN Xin Eric Wang
Coffee Break
14:50-15:35 Towards Realistic VLN Yoav Artzi
15:35-16:20 Challenges and Trend in VLN Peter Anderson


Peter Anderson

Google Research

Yoav Artzi

Cornell University

Zhe Gan

Microsoft D365 AI

Xiaodong He


Linjie Li

Microsoft D365 AI

Jingjing Liu

Microsoft D365 AI

Xin (Eric) Wang

UC Santa Cruz

Qi Wu

University of Adelaide

Luowei Zhou

Microsoft D365 AI


Contact the Organizing Committee: