VQA2VLN Tutorial 2021

CVPR 2021 Tutorial on "From VQA to VLN: Recent Advances in Vision-and-Language Research"

A long-term goal of AI research is to build intelligent agents that can see the rich visual environment around us, communicate this understanding in natural language to humans and other agents, and act in a physical or embodied environment. To this end, recent advances at the nexus of Computer Vision and Natural Language Processing have made tremendous progress -- from generating natural language descriptions of images/videos, to answering questions about them, and to holding free-form conversations about visual content.

Most recently, Embodied AI, where embodied agents are trained to perform various tasks in egocentric perception, has attracted a surge of interest within computer vision, natural language processing and robotics communities. Vision-Language Navigation (VLN) is one fundamental topic in Embodied AI that was proposed by Anderson and Wu et al..

In this tutorial, we will not only cover the latest approaches and principles at the frontier of vision-and-language research, but also present a comprehensive overview of the field of VLN. The tutorial will be a full-day event (9:00 am to 5:00pm) with several middle breaks.

Program (PDT, UTC-7)

Our program is divided into two sub-sessions: (1) Vision-and-Language Pre-training and (2) Vision-and-Language Navigation. Recording of panel discussion will be available after the tutorial.

Prerecorded Sessions
4min	Opening Remarks [Video]	Jingjing Liu and Xiaodong He
50min	Representations and Training Strategies for VLP [Video] [Slides]	Zhe Gan
40min	Robustness, Efficiency and Extensions for VLP [Video] [Slides]	Linjie Li
40min	Video-and-Language Pre-training [Video] [Slides]	Luowei Zhou
42min	Introduction to VLN [Video] [Slides]	Qi Wu
55min	Generalizable VLN Methods [Video] [Slides]	Xin Eric Wang
58min	Forward to Realistic VLN [Video] [ Slides]	Yoav Artzi and Peter Anderson
15min	VLN Summary [Video] [ Slides]	Qi Wu
Live Session
16:00-17:00	Panel Discussion LIVE on Zoom [Video]	All speakers