Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas  ACL 2022

Raphael Schumann and Stefan Riezler

Info

TBA

Task

In Vision and Language Navigation an embodied agent is instructed by natural language to follow a route in a given environment. In our project, the environment is build from Street View panoramas in the area of Manhattan.

The agent is a machine learning model that is given navigation instructions and tries to follow them by deciding its next action from observing the current panorama image. The agent can thus freely navigate the environment until it decides to stop when it thinks the goal location is reached.

Demo

The demo uses a seq2seq model trained from scratch.

Citation

@inproceedings {schumann-riezler-2022-analyzing,
    title = "Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas",
    author = "Raphael Schumann and Stefan Riezler",
    year = "2022",
    publisher = "Association for Computational Linguistics"
}