Author: Christina Kouridi

Objective: Showcase prominent agent behaviours observed during testing on instruction sets that investigate language understanding ability


Agent Strategies

<aside> 💃 Pirouetting

The agent scans the grid without manipulating any objects. Most frequently, it pivots around its starting position, collecting environment observations with its partially-observable view. This occurs when the agent is uncertain on how to act

</aside>

<aside> 👊 Insistent

The agent insists on putting the target object next to the destination object, by repeatedly placing it around the second object. This occurs when the agent is confident that this instruction should result in reward but it does not receive it.

</aside>

<aside> 👓 Semi-myopic

The agent picks up the correct target object immediately, but myopically places it next to random objects in the grid instead of the correct destination object. This agent only seems to understand the target object and the action that must be performed, without locating it in relation to the destination object.

</aside>

<aside> 🌪️ Random exploration

The agent randomly picks up objects and places them next to others, even when the objects referenced in the instruction are not present in the grid.

</aside>


Task: PutNextLocal-d0 (contains no distractor objects)

<aside> ✅ ORIGINAL INSTRUCTION Given: "put the grey ball next to the blue key" **Agent sees: "put the grey ball next to the blue key"

</aside>

GIE-GCN (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/f1fb5c53-72b3-4e61-a530-d2d8c607b1ba/openaigym.video.0.14458.video000000.mp4

Return: 0.83 Success rate achieved: 100%

GIE-GAT (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/6a3e81d2-72bc-4b0d-b102-23ea69a6ad84/openaigym.video.0.12829.video000000.mp4

Return: 0.85 Success rate achieved: 100%

GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/02ecc6c6-dab1-4643-92f9-cfa185d506cf/openaigym.video.0.31854.video000000.mp4

Return: 0.82 Success rate achieved: 100%

ATT-GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/50e14cdb-11dd-4f10-8c9c-00ce22e00787/openaigym.video.0.15955.video000001.mp4

Return: 0.83 Success rate achieved: 100%

<aside> 🔴 FALSE INSTRUCTION Given: "put the grey ball next to the blue key" **Agent sees: "put the yellow key next to the purple box"

</aside>

GIE-GCN (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/73af1d1a-4261-4893-9596-e4935da7bbc2/openaigym.video.2.14458.video000000.mp4

Return: 0.59 Success rate achieved: 100%

GIE-GAT (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/aedee0ac-fcab-46d0-a20f-b2f9f60cb007/openaigym.video.2.12829.video000000.mp4

Return: 0.44 Success rate achieved: 100%

GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/0ef8a054-f490-4c35-81d4-b8da26af1e68/openaigym.video.2.31854.video000008.mp4

Return: 0.81 Success rate achieved: 100%

ATT-GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/e82319bb-c4a3-4026-ac86-35e00e4717a9/openaigym.video.2.15955.video000000.mp4

Return: 0.80 Success rate achieved: 100%

<aside> ↔️ FALSE PERMUTED INSTRUCTION Given: "put the grey ball next to the blue key" **Agent sees: "put the blue key next to the grey ball"

</aside>

GIE-GCN (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/fb8eaab3-5322-4381-a145-0cdcaf8fcb94/openaigym.video.1.14458.video000000.mp4

Return: 0.73 Success rate achieved: 100%

GIE-GAT (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/4f12f9fe-fea2-471c-9bfa-0f5532eb8158/openaigym.video.1.12829.video000000.mp4

Return: 0.69 Success rate achieved: 100%

GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/5c8e6dc4-115a-41a4-a971-08f100345ddf/openaigym.video.1.31854.video000000.mp4

Return: 0 Success rate achieved: 0%

ATT-GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/1c563a82-d05c-4680-a837-a2fdba050dc5/openaigym.video.1.15955.video000000.mp4

Return: 0 Success rate achieved: 0%

<aside> 🔄 TEXT INVERTED INSTRUCTION Given: "put the grey ball next to the blue key" **Agent sees: "next to the blue key put the grey ball"

</aside>

GIE-GCN (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/7b135e4b-3ea0-4704-9677-14bf7bd79426/openaigym.video.3.14458.video000064.mp4

Return: 0 Success rate achieved: 0%

GIE-GAT (ours)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ca2e91f4-ec25-48ae-b0e8-622cc0bc51cc/openaigym.video.3.12829.video000000.mp4

Return: 0.85 Success rate achieved: 100%

GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/3f242c71-a270-4965-9f46-0a2a59988094/openaigym.video.3.31854.video000000.mp4

Return: 0 Success rate achieved: 0%

ATT-GRU (baseline)


https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ec66b41c-08ef-448f-bf5c-4ed8f434f75e/openaigym.video.3.15955.video000000.mp4

Return: 0.27 Success rate achieved: 100%


PutNextLocal-d2 (contains two distractor objects)

<aside> ✅ ORIGINAL INSTRUCTION Given: "put the yellow box next to the grey key" **Agent sees: "put the yellow box next to the grey key"

</aside>