<aside> ๐Ÿ’ก code : https://som-gpt4v.github.io/

</aside>

1. Introduction

1) About Author

2) ๋…ผ๋ฌธ์ด ๋‹ค๋ฃจ๋Š” task

2) limitations of previous studies

๊ณผ๊ฑฐ(2023.12)์˜ LLM์„ ์ด์šฉํ•œ VQA(Visual Question Answering)์—์„œ visual grounding ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜์ง€ ์•Š์•˜๋‹ค.

WHY?)

3) Solution approaches

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2025-07-01 แ„‹แ…ฉแ„’แ…ฎ 7.46.31.png

์ด๋ฏธ์ง€์— Mark๋ฅผ ์ถ”๊ฐ€ํ•ด์„œ VQA๋ฅผ ํ•  ๊ฒฝ์šฐ LLM์—์„œ visual grounding + reasoning ์„ฑ๋Šฅ์ด ์˜ฌ๋ž๋‹ค.