With deep learning, perception is now solved. However, for AI with cognitive capacity it needs to know how to link together multiple actions.
Most approaches to this requires agency: systems that act in the world in pursuit of some goal. A sophisticated agent might be able to plan ahead. A simple agent will have just reflexes.
We can set a goal, and we can examine if the agent is able to achieve that goal. If it’s too complicated we can break that down in to sub tasks. Crucially, we need to demonstrate to the user what the agent attempts to do so that the user can intervene.
Machine vision started to work in 2012. Language understanding is now starting to work. Conversational AI is final piece to enable AI as a platform.
Voice UI has enabled hands free interaction in the home and while travelling. Multimodal interaction — tasting touching hearing seeing — is the natural way to communicate with computers. It enables everyone to achieve what was only possible before by those fluent in computer language.
Safe AI relies upon models that:
Space of safety problems (Ortega et al 2019)