https://arxiv.org/pdf/1909.00596.pdf

The short of it:

The authors propose a new architecture for multiple choice q/a that uses attention to learn to rank supporting documents.

The long of it:

Current multiple choice answering models are a two step process; an information retrieval step, followed by the model prediction based on the selected supporting text for each candidate answer. Information retrieval engine lexically searches through some corpus, like wikipedia to find related documents. Information retrieval is not always successful in returning relevant documents. This is a pretty big problem since the model only learns from these supporting documents and incorrect documents are noisy. The authors have created the Attentive Ranker architecture to tackle this problem. It was shown to increase performance in downstream tasks by up to 7%.