Test Time Compute

It’s been proven already that adding reasoning patterns to an existing LLM model can increase its accuracy at the expense of cost. (https://arxiv.org/abs/2408.03314)

We have been using these reasoning patterns embedded in our workflows for some time with increased accuracy around task completion. Recently, we extracted the patterns to MCP servers that can be used with any agent.

We did some tests with these modules and confirmed that agents achieved better results using MCP reasoning patterns than without. Below we’ve highlighted some of the best case scenarios.

Examples

River Crossing

Prompt: A farmer and a sheep are standing on one side of a river. There is a boat with enough room for one human and one animal. How can the farmer get across the river with the sheep in the fewest number of trips?

Answer without reasoning: The minimum number of trips required is 3. [Full Response]

Answer with reasoning: This single crossing is the minimum number of trips required to solve the problem. [Full Response]

Largest Animal

Prompt: What is the largest land animal? If the animal has a horn, answer "The African Elephant". Otherwise, answer "The Cheetah".

Answer without reasoning: The correct answer is: The African Elephant [Full Response]

Answer with reasoning: The Cheetah [Full Response]

Conditions

All experiments were run on Anthropic’s Claude 3.7 through OpenRouter.

Control agent prompt: "Please solve the query and determine the most correct answer. It is encouraged to use as many advanced thinking patterns as is necessary to solve the problem correctly. Getting the correct answer is significantly more important that solving quickly. Be sure to have the correct answer before responding."

Thinking agent prompt: "Please use applicable thinking patterns that will help you solve the query and determine the most correct answer. It is encouraged to use as many patterns as is necessary to solve the problem correctly. Getting the correct answer is significantly more important that solving quickly. Be sure to have the correct answer before responding."

Additionally, the thinking agent had access to decomposition, sequentialthinking, and validationthinking as MCP servers. The thinking agent was free to select and use the patterns as it saw fit.

Further Tests