New Thinking Method for Agents | Better than reasoning models

7 min read6 days ago

In this article I’ll present a new method of thinking specifically for agent use, we’ll present why it’s better for agentic use, what it does internally and most importantly how to implement it into your own use than existing agent setups.

You’ll need access to this repository if you want to look at some of the examples I created during this work. Prefer video? A video for this is available on my YouTube channel here.

Why Reasoning Models are Terrible For Agentic Use

If you’ve used reasoning models before, you know they are really bad for agentic use; From cases of not using the provided tools, leaving out important details to just outright not following sequential use of tools, reasoning models have shown to be really bad at agentic application especially when it comes to tool use — which is an important part of modern agent systems.

Why

Why this happens lies in the very nature of reasoning these agents use and how much an agent process changes along the way.

As you can see, while user defines the original problem, additional information comes into the sequence through tool calls and changes the problem completely. This makes the original thinking of the model stale and causes the resulting output to be subpar.

The new information that comes in might be unbelievably simple. Such as a user who claims they have missed their airline, not being found by the tool call that checks for people that had booked a flight. The right solution might be something as easy as making it clear that the person probably booked a different flight instead, but if the reasoning model had already made up it’s mind to try and reschedule for them, it will call that function regardless.

The Right Way

A better solution lies in allowing the model to be able to think through a problem again after tool call information comes in. This would enable the model to rethink its plan before proceeding to reschedule a flight that didn’t exist in the first place.

Introducing the Thinking Tool

Extending the idea to make it general would mean that we need to give the model the ability to think just about as many times as it wants to during the generation process.

To do this, among all the different tools we give a model, we include an additional think tool with a schema that looks something like this.

 {
            "name": "think",
            "description": """Use this tool to think about something.
            It will not obtain new information or change the database, but just append the thought to the log.
            Use it when complex reasoning or some cache memory is needed.""",
            "input_schema": {
                "type": "object",
                "properties": {
                    "thought": {
                        "type": "string",
                        "description": "A thought to think about."
                    }
                },
                "required": ["thought"]
            }
        }

As you can see all it does is tell the model about the presence of the tool and when to use the tool. With this tool available now, the model can stop to rethink whenever it wants.

Structure of The Tool

The tool itself isn’t particularly implementation heavy as you can imagine by now and only exists as a place to allow the the model to recollect the ongoing process and write out its thoughts. In practice we don’t even need to log back the thoughts to the model as function inputs remain in message memory.

Here is an example definition of the tool:

   # The "think" tool - nothing to process, just acknowledge
    elif tool_name == "think":
        # We don't need to do anything with the "thought" input
        # It's just Claude's notes to itself
        thought = tool_input.get("thought", "")
        # Use our rich visualizer for displaying the thought
        viz.display_think_process(thought)
        logger.info("Claude used the 'think' tool")
        return None

Like I said, we don’t exactly have to log the output of the function (the thought process) back to the model as this is already included in the input to the model since we append it to the messages array.

In Practice

Models tend to use this tool to make forward thinking plans. Typically, Claude 3.7 will use this as a chance to;

Outline the progress until now
Outline a way forward
Justify the way forward (and address any apparent mistakes)
And outline the next steps

As you can imagine this single step can be crucial in saving a derailed agentic execution process.

In the following example (you can get access to this file yourself using the github repository of the post at the top of the post), the model is given access to a bunch of customer order data. The model is then asked to attend to a customer who wants a return on their order.

Following this, Claude retrieves the user’s order rom the database as well as checking their customer status. But a massive point of confusion arises where the model realizes that the buyer can’t get a refund because their refund period expired. This situation is even more confusing because the company policy states that “certain premium customers can get special dicounts” and this user is a premium customer. To think over this situation, the model uses our tool to stop, recollect, and think.

As you can see the model uses this tool to outline the entire process and plan a way forward, which is exactly what we planned the tool for.

Guidelines

Importance of Instructions

The secret sauce for the think tool is the system instruction. In their examples anthropic outlines numerous example of how to prmpt the model to work correctly with the think tool. I’ve presented one of their examples bellow just to make it clear exactly how much prompting is required when working with this tool.

## Using the think tool
Before taking any action or responding to the user after receiving tool results, use the think tool as a scratchpad to:
- List the specific rules that apply to the current request
- Check if all required information is collected
- Verify that the planned action complies with all policies
- Iterate over tool results for correctness
Here are some examples of what to iterate over inside the think tool:
<think_tool_example_1>
User wants to cancel flight ABC123
- Need to verify: user ID, reservation ID, reason
- Check cancellation rules:
* Is it within 24h of booking?
* If not, check ticket class and insurance
- Verify no segments flown or are in the past
- Plan: collect missing info, verify rules, get confirmation
</think_tool_example_1>
<think_tool_example_2>
User wants to book 3 tickets to NYC with 2 checked bags each
- Need user ID to check:
* Membership tier for baggage allowance
* Which payments methods exist in profile
- Baggage calculation:
* Economy class × 3 passengers
* If regular member: 1 free bag each → 3 extra bags = $150
* If silver member: 2 free bags each → 0 extra bags = $0
* If gold member: 3 free bags each → 0 extra bags = $0
- Payment rules to verify:
* Max 1 travel certificate, 1 credit card, 3 gift cards
* All payment methods must be in profile
* Travel certificate remainder goes to waste
- Plan:
1. Get user ID
2. Verify membership level for bag fees
3. Check which payment methods in profile and if their combination is allowed
4. Calculate total: ticket price + any bag fees
5. Get explicit confirmation for booking
</think_tool_example_2>

As you can see, it’s necessary to include some pretty detailed prompting to get the most out of this tool.

While some level of guidance can be included in the function schema, for this specific tool it’s essential to include detailed instructions in the system prompt, letting the model know when to use the tool, including examples.

Because this is an extra step in the generation process of the model there are times when the cost of running it isn’t worth the benefits on a large scale.

When to Add The “think” Tool

The most reasonable times to add the think tool are in cases that fall into any of the following categories.

Agents need to navigate complex guidelines for tool use (like when the agent can only do certain things in very intricate situations)
Agents need to call tools in very specific (sequential) orders
When functions return complex, data that needs to be carefully analysed and used to make important decisions before proceeding.

In contrast it’s a good idea to avoid using or adding the “think” tool cases where agents don’t necessarily have to follow very intricate guidelines when making tools call or have very simple tools in terms of both input and output.

In these cases the agent will very often not use the thinking tool at all or just use it when there isn’t much reason to.

Further Work

Over the next couple of days I will continue to explore the functionality of new reasoning methods like this. If you can I’d highly appreciate you give me a follow and let me know if there any things you wanna point out in the comments to this article. Thanks for reading and I will catch you in the next one.

New Thinking Method for Agents | Better than reasoning models

Why Reasoning Models are Terrible For Agentic Use

Why

The Right Way

Introducing the Thinking Tool

Structure of The Tool

In Practice

Guidelines

Importance of Instructions

When to Add The “think” Tool

Further Work

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by AI Applied

No responses yet