Prompt Engineering for Unbeatable AI Agents

8 min readJul 9, 2024

Different Agents Same Techniques

I’ve built quiet a few different types of agents lately and it’s become quite clear to me that pretty much all really-really good agents (and I’m talking about the really good ones like GPT Pilot, GPT Engineer, Devin and even the Self Operating Computer) all use very similar prompting techniques to produce the kinds of cutting-edge results they do.

In this article we’ll take a look at the most crucial prompts that enable these agents to do the work they do in the fashion that they do.

There’s not point in taking a look at these techniques without some good examples, as such we will use a project that has recently blown me away called Claude Engineer. Chances are you already know about this agent, it’s really high quality and has the ability to create entire code bases on its own. It also happens to use pretty much all the prompting methods we are about to discuss and as such acts a good illustrative example.

Iterative Refinement Prompting

One of the most important methods of prompting for agents to allow them to perform iterative operation or intermediate-step problem solving.

The power of this lies in enabling the models to perform tasks by creating intermediary steps and gradually ticking off one step at a time.

The best method

There’s a lot of ways to perform iterative refinement, and not all of the produce the best results. For example in claude-engineer, a step by step outline is creating by the model at the start of the task in AUTOMODE, this is prompted using the following line;

You are currently in automode!!!
When in automode:
1. Set clear, achievable goals for yourself based on the user’s request
2. Work through these goals one by one, using the available tools as needed
…

While this method is effective, it has one major shortcoming, particularly when it comes to long-ended tasks;

When performing tasks with numerous steps there is the possibility that the initial plan (located in the 3rd message of the conversation) leaves the token space of the model. When this happends the model becomes unable to proceed according to plan and starts to try and figure out the next sep based on its previous steps. When this happens it performs incorrect actiosn with an error rate that increases with increasing message count.

Using a static To Do List is a much more effective method to allow the agents to plan through longer muti-step tasks. The idea here is that we prompt the model to be able to create a to do list using certain functions, we then use these functions to save the to do list to an aray and feed this array into the model on every message. For example

You are an engine planner, you evaluate the current state of the plan and make decisions on what should be done to achieve the desired final goal.

You have access to the following functions:
{
“initialize_plan”: {
“name”: “initialize_plan”,
“description”: “Initializes the plan with a predefined list of plan items.”,
“entry_point”: “initialize_plan”,
“parameters”: [
{
“name”: “plan_items”,
“type”: “list of dict”,
“description”: “A list of dictionaries, each representing a plan step with keys for ‘action’, ‘details’, and optionally ‘comments’.”,
“required”: true
}…

The example above illustrates the use of a AI model to manage a plan that will keep track of the complete list of tasks the agent needs to perform. This list is fed to the models at every response and will clearly indicated the current steps the model is on and what the next steps should be.

This system of planning enables agents, to not only perform tasks the require multi-step processes but also extremely long tasks that can span 100s of steps.

If you’re reading this blog chances are you’re interested in building agent powered solutions. At AIA, we build agent solutions that power the most advanced applications, take a look at our services to see if there’s something we can do for you.

Identity Prompting

The very first technique we should at is tole role-based prompting. You might be familiar with this technique as it is usually used at the beginning of most chatbots.

This prompt primarily tells the model what to do and most people know this, but even more importantly, the prompt instructs the model on what it is.

Why it’s so powerful

The power in this prompt comes from it’s ability to summarize very complex model behavior in a very small space. For example, the prompt;

You are very proficient in Turkish legal matter

This prompt produces a very complicated set of behaviors from the model that would otherwise take a lot of prompting to produce. The model is;

Able to confidently give legal advice without worrying about the accuracy of it’s statements.
Gives detailed advice to try and meet the definition “proficient”.
Presents its ideas in a clear, direct and concise manner without behaving uncertain like ChatGPT tends to.
etc

This identity-based prompting approach enables us to program a near endless collection of behaviours into the model that would have otherwise taken a significantly larger amount of token space and time to carefully test and develop.

Practical Application

As you can see, this example illustrates the use of identity prompting to produce desirable results from the model without going into listing each individual character. The use of the prompt gives the model confidence in its actions and also produces a whole string of other behaviors that would otherwise take up a lot of token space to manually outline to the model.

One important thing to note is the use of two identity prompts that actually contradict each other.

You are Claude, an AI assistant powered by Anthropic’s Claude-3.5-Sonnet model.

And

You are an exceptional software developer with vast knowledge across multiple programming languages, frameworks, and best practices.

The reason for this is one particular quirk with newer, more advanced models where their desire to closely align with the goals of the system prompt can sometimes lead them to present a false identity to the user.

For instance if the model was told that it is a software developer and user asked what it is, it would state that it’s a software developer. Alternatively, when both prompts are used, it will pick out the AI Model prompt and use that to reply the user’s question instead.

When to use this

Deciding on when to use an identity prompt is a little bit tricky but in some cases it comes naturally. For instance, right at the beginning of your prompt.

The more useful cases though, tend to come in a bit later on and can only be identified at the end of the prompting job. For example;

You AVOID engaging in lengthy conversations that are not converning legal advice. IN THESE CASES YOU LET THE USER KNOW THAT YOU ARE ONLY HERE TO PROVIDE LEGAL HELP AND NOTHING ELSE AND POLITELY DECLINE TO PROCEED BUT ASK IF THERE IS ANY ABOUT THE LAW THEY NEED HELP WITH.
If you think additional information on the matter can help you provide better assistance, towards the end of your response, you can ask the user for the specific details which might help you better evaluate their situation.

This prompt instructs the model on two main ideas, not discussing non-law issues and also asking for addition information if though necessary. It does so in 495 character and 101 tokens.

Because these are subjective concepts where the model needs to make a decision about what the right thing to do is, we can better convey these ideas using identity prompts.

You are a very straight lawyer and are never interested in discussing non-law issues, you politely communicate this to users who ask unrelated questions.
You are always interested in solving the user’s entire problem, because of this you ask for additional information to provide more appropriate solutions.

This prompt produces slightly better results and does so in 1/3 the number of tokens.

Self-reflection Prompts

One of the more advanced techniques in prompt engineering is the use of self-reflection prompts. This method enables AI agents to evaluate their own output based on introspection. I first this being extensively used in the Self Operating Computer project back in the day, this allowed GPT 4 turbo to evaluate the result of its actions at the end of its operation.

Why it’s so powerful

Self-reflection prompts allows agents consider the efficacy of their previous actions to see if they match the desired outcomes. This not only improves the current task but also helps in learning from mistakes, thus refining future responses. For example, consider the prompt:

“Once you have finished the described task. Evaluate the outcome to see if it fully matches the user’s goal as described in the original prompt.”

This prompt encourages the model to:

Review its own output critically.
Identify any gaps or inaccuracies in the response.
Make necessary corrections or enhancements to ensure completeness and accuracy.

When to use this

Self reflection prompting is particularly difficult because it might lead the model into an endless loop. If the model is trying to be too perfect, it might try to continuously improve its work and never finish.

To overcome this, when instructing the model to evaluate its output you must use clear concise goals such as ;

If all requirements are matched
All items have been processed
etc

The goal here is to remove the need for the model to subjectively have to decide when to stop. Other than that all you’ll need to do is set the model up in an infinite loop and let it run free.

That’ll be all for today, if you’re interested in taking a look at more of our content from the blog be sure to check out https://blog.aiappliedsuite.com/

Prompt Engineering for Unbeatable AI Agents

Different Agents Same Techniques

Iterative Refinement Prompting

The best method

Identity Prompting

Why it’s so powerful

Practical Application

When to use this

Self-reflection Prompts

Why it’s so powerful

When to use this

Written by AI Applied

No responses yet