Home Machine Learning Leverage OpenAI Software calling: Constructing a dependable AI Agent from Scratch | by Lukasz Kowejsza | Mar, 2024

Leverage OpenAI Software calling: Constructing a dependable AI Agent from Scratch | by Lukasz Kowejsza | Mar, 2024

0
Leverage OpenAI Software calling: Constructing a dependable AI Agent from Scratch | by Lukasz Kowejsza | Mar, 2024

[ad_1]

Created with DALL·E

Step-by-Step Workflow for creating and refining an AI Agent whereas coping with errors

Once we take into consideration the way forward for AI, we envision intuitive on a regular basis helpers seamlessly integrating into our workflows and taking up complicated, routinely duties. All of us have discovered touchpoints that relieve us from the tedium of psychological routine work. But, the primary duties at the moment tackled contain textual content creation, correction, and brainstorming, underlined by the numerous position RAG (Retrieval-Augmented Era) pipelines play in ongoing improvement. We intention to supply Massive Language Fashions with higher context to generate extra useful content material.

Excited about the way forward for AI conjures photos of Jarvis from Iron Man or Rasputin from Future (the sport) for me. In each examples, the AI acts as a voice-controlled interface to a fancy system, providing high-level abstractions. For example, Tony Stark makes use of it to handle his analysis, conduct calculations, and run simulations. Even R2D2 can reply to voice instructions to interface with unfamiliar pc programs and extract knowledge or work together with constructing programs.

In these situations, AI permits interplay with complicated programs with out requiring the tip consumer to have a deep understanding of them. This might be likened to an ERP system in a big company in the present day. It’s uncommon to seek out somebody in a big company who totally is aware of and understands each aspect of the in-house ERP system. It’s not far-fetched to think about that, within the close to future, AI may help practically each interplay with an ERP system. From the tip consumer managing buyer knowledge or logging orders to the software program developer fixing bugs or implementing new options, these interactions may quickly be facilitated by AI assistants aware of all points and processes of the ERP system. Such an AI assistant would know which database to enter buyer knowledge into and which processes and code is likely to be related to a bug.

To realize this, a number of challenges and improvements lie forward. We have to rethink processes and their documentation. At this time’s ERP processes are designed for human use, with particular roles for various customers, documentation for people, enter masks for people, and consumer interactions designed to be intuitive and error-free. The design of those points will look completely different for AI interactions. We’d like particular roles for AI interactions and completely different course of designs to allow intuitive and error-free AI interplay. That is already evident in our work with prompts. What we contemplate to be a transparent activity typically seems to not be so easy.

Nonetheless, let’s first take a step again to the idea of brokers. Brokers, or AI assistants that may carry out duties utilizing the instruments offered and make choices on the best way to use these instruments, are the constructing blocks that would ultimately allow such a system. They’re the method parts we’d wish to combine into each aspect of a fancy system. However as highlighted in a earlier article, they’re difficult to deploy reliably. On this article, I’ll display how we are able to design and optimize an agent able to reliably interacting with a database.

Whereas the grand imaginative and prescient of AI’s future is inspiring, it’s essential to take sensible steps in direction of realizing this imaginative and prescient. To display how we are able to begin constructing the inspiration for such superior AI programs, let’s concentrate on making a prototype agent for a standard activity: expense monitoring. This prototype will function a tangible instance of how AI can help in managing monetary transactions effectively, showcasing the potential of AI in automating routine duties and highlighting the challenges and issues concerned in designing an AI system that interacts seamlessly with databases. By beginning with a selected and relatable use case, we are able to achieve useful insights that can inform the event of extra complicated AI brokers sooner or later.

This text will lay the groundwork for a sequence of articles geared toward creating a chatbot that may function a single level of interplay for a small enterprise to assist and execute enterprise processes or a chatbot that in your private life organizes every part you want to hold monitor of. From knowledge, routines, recordsdata, to photos, we wish to merely chat with our Assistant, permitting it to determine the place to retailer and retrieve your knowledge.

Transitioning from the grand imaginative and prescient of AI’s future to sensible purposes, let’s zoom in on making a prototype agent. This agent will function a foundational step in direction of realizing the formidable objectives mentioned earlier. We’ll embark on creating an “Expense Monitoring” agent, a simple but important activity, demonstrating how AI can help in managing monetary transactions effectively.

This “Expense Monitoring” prototype won’t solely showcase the potential of AI in automating routine duties but in addition illuminate the challenges and issues concerned in designing an AI system that interacts seamlessly with databases. By specializing in this instance, we are able to discover the intricacies of agent design, enter validation, and the mixing of AI with present programs — laying a strong basis for extra complicated purposes sooner or later.

To convey our prototype agent to life and establish potential bottlenecks, we’re venturing into testing the instrument name performance of OpenAI. Beginning with a primary instance of expense monitoring, we’re laying down a foundational piece that mimics a real-world software. This stage includes making a base mannequin and remodeling it into the OpenAI instrument schema utilizing the langchain library’s convert_to_openai_tool operate. Moreover, crafting a report_tool permits our future agent to speak outcomes or spotlight lacking data or points:

from pydantic.v1 import BaseModel, validator  
from datetime import datetime
from langchain_core.utils.function_calling import convert_to_openai_tool

class Expense(BaseModel):
description: str
net_amount: float
gross_amount: float
tax_rate: float
date: datetime

class Report(BaseModel):
report: str

add_expense_tool = convert_to_openai_tool(Expense)
report_tool = convert_to_openai_tool(Report)

With the info mannequin and instruments arrange, the subsequent step is to make use of the OpenAI consumer SDK to provoke a easy instrument name. On this preliminary take a look at, we’re deliberately offering inadequate data to the mannequin to see if it might accurately point out what’s lacking. This method not solely exams the purposeful functionality of the agent but in addition its interactive and error-handling capacities.

Now, we’ll use the OpenAI consumer SDK to provoke a easy instrument name. In our first take a look at, we intentionally present the mannequin with inadequate data to see if it might notify us of the lacking particulars.

from openai import OpenAI  
from langchain_core.utils.function_calling import convert_to_openai_tool

SYSTEM_MESSAGE = """You might be tasked with finishing particular aims and
should report the outcomes. At your disposal, you've a wide range of instruments,
every specialised in performing a definite kind of activity.

For profitable activity completion:
Thought: Think about the duty at hand and decide which instrument is finest suited
primarily based on its capabilities and the character of the work.

Use the report_tool with an instruction detailing the outcomes of your work.
When you encounter a problem and can't full the duty:

Use the report_tool to speak the problem or cause for the
activity's incompletion.
You'll obtain suggestions primarily based on the outcomes of
every instrument's activity execution or explanations for any duties that
could not be accomplished. This suggestions loop is essential for addressing
and resolving any points by strategically deploying the accessible instruments.
"""
user_message = "I've spend 5$ on a espresso in the present day please monitor my expense. The tax price is 0.2."

consumer = OpenAI()
model_name = "gpt-3.5-turbo-0125"

messages = [
{"role":"system", "content": SYSTEM_MESSAGE},
{"role":"user", "content": user_message}
]

response = consumer.chat.completions.create(
mannequin=model_name,
messages=messages,
instruments=[
convert_to_openai_tool(Expense),
convert_to_openai_tool(ReportTool)]
)

Subsequent, we’ll want a brand new operate to learn the arguments of the operate name from the response:

def parse_function_args(response):
message = response.selections[0].message
return json.hundreds(message.tool_calls[0].operate.arguments)

print(parse_function_args(response))

{'description': 'Espresso',
'net_amount': 5,
'gross_amount': None,
'tax_rate': 0.2,
'date': '2023-10-06T12:00:00Z'}

As we are able to observe, we now have encountered a number of points within the execution:

  1. The gross_amount is just not calculated.
  2. The date is hallucinated.

With that in thoughts. Let’s attempt to resolve this points and optimize our agent workflow.

To optimize the agent workflow, I discover it essential to prioritize workflow over immediate engineering. Whereas it is likely to be tempting to fine-tune the immediate in order that the agent learns to make use of the instruments offered completely and makes no errors, it’s extra advisable to first modify the instruments and processes. When a typical error happens, the preliminary consideration ought to be the best way to repair it code-based.

Dealing with lacking data successfully is an important matter for creating sturdy and dependable brokers. Within the earlier instance, offering the agent with a instrument like “get_current_date” is a workaround for particular situations. Nonetheless, we should assume that lacking data will happen in numerous contexts, and we can not rely solely on immediate engineering and including extra instruments to stop the mannequin from hallucinating lacking data.

A easy workaround for this state of affairs is to change the instrument schema to deal with all parameters as non-obligatory. This method ensures that the agent solely submits the parameters it is aware of, stopping pointless hallucination.

Due to this fact, let’s check out openai instrument schema:

add_expense_tool = convert_to_openai_tool(Expense)
print(add_expense_tool)
{'kind': 'operate',
'operate': {'identify': 'Expense',
'description': '',
'parameters': {'kind': 'object',
'properties': {'description': {'kind': 'string'},
'net_amount': {'kind': 'quantity'},
'gross_amount': {'kind': 'quantity'},
'tax_rate': {'kind': 'quantity'},
'date': {'kind': 'string', 'format': 'date-time'}},
'required': ['description',
'net_amount',
'gross_amount',
'tax_rate',
'date']}}}

As we are able to see we now have particular key required , which we have to take away. Right here’s how one can modify the add_expense_tool schema to make parameters non-obligatory by eradicating the required key:

del add_expense_tool["function"]["parameters"]["required"]

Subsequent, we are able to design a Software class that originally checks the enter parameters for lacking values. We create the Software class with two strategies: .run(), .validate_input(), and a property openai_tool_schema, the place we manipulate the instrument schema by eradicating required parameters. Moreover, we outline the ToolResult BaseModel with the fields content material and success to function the output object for every instrument run.

from pydantic import BaseModel
from typing import Sort, Callable, Dict, Any, Checklist

class ToolResult(BaseModel):
content material: str
success: bool

class Software(BaseModel):
identify: str
mannequin: Sort[BaseModel]
operate: Callable
validate_missing: bool = False

class Config:
arbitrary_types_allowed = True

def run(self, **kwargs) -> ToolResult:
if self.validate_missing:
missing_values = self.validate_input(**kwargs)
if missing_values:
content material = f"Lacking values: {', '.be a part of(missing_values)}"
return ToolResult(content material=content material, success=False)
outcome = self.operate(**kwargs)
return ToolResult(content material=str(outcome), success=True)

def validate_input(self, **kwargs) -> Checklist[str]:
missing_values = []
for key in self.mannequin.__fields__.keys():
if key not in kwargs:
missing_values.append(key)
return missing_values
@property
def openai_tool_schema(self) -> Dict[str, Any]:
schema = convert_to_openai_tool(self.mannequin)
if "required" in schema["function"]["parameters"]:
del schema["function"]["parameters"]["required"]
return schema

The Software class is a vital element within the AI agent’s workflow, serving as a blueprint for creating and managing numerous instruments that the agent can make the most of to carry out particular duties. It’s designed to deal with enter validation, execute the instrument’s operate, and return the end in a standardized format.

The Software class key parts:

  1. identify: The identify of the instrument.
  2. mannequin: The Pydantic BaseModel that defines the enter schema for the instrument.
  3. operate: The callable operate that the instrument executes.
  4. validate_missing: A boolean flag indicating whether or not to validate lacking enter values (default is False).

The Software class two important strategies:

  1. run(self, **kwargs) -> ToolResult: This technique is chargeable for executing the instrument’s operate with the offered enter arguments. It first checks if validate_missing is about to True. If that’s the case, it calls the validate_input() technique to test for lacking enter values. If any lacking values are discovered, it returns a ToolResult object with an error message and success set to False. If all required enter values are current, it proceeds to execute the instrument’s operate with the offered arguments and returns a ToolResult object with the outcome and success set to True.
  2. validate_input(self, **kwargs) -> Checklist[str]: This technique compares the enter arguments handed to the instrument with the anticipated enter schema outlined within the mannequin. It iterates over the fields outlined within the mannequin and checks if every subject is current within the enter arguments. If any subject is lacking, it appends the sector identify to a listing of lacking values. Lastly, it returns the listing of lacking values.

The Software class additionally has a property referred to as openai_tool_schema, which returns the OpenAI instrument schema for the instrument. It makes use of the convert_to_openai_tool() operate to transform the mannequin to the OpenAI instrument schema format. Moreover, it removes the "required" key from the schema, making all enter parameters non-obligatory. This permits the agent to supply solely the accessible data with out the necessity to hallucinate lacking values.

By encapsulating the instrument’s performance, enter validation, and schema era, the Software class supplies a clear and reusable interface for creating and managing instruments within the AI agent’s workflow. It abstracts away the complexities of dealing with lacking values and ensures that the agent can gracefully deal with incomplete data whereas executing the suitable instruments primarily based on the accessible enter.

Subsequent, we’ll lengthen our OpenAI API name. We would like the consumer to make the most of our instrument, and our response object to immediately set off a instrument.run(). For this, we have to initialize our instruments in our newly created Software class. We outline two dummy capabilities which return a hit message string.

def add_expense_func(**kwargs):  
return f"Added expense: {kwargs} to the database."

add_expense_tool = Software(
identify="add_expense_tool",
mannequin=Expense,
operate=add_expense_func
)

def report_func(report: str = None):
return f"Reported: {report}"

report_tool = Software(
identify="report_tool",
mannequin=ReportTool,
operate=report_func
)

instruments = [add_expense_tool, report_tool]

Subsequent we outline our helper operate, that every take consumer response as enter an assist to work together with out instruments.

def get_tool_from_response(response, instruments=instruments):  
tool_name = response.selections[0].message.tool_calls[0].operate.identify
for t in instruments:
if t.identify == tool_name:
return t
increase ValueError(f"Software {tool_name} not present in instruments listing.")

def parse_function_args(response):
message = response.selections[0].message
return json.hundreds(message.tool_calls[0].operate.arguments)

def run_tool_from_response(response, instruments=instruments):
instrument = get_tool_from_response(response, instruments)
tool_kwargs = parse_function_args(response)
return instrument.run(**tool_kwargs)

Now, we are able to execute our consumer with our new instruments and use the run_tool_from_response operate.

response = consumer.chat.completions.create(  
mannequin=model_name,
messages=messages,
instruments=[tool.openai_tool_schema for tool in tools]
)

tool_result = run_tool_from_response(response, instruments=instruments)
print(tool_result)

content material='Lacking values: gross_amount, date' success=False

Completely, we now see our instrument indicating that lacking values are current. Because of our trick of sending all parameters as non-obligatory, we now keep away from hallucinated parameters.

Our course of, because it stands, doesn’t but symbolize a real agent. To this point, we’ve solely executed a single API instrument name. To rework this into an agent workflow, we have to introduce an iterative course of that feeds the outcomes of instrument execution again to the consumer. The essential course of ought to like this:

Picture by creator

Let’s get began by creating a brand new OpenAIAgent class:

class StepResult(BaseModel):  
occasion: str
content material: str
success: bool

class OpenAIAgent:

def __init__(
self,
instruments: listing[Tool],
consumer: OpenAI,
system_message: str = SYSTEM_MESSAGE,
model_name: str = "gpt-3.5-turbo-0125",
max_steps: int = 5,
verbose: bool = True
):
self.instruments = instruments
self.consumer = consumer
self.model_name = model_name
self.system_message = system_message
self.step_history = []
self.max_steps = max_steps
self.verbose = verbose

def to_console(self, tag: str, message: str, coloration: str = "inexperienced"):
if self.verbose:
color_prefix = Fore.__dict__[color.upper()]
print(color_prefix + f"{tag}: {message}{Model.RESET_ALL}")

Like our ToolResultobject, we’ve outlined a StepResult as an object for every agent step. We then outlined the __init__ technique of the OpenAIAgent class and a to_console() technique to print our intermediate steps and power calls to the console, utilizing colorama for colourful printouts. Subsequent, we outline the center of the agent, the run() and the run_step() technique.

class OpenAIAgent:

# ... __init__...

# ... to_console ...

def run(self, user_input: str):

openai_tools = [tool.openai_tool_schema for tool in self.tools]
self.step_history = [
{"role":"system", "content":self.system_message},
{"role":"user", "content":user_input}
]

step_result = None
i = 0

self.to_console("START", f"Beginning Agent with Enter: {user_input}")

whereas i < self.max_steps:
step_result = self.run_step(self.step_history, openai_tools)

if step_result.occasion == "end":
break
elif step_result.occasion == "error":
self.to_console(step_result.occasion, step_result.content material, "purple")
else:
self.to_console(step_result.occasion, step_result.content material, "yellow")
i += 1

self.to_console("Remaining Consequence", step_result.content material, "inexperienced")
return step_result.content material

Within the run() technique, we begin by initializing the step_history, which can function our message reminiscence, with the predefined system_message and the user_input. Then we begin our whereas loop, the place we name run_step throughout every iteration, which can return a StepResult Object. We establish if the agent completed his activity or if an error occurred, which will probably be handed to the console as nicely.

class OpenAIAgent:

# ... __init__...

# ... to_console ...
# ... run ...
def run_step(self, messages: listing[dict], instruments):

# plan the subsequent step
response = self.consumer.chat.completions.create(
mannequin=self.model_name,
messages=messages,
instruments=instruments
)

# add message to historical past
self.step_history.append(response.selections[0].message)

# test if instrument name is current
if not response.selections[0].message.tool_calls:
return StepResult(
occasion="Error",
content material="No instrument calls have been returned.",
success=False
)

tool_name = response.selections[0].message.tool_calls[0].operate.identify
tool_kwargs = parse_function_args(response)

# execute the instrument name
self.to_console(
"Software Name", f"Identify: {tool_name}nArgs: {tool_kwargs}", "magenta"
)
tool_result = run_tool_from_response(response, instruments=self.instruments)
tool_result_msg = self.tool_call_message(response, tool_result)
self.step_history.append(tool_result_msg)

if tool_result.success:
step_result = StepResult(
occasion="tool_result",
content material=tool_result.content material,
success=True)
else:
step_result = StepResult(
occasion="error",
content material=tool_result.content material,
success=False
)

return step_result

def tool_call_message(self, response, tool_result: ToolResult):
tool_call = response.selections[0].message.tool_calls[0]
return {
"tool_call_id": tool_call.id,
"position": "instrument",
"identify": tool_call.operate.identify,
"content material": tool_result.content material,
}

Now we’ve outlined the logic for every step. We first receive a response object by our beforehand examined consumer API name with instruments. We append the response message object to our step_history. We then confirm if a instrument name is included in our response object, in any other case, we return an error in our StepResult. Then we log our instrument name to the console and run the chosen instrument with our beforehand outlined technique run_tool_from_response(). We additionally have to append the instrument outcome to our message historical past. OpenAI has outlined a selected format for this goal, in order that the Mannequin is aware of which instrument name refers to which output by passing a tool_call_id into our message dict. That is carried out by our technique tool_call_message(), which takes the response object and the tool_result as enter arguments. On the finish of every step, we assign the instrument outcome to a StepResult Object, which additionally signifies if the step was profitable or not, and return it to our loop in run().

Now we are able to take a look at our agent with the earlier instance, immediately equipping it with a get_current_date_tool as nicely. Right here, we are able to set our beforehand outlined validate_missing attribute to False, for the reason that instrument would not want any enter argument.

class DateTool(BaseModel):  
x: str = None

get_date_tool = Software(
identify="get_current_date",
mannequin=DateTool,
operate=lambda: datetime.now().strftime("%Y-%m-%d"),
validate_missing=False
)

instruments = [
add_expense_tool,
report_tool,
get_date_tool
]

agent = OpenAIAgent(instruments, consumer)
agent.run("I've spent 5$ on a espresso in the present day please monitor my expense. The tax price is 0.2.")

START: Beginning Agent with Enter: 
"I've spend 5$ on a espresso in the present day please monitor my expense. The tax price is 0.2."

Software Name: get_current_date
Args: {}
tool_result: 2024-03-15

Software Name: add_expense_tool
Args: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15'}
error: Lacking values: gross_amount

Software Name: add_expense_tool
Args: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15', 'gross_amount': 6}
tool_result: Added expense: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15', 'gross_amount': 6} to the database.
Error: No instrument calls have been returned.

Software Name: Identify: report_tool
Args: {'report': 'Expense efficiently tracked for espresso buy.'}
tool_result: Reported: Expense efficiently tracked for espresso buy.

Remaining Consequence: Reported: Expense efficiently tracked for espresso buy.

Following the profitable execution of our prototype agent, it’s noteworthy to emphasise how successfully the agent utilized the designated instruments in response to plan. Initially, it invoked the get_current_date_tool, establishing a foundational timestamp for the expense entry. Subsequently, when trying to log the expense through the add_expense_tool, our intelligently designed instrument class recognized a lacking gross_amount—a vital piece of knowledge for correct monetary monitoring. Impressively, the agent autonomously resolved this by calculating the gross_amount utilizing the offered tax_rate.

It’s vital to say that in our take a look at run, the character of the enter expense — whether or not the $5 spent on espresso was web or gross — wasn’t explicitly specified. At this juncture, such specificity wasn’t required for the agent to carry out its activity efficiently. Nonetheless, this brings to mild a useful perception for refining our agent’s understanding and interplay capabilities: Incorporating such detailed data into our preliminary system immediate may considerably improve the agent’s accuracy and effectivity in processing expense entries. This adjustment would guarantee a extra complete grasp of economic knowledge proper from the outset.

  1. Iterative Growth: The mission underscores the essential nature of an iterative improvement cycle, fostering steady enchancment by suggestions. This method is paramount in AI, the place variability is the norm, necessitating an adaptable and responsive improvement technique.
  2. Dealing with Uncertainty: Our journey highlighted the importance of elegantly managing ambiguities and errors. Improvements equivalent to non-obligatory parameters and rigorous enter validation have confirmed instrumental in enhancing each the reliability and consumer expertise of the agent.
  3. Personalized Agent Workflows for Particular Duties: A key perception from this work is the significance of customizing agent workflows to swimsuit specific use circumstances. Past assembling a collection of instruments, the strategic design of instrument interactions and responses is important. This customization ensures the agent successfully addresses particular challenges, resulting in a extra centered and environment friendly problem-solving method.

The journey we now have embarked upon is just the start of a bigger exploration into the world of AI brokers and their purposes in numerous domains. As we proceed to push the boundaries of what’s doable with AI, we invite you to hitch us on this thrilling journey. By constructing upon the inspiration laid on this article and staying tuned for the upcoming enhancements, you’ll witness firsthand how AI brokers can revolutionize the best way companies and people deal with their knowledge and automate complicated duties.

Collectively, allow us to embrace the facility of AI and unlock its potential to rework the best way we work and work together with expertise. The way forward for AI is shiny, and we’re on the forefront of shaping it, one dependable agent at a time.

As we proceed our journey in exploring the potential of AI brokers, the upcoming articles will concentrate on increasing the capabilities of our prototype and integrating it with real-world programs. Within the subsequent article, we’ll dive into designing a strong mission construction that enables our agent to work together seamlessly with SQL databases. By leveraging the agent developed on this article, we’ll display how AI can effectively handle and manipulate knowledge saved in databases, opening up a world of potentialities for automating data-related duties.

Constructing upon this basis, the third article within the sequence will introduce superior question options, enabling our agent to deal with extra complicated knowledge retrieval and manipulation duties. We may also discover the idea of a routing agent, which can act as a central hub for managing a number of subagents, every chargeable for interacting with particular database tables. This hierarchical construction will enable customers to make requests in pure language, which the routing agent will then interpret and direct to the suitable subagent for execution.

To additional improve the practicality and safety of our AI-powered system, we’ll introduce a role-based entry management system. It will be sure that customers have the suitable permissions to entry and modify knowledge primarily based on their assigned roles. By implementing this characteristic, we are able to display how AI brokers will be deployed in real-world situations whereas sustaining knowledge integrity and safety.

By way of these upcoming enhancements, we intention to showcase the true potential of AI brokers in streamlining knowledge administration processes and offering a extra intuitive and environment friendly manner for customers to work together with databases. By combining the facility of pure language processing, database administration, and role-based entry management, we will probably be laying the groundwork for the event of refined AI assistants that may revolutionize the best way companies and people deal with their knowledge.

Keep tuned for these thrilling developments as we proceed to push the boundaries of what’s doable with AI brokers in knowledge administration and past.

Supply Code

Moreover, all the supply code for the tasks lined is obtainable on GitHub. You’ll be able to entry it at https://github.com/elokus/AgentDemo.

[ad_2]