Home Machine Learning Construct an AI Assistant with OpenAI + Python | by Shaw Talebi | Feb, 2024

Construct an AI Assistant with OpenAI + Python | by Shaw Talebi | Feb, 2024

0
Construct an AI Assistant with OpenAI + Python | by Shaw Talebi | Feb, 2024

[ad_1]

Earlier than diving into the instance code, I need to briefly differentiate an AI chatbot from an assistant. Whereas these phrases are sometimes used interchangeably, right here, I take advantage of them to imply various things.

A chatbot is an AI you’ll be able to have a dialog with, whereas an AI assistant is a chatbot that may use instruments. A device might be issues like net shopping, a calculator, a Python interpreter, or the rest that expands the capabilities of a chatbot [1].

For instance, for those who use the free model of ChatGPT, that’s a chatbot as a result of it solely comes with a primary chat performance. Nonetheless, for those who use the premium model of ChatGPT, that’s an assistant as a result of it comes with capabilities similar to net shopping, information retrieval, and picture technology.

Whereas constructing AI assistants (i.e., AI brokers) just isn’t a brand new concept, OpenAI’s new Assistants API gives a simple strategy to create most of these AIs. Right here, I’ll use the API to make a YouTube remark responder outfitted with information retrieval (i.e. RAG) from considered one of my Medium articles. The next instance code is on the market at this submit’s GitHub repository.

Vanilla Assistant

We begin by importing Python libraries and organising communication with the OpenAI API.

from openai import OpenAI
from sk import my_sk # import secret key from .py file

shopper = OpenAI(api_key=my_sk)

Be aware that for this step, you want an OpenAI API key. In the event you don’t have an API key or don’t know easy methods to get one, I stroll by means of how to do this in a earlier article. Right here, I’ve my secret key outlined in a separate Python file referred to as sk.py, which was imported within the above code block.

Now we are able to create a primary assistant (technically a chatbot since no instruments but). This may be executed in a single line of code, however I take advantage of just a few extra for readability.

intstructions_string = "ShawGPT, functioning as a digital information science 
guide on YouTube, communicates in clear, accessible language, escalating
to technical depth upon request.
It reacts to suggestions aptly and concludes with its signature '–ShawGPT'.
ShawGPT will tailor the size of its responses to match the viewer's remark,
offering concise acknowledgments to transient expressions of gratitude or
suggestions, thus maintaining the interplay pure and interesting."

assistant = shopper.beta.assistants.create(
title="ShawGPT",
description="Knowledge scientist GPT for YouTube feedback",
directions=intstructions_string,
mannequin="gpt-4-0125-preview"
)

As proven above, we are able to set the assistant title, description, directions, and mannequin. The inputs most related to the assistant’s efficiency are the directions and mannequin. Growing good directions (i.e. immediate engineering) is an iterative course of however price spending a while on. Moreover, I take advantage of the most recent accessible model of GPT-4. Nonetheless, older (and cheaper) fashions are additionally accessible [2].

With the “assistant” arrange, we are able to ship it a message to generate a response. That is executed within the code block beneath.

# create thread (i.e. object that handles dialog between consumer and assistant)
thread = shopper.beta.threads.create()

# add a consumer message to the thread
message = shopper.beta.threads.messages.create(
thread_id=thread.id,
position="consumer",
content material="Nice content material, thanks!"
)

# ship message to assistant to generate a response
run = shopper.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
)

Just a few issues are taking place within the above code block. First, we create a thread object. This handles message passing between consumer and assistant, thus avoiding the necessity for us to write down boilerplate code to do this. Subsequent, we add a consumer message to the thread. These are the YouTube feedback for our use case. Then, lastly, we ship the thread to the assistant to generate a response by way of the run object.

After just a few seconds, we get the next response from the assistant:

You are welcome! I am glad you discovered it useful. If in case you have any extra questions 
or subjects you are interested in, be happy to ask. –ShawGPT

Whereas this would possibly look like a pleasant response, it’s not one thing I might ever say. Let’s see how we are able to enhance the assistant by way of so-called few-shot prompting.

Few-shot Prompting

Few-shot prompting is the place we embody input-output examples within the assistant’s directions from which it may study. Right here, I append 3 (actual) feedback and responses to the earlier instruction string.

intstructions_string_few_shot = """ShawGPT, functioning as a digital information 
science guide on YouTube, communicates in clear, accessible language,
escalating to technical depth upon request.
It reacts to suggestions aptly and concludes with its signature '–ShawGPT'.
ShawGPT will tailor the size of its responses to match the viewer's remark,
offering concise acknowledgments to transient expressions of gratitude or
suggestions, thus maintaining the interplay pure and interesting.

Listed here are examples of ShawGPT responding to viewer feedback.

Viewer remark: This was a really thorough introduction to LLMs and answered many questions I had. Thanks.
ShawGPT: Nice to listen to, glad it was useful :) -ShawGPT

Viewer remark: Epic, very helpful for my BCI class
ShawGPT: Thanks, glad to listen to! -ShawGPT

Viewer remark: Truthfully probably the most simple rationalization I've ever watched. Tremendous glorious work Shaw. Thanks. It is so uncommon to search out good communicators such as you!
ShawGPT: Thanks, glad it was clear -ShawGPT"""

Discover that I added the sign-off “— ShawGPT” to every instance (despite the fact that these weren’t a part of my precise response). That is necessary to get the specified conduct from the assistant.

With the up to date directions, we are able to create a brand new assistant in the identical means as earlier than.

assistant = shopper.beta.assistants.create(
title="ShawGPT",
description="Knowledge scientist GPT for YouTube feedback",
directions=intstructions_string_few_shot,
mannequin="gpt-4-0125-preview"
)

Passing the identical check remark (i.e. “Nice content material, thanks!”) to this assistant as earlier than, we now get the next response:

You are welcome, pleased to listen to you discovered it helpful! -ShawGPT

Though that is (nonetheless) just a little longer than my typical response, it’s a lot better than what we noticed with the vanilla assistant.

Let’s attempt one other (extra technical) query. We are able to use the identical syntax as earlier than to go the consumer message to the assistant, however now let’s ask, “What’s fat-tailedness?

Right here’s the assistant response.

Fats-tailedness is an idea from statistics that refers back to the properties of a 
explicit sort of likelihood distribution. In a fat-tailed distribution, the
tails (the far ends of the distribution, representing excessive values removed from
the imply) are heavier or "fatter" than they'd be in a traditional (Gaussian)
distribution. This implies there is a greater probability of utmost outcomes
occurring in comparison with what you'd anticipate if the info adopted a traditional
distribution.

In sensible phrases, if a dataset is described as having a fat-tailed
distribution, it signifies that uncommon, excessive occasions (those within the tails)
occur extra steadily than can be predicted by fashions based mostly on regular
distributions. This idea is extraordinarily necessary in fields like finance,
economics, and danger administration, the place underestimating the probability of utmost
occasions can have important penalties.

Fats-tailed distributions distinction with thin-tailed distributions, just like the
regular distribution, the place excessive values are much less possible, and many of the
possibilities are concentrated across the imply. Understanding whether or not a dataset
follows a fat-tailed distribution can considerably influence how we mannequin and
predict phenomena, particularly in the case of assessing dangers and anticipating
uncommon however impactful occasions. -ShawGPT

Whereas (once more) this response is affordable, it reads extra like one thing from ChatGPT than what I might say. Particularly, it makes use of a distinct definition of fats tails (i.e. a distribution with heavier tails than a traditional distribution) than the one I take advantage of in my sequence on the subject (i.e. a distribution pushed by excessive occasions).

To enhance this side of the assistant’s responses, we can provide it the power to entry information from a Medium article I’ve written on fat-tailedness.

Information Retrieval

Whereas organising a RAG system consists of a handful of nontrivial steps (i.e. chunking paperwork, organising vector database, constructing semantic search operate, fusing search outcomes into context window), the whole course of is lowered to 2 steps within the Assistants API.

First, we add our paperwork for retrieval. Second, we added the retrieval functionality to our assistant.

Within the code block beneath, I add my article on quantifying fats tails to my OpenAI account. To do that, we have to specify the file path/title and the aim of the file. Right here, we set the aim as “assistants”. The opposite choice is “fine-tune”, which we are going to see quickly.

file = shopper.recordsdata.create(
file=open("articles/4 Methods to Quantify Fats Tails with Python _ by Shaw Talebi _ In the direction of Knowledge Science.pdf", "rb"),
objective="assistants"
)

Subsequent, we are able to create a brand new assistant much like earlier than, however now enabling retrieval and specifying the recordsdata accessible to the assistant. Whereas I solely use one doc for retrieval right here, we are able to specify as much as 20 recordsdata by together with the related file IDs within the file_ids listing.

assistant = shopper.beta.assistants.create(
title="ShawGPT",
description="Knowledge scientist GPT for YouTube feedback",
directions=intstructions_string_few_shot,
instruments=[{"type": "retrieval"}],
file_ids=[file.id],
mannequin="gpt-4-0125-preview"
)

Passing the identical technical query to the assistant, we get the next response.

Fats-tailedness describes the diploma to which uncommon occasions considerably 
contribute to the combination statistics of a distribution. This idea
highlights that in sure distributions, a small variety of observations can
disproportionately affect the general image. A traditional instance is the
Pareto distribution, typically referenced within the context of wealth distribution,
the place a small proportion of people can maintain the vast majority of wealth. In
statistical phrases, distributions vary from thin-tailed (e.g., Gaussian, the place
excessive occasions are extremely inconceivable) to very fat-tailed (e.g., Pareto, the place
excessive occasions are much less uncommon than in Gaussian distributions).

Fats-tailedness just isn't a couple of binary categorization of distributions as both
"Energy Legislation" or not however acknowledges a continuum or spectrum of "tail heaviness."
This understanding provides a extra nuanced view, permitting for extra correct and
exact categorization of information distributions. Figuring out and quantifying the
fat-tailed nature of a dataset is necessary in fields like finance, insurance coverage,
danger administration, and social sciences, the place understanding the probability and
influence of utmost occasions is essential.

In sensible phrases, fat-tailedness might be quantified utilizing totally different heuristics,
which can contain mathematical computations such because the Energy Legislation Tail Index,
Kurtosis, Log-normal's σ, and Taleb's κ metric. Every of those measures provides
a distinct lens by means of which to evaluate the fat-tailed nature of a
distribution, thereby offering helpful insights into the conduct of utmost
occasions inside the dataset -ShawGPT

This response is way nearer to the best way I take into consideration (and clarify) fat-tailedness. The assistant did a seamless job of incorporating key ideas from the article into its response. As an illustration, defining fat-tailedness by way of uncommon occasions, fat-tailedness residing on a spectrum, and 4 heuristics for measuring them.

Up thus far, we’ve gotten fairly far utilizing immediate engineering and information retrieval to create our assistant. Nonetheless, the responses nonetheless don’t completely learn like one thing I might write. To additional enhance this side of the assistant, we are able to flip to fine-tuning.

Whereas immediate engineering might be a straightforward strategy to program an assistant, it’s not all the time apparent easy methods to finest instruct the mannequin to show the specified conduct. In these conditions, it may be advantageous to fine-tune the mannequin.

Nice-tuning is after we prepare a pre-existing mannequin with extra examples for a selected activity. Within the OpenAI Nice-tuning API this consists of offering instance user-assistant message pairs [3].

For the YouTube remark responder use case, this implies gathering pairs of viewer feedback (i.e., consumer message) and their related responses (i.e., assistant message).

Though this extra data-gathering course of makes fine-tuning extra work upfront, it can result in important enhancements in mannequin efficiency [3]. Right here, I stroll by means of the fine-tuning course of for this explicit use case.

Knowledge Preparation

To generate the user-assistant message pairs, I manually went by means of previous YouTube feedback and copy-pasted them right into a spreadsheet. I then exported this spreadsheet as a .csv file (accessible on the GitHub repo).

Whereas this .csv file has all of the important information wanted for fine-tuning, it can’t be used immediately. We should first remodel it into a selected format to go it into the OpenAI API.

Extra particularly, we have to generate a .jsonl file, a textual content file the place every line corresponds to a coaching instance within the JSON format. If you’re a Python consumer unfamiliar with JSON, you’ll be able to consider it like a dictionary (i.e. a knowledge construction consisting of key-value pairs) [4].

To get our .csv into the mandatory .jsonl format, I first create Python lists for every sort of remark. That is executed by studying the uncooked .csv file line by line and storing every message within the applicable listing.

import csv
import json
import random

comment_list = []
response_list = []

with open('information/YT-comments.csv', mode ='r') as file:
file = csv.reader(file)

# learn file line by line
for line in file:
# skip first line
if line[0]=='Remark':
proceed

# append feedback and responses to respective lists
comment_list.append(line[0])
response_list.append(line[1] + " -ShawGPT")

Subsequent, to create the .jsonl file, we should create a listing of dictionaries the place every ingredient corresponds to a coaching instance. The key for every of those dictionaries is “messages”, and the worth is (yet one more) listing of dictionaries akin to the system, consumer, and assistant messages, respectively. A visible overview of this information construction is given beneath.

Overview of fine-tuning coaching information format [4]. Picture by writer.

The Python code for taking our comment_list and response_list objects and creating the listing of examples is given beneath. That is executed by going by means of comment_list and response_list, ingredient by ingredient, and creating three dictionaries at every step.

These correspond to the system, consumer, and assistant messages, respectively, the place the system message is similar directions we used to make our assistant by way of few-shot prompting, and the consumer/assistant messages come from their respective lists. These dictionaries are then saved in a listing that serves as the worth for that individual coaching instance.

example_list = []

for i in vary(len(comment_list)):
# create dictionaries for every position/message
system_dict = {"position": "system", "content material": intstructions_string_few_shot}
user_dict = {"position": "consumer", "content material": comment_list[i]}
assistant_dict = {"position": "assistant", "content material": response_list[i]}

# retailer dictionaries into listing
messages_list = [system_dict, user_dict, assistant_dict]

# create dictionary for ith instance and add it to example_list
example_list.append({"messages": messages_list})

On the finish of this course of, now we have a listing with 59 parts akin to 59 user-assistant instance pairs. One other step that helps consider mannequin efficiency is to separate these 59 examples into two datasets, one for coaching the mannequin and the opposite for evaluating its efficiency.

That is executed within the code block beneath, the place I randomly pattern 9 out of 59 examples from example_list and retailer them in a brand new listing referred to as validation_data_list. These examples are then faraway from example_list, which can function our coaching dataset.

# create prepare/validation break up
validation_index_list = random.pattern(vary(0, len(example_list)-1), 9)

validation_data_list = [example_list[index] for index in validation_index_list]

for instance in validation_data_list:
example_list.take away(instance)

Lastly, with our coaching and validation datasets ready, we are able to write them to .jsonl recordsdata. This may be executed within the following means.

# write examples to file
with open('information/training-data.jsonl', 'w') as training_file:
for instance in example_list:
json.dump(instance, training_file)
training_file.write('n')

with open('information/validation-data.jsonl', 'w') as validation_file:
for instance in validation_data_list:
json.dump(instance, validation_file)
validation_file.write('n')

Nice-tuning job

With the info preparation executed, we are able to run the fine-tuning job in 2 steps. First, we add the coaching and validation recordsdata to our OpenAI account. Second, we run the coaching course of [3].

We add recordsdata like we did when organising doc retrieval for an assistant, however now setting the file objective as “fine-tune”. That is executed for each the coaching and validation datasets beneath.

# add fine-tuning recordsdata
training_file = shopper.recordsdata.create(
file = open("information/training-data.jsonl", "rb"),
objective = "fine-tune"
)

validation_file = shopper.recordsdata.create(
file = open("information/validation-data.jsonl", "rb"),
objective = "fine-tune"
)

Now, we are able to run the fine-tuning job. For this, we have to specify the coaching recordsdata and mannequin we want to fine-tune. Essentially the most superior mannequin accessible for fine-tuning is gpt-3.5-turbo, which I take advantage of right here. I additionally set a suffix that provides a user-defined string to the fine-tuned mannequin’s ID. The code for that is proven beneath.

shopper.fine_tuning.jobs.create(
training_file = training_file.id,
validation_file = validation_file.id,
suffix = "ShawGPT",
mannequin = "gpt-3.5-turbo"
)

The coaching course of will take round quarter-hour. When full, we are able to use the fine-tuned mannequin by way of the completions API, as proven beneath.

test_comment = "Nice content material, thanks!"

response = shopper.chat.completions.create(
mannequin="ft:gpt-3.5-turbo-0613:private:shawgpt:8mUeVreo",
messages=[
{"role": "system", "content": intstructions_string_few_shot},
{"role": "user", "content": test_comment}
]
)

Discover how we generated the response within the code block above differs from how we did it beforehand. It is because fine-tuned fashions are not at present supported within the Assistants API.

The most important draw back is that if we need to increase the fine-tuned mannequin with instruments (i.e. make it an assistant), we should construct out these functionalities by way of libraries like LangChain or LlamaIndex (for instance).

Whereas this makes creating a fine-tuned assistant extra work, the responses we see out of the field are nearer to one thing that I might say. Listed here are just a few responses to the check remark.

Thanks, I admire it! -ShawGPT
Thanks, glad you appreciated it! -ShawGPT
Glad it was useful! -ShawGPT

Let’s see the way it handles the identical technical query as earlier than i.e. “What’s fat-tailedness?

Good query! The fats tail phenomenon represents the scale of outlier (excessive) 
occasions relative to a traditional (Gaussian) distribution. In different phrases, there is a
larger likelihood of utmost occasions occurring in comparison with a traditional
distribution. -ShawGPT

Though the mannequin defines fats tails in numerous phrases than I might, the size and magnificence of the response are a lot better than what we noticed with the Assistants API pre-RAG. This implies that if we had been so as to add RAG to this fine-tuned mannequin, it might generate considerably higher responses than what we noticed earlier than.

Constructing a customized AI assistant is less complicated than ever earlier than. Right here, we noticed a easy strategy to create an AI assistant by way of OpenAI’s Assistant’s API and easy methods to fine-tune a mannequin by way of their Nice-tuning API.

Whereas OpenAI at present has probably the most superior fashions for creating the kind of AI assistant mentioned right here, these fashions are locked behind their API, which limits what/how we are able to construct with them.

A pure query, due to this fact, is how would possibly we develop comparable programs utilizing open-source options. This shall be coated within the subsequent articles of this sequence, the place I’ll focus on easy methods to fine-tune a mannequin utilizing QLoRA and increase a chatbot by way of RAG.

Extra on LLMs 👇

Shaw Talebi

Giant Language Fashions (LLMs)

[ad_2]