How to Train an AI Chatbot With Custom Knowledge Base Using ChatGPT API
In our earlier article, we demonstrated the way to construct an AI chatbot with the ChatGPT API and assign a job to personalize it. However what if you wish to prepare the AI by yourself knowledge? For instance, you could have a e-book, monetary knowledge, or a big set of databases, and also you want to search … The post How to Train an AI Chatbot With Custom Knowledge Base Using ChatGPT API appeared first on Ferdja.

In our earlier article, we demonstrated the way to construct an AI chatbot with the ChatGPT API and assign a job to personalize it. However what if you wish to prepare the AI by yourself knowledge? For instance, you could have a e-book, monetary knowledge, or a big set of databases, and also you want to search them with ease. On this article, we deliver you an easy-to-follow tutorial on the way to prepare an AI chatbot along with your {custom} information base with LangChain and ChatGPT API. We’re deploying LangChain, GPT Index, and different highly effective libraries to coach the AI chatbot utilizing OpenAI’s Massive Language Mannequin (LLM). So on that observe, let’s try the way to prepare and create an AI Chatbot utilizing your individual dataset.
Prepare an AI Chatbot With Customized Information Base Utilizing ChatGPT API, LangChain, and GPT Index (2023)
On this article, we have now defined the steps to show the AI chatbot with your individual knowledge in better element. From organising instruments and software program to coaching the AI mannequin, we have now included all of the directions in an easy-to-understand language. It’s extremely beneficial to observe the directions from high to down with out skipping any half.
Observe: The tutorial has been up to date with bug fixes on April 6. Now, you shouldn’t get “BaseGPTIndex” or “llm_predictor” error whereas working the code.
Notable Factors Earlier than You Prepare AI with Your Personal Knowledge
1. You may prepare the AI chatbot on any platform, whether or not Home windows, macOS, Linux, or ChromeOS. On this article, I’m utilizing Home windows 11, however the steps are practically equivalent for different platforms.
2. The information is meant for common customers, and the directions are defined in easy language. So even in case you have a cursory information of computer systems and don’t know the way to code, you may simply prepare and create a Q&A AI chatbot in a couple of minutes. For those who adopted our earlier ChatGPT bot article, it will be even simpler to grasp the method.
3. Since we’re going to prepare an AI Chatbot primarily based on our personal knowledge, it’s beneficial to use a succesful pc with CPU and GPU. Nonetheless, you should use any low-end pc for testing functions, and it’ll work with none points. I used a Chromebook to coach the AI mannequin utilizing a e-book with 100 pages (~100MB). Nonetheless, if you wish to prepare a big set of information working into 1000’s of pages, it’s strongly beneficial to make use of a strong pc.
4. Lastly, the info set must be in English to get the most effective outcomes, however based on OpenAI, it is going to additionally work with widespread worldwide languages like French, Spanish, German, and so on. So go forward and provides it a strive in your individual language.
Set Up the Software program Setting to Prepare an AI Chatbot
Like our earlier article, it is best to know that Python and Pip have to be put in together with a number of libraries. On this article, we’ll arrange every little thing from scratch so new customers may also perceive the setup course of. To provide you a quick thought, we’ll set up Python and Pip. After that, we’ll set up Python libraries, which embody OpenAI, GPT Index, Gradio, and PyPDF2. Alongside the method, you’ll study what every library does. Once more, don’t fret over the set up course of, it’s fairly easy. On that observe, let’s bounce proper in.
Set up Python
1. First off, you want to set up Python (Pip) in your pc. Open this link and obtain the setup file on your platform.

2. Subsequent, run the setup file and ensure to allow the checkbox for “Add Python.exe to PATH.” That is a particularly necessary step. After that, click on on “Set up Now” and observe the same old steps to put in Python.

3. To test if Python is correctly put in, open the Terminal in your pc. I’m utilizing Home windows Terminal on Home windows, however it’s also possible to use Command Immediate. As soon as right here, run the under command under, and it’ll output the Python model. On Linux and macOS, you could have to make use of python3 --version
as an alternative of python --version
.
python --version

Improve Pip
Whenever you set up Python, Pip is put in concurrently in your system. So let’s improve it to the most recent model. For individuals who are unaware, Pip is the package deal supervisor for Python. Mainly, it allows you to set up 1000’s of Python libraries from the Terminal. With Pip, we are able to set up OpenAI, gpt_index, gradio, and PyPDF2 libraries. Listed below are the steps to observe.
1. Open the Terminal of your selection in your pc. I’m utilizing the Home windows Terminal, however it’s also possible to use Command Immediate. Now, run the under command to replace Pip. Once more, you could have to make use of python3
and pip3
on Linux and macOS.
python -m pip set up -U pip

2. To test if Pip was correctly put in, run the under command. It would output the model quantity. For those who get any errors, observe our devoted information on the way to set up Pip on Home windows to repair PATH-related points.
pip --version

Set up OpenAI, GPT Index, PyPDF2, and Gradio Libraries
As soon as we have now arrange Python and Pip, it’s time to put in the important libraries that can assist us prepare an AI chatbot with a {custom} information base. Listed below are the steps to observe.
1. Open the Terminal and run the under command to set up the OpenAI library. We are going to use it because the LLM (Massive language mannequin) to coach and create an AI chatbot. And we may also import the LangChain framework from OpenAI. Observe that, Linux and macOS customers might have to make use of pip3
as an alternative of pip
.
pip set up openai

2. Subsequent, let’s set up GPT Index, which can be referred to as LlamaIndex. It permits the LLM to hook up with the exterior knowledge that’s our information base. Right here, we’re putting in an older model of gpt_index which is suitable with my code under. It will be sure that you don’t get any errors whereas working the code. When you’ve got already put in gpt_index, run the under command once more and it’ll override the most recent one.
pip set up gpt_index==0.4.24

3. After that, set up PyPDF2 to parse PDF recordsdata. If you wish to feed your knowledge in PDF format, this library will assist this system learn the info effortlessly. Other than that, set up PyCryptodome by working the under command. That is once more accomplished to keep away from any errors whereas parsing PDF recordsdata.
pip set up PyPDF2 pip set up PyCryptodome
4. Lastly, set up the Gradio library. That is meant for making a easy UI to work together with the educated AI chatbot. We at the moment are accomplished putting in all of the required libraries to coach an AI chatbot.
pip set up gradio

Obtain a Code Editor
Lastly, we want a code editor to edit among the code. On Home windows, I might suggest Notepad++ (Download). Merely obtain and set up this system through the hooked up hyperlink. You may also use VS Code on any platform if you’re snug with highly effective IDEs. Aside from VS Code, you may set up Elegant Textual content (Download) on macOS and Linux.
For ChromeOS, you should use the superb Caret app (Download) to edit the code. We’re virtually accomplished organising the software program surroundings, and it’s time to get the OpenAI API key.

Get the OpenAI API Key For Free
Now, to coach and create an AI chatbot primarily based on a {custom} information base, we have to get an API key from OpenAI. The API key will let you use OpenAI’s mannequin because the LLM to check your {custom} knowledge and draw inferences. Presently, OpenAI is providing free API keys with $5 value of free credit score for the primary three months to new customers. For those who created your OpenAI account earlier, you could have free $18 credit score in your account. After the free credit score is exhausted, you’ll have to pay for the API entry. However for now, it’s obtainable to all customers totally free.
1. Head to platform.openai.com/signup and create a free account. If you have already got an OpenAI account, merely log in.

2. Subsequent, click on in your profile within the top-right nook and choose “View API keys” from the drop-down menu.

3. Right here, click on on “Create new secret key” and duplicate the API key. Do observe that you may’t copy or view your entire API key afterward. So it’s strongly beneficial to repeat and paste the API key to a Notepad file instantly.

4. Subsequent, go to platform.openai.com/account/usage and test in case you have sufficient credit score left. When you’ve got exhausted all of your free credit score, you should purchase the OpenAI API from here. In case, you need to get extra free credit, you may create a brand new OpenAI account with a brand new cell quantity and get free API entry ( as much as $5 value of free tokens). It will stop you from dealing with Error 429 (You exceeded your present quota, please test your plan and billing particulars) whereas working the code.

5. Lastly, don’t share or show the API key in public. It’s a non-public key meant just for entry to your account. You may also delete API keys and create a number of personal keys (as much as 5).
Prepare and Create an AI Chatbot With Customized Information Base
Now that we have now arrange the software program surroundings and received the API key from OpenAI, let’s prepare the AI chatbot. Right here, we’ll use the “gpt-3.5-turbo” mannequin as a result of it’s cheaper and sooner than different fashions. If you wish to use the most recent “gpt-4” mannequin, it’s essential to have entry to the GPT 4 API which you get by becoming a member of the waitlist here. With that out of the way in which, let’s bounce to the directions.
Add Your Paperwork to Prepare the AI Chatbot
1. First, create a brand new folder referred to as docs
in an accessible location just like the Desktop. You may select one other location as properly based on your choice. Nonetheless, maintain the folder identify docs
.

2. Subsequent, transfer the paperwork you want to use for coaching the AI contained in the “docs” folder. You may add a number of textual content or PDF recordsdata (even scanned ones). When you’ve got a big desk in Excel, you may import it as a CSV or PDF file after which add it to the “docs” folder. You may even add SQL database recordsdata, as defined on this Langchain AI tweet. I haven’t tried many file codecs moreover the talked about ones, however you may add and test by yourself. For this text, I’m including one in every of my articles on NFT in PDF format.
Observe: When you’ve got a big doc, it is going to take an extended time to course of the info, relying in your CPU and GPU. As well as, it is going to shortly use your free OpenAI tokens. So to start with, begin with a small doc (30-50 pages or < 100MB recordsdata) to grasp the method.

Make the Code Prepared
1. Now, launch Notepad++ (or your selection of code editor) and paste the under code into a brand new file. As soon as once more, I’ve taken nice assist from armrrs on Google Colab and tweaked the code to make it suitable with PDF recordsdata and create a Gradio interface on high.
from gpt_index import SimpleDirectoryReader, GPTListIndex, GPTSimpleVectorIndex, LLMPredictor, PromptHelper from langchain.chat_models import ChatOpenAI import gradio as gr import sys import os os.environ["OPENAI_API_KEY"] = 'Your API Key' def construct_index(directory_path): max_input_size = 4096 num_outputs = 512 max_chunk_overlap = 20 chunk_size_limit = 600 prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit) llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo", max_tokens=num_outputs)) paperwork = SimpleDirectoryReader(directory_path).load_data() index = GPTSimpleVectorIndex(paperwork, llm_predictor=llm_predictor, prompt_helper=prompt_helper) index.save_to_disk('index.json') return index def chatbot(input_text): index = GPTSimpleVectorIndex.load_from_disk('index.json') response = index.question(input_text, response_mode="compact") return response.response iface = gr.Interface(fn=chatbot, inputs=gr.parts.Textbox(traces=7, label="Enter your textual content"), outputs="textual content", title="Customized-trained AI Chatbot") index = construct_index("docs") iface.launch(share=True)
2. That is what the code appears like within the code editor.

3. Subsequent, click on on “File” within the high menu and choose “Save As…” from the drop-down menu.

4. After that, set the file identify app.py
and alter the “Save as sort” to “Every type” from the drop-down menu. Then, save the file to the placement the place you created the “docs” folder (in my case, it’s the Desktop). You may change the identify to your liking, however make certain .py
is appended.

5. Be certain the “docs” folder and “app.py” are within the similar location, as proven within the screenshot under. The “app.py” file shall be outdoors the “docs” folder and never inside.

6. Come again to the code once more in Notepad++. Right here, substitute Your API Key
with the one generated on OpenAI’s web site above.

7. Lastly, press “Ctrl + S” to avoid wasting the code. You at the moment are able to run the code.

Create ChatGPT AI Bot with Customized Information Base
1. First, open the Terminal and run the under command to transfer to the Desktop. It’s the place I saved the “docs” folder and “app.py” file. For those who saved each objects in one other location, transfer to that location through the Terminal.
cd Desktop

2. Now, run the under command. Linux and macOS customers might have to make use of python3
.
python app.py

3. Now, it is going to begin analyzing the doc utilizing the OpenAI LLM mannequin and begin indexing the knowledge. Relying on the file dimension and your pc’s functionality, it is going to take a while to course of the doc. As soon as it’s accomplished, an “index.json” file shall be created on the Desktop. If the Terminal is just not exhibiting any output, don’t worry, it would nonetheless be processing the info. In your data, it takes round 10 seconds to course of a 30MB doc.

4. As soon as the LLM has processed the info, you’ll find a native URL. Copy it.

5. Now, paste the copied URL into the net browser, and there you’ve got it. Your custom-trained ChatGPT-powered AI chatbot is prepared. To start out, you may ask the AI chatbot what the doc is about.

6. You may ask additional questions, and the ChatGPT bot will reply from the info you supplied to the AI. So that is how one can construct a custom-trained AI chatbot with your individual dataset. Now you can prepare and create an AI chatbot primarily based on any sort of data you need. The probabilities are countless.
7. You may also copy the public URL and share it along with your family and friends. The hyperlink shall be stay for 72 hours, however you additionally have to maintain your pc turned on because the server occasion is working in your pc.
8. To cease the custom-trained AI chatbot, press “Ctrl + C” within the Terminal window. If it doesn’t work, press “Ctrl + C” once more.

9. To restart the AI chatbot server, merely transfer to the Desktop location once more and run the under command. Be mindful, the native URL would be the similar, however the public URL will change after each server restart.
python app.py

10. If you wish to prepare the AI chatbot with new knowledge, delete the recordsdata contained in the “docs” folder and add new ones. You may also add a number of recordsdata, however make certain to feed clear knowledge to get a coherent response.

11. Now, run the code once more within the Terminal, and it’ll create a brand new “index.json” file. Right here, the outdated “index.json” file shall be changed routinely.
python app.py

12. To maintain observe of your tokens, head over to OpenAI’s on-line dashboard and test how a lot free credit score is left.

13. Lastly, you don’t have to contact the code except you need to change the API key or the OpenAI mannequin for additional customization.
Construct a Customized AI Chatbot Utilizing Your Personal Knowledge
So that is how one can prepare an AI chatbot with a {custom} information base. I’ve used this code to coach the AI on medical books, articles, knowledge tables, and studies from outdated archives, and it has labored flawlessly. So go forward and create your individual AI chatbot utilizing OpenAI’s Massive Language Mannequin and ChatGPY. Anyway, that’s all from us. In case you are in search of the greatest ChatGPT options, head to our linked article. And to use ChatGPT in your Apple Watch, observe our in-depth tutorial. Lastly, if you’re dealing with any sort of points, do tell us within the remark part under. We will certainly strive that will help you out.
The post How to Train an AI Chatbot With Custom Knowledge Base Using ChatGPT API appeared first on Ferdja.