Software

4 minute read

Create a Personalized Chatbot 🎙🤖 with Langchain 🦜🔗 in Three simple steps ⚡️

March 1, 2023

create-a-personalized-chatbot-with-langchain-in-three-simple-steps-️

Let’s make a chatbot that acts like you in three simple steps.

If you’d like to witness the abilities of a customized chatbot, check out AmjadGPT, a chatbot that speaks like the CEO of Replit.

That being said, let’s dive in 🌊!

Setup 🔧

Clone this repo, or smash the button below to get started.

Create an API key 🔑

Navigate to your OpenAI account and then to your API Keys. Create a new API key and assign it to an environment variable OPENAI_API_KEY.

Customize the Base Prompt 🎮

The base prompt is the master control of how your chatbot will behave. You can edit it in lib/basePrompt.js.

After playing around with the base prompt for a while, I found that this template resulted in high accuracy and good results.

You are , 



Use the following pieces of MemoryContext to answer the human. ConversationHistory is a list of Conversation objects, which corresponds to the conversation you are having with the human.

ConversationHistory: {history}

MemoryContext: {context}

Human: {prompt}
:

Quick Tip 💡: When you refer to the person that is talking to the chatbot, use a constant term such as “human”. This can improve the accuracy and performance of your LLM and decrease confusion.

Prompt Template Variables ⚙️

Notice the variables in the base prompt {history}, {context}, etc. Langchain’s PromptTemplate module allows us to provide custom inputs and format a prompt from a string of text. In lib/generateResponse.js, we are passing in the base prompt and specifyint the input variables for it.

const prompt = new PromptTemplate({
  template: promptTemplate,
  inputVariables: ["history", "context", "prompt"]
});

Prompt variables can take any name (preferably alphanumeric). Try to specify what each one does in the base prompt and give them a relevant name to make it easier for your LLM to understand.

Add your data 📄

Excluding the base prompt, all the data you will be passing to your chatbot will be through the training directory. At the moment, only markdown files will be used for training. If you would like to use a different file format such as .txt, make sure you specify it in script/initializeStore.js.

After specifying your file format, start uploading or creating files and folders in the directory. Folder names and depth don’t matter since all files of the desired format will be iterated through.

Create a Vector Store 🤖

Run script/initializeStore.js. The initialization script will iterate through and read all files of the desired format in the training folder. After that, Langchain’s HNSWLib.fromTexts method will populate the training folder.

Your chatbot should be working now ✨!

What is a Vector Store?

A vector store is a data structure used to represent and store large collections of high-dimensional vectors. Vectors are mathematical objects that have both magnitude and direction. In AI they are often used to represent features or attributes of data points.

A vector store is typically constructed by taking a large corpus of text or other types of data and extracting features from each data point. These features are then represented as high-dimensional vectors, where each dimension corresponds to a particular feature.

(Thank you, ChatGPT 😁)

Vector Store Usage

Whenever a response is generated, a similarity search is run through the vectorStore/hnswlib.index file. Here’s what the process looks like when a response is generated (see lib/generateResponse.js):

(on first run) We initialize our OpenAI LLM, PromptTemplate, LLMChain, and load our vector store from the vectorStore directory.
A similarity search is conducted through the vector store to find any relevant Documents regarding the prompt.
The relevant documents, the prompt, and (optionally) the conversation history is passed to Langchain to predict and generate a response.

FAQ

Can I use other file formats like `.pdf` and `.html`?

Theoretically you can, but try to use utf-8 files. Passing a non-utf-8 file like a PDF will either make your chatbot speak to you in non-utf-8, or just be completely useless.

Is the process here actually training/fine-tuning the AI model, or just adding documents for context?

The data you pass to your chatbot will get compiled into a single vector store file and be used as context when generating responses. No actual fine-tuning or training processes are taking place.

Can GPT directly scrape text from a site? What is the advantage of using LangChain over referencing the documentation URLs directly?

OpenAI LLMs don’t have internet access and can’t provide information such as the current date and time. The entirety of the output GPT gives you is based entirely on previous knowledge.

What is the advantage of creating the `vectorStore` directory over running the store initialization script?

It takes a lot of time and processing power to generate the vector store. Storing it in a directory puts less stress on the system to load and cache.

Why isn’t my OpenAI API key working in Replit?

Run kill 1 in the Repl’s shell to force a reboot. Also, make sure you’re using Node.js 18.

In conclusion, that’s all it takes to make a personalized chatbot. The next step to your chatbot depends on your imagination and creativity – tell me what you’re making next in the comments 👇🔥!

If you’ve enjoyed this, drop some reactions and follow @IroncladDev on twitter!

Keep your eyes open for more AI content 🤖!

Thanks for reading 🙏!

Developer Journey – Women’s History Month: March 2023

March 1, 2023

Software

Hackathon – Hack Together: Microsoft Graph and .NET 🦒 – Day 01

March 2, 2023

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

Why AI Works in Isolation But Fails at Scale

Surviving career chaos: “Pause if you must, but don’t stop.” [Video]

7 product marketing lessons from the trenches

Trending Tags

Create a Personalized Chatbot 🎙🤖 with Langchain 🦜🔗 in Three simple steps ⚡️

Setup 🔧

Create an API key 🔑

Customize the Base Prompt 🎮

Prompt Template Variables ⚙️

Add your data 📄

Create a Vector Store 🤖

What is a Vector Store?

Vector Store Usage

FAQ

Can I use other file formats like `.pdf` and `.html`?

Is the process here actually training/fine-tuning the AI model, or just adding documents for context?

Can GPT directly scrape text from a site? What is the advantage of using LangChain over referencing the documentation URLs directly?

What is the advantage of creating the `vectorStore` directory over running the store initialization script?

Why isn’t my OpenAI API key working in Replit?

Leave a Reply Cancel reply

Previous Post

Developer Journey – Women’s History Month: March 2023

Next Post

Hackathon – Hack Together: Microsoft Graph and .NET 🦒 – Day 01

Create a Personalized Chatbot 🎙🤖 with Langchain 🦜🔗 in Three simple steps ⚡️

Setup 🔧

Create an API key 🔑

Customize the Base Prompt 🎮

Prompt Template Variables ⚙️

Add your data 📄

Create a Vector Store 🤖

What is a Vector Store?

Vector Store Usage

FAQ

Can I use other file formats like .pdf and .html?

Is the process here actually training/fine-tuning the AI model, or just adding documents for context?

Can GPT directly scrape text from a site? What is the advantage of using LangChain over referencing the documentation URLs directly?

What is the advantage of creating the vectorStore directory over running the store initialization script?

Why isn’t my OpenAI API key working in Replit?

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts

Can I use other file formats like `.pdf` and `.html`?

What is the advantage of creating the `vectorStore` directory over running the store initialization script?