On this complete information, we’ll dive deep into the important parts of LangChain and show learn how to harness its energy in JavaScript.
LangChainJS is a flexible JavaScript framework that empowers builders and researchers to create, experiment with, and analyze language fashions and brokers. It affords a wealthy set of options for pure language processing (NLP) fans, from constructing customized fashions to manipulating textual content information effectively. As a JavaScript framework, it additionally permits builders to simply combine their AI functions into net apps.
Stipulations
To observe together with this text, create a brand new folder and set up the LangChain npm bundle:
npm set up -S langchain
After creating a brand new folder, create a brand new JS module file through the use of the .mjs
suffix (comparable to test1.mjs
).
Brokers
In LangChain, an agent is an entity that may perceive and generate textual content. These brokers might be configured with particular behaviors and information sources and skilled to carry out numerous language-related duties, making them versatile instruments for a variety of functions.
Making a LangChain agent
Brokers might be configured to make use of “instruments” to assemble the info they want and formulate response. Check out the instance under. It makes use of Serp API (an web search API) to go looking the Web for info related to the query or enter, and use that to make a response. It additionally makes use of the llm-math
software to carry out mathematical operations — for instance, to transform models or discover share change between two values:
import { initializeAgentExecutorWithOptions } from "langchain/brokers";
import { ChatOpenAI } from "langchain/chat_models/openai";
import { SerpAPI } from "langchain/instruments";
import { Calculator } from "langchain/instruments/calculator";
course of.env["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
course of.env["SERPAPI_API_KEY"] = "YOUR_SERPAPI_KEY"
const instruments = [new Calculator(), new SerpAPI()];
const mannequin = new ChatOpenAI({ modelName: "gpt-3.5-turbo", temperature: 0 });
const executor = await initializeAgentExecutorWithOptions(instruments, mannequin, {
agentType: "openai-functions",
verbose: false,
});
const outcome = await executor.run("By looking out the Web, discover what number of albums has Boldy James dropped since 2010 and what number of albums has Nas dropped since 2010? Discover who dropped extra albums and present the distinction in p.c.");
console.log(outcome);
After creating the mannequin
variable utilizing modelName: "gpt-3.5-turbo"
and temperature: 0
, we create the executor
that mixes the created mannequin
with the desired instruments (SerpAPI and Calculator). Within the enter, I’ve requested the LLM to go looking the Web (utilizing SerpAPI) and discover which artist dropped extra albums since 2010 — Nas or Boldy James — and present the proportion distinction (utilizing Calculator).
On this instance, I needed to explicitly inform the LLM “By looking out the Web…” to have it get information up till current day utilizing the Web as a substitute of utilizing OpenAI’s default information restricted to 2021.
Right here’s what the output seems like:
> node test1.mjs
Boldy James has launched 4 albums since 2010. Nas has launched 17 studio albums since 2010.
Subsequently, Nas has launched extra albums than Boldy James. The distinction in the variety of albums is 13.
To calculate the distinction in p.c, we are able to use the formulation: (Distinction / Complete) * 100.
On this case, the distinction is 13 and the whole is 17.
The distinction in p.c is: (13 / 17) * 100 = 76.47%.
So, Nas has launched 76.47% extra albums than Boldy James since 2010.
Fashions
There are three kinds of fashions in LangChain: LLMs, chat fashions, and textual content embedding fashions. Let’s discover each sort of mannequin with some examples.
Language mannequin
LangChain offers a approach to make use of language fashions in JavaScript to supply a textual content output based mostly on a textual content enter. It’s not as advanced as a chat mannequin, and it’s used greatest with easy enter–output language duties. Right here’s an instance utilizing OpenAI:
import { OpenAI } from "langchain/llms/openai";
const llm = new OpenAI({
openAIApiKey: "YOUR_OPENAI_KEY",
mannequin: "gpt-3.5-turbo",
temperature: 0
});
const res = await llm.name("Listing all purple berries");
console.log(res);
As you’ll be able to see, it makes use of the gpt-3.5-turbo
mannequin to listing all purple berries. On this instance, I set the temperature to 0 to make the LLM factually correct. Output:
1. Strawberries
2. Cranberries
3. Raspberries
4. Redcurrants
5. Crimson Gooseberries
6. Crimson Elderberries
7. Crimson Huckleberries
8. Crimson Mulberries
Chat mannequin
If you need extra refined solutions and conversations, it is advisable use chat fashions. How are chat fashions technically totally different from language fashions? Effectively, within the phrases of the LangChain documentation:
Chat fashions are a variation on language fashions. Whereas chat fashions use language fashions underneath the hood, the interface they use is a bit totally different. Fairly than utilizing a “textual content in, textual content out” API, they use an interface the place “chat messages” are the inputs and outputs.
Right here’s a easy (fairly ineffective however enjoyable) JavaScript chat mannequin script:
import { ChatOpenAI } from "langchain/chat_models/openai";
import { PromptTemplate } from "langchain/prompts";
const chat = new ChatOpenAI({
openAIApiKey: "YOUR_OPENAI_KEY",
mannequin: "gpt-3.5-turbo",
temperature: 0
});
const immediate = PromptTemplate.fromTemplate(`You're a poetic assistant that all the time solutions in rhymes: {query}`);
const runnable = immediate.pipe(chat);
const response = await runnable.invoke({ query: "Who is healthier, Djokovic, Federer or Nadal?" });
console.log(response);
As you’ll be able to see, the code first sends a system message and tells the chatbot to be a poetic assistant that all the time solutions in rhymes, and afterwards it sends a human message telling the chatbot to inform me who’s the higher tennis participant: Djokovic, Federer or Nadal. For those who run this chatbot mannequin, you’ll see one thing like this:
AIMessage.content material:
'Within the realm of tennis, all of them shine vibrant,n' +
'Djokovic, Federer, and Nadal, an excellent sight.n' +
'Every with their distinctive model and ability,n' +
'Selecting the perfect is a troublesome thrill.n' +
'n' +
'Djokovic, the Serb, a grasp of precision,n' +
'With agility and focus, he performs with determination.n' +
'His highly effective strokes and relentless drive,n' +
"Make him a pressure that is arduous to outlive.n" +
'n' +
'Federer, the Swiss maestro, a real artist,n' +
'Swish and stylish, his recreation is the neatest.n' +
'His easy approach and magical contact,n' +
'Depart spectators in awe, oh a lot.n' +
'n' +
'Nadal, the Spaniard, a warrior on clay,n' +
'His fierce dedication retains opponents at bay.n' +
'Along with his relentless energy and unending struggle,n' +
'He conquers the court docket, with all his would possibly.n' +
'n' +
"So, who is healthier? It is a query of style,n" +
"Every participant's greatness can't be erased.n" +
"Ultimately, it is the love for the sport we share,n" +
'That makes all of them champions, past examine.'
Fairly cool!
Embeddings
Embeddings fashions present a method to flip phrases and numbers in a textual content into vectors, that may then be related to different phrases or numbers. This will likely sound summary, so let’s have a look at an instance:
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
course of.env["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
const embeddings = new OpenAIEmbeddings();
const res = await embeddings.embedQuery("Who created the world huge net?");
console.log(res)
This may return an extended listing of floats:
[
0.02274114, -0.012759142, 0.004794503, -0.009431809, 0.01085313,
0.0019698727, -0.013649924, 0.014933698, -0.0038185727, -0.025400387,
0.010794181, 0.018680222, 0.020042595, 0.004303263, 0.019937797,
0.011226473, 0.009268062, 0.016125774, 0.0116391145, -0.0061765253,
-0.0073358514, 0.00021696436, 0.004896026, 0.0034026562, -0.018365828,
... 1501 more items
]
That is what an embedding seems like. All of these floats for simply six phrases!
This embedding can then be used to affiliate the enter textual content with potential solutions, associated texts, names and extra.
Now let’s have a look at a use case of embedding fashions…
Now right here’s a script that may take the query “What’s the heaviest animal?” and discover the best reply within the offered listing of doable solutions through the use of embeddings:
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
course of.env["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
const embeddings = new OpenAIEmbeddings();
operate cosinesim(A, B) {
var dotproduct = 0;
var mA = 0;
var mB = 0;
for(var i = 0; i < A.size; i++) {
dotproduct += A[i] * B[i];
mA += A[i] * A[i];
mB += B[i] * B[i];
}
mA = Math.sqrt(mA);
mB = Math.sqrt(mB);
var similarity = dotproduct / (mA * mB);
return similarity;
}
const res1 = await embeddings.embedQuery("The Blue Whale is the heaviest animal on this planet");
const res2 = await embeddings.embedQuery("George Orwell wrote 1984");
const res3 = await embeddings.embedQuery("Random stuff");
const text_arr = ["The Blue Whale is the heaviest animal in the world", "George Orwell wrote 1984", "Random stuff"]
const res_arr = [res1, res2, res3]
const query = await embeddings.embedQuery("What's the heaviest animal?");
const sims = []
for (var i=0;i<res_arr.size;i++){
sims.push(cosinesim(query, res_arr[i]))
}
Array.prototype.max = operate() {
return Math.max.apply(null, this);
};
console.log(text_arr[sims.indexOf(sims.max())])
This code makes use of the cosinesim(A, B)
operate to search out the relatedness of every reply to the query. By discovering the listing of embeddings most associated to the query utilizing the Array.prototype.max
operate by discovering the utmost worth within the array of relatedness indexes that have been generated utilizing cosinesim
, the code is then capable of finding the best reply by discovering which textual content from text_arr
belongs to probably the most associated reply: text_arr[sims.indexOf(sims.max())]
.
Output:
The Blue Whale is the heaviest animal in the world
Chunks
LangChain fashions can’t deal with giant texts and use them to make responses. That is the place chunks and textual content splitting are available. Let me present you two easy strategies to separate your textual content information into chunks earlier than feeding it into LangChain.
Splitting chunks by character
To keep away from abrupt breaks in chunks, you’ll be able to break up your texts by paragraph by splitting them at each prevalence of a newline:
import { Doc } from "langchain/doc";
import { CharacterTextSplitter } from "langchain/text_splitter";
const splitter = new CharacterTextSplitter({
separator: "n",
chunkSize: 7,
chunkOverlap: 3,
});
const output = await splitter.createDocuments([your_text]);
That is one helpful approach of splitting a textual content. Nonetheless, you should use any character as a piece separator, not simply n
.
Recursively splitting chunks
If you wish to strictly break up your textual content by a sure size of characters, you are able to do so utilizing RecursiveCharacterTextSplitter
:
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 100,
chunkOverlap: 15,
});
const output = await splitter.createDocuments([your_text]);
On this instance, the textual content will get break up each 100 characters, with a piece overlap of 15 characters.
Chunk dimension and overlap
By taking a look at these examples, you’ve in all probability began questioning precisely what the chunk dimension and overlap parameters imply, and what implications they’ve on efficiency. Effectively, let me clarify it merely in two factors.
-
Chunk dimension decides the quantity of characters that can be in every chunk. The larger the chunk dimension, the extra information is within the chunk, the extra time it’ll take LangChain to course of it and to supply an output, and vice versa.
-
Chunk overlap is what shares info between chunks in order that they share some context. The upper the chunk overlap, the extra redundant your chunks can be; the decrease the chunk overlap, the much less context can be shared between the chunks. Typically, chunk overlap is between 10% and 20% of the chunk dimension, though the perfect chunk overlap varies throughout totally different textual content varieties and use circumstances.
Chains
Chains are principally a number of LLM functionalities linked collectively to carry out extra advanced duties that couldn’t in any other case be carried out with easy LLM input-->output
vogue. Let’s have a look at a cool instance:
import { ChatPromptTemplate } from "langchain/prompts";
import { LLMChain } from "langchain/chains";
import { ChatOpenAI } from "langchain/chat_models/openai";
course of.env["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
const wiki_text = `
Alexander Stanislavovich 'Sasha' Bublik (Александр Станиславович Бублик; born 17 June 1997) is a Kazakhstani skilled tennis participant.
He has been ranked as excessive as world No. 25 in singles by the Affiliation of Tennis Professionals (ATP), which he achieved in July 2023, and is the present Kazakhstani No. 1 participant...
Alexander Stanislavovich Bublik was born on 17 June 1997 in Gatchina, Russia and commenced enjoying tennis on the age of 4. He was coached by his father, Stanislav. On the junior tour, Bublik reached a career-high rating of No. 19 and received eleven titles (six singles and 5 doubles) on the Worldwide Tennis Federation (ITF) junior circuit.[4][5]...
`
const chat = new ChatOpenAI({ temperature: 0 });
const chatPrompt = ChatPromptTemplate.fromMessages([
[
"system",
"You are a helpful assistant that {action} the provided text",
],
["human", "{text}"],
]);
const chainB = new LLMChain({
immediate: chatPrompt,
llm: chat,
});
const resB = await chainB.name({
motion: "lists all essential numbers from",
textual content: wiki_text,
});
console.log({ resB });
This code takes a variable into its immediate, and formulates a factually appropriate reply (temperature: 0). On this instance, I requested the LLM to listing all essential numbers from a brief Wiki bio of my favourite tennis participant.
Right here’s the output of this code:
{
resB: {
textual content: 'Necessary numbers from the offered textual content:n' +
'n' +
"- Alexander Stanislavovich 'Sasha' Bublik's date of start: 17 June 1997n" +
"- Bublik's highest singles rating: world No. 25n" +
"- Bublik's highest doubles rating: world No. 47n" +
"- Bublik's profession ATP Tour singles titles: 3n" +
"- Bublik's profession ATP Tour singles runner-up finishes: 6n" +
"- Bublik's peak: 1.96 m (6 ft 5 in)n" +
"- Bublik's variety of aces served within the 2021 ATP Tour season: unknownn" +
"- Bublik's junior tour rating: No. 19n" +
"- Bublik's junior tour titles: 11 (6 singles and 5 doubles)n" +
"- Bublik's earlier citizenship: Russian" +
"- Bublik's present citizenship: Kazakhstann" +
"- Bublik's position within the Levitov Chess Wizards staff: reserve member"
}
}
Fairly cool, however this doesn’t actually present the total energy of chains. Let’s check out a extra sensible instance:
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
import { ChatOpenAI } from "langchain/chat_models/openai";
import {
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
} from "langchain/prompts";
import { JsonOutputFunctionsParser } from "langchain/output_parsers";
course of.env["OPENAI_API_KEY"] = "YOUR_OPENAI_KEY"
const zodSchema = z.object({
albums: z
.array(
z.object({
title: z.string().describe("The title of the album"),
artist: z.string().describe("The artist(s) that made the album"),
size: z.quantity().describe("The size of the album in minutes"),
style: z.string().optionally available().describe("The style of the album"),
})
)
.describe("An array of music albums talked about within the textual content"),
});
const immediate = new ChatPromptTemplate({
promptMessages: [
SystemMessagePromptTemplate.fromTemplate(
"List all music albums mentioned in the following text."
),
HumanMessagePromptTemplate.fromTemplate("{inputText}"),
],
inputVariables: ["inputText"],
});
const llm = new ChatOpenAI({ modelName: "gpt-3.5-turbo", temperature: 0 });
const functionCallingModel = llm.bind({
features: [
{
name: "output_formatter",
description: "Should always be used to properly format output",
parameters: zodToJsonSchema(zodSchema),
},
],
function_call: { title: "output_formatter" },
});
const outputParser = new JsonOutputFunctionsParser();
const chain = immediate.pipe(functionCallingModel).pipe(outputParser);
const response = await chain.invoke({
inputText: "My favourite albums are: 2001, To Pimp a Butterfly and Led Zeppelin IV",
});
console.log(JSON.stringify(response, null, 2));
This code reads an enter textual content, identifies all talked about music albums, identifies every album’s title, artist, size and style, and eventually places all the info into JSON format. Right here’s the output given the enter “My favourite albums are: 2001, To Pimp a Butterfly and Led Zeppelin IV”:
{
"albums": [
{
"name": "2001",
"artist": "Dr. Dre",
"length": 68,
"genre": "Hip Hop"
},
{
"name": "To Pimp a Butterfly",
"artist": "Kendrick Lamar",
"length": 79,
"genre": "Hip Hop"
},
{
"name": "Led Zeppelin IV",
"artist": "Led Zeppelin",
"length": 42,
"genre": "Rock"
}
]
}
That is only a enjoyable instance, however this system can be utilized to construction unstructured textual content information for numerous different functions.
Going Past OpenAI
Regardless that I preserve utilizing OpenAI fashions as examples of the totally different functionalities of LangChain, it isn’t restricted to OpenAI fashions. You should use LangChain with a mess of different LLMs and AI companies. You’ll find the total listing of LangChain and JavaScript integratable LLMs of their documentation.
For instance, you should use Cohere with LangChain. After putting in Cohere, utilizing npm set up cohere-ai
, you can also make a easy question-->reply
code utilizing LangChain and Cohere like this:
import { Cohere } from "langchain/llms/cohere";
const mannequin = new Cohere({
maxTokens: 50,
apiKey: "YOUR_COHERE_KEY",
});
const res = await mannequin.name(
"Give you a reputation for a brand new Nas album"
);
console.log({ res });
Output:
{
res: ' Listed here are a number of doable names for a brand new Nas album:n' +
'n' +
"- King's Landingn" +
"- God's Son: The Sequeln" +
"- Road's Disciplen" +
'- Izzy Freen' +
'- Nas and the Illmatic Flown' +
'n' +
'Do any'
}
Conclusion
On this information, you’ve seen the totally different elements and functionalities of LangChain in JavaScript. You should use LangChain in JavaScript to simply develop AI-powered net apps and experiment with LLMs. You’ll want to confer with the LangChainJS documentation for extra particulars on particular functionalities.
Completely happy coding and experimenting with LangChain in JavaScript! For those who loved this text, you may also prefer to examine utilizing LangChain with Python.