Introduction
LLMs are all the fashion, and the tool-calling function has broadened the scope of massive language fashions. As a substitute of producing solely texts, it enabled LLMs to perform advanced automation duties that had been beforehand not possible, corresponding to dynamic UI technology, agentic automation, and so on.
These fashions are skilled over an enormous quantity of information. Therefore, they perceive and might generate structured information, making them excellent for tool-calling functions requiring exact outputs. This has pushed the widespread adoption of LLMs in AI-driven software program growth, the place tool-calling—starting from easy features to classy brokers—has turn out to be a focus.
On this article, you’ll go from studying the basics of LLM software calling to implementing it to construct brokers utilizing open-source instruments.
Studying Goals
- Study what LLM instruments are.
- Perceive the basics of software calling and use instances.
- Discover how software calling works in OpenAI (ChatCompletions API, Assistants API, Parallel software calling, and Structured Output), Anthropic fashions, and LangChain.
- Study to construct succesful AI brokers utilizing open-source instruments.
This text was revealed as part of the Information Science Blogathon.
Instruments are objects that permit LLMs to work together with exterior environments. These instruments are features made accessible to LLMs, which might be executed individually at any time when the LLM determines that their use is suitable.
Normally, there are three components of a software definition.
- Identify: A significant identify of the operate/software.
- Description: An in depth description of the software.
- Parameters: A JSON schema of parameters of the operate/software.
Instrument calling allows the mannequin to generate a response for a immediate that aligns with a user-defined schema for a operate. In different phrases, when the LLM determines {that a} software ought to be used, it generates a structured output that matches the schema for the software’s arguments.
For example, when you have offered a schema of a get_weather operate to the LLM and ask it for the climate of a metropolis, as a substitute of producing a textual content response, it returns a formatted schema of features arguments, which you should utilize to execute the operate to get the climate of a metropolis.
Regardless of the identify “software calling,” the mannequin doesn’t truly execute any software itself. As a substitute, it produces a structured output formatted in line with the outlined schema. Then, You may provide this output to the corresponding operate to run it in your finish.
AI labs like OpenAI and Anthropic have skilled fashions so to present the LLM with many instruments and have it choose the fitting one in line with the context.
Every supplier has a special method of dealing with software invocations and response dealing with. Right here’s the final stream of how software calling works while you move a immediate and instruments to the LLM:
- Outline Instruments and Present a Person Immediate
- Outline instruments and features with names, descriptions, and structured schema for arguments.
- Additionally embrace a user-provided textual content, e.g., “What’s the climate like in New York in the present day?”
- The LLM Decides to Use a Instrument
- The Assistant assesses if a software is required.
- If sure, it halts the textual content technology.
- The Assistant generates a JSON formatted response with the software’s parameter values.
- Extract Instrument Enter, Run Code, and Return Outputs
- Extract the parameters offered within the operate name.
- Run the operate by passing the parameters.
- Cross the outputs again to the LLM.
- Generate Solutions from Instrument Outputs
- The LLM makes use of the software outputs to formulate a basic reply.
Instance Use Instances
- Enabling LLMs to take motion: Join LLMs with exterior functions like Gmail, GitHub, and Discord to automate actions corresponding to sending an electronic mail, pushing a PR, and sending a message.
- Offering LLMs with information: Fetch information from data bases like the net, Wikipedia, and Climate APIs to supply area of interest data to LLMs.
- Dynamic UIs: Updating UIs of your functions primarily based on person inputs.
Completely different mannequin suppliers take completely different approaches to dealing with software calling. This text will talk about the tool-calling approaches of OpenAI, Anthropic, and LangChain. It’s also possible to use open-source fashions like Llama 3 and inference suppliers like Groq for software calling.
Presently, OpenAI has 4 completely different fashions (GPT-4o. GPT-4o-mini, GPT-4-turbo, and GPT-3.5-turbo). All these fashions assist software calling.
Let’s perceive it utilizing a easy calculator operate instance.
def calculator(operation, num1, num2):
if operation == "add":
return num1 + num2
elif operation == "subtract":
return num1 - num2
elif operation == "multiply":
return num1 * num2
elif operation == "divide":
return num1 / num2
Create a software calling schema for the Calculator operate.
import openai
openai.api_key = OPENAI_API_KEY
# Outline the operate schema (that is what GPT-4 will use to grasp how you can name the operate)
calculator_function = {
"identify": "calculator",
"description": "Performs primary arithmetic operations",
"parameters": {
"kind": "object",
"properties": {
"operation": {
"kind": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to carry out"
},
"num1": {
"kind": "quantity",
"description": "The primary quantity"
},
"num2": {
"kind": "quantity",
"description": "The second quantity"
}
},
"required": ["operation", "num1", "num2"]
}
}
A typical OpenAI operate/software calling schema has a reputation, description, and parameter part. Contained in the parameters part, you possibly can present the small print for the operate’s arguments.
- Every property has a knowledge kind and outline.
- Optionally, an enum which defines particular values the parameter expects. On this case, the “operation” parameter expects any of “add”, “subtract”, multiply, and “divide”.
- Required sections point out the parameters the mannequin should generate.
Now, use the outlined schema of the operate to get response from the chat completion endpoint.
# Instance of calling the OpenAI API with a software
response = openai.chat.completions.create(
mannequin="gpt-4-0613", # You should use any model that helps operate calling
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 3 plus 4?"},
],
features=[calculator_function],
function_call={"identify": "calculator"}, # Instruct the mannequin to name the calculator operate
)
# Extracting the operate name and its arguments from the response
function_call = response.decisions[0].message.function_call
identify = function_call.identify
arguments = function_call.arguments
Now you can move the arguments to the Calculator operate to get an output.
import json
args = json.masses(arguments)
consequence = calculator(args['operation'], args['num1'], args['num2'])
# Output the consequence
print(f"Outcome: {consequence}")
That is the only method to make use of software calling utilizing OpenAI fashions.
Utilizing the Assistant API
It’s also possible to use software calling with the Assistant API. This offers extra freedom and management over your entire workflow, permitting you to” construct brokers to perform advanced automation duties.
Right here is how you can use software calling with Assistant API.
We are going to use the identical calculator instance.
from openai import OpenAI
consumer = OpenAI(api_key=OPENAI_API_KEY)
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the offered features to reply questions.",
mannequin="gpt-4o",
instruments=[{
"type":"function",
"function":{
"name": "calculator",
"description": "Performs basic arithmetic operations",
"parameters": {
"type": "object",
"properties": {
"operation": {
"type": "string",
"enum": ["add", "subtract", "multiply", "divide"],
"description": "The operation to carry out"
},
"num1": {
"kind": "quantity",
"description": "The primary quantity"
},
"num2": {
"kind": "quantity",
"description": "The second quantity"
}
},
"required": ["operation", "num1", "num2"]
}
}
}
]
)
Create a thread and a message
thread = consumer.beta.threads.create()
message = consumer.beta.threads.messages.create(
thread_id=thread.id,
position="person",
content material="What's 3 plus 4?",
)
Provoke a run
run = consumer.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id="assistant.id")
Retrieve the arguments and run the Calculator operate
arguments = run.required_action.submit_tool_outputs.tool_calls[0].operate.arguments
import json
args = json.masses(arguments)
consequence = calculator(args['operation'], args['num1'], args['num2'])
Loop by way of the required motion and add it to an inventory
#tool_outputs = []
# Loop by way of every software within the required motion part
for software in run.required_action.submit_tool_outputs.tool_calls:
if software.operate.identify == "calculator":
tool_outputs.append({
"tool_call_id": software.id,
"output": str(consequence)
})
Submit the software outputs to the API and generate a response
# Submit the software outputs to the API
consumer.beta.threads.runs.submit_tool_outputs_and_poll(
thread_id=thread.id,
run_id=run.id,
tool_outputs=tool_outputs
)
messages = consumer.beta.threads.messages.checklist(
thread_id=thread.id
)
print(messages.information[0].content material[0].textual content.worth)
This may output a response `3 plus 4 equals 7`.
Parallel Perform Calling
It’s also possible to use a number of instruments concurrently for extra difficult use instances. For example, getting the present climate at a location and the probabilities of precipitation. To realize this, you should utilize the parallel operate calling function.
Outline two dummy features and their schemas for software calling
from openai import OpenAI
consumer = OpenAI(api_key=OPENAI_API_KEY)
def get_current_temperature(location, unit="Fahrenheit"):
return {"location": location, "temperature": "72", "unit": unit}
def get_rain_probability(location):
return {"location": location, "likelihood": "40"}
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the offered features to reply questions.",
mannequin="gpt-4o",
instruments=[
{
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Get the current temperature for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to make use of. Infer this from the person's location."
}
},
"required": ["location", "unit"]
}
}
},
{
"kind": "operate",
"operate": {
"identify": "get_rain_probability",
"description": "Get the likelihood of rain for a particular location",
"parameters": {
"kind": "object",
"properties": {
"location": {
"kind": "string",
"description": "Town and state, e.g., San Francisco, CA"
}
},
"required": ["location"]
}
}
}
]
)
Now, create a thread and provoke a run. Primarily based on the immediate, this can output the required JSON schema of operate parameters.
thread = consumer.beta.threads.create()
message = consumer.beta.threads.messages.create(
thread_id=thread.id,
position="person",
content material="What is the climate in San Francisco in the present day and the probability it's going to rain?",
)
run = consumer.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id,
)
Parse the software parameters and name the features
import json
location = json.masses(run.required_action.submit_tool_outputs.tool_calls[0].operate.arguments)
climate = json.loa"s(run.requir"d_action.submit_t"ol_out"uts.tool_calls[1].operate.arguments)
temp = get_current_temperature(location['location'], location['unit'])
rain_p"ob = get_rain_pro"abilit"(climate['location'])
# Output the consequence
print(f"Outcome: {temp}")
print(f"Outcome: {rain_prob}")
Outline an inventory to retailer software outputs
# Outline the checklist to retailer software outputs
tool_outputs = []
# Loop by way of every software within the required motion part
for software in run.required_action.submit_tool_outputs.tool_calls:
if software.operate.identify == "get_current_temperature":
tool_outputs.append({
"tool_call_id": software.id,
"output": str(temp)
})
elif software.operate.identify == "get_rain_probability":
tool_outputs.append({
"tool_call_id": software.id,
"output": str(rain_prob)
})
Submit software outputs and generate a solution
# Submit all software outputs without delay after accumulating them in tool_outputs:
strive:
run = consumer.beta.threads.runs.submit_tool_outputs_and_poll(
thread_id=thread.id,
run_id=run.id,
tool_outputs=tool_outputs
)
print("Instrument outputs submitted efficiently.")
besides Exception as e:
print("Didn't submit software outputs:", e)
else:
print("No software outputs to submit.")
if run.standing == 'accomplished':
messages = consumer.beta.threads.messages.checklist(
thread_id=thread.id
)
print(messages.information[0].content material[0].textual content.worth)
else:
print(run.standing)
The mannequin will generate a whole reply primarily based on the software’s outputs. `The present temperature in San Francisco, CA, is 72°F. There’s a 40% probability of rain in the present day.`
Confer with the official documentation for extra.
Structured Output
Just lately, OpenAI launched structured output, which ensures that the arguments generated by the mannequin for a operate name exactly match the JSON schema you offered. This function prevents the mannequin from producing incorrect or sudden enum values, holding its responses aligned with the required schema.
To make use of Structured Output for software calling, set strict: True. The API will pre-process the provided schema and constrain the mannequin to stick strictly to your schema.
from openai import OpenAI
consumer = OpenAI()
assistant = consumer.beta.assistants.create(
directions="You're a climate bot. Use the offered features to reply questions.",
mannequin="gpt-4o-2024-08-06",
instruments=[
{
"type": "function",
"function": {
"name": "get_current_temperature",
"description": "Get the current temperature for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g., San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["Celsius", "Fahrenheit"],
"description": "The temperature unit to make use of. Infer this from the person's location."
}
},
"required": ["location", "unit"],
"additionalProperties": False
},
"strict": True
}
},
{
"kind": "operate",
"operate": {
"identify": "get_rain_probability",
"description": "Get the likelihood of rain for a particular location",
"parameters": {
"kind": "object",
"properties": {
"location": {
"kind": "string",
"description": "Town and state, e.g., San Francisco, CA"
}
},
"required": ["location"],
"additionalProperties": False
},
// highlight-start
"strict": True
// highlight-end
}
}
]
)
The preliminary request will take a couple of seconds. Nonetheless, subsequently, the cached artefacts will likely be used for software calls.
Anthropic’s Claude household of fashions is environment friendly at software calling as effectively.
The workflow for calling instruments with Claude is just like that of OpenAI. Nonetheless, the vital distinction is in how software responses are dealt with. In OpenAI’s setup, software responses are managed beneath a separate position, whereas in Claude’s fashions, software responses are integrated straight throughout the Person roles.
A typical software definition in Claude consists of the operate’s identify, description, and JSON schema.
import anthropic
consumer = anthropic.Anthropic()
response = consumer.messages.create(
mannequin="claude-3-5-sonnet-20240620",
max_tokens=1024,
instruments=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, both 'Celsius' or 'fahrenheit'"
}
},
"required": ["location"]
}
},
],
messages=[
{
"role": "user",
"content": "What is the weather like in New York?"
}
]
)
print(response)
The features’ schema definition is just like the schema definition in OpenAI’s chat completion API, which we mentioned earlier.
Nonetheless, the response differentiates Claude’s fashions from these of OpenAI.
{
"id": "msg_01Aq9w938a90dw8q",
"mannequin": "claude-3-5-sonnet-20240620",
"stop_reason": "tool_use",
"position": "assistant",
"content material": [
{
"type": "text",
"text": "<thinking>I need to call the get_weather function, and the user wants SF, which is likely San Francisco, CA.</thinking>"
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {"location": "San Francisco, CA", "unit": "celsius"}
}
]
}
You may extract the arguments, execute the unique operate, and move the output to LLM for a textual content response with added data from operate calls.
response = consumer.messages.create(
mannequin="claude-3-5-sonnet-20240620",
max_tokens=1024,
instruments=[
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The unit of temperature, both 'celsius' or 'fahrenheit'"
}
},
"required": ["location"]
}
}
],
messages=[
{
"role": "user",
"content": "What's the weather like in San Francisco?"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "<thinking>I need to use get_weather, and the user wants SF, which is likely San Francisco, CA.</thinking>"
},
{
"type": "tool_use",
"id": "toolu_01A09q90qw90lq917835lq9",
"name": "get_weather",
"input": {"location": "San Francisco, CA", "unit": "celsius"}
}
]
},
{
"position": "person",
"content material": [
{
"type": "tool_result",
"tool_use_id": "toolu_01A09q90qw90lq917835lq9", # from the API response
"content": "65 degrees" # from running your tool
}
]
}
]
)
print(response)
Right here, you possibly can observe that we handed the tool-calling output beneath the person position.
For extra on Claude’s software calling, discuss with the official documentation.
Here’s a comparative overview of tool-calling options throughout completely different LLM suppliers.
Managing a number of LLM suppliers can shortly turn out to be troublesome whereas constructing advanced AI functions. Therefore, frameworks like LangChain have created a unified interface for dealing with software calls from a number of LLM suppliers.
Create a customized software utilizing @software decorator in LangChain.
from langchain_core.instruments import software
@software
def add(a: int, b: int) -> int:
"""Provides a and b.
Args:
a: first int
b: second int
"""
return a + b
@software
def multiply(a: int, b: int) -> int:
"""Multiplies a and b.
Args:
a: first int
b: second int
"""
return a * b
instruments = [add, multiply]
Initialise an LLM,
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(mannequin="gpt-3.5-turbo-0125")
Use the bind software technique so as to add the outlined instruments to the LLMs.
llm_with_tools = llm.bind_tools(instruments)
Typically, it is best to power the LLMs to make use of sure instruments. Many LLM suppliers permit this behaviour. To acheive this in LangChain, use
always_multiply_llm = llm.bind_tools([multiply], tool_choice="multiply")
And if you wish to name any of the instruments offered
always_call_tool_llm = llm.bind_tools([add, multiply], tool_choice="any")
Schema Definition Utilizing Pydantic
It’s also possible to use Pydantic to outline software schema. That is helpful when the software has a posh schema.
from langchain_core.pydantic_v1 import BaseModel, Discipline
# Be aware that the docstrings listed below are essential, as they are going to be handed alongside
# to the mannequin and the category identify.
class add(BaseModel):
"""Add two integers collectively."""
a: int = Discipline(..., description="First integer")
b: int = Discipline(..., description="Second integer")
class multiply(BaseModel):
"""Multiply two integers collectively."""
a: int = Discipline(..., description="First integer")
b: int = Discipline(..., description="Second integer")
instruments = [add, multiply]
Guarantee detailed docstring and clear parameter descriptions for optimum outcomes.
Brokers are automated applications powered by LLMs that work together with exterior environments. As a substitute of executing one motion after one other in a series, the brokers can determine which actions to take primarily based on some situations.
Getting structured responses from LLMs to work with AI brokers was tedious. Nonetheless, software calling made getting the specified structured response from LLMs fairly easy. This major function is main the AI agent revolution now.
So, let’s see how one can construct a real-world agent, corresponding to a GitHub PR reviewer utilizing OpenAI SDK and an open-source toolset referred to as Composio.
What’s Composio?
Composio is an open-source tooling resolution for constructing AI brokers. To assemble advanced agentic automation, it provides out-of-the-box integrations for functions like GitHub, Notion, Slack, and so on. It helps you to combine instruments with brokers with out worrying about advanced app authentication strategies like OAuth.
These instruments can be utilized with LLMs. They’re optimized for agentic interactions, which makes them extra dependable than easy operate calls. In addition they deal with person authentication and authorization.
You should use these instruments with OpenAI SDK, LangChain, LlamaIndex, and so on.
Let’s see an instance the place you’ll construct a GitHub PR overview agent utilizing OpenAI SDK.
Set up OpenAI SDK and Composio.
pip set up openai composio
Login to your Composio person account.
composio login
Add GitHub integration by finishing the mixing stream.
composio add github composio apps replace
Allow a set off to obtain PRs when created.
composio triggers allow github_pull_request_event
Create a brand new file, import libraries, and outline the instruments.
import os
from composio_openai import Motion, ComposioToolSet
from openai import OpenAI
from composio.consumer.collections import TriggerEventData
composio_toolset = ComposioToolSet()
pr_agent_tools = composio_toolset.get_actions(
actions=[
Action.GITHUB_GET_CODE_CHANGES_IN_PR, # For a given PR, it gets all the changes
Action.GITHUB_PULLS_CREATE_REVIEW_COMMENT, # For a given PR, it creates a comment
Action.GITHUB_ISSUES_CREATE, # If required, allows you to create issues on github
]
)
Initialise an OpenAI occasion and outline a immediate.
openai_client = OpenAI()
code_review_assistant_prompt = (
"""
You might be an skilled code reviewer.
Your process is to overview the offered file diff and provides constructive suggestions.
Observe these steps:
1. Determine if the file accommodates important logic adjustments.
2. Summarize the adjustments within the diff in clear and concise English inside 100 phrases.
3. Present actionable options if there are any points within the code.
After you have selected the adjustments for any TODOs, create a Github difficulty.
"""
)
Create an OpenAI assistant thread with the prompts and the instruments.
# Give openai entry to all of the instruments
assistant = openai_client.beta.assistants.create(
identify="PR Assessment Assistant",
description="An assistant that will help you with reviewing PRs",
directions=code_review_assistant_prompt,
mannequin="gpt-4o",
instruments=pr_agent_tools,
)
print("Assistant is prepared")
Now, arrange a webhook to obtain the PRs fetched by the triggers and a callback operate to course of them.
## Create a set off listener
listener = composio_toolset.create_trigger_listener()
## Triggers when a brand new PR is opened
@listener.callback(filters={"trigger_name": "github_pull_request_event"})
def review_new_pr(occasion: TriggerEventData) -> None:
# Utilizing the knowledge from Set off, execute the agent
code_to_review = str(occasion.payload)
thread = openai_client.beta.threads.create()
openai_client.beta.threads.messages.create(
thread_id=thread.id, position="person", content material=code_to_review
)
## Let's print our thread
url = f"https://platform.openai.com/playground/assistants?assistant={assistant.id}&thread={thread.id}"
print("Go to this URL to view the thread: ", url)
# Execute Agent with integrations
# begin the execution
run = openai_client.beta.threads.runs.create(
thread_id=thread.id, assistant_id=assistant.id
)
composio_toolset.wait_and_handle_assistant_tool_calls(
consumer=openai_client,
run=run,
thread=thread,
)
print("Listener began!")
print("Create a pr to get the overview")
listener.pay attention()
Right here is what’s going on within the above code block
- Initialize Listener and Outline Callback: We outlined an occasion listener with a filter with the set off identify and a callback operate. The callback operate is known as when the occasion listener receives an occasion from the required set off, i,e. github_pull_request_event.
- Course of PR Content material: Extracts the code diffs from the occasion payload.
- Run Assistant Agent: Create a brand new OpenAI thread and ship the codes to the GPT mannequin.
- Handle Instrument Calls and Begin Listening: Handles software calls throughout execution and prompts the listener for ongoing PR monitoring.
With this, you’ll have a completely purposeful AI agent to overview new PR requests. Each time a brand new pull request is raised, the webhook triggers the callback operate, and eventually, the agent posts a abstract of the code diffs as a remark to the PR.
Conclusion
Instrument calling by the Massive Language Mannequin is on the forefront of the agentic revolution. It has enabled use instances that had been beforehand not possible, corresponding to letting machines work together with exterior functions as and when wanted, dynamic UI technology, and so on. Builders can construct advanced agentic automation processes by leveraging instruments and frameworks like OpenAI SDK, LangChain, and Composio.
Key Takeaways
- Instruments are objects that allow the LLMs interface with exterior functions.
- Instrument calling is the tactic the place LLMs generate structured schema for a required operate primarily based on person message.
- Nonetheless, main LLM suppliers corresponding to OpenAI and Anthropic supply operate calling with completely different implementations.
- LangChain provides a unified API for software calling utilizing LLMs.
- Composio provides instruments and integrations like GitHub, Slack, and Gmail for advanced agentic automation.
Steadily Requested Questions
A. Instruments are objects that allow the LLMs work together with exterior environments, corresponding to Code interpreters, GitHub, Databases, the Web, and so on.
A. LLMs, or Massive Language Fashions, are superior AI techniques designed to grasp, generate, and reply to human language by processing huge quantities of textual content information.
A. Instrument calling allows LLMs to generate the structured schema of operate arguments as and when wanted.
A. AI brokers are techniques powered by AI fashions that may autonomously carry out duties, work together with their surroundings, and make selections primarily based on their programming and the info they course of.
The media proven on this article shouldn’t be owned by Analytics Vidhya and is used on the Writer’s discretion.