Function Calling LLM Best Practices: Production Guide for 2026
Function calling transforms LLMs from text generators into action-taking agents. Production function calling LLM best practices require robust schemas, error handling, and security guardrails.

Function Calling LLM Best Practices: Production Guide for 2026
Function calling transforms LLMs from text generators into action-taking agents. But production function calling LLM best practices require more than just defining tools—you need robust schemas, error handling, and security guardrails.
What Is Function Calling in LLMs?
Function calling (also called tool use or function invocation) allows LLMs to:
- Decide when to call external functions
- Generate proper function arguments from natural language
- Interpret function results and continue conversation
Instead of just generating text, the LLM orchestrates actions:
User: "What's the weather in Lagos?"
LLM: [Calls get_weather(city="Lagos")]
Function: {"temp": 28, "condition": "Partly cloudy"}
LLM: "It's 28°C and partly cloudy in Lagos right now."
Why Function Calling Matters
Function calling enables AI agents to:
- Access real-time data — APIs, databases, web search
- Take actions — Send emails, update records, trigger workflows
- Use specialized tools — Calculators, code interpreters, domain-specific APIs
- Extend capabilities — Break free from training data limitations
Function Calling Best Practices
1. Write Clear Function Descriptions
Bad:
def search(query):
"""Searches stuff"""
pass
Good:
def search_knowledge_base(
query: str,
max_results: int = 5,
filters: Optional[Dict[str, str]] = None
) -> List[Dict[str, Any]]:
"""
Search the company knowledge base for relevant documents.
Use this when the user asks questions about company policies,
procedures, or internal documentation.
Args:
query: Natural language search query
max_results: Maximum number of results to return (default 5)
filters: Optional filters like {"department": "engineering"}
Returns:
List of documents with title, content, and relevance score
Examples:
- User asks "What's our vacation policy?"
→ search_knowledge_base("vacation policy")
- "Show me engineering docs about API design"
→ search_knowledge_base("API design", filters={"department": "engineering"})
"""
pass
The LLM uses your description to decide when and how to call the function.
2. Use Strong Type Hints
Python with Pydantic:
from pydantic import BaseModel, Field
from typing import Literal
class EmailParams(BaseModel):
recipient: str = Field(description="Email address of recipient")
subject: str = Field(description="Email subject line")
body: str = Field(description="Email body content")
priority: Literal["low", "normal", "high"] = Field(
default="normal",
description="Email priority level"
)
def send_email(params: EmailParams) -> dict:
"""Send an email to a recipient."""
pass
TypeScript:
interface EmailParams {
/** Email address of recipient */
recipient: string;
/** Email subject line */
subject: string;
/** Email body content */
body: string;
/** Email priority level */
priority?: 'low' | 'normal' | 'high';
}
function sendEmail(params: EmailParams): Promise<{success: boolean}> {
// Implementation
}
3. Validate Function Arguments
Never trust LLM-generated arguments blindly:
from pydantic import BaseModel, validator
class TransferParams(BaseModel):
from_account: str
to_account: str
amount: float
@validator('amount')
def amount_must_be_positive(cls, v):
if v <= 0:
raise ValueError('Amount must be positive')
if v > 10000:
raise ValueError('Amount exceeds transfer limit')
return v
@validator('from_account', 'to_account')
def account_must_be_valid(cls, v):
if not v.startswith('ACC-'):
raise ValueError('Invalid account format')
return v
def transfer_money(params: TransferParams):
"""Transfer money between accounts (requires validation)."""
# Validation happens automatically via Pydantic
execute_transfer(params)
4. Implement Function Security
Permission Checks:
def delete_user(user_id: str, requesting_user: str) -> dict:
"""
Delete a user account (admin only).
Args:
user_id: ID of user to delete
requesting_user: ID of user making the request
"""
# Check permissions before executing
if not is_admin(requesting_user):
raise PermissionError("Only admins can delete users")
# Additional confirmation for destructive actions
if not user_exists(user_id):
raise ValueError(f"User {user_id} not found")
return perform_deletion(user_id)
Rate Limiting:
from functools import wraps
import time
def rate_limit(max_calls=10, period=60):
"""Rate limit function calls"""
calls = []
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
now = time.time()
# Remove old calls
calls[:] = [c for c in calls if now - c < period]
if len(calls) >= max_calls:
raise Exception(f"Rate limit exceeded: {max_calls} calls per {period}s")
calls.append(now)
return func(*args, **kwargs)
return wrapper
return decorator
@rate_limit(max_calls=5, period=60)
def send_sms(phone: str, message: str):
"""Send SMS (rate limited to 5 per minute)"""
pass
5. Return Structured Results
Bad:
def get_weather(city):
return "It's sunny and 25 degrees" # String hard for LLM to parse
Good:
from typing import TypedDict
class WeatherResult(TypedDict):
city: str
temperature_celsius: float
condition: str
humidity_percent: int
timestamp: str
def get_weather(city: str) -> WeatherResult:
"""Get current weather for a city."""
return {
"city": city,
"temperature_celsius": 25.0,
"condition": "sunny",
"humidity_percent": 65,
"timestamp": "2026-03-13T01:00:00Z"
}
LLMs handle structured JSON better than unstructured text.
6. Handle Errors Gracefully
from typing import Union
class FunctionResult(BaseModel):
success: bool
data: Optional[dict] = None
error: Optional[str] = None
def safe_function_call(func, *args, **kwargs) -> FunctionResult:
"""Wrapper that catches errors and returns structured result"""
try:
result = func(*args, **kwargs)
return FunctionResult(success=True, data=result)
except PermissionError as e:
return FunctionResult(
success=False,
error=f"Permission denied: {e}"
)
except ValueError as e:
return FunctionResult(
success=False,
error=f"Invalid input: {e}"
)
except Exception as e:
logger.error(f"Function error: {e}")
return FunctionResult(
success=False,
error="An unexpected error occurred"
)
For comprehensive error handling patterns, see our AI agent error handling guide.
7. Optimize Token Usage
Function definitions consume tokens. Be concise but clear:
Verbose (wasteful):
"""
This function is used to search our company's knowledge base system.
You should call this function whenever a user asks a question that
might be answered by our internal documentation, policies, or procedures.
The function accepts a query parameter which should be a natural language
search string, and it also accepts an optional max_results parameter...
"""
Optimized:
"""
Search company knowledge base for policies and documentation.
Args: query (str), max_results (int, default 5)
Use when: user asks about company info, policies, procedures
"""
8. Design Composable Functions
Bad — Monolithic:
def send_report_email_to_team(report_type: str):
"""Generate report, format email, send to team"""
# Does too much, hard to debug
pass
Good — Composable:
def generate_report(report_type: str) -> dict:
"""Generate a specific type of report"""
pass
def format_email(subject: str, body: str) -> dict:
"""Format email content"""
pass
def send_email(recipient: str, subject: str, body: str) -> dict:
"""Send an email"""
pass
Let the LLM orchestrate the composition.
9. Provide Usage Examples
def calculate_roi(
initial_investment: float,
returns: float,
time_period_years: float
) -> dict:
"""
Calculate return on investment (ROI).
Args:
initial_investment: Initial amount invested
returns: Total returns received
time_period_years: Investment duration in years
Examples:
User: "I invested $10k and got back $15k over 3 years, what's my ROI?"
Call: calculate_roi(10000, 15000, 3)
User: "What's the ROI if I put in 5000 and earn 6500 in 2 years?"
Call: calculate_roi(5000, 6500, 2)
"""
roi_percent = ((returns - initial_investment) / initial_investment) * 100
annualized_roi = ((returns / initial_investment) ** (1 / time_period_years) - 1) * 100
return {
"roi_percent": round(roi_percent, 2),
"annualized_roi_percent": round(annualized_roi, 2),
"profit": returns - initial_investment
}
10. Implement Confirmation for Destructive Actions
def requires_confirmation(func):
"""Decorator for actions requiring explicit user confirmation"""
@wraps(func)
def wrapper(*args, **kwargs):
confirmation_required = {
"action": func.__name__,
"params": kwargs,
"requires_confirmation": True,
"message": f"Confirm: {func.__doc__}"
}
# LLM should ask user for confirmation before proceeding
return confirmation_required
return wrapper
@requires_confirmation
def delete_all_records(table: str) -> dict:
"""Delete all records from specified table (DESTRUCTIVE)"""
# Only executes after user confirmation
pass
Platform-Specific Best Practices
OpenAI Function Calling
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g. San Francisco"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What's the weather in Lagos?"}],
tools=tools,
tool_choice="auto" # Let model decide when to use tools
)
Anthropic Tool Use
tools = [
{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
]
message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "Weather in Lagos?"}]
)
LangChain Tools
from langchain.tools import tool
@tool
def search_knowledge_base(query: str, max_results: int = 5) -> list:
"""Search company knowledge base
Args:
query: Search query
max_results: Number of results to return
"""
return perform_search(query, max_results)
# LangChain auto-generates schema from decorator + docstring
For more on building agents with tools, see our multi-agent orchestration guide.
Testing Function Calling
Unit Tests
import pytest
def test_weather_function():
result = get_weather("Lagos")
assert result["success"] == True
assert "temperature_celsius" in result["data"]
assert isinstance(result["data"]["temperature_celsius"], float)
def test_weather_invalid_city():
result = get_weather("InvalidCityXYZ123")
assert result["success"] == False
assert "error" in result
Integration Tests
@pytest.mark.asyncio
async def test_llm_function_calling():
"""Test that LLM correctly calls functions"""
agent = create_agent_with_tools([get_weather])
response = await agent.run("What's the weather in Lagos?")
# Verify function was called
assert agent.last_tool_call == "get_weather"
assert agent.last_tool_args == {"city": "Lagos"}
# Verify response includes weather data
assert "temperature" in response.lower() or "weather" in response.lower()
Chaos Testing
def test_function_with_invalid_llm_args():
"""Test handling of malformed LLM-generated arguments"""
invalid_inputs = [
{"city": 12345}, # Wrong type
{"city": ""}, # Empty string
{}, # Missing required field
{"city": "Lagos", "extra": "field"} # Extra fields
]
for invalid_input in invalid_inputs:
result = safe_function_call(get_weather, **invalid_input)
assert result.success == False
assert result.error is not None
Common Pitfalls
Pitfall 1: Ambiguous Function Names
# Bad
def get(): # Get what?
def process(): # Process what?
def handle(): # Handle what?
# Good
def get_user_profile():
def process_payment():
def handle_webhook_event():
Pitfall 2: Side Effects in Descriptions
# Bad
def log_message(msg: str):
"""Logs a message (also sends email notification)""" # Hidden side effect!
# Good
def log_and_notify(msg: str):
"""Logs message and sends email notification to admins""" # Explicit
Pitfall 3: No Dry-Run Mode
def send_email(to: str, subject: str, body: str, dry_run: bool = False):
"""Send email (supports dry-run for testing)"""
if dry_run:
return {"success": True, "message": "Dry run - email not sent"}
return actually_send_email(to, subject, body)
Conclusion
Production function calling LLM best practices require careful schema design, robust validation, security guardrails, and comprehensive testing. The difference between a demo and a production system lies in the details.
Key takeaways:
- Write clear, concise function descriptions with examples
- Use strong typing and validation
- Implement security checks and rate limiting
- Return structured results
- Handle errors gracefully
- Test with real and malformed inputs
Start with a small set of well-designed functions, validate thoroughly, then expand your tool ecosystem.
Build AI That Works For Your Business
At AI Agents Plus, we help companies move from AI experiments to production systems that deliver real ROI. Whether you need:
- Custom AI Agents — Autonomous systems that handle complex workflows, from customer service to operations
- Rapid AI Prototyping — Go from idea to working demo in days using vibe coding and modern AI frameworks
- Voice AI Solutions — Natural conversational interfaces for your products and services
We've built AI systems for startups and enterprises across Africa and beyond.
Ready to explore what AI can do for your business? Let's talk →
About AI Agents Plus Editorial
AI automation expert and thought leader in business transformation through artificial intelligence.



