FuncRoute
Intelligent Function/Tool Routing using Fine-tuned FunctionGemma
🌟 Why FuncRoute?
Problem: Modern AI agents need to route user queries to the right tool/function among dozens of options. Traditional approaches using massive LLMs are:
- 💸 Expensive ($0.10+ per 1000 queries)
- 🐌 Slow (1-3 seconds per query)
- 🎯 Inconsistent (hallucinations, wrong tools)
- 💰 99% cheaper (fine-tuned 270M model vs GPT-4)
- ⚡ 10-100x faster (50-200ms per query)
- 🎯 More accurate (98%+ with proper training)
- 🔒 Self-hosted (no API costs, full control)
📦 Installation
From PyPI
pip install funcroute
From Source
git clone https://github.com/yourusername/funcroute.git
cd funcroute
pip install -e .
Requirements
- Python 3.9+
- PyTorch 2.0+
- CUDA GPU (recommended, 8GB+ VRAM)
- CPU supported but 10x slower
🎯 Quick Start
Complete Workflow
from funcroute import FuncRoute, TrainingConfig
from funcroute.core.config import ToolDefinition
from funcroute.data.generator import SyntheticDataGenerator
from funcroute.data.splitter import PatternGroupSplitter
# Step 1: Define your tools
tools = [
ToolDefinition(
name="manage_order",
signature="manage_order(order_id: str) -> dict",
description="Track and manage customer orders",
examples=["Where is my order?", "Track package"],
keywords=["order", "track", "delivery"],
),
# ... more tools
]
# Step 2: Generate synthetic data
generator = SyntheticDataGenerator(method="rule_based")
data = generator.generate(tools=tools, num_variations=50, num_samples=5000)
# Step 3: Split with anti-leakage
splitter = PatternGroupSplitter(seed=42)
train_data, val_data, test_data = splitter.split(data, verify_no_leakage=True)
# Step 4: Train the model
router = FuncRoute()
router.train(
train_data=train_data,
val_data=val_data,
tools=tools,
config=TrainingConfig(
output_dir="./my_router",
num_epochs=3,
batch_size=4,
),
)
# Step 5: Make predictions
result = router.route("Where is my package?")
print(f"Tool: {result.tool}") # manage_order
print(f"Confidence: {result.confidence:.1%}") # 98.5%
Using Pre-trained Model
from funcroute import FuncRoute
# Load from Hugging Face Hub
router = FuncRoute.from_pretrained("scionoftech/functiongemma-e-commerce-tool-calling")
# Route queries
result = router.route("Where is my order?")
print(f"Tool: {result.tool}") # manage_order
🚀 Features
Easy Training
Fine-tune FunctionGemma with your own data in minutes using LoRA + 4-bit quantization.
Synthetic Data
Generate thousands of training samples automatically with rule-based pattern expansion.
Anti-Leakage
Pattern group splitting prevents data leakage and overfitting between train/val/test sets.
Fast Inference
Batch prediction, streaming, and async support for maximum throughput.
Caching
LRU cache with TTL provides 5-10x speedup for repeated queries.
REST API
Production-ready FastAPI server with OpenAPI documentation.
Training Models
Basic Training
from funcroute import FuncRoute, TrainingConfig
router = FuncRoute()
router.train(
train_data="train.jsonl",
val_data="val.jsonl",
tools=tools, # CRITICAL: Must provide tool definitions
config=TrainingConfig(
output_dir="./my_router",
num_epochs=3,
batch_size=4,
learning_rate=2e-4,
),
)
TrainingConfig Parameters
| Parameter | Default | Description |
|---|---|---|
output_dir |
Required | Directory to save model |
num_epochs |
3 | Number of training epochs |
batch_size |
4 | Batch size per device |
learning_rate |
2e-4 | Learning rate |
eval_strategy |
"epoch" | When to evaluate ("epoch" or "steps") |
save_steps |
None | Save checkpoint every N steps |
lora_r |
16 | LoRA rank |
quantization |
"4bit" | Quantization mode ("4bit", "8bit", None) |
API_REFERENCE_TRAININGCONFIG.md for all 25 parameters.
Data Generation
Synthetic Data Generation
from funcroute.data.generator import SyntheticDataGenerator
generator = SyntheticDataGenerator(method="rule_based")
data = generator.generate(
tools=tools,
num_variations=50, # 50 variations per pattern
num_samples=5000, # Target ~5000 total samples
)
print(f"Generated {len(data)} samples")
# Generated 5000 samples
Data Format
# JSONL format
{"query": "Where is my order?", "tool": "manage_order"}
{"query": "Show me red dresses", "tool": "search_products"}
{"query": "Return this item", "tool": "process_return"}
Pattern Group Splitting (Anti-Leakage)
Problem: Random splitting can leak pattern variations between train/test, causing inflated accuracy.
Solution: FuncRoute groups similar queries and splits by groups, not individual samples.
from funcroute.data.splitter import PatternGroupSplitter
splitter = PatternGroupSplitter(seed=42)
train, val, test = splitter.split(
data,
train_ratio=0.7,
val_ratio=0.15,
test_ratio=0.15,
verify_no_leakage=True, # Automatic verification
)
# Output:
# Pattern Group Splitting:
# Total groups: 150
# Train groups: 105 (3500 samples)
# Val groups: 22 (750 samples)
# Test groups: 23 (750 samples)
# ✅ NO DATA LEAKAGE - Splits are clean!
Inference
Single Prediction
result = router.route("Where is my order?")
print(f"Tool: {result.tool}")
print(f"Confidence: {result.confidence:.1%}")
print(f"Latency: {result.latency_ms:.1f}ms")
Batch Prediction
from funcroute.inference import Predictor
predictor = Predictor(router)
results = predictor.predict_batch(
queries=["Query 1", "Query 2", ...],
max_workers=4,
show_progress=True,
)
Async Prediction
import asyncio
result = await predictor.predict_async("Where is my order?")
Caching
LRU Cache with TTL
from funcroute.inference import RouteCache, Predictor
cache = RouteCache(max_size=1000, ttl_seconds=3600)
predictor = Predictor(router, cache=cache)
# First call: miss (150ms)
result1 = predictor._predict_single("Where is my order?")
# Second call: hit (< 1ms)
result2 = predictor._predict_single("Where is my order?")
# Cache statistics
stats = cache.get_stats()
print(f"Hit rate: {stats['hit_rate']:.1%}") # 50.0%
Cache Warmup
from funcroute.inference import WarmupCache
warmup = WarmupCache(predictor)
common_queries = ["Where is my order?", "Track package", ...]
warmup.warmup(common_queries)
Deployment
REST API Server
# Start server
funcroute serve --model ./my_router --port 8000
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/route |
POST | Single query routing |
/route/batch |
POST | Batch query routing |
/health |
GET | Health check |
/stats |
GET | Server statistics |
/cache/stats |
GET | Cache statistics |
cURL Examples
# Single prediction
curl -X POST http://localhost:8000/route \
-H "Content-Type: application/json" \
-d '{"query": "Where is my order?"}'
# Batch prediction
curl -X POST http://localhost:8000/route/batch \
-H "Content-Type: application/json" \
-d '{"queries": ["Where is my order?", "Show me laptops"]}'
📊 Examples
We provide 9 comprehensive examples demonstrating all features:
- simple_example.py - Complete workflow with 5000 samples
- batch_prediction_example.py - 7 batch processing patterns
- streaming_prediction_example.py - 7 streaming patterns
- async_prediction_example.py - 9 async/await patterns
- caching_example.py - 8 caching strategies
- evaluation_example.py - Metrics and cross-validation
- synthetic_data_example.py - Data generation
- server_example.py - REST API deployment
- test_imports.py - Import verification
# Run examples
cd examples
python simple_example.py
API Reference
FuncRoute Class
class FuncRoute:
def __init__(self, device: str = "auto"):
"""Initialize FuncRoute router."""
def train(
self,
train_data: Union[str, List[Dict]],
val_data: Optional[Union[str, List[Dict]]] = None,
tools: Optional[List[ToolDefinition]] = None,
config: Optional[TrainingConfig] = None,
):
"""Train the routing model."""
@classmethod
def load(cls, model_path: str, device: str = "auto") -> "FuncRoute":
"""Load a trained model."""
@classmethod
def from_pretrained(cls, model_name: str, device: str = "auto") -> "FuncRoute":
"""Load a pre-trained model from Hugging Face Hub."""
def route(self, query: str, config: Optional[InferenceConfig] = None) -> RouteResult:
"""Route a query to the best tool."""
🤝 Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development Setup
git clone https://github.com/yourusername/funcroute.git
cd funcroute
pip install -e ".[dev]"
pytest tests/
Areas for Contribution
- 📝 Documentation improvements
- 🐛 Bug fixes and issue reports
- ✨ New examples and use cases
- 🔧 Performance optimizations
- 🌐 Framework integrations