Chat API
The Chat API is the core endpoint that powers Alfred402's AI conversations.
Endpoint
POST /api/chatOverview
This endpoint receives user messages, processes them through Google's Gemini AI model with web search capabilities, and streams the response back to the client in real-time.
Configuration
Maximum Duration
export const maxDuration = 50;The endpoint can run for up to 50 seconds. This aligns with the client-side rate limiting and ensures long-running AI tasks can complete.
Platform limits:
Vercel Hobby: 10 seconds max
Vercel Pro: 60 seconds max
Vercel Enterprise: 900 seconds max
Request Format
Headers
Body
Or with contract address:
Message Structure
messages
Array
Yes
Array of message objects
messages[].role
String
Yes
Either "user" or "assistant"
messages[].content
String
Yes
Message text content
Response Format
The endpoint returns a streaming response using Server-Sent Events (SSE).
Stream Format
Response Events
0:
Text content chunks
d:
Metadata (finish reason, etc.)
e:
Error messages
Finish Reasons
stop- Normal completionlength- Max tokens reachedcontent-filter- Content filteredtool-calls- Tool execution required
AI Model Configuration
Model Selection
Available models:
gemini-2.5-flash- Fast, efficient (default)gemini-2.5-pro- More capable, slowergemini-1.5-flash- Previous generationgemini-1.5-pro- Previous generation pro
Parameters
Temperature
Controls response creativity:
0.0-0.3: Focused, deterministic
0.4-0.7: Balanced (recommended)
0.8-1.0: Creative, varied
Max Tokens
Maximum response length:
1000: Brief answers
4000: Detailed analysis (default)
8000: Very comprehensive
Tools Integration
The AI has access to two Google tools:
1. Google Search
Enables the AI to:
Search for current cryptocurrency prices
Find recent news and updates
Access DexScreener, CoinGecko data
Verify contract addresses
2. URL Context
Enables the AI to:
Fetch and analyze specific URLs
Extract data from blockchain explorers
Read token information from DEX platforms
System Prompt
The endpoint includes a comprehensive system prompt that defines:
Identity: "Alfred402" cryptocurrency oracle
Capabilities: Web search, token analysis, risk assessment
Security directives: Prompt injection protection
Personality traits: Wise, data-driven, cautious
Instructions: How to analyze tokens and cite sources
See AI System Prompt for full details.
Error Handling
Common Errors
400 Bad Request
Cause: Missing or malformed messages array
401 Unauthorized
Cause: Missing GOOGLE_GENERATIVE_AI_API_KEY environment variable
429 Too Many Requests
Cause: Too many requests to Google AI API
500 Internal Server Error
Cause: Gemini API error or network issue
Error Response Format
Errors are returned as JSON:
Example Usage
Using Fetch API
Using Vercel AI SDK (Recommended)
Performance Considerations
Streaming Benefits
Faster perceived performance: Users see responses as they generate
Better UX: No long waits for complete responses
Efficient: Reduces memory usage on server
Response Times
Typical response times:
Simple queries: 2-5 seconds
With web search: 5-15 seconds
Complex analysis: 10-30 seconds
Optimization Tips
Use appropriate max tokens: Don't request more than needed
Adjust temperature: Lower = faster, higher = more thorough
Enable tool use selectively: Tools add latency
Implement client-side caching: Cache common queries
Security Features
Prompt Injection Protection
The system prompt includes multiple layers of defense:
Rate Limiting
Client-side: 50-second cooldown
Server-side:
maxDurationlimitAPI-side: Google AI quota limits
Input Validation
The endpoint validates:
Request structure
Message format
Content safety
Monitoring
Recommended Metrics
Track these metrics in production:
Request count
Average response time
Error rate
Token usage
Tool invocation frequency
Logging
Add logging for:
Cost Optimization
Gemini API Pricing
Free tier: 15 requests/minute
Paid tier: Higher limits, lower latency
Reducing Costs
Lower max tokens: Use 2000 instead of 4000
Implement caching: Cache frequent queries
Use Flash model: Cheaper than Pro
Rate limit users: Current 50s cooldown helps
Testing
Manual Testing
Automated Testing
Consider testing:
Valid request handling
Error responses
Streaming functionality
Tool invocations
Related Documentation
Need help? Check Troubleshooting or open an issue.
Last updated
