Documentation Index
Fetch the complete documentation index at: https://docs.doclo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Google Gemini models offer excellent performance with the largest context windows and competitive pricing.
Installation
npm install @doclo/providers-llm
Basic Setup
Via OpenRouter (Recommended)
import { createVLMProvider } from '@doclo/providers-llm';
const provider = createVLMProvider({
provider: 'google',
model: 'google/gemini-2.5-flash-preview-09-2025',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
});
Native API
const provider = createVLMProvider({
provider: 'google',
model: 'gemini-2.5-flash',
apiKey: process.env.GOOGLE_API_KEY!
});
Available Models
| Model | Context | Reasoning | Best For |
|---|
gemini-3-pro | 1M | Yes | Latest, most capable |
gemini-2.5-pro | 1M | Yes | Complex extraction |
gemini-2.5-flash | 1M | No | Fast, cost-effective |
gemini-2.5-flash-lite | 1M | No | Lowest cost |
OpenRouter Model IDs
When using OpenRouter, use the full model path:
// OpenRouter
model: 'google/gemini-2.5-flash-preview-09-2025'
// Native
model: 'gemini-2.5-flash'
Configuration Options
createVLMProvider({
provider: 'google',
model: string, // Model ID
apiKey: string, // API key
via?: 'openrouter', // Access method
baseUrl?: string, // Custom endpoint
limits?: {
maxFileSize?: number, // Max file size (bytes)
requestTimeout?: number, // Timeout (ms)
maxJsonDepth?: number // Max JSON nesting
}
})
Capabilities
| Feature | Support |
|---|
| Images | Yes (PNG, JPEG, WebP, GIF, BMP, TIFF, HEIF) |
| PDFs | Yes (up to 1000 pages) |
| Structured Output | Yes (responseMimeType: application/json) |
| Reasoning | Yes (thinking_config) |
| Streaming | Yes |
Images
Gemini supports more image formats than other providers:
// Supported: PNG, JPEG, WebP, GIF, BMP, TIFF, HEIF
{
images: [{
base64: 'data:image/jpeg;base64,...',
mimeType: 'image/jpeg'
}]
}
Image limits: 20MB per image, 3072x3072 max (auto-scaled).
PDFs
Gemini has the highest PDF capacity:
{
pdfs: [{
base64: 'data:application/pdf;base64,...'
}]
}
// Or via Files API (native only)
{
pdfs: [{
fileId: 'files/abc123' // Gemini Files API reference
}]
}
PDF limits: 50MB per file, up to 1000 pages.
Extended Thinking
Gemini models support thinking mode for complex reasoning:
import { createFlow, extract } from '@doclo/flows';
const flow = createFlow()
.step('extract', extract({
provider: createVLMProvider({
provider: 'google',
model: 'google/gemini-2.5-pro-preview-06-05',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
}),
schema: complexSchema,
reasoning: {
enabled: true,
effort: 'high'
}
}))
.build();
Thinking budget limits:
- Gemini 2.5 Flash: 0-24576 tokens (default: auto, up to 8192)
- Gemini 2.5 Pro: Higher limits available
Large Document Processing
Gemini’s 1M token context makes it ideal for large documents:
import { createFlow, extract } from '@doclo/flows';
// Process large documents without chunking
const flow = createFlow()
.step('extract', extract({
provider: createVLMProvider({
provider: 'google',
model: 'google/gemini-2.5-flash-preview-09-2025',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
}),
schema: largeDocSchema
}))
.build();
// Can handle 500+ page documents in a single request
Production Setup
import { buildLLMProvider } from '@doclo/providers-llm';
const provider = buildLLMProvider({
providers: [
{
provider: 'google',
model: 'google/gemini-2.5-flash-preview-09-2025',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
},
{
provider: 'google',
model: 'google/gemini-2.5-flash-lite',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
}
],
maxRetries: 2,
retryDelay: 1000,
useExponentialBackoff: true
});
Pricing
Via OpenRouter (approximate):
| Model | Input (per 1k) | Output (per 1k) |
|---|
| gemini-3-pro | $0.00125 | $0.005 |
| gemini-2.5-pro | $0.00125 | $0.005 |
| gemini-2.5-flash | $0.00015 | $0.0006 |
| gemini-2.5-flash-lite | $0.000075 | $0.0003 |
Gemini is typically the most cost-effective option for high-volume processing.
Example: Multi-Page Report
import { createFlow, extract } from '@doclo/flows';
import { createVLMProvider } from '@doclo/providers-llm';
const provider = createVLMProvider({
provider: 'google',
model: 'google/gemini-2.5-flash-preview-09-2025',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
});
const reportSchema = {
type: 'object',
properties: {
title: { type: 'string' },
author: { type: 'string' },
date: { type: 'string' },
executiveSummary: { type: 'string' },
sections: {
type: 'array',
items: {
type: 'object',
properties: {
title: { type: 'string' },
content: { type: 'string' },
pageNumber: { type: 'number' }
}
}
},
keyFindings: {
type: 'array',
items: { type: 'string' }
}
}
};
const flow = createFlow()
.step('extract', extract({
provider,
schema: reportSchema
}))
.build();
// Process a 200-page annual report
const result = await flow.run({
base64: 'data:application/pdf;base64,...'
});
console.log(`Found ${result.output.sections.length} sections`);
console.log('Key findings:', result.output.keyFindings);
Structured Output Notes
Gemini uses responseMimeType: application/json for JSON output. The SDK embeds the schema in the prompt for reliable structured output, as Gemini’s native responseSchema has limitations with complex schemas.
Next Steps