Documentation Index Fetch the complete documentation index at: https://docs.doclo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Providers are the external services that power document processing. Doclo supports two types:
VLM Providers (Vision Language Models): Process documents visually, extract structured data, classify documents
OCR Providers : Convert documents to text with layout information
Provider Types
VLM Providers
VLM providers can see document images directly. Use them for:
Direct extraction from visually complex documents
Document classification and categorization
Splitting multi-document files
Quality assessment
import { createVLMProvider } from '@doclo/providers-llm' ;
const vlmProvider = createVLMProvider ({
provider: 'google' ,
model: 'google/gemini-2.5-flash-preview-09-2025' ,
apiKey: process . env . OPENROUTER_API_KEY ! ,
via: 'openrouter'
});
OCR Providers
OCR providers convert documents to structured text. Use them for:
High-fidelity text extraction with bounding boxes
Processing text-heavy documents
Building RAG pipelines with chunking
import { createOCRProvider } from '@doclo/providers-datalab' ;
const ocrProvider = createOCRProvider ({
endpoint: 'https://www.datalab.to/api/v1/ocr' ,
apiKey: process . env . DATALAB_API_KEY !
});
Supported Providers
VLM Providers
Provider Models Vision PDFs Structured Output OpenAI GPT-4.1, o3, o4-mini Yes Yes Yes Anthropic Claude 4, Sonnet 4.5, Haiku 4.5 Yes Yes Yes Google Gemini 2.5 Pro/Flash Yes Yes Yes xAI Grok 4.1 Yes Yes Yes
OCR Providers
Provider Package Features Surya @doclo/providers-datalabText + bounding boxes Marker @doclo/providers-datalabMarkdown conversion Mistral @doclo/providers-mistralMarkdown, native extraction, handwriting Reducto @doclo/providers-reductoChunking, citations Unsiloed @doclo/providers-unsiloedParse, extract, classify, split
Access Methods
VLM providers can be accessed two ways:
Via OpenRouter (Recommended)
Single API key for all providers with unified billing:
const provider = createVLMProvider ({
provider: 'anthropic' ,
model: 'anthropic/claude-sonnet-4.5' ,
apiKey: process . env . OPENROUTER_API_KEY ! ,
via: 'openrouter'
});
Benefits:
Single API key for all providers
Unified billing and usage tracking
Automatic cost tracking in responses
Provider fallback without multiple API keys
Native APIs
Direct access to provider APIs:
const provider = createVLMProvider ({
provider: 'openai' ,
model: 'gpt-4.1' ,
apiKey: process . env . OPENAI_API_KEY !
// No 'via' parameter = native API
});
Use native APIs when:
You have existing API keys
You need provider-specific features
You want to avoid the OpenRouter intermediary
Provider Selection
Choose based on your needs:
Use Case Recommended Provider Fast extraction Google Gemini 2.5 Flash Complex documents Anthropic Claude Sonnet 4.5 Cost-sensitive Google Gemini 2.5 Flash Lite Reasoning required OpenAI o3, Anthropic Claude OCR + text extraction Surya or Marker RAG chunking Reducto Parse
Production Configuration
For production, use buildLLMProvider with fallback support:
import { buildLLMProvider } from '@doclo/providers-llm' ;
const provider = buildLLMProvider ({
providers: [
{
provider: 'google' ,
model: 'google/gemini-2.5-flash-preview-09-2025' ,
apiKey: process . env . OPENROUTER_API_KEY ! ,
via: 'openrouter'
},
{
provider: 'anthropic' ,
model: 'anthropic/claude-sonnet-4.5' ,
apiKey: process . env . OPENROUTER_API_KEY ! ,
via: 'openrouter'
}
],
maxRetries: 2 ,
retryDelay: 1000 ,
useExponentialBackoff: true ,
circuitBreakerThreshold: 3
});
This configuration:
Retries failed requests up to 2 times
Falls back to the next provider if one fails
Uses circuit breaker to skip failing providers
Applies exponential backoff between retries
Cost Tracking
All providers return cost information:
const result = await flow . run ( input );
console . log ( 'Cost:' , result . aggregated . totalCostUSD );
console . log ( 'Tokens:' , result . aggregated . totalInputTokens , 'in /' ,
result . aggregated . totalOutputTokens , 'out' );
The SDK exports utility functions for querying provider capabilities programmatically:
import {
PROVIDER_METADATA ,
isImageTypeSupported ,
supportsPDFsInline ,
getProvidersForNode ,
isProviderCompatibleWithNode ,
estimateCost ,
getCheapestProvider ,
compareNativeVsOpenRouter
} from '@doclo/providers-llm' ;
Check Image Support
isImageTypeSupported ( 'openai' , 'image/png' ); // true
isImageTypeSupported ( 'openai' , 'image/bmp' ); // false
isImageTypeSupported ( 'google' , 'image/bmp' ); // true (extended support)
Check PDF Support
supportsPDFsInline ( 'openai' ); // true
supportsPDFsInline ( 'anthropic' ); // true
supportsPDFsInline ( 'google' ); // true
Get Providers for Node Type
// Get all providers compatible with parse()
const parseProviders = getProvidersForNode ( 'parse' );
// Get providers compatible with extract()
const extractProviders = getProvidersForNode ( 'extract' );
// Check specific provider compatibility
isProviderCompatibleWithNode ( 'openai' , 'categorize' ); // true
Estimate Costs
// Estimate cost for 1000 input + 500 output tokens
const cost = estimateCost ( 'openai' , 1000 , 500 );
console . log ( `$ ${ cost . toFixed ( 4 ) } ` ); // "$0.0125"
// Find cheapest provider for a workload
const cheapest = getCheapestProvider ( 10000 , 1000 );
console . log ( cheapest . name ); // "Google (Gemini)"
Compare Access Methods
const comparison = compareNativeVsOpenRouter ( 'anthropic' );
console . log ( comparison . differences );
// ['Uses OpenAI-compatible format...', 'Response prefill trick...']
The PROVIDER_METADATA constant provides complete metadata for all providers:
const anthropic = PROVIDER_METADATA . anthropic ;
console . log ( anthropic . models ); // ['claude-opus-4.5', 'claude-sonnet-4.5', ...]
console . log ( anthropic . capabilities ); // { supportsImages: true, supportsPDFs: true, ... }
console . log ( anthropic . pricing ); // { inputPer1k: 0.003, outputPer1k: 0.015, ... }
console . log ( anthropic . limits ); // { maxContextTokens: 200000, ... }
console . log ( anthropic . inputFormats ); // { images: {...}, pdfs: {...} }
Next Steps
OpenAI GPT-4.1, o3, o4-mini configuration
Anthropic Claude models configuration
Google Gemini models configuration
Mistral OCR Mistral OCR 3 and Document AI setup
Surya OCR Datalab Surya OCR setup