Skip to main content
Providers are the external services that power document processing. Doclo supports two types:
  • VLM Providers (Vision Language Models): Process documents visually, extract structured data, classify documents
  • OCR Providers: Convert documents to text with layout information

Provider Types

VLM Providers

VLM providers can see document images directly. Use them for:
  • Direct extraction from visually complex documents
  • Document classification and categorization
  • Splitting multi-document files
  • Quality assessment
import { createVLMProvider } from '@docloai/providers-llm';

const vlmProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-flash-preview-09-2025',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

OCR Providers

OCR providers convert documents to structured text. Use them for:
  • High-fidelity text extraction with bounding boxes
  • Processing text-heavy documents
  • Building RAG pipelines with chunking
import { createOCRProvider } from '@docloai/providers-datalab';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

Supported Providers

VLM Providers

ProviderModelsVisionPDFsStructured Output
OpenAIGPT-4.1, o3, o4-miniYesYesYes
AnthropicClaude 4, Sonnet 4.5, Haiku 4.5YesYesYes
GoogleGemini 2.5 Pro/FlashYesYesYes
xAIGrok 4.1YesYesYes

OCR Providers

ProviderPackageFeatures
Surya@docloai/providers-datalabText + bounding boxes
Marker@docloai/providers-datalabMarkdown conversion
Reducto@docloai/providers-reductoChunking, citations

Access Methods

VLM providers can be accessed two ways: Single API key for all providers with unified billing:
const provider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4.5',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});
Benefits:
  • Single API key for all providers
  • Unified billing and usage tracking
  • Automatic cost tracking in responses
  • Provider fallback without multiple API keys

Native APIs

Direct access to provider APIs:
const provider = createVLMProvider({
  provider: 'openai',
  model: 'gpt-4.1',
  apiKey: process.env.OPENAI_API_KEY!
  // No 'via' parameter = native API
});
Use native APIs when:
  • You have existing API keys
  • You need provider-specific features
  • You want to avoid the OpenRouter intermediary

Provider Selection

Choose based on your needs:
Use CaseRecommended Provider
Fast extractionGoogle Gemini 2.5 Flash
Complex documentsAnthropic Claude Sonnet 4.5
Cost-sensitiveGoogle Gemini 2.5 Flash Lite
Reasoning requiredOpenAI o3, Anthropic Claude
OCR + text extractionSurya or Marker
RAG chunkingReducto Parse

Production Configuration

For production, use buildLLMProvider with fallback support:
import { buildLLMProvider } from '@docloai/providers-llm';

const provider = buildLLMProvider({
  providers: [
    {
      provider: 'google',
      model: 'google/gemini-2.5-flash-preview-09-2025',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    },
    {
      provider: 'anthropic',
      model: 'anthropic/claude-sonnet-4.5',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    }
  ],
  maxRetries: 2,
  retryDelay: 1000,
  useExponentialBackoff: true,
  circuitBreakerThreshold: 3
});
This configuration:
  • Retries failed requests up to 2 times
  • Falls back to the next provider if one fails
  • Uses circuit breaker to skip failing providers
  • Applies exponential backoff between retries

Cost Tracking

All providers return cost information:
const result = await flow.run(input);

console.log('Cost:', result.aggregated.totalCostUSD);
console.log('Tokens:', result.aggregated.totalInputTokens, 'in /',
            result.aggregated.totalOutputTokens, 'out');

Next Steps