Skip to main content
Doclo is a document intelligence orchestration platform. It provides a unified interface for AI-powered document processing, from OCR to structured data extraction.

The Problem

Building document intelligence pipelines today is complex:
  • Vendor Lock-In: Teams stitch together multiple SDKs (OpenAI, Tesseract, etc.), resulting in brittle, provider-specific code. Switching providers requires significant rewrites.
  • No Automatic Failover: When a provider goes down or hits rate limits, your pipeline stops. There’s no automatic retry or fallback.
  • Lack of Visibility: Without built-in monitoring, you have no per-document cost tracking or accuracy metrics. Quality is a black box until something breaks.
  • Difficult Provider Switching: Testing a new provider means rewriting integration code. Comparing providers on your actual data is nearly impossible.

The Solution

Doclo addresses these challenges with a production-ready orchestration layer:

Provider-Agnostic Architecture

All providers implement unified interfaces. Swap OpenAI for Anthropic or Gemini with a config change - no code rewrite required.
// Switch providers by changing config only
const provider = createVLMProvider({
  provider: 'google',  // 'openai', 'anthropic', 'xai'
  model: 'google/gemini-2.5-flash',  // 'openai/gpt-4o', 'anthropic/claude-sonnet-4'
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'  // Remove for native provider access
});

Automatic Fallback and Retry

Built-in resilience with circuit breakers, exponential backoff, and smart error handling:
const provider = buildLLMProvider({
  providers: [
    { provider: 'openai', model: 'openai/gpt-4.1', ... },
    { provider: 'anthropic', model: 'anthropic/claude-haiku-4.5', ... }
  ],
  maxRetries: 3,
  useExponentialBackoff: true
});

Consensus Voting

Run extraction multiple times and take the majority vote for each field. This dramatically improves accuracy for critical data.
extract({
  provider: vlmProvider,
  schema: invoiceSchema,
  consensus: {
    runs: 3,
    level: 'field',  // 'field' or 'object'
    strategy: 'majority'
  }
})

Cost and Accuracy Visibility

Every extraction returns detailed metrics:
  • Per-document cost tracking in USD
  • Field-level accuracy metrics when using consensus
  • Provider performance monitoring

Architecture Overview

Documents flow through Flows (pipelines) composed of Nodes (operations) that use Providers (AI services) to produce Structured Data.

Core Components

Data Flow Patterns

Direct VLM Extraction

Fastest path for simple documents:

OCR + LLM Pipeline

Most accurate for text-heavy documents:

Multi-Document Processing

Split and process document bundles:

Key Terminology

TermDefinition
FlowA composable pipeline of nodes that transforms documents into structured data
NodeA reusable, stateless operation (parse, extract, categorize, etc.)
ProviderAn AI service (OpenAI, Anthropic, Google) or OCR service (Datalab, Mistral, Reducto)
ModelA specific model from a provider (gpt-4o, gemini-2.5-flash, surya, marker)
DocumentIRIntermediate representation format that decouples parsing from extraction
SchemaJSON Schema defining the structure of extracted data
ConsensusRun extraction N times and take majority vote for improved accuracy

Next Steps