Pre-built Flows

The SDK includes pre-built flows for common document processing patterns. These flows handle provider configuration, error handling, and retries out of the box.

VLM Direct Flow

Skip OCR and send documents directly to a Vision Language Model:

import { buildVLMDirectFlow } from '@doclo/flows';

const flow = buildVLMDirectFlow({
  llmConfigs: [
    {
      provider: 'google',
      model: 'gemini-2.5-flash',
      apiKey: process.env.GOOGLE_API_KEY!,
      via: 'openrouter'
    }
  ]
});

const result = await flow.run({ base64: documentData });
console.log(result.output);

Configuration

buildVLMDirectFlow({
  // Required: LLM provider configurations
  llmConfigs: [
    {
      provider: 'openai' | 'anthropic' | 'google' | 'xai',
      model: string,
      apiKey: string,
      via?: 'openrouter' | 'native',
      baseUrl?: string
    }
  ],

  // Optional: Retry settings
  maxRetries?: number,              // Default: 2
  retryDelay?: number,              // Default: 1000ms
  circuitBreakerThreshold?: number  // Default: 3
})

When to Use

VLM direct is ideal when:

Documents have complex visual layouts (tables, forms, charts)
Speed is more important than cost
OCR might introduce errors (handwritten text, unusual fonts)

VLM direct may not be best when:

Documents are text-heavy with simple layouts
Cost is a primary concern (vision tokens are more expensive)
You need to process very large documents

Output

interface VLMDirectResult {
  output: {
    vessel: string | null;
    port: string | null;
    quantity_mt: number | null;
  };
  metrics: StepMetric[];
  artifacts: {
    vlm_extract: unknown;
  };
}

Multi-Provider Flow

OCR + LLM extraction with automatic provider fallback:

import { buildMultiProviderFlow } from '@doclo/flows';
import { createOCRProvider } from '@doclo/providers-llm';

const ocrProvider = createOCRProvider({
  provider: 'surya',
  endpoint: process.env.SURYA_ENDPOINT!,
  apiKey: process.env.SURYA_API_KEY!
});

const flow = buildMultiProviderFlow({
  ocr: ocrProvider,
  llmConfigs: [
    {
      provider: 'openai',
      model: 'gpt-4.1',
      apiKey: process.env.OPENAI_API_KEY!
    },
    {
      provider: 'anthropic',
      model: 'claude-haiku-4.5',
      apiKey: process.env.ANTHROPIC_API_KEY!,
      via: 'openrouter'
    },
    {
      provider: 'google',
      model: 'gemini-2.5-flash',
      apiKey: process.env.GOOGLE_API_KEY!
    }
  ],
  maxRetries: 2
});

const result = await flow.run({ base64: pdfData });

Configuration

buildMultiProviderFlow({
  // Required: OCR provider
  ocr: OCRProvider,

  // Required: LLM provider configurations (in priority order)
  llmConfigs: [
    {
      provider: 'openai' | 'anthropic' | 'google' | 'xai',
      model: string,
      apiKey: string,
      via?: 'openrouter' | 'native',
      baseUrl?: string
    }
  ],

  // Optional: Retry and fallback settings
  maxRetries?: number,              // Default: 2
  retryDelay?: number,              // Default: 1000ms
  circuitBreakerThreshold?: number  // Default: 3
})

Provider Fallback

Providers are tried in order. If one fails after retries, the next is used:

Request → OpenAI (retry 1) → OpenAI (retry 2) → Anthropic → Google → Error

The circuit breaker prevents repeated calls to failing providers:

After circuitBreakerThreshold failures, provider is marked “open”
Open providers are skipped until reset

Output

interface MultiProviderResult {
  ir: DocumentIR;                   // Parsed document
  output: {
    vessel: string | null;
    port: string | null;
    quantity_mt: number | null;
  };
  metrics: StepMetric[];
  artifacts: {
    parse: unknown;
    extract: unknown;
  };
}

Two-Provider Flow

Compare extraction results from two LLM providers:

import { buildTwoProviderFlow } from '@doclo/flows';
import { createOCRProvider, createLLMProvider } from '@doclo/providers-llm';

const ocrProvider = createOCRProvider({
  provider: 'surya',
  endpoint: process.env.SURYA_ENDPOINT!,
  apiKey: process.env.SURYA_API_KEY!
});

const llmA = createLLMProvider({
  provider: 'openai',
  model: 'gpt-4.1',
  apiKey: process.env.OPENAI_API_KEY!
});

const llmB = createLLMProvider({
  provider: 'anthropic',
  model: 'claude-sonnet-4',
  apiKey: process.env.ANTHROPIC_API_KEY!
});

const flow = buildTwoProviderFlow({
  ocr: ocrProvider,
  llmA: llmA,
  llmB: llmB
});

const result = await flow.run({ base64: pdfData });

console.log('Provider A:', result.outputA);
console.log('Provider B:', result.outputB);

Use Cases

Quality validation: Compare results to detect extraction errors
A/B testing: Evaluate provider performance on your documents
Consensus: Use matching results as high-confidence extractions

Output

interface TwoProviderResult {
  ir: DocumentIR;
  outputA: {
    vessel: string | null;
    port: string | null;
    quantity_mt: number | null;
  };
  outputB: {
    vessel: string | null;
    port: string | null;
    quantity_mt: number | null;
  };
  metrics: StepMetric[];
  artifacts: {
    parse: unknown;
    extractA: unknown;
    extractB: unknown;
  };
}

Building Custom Pre-built Flows

Use createFlow() to build your own reusable flows:

import { createFlow, parse, extract, split, combine, categorize } from '@doclo/flows';
import { createVLMProvider, createOCRProvider } from '@doclo/providers-llm';

export function buildInvoiceFlow(config: {
  vlmApiKey: string;
  ocrEndpoint: string;
  ocrApiKey: string;
}) {
  const vlmProvider = createVLMProvider({
    provider: 'google',
    model: 'google/gemini-2.5-flash',
    apiKey: config.vlmApiKey,
    via: 'openrouter'
  });

  const ocrProvider = createOCRProvider({
    provider: 'surya',
    endpoint: config.ocrEndpoint,
    apiKey: config.ocrApiKey
  });

  const invoiceSchema = {
    type: 'object',
    properties: {
      invoiceNumber: { type: 'string' },
      date: { type: 'string' },
      vendor: { type: 'string' },
      total: { type: 'number' },
      lineItems: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            description: { type: 'string' },
            quantity: { type: 'number' },
            unitPrice: { type: 'number' },
            amount: { type: 'number' }
          }
        }
      }
    }
  };

  return createFlow()
    .acceptFormats(['application/pdf', 'image/jpeg', 'image/png'])
    .step('parse', parse({ provider: ocrProvider }))
    .step('extract', extract({
      provider: vlmProvider,
      schema: invoiceSchema,
      consensus: { runs: 3, strategy: 'majority' }
    }))
    .build();
}

// Usage
const invoiceFlow = buildInvoiceFlow({
  vlmApiKey: process.env.OPENROUTER_API_KEY!,
  ocrEndpoint: process.env.SURYA_ENDPOINT!,
  ocrApiKey: process.env.SURYA_API_KEY!
});

const result = await invoiceFlow.run({ base64: invoicePdf });

Comparing Approaches

Flow	OCR	Speed	Cost	Best For
VLM Direct	No	Fast	Higher	Visual documents, forms
Multi-Provider	Yes	Medium	Lower	Text documents, fallback
Two-Provider	Yes	Slower	Higher	Validation, comparison
Custom	Configurable	Varies	Varies	Specific requirements

Next Steps

Creating Flows

Build your own custom flows

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

Pre-built Flows

VLM Direct Flow

Configuration

When to Use

Output

Multi-Provider Flow

Configuration

Provider Fallback

Output

Two-Provider Flow

Use Cases

Output

Building Custom Pre-built Flows

Comparing Approaches

Next Steps

Creating Flows

Flow Registry

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

​VLM Direct Flow

​Configuration

​When to Use

​Output

​Multi-Provider Flow

​Configuration

​Provider Fallback

​Output

​Two-Provider Flow

​Use Cases

​Output

​Building Custom Pre-built Flows

​Comparing Approaches

​Next Steps

Creating Flows

Flow Registry

VLM Direct Flow

Configuration

When to Use

Output

Multi-Provider Flow

Configuration

Provider Fallback

Output

Two-Provider Flow

Use Cases

Output

Building Custom Pre-built Flows

Comparing Approaches

Next Steps