Documentation Index Fetch the complete documentation index at: https://docs.doclo.ai/llms.txt
Use this file to discover all available pages before exploring further.
The SDK includes pre-built flows for common document processing patterns. These flows handle provider configuration, error handling, and retries out of the box.
VLM Direct Flow
Skip OCR and send documents directly to a Vision Language Model:
import { buildVLMDirectFlow } from '@doclo/flows' ;
const flow = buildVLMDirectFlow ({
llmConfigs: [
{
provider: 'google' ,
model: 'gemini-2.5-flash' ,
apiKey: process . env . GOOGLE_API_KEY ! ,
via: 'openrouter'
}
]
});
const result = await flow . run ({ base64: documentData });
console . log ( result . output );
Configuration
buildVLMDirectFlow ({
// Required: LLM provider configurations
llmConfigs: [
{
provider: 'openai' | 'anthropic' | 'google' | 'xai' ,
model: string ,
apiKey: string ,
via? : 'openrouter' | 'native' ,
baseUrl? : string
}
],
// Optional: Retry settings
maxRetries? : number , // Default: 2
retryDelay? : number , // Default: 1000ms
circuitBreakerThreshold? : number // Default: 3
})
When to Use
VLM direct is ideal when:
Documents have complex visual layouts (tables, forms, charts)
Speed is more important than cost
OCR might introduce errors (handwritten text, unusual fonts)
VLM direct may not be best when:
Documents are text-heavy with simple layouts
Cost is a primary concern (vision tokens are more expensive)
You need to process very large documents
Output
interface VLMDirectResult {
output : {
vessel : string | null ;
port : string | null ;
quantity_mt : number | null ;
};
metrics : StepMetric [];
artifacts : {
vlm_extract : unknown ;
};
}
Multi-Provider Flow
OCR + LLM extraction with automatic provider fallback:
import { buildMultiProviderFlow } from '@doclo/flows' ;
import { createOCRProvider } from '@doclo/providers-llm' ;
const ocrProvider = createOCRProvider ({
provider: 'surya' ,
endpoint: process . env . SURYA_ENDPOINT ! ,
apiKey: process . env . SURYA_API_KEY !
});
const flow = buildMultiProviderFlow ({
ocr: ocrProvider ,
llmConfigs: [
{
provider: 'openai' ,
model: 'gpt-4.1' ,
apiKey: process . env . OPENAI_API_KEY !
},
{
provider: 'anthropic' ,
model: 'claude-haiku-4.5' ,
apiKey: process . env . ANTHROPIC_API_KEY ! ,
via: 'openrouter'
},
{
provider: 'google' ,
model: 'gemini-2.5-flash' ,
apiKey: process . env . GOOGLE_API_KEY !
}
],
maxRetries: 2
});
const result = await flow . run ({ base64: pdfData });
Configuration
buildMultiProviderFlow ({
// Required: OCR provider
ocr: OCRProvider ,
// Required: LLM provider configurations (in priority order)
llmConfigs: [
{
provider: 'openai' | 'anthropic' | 'google' | 'xai' ,
model: string ,
apiKey: string ,
via? : 'openrouter' | 'native' ,
baseUrl? : string
}
],
// Optional: Retry and fallback settings
maxRetries? : number , // Default: 2
retryDelay? : number , // Default: 1000ms
circuitBreakerThreshold? : number // Default: 3
})
Provider Fallback
Providers are tried in order. If one fails after retries, the next is used:
Request → OpenAI (retry 1) → OpenAI (retry 2) → Anthropic → Google → Error
The circuit breaker prevents repeated calls to failing providers:
After circuitBreakerThreshold failures, provider is marked “open”
Open providers are skipped until reset
Output
interface MultiProviderResult {
ir : DocumentIR ; // Parsed document
output : {
vessel : string | null ;
port : string | null ;
quantity_mt : number | null ;
};
metrics : StepMetric [];
artifacts : {
parse : unknown ;
extract : unknown ;
};
}
Two-Provider Flow
Compare extraction results from two LLM providers:
import { buildTwoProviderFlow } from '@doclo/flows' ;
import { createOCRProvider , createLLMProvider } from '@doclo/providers-llm' ;
const ocrProvider = createOCRProvider ({
provider: 'surya' ,
endpoint: process . env . SURYA_ENDPOINT ! ,
apiKey: process . env . SURYA_API_KEY !
});
const llmA = createLLMProvider ({
provider: 'openai' ,
model: 'gpt-4.1' ,
apiKey: process . env . OPENAI_API_KEY !
});
const llmB = createLLMProvider ({
provider: 'anthropic' ,
model: 'claude-sonnet-4' ,
apiKey: process . env . ANTHROPIC_API_KEY !
});
const flow = buildTwoProviderFlow ({
ocr: ocrProvider ,
llmA: llmA ,
llmB: llmB
});
const result = await flow . run ({ base64: pdfData });
console . log ( 'Provider A:' , result . outputA );
console . log ( 'Provider B:' , result . outputB );
Use Cases
Quality validation : Compare results to detect extraction errors
A/B testing : Evaluate provider performance on your documents
Consensus : Use matching results as high-confidence extractions
Output
interface TwoProviderResult {
ir : DocumentIR ;
outputA : {
vessel : string | null ;
port : string | null ;
quantity_mt : number | null ;
};
outputB : {
vessel : string | null ;
port : string | null ;
quantity_mt : number | null ;
};
metrics : StepMetric [];
artifacts : {
parse : unknown ;
extractA : unknown ;
extractB : unknown ;
};
}
Building Custom Pre-built Flows
Use createFlow() to build your own reusable flows:
import { createFlow , parse , extract , split , combine , categorize } from '@doclo/flows' ;
import { createVLMProvider , createOCRProvider } from '@doclo/providers-llm' ;
export function buildInvoiceFlow ( config : {
vlmApiKey : string ;
ocrEndpoint : string ;
ocrApiKey : string ;
}) {
const vlmProvider = createVLMProvider ({
provider: 'google' ,
model: 'google/gemini-2.5-flash' ,
apiKey: config . vlmApiKey ,
via: 'openrouter'
});
const ocrProvider = createOCRProvider ({
provider: 'surya' ,
endpoint: config . ocrEndpoint ,
apiKey: config . ocrApiKey
});
const invoiceSchema = {
type: 'object' ,
properties: {
invoiceNumber: { type: 'string' },
date: { type: 'string' },
vendor: { type: 'string' },
total: { type: 'number' },
lineItems: {
type: 'array' ,
items: {
type: 'object' ,
properties: {
description: { type: 'string' },
quantity: { type: 'number' },
unitPrice: { type: 'number' },
amount: { type: 'number' }
}
}
}
}
};
return createFlow ()
. acceptFormats ([ 'application/pdf' , 'image/jpeg' , 'image/png' ])
. step ( 'parse' , parse ({ provider: ocrProvider }))
. step ( 'extract' , extract ({
provider: vlmProvider ,
schema: invoiceSchema ,
consensus: { runs: 3 , strategy: 'majority' }
}))
. build ();
}
// Usage
const invoiceFlow = buildInvoiceFlow ({
vlmApiKey: process . env . OPENROUTER_API_KEY ! ,
ocrEndpoint: process . env . SURYA_ENDPOINT ! ,
ocrApiKey: process . env . SURYA_API_KEY !
});
const result = await invoiceFlow . run ({ base64: invoicePdf });
Comparing Approaches
Flow OCR Speed Cost Best For VLM Direct No Fast Higher Visual documents, forms Multi-Provider Yes Medium Lower Text documents, fallback Two-Provider Yes Slower Higher Validation, comparison Custom Configurable Varies Varies Specific requirements
Next Steps
Creating Flows Build your own custom flows
Flow Registry Register flows for reuse