Quickstart

Build your first document extraction flow using the Doclo SDK.

Prerequisites

Node.js 18+ installed

pnpm, npm, or yarn

Basic TypeScript knowledge

An AI provider API key (OpenRouter, OpenAI, or Anthropic)

Installation

Install the core packages:

pnpm add @doclo/flows @doclo/providers-llm

These two packages include everything you need:

@doclo/flows - Flow builder and all processing nodes
@doclo/providers-llm - LLM/VLM provider integrations

API Key Setup

This guide uses OpenRouter as a gateway to multiple AI providers. You can also use native provider keys directly.

Get an OpenRouter API Key

Create environment file

Create a .env.local file in your project root:

OPENROUTER_API_KEY=sk-or-v1-your-key-here

Load environment variables

For Node.js scripts, install dotenv:

pnpm add dotenv

Then import it at the top of your script:

import 'dotenv/config';

Your First Flow: Invoice Extraction

Create a file called invoice-extract.ts:

import 'dotenv/config';
import { createFlow, extract, categorize } from '@doclo/flows';
import { createVLMProvider } from '@doclo/providers-llm';
import fs from 'fs';

// Helper to convert file to base64 data URL
function fileToBase64(filePath: string): string {
  const fileBuffer = fs.readFileSync(filePath);
  const base64 = fileBuffer.toString('base64');
  const mimeType = filePath.endsWith('.pdf') ? 'application/pdf' : 'image/jpeg';
  return `data:${mimeType};base64,${base64}`;
}

// Providers for different document qualities
const proProvider = createVLMProvider({
  provider: 'google', model: 'google/gemini-2.5-pro',
  apiKey: process.env.OPENROUTER_API_KEY!, via: 'openrouter'
});
const flashProvider = createVLMProvider({
  provider: 'google', model: 'google/gemini-2.5-flash',
  apiKey: process.env.OPENROUTER_API_KEY!, via: 'openrouter'
});
const liteProvider = createVLMProvider({
  provider: 'google', model: 'google/gemini-2.5-flash-lite',
  apiKey: process.env.OPENROUTER_API_KEY!, via: 'openrouter'
});

// Schema for invoice extraction
const invoiceSchema = {
  type: 'object',
  properties: {
    invoiceNumber: { type: 'string' },
    vendor: { type: 'string' },
    date: { type: 'string' },
    total: { type: 'number' },
    currency: { type: 'string' }
  }
};

// Build the flow with quality-based routing
const flow = createFlow()
  // Assess document quality first
  .step('assess', categorize({
    provider: liteProvider,
    categories: ['low', 'medium', 'high'],
    additionalInstructions: 'Assess document quality: low = poor scan/handwritten, medium = decent quality, high = clean digital document'
  }))
  // Route to appropriate model based on quality
  .conditional('extract', (data) => {
    const options = { schema: invoiceSchema, consensus: { runs: 3, level: 'field', strategy: 'majority' } };
    switch (data.category) {
      case 'low':
        return extract({ provider: proProvider, ...options });
      case 'medium':
        return extract({ provider: flashProvider, ...options });
      default:
        return extract({ provider: liteProvider, ...options });
    }
  })
  .build();

// Run the flow
async function processInvoice(pdfPath: string) {
  const result = await flow.run({ base64: fileToBase64(pdfPath) });
  console.log(result);
  return result;
}

processInvoice('./invoice.pdf').catch(console.error);

Run the Example

Add a test PDF

Save an invoice PDF as invoice.pdf in your project directory.

Run the script

npx tsx invoice-extract.ts

View the output

{
  output: {
    invoiceNumber: "INV-2024-001",
    vendor: "Acme Corporation",
    date: "2024-01-15",
    total: 1250.00,
    currency: "USD"
  },
  aggregated: {
    totalDurationMs: 2134,
    totalCostUSD: 0.0045,
    totalInputTokens: 2400,
    totalOutputTokens: 320,
    stepCount: 2
  },
  metrics: [
    { step: "assess", ms: 312, costUSD: 0.0004 },
    { step: "extract", ms: 1822, costUSD: 0.0041 }
  ],
  artifacts: {
    assess: { category: "high" },
    extract: { invoiceNumber: "INV-2024-001", ... }
  }
}

Understanding the Flow

This example demonstrates two key Doclo features:

Feature	What it does
Quality-based routing	Assesses document quality, routes to the right model for the job
Consensus voting	Runs extraction 3 times and votes on each field for accuracy

The routing logic:

Low quality (poor scans, handwritten) → gemini-2.5-pro for maximum accuracy
Medium quality (decent scans) → gemini-2.5-flash for balanced performance
High quality (clean digital docs) → gemini-2.5-flash-lite for speed and cost

The result object contains:

Property	Description
`output`	Final extracted data matching your schema
`aggregated`	Totals: duration, cost, tokens, step count
`metrics`	Per-step timing and cost breakdown
`artifacts`	Intermediate outputs from each step

Alternative Providers

Use createVLMProvider for a single provider, or buildLLMProvider for fallback chains:

Single Provider
With Fallback
Native Keys

import { createVLMProvider } from '@doclo/providers-llm';

const provider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

import { buildLLMProvider } from '@doclo/providers-llm';

const provider = buildLLMProvider({
  providers: [
    { provider: 'anthropic', model: 'anthropic/claude-sonnet-4', apiKey: process.env.OPENROUTER_API_KEY!, via: 'openrouter' },
    { provider: 'openai', model: 'openai/gpt-5.1', apiKey: process.env.OPENROUTER_API_KEY!, via: 'openrouter' }
  ],
  maxRetries: 2
});

import { createVLMProvider } from '@doclo/providers-llm';

// Use provider directly without OpenRouter
const provider = createVLMProvider({
  provider: 'openai',
  model: 'gpt-4o',
  apiKey: process.env.OPENAI_API_KEY!
});

Troubleshooting

Cannot find module '@doclo/flows'

Make sure packages are installed:

pnpm add @doclo/flows @doclo/providers-llm

OPENROUTER_API_KEY is undefined

Check .env.local exists with your key
Make sure you imported dotenv/config at the top of your file
Restart your dev server if using Next.js

429 Rate Limit Exceeded

Check your OpenRouter usage
Add credits to your account
Use buildLLMProvider() with retry logic for production

Schema validation failed

Make required fields optional if data might not exist
Check your schema uses valid JSON Schema format
Review the error message for which field failed

Next Steps

Concepts

Understand flows, nodes, and providers

Nodes Reference

Explore all processing nodes

Providers

Configure LLM and OCR providers

Consensus Voting

Improve accuracy with multi-provider voting

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

Prerequisites

Installation

API Key Setup

Your First Flow: Invoice Extraction

Run the Example

Understanding the Flow

Alternative Providers

Troubleshooting

Next Steps

Concepts

Nodes Reference

Providers

Consensus Voting

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

​Prerequisites

​Installation

​API Key Setup

​Your First Flow: Invoice Extraction

​Run the Example

​Understanding the Flow

​Alternative Providers

​Troubleshooting

​Next Steps

Concepts

Nodes Reference

Providers

Consensus Voting

Prerequisites

Installation

API Key Setup

Your First Flow: Invoice Extraction

Run the Example

Understanding the Flow

Alternative Providers

Troubleshooting

Next Steps