Skip to main content
When accuracy is critical, running extraction multiple times and comparing results can catch errors that a single pass might miss. This guide covers consensus voting strategies for high-stakes document processing.

Prerequisites

  • Node.js 18+
  • API keys for multiple providers (or OpenRouter for unified access)
  • A document to process

When to Use Consensus

Consensus voting is valuable when:
  • Processing high-value documents (financial, legal, medical)
  • Extraction errors have significant business impact
  • You need confidence scores for extracted values
  • Auditing requirements demand verification
Trade-off: Consensus increases cost and latency proportionally to the number of runs.

Basic Consensus Configuration

Enable consensus on the extract node:
import { createFlow, extract } from '@doclo/flows';
import { createVLMProvider } from '@doclo/providers-llm';

const vlmProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-flash',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const flow = createFlow()
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema,
    consensus: {
      runs: 3,              // Run extraction 3 times
      strategy: 'majority', // Use majority voting
      threshold: 0.6        // Require 60% agreement
    }
  }))
  .build();

Voting Strategies

Majority (Default)

Each field is voted on independently. The value with the most votes wins:
consensus: {
  runs: 3,
  strategy: 'majority',
  threshold: 0.5  // More than half must agree
}
Example with 3 runs:
  • Run 1: { total: 1250.00 }
  • Run 2: { total: 1250.00 }
  • Run 3: { total: 1250.50 }
Result: { total: 1250.00 } (2/3 agreement)

Unanimous

All runs must agree for a value to be accepted:
consensus: {
  runs: 3,
  strategy: 'unanimous'
}
If any run disagrees, that field is marked as uncertain or null.

Weighted

Assign different weights to different providers or runs:
consensus: {
  runs: 3,
  strategy: 'weighted',
  weights: [0.5, 0.3, 0.2]  // First run weighted highest
}
Useful when you have a primary provider and backup providers.

Multi-Provider Consensus

Run extraction across different AI providers for diverse perspectives:
import { createFlow, extract } from '@doclo/flows';
import { createVLMProvider } from '@doclo/providers-llm';

// Create providers from different vendors
const geminiProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-flash',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const claudeProvider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4.5',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const gpt4Provider = createVLMProvider({
  provider: 'openai',
  model: 'openai/gpt-4.1',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

// Run extraction with each provider
const flow = createFlow()
  .step('extract', extract({
    providers: [geminiProvider, claudeProvider, gpt4Provider],
    schema: invoiceSchema,
    consensus: {
      strategy: 'majority',
      threshold: 0.66  // 2 of 3 must agree
    }
  }))
  .build();

Agreement Thresholds

Set minimum agreement levels:
consensus: {
  runs: 5,
  strategy: 'majority',
  threshold: 0.8  // 80% must agree (4 of 5 runs)
}
Threshold3 runs5 runsDescription
0.52/33/5Simple majority
0.662/34/5Two-thirds majority
0.83/34/5Strong consensus
1.03/35/5Unanimous
Fields that don’t meet the threshold can be:
  • Set to null
  • Flagged for human review
  • Returned with a confidence score

Accessing Consensus Results

The output includes agreement information:
const result = await flow.run({ base64 });

console.log('Extracted data:', result.output.data);
console.log('Agreement scores:', result.output.agreement);

// Agreement per field
// {
//   invoiceNumber: { value: 'INV-001', agreement: 1.0, votes: 3 },
//   total: { value: 1250.00, agreement: 0.66, votes: 2 },
//   date: { value: '2024-01-15', agreement: 0.33, votes: 1, uncertain: true }
// }

Observability Hooks

Monitor consensus execution:
const flow = createFlow({
  observability: {
    onConsensusStart: (ctx) => {
      console.log(`Starting ${ctx.runsPlanned} consensus runs`);
      console.log(`Strategy: ${ctx.strategy}`);
    },
    onConsensusRunComplete: (ctx) => {
      console.log(`Run ${ctx.runIndex + 1} complete`);
      console.log(`  Provider: ${ctx.provider}`);
      console.log(`  Output: ${JSON.stringify(ctx.output)}`);
    },
    onConsensusComplete: (ctx) => {
      console.log('Consensus complete');
      console.log(`  Agreement: ${(ctx.agreement * 100).toFixed(0)}%`);
      console.log(`  Total cost: $${ctx.totalCost.toFixed(4)}`);
    }
  }
});

Complete Example: Financial Document Validation

Extract and validate financial data with high confidence:
import { createFlow, parse, extract } from '@doclo/flows';
import { createVLMProvider } from '@doclo/providers-llm';
import { createOCRProvider } from '@doclo/providers-datalab';
import { readFileSync } from 'fs';

// Types
interface FinancialExtraction {
  documentType: string;
  amount: number;
  currency: string;
  date: string;
  referenceNumber: string;
  parties: {
    payer: string;
    payee: string;
  };
}

// Schema
const financialSchema = {
  type: 'object',
  properties: {
    documentType: {
      type: 'string',
      enum: ['invoice', 'receipt', 'wire_transfer', 'check'],
      description: 'Type of financial document'
    },
    amount: {
      type: 'number',
      description: 'Transaction amount with full decimal precision'
    },
    currency: {
      type: 'string',
      description: '3-letter ISO currency code'
    },
    date: {
      type: 'string',
      description: 'Transaction date in YYYY-MM-DD format'
    },
    referenceNumber: {
      type: 'string',
      description: 'Reference, invoice, or transaction number'
    },
    parties: {
      type: 'object',
      properties: {
        payer: { type: 'string', description: 'Entity making payment' },
        payee: { type: 'string', description: 'Entity receiving payment' }
      },
      required: ['payer', 'payee']
    }
  },
  required: ['amount', 'currency', 'documentType']
};

// Providers
const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/marker',
  apiKey: process.env.DATALAB_API_KEY!
});

const geminiProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-pro',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const claudeProvider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4.5',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const gptProvider = createVLMProvider({
  provider: 'openai',
  model: 'openai/gpt-4.1',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

// Build flow with multi-provider consensus
const financialFlow = createFlow({
  observability: {
    onConsensusComplete: (ctx) => {
      console.log(`Consensus: ${(ctx.agreement * 100).toFixed(0)}% agreement`);

      // Flag low-agreement extractions for review
      if (ctx.agreement < 0.8) {
        console.warn('Low agreement - flagging for human review');
      }
    }
  }
})
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract<FinancialExtraction>({
    providers: [geminiProvider, claudeProvider, gptProvider],
    schema: financialSchema,
    inputMode: 'ir+source',
    consensus: {
      strategy: 'majority',
      threshold: 0.66
    },
    additionalInstructions: `
      - Amount must have exact decimal precision (e.g., 1234.56)
      - Currency must be a valid 3-letter ISO code
      - Date must be in YYYY-MM-DD format
      - Reference number should include any prefixes (INV-, REF-, etc.)
    `
  }))
  .build();

// Process document
async function validateFinancialDocument(filePath: string) {
  const fileBuffer = readFileSync(filePath);
  const base64 = `data:application/pdf;base64,${fileBuffer.toString('base64')}`;

  const result = await financialFlow.run({ base64 });

  console.log('\n--- Financial Document Validation ---');
  console.log('Type:', result.output.documentType);
  console.log('Amount:', result.output.currency, result.output.amount);
  console.log('Date:', result.output.date);
  console.log('Reference:', result.output.referenceNumber);
  console.log('Payer:', result.output.parties?.payer);
  console.log('Payee:', result.output.parties?.payee);

  console.log('\n--- Confidence ---');
  if (result.output.agreement) {
    for (const [field, info] of Object.entries(result.output.agreement)) {
      const status = info.uncertain ? '⚠️' : '✓';
      console.log(`${status} ${field}: ${(info.agreement * 100).toFixed(0)}%`);
    }
  }

  console.log('\n--- Cost ---');
  console.log(`Total: $${result.aggregated.totalCostUSD.toFixed(4)}`);

  return result.output;
}

validateFinancialDocument('./financial-doc.pdf').catch(console.error);

Cost Optimization Strategies

Consensus increases costs. Here are strategies to manage this:

Tiered Consensus

Use single extraction for low-value documents, consensus for high-value:
async function processDocument(doc: Document) {
  const estimatedValue = await estimateDocumentValue(doc);

  if (estimatedValue > 10000) {
    // High-value: 3-provider consensus
    return await highValueFlow.run(doc);
  } else if (estimatedValue > 1000) {
    // Medium-value: 2-run consensus
    return await mediumValueFlow.run(doc);
  } else {
    // Low-value: single extraction
    return await basicFlow.run(doc);
  }
}

Selective Field Consensus

Run consensus only on critical fields:
// First pass: extract all fields
const initialResult = await basicFlow.run({ base64 });

// Second pass: validate critical fields only
const criticalSchema = {
  type: 'object',
  properties: {
    amount: { type: 'number' },
    accountNumber: { type: 'string' }
  }
};

const validationResult = await createFlow()
  .step('validate', extract({
    providers: [provider1, provider2, provider3],
    schema: criticalSchema,
    consensus: { runs: 3, strategy: 'unanimous' }
  }))
  .build()
  .run({ base64 });

// Merge results
const finalResult = {
  ...initialResult.output,
  ...validationResult.output
};

Conditional Re-extraction

Only run consensus when initial extraction has low confidence:
const result = await flow.run({ base64 });

// Check for fields that might need validation
const uncertainFields = Object.entries(result.output.agreement || {})
  .filter(([_, info]) => info.agreement < 0.8)
  .map(([field]) => field);

if (uncertainFields.length > 0) {
  console.log('Re-validating uncertain fields:', uncertainFields);

  // Create schema for just uncertain fields
  const validationSchema = createSchemaForFields(uncertainFields);

  const validation = await consensusFlow.run({ base64, schema: validationSchema });
  // Merge validated fields back
}

Handling Disagreement

When providers disagree, you have options:

Return Null for Uncertain Fields

consensus: {
  runs: 3,
  strategy: 'majority',
  threshold: 0.66,
  onDisagreement: 'null'  // Set uncertain fields to null
}

Flag for Human Review

const result = await flow.run({ base64 });

const needsReview = Object.entries(result.output.agreement || {})
  .filter(([_, info]) => info.uncertain)
  .map(([field]) => field);

if (needsReview.length > 0) {
  await queueForHumanReview(result, needsReview);
}

Return All Candidate Values

consensus: {
  runs: 3,
  strategy: 'majority',
  threshold: 0.66,
  returnCandidates: true  // Include all extracted values
}

// Result includes candidates
// {
//   amount: {
//     value: 1250.00,
//     candidates: [1250.00, 1250.00, 1250.50],
//     agreement: 0.66
//   }
// }

Using Doclo Cloud with Consensus

Configure consensus for cloud-based flows:
import { DocloClient } from '@doclo/client';

const client = new DocloClient({
  apiKey: process.env.DOCLO_API_KEY!
});

// Run a pre-configured consensus flow
const result = await client.flows.run<FinancialExtraction>('financial-validation', {
  input: {
    document: { base64, filename: 'invoice.pdf', mimeType: 'application/pdf' }
  },
  wait: true,
  timeout: 120000  // Longer timeout for multi-run consensus
});

console.log('Extracted:', result.output);
console.log('Agreement:', result.output.agreement);

Next Steps