Skip to main content
Production document processing requires handling failures gracefully. This guide covers the error hierarchy, retry patterns, provider fallbacks, and strategies for building resilient extraction pipelines.

Error Hierarchy

The SDK provides a structured error hierarchy for precise error handling:
import {
  DocloError,           // Base error class
  AuthenticationError,  // Invalid API key (401)
  AuthorizationError,   // Insufficient permissions (403)
  NotFoundError,        // Resource not found (404)
  ValidationError,      // Invalid input (400)
  RateLimitError,       // Rate limit exceeded (429)
  TimeoutError,         // Request timed out (408)
  NetworkError,         // Connection issues
  ExecutionError,       // Flow execution failed
  InvalidApiKeyError    // API key format invalid
} from '@doclo/client';

Error Properties

All errors extend DocloError and include:
interface DocloError {
  name: string;                    // Error class name
  code: string;                    // Error code (e.g., 'RATE_LIMIT_EXCEEDED')
  message: string;                 // Human-readable message
  statusCode?: number;             // HTTP status code
  details?: Record<string, any>;   // Additional context
}

Error Codes

Common error codes:
CodeDescription
INVALID_API_KEYAPI key is invalid or malformed
API_KEY_REVOKEDAPI key has been revoked
FLOW_NOT_FOUNDSpecified flow does not exist
EXECUTION_NOT_FOUNDExecution ID not found
INVALID_INPUTInput validation failed
RATE_LIMIT_EXCEEDEDToo many requests
EXECUTION_TIMEOUTExecution took too long
PROVIDER_ERRORLLM/OCR provider failed
PROVIDER_RATE_LIMITEDProvider rate limit hit

Basic Error Handling

Handle errors by type:
import {
  DocloClient,
  AuthenticationError,
  ValidationError,
  RateLimitError,
  TimeoutError,
  NotFoundError,
  ExecutionError
} from '@doclo/client';

const client = new DocloClient({
  apiKey: process.env.DOCLO_API_KEY!
});

async function processDocument(flowId: string, document: any) {
  try {
    const result = await client.flows.run(flowId, {
      input: { document },
      wait: true,
      timeout: 30000
    });
    return result.output;

  } catch (error) {
    if (error instanceof AuthenticationError) {
      // API key issues - likely configuration problem
      console.error('Authentication failed:', error.message);
      throw new Error('Service configuration error');
    }

    if (error instanceof ValidationError) {
      // Bad input - don't retry, fix the input
      console.error('Invalid input:', error.message);
      throw new Error(`Invalid document: ${error.message}`);
    }

    if (error instanceof RateLimitError) {
      // Rate limited - retry after delay
      const retryAfter = error.rateLimitInfo?.retryAfter || 60;
      console.warn(`Rate limited. Retry after ${retryAfter}s`);
      throw error;  // Let caller handle retry
    }

    if (error instanceof TimeoutError) {
      // Timeout - consider async processing
      console.warn('Processing timed out');
      throw new Error('Document processing timed out. Try async processing.');
    }

    if (error instanceof NotFoundError) {
      // Flow doesn't exist
      console.error('Flow not found:', flowId);
      throw new Error('Processing flow not configured');
    }

    if (error instanceof ExecutionError) {
      // Flow execution failed
      console.error('Execution failed:', error.executionId, error.message);
      throw error;
    }

    // Unknown error
    console.error('Unexpected error:', error);
    throw error;
  }
}

Retry with Exponential Backoff

Implement automatic retries for transient failures:
interface RetryOptions {
  maxAttempts: number;
  initialDelayMs: number;
  maxDelayMs: number;
  backoffMultiplier: number;
  retryableErrors: string[];
}

const defaultRetryOptions: RetryOptions = {
  maxAttempts: 3,
  initialDelayMs: 1000,
  maxDelayMs: 30000,
  backoffMultiplier: 2,
  retryableErrors: [
    'RATE_LIMIT_EXCEEDED',
    'PROVIDER_RATE_LIMITED',
    'NETWORK_ERROR',
    'TIMEOUT',
    'PROVIDER_ERROR'
  ]
};

async function withRetry<T>(
  fn: () => Promise<T>,
  options: Partial<RetryOptions> = {}
): Promise<T> {
  const config = { ...defaultRetryOptions, ...options };
  let lastError: Error | undefined;

  for (let attempt = 1; attempt <= config.maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      // Check if error is retryable
      const errorCode = (error as any).code;
      const isRetryable = config.retryableErrors.includes(errorCode);

      if (!isRetryable || attempt === config.maxAttempts) {
        throw error;
      }

      // Calculate delay with exponential backoff
      const delay = Math.min(
        config.initialDelayMs * Math.pow(config.backoffMultiplier, attempt - 1),
        config.maxDelayMs
      );

      // Add jitter to prevent thundering herd
      const jitter = delay * 0.1 * Math.random();
      const totalDelay = delay + jitter;

      console.warn(
        `Attempt ${attempt} failed with ${errorCode}. ` +
        `Retrying in ${Math.round(totalDelay)}ms...`
      );

      await sleep(totalDelay);
    }
  }

  throw lastError;
}

function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}
Usage:
const result = await withRetry(
  () => client.flows.run(flowId, { input: { document }, wait: true }),
  { maxAttempts: 3, initialDelayMs: 2000 }
);

Rate Limit Handling

Handle rate limits with proper backoff:
import { RateLimitError } from '@doclo/client';

async function processWithRateLimitHandling(
  flowId: string,
  document: any
): Promise<any> {
  const maxRetries = 5;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await client.flows.run(flowId, {
        input: { document },
        wait: true
      });
    } catch (error) {
      if (error instanceof RateLimitError) {
        if (attempt === maxRetries) {
          throw new Error('Rate limit exceeded after maximum retries');
        }

        // Use server-provided retry-after if available
        const retryAfter = error.rateLimitInfo?.retryAfter || (attempt * 30);

        console.warn(`Rate limited. Waiting ${retryAfter}s before retry ${attempt + 1}/${maxRetries}`);
        await sleep(retryAfter * 1000);
        continue;
      }
      throw error;
    }
  }
}

Provider Fallback

Use multiple providers with automatic fallback:
import { buildLLMProvider } from '@doclo/providers-llm';

// Create provider with fallback chain
const provider = buildLLMProvider({
  providers: [
    {
      provider: 'google',
      model: 'google/gemini-2.5-flash',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    },
    {
      provider: 'anthropic',
      model: 'anthropic/claude-sonnet-4.5',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    },
    {
      provider: 'openai',
      model: 'openai/gpt-4.1',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    }
  ],
  maxRetries: 2,              // Retries per provider
  retryDelay: 1000,           // Base retry delay
  useExponentialBackoff: true,
  circuitBreakerThreshold: 3  // Failures before skipping provider
});

Circuit Breaker Pattern

Prevent cascading failures by temporarily disabling failing providers:
interface CircuitBreaker {
  failures: number;
  lastFailure: number;
  isOpen: boolean;
}

class ProviderCircuitBreaker {
  private breakers: Map<string, CircuitBreaker> = new Map();
  private threshold: number;
  private resetTimeMs: number;

  constructor(threshold = 3, resetTimeMs = 60000) {
    this.threshold = threshold;
    this.resetTimeMs = resetTimeMs;
  }

  recordFailure(providerId: string): void {
    const breaker = this.breakers.get(providerId) || {
      failures: 0,
      lastFailure: 0,
      isOpen: false
    };

    breaker.failures++;
    breaker.lastFailure = Date.now();

    if (breaker.failures >= this.threshold) {
      breaker.isOpen = true;
      console.warn(`Circuit breaker opened for provider: ${providerId}`);
    }

    this.breakers.set(providerId, breaker);
  }

  recordSuccess(providerId: string): void {
    this.breakers.delete(providerId);
  }

  isAvailable(providerId: string): boolean {
    const breaker = this.breakers.get(providerId);

    if (!breaker) return true;
    if (!breaker.isOpen) return true;

    // Check if reset time has passed
    if (Date.now() - breaker.lastFailure > this.resetTimeMs) {
      // Reset and allow retry
      breaker.isOpen = false;
      breaker.failures = 0;
      return true;
    }

    return false;
  }
}

const circuitBreaker = new ProviderCircuitBreaker();

async function extractWithFallback(
  providers: string[],
  document: any,
  schema: any
): Promise<any> {
  for (const providerId of providers) {
    if (!circuitBreaker.isAvailable(providerId)) {
      console.log(`Skipping provider ${providerId} (circuit open)`);
      continue;
    }

    try {
      const result = await extractWith(providerId, document, schema);
      circuitBreaker.recordSuccess(providerId);
      return result;
    } catch (error) {
      console.error(`Provider ${providerId} failed:`, error);
      circuitBreaker.recordFailure(providerId);
    }
  }

  throw new Error('All providers failed');
}

Partial Results Recovery

When a flow fails mid-execution, recover partial results:
import { FlowExecutionError } from '@doclo/core';

async function processWithPartialRecovery(
  flow: any,
  input: any
): Promise<any> {
  try {
    return await flow.run(input);
  } catch (error) {
    if (error instanceof FlowExecutionError) {
      console.error(`Flow failed at step: ${error.stepId}`);

      // Access results from completed steps
      const artifacts = error.artifacts;

      if (artifacts) {
        console.log('Completed steps:', Object.keys(artifacts));

        // Return partial results
        return {
          partial: true,
          failedAt: error.stepId,
          completedSteps: Object.keys(artifacts),
          results: artifacts
        };
      }
    }
    throw error;
  }
}

Graceful Degradation

Implement degraded modes when providers fail:
interface ExtractionResult {
  data: any;
  confidence: 'high' | 'medium' | 'low';
  mode: 'full' | 'degraded' | 'fallback';
}

async function extractWithDegradation(
  document: any,
  schema: any
): Promise<ExtractionResult> {
  // Attempt 1: Full extraction with VLM
  try {
    const result = await vlmFlow.run({ base64: document });
    return {
      data: result.output,
      confidence: 'high',
      mode: 'full'
    };
  } catch (error) {
    console.warn('VLM extraction failed, trying OCR + LLM');
  }

  // Attempt 2: Degraded mode - OCR then LLM
  try {
    const result = await ocrLlmFlow.run({ base64: document });
    return {
      data: result.output,
      confidence: 'medium',
      mode: 'degraded'
    };
  } catch (error) {
    console.warn('OCR + LLM failed, trying basic extraction');
  }

  // Attempt 3: Fallback - basic text extraction only
  try {
    const result = await basicOcrFlow.run({ base64: document });
    return {
      data: { rawText: result.output.text },
      confidence: 'low',
      mode: 'fallback'
    };
  } catch (error) {
    throw new Error('All extraction methods failed');
  }
}

Timeout Handling

Handle long-running extractions appropriately:
async function processWithTimeoutHandling(
  flowId: string,
  document: any
): Promise<any> {
  // Try sync first with short timeout
  try {
    return await client.flows.run(flowId, {
      input: { document },
      wait: true,
      timeout: 30000  // 30 seconds
    });
  } catch (error) {
    if (error instanceof TimeoutError) {
      console.log('Sync timeout, switching to async');

      // Fall back to async processing
      const execution = await client.flows.run(flowId, {
        input: { document },
        webhookUrl: process.env.WEBHOOK_URL
      });

      console.log('Started async execution:', execution.id);

      // Either wait for webhook or poll
      return await client.runs.waitForCompletion(execution.id, {
        interval: 5000,
        timeout: 300000  // 5 minutes
      });
    }
    throw error;
  }
}

Observability Hooks for Error Tracking

Use hooks to track errors across your flows:
import { createFlow, extract } from '@doclo/flows';

const flow = createFlow({
  observability: {
    onFlowError: (ctx) => {
      // Log to monitoring service
      console.error('Flow error:', {
        flowId: ctx.flowId,
        executionId: ctx.executionId,
        error: ctx.error.message,
        errorCode: ctx.errorCode,
        failedAtStep: ctx.failedAtStepIndex
      });

      // Send to error tracking (Sentry, etc.)
      // Sentry.captureException(ctx.error, { extra: ctx });
    },

    onStepError: (ctx) => {
      console.error('Step error:', {
        stepId: ctx.stepId,
        stepIndex: ctx.stepIndex,
        error: ctx.error.message,
        willRetry: ctx.willRetry,
        retryAttempt: ctx.retryAttempt
      });
    },

    onProviderRetry: (ctx) => {
      console.warn('Provider retry:', {
        provider: ctx.provider,
        attempt: ctx.attemptNumber,
        error: ctx.error.message,
        nextRetryDelay: ctx.nextRetryDelay
      });
    }
  }
})
  .step('extract', extract({ provider, schema }))
  .build();

Complete Example: Resilient Pipeline

import { DocloClient, RateLimitError, TimeoutError } from '@doclo/client';
import { createFlow, parse, extract } from '@doclo/flows';
import { buildLLMProvider, createOCRProvider } from '@doclo/providers-llm';

// Resilient provider configuration
const vlmProvider = buildLLMProvider({
  providers: [
    {
      provider: 'google',
      model: 'google/gemini-2.5-flash',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    },
    {
      provider: 'anthropic',
      model: 'anthropic/claude-sonnet-4.5',
      apiKey: process.env.OPENROUTER_API_KEY!,
      via: 'openrouter'
    }
  ],
  maxRetries: 2,
  useExponentialBackoff: true,
  circuitBreakerThreshold: 3
});

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/marker',
  apiKey: process.env.DATALAB_API_KEY!
});

// Resilient flow with error tracking
const resilientFlow = createFlow({
  observability: {
    onFlowError: (ctx) => {
      console.error(`Flow ${ctx.flowId} failed:`, ctx.error.message);
      // Track in monitoring
    },
    onStepError: (ctx) => {
      console.warn(`Step ${ctx.stepId} error (attempt ${ctx.retryAttempt}):`, ctx.error.message);
    }
  }
})
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: vlmProvider,
    schema: schema,
    inputMode: 'ir+source'
  }))
  .build();

// Processing function with full error handling
async function processDocumentResilient(
  document: { base64: string; filename: string; mimeType: string }
) {
  const maxRetries = 3;
  let lastError: Error | undefined;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const result = await resilientFlow.run({ base64: document.base64 });

      return {
        success: true,
        data: result.output,
        metrics: result.aggregated,
        attempts: attempt
      };

    } catch (error) {
      lastError = error as Error;
      const errorCode = (error as any).code;

      // Don't retry validation errors
      if (errorCode === 'INVALID_INPUT' || errorCode === 'SCHEMA_VALIDATION_FAILED') {
        return {
          success: false,
          error: 'Invalid document or schema',
          code: errorCode
        };
      }

      // Rate limit - wait and retry
      if (error instanceof RateLimitError) {
        const retryAfter = error.rateLimitInfo?.retryAfter || (attempt * 30);
        console.warn(`Rate limited. Waiting ${retryAfter}s...`);
        await sleep(retryAfter * 1000);
        continue;
      }

      // Timeout - maybe use async
      if (error instanceof TimeoutError && attempt === maxRetries) {
        return {
          success: false,
          error: 'Processing timed out',
          suggestion: 'Consider using async processing with webhooks'
        };
      }

      // Other errors - retry with backoff
      if (attempt < maxRetries) {
        const delay = 1000 * Math.pow(2, attempt - 1);
        console.warn(`Attempt ${attempt} failed. Retrying in ${delay}ms...`);
        await sleep(delay);
        continue;
      }
    }
  }

  return {
    success: false,
    error: lastError?.message || 'Processing failed after retries',
    attempts: maxRetries
  };
}

function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Best Practices

  1. Classify errors - Know which errors are retryable vs permanent
  2. Use exponential backoff - Prevent overwhelming failing services
  3. Add jitter - Avoid thundering herd when services recover
  4. Implement circuit breakers - Stop calling failing providers
  5. Log comprehensively - Include context for debugging
  6. Set reasonable timeouts - Balance wait time vs failure detection
  7. Plan for partial failures - Extract value from completed steps
  8. Monitor error rates - Alert on anomalies

Next Steps