Surya OCR

Surya is a high-accuracy OCR provider from Datalab that extracts text with precise bounding boxes.

Installation

npm install @doclo/providers-datalab

Basic Setup

import { createOCRProvider } from '@doclo/providers-datalab';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

Configuration Options

createOCRProvider({
  endpoint: string,           // API endpoint URL
  apiKey?: string,            // API key (optional for self-hosted)
  polling?: {
    maxAttempts?: number,     // Max polling attempts (default: 30)
    pollingInterval?: number  // Polling interval ms (default: 2000)
  }
})

Usage with Flows

import { createFlow, parse, extract } from '@doclo/flows';
import { createOCRProvider } from '@doclo/providers-datalab';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema
  }))
  .build();

Output: DocumentIR

Surya returns a DocumentIR with text and bounding boxes:

interface DocumentIR {
  pages: {
    width: number;
    height: number;
    lines: {
      text: string;
      bbox?: {
        x: number;   // Left position
        y: number;   // Top position
        w: number;   // Width
        h: number;   // Height
      };
    }[];
  }[];
  extras?: {
    raw: object;        // Original API response
    costUSD: number;    // Processing cost
    pageCount: number;  // Total pages
    status: string;     // Processing status
    success: boolean;   // Success flag
  };
}

Supported Formats

Format	MIME Type
PDF	`application/pdf`
PNG	`image/png`
JPEG	`image/jpeg`
GIF	`image/gif`
TIFF	`image/tiff`
WebP	`image/webp`
DOCX	`application/vnd.openxmlformats-officedocument.wordprocessingml.document`
DOC	`application/msword`

Input Methods

URL

const result = await flow.run({
  url: 'https://example.com/document.pdf'
});

Base64

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

Async Processing

Surya uses async processing for large documents. The SDK handles polling automatically:

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!,
  polling: {
    maxAttempts: 60,      // Wait up to 2 minutes
    pollingInterval: 2000  // Check every 2 seconds
  }
});

Self-Hosted Deployment

Run Surya locally for reduced latency and cost:

const ocrProvider = createOCRProvider({
  endpoint: 'http://localhost:8000/ocr'
  // No API key needed for self-hosted
});

Self-hosted endpoints are detected automatically and don’t require API keys.

Pricing

Service	Cost
Surya OCR	$0.01 per page

Cost is included in the response:

const result = await flow.run({ base64: documentData });
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);

When to Use Surya

Use Surya when:

You need precise bounding boxes for citations
Processing text-heavy documents
Building RAG pipelines with positional data
You need OCR before LLM extraction

Consider VLM direct when:

Documents have complex visual layouts
Tables and forms are primary content
Speed is more important than OCR accuracy

Example: OCR + Extraction Pipeline

import { createFlow, parse, extract } from '@doclo/flows';
import { createOCRProvider } from '@doclo/providers-datalab';
import { createVLMProvider } from '@doclo/providers-llm';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

const llmProvider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4.5',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const invoiceSchema = {
  type: 'object',
  properties: {
    invoiceNumber: { type: 'string' },
    date: { type: 'string' },
    total: { type: 'number' },
    lineItems: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          description: { type: 'string' },
          amount: { type: 'number' }
        }
      }
    }
  }
};

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: llmProvider,
    schema: invoiceSchema
  }))
  .build();

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

console.log('Invoice:', result.output);
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);
console.log('Total cost:', result.aggregated.totalCostUSD);

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

Installation

Basic Setup

Configuration Options

Usage with Flows

Output: DocumentIR

Supported Formats

Input Methods

URL

Base64

Async Processing

Self-Hosted Deployment

Pricing

When to Use Surya

Example: OCR + Extraction Pipeline

Next Steps

Marker OCR

Reducto

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

​Installation

​Basic Setup

​Configuration Options

​Usage with Flows

​Output: DocumentIR

​Supported Formats

​Input Methods

​URL

​Base64

​Async Processing

​Self-Hosted Deployment

​Pricing

​When to Use Surya

​Example: OCR + Extraction Pipeline

​Next Steps

Marker OCR

Reducto

Installation

Basic Setup

Configuration Options

Usage with Flows

Output: DocumentIR

Supported Formats

Input Methods

URL

Base64

Async Processing

Self-Hosted Deployment

Pricing

When to Use Surya

Example: OCR + Extraction Pipeline

Next Steps