Skip to main content
Surya is a high-accuracy OCR provider from Datalab that extracts text with precise bounding boxes.

Installation

npm install @docloai/providers-datalab

Basic Setup

import { createOCRProvider } from '@docloai/providers-datalab';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

Configuration Options

createOCRProvider({
  endpoint: string,           // API endpoint URL
  apiKey?: string,            // API key (optional for self-hosted)
  polling?: {
    maxAttempts?: number,     // Max polling attempts (default: 30)
    pollingInterval?: number  // Polling interval ms (default: 2000)
  }
})

Usage with Flows

import { createFlow, parse, extract } from '@docloai/flows';
import { createOCRProvider } from '@docloai/providers-datalab';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema
  }))
  .build();

Output: DocumentIR

Surya returns a DocumentIR with text and bounding boxes:
interface DocumentIR {
  pages: {
    width: number;
    height: number;
    lines: {
      text: string;
      bbox?: {
        x: number;   // Left position
        y: number;   // Top position
        w: number;   // Width
        h: number;   // Height
      };
    }[];
  }[];
  extras?: {
    raw: object;        // Original API response
    costUSD: number;    // Processing cost
    pageCount: number;  // Total pages
    status: string;     // Processing status
    success: boolean;   // Success flag
  };
}

Supported Formats

FormatMIME Type
PDFapplication/pdf
PNGimage/png
JPEGimage/jpeg
GIFimage/gif
TIFFimage/tiff
WebPimage/webp
DOCXapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
DOCapplication/msword

Input Methods

URL

const result = await flow.run({
  url: 'https://example.com/document.pdf'
});

Base64

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

Async Processing

Surya uses async processing for large documents. The SDK handles polling automatically:
const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!,
  polling: {
    maxAttempts: 60,      // Wait up to 2 minutes
    pollingInterval: 2000  // Check every 2 seconds
  }
});

Self-Hosted Deployment

Run Surya locally for reduced latency and cost:
const ocrProvider = createOCRProvider({
  endpoint: 'http://localhost:8000/ocr'
  // No API key needed for self-hosted
});
Self-hosted endpoints are detected automatically and don’t require API keys.

Pricing

ServiceCost
Surya OCR$0.01 per page
Cost is included in the response:
const result = await flow.run({ base64: documentData });
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);

When to Use Surya

Use Surya when:
  • You need precise bounding boxes for citations
  • Processing text-heavy documents
  • Building RAG pipelines with positional data
  • You need OCR before LLM extraction
Consider VLM direct when:
  • Documents have complex visual layouts
  • Tables and forms are primary content
  • Speed is more important than OCR accuracy

Example: OCR + Extraction Pipeline

import { createFlow, parse, extract } from '@docloai/flows';
import { createOCRProvider } from '@docloai/providers-datalab';
import { createVLMProvider } from '@docloai/providers-llm';

const ocrProvider = createOCRProvider({
  endpoint: 'https://www.datalab.to/api/v1/ocr',
  apiKey: process.env.DATALAB_API_KEY!
});

const llmProvider = createVLMProvider({
  provider: 'anthropic',
  model: 'anthropic/claude-sonnet-4.5',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const invoiceSchema = {
  type: 'object',
  properties: {
    invoiceNumber: { type: 'string' },
    date: { type: 'string' },
    total: { type: 'number' },
    lineItems: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          description: { type: 'string' },
          amount: { type: 'number' }
        }
      }
    }
  }
};

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: llmProvider,
    schema: invoiceSchema
  }))
  .build();

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

console.log('Invoice:', result.output);
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);
console.log('Total cost:', result.aggregated.totalCostUSD);

Next Steps