Documentation Index
Fetch the complete documentation index at: https://docs.doclo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Surya is a high-accuracy OCR provider from Datalab that extracts text with precise bounding boxes.
Installation
npm install @doclo/providers-datalab
Basic Setup
import { createOCRProvider } from '@doclo/providers-datalab';
const ocrProvider = createOCRProvider({
endpoint: 'https://www.datalab.to/api/v1/ocr',
apiKey: process.env.DATALAB_API_KEY!
});
Configuration Options
createOCRProvider({
endpoint: string, // API endpoint URL
apiKey?: string, // API key (optional for self-hosted)
polling?: {
maxAttempts?: number, // Max polling attempts (default: 30)
pollingInterval?: number // Polling interval ms (default: 2000)
}
})
Usage with Flows
import { createFlow, parse, extract } from '@doclo/flows';
import { createOCRProvider } from '@doclo/providers-datalab';
const ocrProvider = createOCRProvider({
endpoint: 'https://www.datalab.to/api/v1/ocr',
apiKey: process.env.DATALAB_API_KEY!
});
const flow = createFlow()
.step('parse', parse({ provider: ocrProvider }))
.step('extract', extract({
provider: vlmProvider,
schema: invoiceSchema
}))
.build();
Output: DocumentIR
Surya returns a DocumentIR with text and bounding boxes:
interface DocumentIR {
pages: {
width: number;
height: number;
lines: {
text: string;
bbox?: {
x: number; // Left position
y: number; // Top position
w: number; // Width
h: number; // Height
};
}[];
}[];
extras?: {
raw: object; // Original API response
costUSD: number; // Processing cost
pageCount: number; // Total pages
status: string; // Processing status
success: boolean; // Success flag
};
}
| Format | MIME Type |
|---|
| PDF | application/pdf |
| PNG | image/png |
| JPEG | image/jpeg |
| GIF | image/gif |
| TIFF | image/tiff |
| WebP | image/webp |
| DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
| DOC | application/msword |
URL
const result = await flow.run({
url: 'https://example.com/document.pdf'
});
Base64
const result = await flow.run({
base64: 'data:application/pdf;base64,...'
});
Async Processing
Surya uses async processing for large documents. The SDK handles polling automatically:
const ocrProvider = createOCRProvider({
endpoint: 'https://www.datalab.to/api/v1/ocr',
apiKey: process.env.DATALAB_API_KEY!,
polling: {
maxAttempts: 60, // Wait up to 2 minutes
pollingInterval: 2000 // Check every 2 seconds
}
});
Self-Hosted Deployment
Run Surya locally for reduced latency and cost:
const ocrProvider = createOCRProvider({
endpoint: 'http://localhost:8000/ocr'
// No API key needed for self-hosted
});
Self-hosted endpoints are detected automatically and don’t require API keys.
Pricing
| Service | Cost |
|---|
| Surya OCR | $0.01 per page |
Cost is included in the response:
const result = await flow.run({ base64: documentData });
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);
When to Use Surya
Use Surya when:
- You need precise bounding boxes for citations
- Processing text-heavy documents
- Building RAG pipelines with positional data
- You need OCR before LLM extraction
Consider VLM direct when:
- Documents have complex visual layouts
- Tables and forms are primary content
- Speed is more important than OCR accuracy
import { createFlow, parse, extract } from '@doclo/flows';
import { createOCRProvider } from '@doclo/providers-datalab';
import { createVLMProvider } from '@doclo/providers-llm';
const ocrProvider = createOCRProvider({
endpoint: 'https://www.datalab.to/api/v1/ocr',
apiKey: process.env.DATALAB_API_KEY!
});
const llmProvider = createVLMProvider({
provider: 'anthropic',
model: 'anthropic/claude-sonnet-4.5',
apiKey: process.env.OPENROUTER_API_KEY!,
via: 'openrouter'
});
const invoiceSchema = {
type: 'object',
properties: {
invoiceNumber: { type: 'string' },
date: { type: 'string' },
total: { type: 'number' },
lineItems: {
type: 'array',
items: {
type: 'object',
properties: {
description: { type: 'string' },
amount: { type: 'number' }
}
}
}
}
};
const flow = createFlow()
.step('parse', parse({ provider: ocrProvider }))
.step('extract', extract({
provider: llmProvider,
schema: invoiceSchema
}))
.build();
const result = await flow.run({
base64: 'data:application/pdf;base64,...'
});
console.log('Invoice:', result.output);
console.log('OCR cost:', result.artifacts.parse?.extras?.costUSD);
console.log('Total cost:', result.aggregated.totalCostUSD);
Next Steps
Marker OCR
Markdown conversion
Reducto
Chunking and splitting