Mistral provides VLM-based document processing through two services: OCR 3 for parsing documents to markdown, and Document AI for structured extraction directly from source documents.
Installation
npm install @doclo/providers-mistral
Providers Overview
Mistral offers two services through the same package:
Service Function Use Case Cost OCR 3 Document to markdown Parsing for RAG, text extraction $0.002/page Document AI Schema-based extraction Direct extraction from source $0.002/page
Mistral OCR 3 is a VLM under the hood, not traditional OCR. It provides excellent handwriting recognition and handles complex layouts well.
OCR Provider
Use mistralOCRProvider for parsing documents to markdown/DocumentIR.
Basic Setup
import { mistralOCRProvider } from '@doclo/providers-mistral' ;
const ocrProvider = mistralOCRProvider ({
apiKey: process . env . MISTRAL_API_KEY !
});
Configuration Options
mistralOCRProvider ({
apiKey: string , // Required: Mistral API key
// Table handling
tableFormat? : 'html' | 'markdown' , // Table output format (default: html)
// Header/footer extraction
extractHeader? : boolean , // Extract headers into separate field
extractFooter? : boolean , // Extract footers into separate field
// Image handling
includeImageBase64? : boolean , // Include base64 images in response
// Page selection
pages? : string | number | number [], // Specific pages: "0-5", 3, or [0, 2, 5]
})
Usage with Flows
import { createFlow , parse } from '@doclo/flows' ;
import { mistralOCRProvider } from '@doclo/providers-mistral' ;
const ocrProvider = mistralOCRProvider ({
apiKey: process . env . MISTRAL_API_KEY ! ,
tableFormat: 'html'
});
const flow = createFlow ()
. step ( 'parse' , parse ({ provider: ocrProvider }))
. build ();
const result = await flow . run ({
url: 'https://example.com/document.pdf'
});
// Access parsed content
console . log ( result . output . pages [ 0 ]. lines );
Output: DocumentIR
interface DocumentIR {
pages : {
width : number ;
height : number ;
lines : { text : string }[];
images ?: {
id : string ;
bbox : { top_left_x , top_left_y , bottom_right_x , bottom_right_y };
base64 ?: string ; // If includeImageBase64 was true
}[];
}[];
extras ?: {
raw : object ; // Raw Mistral response
costUSD : number ;
pageCount : number ;
markdown : string ; // Full markdown output
};
}
VLM Provider (Document AI)
Use mistralVLMProvider for structured extraction directly from source documents using JSON schema.
Mistral VLM always requires raw document input (URL or base64). It cannot extract from pre-parsed DocumentIR. Use it as the first step in a flow, not after a parse() step.
Basic Setup
import { mistralVLMProvider } from '@doclo/providers-mistral' ;
const vlmProvider = mistralVLMProvider ({
apiKey: process . env . MISTRAL_API_KEY !
});
Configuration Options
mistralVLMProvider ({
apiKey: string , // Required: Mistral API key
// Annotation mode
annotationMode? : 'document' | 'bbox' , // Extraction mode (default: document)
// Image handling
includeImageBase64? : boolean , // Include base64 images in response
// Page selection
pages? : string | number | number [], // Specific pages to process
})
Annotation Modes
Mode Description Page Limit documentSingle JSON output for entire document 8 pages bboxPer-image annotations with bounding boxes 1000 pages
Usage with Flows
import { createFlow , extract } from '@doclo/flows' ;
import { mistralVLMProvider } from '@doclo/providers-mistral' ;
const vlmProvider = mistralVLMProvider ({
apiKey: process . env . MISTRAL_API_KEY ! ,
annotationMode: 'document'
});
const invoiceSchema = {
type: 'object' ,
properties: {
invoiceNumber: { type: 'string' },
date: { type: 'string' },
total: { type: 'number' },
lineItems: {
type: 'array' ,
items: {
type: 'object' ,
properties: {
description: { type: 'string' },
quantity: { type: 'number' },
price: { type: 'number' }
}
}
}
}
};
const flow = createFlow ()
. step ( 'extract' , extract ({
provider: vlmProvider ,
schema: invoiceSchema ,
inputMode: 'raw' // Required: Mistral needs raw document
}))
. build ();
const result = await flow . run ({
base64: 'data:application/pdf;base64,...'
});
console . log ( result . output );
// { invoiceNumber: "INV-001", date: "2025-01-15", total: 1250.00, ... }
Documents
Format MIME Type Supported PDF application/pdfYes DOCX application/vnd.openxmlformats-officedocument.wordprocessingml.documentYes PPTX application/vnd.openxmlformats-officedocument.presentationml.presentationYes TXT text/plainYes EPUB application/epub+zipYes RTF application/rtfYes ODT application/vnd.oasis.opendocument.textYes LaTeX application/x-latexYes Jupyter application/x-ipynb+jsonYes XLSX application/vnd.openxmlformats-officedocument.spreadsheetml.sheetNo
Images
Format MIME Type JPEG image/jpegPNG image/pngWebP image/webpTIFF image/tiffGIF image/gifAVIF image/avifBMP image/bmpHEIC/HEIF image/heic, image/heif
Limits
Limit Value Max file size 50 MB Max pages (OCR) 1000 Max pages (Document AI - document mode) 8 Max pages (Document AI - bbox mode) 1000
Pricing
Service Cost Batch Discount OCR 3 $0.002/page 50% off Document AI $0.002/page 50% off
$2 per 1000 pages makes Mistral one of the most cost-effective OCR options available.
Mistral vs Other Providers
Feature Mistral Reducto Surya Marker Document parsing Yes Yes Yes Yes Structured extraction Yes (native) Yes (native) Via LLM Via LLM Handwriting Excellent Good Good Limited Format support Extensive Good PDF/Images PDF/Images Bounding boxes Image-level Yes Yes No Cost/page $0.002 $0.004+ $0.01 $0.002+ Page limit 1000 Unlimited Unlimited Unlimited
Choose Mistral when:
Processing documents with handwriting
You need native structured extraction without a separate LLM
Working with diverse document formats (DOCX, PPTX, EPUB, etc.)
Cost is a primary concern
Example: Parse and Extract Pipeline
For documents over 8 pages, use OCR to parse first, then an LLM to extract:
import { createFlow , parse , extract } from '@doclo/flows' ;
import { mistralOCRProvider } from '@doclo/providers-mistral' ;
import { createVLMProvider } from '@doclo/providers-llm' ;
const ocrProvider = mistralOCRProvider ({
apiKey: process . env . MISTRAL_API_KEY !
});
const llmProvider = createVLMProvider ({
provider: 'google' ,
model: 'google/gemini-2.5-flash' ,
apiKey: process . env . OPENROUTER_API_KEY ! ,
via: 'openrouter'
});
const flow = createFlow ()
. step ( 'parse' , parse ({ provider: ocrProvider }))
. step ( 'extract' , extract ({
provider: llmProvider ,
schema: contractSchema ,
inputMode: 'ir' // Use parsed DocumentIR
}))
. build ();
For documents under 8 pages, extract directly:
import { createFlow , extract } from '@doclo/flows' ;
import { mistralVLMProvider } from '@doclo/providers-mistral' ;
const vlmProvider = mistralVLMProvider ({
apiKey: process . env . MISTRAL_API_KEY !
});
const flow = createFlow ()
. step ( 'extract' , extract ({
provider: vlmProvider ,
schema: invoiceSchema ,
inputMode: 'raw'
}))
. build ();
// Single-step extraction
const result = await flow . run ({
url: 'https://example.com/invoice.pdf'
});
Next Steps
Reducto RAG-optimized chunking
Surya OCR Text with bounding boxes