Mistral

Mistral provides VLM-based document processing through two services: OCR 3 for parsing documents to markdown, and Document AI for structured extraction directly from source documents.

Installation

npm install @doclo/providers-mistral

Providers Overview

Mistral offers two services through the same package:

Service	Function	Use Case	Cost
OCR 3	Document to markdown	Parsing for RAG, text extraction	$0.002/page
Document AI	Schema-based extraction	Direct extraction from source	$0.002/page

Mistral OCR 3 is a VLM under the hood, not traditional OCR. It provides excellent handwriting recognition and handles complex layouts well.

OCR Provider

Use mistralOCRProvider for parsing documents to markdown/DocumentIR.

Basic Setup

import { mistralOCRProvider } from '@doclo/providers-mistral';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

Configuration Options

mistralOCRProvider({
  apiKey: string,               // Required: Mistral API key

  // Table handling
  tableFormat?: 'html' | 'markdown',  // Table output format (default: html)

  // Header/footer extraction
  extractHeader?: boolean,      // Extract headers into separate field
  extractFooter?: boolean,      // Extract footers into separate field

  // Image handling
  includeImageBase64?: boolean, // Include base64 images in response

  // Page selection
  pages?: string | number | number[],  // Specific pages: "0-5", 3, or [0, 2, 5]
})

Usage with Flows

import { createFlow, parse } from '@doclo/flows';
import { mistralOCRProvider } from '@doclo/providers-mistral';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!,
  tableFormat: 'html'
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .build();

const result = await flow.run({
  url: 'https://example.com/document.pdf'
});

// Access parsed content
console.log(result.output.pages[0].lines);

Output: DocumentIR

interface DocumentIR {
  pages: {
    width: number;
    height: number;
    lines: { text: string }[];
    images?: {
      id: string;
      bbox: { top_left_x, top_left_y, bottom_right_x, bottom_right_y };
      base64?: string;  // If includeImageBase64 was true
    }[];
  }[];
  extras?: {
    raw: object;        // Raw Mistral response
    costUSD: number;
    pageCount: number;
    markdown: string;   // Full markdown output
  };
}

VLM Provider (Document AI)

Use mistralVLMProvider for structured extraction directly from source documents using JSON schema.

Mistral VLM always requires raw document input (URL or base64). It cannot extract from pre-parsed DocumentIR. Use it as the first step in a flow, not after a parse() step.

Basic Setup

import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

Configuration Options

mistralVLMProvider({
  apiKey: string,               // Required: Mistral API key

  // Annotation mode
  annotationMode?: 'document' | 'bbox',  // Extraction mode (default: document)

  // Image handling
  includeImageBase64?: boolean, // Include base64 images in response

  // Page selection
  pages?: string | number | number[],  // Specific pages to process
})

Annotation Modes

Mode	Description	Page Limit
`document`	Single JSON output for entire document	8 pages
`bbox`	Per-image annotations with bounding boxes	1000 pages

Usage with Flows

import { createFlow, extract } from '@doclo/flows';
import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!,
  annotationMode: 'document'
});

const invoiceSchema = {
  type: 'object',
  properties: {
    invoiceNumber: { type: 'string' },
    date: { type: 'string' },
    total: { type: 'number' },
    lineItems: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          description: { type: 'string' },
          quantity: { type: 'number' },
          price: { type: 'number' }
        }
      }
    }
  }
};

const flow = createFlow()
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema,
    inputMode: 'raw'  // Required: Mistral needs raw document
  }))
  .build();

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

console.log(result.output);
// { invoiceNumber: "INV-001", date: "2025-01-15", total: 1250.00, ... }

Supported Formats

Documents

Format	MIME Type	Supported
PDF	`application/pdf`	Yes
DOCX	`application/vnd.openxmlformats-officedocument.wordprocessingml.document`	Yes
PPTX	`application/vnd.openxmlformats-officedocument.presentationml.presentation`	Yes
TXT	`text/plain`	Yes
EPUB	`application/epub+zip`	Yes
RTF	`application/rtf`	Yes
ODT	`application/vnd.oasis.opendocument.text`	Yes
LaTeX	`application/x-latex`	Yes
Jupyter	`application/x-ipynb+json`	Yes
XLSX	`application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`	No

Images

Format	MIME Type
JPEG	`image/jpeg`
PNG	`image/png`
WebP	`image/webp`
TIFF	`image/tiff`
GIF	`image/gif`
AVIF	`image/avif`
BMP	`image/bmp`
HEIC/HEIF	`image/heic`, `image/heif`

Limits

Limit	Value
Max file size	50 MB
Max pages (OCR)	1000
Max pages (Document AI - document mode)	8
Max pages (Document AI - bbox mode)	1000

Pricing

Service	Cost	Batch Discount
OCR 3	$0.002/page	50% off
Document AI	$0.002/page	50% off

$2 per 1000 pages makes Mistral one of the most cost-effective OCR options available.

Mistral vs Other Providers

Feature	Mistral	Reducto	Surya	Marker
Document parsing	Yes	Yes	Yes	Yes
Structured extraction	Yes (native)	Yes (native)	Via LLM	Via LLM
Handwriting	Excellent	Good	Good	Limited
Format support	Extensive	Good	PDF/Images	PDF/Images
Bounding boxes	Image-level	Yes	Yes	No
Cost/page	$0.002	$0.004+	$0.01	$0.002+
Page limit	1000	Unlimited	Unlimited	Unlimited

Choose Mistral when:

Processing documents with handwriting
You need native structured extraction without a separate LLM
Working with diverse document formats (DOCX, PPTX, EPUB, etc.)
Cost is a primary concern

Example: Parse and Extract Pipeline

For documents over 8 pages, use OCR to parse first, then an LLM to extract:

import { createFlow, parse, extract } from '@doclo/flows';
import { mistralOCRProvider } from '@doclo/providers-mistral';
import { createVLMProvider } from '@doclo/providers-llm';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

const llmProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-flash',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: llmProvider,
    schema: contractSchema,
    inputMode: 'ir'  // Use parsed DocumentIR
  }))
  .build();

Example: Direct Extraction (Short Documents)

For documents under 8 pages, extract directly:

import { createFlow, extract } from '@doclo/flows';
import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

const flow = createFlow()
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema,
    inputMode: 'raw'
  }))
  .build();

// Single-step extraction
const result = await flow.run({
  url: 'https://example.com/invoice.pdf'
});

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

Installation

Providers Overview

OCR Provider

Basic Setup

Configuration Options

Usage with Flows

Output: DocumentIR

VLM Provider (Document AI)

Basic Setup

Configuration Options

Annotation Modes

Usage with Flows

Supported Formats

Documents

Images

Limits

Pricing

Mistral vs Other Providers

Example: Parse and Extract Pipeline

Example: Direct Extraction (Short Documents)

Next Steps

Reducto

Surya OCR

Getting Started

Concepts

SDK

Doclo Cloud

Guides

Resources

​Installation

​Providers Overview

​OCR Provider

​Basic Setup

​Configuration Options

​Usage with Flows

​Output: DocumentIR

​VLM Provider (Document AI)

​Basic Setup

​Configuration Options

​Annotation Modes

​Usage with Flows

​Supported Formats

​Documents

​Images

​Limits

​Pricing

​Mistral vs Other Providers

​Example: Parse and Extract Pipeline

​Example: Direct Extraction (Short Documents)

​Next Steps

Reducto

Surya OCR

Installation

Providers Overview

OCR Provider

Basic Setup

Configuration Options

Usage with Flows

Output: DocumentIR

VLM Provider (Document AI)

Basic Setup

Configuration Options

Annotation Modes

Usage with Flows

Supported Formats

Documents

Images

Limits

Pricing

Mistral vs Other Providers

Example: Parse and Extract Pipeline

Example: Direct Extraction (Short Documents)

Next Steps