Skip to main content
Mistral provides VLM-based document processing through two services: OCR 3 for parsing documents to markdown, and Document AI for structured extraction directly from source documents.

Installation

npm install @doclo/providers-mistral

Providers Overview

Mistral offers two services through the same package:
ServiceFunctionUse CaseCost
OCR 3Document to markdownParsing for RAG, text extraction$0.002/page
Document AISchema-based extractionDirect extraction from source$0.002/page
Mistral OCR 3 is a VLM under the hood, not traditional OCR. It provides excellent handwriting recognition and handles complex layouts well.

OCR Provider

Use mistralOCRProvider for parsing documents to markdown/DocumentIR.

Basic Setup

import { mistralOCRProvider } from '@doclo/providers-mistral';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

Configuration Options

mistralOCRProvider({
  apiKey: string,               // Required: Mistral API key

  // Table handling
  tableFormat?: 'html' | 'markdown',  // Table output format (default: html)

  // Header/footer extraction
  extractHeader?: boolean,      // Extract headers into separate field
  extractFooter?: boolean,      // Extract footers into separate field

  // Image handling
  includeImageBase64?: boolean, // Include base64 images in response

  // Page selection
  pages?: string | number | number[],  // Specific pages: "0-5", 3, or [0, 2, 5]
})

Usage with Flows

import { createFlow, parse } from '@doclo/flows';
import { mistralOCRProvider } from '@doclo/providers-mistral';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!,
  tableFormat: 'html'
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .build();

const result = await flow.run({
  url: 'https://example.com/document.pdf'
});

// Access parsed content
console.log(result.output.pages[0].lines);

Output: DocumentIR

interface DocumentIR {
  pages: {
    width: number;
    height: number;
    lines: { text: string }[];
    images?: {
      id: string;
      bbox: { top_left_x, top_left_y, bottom_right_x, bottom_right_y };
      base64?: string;  // If includeImageBase64 was true
    }[];
  }[];
  extras?: {
    raw: object;        // Raw Mistral response
    costUSD: number;
    pageCount: number;
    markdown: string;   // Full markdown output
  };
}

VLM Provider (Document AI)

Use mistralVLMProvider for structured extraction directly from source documents using JSON schema.
Mistral VLM always requires raw document input (URL or base64). It cannot extract from pre-parsed DocumentIR. Use it as the first step in a flow, not after a parse() step.

Basic Setup

import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

Configuration Options

mistralVLMProvider({
  apiKey: string,               // Required: Mistral API key

  // Annotation mode
  annotationMode?: 'document' | 'bbox',  // Extraction mode (default: document)

  // Image handling
  includeImageBase64?: boolean, // Include base64 images in response

  // Page selection
  pages?: string | number | number[],  // Specific pages to process
})

Annotation Modes

ModeDescriptionPage Limit
documentSingle JSON output for entire document8 pages
bboxPer-image annotations with bounding boxes1000 pages

Usage with Flows

import { createFlow, extract } from '@doclo/flows';
import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!,
  annotationMode: 'document'
});

const invoiceSchema = {
  type: 'object',
  properties: {
    invoiceNumber: { type: 'string' },
    date: { type: 'string' },
    total: { type: 'number' },
    lineItems: {
      type: 'array',
      items: {
        type: 'object',
        properties: {
          description: { type: 'string' },
          quantity: { type: 'number' },
          price: { type: 'number' }
        }
      }
    }
  }
};

const flow = createFlow()
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema,
    inputMode: 'raw'  // Required: Mistral needs raw document
  }))
  .build();

const result = await flow.run({
  base64: 'data:application/pdf;base64,...'
});

console.log(result.output);
// { invoiceNumber: "INV-001", date: "2025-01-15", total: 1250.00, ... }

Supported Formats

Documents

FormatMIME TypeSupported
PDFapplication/pdfYes
DOCXapplication/vnd.openxmlformats-officedocument.wordprocessingml.documentYes
PPTXapplication/vnd.openxmlformats-officedocument.presentationml.presentationYes
TXTtext/plainYes
EPUBapplication/epub+zipYes
RTFapplication/rtfYes
ODTapplication/vnd.oasis.opendocument.textYes
LaTeXapplication/x-latexYes
Jupyterapplication/x-ipynb+jsonYes
XLSXapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheetNo

Images

FormatMIME Type
JPEGimage/jpeg
PNGimage/png
WebPimage/webp
TIFFimage/tiff
GIFimage/gif
AVIFimage/avif
BMPimage/bmp
HEIC/HEIFimage/heic, image/heif

Limits

LimitValue
Max file size50 MB
Max pages (OCR)1000
Max pages (Document AI - document mode)8
Max pages (Document AI - bbox mode)1000

Pricing

ServiceCostBatch Discount
OCR 3$0.002/page50% off
Document AI$0.002/page50% off
$2 per 1000 pages makes Mistral one of the most cost-effective OCR options available.

Mistral vs Other Providers

FeatureMistralReductoSuryaMarker
Document parsingYesYesYesYes
Structured extractionYes (native)Yes (native)Via LLMVia LLM
HandwritingExcellentGoodGoodLimited
Format supportExtensiveGoodPDF/ImagesPDF/Images
Bounding boxesImage-levelYesYesNo
Cost/page$0.002$0.004+$0.01$0.002+
Page limit1000UnlimitedUnlimitedUnlimited
Choose Mistral when:
  • Processing documents with handwriting
  • You need native structured extraction without a separate LLM
  • Working with diverse document formats (DOCX, PPTX, EPUB, etc.)
  • Cost is a primary concern

Example: Parse and Extract Pipeline

For documents over 8 pages, use OCR to parse first, then an LLM to extract:
import { createFlow, parse, extract } from '@doclo/flows';
import { mistralOCRProvider } from '@doclo/providers-mistral';
import { createVLMProvider } from '@doclo/providers-llm';

const ocrProvider = mistralOCRProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

const llmProvider = createVLMProvider({
  provider: 'google',
  model: 'google/gemini-2.5-flash',
  apiKey: process.env.OPENROUTER_API_KEY!,
  via: 'openrouter'
});

const flow = createFlow()
  .step('parse', parse({ provider: ocrProvider }))
  .step('extract', extract({
    provider: llmProvider,
    schema: contractSchema,
    inputMode: 'ir'  // Use parsed DocumentIR
  }))
  .build();

Example: Direct Extraction (Short Documents)

For documents under 8 pages, extract directly:
import { createFlow, extract } from '@doclo/flows';
import { mistralVLMProvider } from '@doclo/providers-mistral';

const vlmProvider = mistralVLMProvider({
  apiKey: process.env.MISTRAL_API_KEY!
});

const flow = createFlow()
  .step('extract', extract({
    provider: vlmProvider,
    schema: invoiceSchema,
    inputMode: 'raw'
  }))
  .build();

// Single-step extraction
const result = await flow.run({
  url: 'https://example.com/invoice.pdf'
});

Next Steps