Skip to main content
Folio supports English (en) and French (fr) documents, with automatic language detection (auto) enabled by default. Use the language form field to influence OCR accuracy and extraction quality.

The language parameter

ValueBehaviour
autoFolio detects the document language automatically. This is the default.
enForce English OCR and extraction.
frForce French OCR and extraction.
curl -X POST https://api.glialhealth.com/v1/documents \
  -H "Authorization: Bearer sk_test_..." \
  -F file=@ordonnance.pdf \
  -F language=fr
The detected (or forced) language is echoed back in the document object’s language field:
{
  "id": "doc_01j9xkqz...",
  "language": "fr",
  "status": "completed"
}

When to use auto vs. explicit

Use auto (default) when:
  • Your document corpus contains a mix of English and French documents and you don’t know the language at submission time.
  • You want Folio to handle language detection without adding any logic on your side.
Use an explicit language (en or fr) when:
  • You know the document language in advance and want to avoid a small detection overhead.
  • OCR accuracy for a specific language is critical and you want to eliminate any ambiguity from the classifier.
  • Documents contain a mix of characters but should be treated as one primary language (e.g. an English form with a few French words).

Bilingual documents

Some documents (e.g. Canadian government forms, Quebec-regulated health records) contain content in both English and French. In these cases:
  • Use auto: Folio will detect the dominant language and apply the best OCR model for that language. Fields in the secondary language are still extracted with reasonable accuracy.
  • Extraction field values are returned in whichever language they appear in the source document — Folio does not translate values.
If bilingual accuracy is critical for your use case, test both auto and each explicit language against a representative sample of your documents and compare the per-field confidence scores in the result.

Language and extraction schemas

Language does not affect how extraction schemas are defined or applied. Schema key names, type constraints, and pattern checks are language-agnostic. When using hint values in a schema, writing the hint in the same language as the target documents may improve extraction accuracy.