🔍 feat: Mistral OCR API / Upload Files as Text #6274

danny-avila · 2025-03-10T19:33:38Z

Summary

Closes #2755

Added an OCR capability check (using AgentCapabilities.ocr) during resource priming to ensure that OCR processing only occurs when enabled.
Integrated OCR configuration loading (via loadOCRConfig) in AppService and updated the endpoints configuration for agents.
Enhanced file processing by retrieving OCR file attachments in agent initialization using getFiles and refactored processAgentFileUpload to support OCR handling.
Implemented Mistral OCR functions (uploadDocumentToMistral, getSignedUrl, performOCR) with improved error handling and logging to provide clear diagnostic feedback.
Refactored initializeAgentOptions to incorporate OCR file attachments and replaced getConvoFiles with getToolFiles for streamlined file retrieval.
Updated frontend components by adding a hover card with contextual OCR information in FileContext and enhancing UI support for OCR file uploads.
Enhanced the ExtendedFile type, translations, and icon styling to support metadata and new OCR file context features.
Added a createAxiosInstance function that supports proxy configuration to standardize external HTTP requests.
Removed unused parameters (e.g., resendFiles) and cleaned up deprecated code for improved maintainability.

Other Changes

Fixes an issue with agents where "Resend Files" (resendFiles) was not being respected from the agent config.
Fixes an issue where all message attachments for the current chat were being included as part of the user's latest message.
Improved SourceIcon styling to distinguish file sources in the agent form, file view, and file table better.
Bump librechat-data-provider package and custom config version

Documentation:

📝 feat: File Context (OCR) or Upload Files as Text LibreChat-AI/librechat.ai#262

Checklist

My code adheres to this project's style guidelines
I have performed a self-review of my own code
I have commented in any complex areas of my code
I have made pertinent documentation changes
My changes do not introduce new warnings
I have written tests demonstrating that my changes are effective or that my feature works
Local unit tests pass with my changes
Any changes dependent on mine have been merged and published in downstream modules.
A pull request for updating the documentation has been submitted.

…support

…etrieval logic

…ion (this option comes from the agent config)

…message formatting

…tent types

…guration

… they are directly tied to database

… and PanelTable components for localization and metadata handling

… (OCR) in FileContext component

…esources in agent initialization

…resource priming

…e extraction, to be done when OCR is actually performed

…arsed in OCR config

…t variables and providing defaults

* refactor: move `loadAuthValues` to `~/services/Tools/credentials` * feat: add createAxiosInstance function to configure axios with proxy support * WIP: First pass mistral ocr * refactor: replace getConvoFiles with getToolFiles for improved file retrieval logic * refactor: improve document formatting in encodeAndFormat function * refactor: remove unused resendFiles parameter from buildOptions function (this option comes from the agent config) * fix: update getFiles call to include files with `text` property as well * refactor: move file handling to `initializeAgentOptions` * refactor: enhance addImageURLs method to handle OCR text and improve message formatting * refactor: update message formatting to handle OCR text in various content types * refactor: remove unused resendFiles property from compactAgentsSchema * fix: add error handling for Mistral OCR document upload and logging * refactor: integrate OCR capability into file upload options and configuration * refactor: skip processing for text source files in delete request, as they are directly tied to database * feat: add metadata field to ExtendedFile type and update PanelColumns and PanelTable components for localization and metadata handling * fix: source icon styling * wip: first pass, frontend file context agent resources * refactor: add hover card with contextual information for File Context (OCR) in FileContext component * feat: enhance file processing by integrating file retrieval for OCR resources in agent initialization * feat: implement OCR config; fix: agent resource deletion for ocr files * feat: enhance agent initialization by adding OCR capability check in resource priming * ci: fix `~/config` module mock * ci: add OCR property expectation in AppService tests * refactor: simplify OCR config loading by removing environment variable extraction, to be done when OCR is actually performed * ci: add unit test to ensure environment variable references are not parsed in OCR config * refactor: disable base64 image inclusion in OCR request * refactor: enhance OCR configuration handling by validating environment variables and providing defaults * refactor: use file stream from disk for mistral ocr api

danny-avila added 23 commits March 9, 2025 18:24

refactor: move loadAuthValues to ~/services/Tools/credentials

64ccecf

feat: add createAxiosInstance function to configure axios with proxy …

3aa6a7f

…support

WIP: First pass mistral ocr

53e22d0

refactor: replace getConvoFiles with getToolFiles for improved file r…

9de32cd

…etrieval logic

refactor: improve document formatting in encodeAndFormat function

df594b1

refactor: remove unused resendFiles parameter from buildOptions funct…

6fbf036

…ion (this option comes from the agent config)

fix: update getFiles call to include files with text property as well

5d76cd6

refactor: move file handling to initializeAgentOptions

20364c9

refactor: enhance addImageURLs method to handle OCR text and improve …

d65b8c4

…message formatting

refactor: update message formatting to handle OCR text in various con…

8462ba1

…tent types

refactor: remove unused resendFiles property from compactAgentsSchema

712f5f3

fix: add error handling for Mistral OCR document upload and logging

bc379ba

refactor: integrate OCR capability into file upload options and confi…

6ffc791

…guration

refactor: skip processing for text source files in delete request, as…

6525757

… they are directly tied to database

feat: add metadata field to ExtendedFile type and update PanelColumns…

7c136d1

… and PanelTable components for localization and metadata handling

fix: source icon styling

43ec4d4

wip: first pass, frontend file context agent resources

01bf406

refactor: add hover card with contextual information for File Context…

d7d69fa

… (OCR) in FileContext component

feat: enhance file processing by integrating file retrieval for OCR r…

81c650e

…esources in agent initialization

feat: implement OCR config; fix: agent resource deletion for ocr files

66ce0cb

feat: enhance agent initialization by adding OCR capability check in …

999d20b

…resource priming

ci: fix ~/config module mock

ccc0275

ci: add OCR property expectation in AppService tests

a72999a

danny-avila mentioned this pull request Mar 10, 2025

Enhancement: Upload Documents as input Context vs RAG Workflow #2755

Closed

1 task

danny-avila added 5 commits March 10, 2025 16:15

refactor: simplify OCR config loading by removing environment variabl…

86c7a56

…e extraction, to be done when OCR is actually performed

ci: add unit test to ensure environment variable references are not p…

650122a

…arsed in OCR config

refactor: disable base64 image inclusion in OCR request

cc5f881

refactor: enhance OCR configuration handling by validating environmen…

032eb2d

…t variables and providing defaults

refactor: use file stream from disk for mistral ocr api

0e20232

danny-avila merged commit ded3cd8 into main Mar 10, 2025
7 checks passed

danny-avila deleted the feat/mistral-ocr branch March 10, 2025 21:23

danny-avila mentioned this pull request Mar 11, 2025

Enhancement: PDF to Image Transformation for Comprehensive Image Processing #4656

Closed

1 task

danny-avila mentioned this pull request Mar 19, 2025

🐞 fix: Agent "Resend" Message Attachments + Source Icon Styling #6408

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🔍 feat: Mistral OCR API / Upload Files as Text #6274

🔍 feat: Mistral OCR API / Upload Files as Text #6274

danny-avila commented Mar 10, 2025 •

edited

Loading

🔍 feat: Mistral OCR API / Upload Files as Text #6274

🔍 feat: Mistral OCR API / Upload Files as Text #6274

Conversation

danny-avila commented Mar 10, 2025 • edited Loading

Summary

Other Changes

Documentation:

Checklist

danny-avila commented Mar 10, 2025 •

edited

Loading