Title: | Experimental R interface to Mistral AI API |
---|---|
Description: | Currently implement only OCR capabilties for PDF conversion. |
Authors: | Sebastian Kranz |
Maintainer: | Sebastian Kranz <[email protected]> |
License: | GPL >= 2.0 |
Version: | 0.1.0 |
Built: | 2025-03-07 15:22:12 UTC |
Source: | https://github.com/skranz/rmistral |
This function retrieves the Mistral API key stored in the R options.
mistral_api_key()
mistral_api_key()
The stored API key as a string.
Get a signed URL for file access
mistral_get_file_url(file_id, expiry = 24, api_key = mistral_api_key())
mistral_get_file_url(file_id, expiry = 24, api_key = mistral_api_key())
file_id |
ID of the file to get URL for |
expiry |
Expiry time in hours, default is 24 |
api_key |
Your MistralAI API key |
The signed URL information as a list
This function sends a document (either via URL or file upload) to the Mistral OCR API for text extraction.
mistral_ocr( url = NULL, file = NULL, model = "mistral-ocr-latest", include_images = TRUE, timeout_sec = 60 * 5, api_key = mistral_api_key() )
mistral_ocr( url = NULL, file = NULL, model = "mistral-ocr-latest", include_images = TRUE, timeout_sec = 60 * 5, api_key = mistral_api_key() )
url |
Optional. The URL of the document to be processed. |
file |
Optional. A local file path or an uploaded file object. |
model |
The OCR model to use. Defaults to '"mistral-ocr-latest"'. |
include_images |
Logical. Whether to include images in the response. Defaults to 'TRUE'. |
timeout_sec |
The timeout for the API request in seconds. Defaults to 300 (5 minutes). |
api_key |
Character string. Your Mistral API key. Best set globabally via calling |
List containing the result. The field is_ok
should be TRUE if everything worked nicely. The element pages
contains the extracted pages.
This function extracts and saves images from OCR results to a specified directory.
mistral_ocr_save_images(ocr, img_dir, overwrite = FALSE)
mistral_ocr_save_images(ocr, img_dir, overwrite = FALSE)
ocr |
A list containing OCR results, including extracted images. |
img_dir |
Directory where images should be saved. |
overwrite |
Logical. If 'TRUE', overwrites existing image files. Defaults to 'FALSE'. |
The number of images saved.
This function saves the extracted OCR text in Markdown format, either as a single file or split by pages.
mistral_ocr_save_md( ocr, file, by_page = FALSE, save_images = TRUE, overwrite = FALSE, img_dir = dirname(file) )
mistral_ocr_save_md( ocr, file, by_page = FALSE, save_images = TRUE, overwrite = FALSE, img_dir = dirname(file) )
ocr |
A list containing OCR results, including extracted text and images. |
file |
The file path where the Markdown content should be saved. |
by_page |
Logical. If 'TRUE', saves each page as a separate file. Defaults to 'FALSE'. |
save_images |
Logical. If 'TRUE', also saves extracted images. Defaults to 'TRUE'. |
overwrite |
Logical. If 'TRUE', overwrites existing files. Defaults to 'FALSE'. |
img_dir |
Directory where images should be saved. Defaults to the directory of 'file'. |
Upload a file for OCR processing
mistral_upload_file(file_path, purpose = "ocr", api_key = mistral_api_key())
mistral_upload_file(file_path, purpose = "ocr", api_key = mistral_api_key())
file_path |
Path to the local file to upload |
purpose |
Purpose of the upload, default is "ocr" |
api_key |
Your MistralAI API key |
The file metadata as a list
This function sets the Mistral API key, either from a provided string or by reading from a file.
set_mistral_api_key(key = NULL, file = NULL)
set_mistral_api_key(key = NULL, file = NULL)
key |
Optional. The API key as a string. |
file |
Optional. A file path containing the API key. |