Extract Handwritten & Typed Text with Jarvis OCR Agent
This video demos Jarvis's OCR agent, which extracts text from documents including handwritten content. To get started, select the OCR agent in Jarvis, drag a PDF with mixed typed and handwritten notes onto the interface, and choose upload as text. Within seconds the agent extracts all visible text from the document, handling both printed and handwritten content in the same pass.
Jarvis's human-in-the-loop design means recognition errors can be corrected conversationally. If the agent misreads a word — as shown when it initially misrecognizes a name — simply prompt it to fix the mistake. The agent reprocesses the input and immediately updates the result, keeping the extraction accurate without requiring manual editing of the raw output.
Extracted text can also be routed to a downstream application directly from Jarvis. Using the mixture of agents setting, selecting a downstream app agent and clicking regenerate sends the results to a REST API, database, Google Drive, Amazon S3, or any other external system the workflow requires. Jarvis confirms when the result has been posted to the endpoint.
The video introduces Jarvis's OCR agent, designed to extract text from documents including handwritten content. Users select the OCR agent, drag a PDF with mixed typed and handwritten notes onto the interface, and choose upload as text to begin.
After asking the OCR agent to analyze the uploaded file, it extracts all visible text within seconds — processing both typed and handwritten content from the same PDF in a single pass.
When the agent misrecognizes a word, users simply prompt it to fix the error. The human-in-the-loop design allows the agent to reprocess the input and update the result immediately, without requiring any manual editing of the extracted output.
Extracted text can be sent to a downstream application by opening the right panel, navigating to advanced settings, selecting a downstream app agent under mixture of agents, and clicking regenerate. Jarvis confirms when the result has been posted to the endpoint.
The downstream integration supports REST APIs, databases, Google Drive, Amazon S3, or any external system the workflow requires. This makes the OCR agent well-suited for document processing pipelines that need to feed structured text into other business systems.


