Read Content Tool
Use the Read Content tool to enable AI-powered workflows to read the full text content of items in your VIDIZMO library. When you connect this tool to an AI node, the AI model can dynamically retrieve PDF document text, video and audio transcripts, and metadata based on user requests and workflow context.
In this article, you learn how to add, configure, and test the Read Content tool in your workflow.
Concept
The Read Content tool is a VIDIZMO-specific tool node that provides AI models with access to the text content of items in your library. For documents (PDF), it extracts and returns the full text content page by page. For videos and audio, it returns the timestamped transcript. For other formats, it returns available metadata. While the Search Mashup tool finds content through search queries, the Read Content tool retrieves the actual text for specific content items using UUIDs returned by Search Mashup. Unlike regular workflow nodes that execute in sequence, this tool is invoked by an AI node only when the model determines that detailed content retrieval is needed.
Key capabilities:
- Multi-format content reading - Extract PDF text page by page, retrieve timestamped video and audio transcripts, and return metadata for other formats
- UUID-based lookup - Retrieve content using UUIDs from Search Mashup results
- Pagination support - Control the portion of content retrieved using offset and limit parameters, with units based on content type (pages for PDF, entries for transcript)
- Regex filtering - Search within content using regex patterns to find specific information without reading everything
- Oversized content handling - Automatically extract relevant information from large content using a configurable extraction threshold and internal LLM call
Understand How The Tool Works
This section explains how the Read Content tool operates within a workflow and how it connects to other nodes.
Tool Connectors
In the Workflow Designer, nodes use colored connectors to indicate the type of connection and data flow. The Read Content tool uses the green connector, which is specific to tool nodes.
Execution Flow
When a workflow runs, the Read Content tool operates in this sequence:
- The AI node receives user input from the chatbot.
- The AI model analyzes the input and determines whether content reading is needed.
- If content reading is needed, the AI invokes the Read Content tool through the green connector, passing a UUID from previous Search Mashup results along with optional offset, limit, and pattern parameters.
- The tool sends a request to the VIDIZMO API using the specified UUID and the user's authentication context.
- The VIDIZMO API returns the content text (PDF text page by page, timestamped transcript, or metadata).
- If the content exceeds the extraction threshold, the tool uses an internal LLM call to extract relevant information from the oversized content.
- The content is stored in the field specified in Output Field.
- The AI node accesses the content text and generates a response for the user.
┌─────────────┐ ┌─────────────┐ ┌──────────────────┐ ┌──────────────┐
│ User Query │ ───► │ AI Node │ ───► │ Read Content │ ───► │ VIDIZMO API │
└─────────────┘ └─────────────┘ │ (Green Connector)│ └──────────────┘
▲ └──────────────────┘ │
│ │
└──── state.data.content_details ◄──────────────┘
When To Use This Tool
Use the Read Content tool when your workflow requires:
- Reading the text content of PDFs, video transcripts, or audio transcripts from your library
- Retrieving content text for items identified through Search Mashup results
- Providing the AI with full content text before performing operations such as summarization, translation, or chapter generation
- Extracting specific sections from large documents using regex patterns
- Paginated reading of large documents or transcripts
Add The Read Content Tool To Your Workflow
Follow these steps to add the Read Content tool to your workflow canvas.
-
Go to Portal Settings > Chatbot > Workflow.
-
Select an existing workflow or create a new workflow.
-
In the Node Library, expand the Tools category.
-
Drag Read Content Tool onto the canvas.
Connect The Tool To An AI Node
After you add the Read Content tool to the canvas, connect it to an AI node.
-
Locate your AI node (such as an LLM node) on the canvas.
-
Drag a connection line from the green connector to the input connector on the Read Content tool node.
-
Release to create the connection. A green connector (●) indicates a successful tool connection.
NOTE: The green connector indicates that the tool is available to the AI node for on-demand invocation. The tool doesn't execute in sequence with other nodes, it executes only when the AI model decides to invoke it.
Configure The Read Content Tool
Select the Read Content node to open the Node Configuration Panel. You can configure the following options:
Description
Instructions for the LLM on how and when to use this tool. The default description provides guidance including:
- When to invoke content reading (user asks about a specific item's content, requests transcripts, or needs document text)
- How to identify content from Search Mashup results using UUIDs
- What each content type returns (PDF text page by page, timestamped transcript, or metadata)
- The requirement to call Search Mashup first to obtain UUIDs
The AI uses this description to determine when content reading is appropriate. For example, when a user asks "What does the onboarding document say about benefits?", the AI reads the description to understand it should invoke the Read Content tool with the content's UUID and an appropriate pattern.
TIP: Keep the default description unless you need to customize the AI's content retrieval behavior for specific use cases.
Content Parameters
You can configure the following options:
-
Offset: Skip the first N units of content. Units are based on content type: pages for PDF documents, entries for transcripts. Zero-indexed. Use together with Limit for pagination through large content. Enter a fixed value such as
0to start from the beginning, or use${state.data.read_offset}for dynamic values. -
Limit: Return at most N units after the offset. Useful for large documents where you want to read a section at a time rather than loading the full text. Enter a fixed value such as
10, or use${state.data.read_limit}for dynamic values. -
Pattern: A regex pattern to search within the content. Returns matching lines with surrounding context. Use this to find specific information in large documents without reading everything. For example, enter
patient.*nameto find lines containing "patient" followed by "name". Enter a fixed pattern or use${state.data.search_pattern}for dynamic values.
All content parameters support Fixed (static value) and Expression (dynamic value using ${variable} syntax) input modes.
NOTE: The Read Content tool requires content UUIDs, which are returned by the Search Mashup tool. Use the Search Mashup tool first to find content, then pass the UUID to the Read Content tool. The AI determines the appropriate arguments (
uuid,offset,limit,pattern) from the conversation context.
Extraction Settings
- Extraction Threshold: The fraction of the context window that triggers chunked extraction. When the retrieved content size exceeds this fraction of the model's context window, the tool automatically splits the content into chunks and uses an internal LLM call to extract relevant information. Lower values trigger extraction more aggressively, while higher values allow more content to pass through without extraction. Default is
0.9.
Model Settings
These optional settings let you override the default AI model used when content exceeds the extraction threshold and requires chunked extraction.
-
System Prompt: Instructions that guide how the AI extracts relevant information from oversized content. The default prompt instructs the model to extract all information relevant to the user's query, including specific details, data points, quotes, and context. If nothing relevant is found, the model responds with "No relevant information in this section." Customize this when you need extraction focused on specific types of information.
-
Model Provider: The AI provider to use for oversized content extraction. When left empty, the tool uses the default model configured for the workflow. Specify a provider when you need a particular model for content extraction quality or cost reasons.
-
Model ID: The specific model identifier from the selected provider. Required if Model Provider is set. When left empty, the tool uses the provider's default model.
-
Temperature: Controls the randomness of the model's output when extracting from oversized content. Lower values produce more deterministic and faithful extraction. Higher values produce more varied output. Default is
0.3. -
Max Token Limit: The maximum context window size in tokens. Overrides the default model configuration if set. Use this to control token consumption when processing large content.
-
Reasoning: Enable reasoning or thinking mode for the internal content extraction LLM. When enabled, the model uses extended reasoning to produce more accurate extraction from complex content. Supports Fixed and Expression input modes.
Output Settings
- Output Field: Variable name where the read content result is stored for use by other nodes. Click to select an existing variable or create a new one. The AI node and subsequent workflow nodes access results using
${state.data.<variable_name>}. For example, if you name itcontent_details, access it as${state.data.content_details}.
Test The Configuration
After you configure the Read Content tool, test the workflow to verify correct behavior.
-
Go to Portal Settings > Chatbot > Agents.
-
Select the agent associated with your workflow, or create a new agent and assign your workflow.
-
Open the chatbot interface in the portal.
-
Enter a query that should trigger content retrieval. For example:
- "What does the compliance document say about data retention?"
- "Show me the transcript of the onboarding training video"
- "Read the quarterly report and find the revenue figures"
-
Verify that the agent returns accurate and complete content from the requested item.
-
Check that the response includes the expected content type (PDF text, transcript, or metadata).
-
For large documents, verify that pagination works correctly by asking follow-up questions about different sections.
-
If results don't match expectations, return to the Workflow Designer and adjust your configuration.
Best Practices
- Always use the Search Mashup tool before the Read Content tool. The Read Content tool requires content UUIDs, which are returned by Search Mashup results.
- Pair the Read Content tool with the Summarize tool to retrieve content text and generate a summary in a single conversation turn.
- Use the Offset and Limit parameters for large documents to retrieve content in manageable portions rather than loading the full text at once.
- Use the Pattern parameter to extract specific sections from content when you need only a particular topic or keyword context, which avoids processing unnecessary content.
- Adjust the Extraction Threshold based on your content size. Lower values extract more aggressively for very large documents, while higher values preserve more original content for smaller items.
- The tool respects the user's content permissions. If a user doesn't have access to specific content, the tool returns an appropriate error.
Related Articles
- Tools in VIDIZMO Intelligence Hub
- Search Mashup tool
- Summarize tool
- Chapters tool
- Workflow Designer
- Variables
- Nodes