Processing | VIDIZMO Knowledge Base

📄️ Concatenate

The Concatenate node joins strings or arrays from multiple workflow variables into a single output with deduplication, formatting, and cleaning options. It supports both string and array output formats, handling mixed input types including strings, arrays, and objects. This is particularly useful for merging results from parallel workflow branches or combining data from different sources.

📄️ Data Extractor

The Data Extractor Node extracts specific fields from structured data objects and converts them into text or structured format for downstream processing. It supports dot notation for accessing nested fields and provides extensive formatting control through separators, metadata inclusion, and record identifiers. This selective extraction reduces data volume and focuses processing on relevant information.

📄️ Data Operations

The Data Operations Node performs multiple data manipulation operations sequentially on workflow variables to transform, filter, and clean data. It supports operations including Set, Filter, Rename Keys, Remove, Clear, and Drop Duplicates. The sequential execution model allows chaining transformations within a single node, reducing workflow complexity.

📄️ Parallel Processor

The Parallel Processor Node processes collections of items concurrently using a specified processor node with automatic load balancing and result aggregation. It distributes workload across multiple workers for improved throughput while maintaining result ordering. Processing failures are handled gracefully with configurable error handling modes.

📄️ Python Code

The Python Code Node executes custom Python code in isolated Docker containers with secure resource management and state variable integration. It supports both shared containers for performance and ephemeral containers for maximum isolation, with automatic idle timeout monitoring. The underscore notation system provides seamless integration with workflow state for reading and writing variables.

📄️ Split in Batches

The Split In Batches Node divides arrays into fixed-size batches with optional range extraction and output limiting for controlled data processing. It supports flexible indexing with negative indices and dynamic batch sizing through variable interpolation. The consistent array-of-arrays output format simplifies downstream processing.

📄️ Text Splitter

The Text Splitter Node divides large text documents into smaller, manageable chunks using intelligent strategies with token-accurate or character-based length measurement. It supports recursive (semantic) and fixed (simple) splitting strategies, automatically calculating optimal chunk sizes from model context windows. Cross-chunking mode combines small objects like subtitle segments into larger chunks while preserving metadata ranges.