Skip to main content
Version: V12

Configuring VIDIZMO Indexer for PII Detection and Redaction

The VIDIZMO Indexer provides detection and redaction capabilities using AI models. One key capability is detecting and redacting Personally Identifiable Information (PII) from transcribed audio and video files.

The VIDIZMO Indexer also supports visual PII detection through Optical Character Recognition (OCR), allowing you to detect PII in documents, images, and videos that contain text. This is especially useful for videos that contain on-screen text but no audio for transcription.

To learn more about this functionality, see Understanding PII Detection and Redaction using VIDIZMO Indexer.

If you are performing PII detection via transcription, your audio or video files must be transcribed. The VIDIZMO Indexer automatically generates transcriptions for your content when processing for PII detection. You can also add transcriptions manually by uploading a closed caption file. See How to Add Closed Captions.

Prerequisites

  • You belong to a group where the App Management feature is enabled.
  • You belong to a group where the PII Detection and Redaction feature is enabled.
  1. In VIDIZMO, select the menu icon in the top-left corner to open the navigation pane.
  2. Expand the Admin section and select Portal Settings.
  3. Go to Apps > Content Processing.
  4. Select the settings icon on the VIDIZMO Indexer app.

Configure PII Detection and Redaction

  1. Media Formats Select the file formats you want to detect PII in. In a DEMS Portal, this field appears as Evidence Formats.

  2. Insights Select the PII entities you want to detect in your content. For the full list, see PII Entities. You can also create your own PII entities and include them in detection by selecting Custom PII. See How to Create Custom Patterns for details.

NOTE: You can also add other AI insights (such as Chaptering) to generate them alongside PII detection or redaction.

PII Detection Settings

  1. Confidence Score Set the minimum confidence score for the model to classify a detected term as a PII entity. A term is classified as PII only if its confidence exceeds this value. The confidence score can be set from 10 to 100. Recommended: 35.

  2. Excluded Words Provide a list of words that will not be detected or marked as PII, even if they match a defined PII entity. For example, adding "John" to Excluded Words prevents the application from identifying it as PII. This field is case-sensitive.

  3. Context Keywords Provide context words that the model analyzes to improve the confidence score for custom PII entities.

Automatic Object Redaction Settings

  1. Redaction Types Select the PII entities you want to redact from your audio, video, or documents.

  2. Confidence Threshold for Redaction Set the minimum confidence level (10 to 99) that the model must reach to identify a term as PII for automatic redaction. Only terms with a confidence score above this value are automatically redacted.

  3. Audio Redaction Type Select how PII is handled in audio. The selected type replaces the detected PII with a specific sound.

    • Bleep — Replaces the PII with a short bleep sound. Available only for .wav files.
    • Mute — Silences the section containing the PII. Used by default for all other audio file formats.

NOTE: You can change these redaction settings later in the Process Modal or Studio Space when performing automatic PII redaction.

Advanced Processing

  1. Action for Original File Select how to handle the content after PII detection or redaction processing finishes. This setting only applies when Automatic Processing is on.

    • Retain File — The original content remains unaffected, while a copy is created, processed for PII insights, and then published.
    • Delete and Move to Recycle Bin — The original content is deleted and moved to the Recycle Bin, while a copy is processed for PII insights and then published.
    • Override Original File — The original content is processed for PII insights while it remains published. No copies are made.
  2. Time Interval Threshold Specify the time interval threshold (in milliseconds) for more accurate redaction of PII. This value determines which PII detections are selected during audio activity detection, and then applies the configured Start Time and End Time corrections to the audio segments for complete redaction.

  3. Start Time Correction Specify a value (in milliseconds) for how much the start time of a PII detection is adjusted. This is helpful when the beginning of a PII term is missed during detection. For example, if "John" is left out from the PII "John Doe," the start time correction helps capture the entire PII.

  4. End Time Correction Specify a value (in milliseconds) for how much the end time of a PII detection is adjusted. This is helpful when the ending of a PII term is missed during detection. For example, if "Doe" is left out from the PII "John Doe," the end time correction helps capture the entire PII.

Save and Enable

  1. Automatic Processing Select On to automatically detect and redact PII as soon as content is uploaded to your Portal, or Off to require on-demand processing.

  2. Select Save Changes to apply your settings.

  3. Turn on the toggle for the VIDIZMO Indexer app to enable processing.