Advanced Settings in the Recording Processing Pipeline
- 1 1. Transcription Adjustment
- 2 2. Replacement Setting
- 3 3. Start-sentence Feature
- 4 4. Number-of-speakers Setting
- 5 5. Do-not-save-voice-prints Option
- 6 6. Report Settings
- 7 7. Forced Language Transcription
- 8 8. Count Transcription Words
- 9 9. Force Identify One Participant
- 10 10. JSON Extraction from Reports
- 11 How to Create a Signal
This document provides clear and concise instructions on configuring advanced settings in the recording processing pipeline. These settings help enhance transcription accuracy, manage speaker diarization, and ensure compliance with specific company policies. Follow the guidelines below to set up each feature correctly.
1. Transcription Adjustment
Purpose: Improves transcription accuracy for specific words or phrases.
How to Set Up:
Create a signal named “Q-score”
Add the tracked expressions you want to emphasis during transcription.
This setting helps the transcription service to better transcribe certain words. The setting is not language-specific and applies to all languages.
2. Replacement Setting
Purpose: Corrects wrongly spelled words or different wordings.
How to Set Up:
Create a signal named “replacement”
Specify the desired replacements as tracked expressions. Examples
Tracked expression 1: “UNIQ:UNIQUE” Replaces UNIQ with UNIQUE in all transcripts which are produced after adding this expression.
Tracked expression 1: “refp:RFP” Replaces refp with RFP in all transcripts which are produced after adding this expression.
This ensures consistent correction across transcripts.
3. Start-sentence Feature
Purpose: Enhances transcription accuracy at the beginning of calls.
How to Set Up:
Create a signal named “start-sentence”
Include one tracked expression to provide a starting hint to the model. For example:
tracked expression: “This call features a sales representative from Unique and their customer. The Unique salesperson will introduce themselves first:”
This setting is not language-specific and applies to all languages.
4. Number-of-speakers Setting
Purpose: Used to activate in-house diarization model in uploaded and mobile recordings (excludes live calls) to distinguish between different speakers.
How to Set Up:
Create a signal named “number-of-speakers”
Set the minimum and maximum number of speakers as tracked expressions example:
tracked expression 1: “min:2”
tracked expression 2: “max:3”
This setting is not specific to individual calls. Adjustments cannot be made during the call if the maximum number of speakers is already set.
5. Do-not-save-voice-prints Option
Purpose: Ensures that NO voice print is saved for a specific company, impacting speaker identification and re-diarization.
How to Set Up:
Create a signal named “do-not-save-voice-prints”
Any tracked expression added here will trigger the system NOT to save voice prints for the specified company.
6. Report Settings
Chat Engine and Region
Purpose: To force the system to always use a specific model for all reports. This allows you to upgrade and choose the location of the model. Default model gpt-4o-2024-11-20, region CH.
How to Set Up:
Create a signal named “Chat-engine-and-region”
Add the tract expression with the model name and the region.
Options:
tracked expression 1: “gpt-4o-2024-05-13”; tracked expression 2: “swe”
tracked expression 1: “gpt-4o-2024-08-06”; tracked expression 2: “swe”
tracked expression 1: “gpt-35-turbo-0125”; tracked expression 2: “switzerlandnorth”
7. Forced Language Transcription
Purpose: This setting allows you to force all transcriptions within a specific company to use a predetermined language, overriding the automatic language detection of the transcription service. This is particularly useful in cases where you consistently expect the same language to be uploaded or recorded.
How to Set Up:
Create a signal named “forced-language-transcription.”
Add the tracked expression for the desired language using lowercase language codes (e.g., "en" for English, "de" for German).
This will ensure that all transcriptions for the specified company are processed in the chosen language, regardless of the audio content.
Example:
Tracked expression: en – Forces all transcriptions to be in English.
Tracked expression: de – Forces all transcriptions to be in German.
8. Count Transcription Words
Purpose: This setting allows you to count specific words or phrases in the transcript. It is useful for tracking the occurrence of certain expressions that are important for analysis or reporting purposes. This is case-insensitive.
How to Set Up:
Create a signal named identical to the report name, in lowercase, for accurate tracking.
Report Name Example: “banks count”
Signal Name: “banks count”
Tracked Expression: Add the words or phrases you want to track (e.g., “ubs,” “credit suisse,” “bank of america”).
This setting ensures that the specified words or phrases are consistently monitored across all relevant transcripts.
9. Force Identify One Participant
Purpose:
This setting ensures that at least one participant in every uploaded or mobile recording is identified.
How to Set Up:
Create a Deal/Coaching Room:
Create a dedicated room for each user or entity where recordings can be consistently uploaded. For optimal results, assign a separate room to each agent, ensuring customer calls are uploaded directly to their respective rooms..
Create a Signal Named “force-identify-one-participant”:
One tracked expression for this signal should be added:
“yes".
Label at Least One Voice in a Recording per room:
For at least one recorded per room, go into the recording's audio settings.
In the "Edit Participants" section label at least one participant by selecting or creating a participant record. This labeling process ensures that the voice is linked to the identified participant.
Save the Voiceprint:
Once the participant's voice is labeled press reprocess such that the voiceprint is saved. This allows future recordings in the same room to automatically associate the participant's voiceprint with any new recordings, ensuring that at least one voice is identified.
Automatic Participant Identification:
After the initial voice identification, future uploads or recordings in the same deal room will automatically apply the participant’s voiceprint, identifying at least one voice in the recording.
10. JSON Extraction from Reports
Purpose:
JSON extraction from reports allows structured data to be efficiently captured and processed directly from transcripts. This method is particularly useful for integrations with other systems.
How It Works
Initial Prompt: Outputs a Report with a single JSON structure from the transcript.
Grouping Prompt: Uses regex to extract the JSON, avoiding additional processing steps.
Key Considerations:
For transcripts over 1 hour, rely on the language model instead of JSON extraction.
Ensure only one JSON is generated; the last JSON in the output is extracted.
How to Create a Signal
Author | @Pascal Hauri |
|---|
© 2025 Unique AG. All rights reserved. Privacy Policy – Terms of Service