Conversation Processors¶

A Conversation Processor represents a discrete task that can be performed on the recorded audio stored in a Conversation object. Each processor produces some output which is accessible as a field on the Conversation object. Processors can performs tasks like transcribe speech, extract topics, and estimate sentiment.

Processors can be bound to a Conversation at creation time or after the fact via a call to the Process a conversation endpoint. Each processor has a unique name that is used when binding.

You may bind additional processors to a Conversation at any time. If a processor has already been bound to a Conversation, it will simply remain bound. Therefore, one processor will run at most once against a single Conversation and the API user needn’t worry about performing redundant calculations.

Some processors have dependencies on other processors. For example the Find Topics processor uses the transcript from the Transcribe processor to generate the topics. If a dependant processor is bound to a conversation, the system will automatically bind all the processors it requires, regardless of whether they are explicitly bound by the client. This frees the client from having to know about the implementation details of individual processors.

Transcribe¶

name:	`transcribe`
dependencies:	None
output fields:	`transcript`

Transcribes the entire conversation. Transcript is provided as a list of segments, each containing the transcribed text, the timestamp, and the connection id of the speaker.

Find Topics¶

name:	`findtopics`
dependencies:	`transcribe`
output fields:	`topics`

Generates a list of topics of conversation. Selects words based on their information content, frequency of use, and relevance to the transcript as a whole.

Classify¶

name:	`classify:{classifier}`
dependencies:	`transcribe`
output fields:	`detected_classes`

Performs classification on a conversation. The model must already be constructed.

Grade¶

name:	`grade`
dependencies:	`transcribe`
output fields:	`call_grades`

Grades a call against a set of common quality metrics based on the voice and speech content.

This call_grades object is placed in a Conversation when it’s requested as a processor. Call grading looks at both the speech content (words) and voice content (signal), and so it will perform transcription if the Conversation hasn’t already been transcribed.

All of the following values range from 0.0 to 1.0. The meaning of that scale is defined for each item below.

Property	Type	Description
Required Properties
outcome	float	Whether the parties of the call appear to have reached a positive result in the conversation. Higher is more successful.
quality	float	A measure of whether the call was cordial and professional. Higher is more cordial.
experience	float	Whether the parties analyzed seemed confident and capable given the topics discussed. Higher is more competent.
proactivity	float	The extent to which problems were addressed before they escalated. Higher is more proactive.
trust	float	A measure of perceived honesty and trust given the tone and speech content. Higher is better.
empathy	float	A measure of the extent to which parties reflexively react to the emotions of each other. Higher is more empathetic.