Call Transcripts in CSV Format
About Call Transcripts in CSV Format
XM Discover enables you to upload call transcripts via CSV files.
Unlike individual feedback where one document is defined in one row, call transcripts can be defined using multiple rows. Here’s how it works:
- Each row contains an individual line of dialogue in a conversation along with participant info and a timestamp.
- Separate rows are rolled into a single conversation by sharing the same conversation ID.
- Conversation-wide field values (such as Document Date or custom attributes) are taken from the first row of the conversation.
Call Transcript Fields
The following table describes the fields that define a call transcript.
Element | Description |
conversationId
(Required) |
A unique ID for the entire conversation. Each row that has the same ID is treated as a separate line within a single conversation.
You can map this field to the natural_id attribute to use it as the document’s Natural ID. |
conversationTimestamp
(Required) |
The date and time of the entire conversation. Use the ISO 8601 format with seconds precision.
You can map this field to the document_date attribute to use it as Document Date. |
participantId
(Required) |
The ID of the participant. Must be unique per conversation (document). |
participantType
(Required) |
The type of the participant. Possible values:
These values are passed through to the CB Participant Type attribute for reporting and participants visualization. If unspecified, CB Participant Type will have no reportable value. |
is_ivr
(Optional) |
A Boolean field that indicates whether a participant is an Interactive Voice Response (IVR) bot or a person.
These values are passed through to the CB Kind of Participant attribute for reporting and participants visualization. If unspecified, CB Kind of Participant will have no reportable value. |
text
(Required) |
Speech transcript.
Attention: A sum of all text elements may not exceed 100,000 characters. If it does, the document is skipped.
|
start
(Required) |
The time the speech starts (in milliseconds passed since the beginning of the conversation). |
end
(Required) |
The time the speech ends (in milliseconds passed since the beginning of the conversation). |
contentSegmentType
(Required) |
This parameter identifies the transcript format, which allows the Natural Language Processing (NLP) engine to process data correctly.
Possible values:
|
custom fields
(Optional) |
You can provide multiple fields to add structured attributes to the conversation. |
Call Transcript Example
Here is an example of several call transcripts in an CSV format. Rows 2 through 11 are lines in a single conversation; rows 12-13 are lines in another conversation.
For an overview of all XM formats, please see XM Discover Data Formats Overview.