Skip to main content
Loading...
Skip to article
  • Qualtrics Platform
    Qualtrics Platform
  • Customer Journey Optimizer
    Customer Journey Optimizer
  • XM Discover
    XM Discover
  • Qualtrics Social Connect
    Qualtrics Social Connect

Call Transcripts Data Formats


Was this helpful?


This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The feedback you submit here is used only to help improve this page.

That’s great! Thank you for your feedback!

Thank you for your feedback!


About Call Transcripts Data Formats

XM Discover enables you to call transcripts (i.e. transcripts of audio conversations) via CSV, Excel, JSON, or WebVTT format. Call transcripts identify the participants in a conversation and attribute each message to a participant.

Typically, call transcripts contain a number of structured and unstructured data fields that represent a conversation between a customer and an entity at your company (for example, the transcript between a customer and your automated phone service, or the call transcript between a customer and a live support representative). Structured fields may contain dates, numbers, or text data with a high degree of organization (such as names of brands, participant names, and products). Unstructured fields contain notes, comments, and other open-ended text fields.

You can upload call via the following formats:

  • CSV
  • XLS or XLSX (Microsoft Excel)
  • JSON
  • WebVTT

CSV and Excel Formatting for Call Transcripts

This section covers formatting for call transcripts for CSV and Excel files. The formatting and requirements for both file types are the same.

In CSV and Excel files, call transcripts are defined using multiple rows. Here’s how it works:

  • Each row contains an individual line of dialogue in a conversation along with participant data and a timestamp.
  • Separate rows are rolled into a single conversation by sharing the same conversation ID.
  • Conversation-wide field values (such as Document Date or custom attributes) are taken from the first row of the conversation.
Element Description
conversationId

(Required)

A unique ID for the entire conversation. Each row that has the same ID is treated as a separate line within a single conversation.

You can map this field to the natural_id attribute to use it as the document’s Natural ID.

conversationTimestamp

(Required)

The date and time of the entire conversation. Use the ISO 8601 format with seconds precision.

You can map this field to the document_date attribute to use it as Document Date.

participantId

(Required)

The ID of the participant. Must be unique per conversation (document).
participantType

(Required)

The type of the participant. Possible values:

  • AGENT: Indicates a company representative or a chatbot.
  • CLIENT: Indicates a customer.
  • TYPE_UNKNOWN: Indicates an unidentified participant.

These values are passed through to the CB Participant Type attribute for reporting and participants visualization.

If unspecified, CB Participant Type will have no reportable value.

is_ivr

(Optional)

A Boolean field that indicates whether a participant is an Interactive Voice Response (IVR) bot or a person.

  • true: Indicates an IVR bot.
  • false: Indicates a person.

These values are passed through to the CB Kind of Participant attribute for reporting and participants visualization.

If unspecified, CB Kind of Participant will have no reportable value.

text

(Required)

Speech transcript.

Attention: A sum of all text elements may not exceed 100,000 characters. If it does, the document is skipped.
start

(Required)

The time the speech starts (in milliseconds passed since the beginning of the conversation).
end

(Required)

The time the speech ends (in milliseconds passed since the beginning of the conversation).
contentSegmentType

(Required)

This parameter identifies the transcript format, which allows the Natural Language Processing (NLP) engine to process data correctly.

Possible values:

  • TOKEN: Transcribed data is provided one word at a time.
  • SENTENCE: Transcribed data is provided one sentence at a time.
  • TURN: Transcribed data is provided one speaker turn at a time.
custom fields

(Optional)

You can provide multiple fields to add structured attributes to the conversation.

JSON Formatting for Call Transcripts

This section contains JSON formatting for call transcripts.

Top-Level Objects

The following table describes the top-level objects of a document node.

Element Description
conversationId A unique ID for the entire conversation.

You can map this field to the natural_id attribute to use it as the document’s Natural ID.

conversationTimestamp The date and time of the entire conversation. Use the ISO 8601 format with seconds precision.

You can map this field to the document_date attribute to use it as Document Date.

content An object that contains the content of the conversation. Includes these nested objects:

  • participants
  • conversationContent
  • contentSegmentType
custom fields (attributes) You can provide multiple key-value pairs to add structured attributes to the conversation.

content Object

The following table describes the objects nested inside the content object.

Element Description
participants An array of objects that provides information about the participants of the conversation. Includes these fields:

  • participant_id
  • type
  • is_ivr
conversationContent An array of objects that contains the lines of the conversation. Includes these fields:

  • participant_id
  • text
  • start
  • end
contentSegmentType

(required)

This parameter identifies the transcript format, which allows the Natural Language Processing (NLP) engine to process data correctly.

Possible values:

  • TOKEN: Transcribed data is provided one word at a time.
  • SENTENCE: Transcribed data is provided one sentence at a time.
  • TURN: Transcribed data is provided one speaker turn at a time.

participants Object

The following table describes the fields nested inside the participants object.

Element Description
participant_id

(required)

The ID of the participant. Must be unique per conversation (document).
type

(Required)

The type of the participant. Possible values:

  • AGENT: Indicates a company representative or a chatbot.
  • CLIENT: Indicates a customer.
  • TYPE_UNKNOWN: Indicates an unidentified participant.

These values are passed through to the CB Participant Type attribute for reporting and participants visualization.

If unspecified, CB Participant Type will have no reportable value.

is_ivr

(Optional)

A Boolean field that indicates whether a participant is an Interactive Voice Response (IVR) bot or a person.

  • true: Indicates an IVR bot.
  • false:Indicates a person.

These values are passed through to the CB Kind of Participant attribute for reporting and participants visualization.

If unspecified, CB Kind of Participant will have no reportable value.

conversationContent Object

The following table describes the fields nested inside the conversationContent object.

Element Description
participant_id

(Required)

The ID of the participant who is speaking. Must match one of the IDs provided in the participants array.
text

(Required)

Speech transcript.

Attention: A sum of all text elements may not exceed 100,000 characters. If it does, the document is skipped.
start

(Required)

The time the speech starts (in milliseconds passed since the beginning of the conversation).
end

(Required)

The time the speech ends (in milliseconds passed since the beginning of the conversation).

Example

Here is an example of a call transcript between an agent and a client.

[
{
"conversationId": "46289",
"conversationTimestamp": "2020-07-30T10:15:45.000Z",
"content": {
"participants": [
{
"participant_id": "1",
"type": "AGENT",
"is_ivr": false
},
{
"participant_id": "2",
"type": "CLIENT",
"is_ivr": false
}
],
"conversationContent": [
{
"participant_id": "1",
"text": "This is Emily, how may I help you?",
"start": 22000,
"end": 32000
},
{
"participant_id": "2",
"text": "Hi, I have a couple of questions.",
"start": 32000,
"end": 42000
}
],
"contentSegmentType": "TURN"
},
"city": "Boston",
"source": "Call Center"
}
]

WebVTT Formatting for Call Transcripts

You can upload call transcripts using WebVTT formatting.

The Document Date is automatically taken from the file name if available. To set the Document Date automatically, make sure the file name starts with the following prefix:

<Timezone><YYYY><MM><DD>-

Example: GMT20201011-meeting.vtt

If the file names use a different format, apply a date transformation to the Document Date field on the mappings step. For details, please see Setting a Specific Document Date.

Example

Here is an example of a Zoom call transcript in WebVTT format.

WEBVTT
1
00:00:00.599 --> 00:00:02.280
John Smith: Alright so let me
2
00:00:04.230 --> 00:00:05.339
John Smith: start sharing
3
00:00:12.809 --> 00:00:13.469
John Smith: My screen.
4
00:00:15.750 --> 00:00:18.119
John Smith: Can everybody see it.
5
00:00:19.050 --> 00:00:28.890
Paul Jones: Yes I can see it.