Survey Platform - Understand Your Data | Qualtrics

Understanding Your Dataset

Introduction

For additional analysis outside of Qualtrics, you can download a dataset file for any survey. This dataset includes all your survey’s raw response data, including respondent file submission information, question responses, embedded data, and more.

Attention: The file referenced in this page was downloaded in the CSV format and is opened in Excel (it has also been formatted using the Wrap Text feature for easier reading). Other file formats include the same data points, but may display the contents with a slightly different layout.

Respondent Information

The first eleven columns in the dataset include information about each respondent and their entry, such as their name, IP address, response submission dates, etc.

2015-09-18_09-12-21

ResponseID

The ResponseID is the ID Qualtrics uses to identify each response in the database. This unique identifier is provided as a reference and generally does not have a use in data analysis.

ResponseSet

This field identifies which Response Set, or bucket of responses, each participant has been saved to. Note that Response Sets is a deprecated feature, currently you can achieve the same results using Embedded Data. Typically you will see “Default Response Set” as the value in this column.

IPAddress

This column includes the respondent’s IP address. This data will not be available if responses have been completely anonymized.

StartDate

These date and time values indicate when the respondents first clicked the survey link.

Qtip: If your Start Date and End Date columns in Excel contain ######, try making the columns wider. Excel uses # symbols when a date won’t fit.

EndDate

These date and time values indicate when the respondent submitted their survey. If the entry is a partial response, this date will indicate the last time the respondent interacted with the survey.

RecipientLastName, RecipientFirstName, and RecipientEmail

If your survey was distributed through an Email distribution or Personal Links, or if respondents logged into the survey using an Authenticator, respondents’ names and email addresses will display in these columns. For all other responses, these columns will be blank.

Qtip: The respondents’ names will be drawn from the contact list that the Email distribution, Personal Links, or Authenticator were based on. If you delete the contact list, the name information will be permanently removed and this column will be blank.

ExternalDataReference

When uploading a contact list to Qualtrics for use in an Email distribution or Authenticator, an External Data Reference can be included for each participant. This is a generic field that can store any information you like (and is most often used for unique identifiers like employee or student IDs). If an External Data Reference was uploaded for respondents, it will display in this column.

Qtip: For greater flexibility, we now generally recommend uploading additional details as Embedded Data rather than using External Data Reference.

Finished

This column details whether the response was submitted or closed. A “1” indicates the respondent reached an end point in their survey (hitting the last Next/Submit button, being screened-out with Skip or Branch Logic, etc.). A “0” indicates the respondent left their survey before reaching an end point and the response was instead closed manually or due to session expiration.

Status

The value in the Status column indicates the type of response collected. There are six main possible statuses:

Responses are flagged as spam if multiple responses are received from the same IP address within a 24-hour period. In most cases there is no need for concern, as this simply indicates two users on the same computer network took the survey.

Qtip: You may occasionally see other status codes. Any other number represents the addition of two of the above statuses. For example, a 9 is a preview response (1) that has been flagged as spam (8).

Scoring Results

For surveys using Scoring, scoring information is included in the set of columns following respondent information.

2015-09-18_09-33-22

For each scoring category, a sum, weighted average, and weighted standard of deviation is provided.

  • sum: The total number of points in that scoring category the respondent received for taking the survey.
  • weightedAvg: The total number of scoring category points the respondent received divided by the total number of questions scored in that scoring category.
  • weightedStdDev: The average difference between each individual question score and the weighted average score.

Embedded Data

For surveys using Embedded Data, Embedded Data information is included in the columns following scoring information.

2015-09-18_09-35-38

Only Embedded Data fields saved in the Survey Flow are included in the downloaded dataset. Embedded Data fields with values from a contact list or URL can be saved to the Survey Flow at any point before or after data collection.

Question Responses

After respondent, scoring, and Embedded Data information comes the question answer columns. The next columns in the data set display the answers provided for each survey question. Columns are headed with the question labels and the beginning lines of the question text. Simple questions (Text Entry, Multiple Choice – Single Answer, etc.) will be contained in one column, but more complex questions (Matrix Table, Side by Side, etc.) will be broken across columns.

2015-09-18_09-35-38

Qtip: Different question types store data differently, so we recommend generating sample data and checking its downloaded form before launching your survey. That way you can be sure you’re obtaining the type of results you want.

By default, data is downloaded displaying the numeric values (called “recode values“), for each answer choice. For example, on a 5 point scale, “Strongly Agree” would display as a “5”, making it easier to find a mean or do other statistical analysis.

Qtip: To show answer choice text instead of numeric values, click More Options when exporting data and select Use choice text.

If the coding of your choices doesn’t match your expectations, you can always return to the Survey module to change them and then export your data again. You can also export your survey to Word to retrieve a Code Book outlining how each choice is coded in the dataset.

Randomization Data

If you select the checkbox for Export viewing order for randomized surveys in your More Options menu, you will see randomization columns following your question response data. There will be one column for each randomizer in your survey. For randomizers where you randomized the order of the items, this column will include the order in which respondents saw the items. For randomizers where you randomly presented one of many items, this column will show the name of the item that respondents saw.

2015-09-18_10-04-38
In column AM, either Block 1 or Block 2 were show to the respondents. In column AN, respondent saw a mixed order of questions 13–16. The first respondent saw Block 2 and was show question 13 first, followed by 15, 16, and then 14.

Location Data

The last three columns in your dataset are an estimate of the respondent’s geolocation.

2015-09-18_10-06-25

If the respondent completed the survey using the Qualtrics Offline App on a GPS-enabled device, this data will be an accurate representation of the respondent’s location.

For all other respondents, the location is an approximation determined by comparing the participant’s IP address to a location database. Inside the United States, this data is typically accurate to the city level. Outside the United States, this data is typically only accurate to the country level.

LocationLatitude and LocationLongitude

These columns include the longitude and latitude of the respondent. Where location is approximated, the longitude and latitude presented are of the geographic center of the most accurate location available for the respondent. For example, if a respondent is predicted to be in Dallas, Texas, the longitude and latitude would reflect the geographic center of Dallas, Texas.

LocationAccuracy

The Location Accuracy indicates the level of accuracy of the provided longitude and latitude coordinates. For most respondents, the value in this column will be “-1″, indicating location is approximated based on their IP address and accuracy level cannot be determined.

For respondents taking the survey using the Qualtrics Offline App on a GPS-enabled device, the value represents a radius (in meters) from the reported longitude and latitude in which the respondent may be located. A larger number indicates a less accurate location.

File Format Differences

Though all file types download the same data fields described above, each features a layout that may be slightly different.

SPSS

The Data View in SPSS includes the exact same layout as the CSV file.

2015-09-18_14-33-17

SPSS includes another view, called the Variable View. This view lists all the variables from your dataset with information about each, such as the variable type and the possible values.

2015-09-18_14-35-29

XML

The XML file type is often used when integrating Qualtrics data with a third-party database. This file type can be parsed easily by common database software.

2015-09-18_14-47-05

An XML element is provided for each response, with a child element for each piece of data stored in that response.

HTML

HTML provides the same table as CSV, but can be viewed in a web browser for a quick look at the data.

2015-09-18_14-45-10

Fixed Field Text

The TXT export is a lightweight file containing a row for each respondent with each column separated by spaces. Though not generally used for statistical analysis, this file type is sometimes used when building an integration between Qualtrics and a third party database.

2015-09-18_14-40-51

In addition to the raw data, you will receive a data map file. This spreadsheet indicates all the variables downloaded and the number of spaces reserved for each variable in the TXT file.