Sampling
From QualtricsWiki
Contents |
[edit] Systematic Errors (Non-Sampling Errors)
Systematic errors result from decisions that bias the sample selection or response to your survey. Four common mistakes are:
- Population Specification Error: This error is one of not understanding who you should be surveying. As a simple example, imagine you are preparing a survey about the consumption of breakfast cereals. Who do you survey? It might be the entire family, the mother, or the children. The family consumes cereal, the mother purchases, and the children influence her choice.
- Sample Frame Error: A frame error occurs when the wrong sub-population is specified from which the sample is drawn. A classic frame error occurred in predicting the 1936 presidential election between Roosevelt and Landon. The sample frame was from car registrations and telephone directories. In 1936, car and telephone owners were largely Republicans. While the results may have reflected the sample, the predictions were not accurate for the US as a whole and the results wrongly predicted a Republican victory.
- Selection Error: Selection error results when the respondents self select their participation - those that are interested respond. Selection error can be controlled by going extra lengths to get participation. Typical steps include initiating pre-survey contact requesting cooperation, actual surveying, post survey follow-up if a response is not received, a second survey request, and finally interviews using alternate modes such as telephone or person to person.
- Non-Response: Non-response errors occur when non-respondents are different than those who respond. This may occur because either the potential respondent was not contacted (they didn't check their e-mail) or they refused to respond (they were all grumpy old men or beautiful young women afraid of strangers). Again, the extent of this non-response error can be checked through follow-up surveys using alternate modes.
[edit] Non-Systematic Errors (Sampling Errors)
Sampling errors occur because of variation in the number or representativeness of the sample that responds. Sampling errors can be controlled by (1) careful sample designs, (2) large samples, and (3) multiple contacts to assure representative response. Two types of samples may be drawn: a probability sample where every person in the sample has an equal and known probability of being selected, and a non-probability sample where the probability of a person being selected is unknown. Non-probability samples include quota samples, referral samples, and convenience samples. Probability samples include simple random samples, systematic samples, and stratified samples.
[edit] Non-Probability Samples
[edit] Quota Sample
This assures that various subgroups of the population are represented on relevant sample characteristics. The quota sample does not use population proportions. For example, a quota sample may be used to make sure you have at least 35 people who have an income more than $250,000. Respondents from the group would rarely be interviewed because of their small incidence in the population as a whole.
[edit] Referral Sample
This is used to locate a population of rare individuals by referral. Locating 100 adult croquet players may be difficult without referrals.
[edit] Convenience Sample
These are based on convenience and may include members of affiliation groups, interest groups, or random intercepts on your website. The objective is to collect as much data as possible, regardless of where they come from.
[edit] Probability Samples
[edit] Simple Random Sample
This occurs where every element has a known and equal probability of being selected. A true random sample is rarely used because we rarely have a sample frame that lists every person we could sample.
[edit] Systematic Sample
This occurs when all potential respondents have a known and equal chance of being selected. Typically a systematic sample would select every nth person from the list of potential respondents. For example, a systematic sample of 400 customers (out of the 3000 total customers) would be conducted by computing a respondent selection frequency (N/n) = 3000/400 = 7.5. Then this number is rounded to 8 and a random number is selected from 1-8 (suppose the result is 3). We would select a systematic sample by first selecting the third customer on the list and then every 8th thereafter. This will form a simple random sample of respondents if the customer list is not systematically ordered in some way.
[edit] Stratified Sample
This is sometimes desirable if the population is to be broken up into different groups based on one or more characteristics of the population. In this case, the strata are identified. Strata may defined as any groups: Credit card users vs noncredit card users, by gender, age, industry, purchasers vs non-purchasers, current customers vs past customers, etc. Once the strata are identified, a simple random sample is drawn within each strata. Once the survey is completed, the strata are then weighted back to the population proportions.

