Randomized response is a technique for estimating the amount of some socially negative behavior in a population—things like shoplifting, extramarital sex, child abuse, being hospitalized for emotional problems, and so on. The technique was introduced by Warner in 1965 and is particularly well described by B. Williams (1978:73). It is a simple, fun, and interesting tool. Here’s how it works.

First, you formulate two questions, A and B, that can be answered ‘‘yes’’ or ‘‘no.’’ One question, A, is the question of interest (say, ‘‘Have you ever shoplifted?’’). The possible answers to this question (either ‘‘yes’’ or ‘‘no’’) do not have known probabilities of occurring. That is what you want to find out.

The other question, B, must be innocuous and the possible answers (again ‘‘yes’’ or ‘‘no’’) must have known probabilities of occurring. For example, if you ask a someone to toss a fair coin and ask, ‘‘Did you toss a heads?’’ then the probability that they answer ‘‘yes’’ or ‘‘no’’ is 50%. If the chances of being born in any given month were equal, then you could ask respondents: ‘‘Were you born in April, May, or June?’’ and the probability of getting a ‘‘yes’’ would be 25%. Unfortunately, births are seasonal, so the coin-toss question is preferable.

Let’s assume you use the coin toss for question B. You ask someone to toss the coin and to note the result without letting you see it. Next, have them pick a card, from a deck of 10 cards, where each card is marked with a single integer from 1 to 10. The respondent does not tell you what number he or she picked, either. The genuine secrecy associated with this procedure makes people feel secure about answering question A (the sensitive question) truthfully.

Next, hand the respondent a card with the two questions, marked A and B, written out. Tell them that if they picked a number between one and four from the deck of 10 cards, they should answer question A. If they picked a number between 5 and 10, they should answer question B.

That’s all there is to it. You now have the following: (1) Each respondent knows they answered ‘‘yes’’ or ‘‘no’’ and which question they answered; and (2) You know only that a respondent said ‘‘yes’’ or ‘‘no’’ but not which question, A or B, was being answered.

If you run through this process with a sufficiently large, representative sample of a population, and if people cooperate and answer all questions truthfully, then you can calculate the percentage of the population that answered ‘‘yes’’ to question A. Here’s the formula:

The percentage of people who answer ‘‘yes’’ to either A or B = (the percentage of people

who answer ‘‘yes’’ to question A) times (the percentage of times that question A is

asked) plus (the percentage of people who answered ‘‘yes’’ to question B) times (the

percentage of times question B is asked).

The only unknown in this equation is the percentage of people who answered ‘‘yes’’ to question A, the sensitive question. We know, from our data, the percentages of ‘‘yes’’ answers to either question. Suppose that 33% of all respondents said ‘‘yes’’ to something. Since respondents answered question A only if they chose a number from 1 to 4, then A was answered 40% of the time and B was answered 60% of the time. Whenever B was answered, there was a 50% chance of it being answered ‘‘yes’’ because that’s the chance of getting a heads on the toss of a fair coin. The problem now reads:

which means that A = .08. That is, given the parameters specified in this experiment, if 33% of the sample says ‘‘yes’’ to either question, then 8% of the sample answered ‘‘yes’’ to question A.

There are two problems associated with this technique. First, no matter what you say or do, some people will not believe that you can’t identify them and will therefore not tell the truth. Bradburn, Sudman et al. (1979) report that 35% of known offenders would not admit to having been convicted of drunken driving in a randomized response survey. Second, like all survey techniques, randomized response depends on large, representative samples. Because the technique is time consuming to administer, this makes getting large, representative samples difficult.

BOX 9.9

THE LIST EXPERIMENT

The list experiment was developed by Kuklinski et al. (1997) to unobtrusively measure socially undesirable attitudes. In this technique (which is closely related to the randomized response technique), two randomly selected samples of people—called the baseline group and the test group—are told:

Now I'm going to read you four [five] things that sometimes make people angry or upset. After I read all four [five] statements, just tell me how many of them upset you. I don't want to know which ones, just how many.

Then, the interviewer reads four statements to the baseline group and five to the test group. The four statements that get read to both groups are about:

One: the way gasoline prices keep going up.

Two: professional athletes getting million-plus salaries.

Three: requiring seat belts be used when driving.

Four: large corporations polluting the environment.

The test group gets a fifth statement, like ''a black family moving in next door'' (Kuklinski et al. 1997), or ''a Jewish candidate running for vice president'' (J. G. Kane et al. 2004), or ''a woman serving as president'' (Streb et al. 2008).

If both groups are chosen at random, then average number of items that make people angry should be more-or-less the same in both groups. If the number is bigger for the people in the test group, it must be because of the extra statement. So, if the average number of items that make people angry in the baseline group is 2.5 and the average number in the test group is 3.0, the percentage of people who are angered by the extra item is (3.0 - 2.5 X 100) = .50, or 50%.

Still, the evidence is mounting that for sensitive questions—Did you smoke dope in the last week? Have you ever bought a term paper? Have you stolen anything from your employer?—when you want the truth, the randomized response method is worth the effort. Every time I read in the newspaper that self-reported drug use among adolescents has dropped by such-and-such and amount since whenever-the-last-self-report-survey- was-done, I think about how easy it is for those data to be utter nonsense. And I wonder why the randomized response technique isn’t more widely used (Further Reading: randomized response) (box 9.9).