Survey respondents understandably don't like to answer questions about sensitive topics such as illegal drug use. If data scientists want to estimate the proportion of illegal drug users in a population, they have to devise methods of getting the information they need while maintaining the privacy of the individual respondents. Randomized response schemes are often used in such situations. In one such scheme, each surveyed person is given a coin and asked to answer YES or NO after following these instructions out of sight of the surveyor: • Toss the coin. . If it lands heads, then truthfully answer, "Do you use illegal drugs?" . If it lands tails, then toss it again and answer, "Did the second toss land heads?" This way each respondent answers YES or NO but the surveyor doesn't know which question was answered. The data scientists then have to estimate the proportion of illegal drug users based on the overall proportion of YES answers, which includes the YES answers to the second question. Let the unknown proportion of illegal drug users in a large population be p, and suppose a random sample of size n is surveyed using the scheme above. You can assume that the sampling is equivalent to drawing at random with replacement. a) Let X be the proportion of sampled people who answer YES. Find E(X). b) Use X to construct an unbiased estimator of p.
If a personality survey is conducted we can see that people will give two dissimilar answers.
If a survey based on personality is conducted and the people are asked two question which are different in nature. We can see that people answering both the question will have dissimilar answer because according to the personality of people two different question cannot have a same answer , they must be different , that is why there is a tendency to get two different answer from the survey.