Sunday, December 16, 2007

“Four out of five doctors recommend Anacin for headaches.”

Con-census: small sample may hide reality

If we don’t take into account the sample size, the confidence with which we can accept a claim can vary dramatically

Let’s say you are in-charge of running a call centre. You have two human resource (HR) managers, Vivek and Madhuri, who are independently responsible for recruiting customer service representatives for the call centre. You are well aware of the retention problem common tothe call centre industry, with a significant proportion of new recruits switching jobs within the first few months of joining a firm.
Over the past few years, your two HR managers have recruited several hundred employees. You are interested in finding out which of the two managers’ recruits stay longer with your company. You decide to look at the recruiting records for the first week of December of last year for both managers, and count the number of people who are still working with the company, one year later.
It just so happens that Vivek had recruited four employees in that week, three of whom were still with the company.On the other hand, Madhuri had been on a roll thatweek, and had recruited 24 people, of which only 14 were still with the company. It is evident from the two samplesthat both the recruiters had a better than 50% retention record, but which of the two samples provides a moreconvincing evidence that the retention rate is really more than 50%? Despite the fact that Vivek’s sample showed a 75% retention rate, Madhuri’s sample actually provides a statistically more convincing evidence of a better than 50% retention rate.
Why is a sample with a retention rate of 58% better evidence of recruiting success than that of 75%? The answer lies in the sample size differences.
In general, large sample sizes are less likely to deviate from the mean than smaller sample sizes. Thus, Madhuri’s sample is more likely to reflect the true retention rate for all her recruits. Vivek’s 75% retention could merely be a chance occurrence that may not reflect his true retention rate.
If we flip a coin four times, the likelihood of getting 75% heads (three of the four flips) is much greater than if we flip the same coin 100 times.
The fact that people judge the likelihood of an event based merely on the proportion of outcomes, while paying little attention to the sample size on which that proportion is based, has been labelled “sample size insensitivity bias” by researchers. The implication of this bias is very clear: We can easily be persuaded by advertising that doesn’t report on sample sizes.
Consider this hypothetical example: “Four out of five doctors recommend Anacin for headaches.” This claim can carry quite different meanings, depending on whether it was based on asking 10 doctors or 1,000. If we don’t take into account the sample size, the confidence with which we can accept a claim can vary dramatically.
For managerial decision-making, the sample size insensitivity bias can result in reliable and unreliable business information being treated equally. As the sample size goes down, the likelihood that observations are representative of the overall average also goes down.
You may feel that your secretary is not very punctualbecause on occasions, you have seen her walk into hercabin several minutes after the 9am deadline, whereasyour boss’ secretary is incredibly punctual because whenever you go to see your boss at 9am, she is always at her desk.
While it is possible that your boss’ secretary may actually be more punctual, you should also be aware of the fact that the two samples are not really comparable. You have a 9am meeting with your boss only once or twice a month, whereas you get to observe the time your secretary arrives every single day of the week. It is quite possible that you get to sample your boss’s secretary’s arrival time only on those days when she isn’t late.
You might have heard people complain about the person they dated being so different from the person they are now married to. It feels as if the person they married is not the one they dated.
This can partially be explained by the sample size bias. Dating provides a very small sample of a person’s true self.In those limited dating encounters, it is possible for an individual to project a persona that is quite different from their true self.
Marriage, on the other hand, isn’t sampling any more. It is the full census of a person’s true self! Obviously, the smaller the dating sample, the greater the chance of post-marriage behavioural deviations.
The moral of the story is simple. Don’t blindly take information that is presented to you and make significant decisions without asking some basic questions about the sample from which that information is drawn.
Realizing that information based on a few data points could present a distorted view of the reality, ask yourself whether a larger sample size is likely to present a different picture of the situation.
Praveen Aggarwal is an associate professor of marketing at the Labovitz School of Business & Economics at the University of Minnesota Duluth and Rajiv Vaidyanathan is a professor of marketing and director of MBA programmes at the University of Minnesota Duluth.

Original article source: http://www.livemint.com/2007/12/17004401/Concensus-small-sample-may-h.html