Tuesday, May 4, 2010

Day 2: Truncated Range in Relation to Diabetes and Obesity

Scatterplot of the eruption interval for a geyser.Image via Wikipedia

In the previous blog post, I discussed what happens when the range of a distribution is truncated. This blog post will present a common example to illustrate the point. Let’s imagine that we wish to know the relationship between obesity and diabetes. Based on what we know to date, about eight out of every 10 people with type 2 diabetes are overweight or obese. Excess weight has been shown to have a strong correlation with type 2 diabetes, in other words. Thus, we would expect the correlation between weight and diabetes to be high as depicted by a scatter plot with a full range of cases that show an upward slope. Our conclusion: In this hypothetical case, we can say that there is a strong, positive correlation between weight and the incidence of type 2 diabetes. However, as in most situations, there is a caveat – one thing does not cause the other thing even though the relationship between the two is strong.

Now let’s suppose we wish to inspect or select only those data points that lie at the extreme ends of the scatter plot, and we wish to draw some conclusions based only on this smaller, more select number of cases (for whatever reason). In other words, we are going to try to draw conclusions based on a “truncated range” of instances. What are the consequences of doing this? The clarity of the overall vision disappears. Because we are viewing only a small number of cases at the extreme end of the range, there appears to be no relationship between the variables at all! If we calculated the correlation coefficient for this restricted range of cases, it would be artificially (spuriously) low. We’d be basing our conclusions on incomplete information. And we would be wrong. Read more in tomorrow's blog post.


Reblog this post [with Zemanta]

1 comments:

  1. One way to reduce the stress is hypnotherapy, its relaxing and pleasurable technique will harness to help you transform for the better.

    ReplyDelete