I read part of Denise's on the China Study, and I quickly came to her quick note on stats:
A quick note on stats
Not a math whiz? No problem. The stuff here should be pretty straightforward, but in case you’re rusty or simply allergic to numbers, here’s a refresher on some of the statistics terminology I’ll
A positive correlation means two variables increase together and decrease together. For example, the more food Garfield scarfs down, the more Garfield weighs;
the less food Garfield eats, the less Garfield weighs.
A negative correlation means one variable increases while the other decreases, and vice versa. For example, the more Garfield exercises, the less Garfield weighs; the
less Garfield exercises, the more Garfield weighs.
Well... From this, I can see that Denise maybe be a math whiz, but she doesn't seem to know how to use stats, and what stats techniques apply for what.
After reading more, I saw her using excel to draw her data plots. I'm pretty sure she used it as well to calculate her "correlations".
So where did she go wrong?
First, my understanding is that in epidemiology you don't look at variables range when you look for associations between a factor and an effect. You draw a 2x2 table with your factor in the rows, and the effect in the columns. First row below the exposure threshold you define, second row above threshold. First column, no effects, second column with effect. Then you can fill in the numbers of incidence in the four cases defined, and calculate the risks and so on.
To what I see, Denise didn't take this approach, but went straight for the CORREL function in excel, plotting data in XY plots, and drawing lines basically between the data points. The CORREL function is a linear regression algorithm between all the points in your plot, to match a line between them, and the correlation factor is a measure of the scattering of the data points around that line.
First, this is not the stats techniques used in the field from all I could gather in litterature.
Second, let's take a quick example of how this CORREL function technique does not work for this purpose. Let's take a simple exponential function but with two different exponential rise, one almost linear (purple), the other really steep at one point (red).
You will agree that in both case, the Y increases when X increases (the more Garfield exercises, the more Garfield weights as she said). But the result of the CORREL function shows clearly a strong correlation for the purple case (0.991669, or 99%), and no correlation for the red case (0.534017, or 53%)! But this conclusion is driven if you use Denise's definition for correlations between factors and incidence.
On this simple example, you can see that the technique she used doesn't work. It doesn't work with the simplest data set your eyes can analyze easily, so imagine what it can do on more complicated data sets?
In conclusion, I think there is not much to talk about what she can write. All she writes is based on a wrong data analysis, so none of her conclusions can be valid. That will save me a lot of time reading it actually.
I bet if her work had been peer reviewed, it would have never made it out of her kitchen.
So, Denise, if you ever come across this, don't take it personally, but before jumping on your soap box and spreading wrong truth out, just open a couple books, and learn about what you want to do first.
Then, there are other aspects of her analyzes I'm not confident with, like using the mortality as a measure of disease incidence. I'm no epidemiologist, but I think for chronic disease or cancer that would not be the best indicator, since you can live quite some time with one, and you can die from other cause. But that may be what was in the original study, I don't know, I haven't seen it.
So I'll just stay on the math side, because she is so wrong in my opinion, that none of what she can write has any value.