There is no profit motive in doing studies on non-proprietary products.
As a result, sometimes a study may be done out of academic interest out of some minor grant money that is lying around for whatever reason, but due to financial limitations, convenience limitations, and what might be called habits of a given field, such studies usually are inherently incapable of resolving small effects.
In this case what I mean by habits of a given is that in exercise science, it is the case that whenever taking either an untrained population or a bunch of guys who train half-assedly, when putting them on a program for (for example) 8 weeks there will be a great deal of variation in results among the placebo group.
Some guys will add 10 lb of muscle, some will lose 5, etc. Not from the placebo actually having any such effects, but from random variation.
It then is assumed that random variation is as severe among the treatment groups.
So when having for example 10 or 11 subjects per treatment group as was the case in this study, even if the average of the data were say a 3 lb increase in LBM and even if this is an actual effect (caused by the treatment) it might well be found “insignificant” and reported as being no increase.
Alternately, a compound could have a real effect of say the 3 lb in 8 weeks, which would be excellent, yet the observed average could come out as zero, by chance alone.
Worse, the above doesn’t result in a situation where if such a study does report a “statistically significant” effect, then wow, it must really be something to have overcome the above difficult situation.
Quite the contrary: more than 5% of the time, treatments of zero efficacy will be found to have “statisticallyh significant” efficicy to p < 0.05. The reason for this is a little complex but the simplest explanation of it is that there is publication bias towards positive results. So if let’s say 100 studies are done of treatments which in fact have no effect, and random variation is such that very nearly 5% of the time, random variation alone would cause the observed positive value, then on average about 5 of these studies will be published and reported as the treatments having a “statistically significant, p < 0,.05” real effect.
Probably most – or if proprietary, all – of the 95 other studies won’t be published at all.
So apparently-positive results in studies of the sort that this one was must be looked at very carefully as well.
Not a lot can be concluded from studies of this type. Actually I think they are of less value than simply trying something oneself, provided that one – if acquiring an initial opinion that something seemed to work for him – also tries discontinuining it, and then restarting it again after a time.
If not doing this but just sticking with something that really seemed to make a difference, sometimes it will be coincidence that what was a good period of time for the body or for training just happened to fall at the same time as introducing the supplement. So one does have to take a little care with personal experimentation as well.
How could these studies be done better? For example, if in the above study they had tried studying only one compound, they could have assigned 22-23 subjects to each group, treatment or placebo. This would have helped although the number still would have been quite marginal for detecting small effects.
Secondly and more importantly, they’d have needed a more stable set of subjects. Athletes who had been training the best they know how for quite some time and who had reached essentially a steady state are far superior subjects, because where there is no real effect, there won’t be a normal outcome of this guy adding 10 lb of muscle in 8 weeks and that one losing a few lb. Variation will be much smaller.
I’d rather informally (not for publication) have basic measurements on 5-10 guys who are at a steady-state in their training, being very consistent and hard trainers, and see what happens with them, than have all kinds of measurements on 20 guys who have been totally inconsistent and half-assed in their training and therefore can see 10 lb of muscle in 8 weeks simply from straightening up their act rather than their being a real effect.
It would be great if the university studies could combine the best of the above approaches and have say thirty (as a minimum, not optimum) subjects per group who were these consistent advanced trainers at a steady-state in their training, but this simply isn’t practical on an ongoing basis, if ever, in the university environment and would be incredibly difficult anywhere. Even if having for example a sports team, say a football team, which at first glance might seem such an environment, aside from the fact that the coach might well not want to do it, the athletes are ordinarily not in a steady-state because of the seasonal nature of most sports.