"He also notes that so far, all the subjects come from industrial nations, and thus eat similar foods. 'This is a shortcoming,' he said. 'We don’t have remote villages.'" So the investigator would still like to know a lot more about how differences in diet might influence microbiomes.
Also, the article noted, "The scientists then searched for patterns. 'We didn’t have any hypothesis,' Dr. Bork said. 'Anything that came out would be new.'" This, of course, could be referred to as data-mining,
(See Warning Sign D7: Lack of a Specific Hypothesis, or Overzealous Data Mining)
and what I think this scientist would himself acknowledge is that before this research can develop, scientists need to form testable hypotheses about microbiomes and then put those hypotheses to the test. Typological thinking is a frequent characteristic of early stages of developing theories in biology, and it may be that eventually a more nuanced form of population thinking will emerge in the study of human-hosted microbiomes.
I would say that 3 classes is few enough that it should be fairly resistant to data mining. Figure 2b and 2c aren't as obvious to me that there are 3 classes, but 2a and 2d are pretty clear. "All science is either physics or stamp collecting", but sometimes you have to have enough data points to start looking for patterns, and this looks like a pretty reasonable way to go about it.
I don't think you can draw any sort of causal patterns yet, but the fact that there's a strong correlation between a 2-factor model based on the data and BMI leads to some testable hypotheses.
I think it's perfectly valid to say "I don't know anything about this area". Would you prefer him to say "I will arbitrarily assume (because I have no information whatsoever) a hypothesis that every person has a unique set of bacteria and see if that is right"? That might actually lead to an experiment that misses what he found: you would probably take similar people. If you hypothesis that people have similar sets of bacteria, you'd want to pick wildly different people. Sometimes we don't even know enough to ask a useful question, but it is still scientific to examine the area with repeatable experiments and see what you find.
On the other hand, they had patients with crohn's disease in the sample. Its hard not to see this as a decisions taken based on the hypothesis that people affected by crohn's disease have a different bacterial flora.
Or you can datamine for some ideas, and then use those ideas as hypothesis for testing. There's nothing wrong with judicious use of data-mining as a tool in research. In many ways, the patterns that emerge out of data mining can become the 'huh, that's weird' moments of the future, especially as we start dealing with observations beyond a single human sense's scale.
http://news.ycombinator.com/item?id=2465085
Two statements in the article caught my eye:
"He also notes that so far, all the subjects come from industrial nations, and thus eat similar foods. 'This is a shortcoming,' he said. 'We don’t have remote villages.'" So the investigator would still like to know a lot more about how differences in diet might influence microbiomes.
Also, the article noted, "The scientists then searched for patterns. 'We didn’t have any hypothesis,' Dr. Bork said. 'Anything that came out would be new.'" This, of course, could be referred to as data-mining,
http://norvig.com/experiment-design.html
(See Warning Sign D7: Lack of a Specific Hypothesis, or Overzealous Data Mining)
and what I think this scientist would himself acknowledge is that before this research can develop, scientists need to form testable hypotheses about microbiomes and then put those hypotheses to the test. Typological thinking is a frequent characteristic of early stages of developing theories in biology, and it may be that eventually a more nuanced form of population thinking will emerge in the study of human-hosted microbiomes.