Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

Forstmeier W, Schielzeth H (2010)
Behavioral Ecology and Sociobiology 65(1): 47-55.

Journal Article | Published | English

No fulltext has been uploaded

Author
;
Abstract
Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.
Publishing Year
ISSN
eISSN
PUB-ID

Cite this

Forstmeier W, Schielzeth H. Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology. 2010;65(1):47-55.
Forstmeier, W., & Schielzeth, H. (2010). Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology, 65(1), 47-55.
Forstmeier, W., and Schielzeth, H. (2010). Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology 65, 47-55.
Forstmeier, W., & Schielzeth, H., 2010. Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology, 65(1), p 47-55.
W. Forstmeier and H. Schielzeth, “Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse”, Behavioral Ecology and Sociobiology, vol. 65, 2010, pp. 47-55.
Forstmeier, W., Schielzeth, H.: Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology. 65, 47-55 (2010).
Forstmeier, Wolfgang, and Schielzeth, Holger. “Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse”. Behavioral Ecology and Sociobiology 65.1 (2010): 47-55.
This data publication is cited in the following publications:
This publication cites the following data publications:

42 References

Data provided by Europe PubMed Central.

A farewell to Bonferroni: the problems of low statistical power and publication bias
Nakagawa S., 2004
What's wrong with Bonferroni adjustments.
Perneger TV., BMJ 316(7139), 1998
PMID: 9553006

Quinn GP, Keough MJ., 2002

AUTHOR UNKNOWN, 2008
Analyzing tables of statistical tests
Rice WR., 1989
Simple means to improve the interpretability of regression coefficients
Schielzeth H., 2010
Conclusions beyond support: overconfident estimates in mixed models.
Schielzeth H, Forstmeier W., Behav. Ecol. 20(2), 2009
PMID: 19461866
Inference in ecology and evolution.
Stephens PA, Buskirk SW, del Rio CM., Trends Ecol. Evol. (Amst.) 22(4), 2007
PMID: 17174005
Statistical significance for genomewide studies.
Storey JD, Tibshirani R., Proc. Natl. Acad. Sci. U.S.A. 100(16), 2003
PMID: 12883005
Individual recognition: it is good to be different.
Tibbetts EA, Dale J., Trends Ecol. Evol. (Amst.) 22(10), 2007
PMID: 17904686

Venables WN, Ripley BD., 2002
Implementing false discovery rate control: increasing your power
Verhoeven KJF, Simonsen KL, McIntyre LM., 2005
Why do we still use stepwise modelling in ecology and behaviour?
Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP., J Anim Ecol 75(5), 2006
PMID: 16922854
Adjusted P-values for simultaneous inference
Wright SP., 1992
Inference after variable selection in linear regression models
Zhang P., 1992
Overcoming the winner's curse: estimating penetrance parameters from case-control data.
Zollner S, Pritchard JK., Am. J. Hum. Genet. 80(4), 2007
PMID: 17357068

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 21297852
PubMed | Europe PMC

Search this title in

Google Scholar