Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

Forstmeier W, Schielzeth H (2010)
Behavioral Ecology and Sociobiology 65(1): 47-55.

Download
No fulltext has been uploaded. References only!
Journal Article | Original Article | Published | English

No fulltext has been uploaded

Author
;
Abstract
Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.
Publishing Year
ISSN
eISSN
PUB-ID

Cite this

Forstmeier W, Schielzeth H. Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology. 2010;65(1):47-55.
Forstmeier, W., & Schielzeth, H. (2010). Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology, 65(1), 47-55. doi:10.1007/s00265-010-1038-5
Forstmeier, W., and Schielzeth, H. (2010). Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology 65, 47-55.
Forstmeier, W., & Schielzeth, H., 2010. Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology, 65(1), p 47-55.
W. Forstmeier and H. Schielzeth, “Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse”, Behavioral Ecology and Sociobiology, vol. 65, 2010, pp. 47-55.
Forstmeier, W., Schielzeth, H.: Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse. Behavioral Ecology and Sociobiology. 65, 47-55 (2010).
Forstmeier, Wolfgang, and Schielzeth, Holger. “Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse”. Behavioral Ecology and Sociobiology 65.1 (2010): 47-55.
This data publication is cited in the following publications:
This publication cites the following data publications:

123 Citations in Europe PMC

Data provided by Europe PubMed Central.

Antioxidant allocation modulates sperm quality across changing social environments.
Rojas Mora A, Meniri M, Gning O, Glauser G, Vallat A, Helfenstein F., PLoS ONE 12(5), 2017
PMID: 28472052
Prelinguistic human infants and great apes show different communicative strategies in a triadic request situation.
Gretscher H, Tempelmann S, Haun DB, Liebal K, Kaminski J., PLoS ONE 12(4), 2017
PMID: 28384300
The socio-economic drivers of bushmeat consumption during the West African Ebola crisis.
Ordaz-Nemeth I, Arandjelovic M, Boesch L, Gatiso T, Grimes T, Kuehl HS, Lormie M, Stephens C, Tweh C, Junker J., PLoS Negl Trop Dis 11(3), 2017
PMID: 28282378
Testing the phenotype-linked fertility hypothesis in the presence and absence of inbreeding.
Forstmeier W, Ihle M, Opatova P, Martin K, Knief U, Albrechtova J, Albrecht T, Kempenaers B., J. Evol. Biol. 30(5), 2017
PMID: 28278362
Perception of emotional valence in horse whinnies.
Briefer EF, Mandel R, Maigrot AL, Briefer Freymond S, Bachmann I, Hillmann E., Front. Zool. 14(), 2017
PMID: 28203263
Acoustic correlates of body size and individual identity in banded penguins.
Favaro L, Gamba M, Gili C, Pessani D., PLoS ONE 12(2), 2017
PMID: 28199318
Behavioural Type Affects Space Use in a Wild Population of Crows (Corvus corone).
Deventer SA, Uhl F, Bugnyar T, Miller R, Fitch WT, Schiestl M, Ringler M, Schwab C., Ethology 122(11), 2016
PMID: 27840464
Sex Differences in Age-Related Decline of Urinary Insulin-Like Growth Factor-Binding Protein-3 Levels in Adult Bonobos and Chimpanzees.
Behringer V, Wudy SA, Blum WF, Stevens JM, Remer T, Boesch C, Hohmann G., Front Endocrinol (Lausanne) 7(), 2016
PMID: 27602019
Does the Structure of Female Rhesus Macaque Coo Calls Reflect Relatedness and/or Familiarity?
Pfefferle D, Hammerschmidt K, Mundry R, Ruiz-Lambides AV, Fischer J, Widdig A., PLoS ONE 11(8), 2016
PMID: 27579491
Dynamic egg color mimicry.
Hanley D, Sulc M, Brennan PL, Hauber ME, Grim T, Honza M., Ecol Evol 6(12), 2016
PMID: 27516874
Bonobo nest site selection and the importance of predictor scales in primate ecology.
Serckx A, Huynen MC, Beudels-Jamar RC, Vimond M, Bogaert J, Kuhl HS., Am. J. Primatol. 78(12), 2016
PMID: 27463835
Travel fosters tool use in wild chimpanzees.
Gruber T, Zuberbuhler K, Neumann C., Elife 5(), 2016
PMID: 27431611
Within arm's reach: Measuring forearm length to assess growth patterns in captive bonobos and chimpanzees.
Behringer V, Stevens JM, Kivell TL, Neufuss J, Boesch C, Hohmann G., Am. J. Phys. Anthropol. 161(1), 2016
PMID: 27143225
Endocrine assessment of ovarian cycle activity in wild female mountain gorillas (Gorilla beringei beringei).
Habumuremyi S, Stephens C, Fawcett KA, Deschner T, Robbins MM., Physiol. Behav. 157(), 2016
PMID: 26875514

42 References

Data provided by Europe PubMed Central.

A farewell to Bonferroni: the problems of low statistical power and publication bias
Nakagawa S., 2004
What's wrong with Bonferroni adjustments.
Perneger TV., BMJ 316(7139), 1998
PMID: 9553006

Quinn GP, Keough MJ., 2002

AUTHOR UNKNOWN, 2008
Analyzing tables of statistical tests
Rice WR., 1989
Simple means to improve the interpretability of regression coefficients
Schielzeth H., 2010
Conclusions beyond support: overconfident estimates in mixed models.
Schielzeth H, Forstmeier W., Behav. Ecol. 20(2), 2009
PMID: 19461866
Inference in ecology and evolution.
Stephens PA, Buskirk SW, del Rio CM., Trends Ecol. Evol. (Amst.) 22(4), 2007
PMID: 17174005
Statistical significance for genomewide studies.
Storey JD, Tibshirani R., Proc. Natl. Acad. Sci. U.S.A. 100(16), 2003
PMID: 12883005
Individual recognition: it is good to be different.
Tibbetts EA, Dale J., Trends Ecol. Evol. (Amst.) 22(10), 2007
PMID: 17904686

Venables WN, Ripley BD., 2002
Implementing false discovery rate control: increasing your power
Verhoeven KJF, Simonsen KL, McIntyre LM., 2005
Why do we still use stepwise modelling in ecology and behaviour?
Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP., J Anim Ecol 75(5), 2006
PMID: 16922854
Adjusted P-values for simultaneous inference
Wright SP., 1992
Inference after variable selection in linear regression models
Zhang P., 1992
Overcoming the winner's curse: estimating penetrance parameters from case-control data.
Zollner S, Pritchard JK., Am. J. Hum. Genet. 80(4), 2007
PMID: 17357068

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 21297852
PubMed | Europe PMC

Search this title in

Google Scholar