# Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous

Sauzet O, Wright KC, Marston L, Brocklehurst P, Peacock JL (2013) *Statistics In Medicine* 32(8): 1429-1438.

Download

**No fulltext has been uploaded. References only!**

DOI

*Journal Article*|

*Original Article*|

*Published*|

*English*

No fulltext has been uploaded

Author

Department

Abstract

In cluster-randomised trials, the problem of non-independence within clusters is well known, and appropriate statistical analysis documented. Clusters typically seen in cluster trials are large in size and few in number, whereas datasets of preterm infants incorporate clusters of size two (twins), size three (triplets) and so on, with the majority of infants being in clusters' of size one. In such situations, it is unclear whether adjustment for clustering is needed or even possible. In this paper, we compared analyses allowing for clustering (linear mixed model) with analyses ignoring clustering (linear regression). Through simulations based on two real datasets, we explored estimation bias in predictors of a continuous outcome in different size datasets typical of preterm samples, with varying percentages of twins. Overall, the biases for estimated coefficients were similar for linear regression and mixed models, but the standard errors were consistently much less well estimated when using a linear model. Non-convergence was rare but was observed in approximately 5% of mixed models for samples below 200 and percentage of twins 2% or less. We conclude that in datasets with small clusters, mixed models should be the method of choice irrespective of the percentage of twins. If the mixed model does not converge, a linear regression can be fitted, but standard error will be underestimated, and so type I error may be inflated. Copyright (c) 2012 John Wiley & Sons, Ltd.

Keywords

Publishing Year

ISSN

PUB-ID

### Cite this

Sauzet O, Wright KC, Marston L, Brocklehurst P, Peacock JL. Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.

*Statistics In Medicine*. 2013;32(8):1429-1438.Sauzet, O., Wright, K. C., Marston, L., Brocklehurst, P., & Peacock, J. L. (2013). Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.

*Statistics In Medicine*,*32*(8), 1429-1438. doi:10.1002/sim.5638Sauzet, O., Wright, K. C., Marston, L., Brocklehurst, P., and Peacock, J. L. (2013). Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.

*Statistics In Medicine*32, 1429-1438.Sauzet, O., et al., 2013. Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.

*Statistics In Medicine*, 32(8), p 1429-1438. O. Sauzet, et al., “Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous”,

*Statistics In Medicine*, vol. 32, 2013, pp. 1429-1438. Sauzet, O., Wright, K.C., Marston, L., Brocklehurst, P., Peacock, J.L.: Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous. Statistics In Medicine. 32, 1429-1438 (2013).

Sauzet, Odile, Wright, K. C., Marston, L., Brocklehurst, P., and Peacock, J. L. “Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous”.

*Statistics In Medicine*32.8 (2013): 1429-1438.
This data publication is cited in the following publications:

This publication cites the following data publications:

### 10 Citations in Europe PMC

Data provided by Europe PubMed Central.

Simulation-based evaluation of the linear-mixed model in the presence of an increasing proportion of singletons.

Bruyndonckx R, Hens N, Aerts M.,

PMID: 29067702

Bruyndonckx R, Hens N, Aerts M.,

*Biom J*60(1), 2018PMID: 29067702

Systematic review and simulation study of ignoring clustered data in surgical trials.

Dell-Kuster S, Droeser RA, Schäfer J, Gloy V, Ewald H, Schandelmaier S, Hemkens LG, Bucher HC, Young J, Rosenthal R.,

PMID: 29405280

Dell-Kuster S, Droeser RA, Schäfer J, Gloy V, Ewald H, Schandelmaier S, Hemkens LG, Bucher HC, Young J, Rosenthal R.,

*Br J Surg*105(3), 2018PMID: 29405280

Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants.

Sauzet O, Peacock JL.,

PMID: 28728549

Sauzet O, Peacock JL.,

*BMC Med Res Methodol*17(1), 2017PMID: 28728549

The SafeBoosC II randomized trial: treatment guided by near-infrared spectroscopy reduces cerebral hypoxia without changing early biomarkers of brain injury.

Plomgaard AM, van Oeveren W, Petersen TH, Alderliesten T, Austin T, van Bel F, Benders M, Claris O, Dempsey E, Franz A, Fumagalli M, Gluud C, Hagmann C, Hyttel-Sorensen S, Lemmers P, Pellicer A, Pichler G, Winkel P, Greisen G.,

PMID: 26679155

Plomgaard AM, van Oeveren W, Petersen TH, Alderliesten T, Austin T, van Bel F, Benders M, Claris O, Dempsey E, Franz A, Fumagalli M, Gluud C, Hagmann C, Hyttel-Sorensen S, Lemmers P, Pellicer A, Pichler G, Winkel P, Greisen G.,

*Pediatr Res*79(4), 2016PMID: 26679155

A distributional approach to obtain adjusted comparisons of proportions of a population at risk.

Sauzet O, Breckenkamp J, Borde T, Brenne S, David M, Razum O, Peacock JL.,

PMID: 27279891

Sauzet O, Breckenkamp J, Borde T, Brenne S, David M, Razum O, Peacock JL.,

*Emerg Themes Epidemiol*13(), 2016PMID: 27279891

Accounting for multiple births in randomised trials: a systematic review.

Yelland LN, Sullivan TR, Makrides M.,

PMID: 25389142

Yelland LN, Sullivan TR, Makrides M.,

*Arch Dis Child Fetal Neonatal Ed*100(2), 2015PMID: 25389142

Clinical trials of medicines in neonates: the influence of ethical and practical issues on design and conduct.

Turner MA.,

PMID: 25041601

Turner MA.,

*Br J Clin Pharmacol*79(3), 2015PMID: 25041601

Analysis of Randomised Trials Including Multiple Births When Birth Size Is Informative.

Yelland LN, Sullivan TR, Pavlou M, Seaman SR.,

PMID: 26332368

Yelland LN, Sullivan TR, Pavlou M, Seaman SR.,

*Paediatr Perinat Epidemiol*29(6), 2015PMID: 26332368

Late outcomes of a randomized trial of high-frequency oscillation in neonates.

Zivanovic S, Peacock J, Alcazar-Paris M, Lo JW, Lunt A, Marlow N, Calvert S, Greenough A.,

PMID: 24645944

Zivanovic S, Peacock J, Alcazar-Paris M, Lo JW, Lunt A, Marlow N, Calvert S, Greenough A.,

*N Engl J Med*370(12), 2014PMID: 24645944

A phase II randomized clinical trial on cerebral near-infrared spectroscopy plus a treatment guideline versus treatment as usual for extremely preterm infants during the first three days of life (SafeBoosC): study protocol for a randomized controlled trial.

Hyttel-Sorensen S, Austin T, van Bel F, Benders M, Claris O, Dempsey E, Fumagalli M, Greisen G, Grevstad B, Hagmann C, Hellström-Westas L, Lemmers P, Lindschou J, Naulaers G, van Oeveren W, Pellicer A, Pichler G, Roll C, Skoog M, Winkel P, Wolf M, Gluud C.,

PMID: 23782447

Hyttel-Sorensen S, Austin T, van Bel F, Benders M, Claris O, Dempsey E, Fumagalli M, Greisen G, Grevstad B, Hagmann C, Hellström-Westas L, Lemmers P, Lindschou J, Naulaers G, van Oeveren W, Pellicer A, Pichler G, Roll C, Skoog M, Winkel P, Wolf M, Gluud C.,

*Trials*14(), 2013PMID: 23782447

### 14 References

Data provided by Europe PubMed Central.

The statistical analysis of data from small groups.

Kenny DA, Mannetti L, Pierro A, Livi S, Kashy DA.,

PMID: 12088122

Kenny DA, Mannetti L, Pierro A, Livi S, Kashy DA.,

*J Pers Soc Psychol*83(1), 2002PMID: 12088122

A comparison between traditional methods and multilevel regression for the analysis of multicenter intervention studies.

Moerbeek M, van Breukelen GJ, Berger MP.,

PMID: 12767411

Moerbeek M, van Breukelen GJ, Berger MP.,

*J Clin Epidemiol*56(4), 2003PMID: 12767411

Comparison of methods for analysing cluster randomized trials: an example involving a factorial design.

Peters TJ, Richards SH, Bankhead CR, Ades AE, Sterne JA.,

PMID: 14559762

Peters TJ, Richards SH, Bankhead CR, Ades AE, Sterne JA.,

*Int J Epidemiol*32(5), 2003PMID: 14559762

High-frequency oscillatory ventilation for the prevention of chronic lung disease of prematurity.

Johnson AH, Peacock JL, Greenough A, Marlow N, Limb ES, Marston L, Calvert SA; United Kingdom Oscillation Study Group.,

PMID: 12200550

Johnson AH, Peacock JL, Greenough A, Marlow N, Limb ES, Marston L, Calvert SA; United Kingdom Oscillation Study Group.,

*N. Engl. J. Med.*347(9), 2002PMID: 12200550

A survey of methods for analyzing clustered binary response data

Pendergast,

Pendergast,

*International Statistical Review*64(), 1996
How should randomised trials including multiple pregnancies be analysed?

Gates,

Gates,

*BJOG-an International Journal of Obstetrics and Gynaecology*111(), 2004
Regression models for twin studies: a critical review.

Carlin JB, Gurrin LC, Sterne JA, Morley R, Dwyer T.,

PMID: 16087687

Carlin JB, Gurrin LC, Sterne JA, Morley R, Dwyer T.,

*Int J Epidemiol*34(5), 2005PMID: 16087687

Analysis of repeated pregnancy outcomes.

Louis GB, Dukic V, Heagerty PJ, Louis TA, Lynch CD, Ryan LM, Schisterman EF, Trumble A; Pregnancy Modeling Working Group.,

PMID: 16615652

Louis GB, Dukic V, Heagerty PJ, Louis TA, Lynch CD, Ryan LM, Schisterman EF, Trumble A; Pregnancy Modeling Working Group.,

*Stat Methods Med Res*15(2), 2006PMID: 16615652

Analysis of neonatal clinical trials with twin births.

Shaffer ML, Kunselman AR, Watterberg KL.,

PMID: 19245713

Shaffer ML, Kunselman AR, Watterberg KL.,

*BMC Med Res Methodol*9(), 2009PMID: 19245713

Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.

Marston L, Peacock JL, Yu K, Brocklehurst P, Calvert SA, Greenough A, Marlow N.,

PMID: 19523085

Marston L, Peacock JL, Yu K, Brocklehurst P, Calvert SA, Greenough A, Marlow N.,

*Paediatr Perinat Epidemiol*23(4), 2009PMID: 19523085

Factors affecting vocabulary acquisition at age 2 in children born between 23 and 28 weeks' gestation

Marston,

Marston,

*Development Medicine and Child Neurology*49(), 2007
Randomised trial of high frequency oscillatory ventilation or conventional ventilation in babies of gestational age 28 weeks or less: respiratory and neurological outcomes at 2 years.

Marlow N, Greenough A, Peacock JL, Marston L, Limb ES, Johnson AH, Calvert SA.,

PMID: 16690640

Marlow N, Greenough A, Peacock JL, Marston L, Limb ES, Johnson AH, Calvert SA.,

*Arch. Dis. Child. Fetal Neonatal Ed.*91(5), 2006PMID: 16690640

The INIS Study. International Neonatal Immunotherapy Study: non-specific intravenous immunoglobulin therapy for suspected or proven neonatal sepsis: an international, placebo controlled, multicentre randomised trial.

INIS Study Collaborative Group, Brocklehurst P, Brearley S, Haque K, Leslie A, Salt A, Stenson B, Stephenson J, Tarnow-Mordi W.,

PMID: 19063731

INIS Study Collaborative Group, Brocklehurst P, Brearley S, Haque K, Leslie A, Salt A, Stenson B, Stephenson J, Tarnow-Mordi W.,

*BMC Pregnancy Childbirth*8(), 2008PMID: 19063731

### Export

0 Marked Publications### Web of Science

View record in Web of Science®### Sources

PMID: 23027676

PubMed | Europe PMC