Large scale health surveys offer an opportunity to study associations between risk factors and outcomes in a population-based setting.
Their complicated mulstistage sampling designs with differential probabilities of sampling individuals can make their analysis unstraightforward.
Classical « design-based
» methods that yield approximately unbiased estimators of associations and standard errors can be highly inefficient.
Model-based methods require assumptions which, if wrong, can lead to biased estimators of associations and standard errors.
This paper examines the implications of utilizing the sample clustering and sample weights in the analysis of survey data.
The approach is to estimate the inefficiency of using these aspects of the sampling design in a design-based analysis when actually it was unnecessary to do so.
If the inefficiency is small, then that aspect of the design is used in a design-based fashion.
Otherwise, additional modelling assumptions are incorporated into the analysis.
By focusing attention on risk factor-outcome associations in large health surveys, specific recommendations for practitioners are given.
The issues are demonstrated with real survey data including two controversial analyses previously published in medical references.
Mots-clés Pascal : Plan échantillonnage, Pondération, Estimation statistique, Causalité, Méthodologie, Enquête, Analyse statistique, Agrégation
Mots-clés Pascal anglais : Sampling design, Weighting, Statistical estimation, Causality, Methodology, Inquiry, Statistical analysis, Clustering
Notice produite par :
Inist-CNRS - Institut de l'Information Scientifique et Technique
Cote : 95-0274873
Code Inist : 002B30A01A1. Création : 01/03/1996.