Four Step Data Analysis, Different Hypothesis Tests

Generally, sample statistics include: quantitative data:

  • - mean, variance, standard deviation, median, quartiles, range,....
  • - mean difference, ratio of medians

discrete data, failure-time data:

  • - proportion, percentage (depending on time)
  • - difference between proportions, numbers needed to treat, odds ratios, relative risks, hazard ratios,....

The current chapter will particularly focus on the discrete data and failure-time data. The quantitative data analyses have been covered in the Chap. 6. Discrete data can answer many questions in trials like those given underneath.

How large is the response rate How many patients have side-effects?

How many patients were alive (after 5 years)?

Is the response rate under treatment A larger than under B?

Are there more “side-effects” after than before treatment?

What is the optimal dose?

Study design: (a.o.)

trials, cohorts, case-control studies cross-sectional vs follow-up measurements Data type: (a.o.)

quantities, binary, categorical, ordinal variables censored variables

The required data analysis is dependent on (1) the study design and (2) the type of data. Four steps are, often, mentioned to constitute a proper data analysis:

step 1 summarize the data

- calculate statistics

step 2 provide the reliability of the statistics

  • - standard error (se), confidence interval (ci) step 3 hypothesis testing
  • - p-values, significance level step 4 regression analysis
  • - (causal) association, confounder correction, prediction, explained variation,....

The fourth step regression will be the subject of the Chap. 7, and will not be addressed here. The general situation with randomized controlled trials is, that they have representative random samples from a target population. The ultimate conclusion of a trial is very relevant to the sample, but much more to the target population of the trial, as explained the underneath graph.

This somewhat peculiar situation of trials explains much of the analysis steps taken.

1 sample 1 measurement

2 samples 1 measurement

>2 samples 1 measurement


one sample t-test/ Wileoxon test

unpaired t-test / Mann- Whitney test

ANOVA, Kruskal- Wallis test


Z-or ehi-squared test

Z-or ehi-squared test

ehi-squared test



logrank test

logrank test

1 sample 2 measurements

1 sample >2 measurements

>1 samples >1 measurement


paired t-test / Wileoxon test

mm ANOVA/ Friedman test



Mc Nemar test

Cochran’s Q test

r.e. logistic regression


stratified logrank test

stratified logrank test

fRailty models

Above an overview of relevant tests for data analysis including those of discrete data analysis is given (ANOVA=analysis of variance, r.e.=random effects). Many hypothesis tests are possible, and each of them has its own place in the area of statistical data analysis. In this chapter the most relevant procedures will now be explained with examples from practice.

< Prev   CONTENTS   Source   Next >