Research Design and data analysis for Experimental Studies


Data analysis is the application of one or more statistical techniques to a set of data as collected. In designed experiments, some form of treatment is applied to experimental units and responses are observed. This course is designed to transform participants into professional data analysts. It is designed for participants without or with very little experience using statistical software. Some basic knowledge on statistics is required. During the course, the instructors will interchangeably use Stata, Excel, SAS and SPSS to demonstrate relevant techniques in each topic.


10 Days


Module 1

The Big Picture

  • The importance of careful experimental design
  • What you should learn here

 Variable Classification

  • What makes a “good” variable?
  • Classification by role
  • Classification by statistical type
  • Tricky cases

 Overview of statistical analysis

 Inferential Statistics

  • Covariance and Correlation
  • From Descriptions to Inferences
  • The Role of Probability Theory
  • The Null and Alternative Hypothesis
  • The Sampling Distribution and Statistical Decision Making
  • Type I Errors, Type II Errors, and Statistical Power
  • Effect Size
  • Meta-analysis
  • Parametric Versus Nonparametric Analyses
  • Selecting the Appropriate Analysis: Using a Decision Tree

 Module 2

Review of Probability

  • Definition(s) of probability
  • Probability mass functions and density functions
  • Probability calculations
  • Populations and samples
  • Parameters describing distributions
  • Central tendency: mean and median
  • Spread: variance and standard deviation
  • Skewness and kurtosis
  • Multivariate distributions: joint, conditional, and marginal
  • Covariance and correlation
  • The Importance of Variability
  • Tables and Graphs
  • Thinking Critically About Everyday Information
  • Central limit theorem

 Common distributions

  • Binomial distribution
  • Multinomial distribution
  • Poisson distribution
  • Gaussian distribution
  • t-distribution
  • Chi-square distribution
  • F-distribution

 Module 3

Exploratory Data Analysis

  • Typical data format and the types of Exploratory Data Analysis
  • Univariate non-graphical Exploratory Data Analysis
  • Categorical data
  • Characteristics of quantitative data
  • Central tendency
  • Spread
  • Skewness and kurtosis

 Univariate graphical Exploratory Data Analysis

  • Histograms
  • Stem-and-leaf plots
  • Boxplots
  • Quantile-normal plots

 Multivariate non-graphical Exploratory Data Analysis

  • Cross-tabulation
  • Correlation for categorical data
  • Univariate statistics by category
  • Correlation and covariance
  • Covariance and correlation matrices

 Multivariate graphical Exploratory Data Analysis

  • Univariate graphs by category
  • Scatterplots

 A note on degrees of freedom

 Module 4 and 5

Learning Stata, Excel, SAS and SPSS: Data and Exploratory Data Analysis

  • Overview of software
  • Starting the programs
  • Typing in data
  • Loading data
  • Creating new variables
    • Recoding
    • Automatic recoding
    • Visual binning
  • Non-graphical Exploratory Data Analysis
  • Graphical Exploratory Data Analysis
    • Overview of programs Graph
    • Histogram
    • Boxplot
    • Scatterplot
  • SPSS convenience item: Explore

 Module 6


  • How classical statistical inference works
    • The steps of statistical analysis
    • Model and parameter definition
    • Null and alternative hypotheses
    • Choosing a statistic
    • Computing the null sampling distribution
    • Finding the p-value
    • Confidence intervals
    • Assumption checking
    • Subject matter conclusions
    • Power
  • t-test in Stata, Excel, SAS and SPSS

One-way ANOVA

  • How one-way ANOVA works
    • The model and statistical hypotheses
    • The F statistic (ratio)
    • Null sampling distribution of the F statistic
    • Inference: hypothesis testing
    • Inference: confidence intervals
  • One-way ANOVA in Stata, Excel, SAS and SPSS
  • Reading the ANOVA table
  • Assumption checking
  • Results interpretation and reporting

Threats to Your Experiment

  • Internal validity
  • Construct validity
  • External validity
  • Maintaining Type 1 error
  • Power
  • Missing explanatory variables
  • Practicality and cost
  • Threat summary

 Module 7

Simple Linear Regression

  • The model behind linear regression
  • Statistical hypotheses
  • Simple linear regression example
  • Regression calculations
  • Interpreting regression coefficients
  • Residual checking
  • Robustness of simple linear regression
  • Additional interpretation of regression output
  • Using transformations
  • How to perform simple linear regression in Stata, Excel, SAS and SPSS

 Analysis of Covariance

  • Multiple regression
  • Interaction
  • Categorical variables in multiple regression
    • ANCOVA with no interaction
    • ANCOVA with interaction
  • Analysis of Covariance in Stata, Excel, SAS and SPSS

 Two-Way ANOVA

  • Application areas of Two-Way ANOVA
  • Interpreting the two-way ANOVA results
  • Examples
  • More on profile plots, main effects and interactions
  • Two-Way ANOVA in Stata, Excel, SAS and SPSS

 Module 8

Statistical Power

  • The concept
  • Improving power
  • Specific researchers’ lifetime experiences
  • Expected Mean Square
  • Power Calculations
  • Choosing effect sizes
  • Using n.c.p. to calculate power
  • A power applet
  • Overview
  • One-way ANOVA
  • Two-way ANOVA without interaction
  • Two-way ANOVA with interaction
  • Linear Regression

 Module 9

Contrasts and Custom Hypotheses

  • Contrasts, in general
  • Planned comparisons
  • Unplanned or post-hoc contrasts
  • Contrasts and Custom Hypotheses in Stata, Excel, SAS and SPSS
    • Contrasts in one-way ANOVA
    • Contrasts for Two-way ANOVA

 Within-Subjects Designs

  • Overview of within-subjects designs
  • Multivariate distributions
  • Example and alternate approaches
  • Paired t-test
  • One-way Repeated Measures Analysis
  • Mixed between/within-subjects designs in Stata, Excel, SAS and SPSS

 Mixed Models

  • Overview
  • Mixed model approach
  • Setting up a model in Stata, Excel, SAS and SPSS
  • Interpreting the results for Mixed models
  • Model selection for Mixed models
    • Penalized likelihood methods for model selection
    • Comparing models with individual p-values

 Module 10

Categorical Outcomes

  • Contingency tables and chi-square analysis
    • Why ANOVA and regression don’t work
  • Testing independence in contingency tables
    • Contingency and independence
    • Contingency tables
    • Chi-square test of Independence
  • Logistic regression
    • Introduction
    • Example and EDA for logistic regression
    • Fitting a logistic regression model
    • Tests in a logistic regression model
    • Predictions in a logistic regression model
    • Logistic regression in Stata, Excel, SAS and SPSS


