While several software are available to perform statistical tests, applying an incorrect test gives results that are useless and often misleading.

Let us take a detailed look at the Parametric tests in Statistics; independent samples t-test and paired samples t-test, when to apply them and how to make calculations?

For the purpose of understanding we use the example of comparing means (averages) although these tests could be extended to proportions.

Independent samples t-test (Also called as 2 sample t-test)

Context around this test

• Objective is ‘Averages’ of two groups are compared
• We have one dependent variable and one independent variable.
• Dependent variable should be a continuous one and independent variable is a categorical one.
• Also called two sample t-test as two independent samples from two groups are used in analysis

Example

Compare the average time taken to produce a device by two different machines. Our objective    is to select the machine which produces devices at a faster rate.

Here the observations are time taken (In minutes) by each machine independently

Dependent variable – Time taken to produce the device

Independent variable – Machine type

Time taken is a continuous variable while machine type (Machine 1 or machine 2) is a categorical one.

When to use an independent samples-t test?

• When the problem is to compare the averages of two groups
• Distribution of each of the groups is Normal (Required to apply a t-test) and the samples are drawn independently
• Sample size is small. When sample size is large, we can apply z test (From Central limit theorem)

Impact of outliers on independent samples-t test

• Outliers skew the average and give misleading results.
• Removing outliers is recommended for this test

One tailed t-test vs 2 tailed t-test in independent samples t-test

One tailed t-test is used when we want to test the hypothesis such as to check if the average of one group is greater than the other.

Hypothesis and calculations

Null Hypothesis (Ho):

Avg. time taken by machine 1 <= Avg. time taken by machine 2

(Difference in time <=0)

Alternate Hypothesis (H1):

Avg. time taken by machine 1 > Avg. time taken by machine 2

(Difference in time > 0)

Using t value above, calculate p-value and compare it with Alpha (Alpha = 1 – Confidence level %)

Example for two tailed t-test

Null Hypothesis (Ho):

Avg. time taken by machine 1 = Avg. time taken by machine2

(Difference in time = 0)

Alternate Hypothesis (H1):

Avg. time taken by machine1 NOT EQUAL to Avg. time taken by machine2

(Difference in time NOT EQUAL to 0)

Paired Sample t-test (Also called as dependent sample t-test)

Context around this test

• Objective is to compare the means of a group under different criteria or at different time points
• We have one dependent variable and one independent variable.
• Dependent variable should be a continuous one and independent variable is a categorical one.
• Also called as dependent samples t-test as this involves the same sample tested/observed on different criteria or at different time points

Example

Objective is to test the impact of training on the time taken by workers to complete a routine task.

Set of workers are tested for time taken to complete the task WITHOUT training

The same set of workers are tested for time taken to complete the task POST training

Check if the difference is statistically significant.

Assumptions for paired sample t-test

Assumptions are on the values of ‘differences’

• Difference in values should approximately follow normal distribution.
• Outliers should not be there. Remove the outliers before using the test

Hypothesis and calculations

Null Hypothesis (Ho):

Avg. time to complete task before training <= Avg. time to complete task after training

(Difference in time <= 0)

Alternate Hypothesis (H1):

Avg. time to complete task before training > Avg. time to complete task after training

(Difference in time > 0)

Above case is an example of one tailed t-test.