Prev: t-Test (independent samples) | Next: One sample t-test

**Paired t-test**is the type of t-test that we apply when we want*to explore whether the two means of two related samples are significantly different*.- Usually, "related" refers to the fact that we use the same sample in a "test-retest" ("repeated measures") research design, thus forming pairs of repeated measurements for the same participant.

Examples:

In educational research: suppose we have a group of students and we want to advance their understanding in a particularly demanding topic implementing an innovative instructional method. Before the intervention we administer a knowledge test to assess their prior knowledge on the topic. After the intervention we also conduct an appropriate knowledge test. We can apply a paired t-test to compare the performance of students before and after the didactical intervention, to investigate whether the instructional method had an impact.In business: suppose we measure the clients' satisfaction of a business services with an appropriate instrument. Based on their feedback we make improvements and after some time we ask the same people to rate again the services. Afterwards we can use paired t-test to investigate whether the mean value of clients' satisfaction after the improvement intervention is significantly higher than before.

For a deeper analysis and to see how the t-statistic is computed read the "Dependent t-test for paired samples" section @wikipedia

- Import the data for the two tests 'Baseline' and 'Retest' didactical intervention from spreadsheet 'ttest-paired' in the researchdata.xlsx file. Some students missed the second test and there are missing data in the 'Retest' column. Respectively, some students took the second test but not the first, so there are missing data in the 'Baseline' column as well. Remove the missing values and apply paired ttest.

In [1]:

```
import pandas as pd
import scipy.stats as stats
data = pd.read_excel('../../data/researchdata.xlsx', sheetname="ttest-paired")
print('Data have NaN values')
print(data.head())
print('\n..........\n')
print(data.tail())
# Drop NaN data
dtdrop = data.dropna() # Note that NaN values are dropped in both columns
print('\n\nData without NaN values')
print(dtdrop.head())
print('\n..........\n')
print(dtdrop.tail())
```

- Normality/Variance criteria

In [2]:

```
stats.shapiro(dtdrop.Baseline), stats.shapiro(dtdrop.Retest),\
stats.levene(dtdrop.Baseline, dtdrop.Retest)
```

Out[2]:

- To apply paired t-test call the scipy stats.ttest_rel() method as demonstrated below.

In [3]:

```
t, p = stats.ttest_rel(dtdrop.Baseline, dtdrop.Retest)
t, p
```

Out[3]:

- When the samples do not come from a normal distribution then you may decide to turn to non-parametric statistics.
- In this case you need to apply the
**Wilcoxon signed-rank test**which is the non-parametric counterpart of the paired t-test (see the table at "'location test'@wikipedia). - You apply the Wilcoxon signed-rank test by calling the scipy
**stats.wilcoxon()**method

In [4]:

```
t, p = stats.wilcoxon(dtdrop.Baseline, dtdrop.Retest)
```

. Free learning material

. See full copyright and disclaimer notice