The t-Test computes the following general statistic:
signal Difference of Means t = ---------------- = ----------------------------- noise Standard Error
...where sp is the pooled (or composite) standard deviation of the two samples.
Be warned that the above formula is valid when:
- The sample sizes are the same: n1 = n2 = n
- The populations (where the samples come from) follwo the normal distribution (normality criterion), and
- The variances of the populations are also the same (variety criterion)
When the above do not hold true then adjustments or even totally different statistical controls apply as we explain further below.
- IV: Background music with two conditions: (a) BM is supplied, (b) BM is not supplied
- DV: Level of student learning as measured by their performance in a reliable and validated test (continuous variable, scale 0-100)
- Population: students of specific age and prior knowledge studying in a multimedia elearing environment
- Research Design: Two groups post-test only design
- Control (C-group): N1=38 students (randomly selected) studying without background music
- Treatment (T-group): N2=40 students (randomly selected) studying with background music
- Null hypothesis H0 = "Students studying with background music will perform the same compared to students studing without background music" (non-directional)
import pandas as pd import scipy.stats as stats data = pd.read_excel('../../data/researchdata.xlsx', sheetname="ttest-indep") data.tail() # see the last 5 lines: 'NaN' is inserted in missing values of C-group
print(data.Control.describe()) print('\n') print(data.Treatment.describe())
count 38.000000 mean 67.236842 std 11.131728 min 45.000000 25% 60.000000 50% 65.000000 75% 75.000000 max 90.000000 Name: Control, dtype: float64 count 40.000000 mean 76.625000 std 11.231109 min 55.000000 25% 70.000000 50% 75.000000 75% 85.000000 max 100.000000 Name: Treatment, dtype: float64
- To test for normality, apply the Shapiro-Wilk test by calling the scipy stats.shapiro() method
- This control tests the null hypothesis that the data was drawn from a normal distribution and returns the test statistic ('W') and the probability ('p').
- The normality criterion holds true when p > a (where a is the probability threshold usually set to 0.05)
# Shapiro-Wilk normality test for Control group stats.shapiro(data.Control.dropna())
# Shapiro-Wilk normality test for Treatment group stats.shapiro(data.Treatment.dropna())
- To test for variance, apply the Levene test by calling the scipy stats.levene() method
- This control tests tests the null hypothesis that all samples come from populations with equal variances. It returns the test statistic ('W') and the probabilyt ('p').
- The variance criterion holds true when p > a (where a is the probability threshold usually set to 0.05)
# Levene variance test for Control and Treatment groups stats.levene(data.Control.dropna(), data.Treatment.dropna())
t, p = stats.ttest_ind(data.Control.dropna(), data.Treatment.dropna()) t, p
t, p = stats.ttest_ind(data.Control.dropna(), data.Treatment.dropna(), equal_var = False)
These terms refer to whether we statistically explore the possibility that null hypothesis is rejected because the independent variable has either positive or negative impact on the outcome. That is, we do not care about the direction of impact (graph A in the fig. below).