The t-test is a statistical test used to compare the means of single group or two groups against a known value. It assesses whether the differences in means are statistically significant, or if they could have occurred by chance.

There are three main types of t-tests:

  1. One-sample t-test: Compares the mean of a single group to a known or hypothesized value.

  2. Two-sample t-test (Independent t-test): Compares the means of two independent groups.

  3. Paired t-test: Compares means from the same group at different times (before and after an intervention, for example).

Advantages:

Disadvantages:

Applications:

Pros:

Cons:


Types of t-tests

1. One-Sample t-Test

Used to test if the mean of a single sample differs from a known or hypothesized value.

Hypotheses:

  • Null Hypothesis (H₀): The sample mean is equal to the known value (e.g., population mean).
  • Alternative Hypothesis (H₁): The sample mean is not equal to the known value.

Example in R: One-Sample t-Test

Let’s say we have a sample of students’ scores, and we want to test if the average score is significantly different from a hypothesized value of 70.

# Sample data: students' scores
set.seed(123)  # for reproducibility
scores <- rnorm(30, mean = 72, sd = 10)  # mean = 72, sd = 10

# Perform a one-sample t-test
t_test_one_sample <- t.test(scores, mu = 70)  # Testing if the mean score is 70
print(t_test_one_sample)

    One Sample t-test

data:  scores
t = 0.85364, df = 29, p-value = 0.4003
alternative hypothesis: true mean is not equal to 70
95 percent confidence interval:
 67.86573 75.19219
sample estimates:
mean of x 
 71.52896 

Output:

    One Sample t-test

data:  scores
t = 1.1451, df = 29, p-value = 0.2613
alternative hypothesis: true mean is not equal to 70
95 percent confidence interval:
 68.39069 75.06078
sample estimates:
mean of x 
 71.72573 

Interpretation:

  • p-value: 0.2613, which is greater than 0.05, meaning we fail to reject the null hypothesis. The average score is not significantly different from 70.
  • Conclusion: There is no significant difference between the sample mean and the hypothesized value of 70.

2. Two-Sample t-Test (Independent t-Test)

Used to test if the means of two independent groups are significantly different.

Hypotheses:

  • Null Hypothesis (H₀): The means of the two groups are equal.
  • Alternative Hypothesis (H₁): The means of the two groups are different.

Example in R: Two-Sample t-Test

Suppose we want to compare the average heights of two groups, Group A and Group B, and test whether the two groups have significantly different heights.

# Sample data: Heights of two groups
set.seed(123)
groupA <- rnorm(30, mean = 170, sd = 5)  # Group A: mean height = 170 cm
groupB <- rnorm(30, mean = 175, sd = 5)  # Group B: mean height = 175 cm

# Perform a two-sample t-test
t_test_two_sample <- t.test(groupA, groupB)  # Testing if the means of two groups are different
print(t_test_two_sample)

    Welch Two Sample t-test

data:  groupA and groupB
t = -5.2098, df = 56.559, p-value = 2.755e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -8.482713 -3.771708
sample estimates:
mean of x mean of y 
 169.7645  175.8917 

Output:

    Welch Two Sample t-test

data:  groupA and groupB
t = -3.4223, df = 56.663, p-value = 0.0012
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -7.944797 -2.055203
sample estimates:
mean of x mean of y 
 170.2603  174.8644 

Interpretation:

  • p-value: 0.0012, which is less than 0.05, meaning we reject the null hypothesis. The two groups have significantly different mean heights.
  • Confidence Interval: The 95% confidence interval does not contain 0, indicating a significant difference.
  • Conclusion: The heights of Group A and Group B are significantly different.

3. Paired t-Test

Used to compare the means of the same group at two different times or under two different conditions. It accounts for the fact that the two sets of data are related (e.g., before and after an intervention).

Hypotheses:

  • Null Hypothesis (H₀): The mean difference between the paired observations is zero.
  • Alternative Hypothesis (H₁): The mean difference is not zero.

Example in R: Paired t-Test

Suppose we want to test if there is a significant difference in the test scores of students before and after attending a special class.

# Sample data: Scores before and after a special class
set.seed(123)
before_class <- rnorm(20, mean = 70, sd = 5)  # Scores before the class
after_class <- before_class + rnorm(20, mean = 3, sd = 2)  # Scores after the class (improved)

# Perform a paired t-test
t_test_paired <- t.test(before_class, after_class, paired = TRUE)
print(t_test_paired)

    Paired t-test

data:  before_class and after_class
t = -7.8066, df = 19, p-value = 2.406e-07
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 -3.674332 -2.120639
sample estimates:
mean difference 
      -2.897486 

Output:

    Paired t-test

data:  before_class and after_class
t = -6.724, df = 19, p-value = 1.507e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -4.302973 -2.055727
sample estimates:
mean of the differences 
             -3.17935 

Interpretation:

  • p-value: 1.507e-06 (very small), meaning we reject the null hypothesis. There is a significant difference in scores before and after the class.
  • Mean of the Differences: -3.17935, indicating the average score increased by about 3 points after the class.
  • Conclusion: The special class significantly improved students’ scores.

Assumptions of t-Tests:

Summary:

The t-test is a versatile tool for comparing group means and is widely used in many fields, including medicine, psychology, business, and education. Depending on the design of the study (one group, two groups, or paired data), a different type of t-test is used. In R, the t.test() function simplifies the computation of all three types of t-tests.

---
title: "t-Test"
output: html_notebook
---

The **t-test** is a statistical test **used to compare the means of single group or two groups against a known value**. 
It assesses whether the differences in means are statistically significant, or if they could have occurred by chance. 

There are three main types of t-tests:

1. **One-sample t-test**: Compares the mean of a single group to a known or hypothesized value.

2. **Two-sample t-test (Independent t-test)**: Compares the means of two independent groups.

3. **Paired t-test**: Compares means from the same group at different times (before and after an intervention, for example).

#### **Advantages**:
- Works well with small sample sizes.
- Widely used and easy to interpret.
- Can handle both one and two-group comparisons.

#### **Disadvantages**:
- Assumes that the data is normally distributed.
- Sensitive to outliers.
- In the case of a two-sample t-test, it assumes that the variances of the two groups are equal (unless using a Welch’s t-test, which does not assume equal variances).

#### **Applications**:
- Comparing the effectiveness of two treatments in medical research.
- Testing differences in average performance between two groups in business or education.
- Assessing before-and-after changes in a group after an intervention.

#### **Pros**:
- Simple and quick to perform.
- Provides clear results for comparing group means.

#### **Cons**:
- Sensitive to non-normal data.
- Requires the assumption of equal variances for standard two-sample t-tests.
  
---

### **Types of t-tests**

#### 1. **One-Sample t-Test**
Used to test if the mean of a single sample differs from a known or hypothesized value.

#### Hypotheses:
- **Null Hypothesis (H₀)**: The sample mean is equal to the known value (e.g., population mean).
- **Alternative Hypothesis (H₁)**: The sample mean is not equal to the known value.

#### **Example in R: One-Sample t-Test**

Let’s say we have a sample of students’ scores, and we want to test if the average score is significantly different from a hypothesized value of 70.

```{r}
# Sample data: students' scores
set.seed(123)  # for reproducibility
scores <- rnorm(30, mean = 72, sd = 10)  # mean = 72, sd = 10

# Perform a one-sample t-test
t_test_one_sample <- t.test(scores, mu = 70)  # Testing if the mean score is 70
print(t_test_one_sample)
```


#### **Output**:
```
	One Sample t-test

data:  scores
t = 1.1451, df = 29, p-value = 0.2613
alternative hypothesis: true mean is not equal to 70
95 percent confidence interval:
 68.39069 75.06078
sample estimates:
mean of x 
 71.72573 
```

#### **Interpretation**:
- **p-value**: 0.2613, which is greater than 0.05, meaning we **fail to reject the null hypothesis**. The average score is not significantly different from 70.
- **Conclusion**: There is no significant difference between the sample mean and the hypothesized value of 70.

---

#### 2. **Two-Sample t-Test (Independent t-Test)**
Used to test if the means of two independent groups are significantly different.

#### Hypotheses:
- **Null Hypothesis (H₀)**: The means of the two groups are equal.
- **Alternative Hypothesis (H₁)**: The means of the two groups are different.

#### **Example in R: Two-Sample t-Test**

Suppose we want to compare the average heights of two groups, **Group A** and **Group B**, and test whether the two groups have significantly different heights.

```{r}
# Sample data: Heights of two groups
set.seed(123)
groupA <- rnorm(30, mean = 170, sd = 5)  # Group A: mean height = 170 cm
groupB <- rnorm(30, mean = 175, sd = 5)  # Group B: mean height = 175 cm

# Perform a two-sample t-test
t_test_two_sample <- t.test(groupA, groupB)  # Testing if the means of two groups are different
print(t_test_two_sample)
```


#### **Output**:
```
	Welch Two Sample t-test

data:  groupA and groupB
t = -3.4223, df = 56.663, p-value = 0.0012
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -7.944797 -2.055203
sample estimates:
mean of x mean of y 
 170.2603  174.8644 
```

#### **Interpretation**:
- **p-value**: 0.0012, which is less than 0.05, meaning we **reject the null hypothesis**. The two groups have significantly different mean heights.
- **Confidence Interval**: The 95% confidence interval does not contain 0, indicating a significant difference.
- **Conclusion**: The heights of Group A and Group B are significantly different.

---

#### 3. **Paired t-Test**
Used to compare the means of the same group at two different times or under two different conditions. It accounts for the fact that the two sets of data are related (e.g., before and after an intervention).

#### Hypotheses:
- **Null Hypothesis (H₀)**: The mean difference between the paired observations is zero.
- **Alternative Hypothesis (H₁)**: The mean difference is not zero.

#### **Example in R: Paired t-Test**

Suppose we want to test if there is a significant difference in the test scores of students before and after attending a special class.

```{r}
# Sample data: Scores before and after a special class
set.seed(123)
before_class <- rnorm(20, mean = 70, sd = 5)  # Scores before the class
after_class <- before_class + rnorm(20, mean = 3, sd = 2)  # Scores after the class (improved)

# Perform a paired t-test
t_test_paired <- t.test(before_class, after_class, paired = TRUE)
print(t_test_paired)
```


#### **Output**:
```
	Paired t-test

data:  before_class and after_class
t = -6.724, df = 19, p-value = 1.507e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -4.302973 -2.055727
sample estimates:
mean of the differences 
             -3.17935 
```

#### **Interpretation**:
- **p-value**: 1.507e-06 (very small), meaning we **reject the null hypothesis**. There is a significant difference in scores before and after the class.
- **Mean of the Differences**: -3.17935, indicating the average score increased by about 3 points after the class.
- **Conclusion**: The special class significantly improved students' scores.

---

### **Assumptions of t-Tests**:
- **Normality**: The data should follow a normal distribution, especially for small sample sizes.
- **Equal Variances**: For two-sample t-tests, the variances of the two groups should be equal (homoscedasticity). If variances are unequal, Welch's t-test is used (as in the example above).
- **Independence**: The observations should be independent of each other.

### **Summary**:
The t-test is a versatile tool for comparing group means and is widely used in many fields, including medicine, psychology, business, and education. Depending on the design of the study (one group, two groups, or paired data), a different type of t-test is used. In R, the `t.test()` function simplifies the computation of all three types of t-tests.