(a) Collision claims tend to be skewed right because there are a few very large collision claims relative to the majority of claims.
These large collision claims skew the data to the right because they are outliers that draw the mean higher than it should be.
The answer cannot be "no large collision claims" or "many large collision claims" because then the data would not be skewed.
(b) Since there are two random samples comparing two means, we will be using a 2-Samp-T-Test. We will use Welch's approximate t-test, where we test the hypothesis that these two populations have the same mean.
We want to determine if 20-24 year old drivers have the same mean collision cost as 30-59 year old drivers.
(c) The null hypothesis is that <u>the mean claim of collision costs for 20-24 year old drivers is the same as the mean claim of collision costs for 30-59 year old drivers.</u>
We can also rewrite this as:
- H₀: μ₁ = μ₂
- where μ₁ is the mean collision claims for 20-24 year old drivers
- and μ₂ is the mean collision claims for 30-59 year old drivers
The alternative hypothesis is that the <u>mean claim of collision costs for 20-24 year old drivers is higher than the mean claim of collision costs for 30-59 year old drivers</u>.
We can rewrite this as:
- H₁: μ₁ > μ₂
- Our variables have already been defined
(d) In order to calculate the p-value, we must find the test statistic first.
We can do so by using this formula:
We are given these variables in the problem:
Substitute these variables into the formula for the test statistic.
You can use the calculator to solve this expression, or plug the numbers into your calculator using: STAT - TESTS - 4:2-SAMPTTEST.
Your test statistic should be: t = 1.8962.
The calculator should also tell you the p-value; p = .0308.
(e) Since p = .0308 < α = .1, <u>reject</u> the null. We have convincing statistical evidence that the mean claim of collision costs for 20-24 year old drivers is higher than the mean claim of collision costs for 30-59 year old drivers.