A) Interquartile range - For males: 11, for femals: 6
B) Difference between the median values: 7.5
C) Males: median, females: mean
D) Outlier: one male attended the move on the 30th
Step-by-step explanation:
A)
The interquartile range (IQR) of a set of data is the difference between the upper quartile (Q3) and the lower quartile (Q1):
- For the box plot representing males:
Lower quartile is
Upper quartile is
Therefore, interquartile range is
- For the box plot representing femals:
Lower quartile is
Upper quartile is
Therefore, interquartile range is
B)
The median value of a dataset is the "central value" of the dataset, i.e. the value of the dataset for which half of the values in the dataset are lower than the median, and half of the values in the dataset are higher than the median.
The median value of a dataset is indicated as .
Here the median values for the two datasets are given:
- For males,
- For females,
Therefore, the difference between the median values is
C)
The mean of a dataset is the average value of the dataset, calculated as the sum of all data divided by the number of values in the dataset.
On the other hand, the median of a dataset is the central value, i.e. the value for which half of the values in the dataset are lower than the median, and half of the values in the dataset are higher than the median.
For a perfectly symmetrical distribution, mean and median are equal; however, for non-symmetrical distribution, this is not true.
In general, the mean is a good to describe a distribution when the distribution itself is simmetrical; instead, when the distribution is very asymmetrical, the median provides a better indicator for the distribution.
In this case, we can see that the male distribution is very asymmetrical: in fact, the lower quartile range is much larger than the upper quartile range , so the distribution can be better describes using the median. For the female case instead, the distribution is more symmetrical (the median is at the center between and ), therefore the distribution can be better described by using the mean.
4)
An outlier of a dataset is a value of the dataset that falls very far from all other values.
In a box plot, outliers are represented as single points outside the whiskers.
In this case, we see that there is only one outlier, in the male plot, at a value of 30. This represents a single case in which a male attended the movie theater on the 30th. This is an isolated case, since the maximum of the male plot is at 21, which means that no males attended the movie after the 21st.