The Importance of Relationships
Introduction
Confidence interval refers to a range of values that would be approximately located or contained in a given unknown population (Newcombe, 2013). On the other hand, the confidence level is the probability of a given randomly selected sample from a true population would be contained in a given confidence interval. Confidence level may range from 90% 95%,99% etc. Both confidence level and the sample usually affect the range of the confidence interval (Wilcox, 2012). This exercise to examine the effect of confidence level and sample on the confidence interval of the age variable. The variable is derived from the titanic data set.
95% and 90% confidence interval for a sample of a 100
Taking a sample of 100 individuals from the data set, the mean age of the selected individual is 28.3050, as indicated in the table below.
Statistics | ||||
N | Mean | Std. Deviation | Std. Error Mean | |
Age | 100 | 28.3050 | 13.25162 | 1.32516 |
95% confidence interval implies that there is a 95% chance that a randomly selected individual from a sample of 100 would have an age between 25.6756 and 30.9344.
95% Confidence Interval of the Difference | ||
Lower | Upper | |
Age | 25.6756 | 30.9344 |
For a 90% confidence interval, there is a 90% chance that a randomly selected individual from a sample of 100 would have age between 26.1047 and 30.5053.
90% Confidence Interval of the Difference | ||
Lower | Upper | |
Age | 26.1047 | 30.5053 |
95% and 90% confidence interval for a sample of a 400
Taking a sample of 400 individuals from the data set, the mean age of the selected individual is 30.2236, as indicated in the table below.
Statistics | ||||
N | Mean | Std. Deviation | Std. Error Mean | |
Age | 400 | 30.2236 | 14.56515 | .72826 |
95% confidence interval implies that there is a 95% chance that a randomly selected individual from a sample of 400 would have an age between 28.7918 and 31.6553.
95% Confidence Interval of the Difference | ||
Lower | Upper | |
Age | 28.7918 | 31.6553 |
For a 90% confidence interval, there is a 90% chance that a randomly selected individual from a sample of 400 would have age between 29.0229 and 31.4242.
90% Confidence Interval of the Difference | ||
Lower | Upper | |
Age | 29.0229 | 31.4242 |
Comparison of the confidence level and the sample size
The 90% confidence interval for the sample with 100 individuals is (26.1047, 30.5053). The 95% confidence interval is (25.6756, 30.9344). For the second sample with 400 individuals, the 90% confidence interval is (29.0229, 31.4242). The 95% confidence interval is (28.7918, 31.6553). Assessment of the confidence levels for the two samples shows that a 95% confidence interval is wider compared to the 90% confidence interval. Increasing the confidence level increases, the margin of or the error bound, making the confidence interval wider. In contrast, decreasing the confidence level decreases the error bound, making the confidence interval narrower.
On the aspect of the sample size, the confidence interval for the two-sample sizes differs significantly. The confidence interval is wider for the sample with 100 individual compared to the sample with 400 individuals. Increasing the sample size decreases the margin of error hence making the confidence interval narrower. On the other hand, reducing the sample increases the margin of error, thus making the confidence interval wider (Sauro & Lewis, 2016).
The statement, “Confidence intervals are underutilized “may refer to ignoring the information provided by the confidence interval. While reporting the statistics of a given population parameter, one may only indicate a specific value and ignores the range of values within the given parameter. This would be over-representing the precision of results. It is essential to make use of confidence interval when reporting your findings as by adopting such an approach, the variation of the population is considered. It is therefore vital to apply confidence interval in reporting, as it estimates the range in which the true value of the population lies. The results of the analysis provide a reliable estimate of where the mean value of the age of individuals lies.
References
Kaggle: Your Machine Learning and Data Science Community. Kaggle.com. (2020). Retrieved
24 June 2020, from https://www.kaggle.com/.
Newcombe, R. (2013). Confidence intervals for proportions and related measures of effect
size (p. 101). CRC.
Sauro, J., & Lewis, J. (2016). Quantifying the User Experience (p. 21). Elsevier Science.
Wilcox, R. (2012). Introduction to robust estimation and hypothesis testing (p. 103).