Continuing my recent series on exploratory data analysis (EDA), today’s post focuses on 5-number summaries, which were previously mentioned in the post on descriptive statistics in this series. I will define and calculate the 5-number summary in 2 different ways that are commonly used in R. (It turns out that different methods arise from the lack of universal agreement among statisticians on how to calculate quantiles.) I will show that the fivenum() function uses a simpler and more interpretable method to calculate the 5-number summary than the summary() function. This post expands on a recent comment that I made to correct an error in the post on box plots.
> y = seq(1, 11, by = 2) > y  1 3 5 7 9 11 > fivenum(y)  1 3 6 9 11 > summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max. …
View original post 1,296 more words