Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R. (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density estimation, and violin plots.)
I will plot empirical CDFs in 2 ways:
- using the built-in ecdf() and plot() functions in R
- calculating and plotting the cumulative probabilities against the ordered data
Continuing from the previous posts in this series on EDA, I will use the “Ozone” data from the built-in “airquality” data set in R. Recall that this data set has missing values, and, just as before, this problem needs to be addressed when constructing plots of the empirical CDFs.
Recall the plot of the empirical CDF of random standard normal numbers in…
View original post 835 more words