This is a follow-up post to my recent introduction of histograms. Previously, I presented the conceptual foundations of histograms and used a histogram to approximate the distribution of the “Ozone” data from the built-in data set “airquality” in R. Today, I will examine this distribution in more detail by overlaying the histogram with parametric and non-parametric kernel density plots. I will finally answer the question that I have asked (and hinted to answer) several times: Are the “Ozone” data normally distributed, or is another distribution more suitable?
Read the rest of this post to learn how to combine histograms with density curves like this above plot!
This is another post in my continuing series on exploratory data analysis (EDA). Previous posts in this series on EDA include
View original post 1,087 more words