Iris Analysis

Analysis with Nextjournal and R using the built-in Iris dataset

The Iris data comes from Ronald Fisher's 1936 paper, The Use of Multiple Measurements in Taxonomic Problems. Four features of three different Iris species were measured and recorded. Fisher settled on a sample size of fifty for each species.

Iris virginica

The data

The raw data can be viewed in a table by returning it as the final value from a cell.


The summary() function shows a few statistical measures of the data.



We'll use the plotting functionality built into R and attach the dataset to make things cleaner.


Dot plot with regression line.

plot(Petal.Length, Petal.Width,
     col=Species, pch=19,
     main="Iris Data",
     xlab="Iris petal length",
     ylab="Iris petal width"
abline(lm(Petal.Width~Petal.Length), col="red")

Scatter matrix, with color coding by species and regression lines.

pairs(~Sepal.Length+Sepal.Width+Petal.Length+Petal.Width, data=iris,
      panel=function(x,y,...) {
        par(new = TRUE)
        abline(lm(y~x), col="red")
      col=Species, pch=19, 
      main="Iris Matrix",
      label=c("Sepal Length", "Sepal Width", "Petal Length", "Petal Width")

legend("bottomright", fill = unique(Species), legend = c(levels(Species)), bg="white")