Module 6 - Practice - Correlation and Models

Back to the course outline

Exercise 1:

From the datasets folder, load the "tamiami.csv" file as a dataframe. Rename the columns (in order) to the following:

  • location

  • sales

  • employees

  • restaurants

  • foodcarts

  • price

Then do a correlation table on that dataframe. What features (columns) are correlated? What features aren't correlated?

Exercise 2:

Using the dataframe from the previous exercise, choose features (columns) to create a linear regression formula to predict sales. Try it with and without the y-intercept. How does it make a difference? Does adding or removing features in your model formula make a difference in the output?

Back to the course outline