Module 8 - Practice - Logistic Regression

Back to the course outline

Exercise 1:

Download the diabetes.csv file and load it in as a dataframe. Narrow your dataset to include columns and/or rows that best predicts the outcome of if a patient will get diabetes.

Exercise 2:

Using the dataframe in the exercise above, split the dataset into training and testing. Use the default 25% test setting.

Exercise 3:

Using the logistic regression function in the scikit-learn library (sklearn), fit the model with the training dataset. Then score the model for training; how well did it do?

Exercise 4:

Now use the test dataset on the logistic regression function and get its score.

Exercise 5:

Make a confusion matrix for the predicted outcomes to compare it against the "true" outcomes. How many values for each outcome did the model get incorrect?

Exercise 6:

Get a classification report on the model for the predicted data. Which outcome is the model more accurate at predicting?

Back to the course outline