Module 3 - Practice - Data Manipulation
Exercise 1:
From the datasets folder, load in the "dupedata.csv" file as a dataframe. Drop the duplicates from the dataframe, keeping the first value (save the resulting dataframe to a new variable).
Exercise 2:
Using the dataframe in the previous exercise, select all the rows where students received a grade lower than 60 (they need a teacher conference on how to improve for the next test).
Exercise 3:
Using the dataframe from Exercise 1, select all the rows where a student received a grade of 100 and change their grade to 103 (extra credit!).
Exercise 4:
Load in the "travel_times.csv" file as a dataframe. Drop the "Comments" column. Then remove rows from the dataframe that have missing values and assign the resulting dataframe as a new variable.
Exercise 5:
Using the dataframe from the exercise above (w/ no missing values), create bins that will categorize the AvgSpeed column as "slow" or "fast", and make a new column called "Speed" to hold those new values. Values less than 75 are "slow" and everything above is "fast".
Exercise 6:
Using the dataframe in the previous exercise, make a new column called "Police" which is equal to all the values being "no" (they were never stopped by police for speeding while traveling).
Exercise 7:
Using the dataframe from the previous exercise, pick a method (Standard Deviation or Interquartile Range) and remove the outliers from the "FuelEconomy" column.