Task I (5%)
Objective: Open your first csv file and begin exploring the dataset.
File: carAuction.csv Download carAuction.csv
Import the packages needed
Pandas
You will also need the package for mounting google drive folders
Seaborn
Task II (20%)
read in the car auction csv file
Describe the dataframe using the describe() function. In a text box below the code cell briefly describe data patterns you see in each column.
Show the column information using the info() function. In a text box below the code cell briefly describe what information you have learned from the info function. Are there any nulls? How many columns are there? What datatypes are in our data?
Task III (20%)
show the shape of the dataframe
extract the number of rows from shape and print(“number of rows is “,x) # replace the x with the number of rows retrieved via code
extract the number of columns from shape and print(“number of column is “,x) # replace the x with the number of columns retrieved via code
Task IV (25%)
Show the head of the dataframe
Create a new column that is a True/False column which is called ‘over_80k_miles’ and is true when the vehicle odometer (VehOdo) is greater than 80,000 miles. Show the head of the updated dataframe.
Filter the dataframe to only return rows where the rows are over_80k_miles is True, do not overwrite the existing dataframe. Show the filtered dataframe.
Sort the dataframe in descending order by VehOdo. Show the sorted result.
Task V (25%)
Create a boxplot of VehOdo. In a new textbox describe the graph and what it tells us.
Create a barplot of Size by count of size. Describe the graph in a textbox below.
Create a histogram of WarrantyCost. Change the bins to visualize the distribution better. Describe the graph in a textbox below.
Create a seaborn pairplot, for each pair of features describe the dataypes being compared such as continuous-continuous, continuous-categorical, categorical-categorical.
Create a seaborn pairplot with the hue set for IsBadBuy. Describe the plot. Do any patterns jump out that might indicate groupings of bad buys?
Task VI (5%)
Render A2_yourLastname_yourFirstname.ipynb to an HTML output file. Submit these files and A2_yourLastname_yourFirstName.xxx to Canvas. Please do not submit a zipped file.
Customer files