Pre-Work: Download the BestSmileDental.csvPreview the document file.
The file BestSmileDental.csv contains the number of patient/customer visits for a dental clinic “Best Smile Dental” for the past seven years. The data is rolled up
monthly by year. Using forecasting techniques, you need to predict the customer/patient count for the 12 months of 2008. As part of this exercise, you will have to
perform-
data cleaning / pre-processing [removal of invalid values]
When you explore the data, you will notice that the “Customers” column contains invalid values (non-numeric/negative/outliers). Remember to utilize plots/graphs to
detect invalid/outlying values.
imputation for missing/invalid data using one of the following R packages – imputeTS/mice/amelia. Make sure to provide a tabular list for all the imputed values.
create two time series models – ARIMA and Holt-Winters using the clean and imputed dataset
compare the two models and select the best model. Provide explanation for your selection.
Using the best model, forecast the customer/patient count for the 12 months of 2008 and clearly write the forecasted values in your word report.
You will need to explain the work performed in each of the above steps in your word report. Make sure to utilize visualizations to reinforce the information.