Assume you are a marketing executive of a retail company, the company sent an online shopping voucher to 2,240 randomly chosen customers in January 2020 by emails. One month later, your colleague checked whether each of these customers used those vouchers. The outcome is summarised in a variable called Response (1-Used the Voucher, 0- Did not use the voucher) in a dataset “customer_campaigns.sav”. The dataset also contains these customers’ other personal details (see Table 1).
Your manager gave you a second dataset “transactions.sav” (See Table 2) which summarises these 2,240 customers’ transaction records in 2018 and 2019. Using these two datasets, your manager would like you to prepare a report which will include the followings:
1. Pre-process and conduct basic descriptive analysis with the given datasets;
2. Using the “transaction.sav” dataset, conduct a RFM analysis and discuss how to use the results of this analysis in practice;
3. With the given data, build a classification model to understand what variables drive response, and which types of customers are characterised by a higher propensity of response to this type of online shopping voucher.
Name Description
NID Customer ID
Year_Birth Year of birth
Education Highest education level(1-Basic, 2-Undergraduate, 3-Master, 4-PhD)
Marital_Status Marital status (1-Single, 2-Married/With a partner, 3-Divorced, 4-Widow/Widower, 5-Others)
Income Annual personal income
Childhome Number of children
SDate Date sign up with the company
Response Used the voucher or not (1-Used the voucher, 0-Did not use the voucher)
Table 1. Description for the “customer_campaign.sav” dataset