Data mining is a complex process that aims to discover patterns in large data sets starting from a collection of existing data. In my opinion, data mining contains four main steps:
Collecting data: This is a complex step, I will assume we have already gotten the datasets.
Pre-processing: In this step, we need to try to understand your data, remove or reduce noise, do dimension reduction and select proper predictors etc.
Feeding data mining: In this step, we need to use your data to feed your model.
Post-processing: In this step, we need to interpret and evaluate your model.
Classification algorithms predict one or more discrete variables, based on the other attributes in the dataset.
Regression algorithms predict one or more continuous numeric variables, such as profit or loss, based on other attributes in the dataset.
Segmentation/Clustering algorithms divide data into groups, or clusters, of items that have similar properties.
Association algorithms find correlations between different attributes in a dataset. The most common application of this kind of algorithm is for creating association rules, which can be used in a market basket analysis.
Sequence analysis algorithms summarize frequent sequences or episodes in data, such as a series of clicks in a web site, or a series of log events preceding machine maintenance.