Data Mining Phases/process

by Mandeep Singh - Date: 2008-08-24 - Word Count: 401 Share This!


Data Mining is process of finding meaningful information from huge volume of data. Main objective of data mining is to find hidden trends in the data. It may be customers purchasing behaviour, Sale trends, finding new cross selling opportunities and many more. Data mining is step by step process starts with business understanding and ends with possible solution of the problem in hand. CRISP-DM is widely known data mining industry standard and used by most of organization that provides data mining services. There are six stages in CRISP-DM.

Business Understanding
Data Preparation
Data Understanding
Model Evaluation

These step are interactive (interactive diagram @, that mean at any stage you can come back to previous stage. For example at the modeling stage you are still not sure about trends in data, at that point you can go back to data understanding phase to properly understand the trends in the data.


Business Understanding: This is the first phase in CRISP-DM Model and includes

-       Understand necessary business process

-       Understand the problem

-       Plan how to solve the problem while considering resources in hand.

-       Define objectives and goals what you are going to achieve.


Data Understanding: This is second phase in CRISP-DM model and includes

-          Go through historical data

-          Try to relate data with each other

-          Find hidden trends in data (not too deep)


Data Preparation: This is third phase in CRISP-DM model and includes

-          In "Data Understanding" phase we go through historical data and collect that we need (Data Sampling).

-          In this stage, format the data into desired form.

-          Handle missing and noisy data (In next article I will discuss in detail about how to handle noisy and missing data, so don't miss that one.)


Modeling: Fourth phase of CRISP-DM model

-          Develop model for future prediction

-          Try different modeling technique.

-          Try different parameters to improve the results

-          Pick those models that look appropriate at this phase and evaluate them in next phase.


Model Evaluation:  Fifth phase of CRISP-DM model

-          Important stage in CRISP-DM model

-          Model need to evaluated in terms of response time, confidence level, cost, error rate and many other

-          Determine how this model is helpful in achieving objectives and goals defined in the first stage.


Deployment: Final phase of CRISP-DM model

-          Create reports, so that end user can easily use this model to improve the business performance.


Special thanks to the team of CRISP-DM org who are making efforts in building CRISP-DM 2.0 for the industry.

 Don't forgot to check next article

"How to Handle Missing Data"

Mandeep Singh


Related Tags: data mining, deployment, data analysis, business understanding, data preparation, data understanding, model evaluation, data sampling, noisy data

Mandeep Singh

Your Article Search Directory : Find in Articles

© The article above is copyrighted by it's author. You're allowed to distribute this work according to the Creative Commons Attribution-NoDerivs license.

Recent articles in this category:

Most viewed articles in this category: