All Courses

Data Mining And Its Data Functionalities

Neha Kumawat

2 years ago

Data mining | insideAIML
Table of Contents
  • Introduction
  • What is Data Mining?
           1) Understanding your project goal
           2) Understanding the data and its sources
           3) Preparing the data
           4) Data Analysis
           5) Results Review
  • What are Data Mining functionalities?
  • Descriptive mining task
          1) Association
          2) Clustering
          3) Summarization
  •  Predictive mining task
          1) Classification
          2) Prediction
          3) Time series analysis


          In this article, I will try to explain you what are some of the data mining functionalities involved in any data mining process.
So before going into much detail about data mining functionalities. Lets first try to understand what is Data Mining?

What is Data Mining?

             Data mining is a process where we try to find out hidden patterns, insights, or information from a large data set.
Nowadays data mining is used by most companies to turn their raw data into some useful information. So that businesses can learn more about their customers and their behaviors to develop more effective marketing strategies, which helps the company to increase sales and decrease the costs. Data mining depends on effective data collection, warehousing, and computer processing power.
The data mining process involves five main stages:
  • Understanding your project goal
  • Understanding the data sources
  • Preparing the data
  • Data Analysis
  • Results reviews

1) Understanding your project goal

                The first step in any data mining project is to have a better understanding of the project goal. What are the requirements for your project?
For example, what areas of your business do you want to improve through data mining? Do you want to make your product recommendation systems better the way Netflix did? Do you want to understand your customers and their behavior in a better way based on their personas and segmentation method? This is the main part of any project because if you set your project goal wrong then it will lead to the collapse of the whole project, leading to a huge loss. So always take more precautions while setting up your project goal.

2) Understanding the data and its sources

               As of now, you set your project goal based on your project requirements. Now the next step in data mining is understanding the data and data sources.
It is also an important step where you need to collect the relevant data based on the project goal. You also have to collaborate data from different and relevant data sources so that while building any model, your model should be generalized and not get bad accuracy on new data points.

3) Preparing the data

               Next step is to prepare your data, in this step you need to clean and organized your data so that your data may be free from any noise. You also have to find the relevant features which can be used while building your model based on this data.
There are many different ways or tools available that can be used to clean your data. This process also plays an important role in projects as how much your data is free from any kind of noise, it will be led to a better performance model with high model accuracy.

4) Data Analysis

             In this step we try to know more about the data and find out some hidden and meaningful insights from the data. These hidden insights help us to find is there any hidden information which we are missing which are impacting our business.

5) Results Review

            The last stage of the data mining process is to review the results and answer key questions, such as:
• If the findings are accurate
• If they support your goals
• How to act on them
• How to share the findings with your team

What are Data Mining functionalities?

            Data mining functionalities are used to specify what kind of pattern are present in our data during data mining tasks. We can further divide data mining tasks into two different categories.
1. Descriptive mining task
2. Predictive mining task
Data Mining functionalities | insideAIML

Descriptive mining task

                  In descriptive mining tasks, we try to find out the general properties present in our data. For example, we find data describing patterns and come up with new and significant information present in our available dataset.
For example:
Let’s suppose, there is a mart near your home. One day you visit that mart and saw that the mart manager is trying to observe the customers purchasing behavior that who is buying what? You are a curious type of person so you went to him and asked him why he is doing this?
The mart manager replied to you that he is trying to identify products that are purchased together so that he can rearrange the mart accordingly. He told you that let's suppose you buy bread so next thing you may try to buy some eggs or butter. So, if this thing is kept close to bread than the mart sales may rise. This is known as Association analysis and considered as a Descriptive data mining task.
Some of the predictive data mining tasks are Association, Clustering, Summarization, etc.

1) Association

            Association is used to find the association or connection among a set of items present with us. It’s mainly tries to identifies the relationships between objects. Association analysis is used for commodity management, advertising, catalog design, direct marketing etc.
A retailer can identify the products that normally customers purchase together which I explained you above or even find the customers who respond to the promotion of same kind of products.
For example:
If a retailer finds that bread and eggs are bought together mostly, he can put eggs on sale to promote the sale of bread.

2) Clustering

              Clustering is a process to identify data objects that are similar to one another. The similarity can be decided based on a number of factors like purchase behavior, responsiveness to certain actions, geographical locations and so on.
For example:
A Telecom company can cluster its customers based on age, residence, income, etc. This will help the telecom company to understand its customers in a better way and hence solved the issues and provide better-customized services.

3) Summarization

           Summarization is a technique for the generalization of data. A set of relevant data is summarized which result in a smaller set that gives aggregated information of the data.
For example:
The shopping done by a customer can be summarized into total products, total spending offers used, etc. Such high-level summarized information can be useful for sales or customer relationship team for detailed customer and purchase behavior analysis. Data can be summarized in different abstraction levels and from different angles.

Predictive mining task

           In predictive mining tasks, we try to find out some inference on the current data in order to make some predictions from the available data for the future.
Predictive data mining tasks come up with a model from the available data set that is helpful in predicting unknown or future values of another data set of interest.
For example:
Let’s suppose your friend is a medical practitioner and he is trying to diagnose a disease based on the medical test results of a patient. This can be considered as a predictive data mining task. Where we try to predict or classify the new data based on the historical data.
Some of the predictive data mining tasks are classification, prediction, time-series analysis etc.

1) Classification

              Classification is a process where we try to build a model that can determine the class of an object based on its different attributes.
Here, a collection of records will be available, each record represents a set of attributes. One of the attributes will be class attributes or target attributes.
The main aim of the classification task or model is assigning a class attribute to a new set of records as accurately as possible.
Let’s take an example and try to understand it.
Classification can be used in direct marketing so that we can reduce marketing costs by targeting a set of customers who are likely to buy a new product. Using the available data, it is possible to know which customers purchased similar products and who did not purchase in the past. Hence, {purchase, don’t purchase} decision forms the class attribute in this case. Once the class attribute is assigned, demographic and lifestyle information of customers who purchased similar products can be collected and promotion emails can be sent to them directly.

2) Prediction

               In the prediction task, we try to predict the possible values of missing data. Here, we build a model based on the available data and this model is then used in predicting future values of a new data set.
For example:
If we want to predict the price of the new house based on the historical data available such as the number of bedrooms, number of kitchens, number of bathrooms, carpet area, old house prices, etc. Then we have to build a model that can predict the new house price based on the given input. Also, prediction analysis is used in different areas including fraud detection, medical diagnosis, etc.

3) Time series analysis

              Time series is also a type of predictive mining task which is a sequence of events where the next event is determined by one or more of the preceding events. Time series reflects the process being measured and there are certain components that affect the behavior of a process.
Time series analysis includes methods to analyze time-series data in order to extract useful patterns, trends, rules, and statistics.
For example:
Stock price prediction is an important application of time- series analysis.
I hope after reading this article, finally, you came to know about what is data mining, what are the steps involved in data mining, and what are some of the Verified data mining functionalities?
For more blogs/courses on data science, machine learning, artificial intelligence, and new technologies do visit us at InsideAIML.
Thanks for reading…
Keep Learning. Keep Growing.

Submit Review