The phrase 'Data Mining' was first used in the 1990s, however this term refers to the development of a long-standing industry.
The Bayes theorem and regression development are two early methods for finding patterns in data (1800s). As data sets have gotten bigger and more complicated, computer science's development and increasing capability have increased data gathering, storage, and manipulation. With indirect, automatic data processing and other computer science advancements like neural networks, clustering, evolutionary algorithms (1950s), decision trees (1960s), and supporting vector machines, explicit hands-on data analysis has been steadily enhanced (1990s).
The roots of data mining can be found in three families: traditional statistics, artificial intelligence, and machine learning.
Traditional statistics: Statistics, such as regression analysis, standard deviation, standard distribution, standard variance, discriminating analysis, cluster analysis, and confidence intervals, are the foundation of the majority of data mining technology. To examine data and data connections, all of these are used.
Artificial Intelligence (AI): Heuristics, rather than statistics, are the foundation of AI. It aims to approach statistical issues with processing that is similar to human reasoning. Some premium commercial solutions, like query optimization modules for Relational Database Management Systems, have incorporated a specialized AI philosophy (RDBMS).
AI and statistics come together to create machine learning. Because it incorporates sophisticated statistical analysis with AI heuristics, it might be seen as a progression of AI. Machine learning is to provide computer programmes knowledge about the data they are analyzing so that they can decide differently based on those qualities. It incorporates additional AI heuristics and algorithms to achieve its goal and employs statistics for fundamental notions.
In the 18th and early 19th centuries, statistical analysis utilizing the Bayes' theorem, regression analysis by Carl Gauss and Adrien Marie Legendre, and other methods gave rise to data mining (Li, 2016). Bayes theorem states that it employs probability and data mining to comprehend estimated probabilities and resolve intricate permutations. One of the key components of contemporary data mining is regression analysis, which makes it possible to estimate the relationship between variables (Li, 2016). Following these developments in statistics, Alan Turing conceived of the 'Universal Machine,' which would be able to process and perform calculations that would subsequently be utilized in the creation of contemporary computers (Li, 2016).