“Data Mining: Desktop Survival Guide” by Graham Williams is a free online book. Data mining is about building models from data. We build models to gain insights into the world and how the world works, so we can predict how things will behave into the future.
Book Description
A data miner, in building models, deploys many different data analysis and model building techniques. Our choices depend on the business problems to be solved. Although data mining is not the only approach it is becoming very widely used because it is well suited to the data environments we find in today’s enterprises. This is characterised by the volume of data available, commonly in the gigabytes and fast approaching the terabytes, and the complexity of that data, both in terms of the relationships that are awaiting discovery in the data and the data types available today, including text, image, audio, and video. Also, the business environments are rapidly changing, and analyses need to be regularly performed and models regularly updated to keep up with today’s dynamic world.
Table of Contents
- The Business Problem
- Data
- Loading Data
- Exploring Data
- Interactive Graphics
- Statistical Tests
- Models
- Network Analysis
- Text Mining
- Decision Trees
- Random Forests
- Boosting
- Bootstrapping: Meta Algorithm
- Bagging: Meta Algorithm
- Support Vector Machine
- Linear Regression
- Neural Network
- Naive Bayes
- Survival Analysis
- Evaluation and Deployment
- Transforming Data
- Deployment
- Troubleshooting
- Issues
- Moving into R
- R: The Language
- Getting Help
- Data
- Graphics in R
- Understanding Data
- Preparing Data
- Classification: Decision Trees
- Classification: Boosting
- Classification: Random Forests
- Issues
- Evaluating Models
- Reporting
- Fraud Analysis
- Archetype Analysis