Free PDF Download - Mining of Massive Datasets : OnlineProgrammingBooks.com

Free eBook: Mining of Massive Datasets

You can download a complete pdf copy of “Mining of Massive Datasets” by Anand Rajaraman and Jeffrey David Ullman from their website. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike. It teaches algorithms that have been used in practice to solve key problems in data mining and includes exercises suitable for students from the advanced undergraduate level and beyond.

Book Description

At the highest level of description, this book is about data mining. However,it focuses on data mining of very large amounts of data, that is, data so large it does not fit in main memory. Because of the emphasis on size, many of our
examples are about the Web or data derived from the Web. Further, the book takes an algorithmic point of view: data mining is about applying algorithms to data, rather than using data to “train” a machine-learning engine of some
sort. The principal topics covered are:

1. Distributed file systems and map-reduce as a tool for creating parallel
algorithms that succeed on very large amounts of data.
2. Similarity search, including the key techniques of minhashing and locality-
sensitive hashing.
3. Data-stream processing and specialized algorithms for dealing with data
that arrives so fast it must be processed immediately or lost.
4. The technology of search engines, including Google’s PageRank, link-spam
detection, and the hubs-and-authorities approach.
5. Frequent-itemset mining, including association rules, market-baskets, the
A-Priori Algorithm and its improvements.
6. Algorithms for clustering very large, high-dimensional datasets.
7. Two key problems for Web applications: managing advertising and rec-
ommendation systems.

Data Mining
Large-Scale File Systems and Map-Reduce
Finding Similar Items
Mining Data Streams
Link Analysis
Frequent Itemsets
Clustering
Advertising on the Web
Recommendation Systems

Download Free PDF / Read Online

Author(s): Anand Rajaraman, Jeffrey D. Ullman
Format(s): PDF
File size: 2.63 MB
Number of pages: 457
Link: Download.

Mining of Massive Datasets

Book Description

Table of Contents

Download Free PDF / Read Online

Similar Books: