• Menu
  • Skip to right header navigation
  • Skip to main content
  • Skip to secondary navigation
  • Skip to primary sidebar

OnlineProgrammingBooks.com

Legally Free Computer Books

  • All Categories
  • All Books
  • All Categories
  • All Books
  • About Us
  • Privacy policy
  • Disclaimer
  • Subscribe
  • Contact
You are here: Home ▶ Databases ▶ Mining of Massive Datasets

Mining of Massive Datasets

October 22, 2011

Free eBook: Mining of Massive Datasets

You can download a complete pdf copy of “Mining of Massive Datasets” by Anand Rajaraman and Jeffrey David Ullman from their website. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike. It teaches algorithms that have been used in practice to solve key problems in data mining and includes exercises suitable for students from the advanced undergraduate level and beyond.

Book Description

At the highest level of description, this book is about data mining. However,it focuses on data mining of very large amounts of data, that is, data so large it does not fit in main memory. Because of the emphasis on size, many of our
examples are about the Web or data derived from the Web. Further, the book takes an algorithmic point of view: data mining is about applying algorithms to data, rather than using data to “train” a machine-learning engine of some
sort. The principal topics covered are:

1. Distributed file systems and map-reduce as a tool for creating parallel
algorithms that succeed on very large amounts of data.
2. Similarity search, including the key techniques of minhashing and locality-
sensitive hashing.
3. Data-stream processing and specialized algorithms for dealing with data
that arrives so fast it must be processed immediately or lost.
4. The technology of search engines, including Google’s PageRank, link-spam
detection, and the hubs-and-authorities approach.
5. Frequent-itemset mining, including association rules, market-baskets, the
A-Priori Algorithm and its improvements.
6. Algorithms for clustering very large, high-dimensional datasets.
7. Two key problems for Web applications: managing advertising and rec-
ommendation systems.

Table of Contents

  • Data Mining
  • Large-Scale File Systems and Map-Reduce
  • Finding Similar Items
  • Mining Data Streams
  • Link Analysis
  • Frequent Itemsets
  • Clustering
  • Advertising on the Web
  • Recommendation Systems

Download Free PDF / Read Online

Author(s): Anand Rajaraman, Jeffrey D. Ullman
Format(s): PDF
File size: 2.63 MB
Number of pages: 457
Link: Download.

Similar Books:

  1. An Introduction to Data Mining
  2. Data Mining and Knowledge Discovery in Real Life Applications
  3. Data Mining in Medical and Biological Research
  4. Data Mining: Desktop Survival Guide
  5. Sorting and Searching Algorithms: A Cookbook
Previous Post: « Programming in Scala, First Edition
Next Post: Windows 7 Power Users Guide »

Primary Sidebar

Get Latest Updates

  • Facebook
  • Pinterest
  • RSS
  • Twitter
  • YouTube
  • About Us
  • Privacy policy
  • Disclaimer
  • Subscribe
  • Contact

Copyright © 2006–2025 OnlineProgrammingBooks.com