Free PDF Download - Hands-On Big Data Analytics with PySpark : OnlineProgrammingBooks.com

Hands-On Big Data Analytics with PySpark

In Hands-On Big Data Analytics with PySpark, you will not only learn how to use Spark and the Python API to create high-performance analytics with big data, but also discover techniques for testing, immunizing, and parallelizing Spark jobs. You will learn how to source data from all popular data hosting platforms, including HDFS, Hive, JSON, and S3, and deal with large datasets with PySpark to gain practical big data experience. (Limited-time offer)

Installing Pyspark and Setting up Your Development Environment
Getting Your Big Data into the Spark Environment Using RDDs
Big Data Cleaning and Wrangling with Spark Notebooks
Aggregating and Summarizing Data into Useful Reports
Powerful Exploratory Data Analysis with MLlib
Putting Structure on Your Big Data with SparkSQL
Transformations and Actions
Immutable Design
Avoiding Shuffle and Reducing Operational Expenses
Saving Data in the Correct Format
Working with the Spark Key/Value API
Testing Apache Spark Jobs
Leveraging the Spark GraphX API

Download Free PDF / Read Online

Author(s): James Cross, Rudy Lai, Bartłomiej Potaczek
Publisher: Packt Publishing
Published: March 2019
Format(s): Online
File size: –
Number of pages: 182
Download / View Link(s): This offer has ended.
Free as of 10/12/2023.

Hands-On Big Data Analytics with PySpark

Table of Contents

Download Free PDF / Read Online

Similar Books: