Hands-On Big Data Analytics with PySpark Cover

Hands-On Big Data Analytics with PySpark

Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs

Publisher:Packt Publishing Limited

By: Rudy Lai and Bartłomiej Potaczek

Paid access

|Sep 2024

E-Book €16.99Institutions €107.95

E-Book €16.99Institutions €107.95

Table of Contents

Installing Pyspark and Setting up Your Development Environment
Getting Your Big Data into the Spark Environment Using RDDs
Big Data Cleaning and Wrangling with Spark Notebooks
Aggregating and Summarizing Data into Useful Reports
Powerful Exploratory Data Analysis with MLlib
Putting Structure on Your Big Data with SparkSQL
Transformations and Actions
Immutable Design
Avoiding Shuffle and Reducing Operational Expenses
Saving Data in the Correct Format
Working with the Spark Key/Value API
Testing Apache Spark Jobs
Leveraging the Spark GraphX API

PDF ISBN: 978-1-83864-883-1

Publisher: Packt Publishing Limited

Copyright owner: © 2019 Packt Publishing Limited

Publication date: 2024

Language: English

Pages: 182

Related subjects:

Computer sciences,

Databases and data mining

People also read

Essential PySpark for Scalable Data Analytics

.blurhash-client-img { display: none !important; }

Essential PySpark for Scalable Data Analytics

Sreeram Nudurupati

Big Data Analytics

.blurhash-client-img { display: none !important; }

Big Data Analytics

Big Data Analysis with Python

.blurhash-client-img { display: none !important; }

Big Data Analysis with Python

Ivan Marin, Ankit Shukla, Sarang VK

Fast Data Processing with Spark 2

.blurhash-client-img { display: none !important; }

Fast Data Processing with Spark 2

Learning PySpark

.blurhash-client-img { display: none !important; }

Learning PySpark

Tomasz Drabas, Denny Lee

Frank Kane's Taming Big Data with Apache Spark and Python

Frank Kane's Taming Big Data with Apache Spark and Python

Spark for Data Science

.blurhash-client-img { display: none !important; }

Spark for Data Science

Scala and Spark for Big Data Analytics

.blurhash-client-img { display: none !important; }

Scala and Spark for Big Data Analytics

Mastering Spark for Data Science

.blurhash-client-img { display: none !important; }

Mastering Spark for Data Science

Apache Spark for Data Science Cookbook

.blurhash-client-img { display: none !important; }

Apache Spark for Data Science Cookbook