Name		Name	Last commit message	Last commit date
parent directory ..
001-github-activity-analysis		001-github-activity-analysis
002-read-csv-write-dataset		002-read-csv-write-dataset
003-complimentary-customer-gifts		003-complimentary-customer-gifts
004-count-words		004-count-words
README.md		README.md
pom.xml		pom.xml

99 — Spark Apps

full-fledged Apache Spark applications

01 — GitHub activity Analysis

Analyzing actions on GitHub using the SparkApplicationTemplate and wconf.

02 — Reading CSV and Writing Parquets

Reading CSV files using Spark core methods and writing Parquet datasets with different compression formats on different targets (local file system, S3 and Azure Blob Storage).

03 — Processing a structured Purchase Log with Spark Core

The application loads a structured text file and applies some business rules using Spark Core module. The result of the processing is then written to the local file system as a text file with the same structure.

04 — Count Words using Spark Core

Illustrates how to count the words from file downloaded from the Internet using Spark Core module. By contrast to 03 — Processing a structured Purchase Log with Spark Core, the sorting is performed in a materialized Map instead of on an RDD.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

99-spark-apps

99-spark-apps

001-github-activity-analysis

001-github-activity-analysis

002-read-csv-write-dataset

002-read-csv-write-dataset

003-complimentary-customer-gifts

003-complimentary-customer-gifts

004-count-words

004-count-words

README.md

README.md

pom.xml

pom.xml

README.md

99 — Spark Apps

01 — GitHub activity Analysis

02 — Reading CSV and Writing Parquets

03 — Processing a structured Purchase Log with Spark Core

04 — Count Words using Spark Core

Files

99-spark-apps

Directory actions

More options

Directory actions

More options

Latest commit

History

99-spark-apps

Folders and files

parent directory

99 — Spark Apps