Skip to content

Latest commit

 

History

History

01-spark-core

01 — Spark Core API

Illustrates basic usage of Spark RDDs and its transformations and actions

Serves as a shakedown test for the local development environment. In the example, the Spark license file is loaded and the number of lines in this file are counted.

Illustrates how to use the JavaRDD.filter method used to filter out records from an RDD. In the example, we demonstrate how to filter the lines from an JavaRDD<String> that match a particular string.

Illustrates how to use the JavaRDD.filter method used to filter out records from an RDD. In the example, we demonstrate how to filter the lines from an JavaRDD<String> that match a particular string.

Illustrates how to use the JavaRDD.flatmap method used to flatten an RDD whose elements are also collections.

Illustrates how to use the JavaRDD.sample transformation and the JavaRDD.takeSample action.

Illustrates how to obtain basic statistics (such as the mean, stdev, etc.) from a JavaDoubleRDD.

Illustrates how to obtain a histogram from the data inside a JavaDoubleRDD.

Illustrates how to obtain an approximate sum and mean from a JavaDoubleRDD.

Illustrates how to create JavaRDD.

Illustrates how to create JavaPairRDD.

Illustrates how to load the contents of a file and parse its contents manually.

Illustrates how to use flatMapValues to change the number of elements associated to a given key when working with JavaPairRDD.

Illustrates how to use union to add records to a JavaRDD or JavaPairRDD.

Illustrates how to sort a JavaPairRDD by the value, or by a field of the value.

Illustrates how to save an RDD on the file system.