Skip to content

satyakisen/spark-on-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark on Kubernetes

This repo demonstrates spark on kubernetes with a custom Kubernetes REST API made from scratch. Apache Spark version 2.4.4 is used in this demo project.

Table of Contents

Pre-requisite

  1. Spark Docker
    • Docker

Spark Docker

The spark docker directory consists of Apache Spark Scala and Python base images through which one can submit the spark applications into kubernetes. We will be using the Spark kubernetes operator for the demo.

The Scala base image can be built using the following command:

docker build -t <IMAGE_NAME>:<IMAGE_VERSION> ./spark-docker/scala/

Once the image is built, it can be tagged and pushed to respective docker repository.

The Python base image can be built using the following command:

docker build -t <IMAGE_NAME>:<IMAGE_VERSION> --build-args base_image=<SCALA_SPARK_BASE_IMAGE> ./spark-docker/python/ 

The Scala Spark base image is required for building the Python Spark base image. Once the image is built, it can be tagged and pushed to respective docker repository.

Spark K8S Operator

WORK IN PROGRESS

Spark REST API

WORK IN PROGRESS