Skip to content

Spark Intialization & Shut down

Adam Cervenka edited this page Sep 30, 2022 · 2 revisions
SparkSession
  |
SparkContext
listenerBus
  |
LiveListenerBus
val queues = new CopyOnWriteArrayList[AsyncEventQueue]()
SHARED_QUEUE
  |
AsyncEventQueue extends SparkListenerBus (extends ListenerBus[SparkListenerInterface, SparkListenerEvent])
val dispatchThread = new Thread(s"spark-listener-group-shared") deamon = true
calls instances of (sub)type SparkListenerInterface






SplineQueryExecutionListener extends QueryExecutionListener

SparkSession
listenerManager: ExecutionListenerManager
  |
ExecutionListenerManager
val listenerBus = new ExecutionListenerBus
session.sparkContext.listenerBus.addToSharedQueue(listenerBus) // this listenerBus is listener in SHARED_QUEUE
  |
ExecutionListenerBus extends SparkListener with ListenerBus[QueryExecutionListener, SparkListenerSQLExecutionEnd]


// both listeners are served from the same queue inside session.sparkContext.listenerBus - the SHARED_QUEUE 
// both listeners are served from the same thread spark-listener-group-shared
// => the order of the event should not change so first one is the lineage producing event tha is generated and inserted by non-deamon thread
// then the non-deamon thread terminates and JVM shutdown is in progress, JVM shutdown hooks are created
// this in turn start shutdown hooks in spark and caus insertion of application edn event
// this means the application event should be always called after the lineage event and thus it should be saved to release resources there  
SplineQueryExecutionListener
SplineSparkApplicationEndListener

// Shutdown chain:
java.lang.Runtime.getRuntime().addShutdownHook(...)
org.apache.hadoop.util.ShutdownHookManager
org.apache.spark.util.ShutdownHookManager
hookTask Thread-1
SparkContext.stop()
Clone this wiki locally