Spark-submit: Examples and Reference
Last updated:Unless otherwise noted, examples reflect Spark 2.x, running on a local setup, on client mode
Simplest possible example
All commands can be written on a single line, but for presentation purposes I've used a backward slash ("\"), which can be used to signal line breaks in shells like bash
Using a Scala Jar file and running locally (no cluster) using 2 threads.
$ spark-submit --class name.space.to.MyMainClass \
--master local[2] \
path/to/my-spark-fat-jar.jar \
argument1 \
argument2 \
argument3
Setting memory options
Set total driver memory to 8 gigbytes
$ spark-submit --class name.space.to.MyMainClass \
--driver-memory 8G
--master local[2] \
path/to/my-spark-fat-jar.jar
Pass general config options
For instance, if you want Spark to use 70% of available JVM heap space for itself (default value is 0.6)
$ spark-submit --class name.space.to.MyMainClass \
--conf "spark.memory.fraction=0.7" \
--master local[2] \
path/to/my-spark-fat-jar.jar
Pass general JVM flags
It's forbidden to pass JVM heap size configs here (use other command-line parameters such as
--driver-memory
or--executor-memory
)
These may be passed to the driver JVM or to the executor JVMs separately. To increase garbage collection verbosity on the driver and the executor:
$ spark-submit --class name.space.to.MyMainClass \
--conf "spark.driver.extraJavaOptions=-XX:+PrintGCDetails" \
--conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails" \
--master local[2] \
path/to/my-spark-fat-jar.jar
References