When SparkContext is in process of creation, a bunch of random ports are allocated to run the Spark service. This can be annoying when you have security groups to think of.
Note!
A more detailed post on this topic is here.
Here is an example of how random ports are allocated when Spark service is started:
The only sure bet is 4040 (or 404x depending on how many Spark Web UI have been already started).
On Apache Spark website, under Configuration, under Networking, 6 port properties have default value random. These are the properties that have to be tamed.
(The only 6 properties with default value random among all Spark properties)
Solution
Open $SPARK_HOME/conf/spark-defaults.conf:
sudo -u spark vi conf/spark-defaults.conf
The following properties should be added:
spark.blockManager.port 38000 spark.broadcast.port 38001 spark.driver.port 38002 spark.executor.port 38003 spark.fileserver.port 38004 spark.replClassServer.port 38005
I have picked port range 38000-38005 for my Spark services.
If I run Spark service now, the ports in use are now as defined in the configuration file: