Interactive Spark shell
You can run Spark commands interactively in the Spark shell. The Spark shell is available in Scala, Python, and R.
-
Launch a long-running interactive
bash
session usingdcos task exec
. -
From your interactive
bash
session, pull and run a Spark Docker image.docker pull mesosphere/spark:2.6.0-2.3.2-hadoop-2.7 docker run -it --net=host mesosphere/spark:2.6.0-2.3.2-hadoop-2.7 /bin/bash
-
Run the Spark shell from within the Docker image.
For the Scala Spark shell:
./bin/spark-shell --master mesos://<internal-leader-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:2.6.0-2.3.2-hadoop-2.7 --conf spark.mesos.executor.home=/opt/spark/dist
For the Python Spark shell:
./bin/pyspark --master mesos://<internal-leader-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:2.6.0-2.3.2-hadoop-2.7 --conf spark.mesos.executor.home=/opt/spark/dist
For the R Spark shell:
./bin/sparkR --master mesos://<internal-leader-ip>:5050 --conf spark.mesos.executor.docker.image=mesosphere/spark:2.6.0-2.3.2-hadoop-2.7 --conf spark.mesos.executor.home=/opt/spark/dist
-
Run Spark commands interactively.
In the Scala shell:
val textFile = sc.textFile("/opt/spark/dist/README.md") textFile.count()
In the Python shell:
textFile = sc.textFile("/opt/spark/dist/README.md") textFile.count()
In the R shell:
df <- as.DataFrame(faithful) head(df)