by heroku


Spark Singularity

Use Spark on Heroku in a single dyno. Experiment inexpensively with Spark in the Common Runtime.

Production-quality Spark clusters may be deployed into Private Spaces using spark-in-space.


This buildpack provides the following three processes, children of the main web process:

  1. Nginx proxy for * basic password authentication set via environment variable
    • format username:{PLAIN}password
  2. Spark master * web UI * REST API
  3. one Spark worker *

🚨 This app should not be scaled beyond a single dyno. (There is no coordination mechanism between multiple instances; implicitly use as Spark Master.)

Submitting & controlling jobs

Because Spark Singularity is contained in a single dyno with only port 80 exposed, there are two options for submitting jobs:

  1. Spark's REST API, proxied at
  2. Declare the Spark jobs to submit on start-up by adding each classname on an individual line in Jobfile:

Source deploy

heroku create
heroku addons:create bucketeer --as SPARK_S3
heroku buildpacks:add -i 1
heroku buildpacks:add -i 2 heroku/scala
heroku buildpacks:add -i 3
heroku buildpacks:add -i 4
heroku buildpacks:add -i 5

Sample import & query

These processes will run out of memory without the large 14GB RAM dynos.

heroku scale web=0
heroku run bin/spark-local-job -s Performance-L
heroku scale web=1:Performance-L
heroku logs -t
# Once complete, avoid ongoing PL dyno charges,
heroku scale web=0:Standard-1x