Spark Clusters are a great tool for data scientists who wants to distribute their computation across different workers.
cnvrg makes very easy to build and use spark clusters to run experiments or run jupyter notebooks on a spark cluster.

Build Spark Cluster using cnvrg

In Organization Setting, go to the Clusters tab
Click on create new Cluster and choose Spark Standalone:

  • Set a name for the cluster
  • Choose type of machine for the master
  • Choose number of workers, and type of machine for each worker
  • Choose  details for each executor: number of cpus and memory per executor
  • Create the Cluster

After creating the spark cluster, you're ready to use it for experiments and 

Notebooks:

To create a jupyter notebook on a spark cluster, go to the Notebooks tab:
Click on Start Notebook:

  • Choose Jupyter on Spark option
  • Under Machine, choose the name of spark cluster

Your cluster will be created and your notebook will be available in a few moments.

Experiments:

Click on New Experiment:

  • Set your input command: spark-submit pi.py  
  • Under Machine, choose the name of spark cluster

Your cluster will be created and your experiment will start to run distributed on the cluster in a few moments.

Did this answer your question?