What is CNVRG?

cnvrg.io is a full-stack data science platform. cnvrg.io empowers data science teams with a collaborative place for their entire data science and machine learning workflows – from research, development and experimentation to the actual deployment of the model in production. cnvrg.io makes data science work reproducible, accessible and faster. 

In this guide we will:

Let's get started!

Create a project & connect a git repository

To create a new project, go to your organization's home page and click on
Start Project.
Set a name, description(optional) and click on start.

Now you have a new project in your organization!

Connecting a git project is done easily via the project home page or from project settings' page. 

From the project's home page:

And click on Save.

Now cnvrg mnist example repository is connected!

Connecting cnvrg projects to cnvrg-cli

if you haven't done it yet, download and login cnvrg-cli

Create a Dataset

  • Open the terminal and create a directory for the dataset: mkdir mnist_dataset 
  • Enter to the directory and run cnvrg data init  
  • Download the following file by running wget https://github.com/cnvrg/mnist_with_dataset/raw/master/mnist.npz 
  • sync the dataset to cnvrg, by running:  cnvrg data sync  

Running your first experiment 

Inside the project directory we will run the mnist.py using the dataset we just created.
This will download the dataset and will create an experiment that will run mnist.py on a medium machine.
cnvrg run --data=mnist_dataset python3 mnist.py                          --dataset_path=/data/mnist_dataset/mnist.npz
To run the experiment on a GPU machine, run:
cnvrg run --gpuxl --data=mnist_dataset python3 mnist.py                         --dataset_path=/data/mnist_dataset/mnist.npz

Running your first Grid Search
Grid Search enables you to run multiple experiments with different parameters in a single command.
To run a grid search command, you'll need to provide a yaml that defines the parameters for the run.

Below you can see an example yaml file: 

    # Float parameter is a range of possible values between a minimum (inclusive)
    # and maximum (not inclusive) values.
    - param_name: "learning_rate"
      type: "float" # precision is 9 after period
      min: 0.00001
      max: 0.1
      scale: "log2" # Could be log10 as well
      steps: 2

    # Discrete parameter is an array of numerical values.
    - param_name: "c"
      type: "discrete"
      values: [0, 0.1 ,0.001]

    # Categorical parameter is an array of string values
    - param_name: "kernel"
      type: "categorical"
      values: ["linear", "rbf"]

After saving the yaml inside the project directory you can now run:
cnvrg run --grid=grid.yaml --data=mnist_dataset python3 mnist.py                         --dataset_path=/data/mnist_dataset/mnist.npz  
And cnvrg will create and run all different combinations of the provided yaml.
You can follow the status of the experiments in the experiments tab in the project UI.

Publishing a model

With cnvrg, it's easy to publish a predictive model to a secured endpoint.

  • In cnvrg project's page, go to the Publish Tab
  • Click on Publish new model
  • Fill in the following details:   File: model.py , Function to execute: predict & machine type. Click on Publish!
  • When the model is published, you can start sending it request as specified in the example on the lower part of the page.

All done! 

Now you're ready to connect one of your git projects and fire experiments!

Did this answer your question?