When running experiments, users can connect their datasets, install different requirements and prepare their custom and isolated environment for each experiment.
Sometimes, experiments get failed due to errors in the code, or in the environment, or just forgot to attach the correct configuration.
cnvrg helps you resolve those cases quickly without needing to rerun the experiment and wait for it to run on a new environment.
How to debug experiments
Once experiment get failed, cnvrg will send notification to the user (via slack or email)
that the experiment failed and it's now LIVE for 30 minutes:
The experiment, compute and environment will be accessed via a terminal session in the experiment's page.
Once the experiment is entered to a DEBUG state, the user will be able to fix the errors, and rerun the experiment from the current state.
The user will be able to increase the DEBUG state by pressing on the 'Add 15 Minutes' button.