Starting a cluster job

Starting a cluster job

The cluster uses the qsub system. The job scheduler is responsible for housekeeping of cluster jobs.

Starting a job

$ qsub 8 example.sh
  • qsub is the command that submits the job to the scheduler
  • 8 is the numer of cores you want to use – this will be inserted into your jobfile (variable $NSLOTS)
  • example.sh is the job file you uploaded to cluster.

The system will provide an output like:

Your job 403 ("example.sh.job") has been submitted

Monitoring a job

The job status can be monitored with the following command:

$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
    403 0.00000 example mustermann      qw    07/18/2014 13:51:09                                    8

The letters of the ‘state’ variable are:

  • r – running
  • w – waiting
  • t – transfering to the cluster nodes
  • E – Error
  • h – hold
  • s/S – suspended
  • q – in queue (qw is waiting in queue until enough resources are available)

Deleting a Job

For cancelling or deleting a job with qsub use qdel with the job-ID shown in qstat:

$ qdel <job_id>

If an error occurs

For obtaining more information about a job including the error message use:

$ qstat -j <job_id>