12

I am running my jobs on a small cluster. I submitted them via qsub. Now my labmates need resources more urgently than me, so I need to either kill my jobs or pause them, if possible.

Is there a way of pausing my jobs and releasing the CPU, RAM, etc.?

I am a normal user (no root privileges).

Sibbs Gambling
  • 1,646
  • 6
  • 20
  • 26
  • 1
    You can hold back your not yet running jobs with `qhold`, but this does not affect already running jobs. It seems that `qmod` can suspend running jobs, but needs root or manager privileges. – jofel Sep 07 '15 at 12:01
  • Assuming you have MOAB cluster: you can suspend your job (if configured accordingly), but only administrator can resume it: http://docs.adaptivecomputing.com/mwm/Content/topics/jobAdministration/suspendresume.html – Ott Toomet Dec 27 '15 at 05:13
  • use `qrls ` to rerun your job. – Charlie Parker Dec 24 '20 at 18:41

1 Answers1

10

If the jobs haven't started you can put them on hold with qhold. Use qrls to restart.

qhold <job ID>
qrls <job ID>

If they are already running you can use qsig to suspend and resume jobs (you may need extra permissions for that, ask your administrator if that's the case):

qsig -s suspend <job ID>
qsig -s resume <job ID>

Once you have resumed your job you may have to force it to run with qrun

qrun <job ID>

Tested on a SLES 11 SP4 system with PBSPro 13.0.2.153173, but I am confident it should work with other POSIX-compliant batch job submission systems.

Calimo
  • 280
  • 3
  • 12