I have a program that is parallelized using MPI. It thinks that it is able to run across multiple nodes on our (CentOS 6.6)-based HPC grid, when in actual fact it only runs successfully on multiple cores of the same compute node.
e.g. If I qsub a job to the grid asking for 20 cores, and Grid Engine decides to split it over two different nodes, the program fails. However, if there is a node with 20 cores available, and Grid Engine sends it all to that one, the program runs successfully. The qsub script contains the command #$ -pe mpi 20 to select the number of cores.
So at the moment, I do a qstat -f -u "*" to manually identify a compute node with 20 available cores, and submit to that node with qsub -q general.q@node-X-X
What I am looking for is a way to tell Grid Engine to wait and only submit the job to a single compute node that has the required number of available cores. This will allow me to automate my job submission.
I am considering writing a bash script to parse the qstat -f -u "*" command, but there must be a more elegant solution. I have looked through the qsub manual but am unable to find a suitable flag or command line argument.
I'm not able to modify the program itself at this time and I am not a system administrator.
Here is some information on the different software versions I have available:
MPI/gridengine info:
> ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.2)
Grid engine version is: OGS/GE 2011.11p1