High Performance Computational Cluster: STEP 3 Configuration of Grid Engine Using qmon, configure all the queues for Batch and Parallel jobs only. If you run qmon on a remote host, you need to set xhost on your local machine and DISPLAY on the remote host. For example, if I am sitting on unisys02 and running qmon on node02, then on unisys02 I would type xhost +node02, and on node02, I would run export DISPLAY=unisys02:0.0 then run qmon & Most of the default parameters for the queues are acceptable. Each queue is related to a particular host. For example, queue node01.q should relate to node01. Shell entry should be /bin/bash. Batch jobs in form of a script can be submitted to the queue system with qsub command. Below is sample script for a job:
#!/bin/bash #----------------------------------------------------------------- # Template script for serial jobs to run on CODINE cluster. # Modify it for your case and submit to CODINE with # command "qsub batch_run.sh". # You may want to modify the parameters for # "-N" (job queue name) # You can monitor your jobs with command # "qstat -u your_username" or "qstat -f" to see all queues. # To remove your job, run "qdel job_id" # To kill a running job, use "qdel -f job_id" #---------------------------------------------------------------- # Give a name to your job (for example, run-on-mphase): #$ -N run-in-queue # # Specify the kind of shell script you use, for example, bash #$ -S /bin/bash # # Standard output and error files: #$ -o stdo.output #$ -e stderr.output # # Start this script from the current working directory: #$ -cwd # # Specify a local temporary directory on the running node: TMPD=/tmp/$LOGNAME/$JOB_ID echo "TMPD is $TMPD" mkdir -p $TMPD # Specify the current working directory: CDIR=`pwd` # Specify the name of your executable, for example, "loop.x" myjob=loop.x # Copy you compiled executable into $TMPD directory # along with the input files and run it from there: cp $myjob $TMPD cd $TMPD $TMPD/$myjob # if there is some output, copy it back into your current working directory #cp output* $CDIR cd /tmp rm -rf $TMPD |
#!/bin/bash -f # #---------------- SHORT COMMENT ---------------------------------------- # Template script for parallel MPI jobs to run on mphase Grid Engine cluster. # Modify it for your case and submit to CODINE with # command "qsub mpi_run.sh". # You may want to modify the parameters for # "-N" (job queue name), "-pe" (queue type and number of requested CPUs), # "myjob" (your compiled executable). # You can compile you code, for example myjob.c (*.f), with GNU mpicc or # mpif77 compilers as follows: # "mpicc -o myjob myjob.c" or "mpif77 -o myjob myjob.f" # You can monitor your jobs with command # "qstat -u your_username" or "qstat -f" to see all queues. # To remove your job, run "qdel job_id" # To kill running job, use "qdel -f job_id" # ------Attention: #$ is a special CODINE symbol, not a comment ----- # # The name, which will identify your job in the queue system #$ -N MPI_Job # # Queue request, mpich. You can specify the number of requested CPUs, # for example, from 2 to 3 #$ -pe mpich 2-3 # # --------------------------- #$ -cwd #$ -o MPI-stdo.output #$ -e MPI-stderr.output #$ -v MPIR_HOME=/usr/local/mpich-1.2.5 # --------------------------- echo "Got $NSLOTS slots." # Put the name of your compiled MPI file, for example, "cpi" myjob=cpi # Don't modify the line below if you don't know what it is $MPIR_HOME/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines $myjob |