Equivalences des commandes SLURM/SGE
Commandes utilisateur
Explications |
Commande Slurm |
Commande SGE |
Interactive login |
# srun --pty bash |
# qlogin |
|
# srun -p "part_name" --pty bash |
|
|
# sdev |
|
Job submission |
# sbatch [script file] |
# qsub [script file] |
Job deletion |
# scancel [job_ID] |
# qdel [job_ID] |
Job status all |
# squeue -all |
# qstat -f |
Job status |
# squeue [job_ID] |
# qstat -u \ * [-j job_ID] |
Job user status |
# squeue -u [user name] |
# qstat [-u user name] |
Job hold |
# scontrol hold [job_ID] |
# qhold [job_ID] |
Job release |
# scontrol release [job_ID] |
# qrls [job_ID] |
Queue list |
# squeue |
# qconf -sql |
Node list |
# sinfo -N |
# qhost |
|
# scontrol show nodes |
# qhost |
Clusterstatus |
# sinfo |
# qhost -q |
GUI |
# sview |
# qmon |
Commandes admin
Explications |
Commande Slurm |
Commande SGE |
Version |
# sinfo --version |
# qstat -help |
Désactiver un noeud |
# scontrol update nodename=<node> state=draining |
# qmod -d <queue>@<noeuds> |
dans toutes les queues |
# scontrol update --all state=draining (?) |
# qmod -d \*@<noeuds> |
Activer un noeud |
# scontrol update nodename=<node> state=resume |
# qmod -e <queue>@<noeuds> |
dans toutes les queues |
# scontrol update --all state=resume (?) |
# qmod -e \*@<node> |
Variables d’environnement
Explications |
Variable Slurm |
Variable SGE |
job_ID |
$SLURM_JOBID |
$JOB_ID |
Submit Directory |
$SLURM_SUBMIT_DIR |
$SGE_O_WORKDIR |
Submit Host |
$SLURM_SUBMIT_HOST |
$SGE_O_HOST |
Node List |
$SLURM_JOB_NODELIST |
$PE_HOSTFILE |
Job Array Index |
$SLURM_ARRAY_TASK_ID |
$SGE_TASK_ID |
Paramètres des scripts des jobs
Explications |
Paramètre Slurm |
Paramètre SGE |
Script directive |
#SBATCH |
#$ |
Queue |
-p [queue] |
-q [queue] |
Node Count |
-N [min[-max]] |
X |
CPU count |
-n [count] |
-pe [PE] [count] |
Wall Clock Limit |
-t [min]] |
-l h_rt=[seconds] |
|
-t [days-hh:mm:ss] |
-l h_rt=[seconds] |
Standard Output |
-o [file name] |
-o [file name] |
Standard Error |
-e [file name] |
-e [file name] |
Error File stdout/err |
-o [file name] |
-j yes |
Copy Environment |
--export=[ALL/NONE/var] |
-V |
Event Notification |
--mail-type=[events] |
-m abe |
EmailAddress |
--mail-user=[address] |
-M [address] |
Job Name |
--job-name=[name] |
-N [name] |
Job Restart |
--requeue |
-r [yes/no] |
|
--no-requeue |
-r [yes/no] |
Working Directory |
--workdir=[dir_name] |
-wd [dir_name] |
Resource Sharing |
--exclusive |
-l exclusive |
|
--shared |
-l exclusive |
Memory Size |
--mem=[mem][M/G/T] |
-l mem_free=[memory][K/M/G] |
|
--mem-per-cpu=[mem][M/G/T] |
-l mem_free=[memory][K/M/G] |
Account to charge |
--account=[account] |
-A [account] |
Tasks Per Node |
--tasks-per-node=[count] |
(Fixed allocation_rule in PE) |
CPUs Per Task |
--cpus-per-task=[count] |
X |
Job Dependency |
--depend=[state:job_ID] |
-holdjid [job_ID/job_NAME] |
Job Project |
--wckey=[name] |
-P [name] |
Job host preference |
--nodelist=[nodes] |
-q [queue]@[node] |
|
option : -exclude=[nodes] |
-q [queue]@@[hostgroup] |
Quality of Service |
--qos=[name] |
X |
Job Arrays |
--array=[array_spec] |
-t [array_spec] |
Generic Resources |
--gres= [resource_ spec I |
-l [resource]=[value] |
Licenses |
--licenses=[license_spec] |
-l [license]=[count] |
Begin Time |
--begin=YY-MM-DD[HH:MM:SS] |
-a [YYMMDDhhmm] |
Exemples
Scripts
Single-core application
Note : dans slurm on ne devrait pas utiliser + de 3G de mem
Script Slurm single-core |
Script SGE single-core |
#!/bin/bash -l (NOTE the -l flag) # # #SBATCH -J test #SBATCH -e test.output #SBATCH -o test.output # Default in slurm #SBATCH --mail-user my@mail.fr #SBATCH --mail-type=ALL # Request 5 hours run time #SBATCH -t 5:0:0 #SBATCH -A your_project_id_here # #SBATCH -p core -n 1 # //call your app here
|
|
#!/bin/bash # # #$ -N test #$ -j y #$ -o test.output #$ -cwd #$ -M my@mail.fr #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_id_here # #$ -l mem=4G # //call your app here
|
|
MPI application
Script Slurm |
Script SGE |
#!/bin/bash -l # NOTE the -l flag! # #SBATCH -J test #SBATCH -o test.output #SBATCH -e test.output # Default in slurm #SBATCH --mail-user my@mail.fr #SBATCH --mail-type=ALL # Request 5 hours run time #SBATCH -t 5:0:0 #SBATCH -A your_project_id_here #SBATCH --mem=4000 #SBATCH -p normal # //call your app here
|
|
#!/bin/bash # # #$ -N test #$ -j y #$ -o test.output #$ -cwd #$ -M my@mail.fr #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_id_here # #$ -l mem=4G # //call your app here
|
|
Documentation
https://docs.hpc.shef.ac.uk/en/latest/referenceinfo/scheduler/SGE/sge_environment_variables.html
https://www.uppmax.uu.se/support/user-guides/sge-vs-slurm-comparison/
PDF : SGEtoSLURMconversion.pdf
PDF : scheduler_commands_cheatsheet-2020-ally.pdf