Equivalences des commandes SLURM/SGE
Commandes utilisateur
Explications | Commande Slurm | Commande SGE |
---|---|---|
Interactive login | # srun --pty bash |
# qlogin |
# srun -p "part_name" --pty bash |
||
# sdev |
||
Job submission | # sbatch [script file] |
# qsub [script file] |
Job deletion | # scancel [job_ID] |
# qdel [job_ID] |
Job status all | # squeue -all |
# qstat -f |
Job status | # squeue [job_ID] |
# qstat -u \ * [-j job_ID] |
Job user status | # squeue -u [user name] |
# qstat [-u user name] |
Job hold | # scontrol hold [job_ID] |
# qhold [job_ID] |
Job release | # scontrol release [job_ID] |
# qrls [job_ID] |
Queue list | # squeue |
# qconf -sql |
Node list | # sinfo -N |
# qhost |
# scontrol show nodes |
# qhost |
|
Clusterstatus | # sinfo |
# qhost -q |
GUI | # sview |
# qmon |
Commandes admin
Explications | Commande Slurm | Commande SGE |
---|---|---|
Version | # sinfo --version |
# qstat -help |
Désactiver un noeud | # scontrol update nodename=<node> state=draining |
# qmod -d <queue>@<noeuds> |
dans toutes les queues | # scontrol update --all state=draining (?) |
# qmod -d \*@<noeuds> |
Activer un noeud | # scontrol update nodename=<node> state=resume |
# qmod -e <queue>@<noeuds> |
dans toutes les queues | # scontrol update --all state=resume (?) |
# qmod -e \*@<node> |
Variables d’environnement
Explications | Variable Slurm | Variable SGE |
---|---|---|
job_ID | $SLURM_JOBID |
$JOB_ID |
Submit Directory | $SLURM_SUBMIT_DIR |
$SGE_O_WORKDIR |
Submit Host | $SLURM_SUBMIT_HOST |
$SGE_O_HOST |
Node List | $SLURM_JOB_NODELIST |
$PE_HOSTFILE |
Job Array Index | $SLURM_ARRAY_TASK_ID |
$SGE_TASK_ID |
Paramètres des scripts des jobs
Explications | Paramètre Slurm | Paramètre SGE |
---|---|---|
Script directive | #SBATCH |
#$ |
Queue | -p [queue] |
-q [queue] |
Node Count | -N [min[-max]] |
X |
CPU count | -n [count] |
-pe [PE] [count] |
Wall Clock Limit | -t [min]] |
-l h_rt=[seconds] |
-t [days-hh:mm:ss] |
-l h_rt=[seconds] |
|
Standard Output | -o [file name] |
-o [file name] |
Standard Error | -e [file name] |
-e [file name] |
Error File stdout/err | -o [file name] |
-j yes |
Copy Environment | --export=[ALL/NONE/var] |
-V |
Event Notification | --mail-type=[events] |
-m abe |
EmailAddress | --mail-user=[address] |
-M [address] |
Job Name | --job-name=[name] |
-N [name] |
Job Restart | --requeue |
-r [yes/no] |
--no-requeue |
-r [yes/no] |
|
Working Directory | --workdir=[dir_name] |
-wd [dir_name] |
Resource Sharing | --exclusive |
-l exclusive |
--shared |
-l exclusive |
|
Memory Size | --mem=[mem][M/G/T] |
-l mem_free=[memory][K/M/G] |
--mem-per-cpu=[mem][M/G/T] |
-l mem_free=[memory][K/M/G] |
|
Account to charge | --account=[account] |
-A [account] |
Tasks Per Node | --tasks-per-node=[count] |
(Fixed allocation_rule in PE) |
CPUs Per Task | --cpus-per-task=[count] |
X |
Job Dependency | --depend=[state:job_ID] |
-holdjid [job_ID/job_NAME] |
Job Project | --wckey=[name] |
-P [name] |
Job host preference | --nodelist=[nodes] |
-q [queue]@[node] |
option : -exclude=[nodes] |
-q [queue]@@[hostgroup] |
|
Quality of Service | --qos=[name] |
X |
Job Arrays | --array=[array_spec] |
-t [array_spec] |
Generic Resources | --gres= [resource_ spec I |
-l [resource]=[value] |
Licenses | --licenses=[license_spec] |
-l [license]=[count] |
Begin Time | --begin=YY-MM-DD[HH:MM:SS] |
-a [YYMMDDhhmm] |
Exemples
Scripts
Single-core application
Note : dans slurm on ne devrait pas utiliser + de 3G de mem
Script Slurm single-core | Script SGE single-core | ||
---|---|---|---|
|
|
MPI application
Script Slurm | Script SGE | ||
---|---|---|---|
|
|
Documentation
https://docs.hpc.shef.ac.uk/en/latest/referenceinfo/scheduler/SGE/sge_environment_variables.html
https://www.uppmax.uu.se/support/user-guides/sge-vs-slurm-comparison/
PDF : SGEtoSLURMconversion.pdf
PDF : scheduler_commands_cheatsheet-2020-ally.pdf