help
help > RE: HPC - SLURM - Parralellisation with conn
Dec 18, 2020 05:12 PM | Alfonso Nieto-Castanon - Boston University
RE: HPC - SLURM - Parralellisation with conn
Dear Sophie,
It seems there are two separate issue here:
1) you may have specified in CONN your default HPC profile to be "LSF computer cluster", while from your description it sounds like your cluster environment is likely using SLURM instead, so simply go to CONN's gui 'tools. HPC options. Configuration' menu and select there the profile named 'Slurm computer cluster', and then click on the 'default profile' checkbox and 'save' (all) to save that configuration (you can do the same thing from the command line using the command "conn_jobmanager setdefault Slurm saveall"). After doing that, the submitted jobs should run just fine. Alternatively, you may also simply specify the desired profile explicitly within your Lesion*.m file by adding the line "BATCH.parallel.profile='Slurm computer cluster';" there
and 2) there is no need really to run your LesionNetworkMappingSoso_SCITAS_17122020v3.m script also remotely (e.g. you may simply run that locally from any VNC matlab session, since that script will simply submit one or multiple jobs to your cluster and then -optionally- just wait until they finish). Currently the second error you are observing simply arises because CONN attempts to show you a display of the status of those submitted jobs, but it is failing to do so because the submitting matlab session has been started using the -nojvm flag (so there are not windows associated with it). In any way, simply adding to your Lesion*.m script the line "BATCH.parallel.immediatereturn=true;" right below the other BATCH.parallel line should fix this, since in that case CONN will not attempt to show you that status-display. Alternatively, running your Lesion*.m script directly from a VNC matlab session should also work just fine (without needing to add the 'immediatereturn' field to the script).
Hope this helps
Alfonso
Originally posted by sophieb:
It seems there are two separate issue here:
1) you may have specified in CONN your default HPC profile to be "LSF computer cluster", while from your description it sounds like your cluster environment is likely using SLURM instead, so simply go to CONN's gui 'tools. HPC options. Configuration' menu and select there the profile named 'Slurm computer cluster', and then click on the 'default profile' checkbox and 'save' (all) to save that configuration (you can do the same thing from the command line using the command "conn_jobmanager setdefault Slurm saveall"). After doing that, the submitted jobs should run just fine. Alternatively, you may also simply specify the desired profile explicitly within your Lesion*.m file by adding the line "BATCH.parallel.profile='Slurm computer cluster';" there
and 2) there is no need really to run your LesionNetworkMappingSoso_SCITAS_17122020v3.m script also remotely (e.g. you may simply run that locally from any VNC matlab session, since that script will simply submit one or multiple jobs to your cluster and then -optionally- just wait until they finish). Currently the second error you are observing simply arises because CONN attempts to show you a display of the status of those submitted jobs, but it is failing to do so because the submitting matlab session has been started using the -nojvm flag (so there are not windows associated with it). In any way, simply adding to your Lesion*.m script the line "BATCH.parallel.immediatereturn=true;" right below the other BATCH.parallel line should fix this, since in that case CONN will not attempt to show you that status-display. Alternatively, running your Lesion*.m script directly from a VNC matlab session should also work just fine (without needing to add the 'immediatereturn' field to the script).
Hope this helps
Alfonso
Originally posted by sophieb:
Dear Alfonso,
please find the requested zipped file enclosed,
Best,
s.
please find the requested zipped file enclosed,
Best,
s.
Threaded View
Title | Author | Date |
---|---|---|
sophieb | Dec 18, 2020 | |
sophieb | Jan 15, 2021 | |
sophieb | Jan 14, 2021 | |
sophieb | Jan 13, 2021 | |
sophieb | Jan 11, 2021 | |
Alfonso Nieto-Castanon | Jan 11, 2021 | |
Alfonso Nieto-Castanon | Jan 11, 2021 | |
sophieb | Jan 12, 2021 | |
Alfonso Nieto-Castanon | Jan 12, 2021 | |
sophieb | Jan 12, 2021 | |
Alfonso Nieto-Castanon | Jan 26, 2021 | |
sophieb | Jan 12, 2021 | |
sophieb | Jan 11, 2021 | |
sat2020 | Dec 18, 2020 | |
Alfonso Nieto-Castanon | Dec 18, 2020 | |
sophieb | Dec 18, 2020 | |
Alfonso Nieto-Castanon | Dec 18, 2020 | |
sophieb | Dec 18, 2020 | |
Alfonso Nieto-Castanon | Dec 18, 2020 | |
sophieb | Dec 19, 2020 | |
Alfonso Nieto-Castanon | Dec 19, 2020 | |
sophieb | Dec 21, 2020 | |
Alfonso Nieto-Castanon | Dec 21, 2020 | |
sophieb | Jan 2, 2021 | |
sophieb | Jan 8, 2021 | |
sophieb | Dec 22, 2020 | |