help > RE: HPC - SLURM - Parralellisation with conn
Dec 19, 2020  09:12 PM | Alfonso Nieto-Castanon - Boston University
RE: HPC - SLURM - Parralellisation with conn
Hi Sophie,

If the job was stopped without any apparent reason, the most likely cause is that the job-scheduler in your cluster environment killed the job because it exceeded the allocated resources (typically either it exceeded the allocated time or it exceeded the allocated memory). To fix that you simply need to "request" more memory and longer times for your jobs. You may do that in CONN's gui 'tools. HPC options. Configuration' menu, selecting the profile named 'Slurm computer cluster', and then adding "in-line additional submit options" box the text:

-t 12:00:00 --mem=8Gb

(the above line will request 12 hours and 8Gb per job, that should typically suffice but feel free to play with those values if needed)
and then clicking 'save' (all) to save that configuration for future jobs as well. If you prefer not to use the GUI you can also do the same thing from Matlab's command line using the command:

conn_jobmanager options cmd_submitoptions '-t 12:00:00 --mem=8Gb' saveall

Let me know if that seems to fix this issue (and if does not, please give me more details about what cluster environment your are using since may of these limitations and policies vary from place to place) 

Hope this helps
Alfonso

Originally posted by sophieb:
Hello,

I amended my script as suggested.
The parralellisation seemed to have worked  (4 .mat -one for each participant -were produced as I had 24 cores and 4 participants). However, the analyses stopped without being done, not sure why. Please find the zipped filed enclosed.
Thanks a lot,
Sophie

Threaded View

TitleAuthorDate
sophieb Dec 18, 2020
sophieb Jan 15, 2021
sophieb Jan 14, 2021
sophieb Jan 13, 2021
sophieb Jan 11, 2021
Alfonso Nieto-Castanon Jan 11, 2021
Alfonso Nieto-Castanon Jan 11, 2021
sophieb Jan 12, 2021
Alfonso Nieto-Castanon Jan 12, 2021
sophieb Jan 12, 2021
Alfonso Nieto-Castanon Jan 26, 2021
sophieb Jan 12, 2021
sophieb Jan 11, 2021
sat2020 Dec 18, 2020
Alfonso Nieto-Castanon Dec 18, 2020
sophieb Dec 18, 2020
Alfonso Nieto-Castanon Dec 18, 2020
sophieb Dec 18, 2020
Alfonso Nieto-Castanon Dec 18, 2020
sophieb Dec 19, 2020
RE: HPC - SLURM - Parralellisation with conn
Alfonso Nieto-Castanon Dec 19, 2020
sophieb Dec 21, 2020
Alfonso Nieto-Castanon Dec 21, 2020
sophieb Jan 2, 2021
sophieb Jan 8, 2021
sophieb Dec 22, 2020