help > Cluster / conn_jobmanager Error
Showing 1-3 of 3 posts
Display:
Results per page:
Oct 1, 2018  10:10 PM | Paul Cernasov
Cluster / conn_jobmanager Error
Hello,

I'm a grad student and new user of Conn Toolbox. I seem to be having an issue submitting jobs through the distributed processing Slurm cluster. I first noticed a problem when I attempted to preprocess data and the jobs consistently terminated after 1 hour.

The submission command line revealed that Conn defaults to Grid Engine Computer Cluster so I tried switching by going to "Tools > Cluster / HPC > Settings". Immediately the error message below populates. None of the changes that I make in the settings, such as making Slurm computer cluster my default, are ever saved or applied.

Any support would be greatly appreciated!! I have been stuck on this for quite some time.

Cheers,
Paul


=========================================================================

ERROR DESCRIPTION:

Undefined variable "CFG" or class "CFG.osquotes".

Error in conn_jobmanager>conn_jobmanager_checkdeployedname (line 1369)
isdep_callback=sprintf(isdep_callback{idx},[CFG.osquotes msg(msg>=32) CFG.osquotes]); % full-path to executable

Error in conn_jobmanager>conn_jobmanager_settings/conn_jobmanager_settings_update (line 954)
if isempty(tstr), tstr=conn_jobmanager_checkdeployedname; end

Error in conn_jobmanager>conn_jobmanager_settings (line 938)
conn_jobmanager_settings_update('refresh');

Error in conn_jobmanager (line 397)
[PROFILES,DEFAULT]=conn_jobmanager_settings(PROFILES,DEFAULT);

Error in conn (line 1144)
conn_jobmanager('settings');

Error in conn_menumanager (line 120)
feval(CONN_MM.MENU{n0}.callback{n1}{1},CONN_MM.MENU{n0}.callback{n1}{2:end});
CONN v.18.a
SPM12 + DEM FieldMap MEEGtools
Matlab v.2017b
storage: 2558.6Gb available

spm @ /nas/longleaf/home/cernasov/bin/spm12
conn @ /nas/longleaf/home/cernasov/bin/conn

===========================================================================
Oct 2, 2018  10:10 AM | Alfonso Nieto-Castanon - Boston University
RE: Cluster / conn_jobmanager Error
Hi Paul,

Thanks for reporting this issue. Could you please try the attached patch and let me know if that fixes this issue? (this patch is for version 18a, to install it simply copy the attached file to your conn distribution folder overwriting the file with the same name there)

And regarding the jobs being terminated after 1 hour, that could be the result of the default time-limit assigned to jobs in your cluster. I would suggest to try adding a "-time=..." flag to your slurm submissions to change that (see for example this post for additional info on how to do this)

Best
Alfonso
Originally posted by Paul Cernasov:
Hello,

I'm a grad student and new user of Conn Toolbox. I seem to be having an issue submitting jobs through the distributed processing Slurm cluster. I first noticed a problem when I attempted to preprocess data and the jobs consistently terminated after 1 hour.

The submission command line revealed that Conn defaults to Grid Engine Computer Cluster so I tried switching by going to "Tools > Cluster / HPC > Settings". Immediately the error message below populates. None of the changes that I make in the settings, such as making Slurm computer cluster my default, are ever saved or applied.

Any support would be greatly appreciated!! I have been stuck on this for quite some time.

Cheers,
Paul


=========================================================================

ERROR DESCRIPTION:

Undefined variable "CFG" or class "CFG.osquotes".

Error in conn_jobmanager>conn_jobmanager_checkdeployedname (line 1369)
isdep_callback=sprintf(isdep_callback{idx},[CFG.osquotes msg(msg>=32) CFG.osquotes]); % full-path to executable

Error in conn_jobmanager>conn_jobmanager_settings/conn_jobmanager_settings_update (line 954)
if isempty(tstr), tstr=conn_jobmanager_checkdeployedname; end

Error in conn_jobmanager>conn_jobmanager_settings (line 938)
conn_jobmanager_settings_update('refresh');

Error in conn_jobmanager (line 397)
[PROFILES,DEFAULT]=conn_jobmanager_settings(PROFILES,DEFAULT);

Error in conn (line 1144)
conn_jobmanager('settings');

Error in conn_menumanager (line 120)
feval(CONN_MM.MENU{n0}.callback{n1}{1},CONN_MM.MENU{n0}.callback{n1}{2:end});
CONN v.18.a
SPM12 + DEM FieldMap MEEGtools
Matlab v.2017b
storage: 2558.6Gb available

spm @ /nas/longleaf/home/cernasov/bin/spm12
conn @ /nas/longleaf/home/cernasov/bin/conn

===========================================================================
Attachment: conn_jobmanager.m
Oct 2, 2018  12:10 PM | Paul Cernasov
RE: Cluster / conn_jobmanager Error
Alfonso,

Thank you so much for sharing the patch so quickly! It looks like my issue is resolved. I can now save the default setting to Slurm and include the 24:00:00 time limit in the command.

Thank you again!!
Paul