general-discussion > No progress and no error
Showing 1-4 of 4 posts
Display:
Results per page:
Nov 20, 2013  07:11 AM | Debra Dawson
No progress and no error
Hi,  I wrote a post several months ago with no response as a reply to the "NIAK running really slowly" topic, but didn't get any response, so I am trying a new topic.

I have run my script file, and all of the output directories are created along with all of the log files.  There are 4 files which are identified as being submitted in the status log.

After the above progress, which occurred within the first few minutes, there has been no change at all in two days, and no error messages related to the running of niak processes.  The only error I get is from my matlab closing after about a day:  "Aborted (core dumped)".  My impression is that my matlab closing isn't a issue because once the job has been submitted, it is run elsewhere. 

Is this correct?
Why is nothing happening?

Below is my matlab script file which I ran on william.bic.mni.mcgill.ca, username ddawson.


%% Running NIAK
% If you are not seated in front of a BIC terminal use:
% 1)Connect to a BIC cluster terminal by ssh (e.g. ssh username@computer_name)
% 2)Run matlab on this terminal (matlab -nojvm -nodisplay)
% 3)Run this example below in matlab.


% path_niak = '/data/aces/aces1/pbellec/public/niak-edge/';
path_niak = '/data/aces/aces1/pbellec/public/niak-0.6.3/';
P = genpath(path_niak);
addpath(P);

%files_in=niak_files_in_rest_2;

rawdatapath='/export02/data/debbie/niakstuff/raw/';
niakoutpath='/export02/data/debbie/niakstuff/niak_out_Nov2013/';
storagepath='/export02/data/debbie/niakstuff/niak_out_storage/';
%mkdir(storagepath);

Subject(1).info={'AMB','beaulne_anne_marie', 'beaulne_anne_marie_20100719_093834', 2, [3:11], 'TRUE'};
Subject(2).info={'RG','graham_ross', 'graham_ross_20100924_095129', 3, [7:15], 'TRUE'};
Subject(3).info={'HH','hardie_heather', 'hardie_heather_20100830_145905', 3, [4:11], 'TRUE'};
Subject(4).info={'LM','mccabe_lorraine', 'mccabe_lorraine_20100830_124751', 3, [4:11], 'TRUE'};
Subject(5).info={'HM','monroe_hollis', 'monroe_hollis_20100719_115142', 2, [3:11], 'TRUE'};
Subject(6).info={'AS','stanhope_alexis', 'stanhope_alexis_20100726_133710', 2, [4:11], 'TRUE'};
Subject(7).info={'ZY','yao_zeshan', 'yao_zeshan_20101015_145702', 2, [6:14], 'TRUE'};

infoList=cat(1,Subject(:).info);
%nb_subject = sum(strcmp('TRUE',cat(1,infoList(:,6))));
subjIdxList = find(strcmp('TRUE',cat(1,infoList(:,6))));

nSubject=length(subjIdxList);
for iS=[1 4 5 6 7]%:nSubject
    files_in=[];
   
    subjIdx=subjIdxList(iS);   
    subjFieldName=sprintf('subject%d',subjIdx);
    subjDataPath=[rawdatapath '/' Subject(subjIdx).info{3} '/'];
    subjOutPath= [niakoutpath '/subject' num2str(iS) '/'];
    subjStoragePath = [storagepath '/subject' num2str(iS) '/'];
   
    % anatomy file   
    anatFileIdx=Subject(subjIdx).info{4};
    files_in.(subjFieldName).anat = ...
        [subjDataPath '/' Subject(subjIdx).info{3} '_' num2str(anatFileIdx) '_mri.mnc.gz'];
   
    % functional files
    boldFileIdxList=Subject(subjIdx).info{5};
    nRuns=length(boldFileIdxList);
    for iR=1:nRuns
        boldFileIdx=boldFileIdxList(iR);
        files_in.(subjFieldName).fmri.session1{iR} = ...
            [subjDataPath '/' Subject(subjIdx).info{3} '_' num2str(boldFileIdx) '_mri.mnc.gz'];
    end

 
    %% Building the optional inputs
    opt.style = 'standard-native'; %% The fMRI data will be analyzed in the standard-stereotaxic space
    %opt.folder_out = [path_data 'fmri_preprocess' filesep];
    % opt.folder_out = '/export02/data/kuwook/resting_long/niak/';%'/data/shmuel/shmuel1/kuwook_new/niak_mendola_normal/';
    opt.folder_out = subjOutPath;
    opt.size_output = 'quality_control';
    opt.flag_corsica = 1;

    %%%%%%%%%%%%%%%%%%%%
    %% Bricks options %%
    %%%%%%%%%%%%%%%%%%%%

    %% 1. Linear and non-linear fit of the anatomical image in the stereotaxic
    %% space (niak_brick_civet)
    opt.bricks.civet.n3_distance = 25; % Parameter for non-uniformity correction.
    %200 is a suggested value for 1.5T images, 25 for 3T images.
    %If you find that this stage did not work well, this parameter is usually critical to improve the results.

    %% 2. Motion correction (niak_brick_motion_correction)
    opt.bricks.motion_correction.suppress_slice = 1;           % Set the first and last slice to zero after interpolation
    opt.bricks.motion_correction.suppress_vol = 3;             % Remove the first three dummy scans
    opt.bricks.motion_correction.vol_ref = 'median';           % Use the median of the run as a reference
    opt.bricks.motion_correction.flag_session = 0;             % Correct for both within and between sessions motion
    opt.bricks.motion_correction.flag_run = true;              % Consider each run as a separate session

    opt.bricks.mask_brain.thresh = 0.5;

    %% 3. Coregistration between T1 and T2 (niak_brick_coregister)

    %% 4. Concatenation of the T2-to-T1 and T1-to-stereotaxic-nl
    %%    transformations. (niak_brick_concat_transf)

    %% 5. Slice timing correction (niak_brick_slice_timing)
    TR = 2; % Repetition time in seconds
    nb_slices = 30; % Number of slices in a volume
    opt.bricks.slice_timing.slice_order = [2:2:nb_slices 1:2:nb_slices];
    opt.bricks.slice_timing.timing(1)=TR/nb_slices; % Time beetween slices
    opt.bricks.slice_timing.timing(2)=TR/nb_slices; % Time between the last slice of a volume and the first slice of next volume (here there is no delay in TR)

    %% 6. Temporal filetring (niak_brick_time_filter)
    opt.bricks.time_filter.hp = 0.01; % Apply a high-pass filter at cut-off frequency 0.01Hz (slow time drifts)
    %opt.bricks.time_filter.lp = 0.1; % Do not apply low-pass filter. Low-pass filter induce a big loss in degrees of freedom without sgnificantly improving the SNR.

    %% 7. Correction of physiological noise (niak_pipeline_corsica)
    opt.bricks.sica.nb_comp = 50;
    opt.bricks.component_supp.threshold = 0.15;

    %% 8. Resampling in the stereotaxic space (niak_brick_resample_vol)
    %opt.bricks.resample_vol.voxel_size = -1;

    %% 9. Spatial smoothing (niak_brick_smooth_vol)
    opt.bricks.smooth_vol.fwhm = 0; % Apply an isotropic 6 mm gaussian smoothing.

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %% Generation of the pipeline %%
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %opt.flag_test = 1;
    % opt.path_folder = opt.folder_out;
    % opt.path_logs = [opt.folder_out,'logs/'];
%     opt.psom.mode = 'qsub'; %qsub
%     opt.psom.mode_pipeline_manager = 'batch';
%     opt.psom.max_queued = 4;

    pipeline = niak_pipeline_fmri_preprocess(files_in,opt);
   
    %%%
    % move file
    mkdir(subjStoragePath);
    fprintf(1,'Copying files (subj #%d)...\n',iS);
    bSuccess=copyfile([subjOutPath '/*'],subjStoragePath);
    if bSuccess
        fprintf(1,'\tDeleting original files...');
        bSuccess=rmdir([subjOutPath],'s');
        if ~bSuccess;error('Orignial files not removed');end
    end
    fprintf(1,'Done.\n');
   
end
Nov 20, 2013  08:11 AM | Mike Ferreira
RE: No progress and no error
Hi Debra,

Your output paths look like they are local to your machine (ie. /export02/...). These local directories are not usually accessible to your job if it is running elsewhere on the batch queue. Try running your script with different input and output paths that are mounted on the BIC network (eg. /data/...).

Regards,

mike
Nov 21, 2013  09:11 AM | Sebastian Urchs
RE: No progress and no error
Hi,

although maybe not the cause of your problem I experienced something similar recently. The reason in my case was that the queue of the computation cluster was full, causing a large backlog of submitted processes - in my case it took almost 24 hours for my submitted jobs to actually start running. You can check the status of your jobs with
qstat -u yourusername
If the second last column ('S') says R then your job is running currently. Otherwise you can look for the name of the queue you submitted to in the third column ('Queue') and then enter the following command:
qstat -Q nameofthequeue
There you should look for the columns 'Que' and 'Run' to get an idea of how many jobs are waiting to be executed. If you don't want to keep running these commands over and over, just add 'watch ' in front to have a continuous report.
(watch qstat -u yourusername)
Nov 22, 2013  07:11 AM | Debra Dawson
RE: No progress and no error
Thank you , both of these responses were helpful.

My jobs began
running properly once I changed the directories to bic directories, but
now many of my jobs have failed, and when I check the status using
qstat, I only see jobs waiting in an error state (eqw).  This began with
the failing of my "anat" analysis of subject 1.


In my
checks of the errors in the log file (below), I confirm that there is no
"emma" directory, however the surfstat toolbox is present (though I
receive a message saying that the new location is
"/export01/local/matlab7a/toolbox").  My raw files are in ".mnc.gz"
format, is there some other format they should be in?


Here is is the log from the anat analysis:


 
Warning: Name is nonexistent or not a directory: /usr/local/matlab7a/toolbox/emma.
> In path at 110
  In matlabrc at 279
Warning: Name is nonexistent or not a directory: /usr/local/matlab7a/toolbox/surfstat.
> In path at 110
  In matlabrc at 281
Warning: Name is nonexistent or not a directory: /usr/local/matlab7a/toolbox/surfstat.
> In path at 34

******************************
Log of the (matlab) job : anat_subject1
Started on 20-Nov-2013 11:09:55
User: ddawson
host : valhalla
system : unix
******************************

command =

niak_brick_civet(files_in,files_out,opt)


files_in =

     anat: [1x128 char]
    civet: ''gb_niak_omitted''


files_out =

        transformation_lin: [1x124 char]
         transformation_nl: [1x124 char]
    transformation_nl_grid: [1x129 char]
                  anat_nuc: [1x109 char]
       anat_nuc_stereo_lin: [1x110 char]
        anat_nuc_stereo_nl: [1x109 char]
                      mask: [1x114 char]
               mask_stereo: [1x115 char]
                  classify: [1x119 char]
                    pve_wm: [1x117 char]
                    pve_gm: [1x117 char]
                   pve_csf: [1x118 char]
                    verify: [1x107 char]


opt =

      n3_distance: 25
       folder_out: [1x82 char]
        flag_test: 0
    civet_command: ''''
     flag_verbose: 1
            civet: ''gb_niak_omitted''


********************
The job starts now !
********************
Running
CIVET on volume
/data/shmuel/shmuel1/kuwook/HumanRest_2/raw//beaulne_anne_marie_20100719_093834//beaulne_anne_marie_20100719_093834_2_mri.mnc.gz.
This is going to take a while ! (roughly one hour)

succ =

     0


msg =




Copying

/tmp/niak_tmp_anat_subject1_256933926_civet/coco/transforms/linear/anat_coco_t1_tal.xfm
to
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013//subject1//anat/subject1//transf_subject1_nativet1_to_stereolin.xfm
Warning:
cp:
cannot stat
`/tmp/niak_tmp_anat_subject1_256933926_civet/coco/transforms/linear/anat_coco_t1_tal.xfm'':
No such file or directory

> In niak_brick_civet at 429
  In psom_run_job at 118
Copying

/tmp/niak_tmp_anat_subject1_256933926_civet/coco/transforms/nonlinear/anat_coco_nlfit_It.xfm
to
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013//subject1//anat/subject1//transf_subject1_stereolin_to_stereonl.xfm


********************
Something went bad ... the job has FAILED !
The last error message occured was :
Error using ==> fread
Invalid file identifier.  Use fopen to generate a valid file identifier.
File /data/aces/aces1/pbellec/public/niak-0.6.3/bricks/fmri_preprocess/niak_brick_civet.m at line 437
File /data/aces/aces1/pbellec/public/niak-0.6.3/extensions/psom/psom_run_job.m at line 118

****************
Checking outputs
****************
The
output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/transf_subject1_nativet1_to_stereolin.xfm
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/transf_subject1_stereolin_to_stereonl.xfm
was successfully generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/transf_subject1_stereolin_to_stereonl_grid.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_nativet1.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_stereonl.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_mask_nativet1.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_mask_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_classify_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_pve_wm_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_pve_gm_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_pve_csf_stereolin.mnc
has not been generated!
The output file or directory
/data/shmuel/shmuel1/debbie/HumanRest_2/niak_out_Nov2013/subject1/anat/subject1/anat_subject1_verify.png
has not been generated!

************************************************
20-Nov-2013 11:18:57 : The job has FAILED
Total time used to process the job : 541.45 sec.
************************************************
'