help > RE: Potential bugs in preprocessing pipeline
Dec 6, 2018  08:12 AM | Pravesh Parekh - National Institute of Mental Health and Neurosciences
RE: Potential bugs in preprocessing pipeline
Dear Dr. Alfonso,

Apologies for not having reverted back sooner. I am sharing a document that has the slice order for a Philips MRI machine via email. 

I was wondering if you had had a moment to look at the indirect pipeline files that I had shared? I am curious to know what could be going wrong when processing.


Warm Regards
Pravesh

Originally posted by Alfonso Nieto-Castanon:
Dear Pravesh,

Thanks again for your great points regarding slice-timing details. You are right that there is often some unspecified details regarding the precise timing information for each slice. Generally SPM uses the equation TA*(n-1)/(N-1) as the estimated relative acquisition-time of the n-th acquired slice (over a total of N acquired slices), and TA, if not specified, is set to TR*(1-1/N). These equations, of course, assume evenly distributed acquisition times across the entire TR, which may or may not be true. In sparse acquisition sequences, typically specifying the TA value (much smaller than TR) often suffices to make the above equations work (again assuming evenly-distributed slice acquisition times over the entire acquisition period), but you are right that perhaps it would be a good idea to have CONN allow people to modify TA if needed for these cases (currently if you are using CONN's batch commands you can use the 'ta' field to modify the default TA value, but there is no such option when using CONN's GUI). Regarding the information about the actual slice times stored in the original DICOM files, currently CONN can read DICOM files and automatically import the acquisition times (into a .json sidecar file) only for Siemens scanners (since they typically incorporate either the DICOM MosaicRefAcqTimes field or the Private_0019_1029 field containing this info). I would love to be able to do the same for Philips scanners so I would very much appreciate if you could send me official documentation as well as any additional details that you believe might be helpful to do this. Also the "interleaved (Siemens)" option in CONN implements the somewhat specific Siemens default interleaved-sequence settings so I would love to create an "interleaved (Philips)" option since, from you description, that clearly differs from the standard interleaved-ascending/descending sequences. So again thank you very much for pointing this out and yes, please do send me any Philips-specific information that may be helpful in this regard. 

Regarding your question about microtime onset settings, those are also very good points. The default settings in SPM are to use stats.fmri.t0=8 and stats.fmri.t=16 (this is all irrespective of the number of actual slices in your data) which means that the condition timeseries are typically resampled at approximately the time at the middle of each acquisition (the default used to be stats.fmri.t0=1 but that was changed not too long ago, if I recall correctly, precisely to make it a better fit to the standard practice of picking a mid-acquisition slice as reference during slice-timing correction). The documentation is not clear but these two values (stats.fmri.t and stats.fmri.t0) need to be integer numbers, as far as I can tell. You are right that there remain a few inaccuracies here particularly for cases where TA is considerably smaller than TR such as sparse acquisition designs (where a smaller t0 value would be preferable). I also see your point that people may çhange stats.fmri.t to be the number of actual slices and change stats.fmri.t0 to be the reference slice (if the reference time matches the acquisition time of an actual slice) but they could not use the same trick if the reference slice does not match an actual slice acquisition time, but I think in these cases the default setting -t0=8 and t=16- should work just fine. If you want to be more precise you could always match stats.fmri.t to an even number closer to the actual number of slices, and stats.fmri.t0 to stats.fmri.t/2, but that only changes the width of the center bin, nor its location, so that might be an overkill. For cases where TA is different than the default TR*(1-1/N) perhaps a reasonable choice (when using the proposed average acquisition time as reference during slice-timing correction) would be to set stats.fmri.t and stats.fmri.t0 to values where TR/fmri.t*(fmri.t0-1/2) approximates TA/2, which, of course, is a somewhat convoluted procedure (but to be fair the same difficulties exist irrespective of whether the reference time in the STC procedure matches an actual slice acquisition time or not in these cases)

Regarding the indirect pipeline dataset I have not had the chance to look at this just yet but I will do that asap and let you know any thoughts on that regard

Hope this helps
Alfonso


Originally posted by Pravesh Parekh:
Dear Dr. Alfonso,

Thank you for the detailed reply and for your encouraging words!

Slice-timing correction:
I agree with your suggestion that it might be better to shift to a an actual timing based correction as it is more encompassing (including multiband data) rather than specifying slice numbers. It also seems like a good choice given that BIDS convention requires that slice timing is mentioned in the associated .JSON files. It also makes sense to use the average timing. However, there are two (potential) problems that I can think of:

1. Given a vector of slice acquisition numbers, how do we work out the actual slice timing?
  • We do know the overall TR, however that does not necessarily mean that the slices were evenly distributed within the TR. 
  • Without the actual DICOM data available to Conn, I am not sure if it is possible to work out precise timings for the acquisition of each slice. 
  • This would get more troubling if the data is not continuously acquired (sparse acquisition) i.e. if there are acquisition styles in which the timing between slices are kept to minimal while the TR is longer than the time required for these acquisitions.
  • Finally, if the data is acquired on a Philips machine, the exact details of slicing are not saved in the DICOM header (as far as I know). Therefore, the MosaicRefAcqTimes field won't be available too.
  • It is possible to artificially create timings assuming equal temporal spacing of the slices but that might not be a very accurate thing to do
2. If the slice timing correction is done using an average timing (which may or may not exactly correspond to the timing of an actual slice), then it raises a problem at the first level SPM analysis. If Conn users have preprocessed the data using Conn and want to do SPM analysis, then what slice would the user enter for microtime onset? The microtime resolution would correspond to the number of slices so that's straight forward. The documentation is not clear if the microtime onset can be specified as seconds (the help mentions entering reference slice). This would limit the use of Conn preprocessed data as the users would have to approximate this information to match the reference slice number which is closest to the average timing (creating same even/odd problem that you mentioned or having to work out in some convoluted way which slice the timing could refer to, if they do not know the actual timings).

Additional point about data from Philips scanner:
Philips slicing information is different from Siemens slicing information. For example, when specifying interleaved acquisition on a Philips machine, the scanner takes the root of the total number of slices, rounds it off, and then spaces the slices so that there is minimal cross-talk between slices. Similarly, the "default" slicing in Philips is actually first odd then even slices which is similar to ascending interleaved of Siemens. Perhaps the GUI can explicitly list that the interleaved options that it shows is specific to Siemens scanner? There could be other options which are more specific to Philips scanner too (if you would like, I can send you the official documentation as we have access to one; some details are also mentioned here: https://en.wikibooks.org/wiki/SPM/Slice_Timing#Philips_scanners). This is just for the cases where the users might not be that well informed about how the data was acquired (again, JSON files would not be very useful in such cases) and may make an incorrect choice assuming that interleaved of Siemens and Philips can be interchanged.


Indirect pipeline incorrect normalization:
 I am sharing a zip file which has both the source and the processed files using direct and indirect pipeline (along with the project variable). I have not included the normalized and smoothed functional files; Here is the Google Drive link. 


CSF segmentation mask:
Yes, indeed. I agree that it would not make any impact on Conn as all analysis happen using functional data while structural data is only for visualization. I believe a threshold would be additionally in the expression (i2+i3).*i1 as the sum of probabilities of i2 and i3 wouldn't add up to 1 and therefore the intensities in the structural image (i1) would get down-scaled. Again, like you said, it really depends on what the user wants to do with these images outside of Conn so I don't think it matters too much.


Thanks and Regards
Pravesh

Originally posted by Alfonso Nieto-Castanon:
Dear Pravesh,

My apologies for the late reply (and sorry that you needed some persistence to get my attention! I really appreciate your helpful contributions to the forum, not only with your own interesting questions but also helping others out, so if I haven't said so yet thanks a ton for that!)

Regarding the "slice-timing correction" question, first, you are exactly right, a few of those issues are related to SPM having a somewhat dual way of specifying slice timing information for this procedure. In SPM you can either: a) specify the slice-order vector (a sequence of indexes indicating which slice is acquired at each point in time, in the same order as they were acquired; of course this syntax does not work for multi-band sequences since there is no way to specify that two slices were acquired simultaneously) and then specify the reference slice (which slice you want to use as reference; implicitly this forces the reference to be an actual slice); or b) specify the slice-timing vector (a sequence of times indicating the time of acquisition of each slice; this syntax works perfectly fine for multi-band sequences) and then specify the reference time (which does not need to be the time of acquisition of an actual slice). This somewhat convoluted dual-syntax structure is due to the latter option being just a recent addition to SPM in order to support slice-timing correction for multi-band sequences (but they needed/wanted to keep the former option working as well, even if only for backwards-compatibility with older scripts / software packages). CONN tries to maintain compatibility with this dual-syntax and allows you to specify either slice-order or slice-timing vectors. When doing the former CONN will select the slice closest to the average acquisition time as reference-slice, and when doing the latter CONN will select the average of the slice-acquisition times as reference-time (not necessarily an actual slice acquisition time). You are right that this creates the somewhat unfortunate inconsistency that the same procedure on the same data will lead to slight differences (due to different reference times being used) depending on whether you use the "slice-order" or the "slice-timing" syntax to characterize your data. For example, when using a BIDS dataset which includes an "SliceTiming" field in the sidecar .json files (or when importing DICOMs that may have a "MosaicRefAcqTimes" field) describing the acquisition times, CONN will pass this information using SPM's "slice-timing" syntax, while when you use any of the standard "ascending/descending/interleaved/etc." descriptors to specify the slice acquisition order CONN will convert these to slice ordering sequences and use instead the "slice-order" syntax in SPM. Depending on which case you use the reference slice will be selected based on the average acquisition time within a TR or the timing of the slice acquired closer to mid-TR. 

In any way, thanks very much for pointing this issue out. I believe perhaps the best compromise solution would be to have CONN always use SPM's "slice-timing" syntax internally (e.g. if a user specifies the slice ordering vector CONN can translate that to slice acquisition-times and feed that info to SPM) which then would allow us to explicitly choose the reference time that would be consistent across all use cases. My choice would be to use the average acquisition time in all cases (irrespective of whether that time coincides with the acquisition of an actual slice), rather than trying to match the reference time to the actual acquisition time of one slice (mostly because that seems a reasonable quantitative choice, since it makes the distribution of shift/correction times that SPM needs to apply centered and with minimal variance, but also because it allows us to have a unique solution without having to round/choose between potentially similar-distance slices -e.g. when you have even number of slices-), but please let me know your thoughts/comments/suggestions.

Also on the 21 vs. 22 selected slice depending on the ascending vs. descending order, when using the "slice-order" syntax CONN will select as reference the "floor(nslice/2)"-th acquired slice. This would translate to slice #21 if the slice-order is 1,2,...,42 or slice number 22 if the slice-order is 42,41,...,1. Again, with the proposed change, in both cases the reference time would be selected as (TR-TR/nslices)/2 (which would be right in between the 21th and 22th slices)

Regarding the "indirect normalization" question, that is rather unusual. I have tried to replicate that behavior but do not seem to be able to do so. Since in the "indirect normalization" pipeline the structural is normalized/segmented directly (without any further modification) the resulting normalized structural images should be very similar if not identical to those resulting from the "direct normalization" pipeline (disregarding perhaps minor differences due to different initial states related to the additional centering step in the "direct normalization" pipeline), so I am not really sure what could possibly result in this cropping/masking effect. If you don't mind perhaps you could send me your structural and mean-functional volumes (the ones reported by CONN in Matlab's command-line as being selected for the indirect normalization step) so I can try to replicate this behavior more directly?

And regarding the "CSF semgentation mask" question, the wc0* structural image for each subject is, at least in CONN, only used for display purposes. Its main purpose was just to show a skull-stripped version of the normalized anatomical image, so the sum of the gray/white/csf should basically leave out those areas that are being identified as bone, soft-tissue, or air. Of course you are right that, depending on what you plan to use that structural image for, other masks may be better suited to the task (and even just for display purposes having the eye balls being removed by not including wc3 in the mask formation would be a good idea, although I would worry about the removal of pial matter and ventricles since those are closer to functionally-relevant areas). In any way, if you want to modify this behavior in CONN you could do so modifying conn_setup_preproc.m in the lines that read "...expression='(i2+i3+i4).*i1'". That expression specifies an imcalc command to compute the wc0 mask. In there i1 corresponds to the structural (wc0*), and i2 to i4 correspond to the gray/white/csf respectively (wc1*, wc2*, and wc3*), so, for example, changing the expression to "(i2+i3).*i1" would have the effect of masking out CSF from the resulting wc0 structural image.

Hope this helps
Alfonso

Originally posted by Pravesh Parekh:
Dear Dr. Alfonso,

Thank you for your reply. It is always a learning experience when listening to your thoughts. Thank you for that (and of course, many thanks for Conn!).

Thank you for the link to work in progress. I have downloaded the experimental release and will try it out (look forward to Conn 18b!). Apart from the ongoing discussion about slice timing correction, I have also seen a strange behaviour when using the indirect normalization pipeline. I have appended details of that below the slice timing correction discussion.

Regarding slice timing correction:
Say that my data has 42 slices (data was acquired in ascending manner). The JSON file associated with the data has the following slice timing correction:

[2.225; 2.17; 2.1175; 2.0625; 2.0075; 1.9525; 1.9; 1.845; 1.79; 1.7375; 1.6825; 1.6275; 1.5725; 1.52; 1.465; 1.41; 1.3575; 1.3025; 1.2475; 1.1925; 1.14; 1.085; 1.03; 0.9775; 0.9225; 0.8675; 0.8125; 0.76; 0.705; 0.65; 0.5975; 0.5425; 0.4875; 0.4325; 0.38; 0.325; 0.27; 0.2175; 0.1625; 0.1075; 0.0525; 0]

Timings for a few important slices:
first slice = 0;
last slice = 2.225
middle slice (i.e. slice number 21) = 1.1400
mean timing = 1.1121

When I specify the order as ascending in the GUI and check the SPM batch, the reference slice number is 21. However, when I specify BIDS and check the batch, the reference slice number (actually timing) is 1.1121. This corresponds to the mean timing but does not correspond to the timing of any specific slice (the timing is in the middle of 21). To get the same result as when specifying the slices (rather than timings), shouldn't the reference timing be the timing for slice number 21 (middle slice) = 1.1400? Of course, the difference in timing is quite minor.

On a related note, I noticed that for the same number of slices (=42), the batch shows reference slice as 21 for ascending acquisition and 22 for descending slice. I guess this must be because Conn is actually calculating the mean of the number of slices (just like the case of timing) and then rounding it up for the descending case and rounding it down in the ascending case (so that the resulting slice number is a valid slice). In the same tune, I was wondering if it would be better to specify the timing of a slice which actually exists.


Regarding indirect normalization:
I was trying out different preprocessing options and see a strange behaviour when running the indirect pipeline (I used the realignment and unwarping option without phase map). For the test case, the functional data had a couple of cerebellar slices missing while the functional data had a larger field of view and consequently had the full acquisition. After normalization, the resulting structural image gets cropped from the bottom resulting in an image similar to the functional image. This behaviour does not happen when using the direct normalization pipeline. I was unable to replicate the behaviour manually preprocessing the same image. I assumed that this is perhaps because of ART mask being calculated from the functional volume; however, this persists even if I skip ART outlier detection step during preprocessing. I have seen similar problem with other subjects too. What could be causing this?

I have attached a snapshot showing the native space structural and functional images, and the normalized structural and functional images using direct pipeline, and the normalizaed structural image using indirect normalization pipeline.


Regarding inclusion of CSF segmentation file for calculating wc0 file:
On a different note, when calculating the wc0 (normalized, skull stripped image), Conn includes GM, WM, and CSF files. It adds them up (resulting in a value of 1 in all brain tissue areas) and then multiplies it with the structural image to get the wc0 file. However,  this means that areas like eye balls will get included into the skull stripped image. Would it be better to only add c1 and c2 files, threshold (to get binary values), and then multiply with the structural image to get skull stripped image without eye balls and the "ring" of CSF around the brain? Of course, this does not matter too much as all processing happens using either the c* images or the functional images.


Look forward to your thoughts

Best Regards
Pravesh

Originally posted by Alfonso Nieto-Castanon:
Dear Pravesh,

As always thanks for your comments and feedback. The rationale for selecting the mid-time slice as reference when performing slice-timing correction in CONN is in order to attempt to minimize the average temporal displacement that needs to be corrected across all slices (each slice is corrected -i.e. time-shifted- by an amount equal to its actual acquisition time minus the acquisition time of the reference slice). In any way, perhaps I am missing something here so please feel free to clarify why you believe in this case selecting the mid-slice as reference could be more appropriate or preferable. 

Also, regarding your previous message, sorry the patch I sent you had too many dependencies with other version-18b changes. If you do not mind, please feel free to download the code in https://www.conn-toolbox.org/resources/s... to get the current development version -that already includes the patch that I sent you- (the final 18b version should be released in the next few weeks and that will become available as always here at nitrc.org)
 
Thanks
Alfonso
Originally posted by Pravesh Parekh:
Dear Dr. Alfonso,

I think there may be another bug in the pre-processing pipeline. When performing slice timing correction and selecting BIDS to pick up slicing order (actually timing), the reference slice is specified as half the last slice time i.e. say the last slice was acquired at 2250ms, then the reference slice is specified as 1125ms. I am assuming that instead of this, the timing for the middle slice is what needs to be picked up. 


Regards
Pravesh

Threaded View

TitleAuthorDate
Pravesh Parekh Sep 1, 2018
Pravesh Parekh Oct 9, 2018
Pravesh Parekh Oct 4, 2018
Pravesh Parekh Sep 17, 2018
Alfonso Nieto-Castanon Sep 25, 2018
Pravesh Parekh Sep 26, 2018
Alfonso Nieto-Castanon Oct 10, 2018
Pravesh Parekh Oct 11, 2018
Alfonso Nieto-Castanon Oct 12, 2018
RE: Potential bugs in preprocessing pipeline
Pravesh Parekh Dec 6, 2018
Alfonso Nieto-Castanon Dec 9, 2018
Jeff Browndyke Dec 10, 2018
Pravesh Parekh Dec 11, 2018
Pravesh Parekh Sep 25, 2018
Pravesh Parekh Sep 12, 2018
Alfonso Nieto-Castanon Sep 3, 2018
Pravesh Parekh Sep 4, 2018