help > Statistical analysis: implications of the paper of Eklund et al?
Showing 1-5 of 5 posts
Jul 10, 2019 03:07 PM | Mickael Tordjman - NYU
Statistical analysis: implications of the paper of Eklund et al?
Hi everyone,
I have only very basic knowledge about statistics, I apologize in advance if the answer to my question is obvious.
Regarding the settings used for statistical analysis in Conn (voxel threshold uncorrected <0.001, cluster threshold <0.05 with cluster-size p-FDR correction), I don't really understand what are the implications of the articles of Eklund et al regarding parametric cluster size inference (Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates (2016) and Correction for this paper ;Cluster failure revisited: Impact of first level design and physiological noise on cluster false positive rates (2018)).
I have read the previous topic about the question (below) but I don't understand one point.
In the most recent article, the authors said "Finally, we discuss the implications of our work on the fMRI literature as a whole, estimating that at least 10% of the fMRI studies have used the most problematic cluster inference method (pā=ā.01 cluster defining threshold), and how individual studies can be interpreted in light of our findings".
However, the threshold used in Conn is 0.05 (but with FDR correction).
For a common analysis of 2 independent groups with ICA, should we prefer non-parametric stats in any case? Should we use a less liberal cluster threshold of 0.001 if we use parametrical stats?
Thanks a lot for your help and sorry if I got something wrong,
Mickael
------------------------------------------------------------------------------
Previous topic
"Hi Jeff&Mike,
My reading from the Eklund et al. and Flanding&Friston papers is that if you are using a voxel-wise height threshold p<.001 (this is the default threshold both in SPM and in CONN) then using parametric statistics is perfectly fine, while nonparametric statistics are recommended when you want to use higher (i.e. more liberal) voxel-wise height thresholds (e.g. p<.01, in order to focus on perhaps weaker but large/distributed responses). So yes, either one of the following two approaches should be perfectly fine (the former being more sensitive for strong localized effects, while the latter being more sensitive for weak distributed effects):
height threshold p-unc<.001, cluster-size threshold p-FDR<.05, parametric statistics
height threshold p-unc<.01, cluster-mass threshold p-FDR<.05, non-parametric statistics
Regarding how to use non-parametric statistics in CONN, simply select 'non-parametric statistics' in the explorer window top-right corner menu. Everything works as in the 'parametric statistics' case, only now the choice of statistics being displayed and the associated thresholds that you can use are slightly different. In particular, the information displayed for each cluster when selecting parametric statistics is:
cluster position: MNI coordinates of largest peak within this cluster
cluster size: number of voxels in this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this size across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this size)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this size)
peak p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more peaks of at least this height across the entire brain)
peak p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected peak having at least this height)
all of the above p-values are obtained using random-field-theory (RFT) assumptions.
When selecting instead non-parametric statistics (and after the corresponding permutation/randomization tests are run) you will then get the following information for each cluster:
cluster position: MNI coordinates of largest peak within this cluster
cluster size: number of voxels in this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this size across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this size)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this size)
cluster mass: sum of statistics (F-values or T^2 values) across all voxels within this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this mass across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this mass)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this mass)
and all of the above p-values are obtained using non-parametric assumptions (permutation/randomization analyses). Cluster mass statistics combine information about each cluster size as well as each cluster height/strength, so they are generally considered more sensitive than either cluster-size or peak-level statistics. Typically for non-parametric statistics I would recommend using a cluster-mass p-FDR<.05 threshold by default, since that should typically be one of the most sensitive tests, but your preferences might vary.
Hope this helps
Alfonso"
I have only very basic knowledge about statistics, I apologize in advance if the answer to my question is obvious.
Regarding the settings used for statistical analysis in Conn (voxel threshold uncorrected <0.001, cluster threshold <0.05 with cluster-size p-FDR correction), I don't really understand what are the implications of the articles of Eklund et al regarding parametric cluster size inference (Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates (2016) and Correction for this paper ;Cluster failure revisited: Impact of first level design and physiological noise on cluster false positive rates (2018)).
I have read the previous topic about the question (below) but I don't understand one point.
In the most recent article, the authors said "Finally, we discuss the implications of our work on the fMRI literature as a whole, estimating that at least 10% of the fMRI studies have used the most problematic cluster inference method (pā=ā.01 cluster defining threshold), and how individual studies can be interpreted in light of our findings".
However, the threshold used in Conn is 0.05 (but with FDR correction).
For a common analysis of 2 independent groups with ICA, should we prefer non-parametric stats in any case? Should we use a less liberal cluster threshold of 0.001 if we use parametrical stats?
Thanks a lot for your help and sorry if I got something wrong,
Mickael
------------------------------------------------------------------------------
Previous topic
"Hi Jeff&Mike,
My reading from the Eklund et al. and Flanding&Friston papers is that if you are using a voxel-wise height threshold p<.001 (this is the default threshold both in SPM and in CONN) then using parametric statistics is perfectly fine, while nonparametric statistics are recommended when you want to use higher (i.e. more liberal) voxel-wise height thresholds (e.g. p<.01, in order to focus on perhaps weaker but large/distributed responses). So yes, either one of the following two approaches should be perfectly fine (the former being more sensitive for strong localized effects, while the latter being more sensitive for weak distributed effects):
height threshold p-unc<.001, cluster-size threshold p-FDR<.05, parametric statistics
height threshold p-unc<.01, cluster-mass threshold p-FDR<.05, non-parametric statistics
Regarding how to use non-parametric statistics in CONN, simply select 'non-parametric statistics' in the explorer window top-right corner menu. Everything works as in the 'parametric statistics' case, only now the choice of statistics being displayed and the associated thresholds that you can use are slightly different. In particular, the information displayed for each cluster when selecting parametric statistics is:
cluster position: MNI coordinates of largest peak within this cluster
cluster size: number of voxels in this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this size across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this size)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this size)
peak p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more peaks of at least this height across the entire brain)
peak p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected peak having at least this height)
all of the above p-values are obtained using random-field-theory (RFT) assumptions.
When selecting instead non-parametric statistics (and after the corresponding permutation/randomization tests are run) you will then get the following information for each cluster:
cluster position: MNI coordinates of largest peak within this cluster
cluster size: number of voxels in this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this size across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this size)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this size)
cluster mass: sum of statistics (F-values or T^2 values) across all voxels within this cluster
cluster p-FWE: family-wise error corrected p-value (probability under the null hypothesis of observing one or more clusters of at least this mass across the entire brain)
cluster p-FDR: false discovery rate corrected p-value (expected proportion under the null hypothesis of false discoveries among clusters of at least this mass)
cluster p-unc: uncorrected p-value (probability under the null hypothesis of a randomly-selected cluster having at least this mass)
and all of the above p-values are obtained using non-parametric assumptions (permutation/randomization analyses). Cluster mass statistics combine information about each cluster size as well as each cluster height/strength, so they are generally considered more sensitive than either cluster-size or peak-level statistics. Typically for non-parametric statistics I would recommend using a cluster-mass p-FDR<.05 threshold by default, since that should typically be one of the most sensitive tests, but your preferences might vary.
Hope this helps
Alfonso"
Jul 11, 2019 12:07 AM | Stephen L. - Coma Science Group, GIGA-Consciousness, Hospital & University of Liege
RE: Statistical analysis: implications of the paper of Eklund et al?
Dear Mickael,
I also had similar questions and I tried to follow the developments on this issue, which I describe here:
https://www.nitrc.org/forum/message.php?...
As Alfonso said, the paper mostly shows that using a voxel-wise threshold of p-uncorrected < 0.01 leads to a very inflated false positive rate (up to 67% if I recall correctly), but using p-uncorrected < 0.001 and cluster-size p-FWE < 0.05 has at worst a false positive rate of about 0.01, so it's still inflated but nowhere near as badly.
Of interesting note, all Eklund's papers on this issue only investigate cluster-size p-FWE, not p-FDR. But since the problems arise from a violation in Random Field Theory assumptions, the basis of the statistical hypothesis testing that is used to correct the values and calculate the significance, and in particular in the spatial autocorrelation function, it's safe to assume that p-FDR is also affected in a similar way.
However, as described in the above link, the latest paper "Cluster Failure Revisited" shows that using a non-neuronal component denoising step (ICA is used in their paper, but they mention that PCA such as aCompCor in CONN should work equally well) fixes the autocorrelation issue as I suspected and essentially mitigates the false positive rate inflation.
So in the end, as long as you are using CONN, and with a voxel-wise threshold p-uncorrected < 0.001, you should not be affected by the issue outlined by these papers. If you want to further ensure a very precisely controlled false positive rate, you can use non-parametric calculations as Alfonso advised, which will not use RFT assumptions and thus totally bypass this issue.
Hope this helps,
Best regards,
Stephen
I also had similar questions and I tried to follow the developments on this issue, which I describe here:
https://www.nitrc.org/forum/message.php?...
As Alfonso said, the paper mostly shows that using a voxel-wise threshold of p-uncorrected < 0.01 leads to a very inflated false positive rate (up to 67% if I recall correctly), but using p-uncorrected < 0.001 and cluster-size p-FWE < 0.05 has at worst a false positive rate of about 0.01, so it's still inflated but nowhere near as badly.
Of interesting note, all Eklund's papers on this issue only investigate cluster-size p-FWE, not p-FDR. But since the problems arise from a violation in Random Field Theory assumptions, the basis of the statistical hypothesis testing that is used to correct the values and calculate the significance, and in particular in the spatial autocorrelation function, it's safe to assume that p-FDR is also affected in a similar way.
However, as described in the above link, the latest paper "Cluster Failure Revisited" shows that using a non-neuronal component denoising step (ICA is used in their paper, but they mention that PCA such as aCompCor in CONN should work equally well) fixes the autocorrelation issue as I suspected and essentially mitigates the false positive rate inflation.
So in the end, as long as you are using CONN, and with a voxel-wise threshold p-uncorrected < 0.001, you should not be affected by the issue outlined by these papers. If you want to further ensure a very precisely controlled false positive rate, you can use non-parametric calculations as Alfonso advised, which will not use RFT assumptions and thus totally bypass this issue.
Hope this helps,
Best regards,
Stephen
Jul 11, 2019 03:07 PM | Mickael Tordjman - NYU
RE: Statistical analysis: implications of the paper of Eklund et al?
Hi Stephen,
Thank you very much for your very clear answer!
Best,
Mickael
Thank you very much for your very clear answer!
Best,
Mickael
Jul 11, 2019 04:07 PM | Jeff Browndyke
RE: Statistical analysis: implications of the paper of Eklund et al?
Excellent and helpful response, Stephen.
I've been considering going the non-parametric route to just avoid these issues in the first place, but part of my trepidation has been a lack understanding what CONN is doing "under the hood" with respect to the permutation approach. Do you know how this is handled in CONN? Or, maybe can point towards some papers where they used CONN permutation and delve into the technique in the paper methods?
Thanks,
Jeff Browndyke
I've been considering going the non-parametric route to just avoid these issues in the first place, but part of my trepidation has been a lack understanding what CONN is doing "under the hood" with respect to the permutation approach. Do you know how this is handled in CONN? Or, maybe can point towards some papers where they used CONN permutation and delve into the technique in the paper methods?
Thanks,
Jeff Browndyke
Jul 11, 2019 06:07 PM | Stephen L. - Coma Science Group, GIGA-Consciousness, Hospital & University of Liege
RE: Statistical analysis: implications of the paper of Eklund et al?
Dear Jeff,
Glad my reply could be helpful, if it can save some time and help getting more robust results, that's great!
About what CONN is doing internally to calculate non-parametric results, what CONN is doing has changed since v18a, here are my personal note comparing the various approaches:
------------
For more infos on this implementation in CONN since v18a: https://www.nitrc.org/forum/forum.php?th...
So essentially, CONN has always been using permutation by sign flipping of the residuals, which is considered the best approach, but for single case studies it yielded some issues, so CONN devised a new approach by applying a full multiplication by a random orthogonal matrix, which is essentially the same thing but the latter seems to be more robust and flexible than the former approach (and I guess also faster since processors and in particular MATLAB is optimized for matricial operations thanks to to BLAS).
In practice, on a few studies I did, I can confirm the results were the same for group analyses, but better for single case studies (whereas before they were nonsensical). This is no proof in the general case, and it would be awesome if Alfonso could publish a paper about it, but as of now there is no reference for this particular approach to my knowledge (but maybe Alfonso can give more infos?).
Hope this helps,
Best regards,
Stephen
Glad my reply could be helpful, if it can save some time and help getting more robust results, that's great!
About what CONN is doing internally to calculate non-parametric results, what CONN is doing has changed since v18a, here are my personal note comparing the various approaches:
------------
Permutation-based options can be implemented on
any NIfTI file using SnPM in SPM, randomise in FSL, or FSL PALM
(also affiliated with FSL but standalone script), Eklund's BROCCOLI
package, mri_glmfit-sim in FreeSurfer (or future in QDEC, see
attached PPT), CONN for fmri (similar approach to PALM,
state-of-the-art and generic), VBM/CAT for T1 voxel-based
morphometry. In practice, there are two main approaches for
non-parametric permutation-based correction: permutation of
residuals (FSL PALM, CONN) which is more generic, or "whole-data"
permutation (SnPM, BROCCOLI), which a permutation approach requires
different permutation schemes for different sorts of analyses and
it does not apply to certain scenarios (e.g. one-sample t-test).
However, even permutation of residuals can have shortcomings in
practice, such as the inability of the "sign-flipping" procedure to
build the proper null distribution for the group that contains a
single sample/subject (in the case of a single-case study, one
patient vs a group). The "fix" that Alfonso introduced in CONN
v18a, instead of flipping the signs (or permuting) the residuals,
does a full multiplication by a random orthogonal matrix (since
both permutation and sign-flip operations can be considered
special-cases of an orthogonal transformation, and orthogonal
transformations are the most general class of transformations
guaranteeing that the randomized data has exactly the same spatial
covariance structure as your original data).
-------------For more infos on this implementation in CONN since v18a: https://www.nitrc.org/forum/forum.php?th...
So essentially, CONN has always been using permutation by sign flipping of the residuals, which is considered the best approach, but for single case studies it yielded some issues, so CONN devised a new approach by applying a full multiplication by a random orthogonal matrix, which is essentially the same thing but the latter seems to be more robust and flexible than the former approach (and I guess also faster since processors and in particular MATLAB is optimized for matricial operations thanks to to BLAS).
In practice, on a few studies I did, I can confirm the results were the same for group analyses, but better for single case studies (whereas before they were nonsensical). This is no proof in the general case, and it would be awesome if Alfonso could publish a paper about it, but as of now there is no reference for this particular approach to my knowledge (but maybe Alfonso can give more infos?).
Hope this helps,
Best regards,
Stephen