help > Tools-Calculator-questions about correlation analysis
Showing 1-3 of 3 posts
Display:
Results per page:
Apr 27, 2022  04:04 AM | Haiying Lv
Tools-Calculator-questions about correlation analysis
Dear Alfonso, Dear Forum,
      Right now, I have a few questions of my conn project. I sincerely appreciate any help from the forum. 
      I will describe what I did before, so that you can figure out what I am concerning about.
      I have just finished a seed to voxel analysis between my patient group and healthy controls while controlling for potential age and sex effects using a contrast [Group patients, Group HC, age, sex] = [-1 1 0 0]. And I found several significant clusters.
      Then I pressed import values > chose "clusters of interest in current analysis" option > then the connectivity values between these clusters of interest and my seed region were imported to Tools > calculator.
      Next I did two types of correlation analysis to analyze the correlation between the connectivity values and clinical grading with or without controlling for age and sex covariates.
      p.s.This clinical grading is a discontinuous variable that only involves 6 numbers: 0, 1, 2, 3, 4, 5, representing different severity grades of clinical symptoms.
      1) I chose "Group patients", and "clinical grading" on the Subject effects list and used a contrast: [Group patients, clinical grading]=[0 1]. And I chose one of the imported values on the Measures list and got a scatter plot as shown in the first page of attached file.
      2) I also tried another combination by choosing "Group patients", and "clinical grading" "Age_patients" "Sex_patient" on the Subject effects list and used a contrast: [Group patients, clinical grading, Age_patient, Sex_patient]=[0 1 0 0]. This time I got another scatter plot with more fitted values displaying in blue dots which is shown in the second page of attached file.
      I have reviewed the thread in https://www.nitrc.org/forum/message.php?msg_id=33064 , but I still have several questions.
      My Questions are:
      #1. One of my variable (clinical grading) is discontinuous or ordinal one, so briefly I should do Spearman's correlation rather than the Pearson's. Can I do it in conn Tools>calculator? Or can I assume that the p-value in my correlation analysis in the calculator gui is still efficient despite the discontinuous nature of my clinical variable? Do you think it might be a concern in the correlation analysis say if one of the variables is discontinuous or not normally distributed? Or is there any method handling this?
      #2. In the thread I listed above you described that the blue-dots were the fitted data (how well could you predict connectivity knowing a subject's age, sex, etc.), that's really interesting. I want to know if there is a way to export those fitted values? Because the blue dots seemed to be more aggregated than the red dots, and there is a known age and sex effect of my interested outcome measure, I suppose it's better that I can have those fitted values for my analysis. Or can you please possibly tell me how to calculate those fitted values by controlling for covariates? Because even if I know how to extract the connectivity values and perform Spearman's correlation test in other software such as SPSS or R Gui, I can't find a way to recalculate and to get those blue dots' values outside of conn. Any suggestions for that? 
       Thank you again. I have learnt a lot from this forum, and I hope to learn more in the future.
May 4, 2022  04:05 PM | Alfonso Nieto-Castanon - Boston University
RE: Tools-Calculator-questions about correlation analysis
Dear Haiying Lv

Your analyses look perfectly fine, no issues at all. Regarding #1 these analyses do not require the covariates to be normally distributed (the normality assumption in GLM is about the residuals of the model -the difference between the observed and predicted connectivity values-, not about the regressors / effects included in your model, which may be ordinal, categorical, etc. without any problem). In particular in your case, with clinical grading being ordinal, these analyses are perfectly fine/valid, they are only limited in terms of sensitivity as they are only capable of testing potential linear effects of this severity scale on functional connectivity leaving other potential nonlinearities untested (but the plots do not seem to show any clear/consistent non-linearities so I do not see any reason for concern). 

And regarding #2 that's a very good observation. My guess would be that that "clustering" is reflecting the effect of gender on connectivity in your sample. If you want to compute those predicted values directly the formula is:

Y_predicted = X * (X\Y);

where X is your design matrix (a 47 x 4 matrix containing the [Group patients, clinical grading, Age_patient, Sex_patient] values for each subject, and Y is the data matrix (a 47 x 1 vector containing the connectivity values for each subject). For interpretation you may also perhaps want to test the effect of gender in these same analysis to see its direction and significance, which you may do for example using a contrast [0, 0, 0, 1] for the above selection of model effects

Hope this helps
Alfonso

Originally posted by Haiying Lv:
Dear Alfonso, Dear Forum,
      Right now, I have a few questions of my conn project. I sincerely appreciate any help from the forum. 
      I will describe what I did before, so that you can figure out what I am concerning about.
      I have just finished a seed to voxel analysis between my patient group and healthy controls while controlling for potential age and sex effects using a contrast [Group patients, Group HC, age, sex] = [-1 1 0 0]. And I found several significant clusters.
      Then I pressed import values > chose "clusters of interest in current analysis" option > then the connectivity values between these clusters of interest and my seed region were imported to Tools > calculator.
      Next I did two types of correlation analysis to analyze the correlation between the connectivity values and clinical grading with or without controlling for age and sex covariates.
      p.s.This clinical grading is a discontinuous variable that only involves 6 numbers: 0, 1, 2, 3, 4, 5, representing different severity grades of clinical symptoms.
      1) I chose "Group patients", and "clinical grading" on the Subject effects list and used a contrast: [Group patients, clinical grading]=[0 1]. And I chose one of the imported values on the Measures list and got a scatter plot as shown in the first page of attached file.
      2) I also tried another combination by choosing "Group patients", and "clinical grading" "Age_patients" "Sex_patient" on the Subject effects list and used a contrast: [Group patients, clinical grading, Age_patient, Sex_patient]=[0 1 0 0]. This time I got another scatter plot with more fitted values displaying in blue dots which is shown in the second page of attached file.
      I have reviewed the thread in https://www.nitrc.org/forum/message.php?msg_id=33064 , but I still have several questions.
      My Questions are:
      #1. One of my variable (clinical grading) is discontinuous or ordinal one, so briefly I should do Spearman's correlation rather than the Pearson's. Can I do it in conn Tools>calculator? Or can I assume that the p-value in my correlation analysis in the calculator gui is still efficient despite the discontinuous nature of my clinical variable? Do you think it might be a concern in the correlation analysis say if one of the variables is discontinuous or not normally distributed? Or is there any method handling this?
      #2. In the thread I listed above you described that the blue-dots were the fitted data (how well could you predict connectivity knowing a subject's age, sex, etc.), that's really interesting. I want to know if there is a way to export those fitted values? Because the blue dots seemed to be more aggregated than the red dots, and there is a known age and sex effect of my interested outcome measure, I suppose it's better that I can have those fitted values for my analysis. Or can you please possibly tell me how to calculate those fitted values by controlling for covariates? Because even if I know how to extract the connectivity values and perform Spearman's correlation test in other software such as SPSS or R Gui, I can't find a way to recalculate and to get those blue dots' values outside of conn. Any suggestions for that? 
       Thank you again. I have learnt a lot from this forum, and I hope to learn more in the future.
May 8, 2022  10:05 AM | Haiying Lv
RE: Tools-Calculator-questions about correlation analysis
Dear Alfonso,

Thank you so much for your detailed explanation! It's super helpful!

For the second question, you were absolutely right. I tested the gender effect on this particular functional connectivity (FC). There does exist a gender effect related to this correlation showing a smaller or more negative correlation coefficients in male gender. And no obvious age effect was observed. So in the figure with between-subjects contrast [Group patients, clinical grading, Age_patient, Sex_patient]=[0 1 0 0], we can observe the more aggregated yet separated "upper" and "lower" string-like distribution of those fitted values in blue dots. And I was able to calculate those fitted values using the formula and replicate the drawing of the correlation figure with your help. (For details, I have listed the above findings in attached pdf file for your reference.)

But I am still confused about the interpretation and I sincerely hope to get your guidance, even though my questions might sounds a little bit naive :).

1# To my understanding, the fitted values are the new correlation coefficient values after controlling for potential effects of covariates such as age and sex in my model. And it seems to me that the gender effect explains the increase of R^2 from 0.18 to 0.28 of the linear model. I'm just not sure if the fitted model of this correlation analysis between FC and clinical grading became more statistically significant because of the inclusion of the gender covariate or the exclusion of it. I believe that, looking at the increased R^2, it should be the result of inclusion of gender effect for a better prediction of FC. But looking at the contrast matrix, which I hope to disregard the effect of age and sex using [Group patients, clinical grading, Age_patient, Sex_patient]=[0 1 0 0], this should present me with the correlation between FC and clinical grade removing the potential age and sex effect.

2# I'am also curious about the calculation of the R^2 in the calculator gui. I think it's not quite the same as what we did in a simple correlation analysis, because I simply calculated pearson's correlation coefficient R^2 of those fitted values showing as blue dots in the figure, which was 0.6353, not 0.28 (see the last page of the attached pdf). In the 5th page of the attached file, I selected the 'show null-hypothesis' option as you suggested in the post (https://www.nitrc.org/forum/message.php?...) and observed that the null-hypothesis counts for 0.1 for prediction of connectivity knowing a subjects' age and sex of R^2. And remembering that R^2 was 0.18 before involving age and gender covariates, which gave me the impression of 0.1+0.18=0.28. So I wonder is this the right way for R^2 calculation in GLM model? And if not, what is the correct method for R^2 calculation with covariates (The only method I know is partial correlation coefficients between residual Rx obtained by X and Z linear regression and residual Ry from Y and Z linear regression; suppose we need to calculate the correlation between X and Y, Z stands for all the other variables)?

Thank you so much again!

Very best,

Haiying Lv
Attachment: attached_pdf.pdf