help >

**design matrix of ANOVA**Showing 1-5 of 5 posts

Nov 16, 2017 10:11 AM | Xiaonan Guo

design matrix of ANOVA

Hi Andrew,

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

Nov 16, 2017 04:11 PM | Andrew Zalesky

design matrix of ANOVA

Hi Xiaonan,

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

*Originally posted by Xiaonan Guo:*Hi Andrew,

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

Nov 17, 2017 04:11 AM | Xiaonan Guo

RE: design matrix of ANOVA

Hi Andrew,

Many thanks for your help.

Another question about NBSglm:

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

y=[1;1;1;1;1;1;1;1];

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

After 5000 permutations, all the 5001 Test_Stat all the same, as expected.

However, if number of subjects is different among groups, for example adding a last row to the design matrix, Test_Stat will be different among permutations.

e.g.

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

-1 -1 1 1

y=[1;1;1;1;1;1;1;1;1];

After 5000 permutations, we will get different Test_Stat values.

I am quite confused about this result. All responsible variables are the same for subjects, then we should obtain the same vector of y after each permutation. As a result, we should get the same Test_Stat values. But NBSglm result doesn't match our expectation in this case. Why can this happen?

Thanks!

Best wishes,

Xiaonan

Many thanks for your help.

Another question about NBSglm:

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

y=[1;1;1;1;1;1;1;1];

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

After 5000 permutations, all the 5001 Test_Stat all the same, as expected.

However, if number of subjects is different among groups, for example adding a last row to the design matrix, Test_Stat will be different among permutations.

e.g.

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

-1 -1 1 1

y=[1;1;1;1;1;1;1;1;1];

After 5000 permutations, we will get different Test_Stat values.

I am quite confused about this result. All responsible variables are the same for subjects, then we should obtain the same vector of y after each permutation. As a result, we should get the same Test_Stat values. But NBSglm result doesn't match our expectation in this case. Why can this happen?

Thanks!

Best wishes,

Xiaonan

*Originally posted by Andrew Zalesky:*Hi
Xiaonan,

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

*Originally posted by Xiaonan Guo:*Hi Andrew,

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

Nov 17, 2017 05:11 AM | Andrew Zalesky

RE: design matrix of ANOVA

Hi Xiaonan,

For the degenerate case of y=[1;1;1;1;1;1;1;1];, statistical inference becomes immaterial. We can immediately be sure that the null hypothesis cannot be rejected in this case and testing is not required.

The reason for the test statistic values varying is probably due to rounding/truncation error. In particular, the fitted regression coefficients are probably a really tiny value (when they should really be 0) and the standard error of the model is also a really tiny value. When dividing one tiny value by another tiny value (beta / se), the resulting value can be large. So the behaviour that you have observed is most likely due to rounding error.

In any case, your example does not require any statistical testing. It is clear that the null hypothesis cannot be rejected!

We should probably include a catch statement in future versions to exclude this degenerate case from testing, although it is probably extremely rare in practice.

Andrew

For the degenerate case of y=[1;1;1;1;1;1;1;1];, statistical inference becomes immaterial. We can immediately be sure that the null hypothesis cannot be rejected in this case and testing is not required.

The reason for the test statistic values varying is probably due to rounding/truncation error. In particular, the fitted regression coefficients are probably a really tiny value (when they should really be 0) and the standard error of the model is also a really tiny value. When dividing one tiny value by another tiny value (beta / se), the resulting value can be large. So the behaviour that you have observed is most likely due to rounding error.

In any case, your example does not require any statistical testing. It is clear that the null hypothesis cannot be rejected!

We should probably include a catch statement in future versions to exclude this degenerate case from testing, although it is probably extremely rare in practice.

Andrew

*Originally posted by Xiaonan Guo:*Hi
Andrew,

Many thanks for your help.

Another question about NBSglm:

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

y=[1;1;1;1;1;1;1;1];

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

After 5000 permutations, all the 5001 Test_Stat all the same, as expected.

However, if number of subjects is different among groups, for example adding a last row to the design matrix, Test_Stat will be different among permutations.

e.g.

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

-1 -1 1 1

y=[1;1;1;1;1;1;1;1;1];

After 5000 permutations, we will get different Test_Stat values.

I am quite confused about this result. All responsible variables are the same for subjects, then we should obtain the same vector of y after each permutation. As a result, we should get the same Test_Stat values. But NBSglm result doesn't match our expectation in this case. Why can this happen?

Thanks!

Best wishes,

Xiaonan

Many thanks for your help.

Another question about NBSglm:

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

y=[1;1;1;1;1;1;1;1];

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

After 5000 permutations, all the 5001 Test_Stat all the same, as expected.

However, if number of subjects is different among groups, for example adding a last row to the design matrix, Test_Stat will be different among permutations.

e.g.

design matrix:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

-1 -1 1 1

y=[1;1;1;1;1;1;1;1;1];

After 5000 permutations, we will get different Test_Stat values.

I am quite confused about this result. All responsible variables are the same for subjects, then we should obtain the same vector of y after each permutation. As a result, we should get the same Test_Stat values. But NBSglm result doesn't match our expectation in this case. Why can this happen?

Thanks!

Best wishes,

Xiaonan

*Originally posted by Andrew Zalesky:*Hi
Xiaonan,

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

Hi Andrew,

When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

the first formulation (factor effects) is the formulation that is expected by NBSglm.This will work correctly. You can also use a t-test instead of F-test to assess one-sided alternative hypotheses.

I don't think the cell formulation will work correctly.

The column of 1's simply models the global mean. It is not particularly surprising that this column is significant, especially if you do not de-mean your data. This is not a problem. Not sure what you mean by finding a NaN.

Andrew

*Originally posted by Xiaonan Guo:*When I perform between-subject two-way ANOVA using NBSglm, I am somewhat confused about the design matrix. I tried two types of design matrix with corresponding contrast matrix, but got different observed test statistics. Here is a short example,

The first one is factor effects approach:

1 1 1 1

1 1 1 1

1 -1 -1 1

1 -1 -1 1

-1 1 -1 1

-1 1 -1 1

-1 -1 1 1

-1 -1 1 1

Where the first column is the group, the second column gender, 3rd the interaction effect between group and gender, 4th column is intercept.

My F contrasts are [1 0 0 0],[0 1 0 0],[0 0 1 0] for main effects of group and gender and interaction effect, respectively.

The second one is cell approach:

1 0 0 0

1 0 0 0

0 1 0 0

0 1 0 0

0 0 1 0

0 0 1 0

0 0 0 1

0 0 0 1

My F contrasts are [1 1 -1 -1], [1 -1 1 -1], [1 -1 -1 1].

Which one is correct for NBSglm?

Another question, in the case of cell approach, all the main effects and interaction effect are the same. It seems that there's somethings wrong with this approach.

Then if I used the factor effects approach, there's another question that confused me. If all the responsible variables are the same for all subjects, for example a column of ones, we will find significant main effect. That's really strange. I also tried glmfit to perform the same procedure, and all the results are NaN. Why does this happen?

Best wishes,

Xiaonan

Nov 17, 2017 06:11 AM | Xiaonan Guo

RE: design matrix of ANOVA

Hi Andrew,

Many thanks for your helpful reply.

Best wishes,

Xiaonan

Many thanks for your helpful reply.

Best wishes,

Xiaonan