To request a blog written on a specific topic, please email James@StatisticsSolutions.com with your suggestion. To view our SPSS video tutorials become a member here. Thank you!

Friday, December 7, 2012

Conducting a Chi Square Test of Independence in SPSS with Only Frequencies



So let’s say you had some data one two nominal variables and wanted to test the independence between the two variables. However, instead of having the data on an individual-level scale, you had it in the form like what is shown below—in a cross-tabular form. You could compute the chi square test statistic by hand, but this can be cumbersome. However, there is a very simple way in SPSS to conduct the chi square test using only the numbers shown below.


Group
Fruit chosen
Group 1
Group 2
Group 3




Apple
10
7
22
Orange
8
19
9
Grape
17
13
12

So with this data, we have three groups (1, 2, and 3), and three fruits (apple, orange, and grape). The first thing to do to conduct the chi square test of independence in SPSS would be to set up the two grouping variables. So in SPSS, one variable will be “Apple, Apple, Apple, Orange, Orange, Orange, Grape, Grape, Grape” while the other variable will be “1, 2, 3, 1, 2, 3, 1, 2, 3” as shown below.


By doing so, we have taken care of each combination of fruit chosen and group. Now the frequencies for each group combination are needed. So the first combination is “Apple, Group 1,” which has a frequency of 10. So simply input “10” for the first item in a new variable. Continue to do this until each fruit chosen and group combination has the correct frequency, as shown below.

 

Once this is done, simply go to Data -> Weight Cases… Select Weight cases by, and use the new “Frequency” variable. Next, to actually perform the chi square, simply go to Analyze -> Descriptive statistics -> Crosstabs. Put the Fruit in Row(s): and the Group in Column(s):. Then in the Statistics… dialogue box, check the Chi-square box. Hit Continue and then OK.  The results of the chi square show the same cross-tabulation box as we had above as well as the actual chi square statistic. By doing it in SPSS instead of by hand or using Excel, we can get the test statistic information we’re looking for much faster.

Fruit * Group Crosstabulation
Count

Group
Total
1.00
2.00
3.00
Fruit
Apple
10
7
22
39
Grape
17
13
12
42
Orange
8
19
9
36
Total
35
39
43
117


Chi-Square Tests

Value
df
Asymp. Sig. (2-sided)
Pearson Chi-Square
15.659a
4
.004
Likelihood Ratio
15.184
4
.004
N of Valid Cases
117


a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 10.77.



Thursday, November 8, 2012

Data Entry and Management in SPSS

This video describes how to upload your data file from excel to SPSS, and then how to manage your data to prepare for analysis in SPSS.

Friday, September 7, 2012

Two-Way ANOVA Interactions in SPSS


Typically, when conducting an ANOVA, we can get the pairwise comparison results for the differences between the groups on the dependent variable. However, when we step it up to two grouping variables, SPSS tends to not give us this option.  
            For example, let’s say you wanted to test for difference in “Test Scores” by gender (male vs. female) and by ethnicity (white vs. black vs. Hispanic). In the Options… dialogue box in SPSS, you can move over Gender, Ethnicity, and Gender*Ethnicity. This will give the marginal means and standard errors for each of the groups. However, if you select the box “Compare main effects”, you will only get comparisons by Gender and by Ethnicity, not by the combination.  The secret to getting the main effects comparison is in examining the syntax. So first “Paste” the analysis into a Syntax file. It should look something like what is below:

UNIANOVA TestScores BY Gender Ethnicity
  /METHOD=SSTYPE(3)
  /INTERCEPT=INCLUDE
  /EMMEANS=TABLES(Gender) COMPARE
  /EMMEANS=TABLES(Ethnicity) COMPARE
  /EMMEANS=TABLES(Gender*Ethnicity)
  /CRITERIA=ALPHA(.05)
  /DESIGN=Gender Ethnicity Gender*Ethnicity.

            From the above you can see that SPSS did not add the “COMPARE” syntax to the Gender*Ethnicity means. In order to conduct the comparisons, we have to manually add it. However, simply adding “COMPARE” is not enough. Because it is an interaction, you have to specify what you want to compare. So it should be changed into what is below:

UNIANOVA TestScores BY Gender Ethnicity
  /METHOD=SSTYPE(3)
  /INTERCEPT=INCLUDE
  /EMMEANS=TABLES(Gender) COMPARE
  /EMMEANS=TABLES(Ethnicity) COMPARE
  /EMMEANS=TABLES(Gender*Ethnicity) COMPARE (Gender)
  /CRITERIA=ALPHA(.05)
  /DESIGN=Gender Ethnicity Gender*Ethnicity.

            What this will do is it will compare the Test Scores by gender for each ethnicity separately. But what about comparing the ethnicity for each gender? That simply requires another line in the syntax, which is below. However, conducting all these pairwise comparisons is going to affect Type I error. We may have some significant differences there that may be only significant due to random chance. In order to adjust for Type I error, we can include the Bonferroni adjustment the the comparisons. So the final syntax below has both two-way interactions examined with a Bonferonni adjustment added onto the p-values to adjust for Type I error.

UNIANOVA TestScores BY Gender Ethnicity
  /METHOD=SSTYPE(3)
  /INTERCEPT=INCLUDE
  /EMMEANS=TABLES(Gender) COMPARE ADJ(BONFERRONI)
  /EMMEANS=TABLES(Ethnicity) COMPARE ADJ(BONFERRONI)
  /EMMEANS=TABLES(Gender*Ethnicity) COMPARE (Gender) ADJ(BONFERRONI)
  /EMMEANS=TABLES(Gender*Ethnicity) COMPARE (Ethnicity) ADJ(BONFERRONI)

  /CRITERIA=ALPHA(.05)
  /DESIGN=Gender Ethnicity Gender*Ethnicity.

            To make things visual, you can make a bar chart using the estimated marginal means so you might have something like the chart below.