The binomial test is a non-parametric statistical test often performed for testing the median of a distribution. Non parametric tests are useful for data that include one or more variables measured
on a nominal or an ordinal scale and for data from non-normal populations (We will discuss assumptions of normality and non-parametric procedures in more detail in the future). The binomial test
evaluates whether the population proportion of individuals who fall into one of a two category variable (i.e. gender) is equal to a hypothesized value (i.e. .50). Our hypothesis will be that only
one of the two categories (i.e. girls) differs from the hypothesized value. An SPSS data file for the binomial test may be structured in two ways. With the standard method, the SPSS data file
contains as many cases as individuals. For each case, there is a single variable that has two values that represent the two categories for the variable of interest. With the weighted cases method,
the SPSS data file contains two cases, one for each category, and two variables. The two variables are the focal variable with two values for the two categories and the weight variable containing
frequencies for the two categories.
ASSUMPTIONS UNDERLYING THE BINOMIAL TEST
1. The sample size is much smaller than the population size;
2. The sample is representative for the target population;
3. The variable is independent and identically distributed (part of which is “independent observations”).
These assumptions are beyond the scope of this assignment. We presume they’ve been met by the data at hand.
Conducting the Binomial Test with the weighted cases method
A researcher wants to assess if patients who are seated in a waiting room with four doors (one in front of them, one to their left, one to their right, and one behind them) are equally likely to
choose any door when asked which door they would choose when leaving the room. The hypothesis is that patients with ADD would choose the door behind them. Eighty patients with ADD participate in
the study and are asked to pick one of four doors when they leave the room. Because there are four doors, the test proportion is .25. In this study there are two possible outcomes, door behind and
other doors. The description of the variables are:
Variables: Door (0 = Door behind; 1 = Other doors)
Number of ADA patients who picked category 0 or category 1.
HYPOTHESES:
Ho: The proportion associated with patients with ADD who choose the door behind them is not different than 0.25
Ha: The proportion associated with patients with ADD who choose the door behind them is greater than 0.25
1) Select: Data, then click Weight Cases. Click Weight cases by. Click Number, then
click to move the variable into the Frequency Variable box. Click OK.
2) Click Analyze /Nonparametric Tests / Legacy Dialogs/Binomial. Click the Choice of door variable and move it to the Test Variable List box. Change the test proportion to .25. This value is
the hypothesized proportion associated with the value of 0 (door behind), which is the first case in the data file. Click OK.
Binomial Test
Category N Observed Prop. Test Prop. Exact Sig. (1-tailed)
Choice of door Group 1 Door Behind 25 .31 .25 .124
Group 2 Other Door 55 .69
Total 80 1.00
In this example we see that we had 55 patients with ADD who picked “other doors” and 25 who chose the door behind them. Because the p-value (highlighted in yellow) is greater than the traditional
alpha level of 0.5 (one-tailed test), we cannot reject the null hypothesis (we will study hypotheses testing in more detail in the future). More specifically, the observed proportion for door
behind is .31 (highlighted in light blue) and the probability of obtaining a sample proportion of .31 or greater given a population proportion of 0.25 (our test proportion) is .124; therefore, we
cannot conclude that the population proportion is greater than .25.
Conducting the Binomial Test with the standard method
A UWF professor claims that 80% of the student body consists of female students. He selects 15 students, 7 of which are female. The level of significance is set at .05. Note that in SPSS, the test
applies to the category that is first encountered in the data, so the hypothesis in the default system depends on the order the cases appear in the data file. Since the researcher wants to test
whether the proportion of female students is different from 80%, we need to make sure that the female students are on the top of the data file. In the gender column, “0” represents female, and “1”
represents male. We do this by right clicking on the variable and choosing “Sort Ascending”.
HYPOTHESES:
Ho: The proportion of female students is 0.8;
H1: the proportion of female students is not 0.8.
Alpha level = 0.05
1) Click ‘Analyze’, then choose ‘Nonparametric Tests’, ‘Legacy Dialogs’ and then ‘Binomial’
2) Then from the ‘Binomial Test’ dialogue box, select ‘Gender of the students’ and move it to the Test Variable List box. Make sure to change the test proportion to .80
From the result we can see there are 7 female students out of 15 students, and observed proportion is 0.5 in the result table. This is not precise enough as we need more decimal places for this
output. To change the decimal places for the output, we then do the following:
In the output window, double click on the proportion (i.e. .5) and then select ‘Cell Properties’. Then you will see a dialogue box similar to:
Then select ‘Format Value’ on the top, and change the ‘Decimals:’ to the number of decimal places you desire, and in this case, we set it to 3.
Do this for both observed proportions.
The results table should be:
Binomial Test
Category N Observed Prop. Test Prop. Exact Sig. (1-tailed)
Gender of the student Group 1 Female student 7 .467 .8 .004a
Group 2 Male student 8 .533
Total 15 1.0
a. Alternative hypothesis states that the proportion of cases in the first group < .8.
The observed proportion for female students is 0.467, and our hypothesis is:
H0: the proportion of female students is 0.8;
H1: the proportion of female students is not 0.8.
The p value, which is called ‘Exact Sig. (1-tailed; highlighted in yellow) is 0.04. Which means if the proportion of female students is exactly 0.80 in the total population, then there is 0.4%
chance to find 7 or fewer female students in a sample of size N=15. Statistically, we would normally reject the null hypothesis if the p value is smaller than 5% (p < 0.05). So in this case, we
conclude that the proportion of female students is not 0.8 and is probably much lower.
When reporting test results, we should state: “a binomial test indicated that the proportion of female students of .47 was lower than the expected .80 , p = .004 (1-sided)”.
Exercise 1: (2 points)
The probability that a UWF freshman will graduate is 0.6. Three freshmen are randomly selected. Using the Stat Calculator (http://stattrek.com/online-calculator/binomial.aspx) calculate the
probability that exactly two of the students will graduate and the probability that 2 or fewer of these students will graduate from the program.
Exercise 2: (2 points)
An MPH professor claims that 50 % of the students in his class has a median weight different from 140 lb. He collects the weight of a random sample of 22 students. Enter the following data in SPSS
and perform a binomial test using the standard method. Alpha level = 0.05
135 119 106 135 180 108 128 160 143 175 170
205 195 185 182 150 175 190 180 195 220 235
a) Write the null and alternative hypotheses
b) Is the p-value significant? Is the median value greater than 140?
c) Report the results and include the SPSS output including the binomial test table
NOTE: Since we are testing whether the median weight is different from 140 the Cut point should be specified as 140 (see picture below).
Exercise 3: (2 points) Note: Exercises 3, 4 and 5 must be done manually
The table below shows the number of individuals with optimal, normal BMI and those who are overweight by gender.
Optimal Normal Overweight Total
Male 22 73 55 150
Female 43 132 65 240
Total 65 205 120 390
a. What proportion of the participants have optimal weight?
b. What proportion of men have optimal weight?
c. What proportion of participants who are overweight are men?
d. Are overweight status and male gender independent?
Exercise 4: (2 points)
Suppose that I.Q scores are distributed normally in a certain population with a mean of 85 and a standard deviation of 12.
a) What proportion of people have IQ scores less than 90?
b) What proportion of people have IQ scores between 80 and 90?
c) If someone has an IQ of 100, what percentile is he/she in?
Show your work and, if possible, include the curve diagram.
Exercise 5: (2 points)
Among coffee drinkers, men drink a mean of 3.2 cups per day with a standard deviation of 0.8 cups. Assume the number of drinking per day follows a normal distribution. Show your work and, if
possible, include the curve diagram.
a. What proportion drink 2 cups per day or more?
b. What proportion drink no more than 4 cups per day?
c. If the top 5% of coffee drinkers are considered heavy coffee drinkers, what is the minimum number of cups consumed by a heavy coffee drinker? (HINT: see slide 42)
Exercise 6: (0.5 points) OPTIONAL
Why do you think that in the binominal tests we have conducted, SPSS one gives the p-value for 1-tail and the other one gives a two-tailed p value?