There are so many places were issues can arise for students when they are learning about Null Hypothesis testing. I believe that the best professors highly rely upon the technique of scaffolding (see a prior post for more detail). Briefly, scaffolding is a Vygotskian concept where the professor constrains the situation for the students so they can learn the component parts of a larger, more complex concept. Certainly, as Null Hypothesis testing is complex, scaffolding is in order.

Many of the statistics classes I teach have student learning outcomes that expect students to be able to calculate and interpret statistics like the *z*-test, *t*-test, *F*-test, and correlation coefficients, (i.e., Null Hypothesis testing). Here are the component pieces, in my opinion, that often deserve a full class period ( at least 50 mins.) and homework that requires students to master the pieces before putting it all together. I find in breaking apart the teaching of these concepts, that students not only end up in the same place as when a professor doesn’t break down these pieces and just goes full steam ahead, but that students have a far greater understanding of the underlying concepts, thus minimizing the likelihood of them carrying with them misconceptions. So, though it may seem like it takes more time to teach this way, my experiences has been that it doesn’t, while resulting in greater student understanding.

(1) Though my focus is concepts not mathematics/calculations, I find that students will never fully understand statistics without having do complete repeated hand calculations of small data sets using definitional formulas. Thus, it is critical that students learn how to calculate the Sum of the Squared Deviations (*SS*). They can then learn how to use the *SS* for calculating the variance and standard deviation. (See prior post for details on how I use a kinesthetic activity for the teaching of *SS *which maximizes student comprehension of the Sum of Squares).

(2) I actively teach concepts on the Normal Distribution and *z*-score, which typically take more than one class period.

(3) I feel it is critical that students fully understand sampling error, standard error, and how to estimate standard error. Again, please see a prior post for a tactile activity I use in the teaching the concepts of sampling error/standard error.

(4) Understanding that we begin by assuming the null hypothesis is true, then we establish a point of rejecting that ones hypothesis is wrong (a line in the sand), and what the consequences are if you hold onto a hypothesis that isn’t true or reject one that is true. This is such a critical component of this entire process, and helps lay out students understanding of the assumptions underlying NHT, what Alpha and Beta are (along with their corresponding errors) and even helps lay the ground work for understanding when to avoid parametric statistics in place of non-parametric statistics.

(5) Students need to understand the purposes, strengths, limitations, and assumptions required for each NHT statistic.

(6) By this point, if all is spelled out, especially if students can calculate the means, *SS*, and standard error, learning how to calculate and interpret the *z*-test,* t*-test, *F*-test, or correlation coefficient becomes easy. The calculation and interpretation become students’ favorite part of the class, as it all makes sense to them.

(7) However, even though we’ve discussed this previously, we cover yet again, detailed issues of type I and type II error, the requirement that NHT does not work absent of a theory that is predicting a specific outcome, and that though we have estimated sampling error, that estimate still contains sampling error, measurement error, and experimenter error.

(8) We calculate effect size statistics and confidence intervals. The latter so students get begin to get an idea of the size of an effect, the latter is to aid in general understanding of what the point estimate of the sample mean is really telling us. Confidence Intervals are truly easier for students to “get.”

Students don’t leave my class, or at least I hope they don’t, thinking that if their Observed *t *falls in the rejection region that there is proof that their independent variable caused the change of the dependent variable ; they don’t leave thinking the *p*-value and effect size are one in the same; they don’t leave believing that any research design (in general) or any experiment (in particular) is equally helped by using a specific statistic,… but they do leave recognizing that this one test is providing evidence, and that to be sure, more needs to be done.

I liken the types of conceptual mistakes individuals have about how and when to use NHT and what it can tell us to when my children were young and they thought … to get money, you just had to go to the cash machine. Yes, I get money from the cash machine, but obviously not without first putting it in. And yes, a significant statistics test can tell us something, but not outside of the context of us first understanding all that went into that study for the statistic to come out, and just like the amount of money available to me from the cash machine … there are significant limitations for which we must always be aware, lest we look like fools.

I believed if taught well, students “get” this.