# Category Archives: Methods of Data Collection

## Most Americans favor…

While watching a well-known TV news channel the other day, a report came on about student reactions to the enhanced security procedures at U. S. airports. The correspondent indicated that students are split right down the middle, 50-50 on the use of full-body airport scanners. Fifty percent favor, fifty percent oppose. I was curious what data there were to support this contention. The correspondent stated that half the students he talked to were in favor, half opposed. He then proceeded to present the results of his sample, which appeared to be N = 2, one student in favor, one student opposed. But he went on, according to another poll, most Americans favor the full-body scanners. Now “most” is a most ambiguous word in my mind. By definition, most simply means the majority, but if asked to attach a number to the word “most,” I tend to think about 80%. So I concluded that about 80% of Americans favor the new scanning procedure. But just to be sure, I looked up the poll that was being cited here. ( http://www.washingtonpost.com/wp-srv/politics/polls/postpoll_11222010.html?hpid=topnews)

The poll indicated it was based on a random sample of 514 adults who were called either on their land line or cell phones. Now a random sample is one in which each member of a a population has an equal chance of being included in the sample. But I for one, have no chance of being included in this sample; I never respond to telephone polls, sometimes quite ungraciously. So the question arises, how many calls were made to obtain 514 responses? And how would those who declined to participate in this poll have responded?

The poll results indicate that if we consider only those people who “strongly support” the new body scanners, then 37% of the sample responded favorably. Twenty-seven percent were somewhat in favor, for a total of 64% responding that they support the scanning, either strongly or somewhat. But only 48% responded in a favorable way to the enhanced hand searches. And it also seems reasonable to expect that those who fly infrequently or not at all (53% of the respondents in the survey) may differ in their beliefs from those who fly frequently. We might also expect differences related to age of respondents, but we can’t tell from the results of the poll.

In her recent blog post, Bonnie included data collection methods as one of the core concerns for an introductory statistics course. To quote:

“Though I have said to my students more times that I can count, ‘the quality of our statistics is limited by the quality of our sample,’ I must admit to being a bit surprised that this was considered critical by others, especially since when I look at many undergraduate statistics textbooks, data collection methods are barely mentioned.”

The two examples given above provide excellent support for Bonnie’s contention that students should be taught to carefully evaluate the quality of the data on which statistics are based. Who were the subjects and how were they selected? What questions were asked and what responses allowed? And what inferences can be made from the results. It might prove to be an interesting class exercise to have students find media reports of current polls and then actually access the poll to see who the respondents were, what questions were asked, and the results obtained.

## Core Statistical Concepts

I have been spending the week thinking about what I consider to be the “core concepts” that need to be covered in an applied statistics class, be it in psychology, health, business, or education. However, before I post my personal thoughts, I felt it necessary to see what other applied statisticians had to say. In my search, I found http://www.statlit.org/pdf/2004McKenzieASA.pdf . This work was conducted by John McKenzie (2004), Conveying the Core Concepts, is from the Proceedings of the ASA Section on Statistical Education, pages 2755-2757.

In reading what  McKenzie, and several other professors of applied statistics identified as the core concepts in statistics, I must say … I concur. Listed below are the core concepts in applied statistics … the information that, in my opinion, simply has to be covered regardless of illness, snow days, or anything else that could interrupt a professors’ teaching schedule.

Variability: Students cannot understand the purpose of statistics unless they get the concept of variability. Within this, we can further talk about variability due to chance and variability due to effect. Including in the discussion of variability should be the difference between systematic and random variability. I would have to say that not a class period goes by without me spending at least a little time on helping students to focus on issues of variability (especially variability due to the individual differences of the subjects who just happen to be in our sample).

Randomness: Though I would see randomness and variability as being part of the same large concept, McKenzie’s work identified the concept of randomness as not only separate from variability but also critical for students to master.

Sampling Distribution: Along with Hypothesis Testing, the teaching of sampling distribution is considered to be one of the most complicated to teach.  I would concur, which is why I spend an entire class period just on a single activity with M&M’s to demonstrate the concept of sampling distribution. (Please see a prior blog entry for details on this tactile activity).

Hypothesis Testing: The sages and I spent the month of October and much of November discussing whether Hypothesis Testing is critical and if so, how to best tackle the teaching of this complex topic. Not surprising, McKenzie identified the teaching of hypothesis testing as being one of the two most difficult concepts to teach in applied statistics (the other being sampling distribution). Though there may be several published articles on hypothesis testing no longer being a critical concept to teach, the individuals who were surveyed for McKenzie’s work, certainly consider it to be a critical concepts.

Data Collection Methods: Though I have said to my students more times that I can count, “the quality of our statistics is limited by the quality of our sample,” I must admit to being a bit surprised that this was considered critical by others, especially since when I look at many undergraduate statistics textbooks, data collection methods are barely mentioned. Kiess and Green’s (2010) Statistical Concept for the Behavioral Sciences, 4/e, certainly tackles the issue of data collection methods.

Association vs. Causality: This core concept makes me smile, as often when I meet someone for the first time, and they ask me what I do … my response is often met with one of two comments … “Oh, I hated statistics” or “Correlation does not mean causation.” It’s kind of like me recalling how to greet a person in German, a class that I had for three years, and yet recall so little. We, as applied statisticians, certainly engrave this concept into the minds of our students, but I’m sure most of you are like me, hoping student get more than a “pat phrase” out of our classes.

Significance (Statistical vs. Practical): This is a critical concept in applied statistics and one that is probably not mentioned in theoretical statistics classes. Sure, we delineate a mark in which we have to say … these results are too extreme for us to attribute them to “chance” … but just because we found a statistically significant difference, doesn’t mean it’s a difference that truly matters. In applied statistics, it’s not enough to understand how statistical significance works, but to be able to interpret the results to determine practical difference. I must admit to not covering this core concept to the same extent I cover the others.

As I think of other “critical concepts” they tend to be a bit more specific and fall under the larger concepts listed above (e.g., understanding what a standard deviation can tell us, clearly falls under the concept of variability. I invite all of you, to comment on what concepts, if any, are missing from this list.