Are multiple-choice exams easier for economics students? A comparison of multiple-choice and 'equivalent' constructed-response exam questions.

AuthorChan, Nixon
PositionTargeting Teaching
  1. Introduction

    Multiple-choice (MC) exams are very popular in economics, especially at the principles level. Becker and Watts (2001) use survey data to estimate that MC questions account on average for about 45% of total assessment at this level, while Siegfried and Kennedy (1995) use the Test of Understanding of College Economics (TUCE) data to place this figure at about 67%. Constructed-response (CR) exams, in which students are asked to construct an answer rather than choose from among a set of possible answers, is the main alternative to MC exams. Considerable research has investigated the question of whether MC exams are essentially measuring the same thing as CR exams; the methodology of this research can loosely be described in terms of whether MC and CR scores are highly correlated. High correlation implies that an instructor could use one type of exam and be confident that, had the other type of exam been used, the ranking of the students would be largely unaffected. But it is possible for two exam results to be hi ghly correlated, with scores on one exam much higher than scores on the other. Consequently, although the students would be ranked appropriately, the exam results may give a misleading impression concerning students' level of understanding of the material being examined.

    To our knowledge, no research has investigated the issue of whether the question format affects the level of student scores in economics exams. The purpose of this article is to ask whether, as many would speculate, students score higher on MC exams (after correcting for guessing) than on "equivalent" CR exams. For many instructors, it is performance on CR exams, not MC exams, that measures how well students understand economics and whether they can apply this understanding. Katz, Bennett, and Berger (2000, p. 55) articulate this well: "Constructed response items are preferred over multiple-choice by many in the education community because the former are believed to measure more important skills, be more relevant to applied decision making, better reflect changing social values, and have more positive social consequences." if this view is accepted, and so CR exam scores are interpreted as being the true reflection of student understanding of and ability to apply economics, then if students in fact score highe r on "equivalent" MC exams, both instructors and students could be unjustifiably complacent about their teaching success and understanding of economics, respectively, an undesirable state of affairs. By reporting our findings, we hope to sensitize instructors to this phenomenon.

    Section 2 reviews the literature in this area and discusses the relevant educational theory. Section 3 describes the experiment we undertook to produce our data, and section 4 reports the empirical results. Section 5 reanalyzes the data to investigate whether our results are the same for males as for females, and section 6 summarizes results from a similar investigation of the performance of "good" versus "poor" students. Section 7 concludes.

  2. Theory and Literature Review

    Ever since Robert Yerkes tested a million World War I recruits with his multiple-choice Alpha Army Intelligence Test, there has been controversy concerning the relative merits of MC and CR tests. In economics, Walstad (1998) provides an excellent summary of the advantages of MC testing, arguing that these tests have low grading costs, provide timely feedback, are free from scoring bias, allow a wider sampling of course content, produce less measurement error, and are highly correlated with CR test scores. Welsh and Saunders (1998) defend the essay test for economics, noting that it can assess and develop higher level cognitive skills, encourage the development of writing skills, elicit students' opinions and attitudes, and lower test preparation costs. The Katz, Bennett, and Berger (2000, p. 55) quote given earlier summarizes the case for CR.

    In the education literature, research on MC versus CR appears under the rubric of format effects. There are three main streams of research. The most prominent stream addresses the issue of whether MC and CR test scores are essentially measuring the same thing. Wainer and Thissen (1993, p. 116) summarize this work, concluding that "A natural conclusion to reach from the weightings associated with constructed-response tests versus multiple-choice questions is that the former take more examinee time and resources to measure essentially the same thing more poorly than the latter." Walstad and Becker (1994) endorse this view using data on economics students, whereas Becker and Johnston (1999) is a contrary view using data on economics students. Kennedy and Walstad (1997) go beyond traditional statistical analysis of data on economics students to examine implications for an explicit objective function (minimizing grading errors), an approach they describe as an economist's view. They conclude that the statistical c orrelations upon which the standard literature in this area are based mask significant grading errors for a small number of students.

    A second stream of research in this area is concerned with how to scale or link MC and CR scores so as to create a single total score on an exam consisting of both types of questions, or to compare scores from one type of exam with those of another. Sykes and Yen (2000) and Tate (2000) are recent examples; to our knowledge, there is no similar research in economic education.

    The third stream of research in this area, the branch to which the research reported in this article belongs, addresses the issue of whether questions in one format are more difficult than "equivalent" questions in the other format. Research in this branch is limited, mainly because, as Traub (1993, p. 30) suggests, a meaningful comparison of difficulty requires that the true-score scales of the MC and CR instruments must be equivalent, something he claims is difficult if not impossible to demonstrate. Katz, Bennett, and Berger (2000) is a recent example of work in this area and provides a good literature review. They begin (p. 39) by stating that "Researchers have frequently noted that some items are more difficult in the constructed response (CR) format than in the multiple-choice (MC) format, while format does not affect performance on other items. Yet little is known about the mechanism by which response format affects item difficulty."

    We address this question by confining our analysis to two specific contexts, typical of the literature in this area. First, we examine only one type of CR question. A CR question is any question requiring the examinee to generate an answer rather than select from a small set of options. Such answers can range from producing a word or phrase to writing a lengthy essay. Snow (1993, p. 48) has a taxonomy. Our CR questions all fall into Snow's most basic CR category, namely generation of a short sentence or phrase to answer a question. Despite this apparent restriction, it could be argued that our study is more representative of the flavor of constructed response than are existing studies in the literature. The two most prominent studies in the literature, those of Bridgeman (1992) and Katz, Bennett, and Berger (2000), both use mathematics questions for which the CR answer is a number.

    Second, following the existing literature, we match MC and CR questions with identical stems so that students are faced with "equivalent" questions but in different formats. In the MC version, they choose from four suggested answers; in the CR version, they must produce the correct answer on their own. In both variants, scoring is on a right/wrong basis--no part marks are available, arbitrarily ruling out one of the possible advantages of CR.

    These two features of our data are the basis for our claim that we have created a fair comparison between MC and CR questions, allowing us to examine in unbiased fashion the influence of the format effect so long as suitable adjustment is made for the possibility of guessing in the MC format.

    Snow (1993, p. 51) claims that, in general, students perceive MC tests to be easier than CR tests. Bridgeman (1992, p. 269) reports that 81% of the students he surveyed preferred MC, 11% preferred CR, and 8% were indifferent. Kennedy and Walstad (1997) report the results of a 1995 survey at the University of Nebraska, indicating that about 70% of economics principles students believe that MC tests are easier than CR tests. Why might this be the case? Several reasons for why MC tests might be easier are possible, any of which could explain the empirical results we report later.

    One reason is that MC questions sometimes have options that are simply not credible, so that a student guessing will produce a reasonable score, even after standard penalties for guessing are imposed. (1) We have made every effort in this study to use MC questions for which the distracters are all credible, but this is undoubtedly a drawback of the MC format, given that not all instructors are diligent in this respect. Indeed, one of Bridgeman's (1992, p. 269) conclusions is that "Format effects appeared to be particularly large when the multiple-choice options were not an accurate reflection of the errors actually made by students." In such a case, students get useful feedback when their initial answer does not appear among the MC options, rendering the MC questions "easier."

    A second reason why MC questions may be thought easier is that, in some MC questions, particularly mathematics questions, it may be possible to work backward from the MC answers to figure out the correct answer. This problem-solving strategy is not available in the CR format.

    The theory of format effects in the education literature, as summarized by Traub (1993, pp. 39-42), provides further insight into this issue. Questions...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT