The advantages and drawbacks of multiple choice test technique

Филологические науки/ Методы и приемы контроля уровня владения иностранным языком

O. V. Bondar

Khmelnitsky national university,Ukraine

The advantages and drawbacks of multiple-choice test technique.

Test techniques can be defined as means of eliciting behaviour from candidates which will tell us about their language abilities. What we need are techniques which:

1) will elicit behaviour which is a reliable and valid indicator of the ability in which we are interested;

2) will elicit behaviour which can be reliably scored;

3) are as economical of time and effort as possible;

4) will have a beneficial backwash effect.

One technique which has been recommended and used for the testing of many language abilities is multiple choice. Its items take many forms, but their basic structure is as follows. There is a stem:

Jack should_____ harder.

and a number of options, one of which is correct, the others being distractors.

A. studies

B. to study

C. study

D. studying

It is the candidate's task to identify the correct or most appropriate option (in this case C). Perhaps the most obvious advantage of multiple choice is that scoring can be perfectly reliable. Scoring should also be rapid and economical. A further considerable advantage is that, since in order to respond the candidate has only to make a mark on the paper, it is possible to include more items than would otherwise be possible in a given period of time. As a result, this is likely to make for greater test reliability.

The advantages of the multiple choice technique were so highly regarded at one time that it almost seemed that it was the only way to test. While many laymen have always been skeptical of what could be achieved through multiple choice testing, it is only fairly recently that the technique’s limitations have been more generally recognised by professional testers. The difficulties with multiple choice are as follows.

The technique tests only recognition knowledge

If there is a lack of fit between at least some candidates’ productive and receptive skills, then performance on a multiple choice test may give a quite inaccurate picture of those candidates’ ability. A multiple choice grammar test score, for example, may be a poor indicator of someone’s ability to use grammatical structures. The person who can identify the correct response in the item above may not be able to produce the correct form when speaking or writing. This is in part a question of construct validity; whether or not grammatical knowledge of the kind that can be demonstrated in a multiple choice test underlies the productive use of grammar. Even if it does, there is still a gap to be bridged between knowledge and use; if use is what we are interested in, that gap will mean that test scores are at best giving incomplete information.

Guessing may have a considerable but unknowable effect on test scores

The chance of guessing the correct answer in a three-option multiple choice item is one in three, or roughly thirty-three percent. On average we would expect someone to score 33 on a 100-item test purely by guesswork. We would expect some people to score fewer than that by guessing, others to score more. The trouble is that we can never know what part of any particular individual’s score has come about through guessing. Attempts are sometimes made to estimate the contribution of guessing by assuming that all incorrect responses are the result of guessing, and by further assuming that the individual has had average luck in guessing. Scores are then reduced by the number of points the individual is estimated to have obtained by guessing. While other testing methods may also involve guessing, we would normally expect the effect to be much less, since candidates will usually not have a restricted number of responses presented to them (with the information that one of them is correct).

The technique severely restricts what can be tested

The basic problem here is that multiple choice items require distractors, and distractors are not always available. In a grammar test, it may not be possible to find three or four plausible alternatives to the correct structure. The result is that command of what may be an important structure is simply not tested. An example would be the distinction between the past tense and the present perfect. Certainly for learners at a certain level of ability, in a given linguistic context, there are no other alternatives that are likely to distract. The argument that this must be a difficulty for any item that attempts to test for this distinction is difficult to sustain, since other items that do not overtly present a choice may elicit the candidate’s usual behaviour, without the candidate resorting to guessing.

It is difficult to write successful items

A further problem with multiple choice is that, even where items are possible, good ones are extremely difficult to write. Professional test writers have to write many more items than they actually need for a test, and it is only after pretesting and statistical analysis of performance on the items that they can recognize the ones that are usable. It is my experience that multiple choice tests that are produced for use within institutions are often shot through with faults. Common amongst these are:

more than one correct answer; no correct answer; there are clues in the options as to which is correct ( for example the correct option may be different in length to the others); ineffective distractors. The amount of work and expertise needed to prepare good multiple choice tests is so great that, even if one ignored other problems associated with the technique, one would not wish to recommend it for regular achievement testing( where the same test is not used repeatedly) within institutions.

Savings in time for administration and scoring will be outweighed by the time spent on successful test preparation.

Backwash may be harmful

It should be mentioned that where a test which is important to students is multiple choice in nature, there is a danger that practice for the test will have a harmful effect on learning and teaching. Practice at multiple choice items (especially when, as happens, as much attention is paid to improving one’s educated guessing, as to the content of the items) will not usually be the best way for students to improve their command of a language.

Cheating may be facilitated

The fact that the responses on a multiple choice test (a, b, c, d ) are not so simple makes them easy to communicate to other candidates nonverbally. Some defence against this is to have at least two versions of the test, the only difference between them being the order in which the options are presented.

All in all, the multiple choice technique is best suited to relatively infrequent testing of large numbers of candidates. This is not to say that there should be no multiple choice items in tests produced regularly within institutions. In setting a reading comprehension test, for example, there may be certain tasks that lend themselves very readily to the multiple choice format, with obvious distractors presenting themselves in the text. There are real-life tasks ( say, a shop assistant identifying which one of four dresses a customer is describing) which are essentially multiple choice. The simulation in a test of such a situation would seem to be perfectly appropriate. What the reader is being urged to avoid is the excessive, indiscriminate, and potentially harmful use of the technique.

Literature:

1. Canale, M. and M. Swain, 1980. Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics 1 .1-47.

2. Carroll, J. B.1961. Fundamental considerations in testing for English language proficiency of foreign students. In H. B. Allen and R. N. Campbell 1972. Teaching English as a second language: a book of readings. New York: McGraw Hill.

3. Davies, A. 1988 Communicative language testing. In Hughes 1988b.

4. Hughes, A.1988b (Ed.) Testing English for university study. ELT Documents 127, Oxford: Modern English.