The Dilemmas of Language Assessment

I would like to share my experiences with language assessment in my current instructional context–an intensive English program for international students at a large language school in Miami, Florida. We conduct formal assessments using a full-length simulated TOEIC (a norm-referenced standardized test consisting of two hundred selected-response items measuring listening and reading comprehension skills) at the beginning of each student’s enrollment for initial placement purposes, and then a comprehensive assessment at the end of each academic quarter that involves a criterion-referenced achievement test (containing selected-response and constructed-response items to measure attainment of the course’s learning objectives, which involve all four language skills), another full-length simulated TOEIC and an oral proficiency interview (OPI) scored on the ILR proficiency scale (a speaking performance assessment). We spend so much time, energy and resources conducting these three different types of assessments primarily because we are required to do so by the national accrediting agency to which we must answer (I am personally ambivalent about the value of spending so much time on formal, end-of-quarter testing; by the end of the year, most students are more than “tested out”). The accrediting agency’s standards require language schools to conduct regular language assessments that are valid and reliable, and so we use the TOEIC because (at least according to ETS, the test maker), the test is among the most valid and reliable norm-referenced assessment instruments available. In the past, we tried using the TOEFL iBT, but that test is too complicated to administer en masse (it requires computers, whereas the TOEIC is an entirely paper-based test) and cannot be reliably scored due to the constructed-response sections (speaking and writing), which require professional, trained raters to score consistently and reliably. The same goes for the IELTS.

Although we employ three distinct assessment instruments for our quarterly testing, only the criterion-referenced achievement test is used to determine whether students pass or fail the course (we do not believe that it is fair to use norm-referenced tests to award final course grades, since such tests are designed to produce a normal distribution of scores; we believe that all students should have the opportunity to pass the course provided that they can demonstrate minimally-acceptable attainment of course objectives, which are defined and explained to students at the beginning of the course). All students are also assessed informally throughout their course based on their continuing in-class performance and completion of homework, assignments, language lab activities and exercises, which include a wide range of item types (selected-response, constructed-response and personal-response) and assessment tools (traditional and alternative), although none of these is formally graded.

Despite the apparent robustness of this assessment protocol, we have been disappointed with the results of these assessments, especially the TOEIC, which despite its supposed validity has proven almost useless for us. The TOEIC’s test maker claims that TOEIC scores provide a valid and reliable measure of the examinees’ ability to use English in the workplace (since our program is general and not academic in nature, this test is better aligned to our course’s goals than the TOEFL or IELTS, which mainly assess the examinee’s ability to use English in academic settings). However, it has been our experience that the TOEIC substantially overestimates the test taker’s language proficiency (we have had cases of students achieving high TOEIC scores who could not communicate effectively in English, at all). In my opinion, part of this problem is that the TOEIC only measures receptive skills, yet our students must be able to use the language to communicate in authentic contexts, and that obviously requires production. Unfortunately, our school’s achievement tests (which are provided to us by our international headquarters in Switzerland) are only marginally better than the TOEIC (for one thing, the minimum passing score is only fifty percent according to the published scoring guide, although our school unilaterally moved that up to sixty percent for several reasons, including methodological ones that are too complicated to go into here), but at least they have the virtue of being aligned with the course content and assessing all four language skills. We (and many other language schools in the United States accredited by the same agency) have been forced into this situation due to accreditation rules and the mandatory institutional accreditation required by a federal law that went into effect in 2010. So, like Pete from this week’s article, I long to break free of this administrative assessment mess and instead use what I consider more meaningful and authentic forms of assessment (I personally favor performance assessments), but our hands are tied due to the legally-enforced accreditation rules, which require that we use “nationally accepted language assessment instruments.”

Scroll to Top