In 2018 most students in most schools will move to an online environment for NAPLAN. This means that students will complete all test sections on a computer or tablet. Test data that is entirely digital can be turned around more rapidly so that results will be available for schools, systems and families much faster.
The implication is that the results can be put to use to assist students with their learning, and teachers with their planning. While this appears to address one of the persistent criticisms of NAPLAN – the lag between testing and results – other questions still need to be asked about NAPLAN. Continuing concerns include high stakes contexts and perverse effects (Lingard, Thompson & Sellar, 2016), the marketization of schooling (Ragusa & Bousfueld, 2017), the hijacking of curriculum (Polesel, Rice & Dulfer, 2014) and the questionable value of NAPLAN for deep learning (beyond test performance).
Almost ten years after its introduction, NAPLAN has been normalised in Australian schooling. Despite some tweaking around the edges, the original assessment architecture remains intact. However, the move to online delivery and automated marking represents a seismic shift that demands urgent attention.
Most student responses in NAPLAN are closed questions. In the new online format these include multiple choice, checkbox, drag and drop, reordering of lists, hot text, lines that can be drawn with a cursor and short answer text boxes. These types of answers are easily scored by optical recognition software, and have been since NAPLAN was introduced.
However the NAPLAN writing task, requiring students to produce an extended original essay in response to an unseen prompt, has always been marked by trained human markers. Markers apply a detailed 10 point rubric addressing: audience, text structure, ideas, persuasive devices, vocabulary, cohesion, paragraphing, sentence structure, punctuation and spelling. In years when narrative writing is allocated, the first four criteria differ however the remaining six remain the same. Scores are allocated for each criterion, using an analytic marking approach which assumes that writing can be effectively evaluated in terms of its separate components.
It is important to stress that online marking by trained and highly experienced teachers is already a feature of high stakes assessment in Australia. In NSW, for example, HSC exams are marked by teachers via an online secure portal according to HSC rubrics. The professional learning that teachers experience through their involvement in such processes is highly valued, with the capacity to enhance their teaching of HSC writing in their own schools.
Moving to online marking (called AES or Automated Essay Scoring by ACARA, also called machine-marking, computer marking or robo-marking) as NAPLAN proposes is completely different from online marking by teachers. While the rubric will remain the same, judgement of all these criteria will be determined by algorithms, pre-programmed into software developed by Pearson, the vendor who was granted the contract. Algorithms cannot “read” for sense, style, context or overall effectiveness in the ways that human experts can. All they can do is count, match patterns, and apply proxy measures to estimate writing complexity.
ACARA’s in-house research (ACARA NASOP Research Team, 2015) insists on the validity and reliability of the software. However, a recent external evaluation of ACARA’s Report is scathing. The evaluation (Perelman, 2017), commissioned by the NSW Teachers’ Federation from a prominent US expert, argues that ACARA’s research is poorly designed and executed. ACARA would not supply the data or software to Perelman for independent examination. However it is clear that AES cannot assess key aspects of writing including audience, ideas and logic. It is least effective for analytic marking (the NAPLAN approach). It may be biased against some linguistic groups. It can easily be distorted by rewarding “verbose high scoring gibberish” (Perelman, 2017, 6). The quality of data available to teachers is unlikely to improve and may lead to perverse effects as students learn to write for robots. The risk of ‘gaming’ the test is likely to be higher than ever, and ‘teaching to the test’ will take on a whole new dimension.
Human input has been used in ACARA’s testing of AES in order to train and calibrate the software and in the future will be limited to reviewing scripts that are ‘red-flagged’ by the software. In 2018 ACARA plans to use both human and auto-marking, and to eliminate humans almost entirely from the marking process by 2019. In effect, this means that evaluation of writing quality will be hidden in a ‘black box’ which is poorly understood and kept at a distance from educational stakeholders.
The major commercial beneficiary, Pearson, is the largest edu-business in the world. Educational assessment in the UK, US and now Australia is central to its core business. Details of the contract and increased profits that will flow from the Australian government to Pearson from the automated marking of writing are not publicly available. Pearson has already been involved in NAPLAN, as several states contracted Pearson to recruit and train NAPLAN markers. Pearson have been described as a “vector of privatisation” (Hogan, 2016, 96) in Australian education, an example of the blurring of social good and private profit, and the shifting of expertise from educators and researchers to corporations.
Writing is one of the most complex areas of learning in schools. NAPLAN results show that it is the most difficult domain for schools to improve. Despite the data that schools already have, writing results have flatlined through the NAPLAN decade. Negative effects and equity gaps have worsened in the secondary years. The pattern of “negative accelerating change” (Wyatt-Smith & Jackson, 2016, 233) in NAPLAN writing requires a sharper focus on writer standards and greater support for teacher professional learning. What will not be beneficial will be furthering narrowing the scope of what can be recognised as effective writing, artfully designed and shaped for real audiences and purposes in the real world.
NAPLAN writing criteria have been criticised as overly prescriptive, so that student narratives demonstrating creativity and originality (Caldwell & White, 2016) )are penalised, and English classrooms are awash with formulaic repetitions (Spina, 2016) of persuasive writing NAPLAN-style. Automated marking may generate data faster, but the quality and usefulness of the data cannot be assumed. Sustained teacher professional learning and capacity building in the teaching of writing – beyond NAPLAN – will be a better investment in the long term. Until then, the major beneficiaries of online marking may be the commercial interests invested in its delivery.
ACARA NASOP Research Team (2015). An evaluation of automated scoring of NAPLAN Persuasive Writing. Available at: http://nap.edu.au/_resources/20151130_ACARA_research_paper_on_online_automated_scoring.pdf
Caldwell, D. & White, P. (2017). That’s not a narrative; this is a narrative: NAPLAN and pedagogies of storytelling. Australian Journal of Language and Literacy, 40(1), 16-27.
Hogan, A. (2016). NAPLAN and the role of edu-business: New governance, new privatisations and new partnerships in Australian education policy. Australian Educational Researcher, 43(1), 93-110.
Lingard, B., Thompson, G. & Sellar, S. (2016). National Testing in schools: An Australian Assessment. London & New York: Routledge.
Polesel, J., Rice, S. & Dulfer, N. (2014). The impact of high-stakes testing on curriculum and pedagogy: a teacher perspective from Australia. Journal of Education Policy, 29(5), 640-657.
Ragusa, A. & Bousfield, K. (2017). ‘It’s not the test, it’s how it’s used!’ Critical analysis of public response to NAPLAN and MySchool Senate Inquiry. British Journal of Sociology of Education, 38(3), 265-286.
Wyatt-Smith, C. & Jackson, C. (2016). NAPLAN data on writing: A picture of accelerating negative change. The Australian Journal of Language and Literacy, 39(3), 233-244.