Category Archives: Directions in Education

Who benefits from online marking of NAPLAN writing?

By Susanne Gannon

In 2018 most students in most schools will move to an online environment for NAPLAN. This means that students will complete all test sections on a computer or tablet. Test data that is entirely digital can be turned around more rapidly so that results will be available for schools, systems and families much faster.

The implication is that the results can be put to use to assist students with their learning, and teachers with their planning. While this appears to address one of the persistent criticisms of NAPLAN – the lag between testing and results – other questions still need to be asked about NAPLAN. Continuing concerns include high stakes contexts and perverse effects (Lingard, Thompson & Sellar, 2016), the marketization of schooling (Ragusa & Bousfueld, 2017), the hijacking of curriculum (Polesel, Rice & Dulfer, 2014) and the questionable value of NAPLAN for deep learning (beyond test performance).

Almost ten years after its introduction, NAPLAN has been normalised in Australian schooling. Despite some tweaking around the edges, the original assessment architecture remains intact. However, the move to online delivery and automated marking represents a seismic shift that demands urgent attention.

Most student responses in NAPLAN are closed questions. In the new online format these include multiple choice, checkbox, drag and drop, reordering of lists, hot text, lines that can be drawn with a cursor and short answer text boxes. These types of answers are easily scored by optical recognition software, and have been since NAPLAN was introduced.

However the NAPLAN writing task, requiring students to produce an extended original essay in response to an unseen prompt, has always been marked by trained human markers. Markers apply a detailed 10 point rubric addressing: audience, text structure, ideas, persuasive devices, vocabulary, cohesion, paragraphing, sentence structure, punctuation and spelling. In years when narrative writing is allocated, the first four criteria differ however the remaining six remain the same. Scores are allocated for each criterion, using an analytic marking approach which assumes that writing can be effectively evaluated in terms of its separate components.

It is important to stress that online marking by trained and highly experienced teachers is already a feature of high stakes assessment in Australia. In NSW, for example, HSC exams are marked by teachers via an online secure portal according to HSC rubrics. The professional learning that teachers experience through their involvement in such processes is highly valued, with the capacity to enhance their teaching of HSC writing in their own schools.

Moving to online marking (called AES or Automated Essay Scoring by ACARA, also called machine-marking, computer marking or robo-marking) as NAPLAN proposes is completely different from online marking by teachers. While the rubric will remain the same, judgement of all these criteria will be determined by algorithms, pre-programmed into software developed by Pearson, the vendor who was granted the contract. Algorithms cannot “read” for sense, style, context or overall effectiveness in the ways that human experts can. All they can do is count, match patterns, and apply proxy measures to estimate writing complexity.

ACARA’s in-house research (ACARA NASOP Research Team, 2015) insists on the validity and reliability of the software. However, a recent external evaluation of ACARA’s Report is scathing. The evaluation (Perelman, 2017), commissioned by the NSW Teachers’ Federation from a prominent US expert, argues that ACARA’s research is poorly designed and executed. ACARA would not supply the data or software to Perelman for independent examination. However it is clear that AES cannot assess key aspects of writing including audience, ideas and logic. It is least effective for analytic marking (the NAPLAN approach). It may be biased against some linguistic groups. It can easily be distorted by rewarding “verbose high scoring gibberish” (Perelman, 2017, 6). The quality of data available to teachers is unlikely to improve and may lead to perverse effects as students learn to write for robots. The risk of ‘gaming’ the test is likely to be higher than ever, and ‘teaching to the test’ will take on a whole new dimension.

Human input has been used in ACARA’s testing of AES in order to train and calibrate the software and in the future will be limited to reviewing scripts that are ‘red-flagged’ by the software. In 2018 ACARA plans to use both human and auto-marking, and to eliminate humans almost entirely from the marking process by 2019. In effect, this means that evaluation of writing quality will be hidden in a ‘black box’ which is poorly understood and kept at a distance from educational stakeholders.

The major commercial beneficiary, Pearson, is the largest edu-business in the world. Educational assessment in the UK, US and now Australia is central to its core business. Details of the contract and increased profits that will flow from the Australian government to Pearson from the automated marking of writing are not publicly available. Pearson has already been involved in NAPLAN, as several states contracted Pearson to recruit and train NAPLAN markers. Pearson have been described as a “vector of privatisation” (Hogan, 2016, 96) in Australian education, an example of the blurring of social good and private profit, and the shifting of expertise from educators and researchers to corporations.

Writing is one of the most complex areas of learning in schools. NAPLAN results show that it is the most difficult domain for schools to improve. Despite the data that schools already have, writing results have flatlined through the NAPLAN decade. Negative effects and equity gaps have worsened in the secondary years. The pattern of “negative accelerating change” (Wyatt-Smith & Jackson, 2016, 233) in NAPLAN writing requires a sharper focus on writer standards and greater support for teacher professional learning. What will not be beneficial will be furthering narrowing the scope of what can be recognised as effective writing, artfully designed and shaped for real audiences and purposes in the real world.

NAPLAN writing criteria have been criticised as overly prescriptive, so that student narratives demonstrating creativity and originality (Caldwell & White, 2016) )are penalised, and English classrooms are awash with formulaic repetitions (Spina, 2016) of persuasive writing NAPLAN-style. Automated marking may generate data faster, but the quality and usefulness of the data cannot be assumed. Sustained teacher professional learning and capacity building in the teaching of writing – beyond NAPLAN – will be a better investment in the long term. Until then, the major beneficiaries of online marking may be the commercial interests invested in its delivery.

References

ACARA NASOP Research Team (2015). An evaluation of automated scoring of NAPLAN Persuasive Writing. Available at: http://nap.edu.au/_resources/20151130_ACARA_research_paper_on_online_automated_scoring.pdf

Caldwell, D. & White, P. (2017). That’s not a narrative; this is a narrative: NAPLAN and pedagogies of storytelling. Australian Journal of Language and Literacy, 40(1), 16-27.

Hogan, A. (2016). NAPLAN and the role of edu-business: New governance, new privatisations and new partnerships in Australian education policy. Australian Educational Researcher, 43(1), 93-110.

Lingard, B., Thompson, G. & Sellar, S. (2016). National Testing in schools: An Australian Assessment. London & New York: Routledge.

Polesel, J., Rice, S. & Dulfer, N. (2014). The impact of high-stakes testing on curriculum and pedagogy: a teacher perspective from Australia. Journal of Education Policy, 29(5), 640-657.

Ragusa, A. & Bousfield, K. (2017). ‘It’s not the test, it’s how it’s used!’ Critical analysis of public response to NAPLAN and MySchool Senate Inquiry. British Journal of Sociology of Education, 38(3), 265-286.

Wyatt-Smith, C. & Jackson, C. (2016). NAPLAN data on writing: A picture of accelerating negative change. The Australian Journal of Language and Literacy, 39(3), 233-244.

 

Associate Professor Susanne Gannon is a senior researcher in the School of Education and Centre for Educational Research at Western Sydney University, Australia.

Man enough to study Physics? What do New South Wales Physics students say?

by Jessy Abraham

The proposed Stage 6 Physics curriculum for New South Wales (NSW) has been lauded as a “return to science” and has been welcomed by science-education experts who regard the current curriculum as ‘soft’ and a ‘diluted’ version of physics (Robinson & Armitage, 2017).

In her 2017 Australia Day address, Professor Michelle Simmons criticised the “feminisation” of physics in NSW (Fitzpatrick, 2017). The use of the term feminisation refers to efforts whereby curriculum developers sought to make the current physics curriculum more appealing to girls by minimising rigorous mathematical problem-solving and replacing it with a qualitative approach. The new syllabus that will commence in 2018 will move away from this qualitative emphasis and the current ‘social-context’ approach to teaching physics and bring in a greater focus on content and quantitative rigour, including mandatory equation derivations and problem solving (Crook, 2017). Stronger emphasis will be given to learning scientific principles, theories and laws.

Topics with a descriptive nature, such as historical linkages and societal implications of scientific inventions will be largely eliminated (Physics Stage 6 Syllabus, 2017). While this has been applauded by critics of the current syllabus and University-based Physics educators, concerns about equity of access have also been raised. The concern is that an increase in quantitative rigour may perhaps lead to even sharper declines in physics enrolment numbers (Crook, 2017).

How valid are the perceived beliefs that the ‘dumbing down ‘of physics content by replacing mathematical focus with the life stories of scientists, historical development and societal impacts of their inventions, will appeal more to female students? Are male students naturally better at and inclined to problem solving, experiments and mathematical applications? Such perceptions exacerbate the ubiquitous gender stereotypes regarding the ‘masculinity’ of physics.

Results of my study conducted among 247 year 11 physics students (157 males and 90 females) from the Sydney metropolitan area did not support these claims. Male and female students who were continuing physics to Year 12 held high levels of interest value, performance perceptions and instrumental value (usefulness for personal career/study plans) in relation to physics, and there were no statistically significant differences for these values between the genders. Both genders displayed similar levels of high engagement with physics, and held low levels of stereotypes on the perceived masculinity of physics.

These observations were equally valid for students who were discontinuing physics, who possessed low levels of interest, performance perceptions and engagement with physics: they also held low stereotypical gender role beliefs. No significant gender differences were found. For the four modules in the current year 11 physics curriculum, in the majority of instances there were no consistent differences in how male and female students perceived the achievement motivational factors explored in the study.

When students were asked to rate various Year 11 physics topics based on their interest value, no significant gender difference was identified. Both genders indicated higher than average levels of interest in learning laws of physics, problem solving, experiments, relating to real life situations, contributions to humanity and the abstract nature of physics. However, regarding the much criticized topics such as ‘Lives of Scientists’ and ‘Historical Contexts of Inventions’, both genders displayed a marked lack of interest. This lack of interest was equally expressed by both genders.

Likewise, both genders described physics as “interesting, challenging, yet satisfying, and something that relates to everyday life” (male student, comprehensive school). Furthermore, participants’ qualitative responses tended to reinforce traditional views on the expected nature of physics. Students reported that they expected more mathematically oriented content, and ‘crazy calculations to experiments’ (female student, selective school) when they enrolled in senior secondary physics. Nevertheless, the enacted curriculum had ‘too much language orientation’ (male student, selective school). They wanted to see ‘less literacy, more scientific content’ (male student, comprehensive school). In relation to the historical and social contexts of inventions, and descriptive topics like The Cosmic Engine (a topic on Astrophysics), the majority found these  ‘boring, dull and not useful’ (male student, Catholic school). Interestingly students gave a strong emphasis to the instrumental value of physics and tended to view the subject as a preparation course of STEM courses at university.

The results of my study support the argument that senior secondary physics students may prefer the content and quantitative analytical rigour proposed in the new curriculum and the removal of certain sections in the current curriculum. This endorses the changes prescribed in new Stage 6 Physics syllabus. However, the popular misconception that ‘dumbing-it-down- for- females’ might increase its attractiveness was not supported. Issues around whether the new syllabus may aggravate equity of access to physics will need to be examined once the implementation of the new syllabus begins.

References

Fitzpatrick, S (2017, January 24). Feminisation of science a disaster, leading quantum physicist Michelle Simmons says. Retrieved from http://www.theaustralian.com.au/higher-education/feminisation-of-science-a-disaster-leading-quantum-physicist-michelle-simmons-says/news-story/8a432da4bce81e4fb51d91da9bf7a98b

Crook, S (2017, February 22). New physics syllabus raises the bar, but how will schools clear it? Retrieved from https://theconversation.com/new-physics-syllabus-raises-the-bar-but-how-will-schools-clear-it-73370

NSW Syllabus for the Australian Curriculum. Physics Stage 6 Syllabus (2017) NSW Education Standards Authority (NESA). Retrieved from http://syllabus.nesa.nsw.edu.au/assets/physics_stage_6/physics-stage-6-syllabus-2017.pdf

Robinson, N & Armitage, R (2017, February 21). New South Wales HSC syllabus gets overhaul with more complex topic. Retrieved from http://www.abc.net.au/news/2017-02-21/nsw-hsc-syllabus-gets-radical-overhaul-year-12-teaching-changes/8288000

 

Dr Jessy Abraham is a lecturer in Primary science and technology curriculum in the School of Education at Western Sydney University, Australia.

(Un)necessary teachers’ work? Lessons from England.

by Susanne Gannon

Disembarking at Heathrow a few weeks ago, my first purchase in pounds as always was a copy of The Times to read on the train into the city. The second page headline, “CR (Creative Original): Grades on schoolwork replaced by codes” (Bennett, 2017) caught my eye. Skimming the article in my dazed jetlagged state was not ideal for a critical reading but I snapped a photo with my phone of the final paragraph:

“In 2014 the government asked teachers to tell them what created unnecessary work. Three big areas were marking, planning and data management.”

I recognise the data deluge in schooling is now overwhelming, may be driven by externally imposed system imperatives and is not always put to use to improve student learning. However, I’ve spent my professional life as a secondary English teacher, tertiary teacher educator and researcher. I could not see how “marking” and “planning” are seen as “unnecessary work” for teachers.

Planning is surely at the heart of teachers’ work. Otherwise how do we claim our status as professionals? Ideally we don’t just wing it in the classroom, nor do we follow prescriptive scripts. Systematic, responsive, syllabus-informed planning of purposeful sequences of learning and meaningful resources are what make the difference for individuals and groups of students. Well-selected and fine-grained data about student progress (not necessarily only the numerical data that is favoured by educational systems) should of course inform such planning as skilled teachers identify gaps and opportunities for extension and tailor their planning to their students’ needs and their potential.

Having high expectations and creating the conditions – through careful and ideally collaborative planning – for students to succeed and to excel are hallmarks of quality teachers. These features are characteristic of exemplary teaching in disadvantaged contexts (Lampert & Burnett, 2015; Munns, Sawyer & Cole, 2013). Careful planning need not preclude flexibility, creativity and authenticity in learning and assessment practices, but conversely may enable these qualities (Hayes, Mills & Christie, 2005; Reid, 2013). As many of these authors stress, good planning is often underpinned by a disposition of teachers to become researchers of learning within their own classrooms. Where teachers are provided some agency and capacity to gather and use data then problems are less likely to be at the low level of time consuming and potentially meaningless “data management” that is perceived as “unnecessary work” by teachers in England.

Marking is of course close to my heart as a secondary English teacher and I have spent countless hours of my life providing written feedback on student work. Whilst I have become adept at designing and using outcomes based rubrics / criteria sheets since their introduction in the mid-90s with outcomes based assessment and curriculum, I have always endeavoured to provide tailored and specific feedback to students on their texts.

This for me is “marking” as a process, and I think of it – in ideal circumstances – as sometimes like a sort of dialogue on the page between student, text and teacher, and an opening towards further dialogue. It features in formative as well as summative assessment contexts (apart from exams). Now it features in the texts in progress that are thesis chapters for my current doctoral students. In a perfect world it is diagnostic, supportive, explicit and critical in combination and students will take heed. Portfolios, peer and self-assessment processes and tools can be incorporated. As Munns et al (2013) describe, sharing assessment responsibility is an important component of the insider school. The volume and pressure of marking has always been problematic however, when short timelines for results and sheer numbers of students across multiple classes work against ideal scenarios. My research into creative writing in secondary schools (e.g. Gannon, 2014) suggests how English faculties were able to work collegially to support senior students as they developed major works in English. Marking, at best, can be rewarding, encouraging and useful for students and for teachers.

Where, then, does the aversion to marking come from for teachers in England? The article in The Times does not provide any pointers towards the government survey of 2014, but is rather an announcement of a large randomised control trial to be funded by the UK-based Education Endowment Foundation, based on a Report reviewing written feedback on student work that they commissioned and recently published (Elliot et al., 2016). The opening of the executive summary of the Report provides further detail:

[T]he 2014 Workload Challenge [UK] survey identified the frequency and extent of marking requirements as a key driver of large teaching workloads. The reform of marking policies was the highest workload-related priority for 53% of respondents. More recently, the 2016 report of the Independent Teacher Workload Review Group [UK] noted that written marking had become unnecessarily burdensome for teachers and recommended that all marking should be driven by professional judgement and ‘be meaningful, manageable and motivating’. (2016, 4)

Well, of course! What has gone wrong in England that marking is not driven by these qualities. Are there lessons for us in Australia (yet again from England) of what not to do in educational reform? Although the report acknowledges that there is very little evidence or research into written marking, they nevertheless identify some inefficient and apparently widespread practices: triple-marking, awarding grades for every piece of student work (so that the grades distract students from the feedback), too many texts required from students, marking excessive numbers of student texts, provision of low level corrections rather than requiring students to take some responsibility for corrections/ improvements, and moving on without giving students time to process and respond to feedback.

Despite the caveat in the opening section, the report is worth reading in full (though it has been criticised by local critics e.g. Didau, 2016). Secondary teachers are much more inclined to put a grade on every piece of student work, they say (2016, 9). Unsurprisingly, offering clear advice on how a student may improve their work in a particular dimension seems to be more useful than broad comments (‘Good work!’) or excessively detailed and overwhelming commentary (2016, 13). Targets or personalised and specific “success criteria” may be effective, particularly where students are involved in establishing them (2016, 20; also see Munns et al., 2013).

It is in this part of the Report that the overall logic of the newspaper article becomes apparent. Buried well down into the subsection on “Targets” is the following comment:

Writing targets that are well-matched to each student’s needs could certainly make marking more time-consuming. One strategy that may reduce the time taken to use targets would be to use codes or printed targets on labels. Research suggests that there is no difference between the effectiveness of coded or uncoded feedback, providing that pupils understand what the codes mean. However the use of generic targets may make it harder to provide precise feedback. (2016, 20).

The Times headline is therefore not quite accurate. It seems that “Grades” will not be replaced by “codes” but rather that teachers’ written comments will be replaced by codes. In another article, “Schools wanted to take part in marking without grading trial” (Ward, 2017) this is called “FLASH Marking” and is an initiative developed in house by a secondary school in northwestern England that will be rolled out to 12,500 pupils in 100 schools (EEF, 2017). The school claims that teachers will now be able to mark a class of Yr 11 exam papers in an hour. Students will receive an arrow (at, above or below expected target), and codes such as CR = “creative original ideas”, and V= “ambitious vocabulary needed.”

It seems from these news stories (and presumably EEF will put up the design protocols on their website eventually) that two different factors are being measured – one is holding back grades and the other is using codes instead of written comments. I’m curious but ambivalent, after all at university it is now mandatory to use “Grademark” software for coursework students. This enables teachers to provide generic abbreviated feedback (“codes”) but also gives us the opportunity to personalize responses, and supplement these with an extended written comment, or even an audio-recorded comment. These are highly personalised and appreciated by students.

To turn back to the English example, I wonder whether the randomized control trial design (in this case an efficacy trial that will be evaluated by Durham University) means that participating schools will not be able to improvise around the conditions of the feedback? At least, if the reduction of feedback to codes proves not to improve student results, given the need for the control (or “business as usual”) group, the damage will be limited to only half the participating schools and students. The news articles are unclear about the purpose of the study – which is described as a way to reduce teacher workload more than to improve student learning. However the EEF project description also mentions, reassuringly, that the rationale is focused on student outcomes, as “specific, actionable, skills-based feedback is more useful to students than grades” (2017). The project will follow year 10 students in senior English classes through to the end of secondary school with a report to be published in 2021. Already, I can’t wait.

References

Bennett, R. (June 17, 2017). CR (Creative original idea): grades on schoolwork replaced with codes. The Times.

Didau, D. (May 18, 2016), The Learning Spy Blog.

http://www.learningspy.co.uk/assessment/marked-decline-eefs-review-evidence-written-marking/

Education Endowment Foundation (2017). Flash Marking. https://educationendowmentfoundation.org.uk/our-work/projects/flash-marking/

Elliot, V., Baird, J., Hopfenback, T., Ingram, J., Thompson, I., Usher, N., Zantout, M, Richardson, J., & Coleman, R. (2016). A Marked Improvement? A review of the evidence on written marking. Education Endowment Foundation. https://educationendowmentfoundation.org.uk/resources/-on-marking/

Gannon, S. (2014). ‘Something mysterious that we don’t understand…the beat of the human heart, the rhythm of language’: Creative writing and imaginative response in English. In B. Doecke, G.Parr & W. Saywer (Eds), Language and creativity in contemporary English classrooms (pp. 131-140). Putney: Phoenix Education.

Hayes, D., Mills, M., & Christie, P. (2005). Teachers & schooling making a difference: productive pedagogies, assessment and performance. Allen and Unwin.

Lampert, J. & Burnett, B. (Eds) (2015) Teacher Education for High Poverty Schools. Springer.

Munns, G., Sawyer, W. & Cole, B. (Eds). (2013). Exemplary Teachers of students in poverty. Routledge

Reid, J. (2013). Why Programming matters: Aporia and teacher learning in classroom practice. English in Australia. 48(3), 40-45.

Ward, H. (June 16, 2017). Schools wanted to take part in marking without grading trial. Times Education Supplement. https://www.tes.com/news/school-news/breaking-news/schools-wanted-take-part-marking-without-grading-trial

 

Dr Susanne Gannon is an Associate Professor in the School of Education and a senior researcher in the Centre for Educational Research at Western Sydney University, Australia.