Ideogram: A synthwave-inspired illustration depicting an AI and a teacher collaboratively grading papers. |
To the first question, yes, AI can grade student writing. There are already tools like Cograder built specifically to grade student work. Other tools like Writable have added AI scoring and feedback to their offerings. Individual teachers have found that, with the right prompt, chatbots like ChatGPT and Claude can grade student writing and provide feedback.
Are these scores and feedback accurate though? In many cases, yes. They are at least as accurate as a range of human teachers scores would be. A 2023 study using ChatGPT 3.5 (Steiss) found that, while experienced teachers gave better quality feedback on student writing, the AI was actually more accurate for criteria based scoring than the teachers in the study. If you are working with a well developed rubric and clear criteria, an AI grading assistant is likely to be more accurate and consistent than the teacher. Also, worth remembering here that the latest large language models are only going to take this further.
So, if AI can accurately grade written student work, the next question is, should teachers consider using AI for scoring and feedback? This will be an individual decision for each educator, and some will be guided by the rules of their institution, but it is worth at least considering the multitude of benefits for students and teachers that come from using some AI assisted grading and AI generated feedback. Even if the feedback is slightly lower quality than an experienced teacher, there is a net benefit to giving all students access to AI feedback while they wait their turn for a writing conference or more detailed commentary from their teacher. I previously wrote about some ways of giving AI feedback for Edutopia, and I wrote about the impact of AI feedback during writing time for my 9th graders.
Before we go further into the ethics of teachers using AI to grade student work, a bit about how all these tools work. AI assisted grading tools always require the supervision of a teacher. No score gets to a student without first being approved by the educator. When I use Writable to score student writing, I must review and approve every score. This gives me an opportunity to revise the score if I need to, revise the feedback if I need to, and then approve the score. Cograder and others are built with the same kind of process. The AI assigns a draft score with draft feedback. The teacher must review the student work and approve the score. The teacher is still the final arbiter of the score and remains responsible for the quality of the feedback.
Graphic Created with Napkin.ai |
What does this really mean for teachers and their students? It means students get scores and feedback faster. Review and approve is still a time consuming process, but it is usually much faster than read, score, and write feedback. It means I experience less decision fatigue. I get to read the student work with an eye toward instruction rather than evaluation. It means I sometimes have to examine my biases about my highest and lowest performing students. The AI tool does not know their names or past performance. It is purely scoring the work. And it means students get more detailed feedback, specifically tied to the rubric. The AI tools are always more elaborate and specific than I would have taken the time to write.
When I use AI assisted grading tools my 140+ students get their work returned much faster, thus they revise and resubmit as needed at a point much closer in time to the original assignment. By accelerating this feedback loop I have been able to have my students engage with more scored writing activities.
One unexpected result of AI assisted grading has been the impact on my relationship with students. One might think that with AI in the mix my writing instruction would become less personal, but the opposite is happening. When my students know that their score is at least partly determined by AI, they are less likely to take it personally. Their low grade was based on the quality of their writing, not any personal bias they perceive from the teacher. I become the ally who can help them write better, find better evidence, explain their reasoning in more detail. I get to move from critic to coach and I have found that to be helpful.
Some have said that teachers should not use AI to grade if students can't use AI to write. This is a false equivalency. Students are learners. Teachers are responsible for helping them learn. If an AI tool for scoring and feedback can help my students improve their writing faster then I should be using that. I'm seeing growth for my students as writers and I have no plans to halt an effective practice because others want to make this a conversation about morality.
Of course there are concerns about AI bias. Large language models have well documented cases of gender and racial biases. This is part of why all scores require a teacher review, but let's not pretend that teachers themselves are free of bias. A recent long term study by the University of Michigan looked at 30,000 grading records and found that students with last names that came later in the alphabet were statistically more likely to receive lower grades and less helpful feedback. Grading comes with decision fatigue, and sometimes boredom if the assignments are similar. It can be frustrating to see similar mistakes repeated by different students. When it takes multiple days to grade an assignment, the teacher may approach it a little differently each time. The amount and quality of feedback students get is likely to decline as well. The AI grader doesn't get tired though. It does not have favorite students and least favorite students. It just scores the work.
Student privacy is also a concern. Never provide any AI tool with identifying student information like names, student ID numbers etc. I work confidently with the AI grading assistant in Writable because I know my district is paying for the service and therefore the legal department has been involved with that contract. Free tools do not have those guardrails though, and we must be protective of our students, as well as compliant with legal requirements at federal and state levels.
AI assisted grading has been a net good for my students because they get their scores faster and with more detailed feedback. I'm not going to lie, it has also been good for my work/life balance. I still get to read their writing, but I get to step back from the agony of having to decide on, and then justify, a score. My students are writing more, revising more, and growing in their confidence as writers.
Works Cited
Steiss, Jacob, et al. “Comparing the quality of human and CHATGPT feedback on students’ writing.” Learning and Instruction, vol. 91, 7 Sept. 2023, https://doi.org/10.35542/osf.io/ty3em.
Pei, Jiaxin. 30 Million Canvas Grading Records Reveal Widespread Sequential Bias and System-Induced Surname Initial Disparity. Apr. 2024, https://conference2023.eaamo.org/papers/EAAMO23_paper_118.pdf. Accessed June 2024.
Comments
Post a Comment
Thanks for your comment on this post. If you have an urgent question you may want to reach out to me on Twitter @JenRoberts1.
Comments on this blog are moderated for posts more than five days old to cut down on spam, so if you are commenting on an older post it may not appear right away.
If something here helped you, feel free to donate $5 toward my classroom library at https://www.buymeacoffee.com/jroberts1