Marking is calibration work. The first essay in the pile gets your full attention. The fifteenth gets a draft you re-read three times. The thirtieth, at 11pm on a Sunday, gets the band you decided on in the first 30 seconds and a comment you've already typed twice tonight in slightly different words.
This is a marker's worst-kept secret. It's not laziness. It's the natural drift of human attention through a stack of cognitively demanding work. The students at the top of the pile and the bottom of the pile do not, statistically, get the same quality of attention.
I built MarkMate because I wanted a tool that doesn't drift. This post is about what that means in practice, including the things MarkMate catches that I miss, and the things I catch that MarkMate doesn't.
The 11pm calibration drift
My own data on this isn't scientific, but it's strong enough to make me uncomfortable. I once re-marked a small set of essays a week after I'd marked them the first time. Same essays, same rubric, same rules. Five out of fifteen got a different band on at least one criterion the second time round.
The pattern: the essays I'd marked late in the original session had drifted toward the middle. Strong essays I'd marked at 11pm came back up the second time. Weak essays I'd marked at 11pm came back down. My tired self compresses the band distribution toward the average.
Other teachers I've spoken to recognise this. Some don't, because they've never tested it on themselves. If you've never re-marked your own pile a week later, I'd recommend trying it on a small set sometime. It's instructive.
What MarkMate caught that I'd missed
I tested MarkMate on a stack of essays I'd already marked by hand. Year 12 English Advanced, Module C extended response. Twenty-two essays, marked the previous Sunday, given back to students on the Tuesday.
I dropped the same 22 essays into MarkMate after the fact (kids' grades were already in; this was for my own calibration check) and compared.
Three things came up I hadn't caught.
Thing 1: A factual error I'd read past
One student wrote that Frost's poem "Acquainted with the Night" was published in 1923. It was published in 1928. I'd read past it. By the time I got to that essay, I was on essay 14 and had stopped fact-checking dates. MarkMate flagged it as a content red flag for verification.
This wasn't a marking-changing error. The factual point wasn't load-bearing for the student's argument. But if it had been, I'd have missed it.
Thing 2: Vocabulary suggestions I'd have made if alert
A capable Band 5 student wrote "the poem is important because it shows feelings of isolation." I'd normally annotate that as a Band 5 sentence to a Band 6 student and suggest "significant" or pulling the analytical verb out altogether ("the poem dramatises isolation through..."). I hadn't. I'd written "good point" and moved on.
MarkMate flagged the sentence, suggested three stronger alternatives, and explained why each was a different shade. This is the kind of feedback that lifts students from Band 5 to Band 6. I'd missed three of these across the stack.
Thing 3: A weak conclusion where the student ran out of steam
Most teachers know this pattern. The first three paragraphs are tight, the fourth wanders, and the conclusion is rushed. I'd given full credit on structure to a student whose conclusion was effectively a single declarative sentence followed by a quote. MarkMate flagged the conclusion as undeveloped and suggested two specific moves.
I should have caught it. By essay 19 I wasn't reading conclusions with the same care as openings.
What I caught that MarkMate didn't
This is the part I want to be honest about. If MarkMate were better than me at every category, you wouldn't need me. You'd just need MarkMate.
Catch 1: A student writing below her usual standard for a reason
One Year 12 student had had a hard fortnight. I knew about it from another teacher. Her essay was technically a Band 4, but the moves she'd attempted in paragraphs 2 and 3 were Band 6 ambitious. She'd just run out of time to execute them. I marked her at the bottom of Band 5 with a feedback note that named the ambition and asked her to come and talk to me.
MarkMate marked her at Band 4 across the board. Correct on the rubric. Wrong on the student. AI doesn't know what's happening in a student's life, and it shouldn't.
Catch 2: An overdue conversation about voice
A different student has been writing increasingly polished essays this term, but the polish is starting to flatten her voice. She's writing what she thinks I want, not what she thinks. I'd flagged this in feedback and asked to see her at recess.
MarkMate gave the essay a strong Band 5 mark and rubric-aligned feedback. Both correct. Both missed the bigger issue.
Catch 3: A misread directive verb that the rubric didn't quite catch
The task had asked students to "evaluate." A student had written a strong "describe" response. The rubric criteria for argument and analysis caught some of this, but the structural issue (the whole essay was the wrong genre) wasn't quite captured by the rubric line by line. I wrote a longer feedback note explaining the directive verb mismatch and what the next attempt would look like.
MarkMate flagged the directive verb issue but didn't quite frame the consequence the way I would have for this student.
How I use MarkMate now
After a few rounds of testing, my own use pattern has settled into this:
- Run the batch mark on the full class set.
- Read the cohort summary first, before I look at individual essays.
- Scan the rubric scores. Override any I disagree with after re-reading the essay.
- Edit the feedback on the essays where I know the student needs a different note.
- Use MarkMate's annotations as the base layer. Add my own where the student needs context AI doesn't have.
This gives me MarkMate's calibration consistency on the rubric work, plus my own student-knowledge layer on top. The combined output is better than either of us alone.
The honest summary: MarkMate doesn't replace your marking. It catches the tired-marker errors you can't catch yourself, and it leaves the student-knowledge work to you. That trade is worth doing.
Try it on something you've already marked
The demo on the homepage lets you paste up to 500 words and see what MarkMate does with it. If you've got an essay you marked recently, it's a useful comparison. Compare what MarkMate flags to what you flagged. The gaps in either direction are interesting.
