A mix of artificial intelligence and natural language processing drives the development of these tools, and today some programs like NoRedInk claim that half of all districts use their program.
Assisted writing feedback software addresses an increasingly acknowledged problem: Students graduate without being very good writers. Less than a third of high school seniors are proficient writers, according to the National Assessment of Educational Progress (NAEP). The problem is particularly acute for low-income and minority students, with less than fifteen percent of high school seniors scoring proficient.
Part of the issue is that students require lots of writing practice to become good writers, but teachers often report being overwhelmed with the amount of feedback they already provide to students.
In principle, writing feedback software can lessen that burden. Having the teacher point out (for the hundredth time) that the student needs a strong topic sentence is probably not the best use of the teacher’s time. The same could be said for having evidence, making clear transitions between one idea and the next, and making more precise word choices.
But does assisted writing feedback work? How should it be used? Here’s what we know so far.
ASSISTED WRITING FEEDBACK IS HERE.
More than a dozen companies now provide assisted writing feedback software to teachers, students, and everyday professionals. Among them: Revision Assistant, Quill, Hemingway, NoRedInk, Peg, Criterion, and ESL Assistant.
The goal? To improve people’s writing.
Assisted feedback software is different from assisted grading software, although they rely on the same technology, and in a couple of cases, are produced by the same companies. But grading software is about — yes, giving students a final grade. This kind of software is already in use by large, standardized test services, which have to grade an immense number of essays.
Feedback software is different. It is about improving student writing through iteration and structured feedback. The point is not to give one student and F and another an A. The point is that every student gets to see weak points in their writing and gets the opportunity to improve them in real-time.
Some forms of the software focus on common grammatical and writing errors. A missed comma or a misused semicolon. A verb that doesn’t agree with the subject. Mistaking “they’re” for “their”. A poor word choice. This software is kind of like an advanced grammar checker.
Other software goes further. Turnitin’s product, Revision Assistant, tries to identify weak reasoning. It does so by analyzing lots of essays (and evaluations of those essays) to the same writing prompt. Other tools like Quill are focused on giving students lots of writing practice.
In both cases, the aim is to get students to revise their own work (independently) before turning it in to their teacher.
ASSISTED WRITING FEEDBACK WORKS. AT LEAST SOMETIMES.
Does all of this technology amount to better learning outcomes? There are some positive signs that the answer is yes. At least in some contexts — and for some kids.
First, some of these programs clearly improve the writing products of the students who use them. The feedback provided by the program Criterion and the program ESL Assistant can help English language learners produce better essays.
It’s important to note that such software isn’t always “right” — some of the software’s suggestions are incorrect. But in this study, English language learners were able to distinguish correct suggestions from incorrect suggestions enough to improve their essays.
Second, using a program really can save teachers time. Teachers report that using the PEG program cuts the time they spend providing feedback to students by a third (or even in half) without impacting the total amount of feedback students received or students’ ultimate writing quality.
Finally, a large-scale study that explored the effect of district-wide implementations of yet another assisted writing feedback program—Revision Assistant—found that districts that implement the program outperformed statewide improvements in a number of areas, including writing and speaking. Isolated implementations of such software did not show the same effect, suggesting that there’s a lot of value in using the technology in the right way.
ASSISTED WRITING FEEDBACK HAS SOME LIMITATIONS.
But it’s not all AI peaches and computer science cream. There are still some significant limitations to the technology.
After all, identifying passive sentences is relatively straightforward. But for software to get to the guts of good writing, it has to “know” the material well. Such software needs large data sets of well annotated essays on the specific essay topic to be able to infer that an essay is lacking evidence. Or that the evidence doesn’t support the conclusion very well. That means that such software is limited to only certain writing prompts (and dependent upon high-quality human evaluation as well).
These tools don’t work well on less formulaic writing like fiction or poetry or even just creative approaches to argumentative essays. A software might say that a solid essay is “in need of significant improvement” because it deviated from the standard formula of “good” writing.
Of course, even teachers can disagree about an essay’s quality. In the essay scoring context, many studies illustrate the relatively low consistency of teacher-graders and in many cases, teachers only agree around 70 percent of the time. Using writing rubrics improves consistency (along with monitoring and re-training), but they don’t eliminate inconsistency. Rubrics themselves are an effort to standardize: they do not necessarily reward more creative, innovative writing.
Generally speaking, assisted writing feedback software isn’t good for illustrating exceptions to common practices. Sometimes passive voice should be used by writers. (See what I’m doing there?) Sometimes it’s good to have longer sentences that mix up the pacing, add lavish details, or illustrate the writer’s state of mind in order to change up the pace for an essay or an online column.
This generation of assisted writing software is in a better position to avoid overly rule-based recommendations because it infers patterns of good writing from the bottom-up rather than the top-down. To explain, previous generations of software were generally rule-based: humans decide what the rules of good writing were and figured out how the software should identify these rules. This new generation of software takes advantage of neural networks and large data repositories to find essay features that correlate with teacher evaluations of the writing.
But, like all software that relies on such techniques, the quality of the software is only as good as the data its relying on.
A handful of these platforms have produced rigorous, high-quality research on their effectiveness. Other platforms, however, make claims that await transparent verification. What’s more, the tools might work in some contexts but not others. In short, more work needs to be done.
STUDIES SUGGEST THAT THE SOFTWARE WORKS BEST WHEN IT COMPLEMENTS GOOD TEACHING AND LEARNING.
One of the broader concerns about using assisted writing feedback software is that it would grow to replace traditional writing instruction instead of complement it. But recent work suggests that software works best when complementing robust teaching and learning.
Part of the reason is that while natural language processing is good and getting better, it still does not “understand” text. This means that natural language processing can be fooled in all sorts of ways. Indeed, many argue that the holy grail of natural language processing right now is for machines to uncover the “meaning” of a text.
But even still, no educator should use only natural language processing to evaluate a student’s writing. The technology is not there–and might never be there. Most organizations using natural language processing like chatbots know this fact, and so they keep a human in the loop. Educators need to do the same.
Indeed, the PEG study suggests that assisted feedback software works best when effectively integrated into existing teacher practices. In other words, the tools work best when aligned with standards, rubrics, and the forms of great English instruction.
To be sure, it’s also possible that over-reliance on such software can lead to “stale,” standardized, uncreative writing habits. And some argue that this way of thinking is beginning to stifle the real work of writing, which is making meaning.
A final, more pressing concern is with professional development. As the Revision Assistant study demonstrates, such technology requires extensive training (and, perhaps, teacher coordination) to use well. Haphazard or inconsistent implementation is not likely to lead to large benefits.
Join us.
At the Learning Agency, we are exploring the promise of this technology: how to build better, more useful assisted feedback tools and how to incorporate these tools effectively in the classroom.
If you’re curious about using such software, Hemingway offers a free, interesting demo.
If you’re a teacher who’s using, planning on using, or even completely skeptical about these tools, we’d love to hear from you. If you’re a student who’d like to share your experience with these tools, please get in touch. If you’re developing these tools, reach out to stay connected to a broader developer community.
Email aly@the-learning-agency.com to stay abreast of the latest developments.
-Ulrich Boser