PERSUADE Dataset
Overview
Why do students write the way they do? And are they any good at it?
Understanding the nuance of how students write remains a complex challenge – one that can be aided by deeper insight into how various writing components ultimately come together to form effective essays and other text.
Granular writing feedback can help create better writers but teachers are often too overwhelmed to provide it as needed. So what can help? More knowledge about the different elements of student writing can aid better development of customized AI, machine learning, and also more effective, formative teacher feedback.
Recent enhancements in the ability to study specific student writing components are now possible thanks to the Learning Agency Lab’s PERSUADE dataset. This dataset opens a window into how students think, label and organize their thoughts as they write. The resulting snapshots of information provide greater clarity and enhanced knowledge of specific writing elements.
Traditionally these types of labeled datasets, which break down and focus on particular elements of discourse in an essay, are hard to come by. The PERSUADE (Persuasive Essays for Rating, Selecting, and Understanding Argumentative and Discourse Elements) dataset available here on the Lab’s Learning Exchange is a rare and exciting, nationally-representative new resource that lets learning engineers glean in-depth insights on student writing in the United States.
The PERSUADE dataset provides access to comprehensive data such as labels for more than 14,000 essays, including the various argumentative and rhetorical elements contained within each essay response. It also includes the effectiveness rating of these discourse elements, holistic quality scores for the essay responses, and student demographic information that includes grade level, race/ethnicity, economic background, and more.
Persuade dataset © 2024 by The Learning Agency Lab is licensed under CC BY 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
Potential Uses
Those who access the PERSUADE dataset can conceivably:
- Develop new AI algorithms that identify discourse elements in argumentative writing
- Perform linguistics research on writing characteristics of specific student populations
- Perform pedagogical applications such as student-centered assessment, peer reviews, as well as other uses by classroom teachers and writing program administrators
Dashboard
The Tableau dashboards below provide data and analysis on essay length distribution, discourse length distribution, and more.