KLICKE Dataset
Overview
What if educators could better understand the indicators that predict good writing – before a student even approaches a keyboard?
Most writing assessments focus on only the final product, but data science may now be able to unlock key aspects of the writing process in order to bring new insights and efficiencies to light.
Ultimately, work derived from the KLICKE dataset could provide valuable information that aids writing instruction, writing research, and helps train artificial intelligence models in the development of automated writing evaluation techniques, intelligent tutoring systems, and writing support tools.
While past research into keystroke logging has been done, most studies of the process included only a small number of writing process features and were also limited by relatively small datasets. KLICKE, which was released in October 2023 via a Kaggle competition and concluded in early 2024, encompassed 7,209 entrants and ultimately yielded 2,256 participants and 44,811 submissions.
In addition to training AI and supporting teachers, the potential applications of the KLICKE writing dataset include the ability to direct learners’ attention to their text production process, which can boost their autonomy, metacognitive awareness, and self-regulation in writing.
Klicke dataset © 2024 by The Learning Agency Lab is licensed under CC BY 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/
Potential Uses
- Analyze keystroke data to study cognitive processes in computer-based text production
- Discover important patterns in keystroke activities (e.g., insertion, deletion, text move) that predict writing quality
- Train artificial intelligence/natural language processing algorithms for automatic assessment of writing quality using keystroke process data