Menu
Datasets
Overview
By providing researchers and technologists with access to high-quality datasets, the Lab’s Learning Exchange fosters innovation and progress in education – helping to improve student outcomes and prepare learners for success in the 21st century.
Datasets in other fields have already yielded transformational results. Among them, the ImageNet dataset, which is credited with major advances in computer vision, and datasets like the Automated State Assessment Prize (ASAP) that helped launch the practice of automated essay scoring.
The Lab’s Learning Exchange is a critical step toward creating high-quality educational datasets that can unleash the full potential of artificial intelligence and machine learning in education. Many of the datasets hosted in this clearinghouse are associated with open data science competitions to support the creation of new, innovative AI algorithms in education. These datasets can be used to:
- Train machine learning algorithms to evaluate student essays for writing and language proficiency.
- Analyze a dataset of persuasive essays by students to examine the writing and linguistic differences between different student populations in the United States.
- Train generative AI algorithms to create reading comprehension questions for elementary and middle school students.
- Analyze students’ enjoyment, engagement, and learning progress on game-based learning platforms (Ex. Jo Wilder and the Capitol Case).
How can visitors best use the Lab’s Learning Exchange?
Our goal is to enable educators and engineers to create useful teaching and learning tools that are supported by the use of large and relevant datasets. To get started:
Preview & Download Datasets. Dataset descriptions and examples of their potential applications are listed on each set’s page. Upon deciding on the dataset you are interested in, please click the button to download the data in your preferred format (CSV, XLSX). [Note: All datasets provided on the Exchange are open source.]
Tableau integration. Learning Exchange visitors can use the built-in data visualizations by Tableau, a visual analytics platform, to further analyze available data. Users can also create custom data visualizations with the interactive dashboards.
Datasets

The Quest Dataset
The Learning Agency Lab’s data science competition, “The Quest for Quality Questions: Improving Reading Comprehension through Automated Question Generation,” was designed to build AI algorithms that can automatically generate questions for testing young learners’ reading comprehension.