

-Statistics by De-
DeTistics
Data Mining Publications

COMING
SOON
The In-Task Assessment Framework for Behavioral Data
Deirdre Kerr, Jessica J. Andrews, Robert J. Mislevy
November 2016
This study ???
In educational games and simulations, game play itself can provide a novel source of assessment data as it can offer rich observations of student learning behaviors, which can support diagnostic claims about students’ learning processes. However, the nature of the data produced by these environments makes using in-task behavioral data for assessment purposes difficult. We introduce the in-task assessment framework as an innovative approach for identifying the measurable components of the domain of interest, which is a process for feature extraction that operationalizes the concepts of interest at the same grain-size as contained in the log data and articulates chains-of-evidence that link the extracted features to applicable concepts in an ontology. This process transforms low-level log data into a set of action set labels that can be utilized in a number of different measurement models so that proficiency can be assessed solely from in-task behavioral data.

COMING
SOON
Vizualizing Changes in Strategy Use across Attempts via State
Diagrams: A Case Study
Deirdre Kerr
January 2016
This study uses a three-step solution to visualize how strategy use changes as players repeatedly attempt to solve a given level in an educational video game. First, cluster analysis is used to determine which strategies players use to solve each level in the game, leading to the identification of 15 different strategies. Then, sequence mining is used to identify common changes in strategy use across multiple attempts to solve each level in the game, resulting in approximately 40 different strategy sequences for each level in the game. Finally, state transition diagrams are used to visualize the large number of strategy sequences identified for each given level in a single, easily interpretable image.

Using Data Mining Results to Improve Educational Video Game Design
Deirdre Kerr
October 2015
This study uses data mining results to make data-driven modifications to a game in order to reduce construct-irrelevant behavior. Data mining indicated that students were able to pass levels using incorrect mathematical strategies and that it was common to use order-based strategies rather than mathematical strategies to solve game levels. To address these issues, two minor changes were made to the game and students were randomly assigned to the original or revised version. Students who played the revised version solved levels using incorrect mathematical strategies less and used order-based strategies less. Additionally, student perception of the revised version was more positive than the original version. This indicates that data mining results can be used to make targeted modifications to a game that increase interpretability without decreasing engagement.

Methodological Challenges in the Analysis of MOOC Data: Exploring the Relationship between Discussion Forum Views and Learning Outcomes
Yoav Bergner, Deirdre Kerr, and David E. Pritchard - June 2015
Determining how learners use MOOCs effectively is critical for providing useful feedback about ideal MOOC use to students, instructors, schools, and policy makers. However, drawing inferences about student outcomes in MOOCs has proven difficult due to the large amount of missing data they generate and the diverse population of their participants. Thus, significant methodological challenges must be addressed before substantive questions about MOOC use can be answered. This study models final exam scores based on early-stage ability estimates, discussion forum viewing frequency, and assessment-oriented engagement. The impact of various operationalizations of these variables is examined in this study, demonstrating that the effect size of discussion forum viewing on final exam outcomes is quite sensitive to these decisions.

Into the Black Box: Using Data Mining of In-Game Actions to Draw
Inferences from Educational Technology about Student Knowledge
Deirdre Kerr - March 2014
Educational video games have the potential to be used as assessments of student understanding of complex concepts. However, the interpretation of the rich stream of complex data that results from the tracking of in-game actions is so difficult that it is one of the most serious blockades to the use of educational video games or simulations to assess student understanding, and there is currently no systematic approach to extracting relevant data from log files from educational games or simulations. This dissertation examined whether data mining techniques can be used to extract information from log files that allows for the formation of testable hypotheses. The log files in this study come from an educational video game teaching students about the identification of fractions. The data mining techniques used were: cluster analysis, sequence mining, and classification.

Identifying Learning Trajectories in an Educational Video Game
Deirdre Kerr and Gregory K.W.K. Chung
July 2013
Educational video games and simulations hold great potential as measurement tools to assess student levels of understanding, identify effective instructional techniques, and pinpoint moments of learning because they record all actions taken in the course of solving each problem rather than just the final answers given. However, extracting meaningful information from the log data produced by educational video games and simulations is notoriously difficult. We extract meaningful information from the log data by first utilizing a logging technique that results in a far more easily analyzed dataset. We then identify different learning trajectories from the log data, determine the varying effects of the trajectories on learning, and outline an approach to automating the process.

Identifying Key Features of Student Performance in Educational
Video Games and Simulations through Cluster Analysis
Deirdre Kerr and Gregory K.W.K. Chung - October 2012
The assessment cycle of evidence-centered design (ECD) provides a framework for treating an educational video game or simulation as an assessment. One of the main steps in the assessment cycle of ECD is the identification of the key features of student performance. While this process is relatively simple for multiple choice tests, when applied to log data from educational video games or simulations it becomes one of the most serious bottlenecks facing researchers interested in implementing ECD. In this paper we examine the utility of cluster analysis as a method of identifying key features of student performance in log data stemming from educational video games or simulations. In our study, cluster analysis was able to consistently identify key features of student performance in the form of solution strategies and error patterns across levels, which contained few extraneous actions and explained a sufficient amount of the data.

Using Cluster Analysis to Extend Usability Testing to
Instructional Content
Deirdre Kerr and Gregory K.W.K. Chung - May 2012
Commercial video games undergo usability studies to determine the degree to which the player is able to learn, control, and understand the game. Usability studies allow game designers to improve their games before they are released to the public. If usability studies could be expanded to include information about the presentation of the instructional content, they could help improve educational video games. In this study, cluster analysis was used to identify usability information from the log files from an educational video game called Save Patch. Cluster analysis was able to pinpoint specific levels in the game that could be improved as well as identify specific components of the level design under which certain errors were likely to occur, culminating in specific recommendations to improve the game in ways likely to increase learning.

A Primer on Data Logging to Support Extraction of Meaningful
Information from Educational Games: An Example from Save Patch
Gregory K.W.K. Chung and Deirdre Kerr - March 2012
In this primer we briefly describe our perspective and experience in using data logging to support measurement of student learning in an educational game. The goal of data logging is to support the derivation of cognitively and affectively meaningful measures from a combination of player behaviors, game events, and game states. Key best practices we have developed are to record data that reflects behavior rather than inferences about the behavior, specify the behavior to log ahead of time, log in-game behaviors that map directly to targeted knowledge, skills, and attitudes, encode sufficient information so that the data elements are unambiguous at the desired grain size, and capture context to allowing linking of the data element to an individual’s specific game experience. This allows for the investigation of numerous research questions that connect game play to students’ background, strategy use, knowledge, and cognitive processes.