Papers and Talks:
Yue Gong presented a nice paper titled, Items, Skills and Transfer Models: Which Really Matters for Student Modelling. In this work the authors liken the idea of student modelling as a backpack to a backpacker travelling up a mountain. The backpack is the student model, within the backpack are a set of tools to be used to improve the quality of the student model. So they measure the contribution of item difficulty, student ability and transfer model to the accuracy of a student model. Additionally they also look at the combination of the contributing parameters.
One side note about this work is the use of Efron’s r^2 instead of R^2, I intend to investigate a bit and get a good understanding of the difference and possibly write my interpretation of the two. If anyone can provide their own explanation it is welcomed.
Classifiers from Tutor Logs:
Cool talk by Jack Mostow, Learning Classifiers from a Relational Database of Tutor Logs, the idea being that tutors log event streams, but the way we mine that data, is by converting it into a y=f(x) function, where y is the label, f() is the classifier, and x is the feature vector. The argument being that this approach creates a gap between the type of data that is logged and the ways we treat the classifier function have a gap. Thus, by going directly from the tutor logs, in this case a database, to labels, ignoring the step of creating a feature vector, we can achieve better results.
Ensembling and Post-Test Score Predictions:
Another cool paper by R. Baker where he compares 9 different student modelling strategies as well as some ensemble approaches. The list of models includes five variations of knowledge tracing, performance factor analysis, and some others as well as six ensemble approaches, but sadly CFAR, Correct First Attempt Rate, had the lowest root mean squared error. The authors suggest the small size of the data set likely hurt the quality of the ensemble approaches. Another potential issue is the methods used to create the ensembles are highly related suggesting that the combination of models doesn’t really add more to the quality of the model.
I found this work really interesting because it is a nice analysis of a variety of approaches, in addition they address in the related work that there is more or less a case where model A is better than B which is better than C which goes full circle and is better than A. A situation which makes it very difficult to determine which model is the best approach.
2010 KDD Competition:
So last year, in 2010, the KDD competition made use of an educational data, data-set and the idea was that a number of people from the EDM community would participate and bring some recognition to our field and work. Unfortunately not so many people from our community joined the KDD cup competition, myself included sadly, which is a real shame. One of the key note talks I am really looking forward to is by John Stamper and is about reflections on KDD cup. In addition rumor is, another competition, this time through some other group, not KDD, will also be announced. This time around however, I will participate, and not allow our participation in these types of challenges to be a let down. I hope others will be equally motivated and interested in accepting the challenge. For anyone wanting to work together with me and hopefully other members of my lab at Charlotte, feel free to get in touch with us. Our lab site is www.game2learn.com

