There is a lot to talk about today, so this post may be a little long, but only because day two of the EDM 2011 conference was that amazing.
New Ideas from Stanford:
Marcelo Worsley from Standford had a very refreshing paper and presentation. In his work, they had a bunch of students 3 novices, 6 intermediate, and 3 experts sit down and design an engineering project to separate trash. Next they more or less tried to monitor every type of aspect they could, looking at intonation of words, transcripts of how people describe their designs, investigated word choice, and many more features. Next they looked for markers of experts. The most interesting thing about this work is the extraordinarily open environment in which their test subjects are working, a very hands on environment comparable to Charlotte’s Discovery Place, the children’s science museum. Very hands on, very tactile, like building with Legos or making a fort in the woods.
His talk was really interesting because it is nice to see a different area of work, addressing the very hands on work environment, but also seeing a new set of ideas on how to approach EDM in general. The folks at Stanford had a couple of ideas that I haven’t seen before being brought to this community and it was really nice to see some new and different ideas and approaches being presented.
Conference Power: a concrete example of why they are wonderful
So one of the major challenges that we have been facing with the EDM Vis tool is how to incorporate time stamps into the visualization tool. Our research group has bounced a number of ideas around regarding this issue but we were never able to fully satisfy our desire for conceiving of a simple and informative method for visualizing that type of data in our tool. Luckily I am here at EDM 2011 and met Edgar Cambranes-Martinez and he is working on a similar problem in regards to visualizing data. He pointed out a paper, Diagnosing Learners’ Problem-Solving Strategies Using Learning Environments with Algorithmic Problems in Secondary Education [link]. In that paper, the author places all the possible actions along the Y-axis in categories, and along the X-axis is time. When Edgar explained this idea to me, I didn’t fully grasp the explanation and thought of the image below.
Each student receives a different color, and each “node” is the state. Along the X-axis we have the time based on time stamps, and along the Y-axis we have the number of actions. So we can see the “purple” student performed one action more than the “blue” student. Also we can see there is a large time gap between actions 3 and 4 for the purple student, while actions 2 and 3 for that student were quite quick. Next we just need to provide some functionality for drawing all of the students as well as some filtering, zooming and of course details on demand. Thanks Edgar.
MagnaView [link]
So the invited speaker was Erik-Jan van der Linden, the CEO of MagnaView, a visual analytics company in the Netherlands and I found the talk to be really interesting, likely more interesting than most. The reason for this is because, A) he is interested in visualization, more so than the average EDM’er, and B) because he has developed a profitable company in the area of visual analytics, good for Info Vis and I argue data mining as well. Long story short is his view of the whole Vis and DM relationship is Visualization on the front end, giving feedback to the user and providing an attractive and intuitive interface, with data mining on the back end handling a load of calculations, clustering, and processing, behind the scenes. Again there are people pushing for the combination of Vis and DM, but I still think we have yet to see a truly elegant combination of the both, perhaps something for me to think about for my dissertation.
Future of EDM
The panel talk was titled the Future of the EDM and was vague, as whether it meant the conference, the community, the field or the research. Rightly so the panel members did an excellent job, each addressing their own issue on the topic with little overlap. The big calls were for replication, visualization, serious games, and randomization. Replication being that, people should be running their own studies of other peoples’ work and determining if they get similar results. One frustrating aspect with this, is it is not uncommon for some reviewer to shoot these papers down saying, there is nothing new here. Ignoring the fact, of course, that the very point of a replication study is that the new study is the contribution because it provides either supporting or contradicting evidence to previous studies. Hopefully next year in 2012 these papers will be respected, accepted and have a fair representation in the conference, these types of works are desperately needed!
As far as visualization goes for EDM, I think this is a similar issue to the replication studies works. Visualization is a different beast from data mining and so if the community wants to see those types of papers presented at EDM, it will be necessary to adopt a different means for evaluating a paper’s quality. With many visual analytic works, the tool itself is the contribution just like developing an ITS is sufficient for submitting to the ITS conference. As with an ITS the challenge is to show it provides learning gains, longer retention, or some other beneficial factor, for visualization it is the insights that the tool provides which are often used as the metric for quality.
In the case of serious games, it is again similar to an ITS, the game itself, often with a study is the contribution. Personally I would like to see what a game really adds to the learning, that an ITS doesn’t. The definition of a game is still way up in the air so it is not clear where an ITS ends and a serious game begins. From what I’ve been able to see so far the only major difference is the type of art assets. I will make a future post about serious games to go into more detail but I’m still torn on whether an ITS and a game are all the different in terms of learning gains, mainly because it is difficult to tell where one ends and the other begins.
Lastly randomization. The idea here is we need to incorporate more randomization in the order of instructional information. Currently, someone develops an ITS and the instructional pieces are offered to the student in the way the author thought of them, for example learn addition, then subtraction, multiplication, then division, then negative numbers. But perhaps that isn’t the best way to teach the subject matter, sure we’ve been doing it that way for ages, but we also used to think the world was flat. The idea the tutor randomize the order and get thousands of students to use it and then with millions of logs we can compare how each set of students learns with all the different orders of topics. A lot of people are really interested in more randomization in the data sets so we can better analyze how the order affects learning. That may also lead to new discoveries regarding knowledge components, who knows? No one, hence the point of doing the research.
Conclusion:
Well that’s that for day two, one more day to go and I am already really excited about the types of works we will see at next years EDM conference. Lastly, Baker made an interesting point in the panel talk which was, the papers of EDM08 should not be admitted to EDM12. Arguing that our works should consistently become better and better, stronger statistical analysis, better study designs, more rigor, better written, and stronger influence from the results of the work. Taking that point into consideration, it makes me even more excited to the types of works we will see next year, many of which will hopefully be my own.


