It has come to the end of my second week with Dr. Moon. My task for the last two weeks have been to compile all the essays we have scored and format them into a document to run through our predictive model. The goal was very simple: Previous code has already separated the assignments into 3 clusters, reaching different levels of cognitive complexity. Our hope was that our code would create a vocabulary and through linear regression, predict what cluster the assignment would belong to.
It took hours to format and set up, it was late at night, and I finally uploaded the document and ran the codes.
It did not predict anything other than one cluster correctly. Too many questions ran through my mind of what the problem could have been. Was it because one cluster group dominated our train set? That there were too many of a single cluster that it saturated all results? Could it have been our model? Was a linear regression using vocabulary not capable of assigning cognitive complexity to an assignment?
There are many many questions I have asked and ran by my postdoc. For the mean time, I wait for her to get a response back from more technologically savvy colleagues. In the meanwhile, all I can do is assign scores and distribute tasks to other colleagues concerning another project we have going on.
Just wish us luck, I want our model and our code to work. The potential it has for education and automated analysis could be exponential if we lay down the groundwork.