Accepted papers

Here is the list of the accepted papers for presentation at L@S conference.
After the conference, papers will be available in the ACM Digital Library, Proceedings of the third ACM conference on Learning@Scale.

Full papers

Paper 1: Mobile Devices for Early Literacy Intervention and Research with Global Reach by Cynthia Breazeal, Robin Morris, Stephanie Gottwald, Tinsley Galyean, Maryanne Wolf

Keywords: Open platform for education; early literacy; reading brain; virtual preschool; pre-k learning and technology; global literacy project
Extensive work focuses on the uses of technology at scale for post-literate populations (e.g., MOOC, learning games, Learning Management Systems). Little attention is afforded to non-literate populations, particularly in the developing world. This paper presents an approach using mobile devices with the ultimate goal to reach 770 million people. We developed a novel platform with a cloud backend to deliver educational content to over a thousand marginalized children in different countries: specifically, in remote villages without schools, urban slums with overcrowded schools, and at-risk, rural schools. Here we describe the theoretical basis of our system and results from case studies in three educational contexts. This model will help researchers and designers understand how mobile devices can help children acquire basic skills and aid each other’s learning when the benefit of teachers is limited or non-existent.

Paper 2: Learning Transfer: Does It Take Place in MOOCs? An Investigation into the Uptake of Functional Programming in Practice by Guanliang Chen, Dan Davis, Claudia Hauff, Geert-Jan Houben

Keywords: transfer learning; MOOCs; GitHub; functional programming
The rising number of Massive Open Online Courses (MOOCs) enable people to advance their knowledge and competencies in a wide range of fields. Learning though is only the first step, the transfer of the taught concepts into practice is equally important and often neglected in the investigation of MOOCs. In this paper, we consider the specific case of FP101x (a functional programming MOOC on edX) and the extent to which learners alter their programming behaviour after having taken the course. We are able to link about one third of all FP101x learners to GitHub, the most popular social coding platform to date and contribute a first exploratory analysis of learner behaviour beyond the MOOC platform. A detailed longitudinal analysis of GitHub log traces reveals that (i) more than 8% of engaged learners transfer, and that (ii) most existing transfer learning findings from the classroom setting are indeed applicable in the MOOC setting as well.

Paper 3: A Data-Driven Approach for Inferring Student Proficiency from Game Activity Logs by Mohammad H. Falakmasir, Jose P. Gonzalez-Brenes, Geoffrey J. Gordon, Kristen E. DiCerbo

Keywords: Educational Games; Student Modeling; Stealth Assessment; Hidden Markov Models
Student assessments are important because they allow collecting evidence about learning. However, time spent on evaluating students may be otherwise used for instructional activities. Computer-based learning platforms provide the opportunity for unobtrusively gathering students' digital learning footprints. This data can be used to track learning progress and make inference about student competencies. We present a novel data analysis pipeline, Student Proficiency Inferrer from Game data (SPRING), that allows modeling game playing behavior in educational games. Unlike prior work, SPRING is a fully data-driven method that does not require costly domain knowledge engineering. Moreover, it produces a simple interpretable model that not only fits the data but also predicts learning outcomes. We validate our framework using data collected from students playing 11 educational mini-games. Our results suggest that SPRING can predict math assessments accurately on withheld test data (Correlation=0.55, Spearman rho=0.51).

Paper 4: An Exploration of Automated Grading of Complex Assignments by Chase Geigle, ChengXiang Zhai, Duncan C. Ferguson

Keywords: Automatic grading; ordinal regression; supervised learning; learning to rank; active learning; text mining
Automated grading is essential for scaling up learning. In this paper, we conduct the first systematic study of how to automate grading of a complex assignment using a medical case assessment as a test case. We propose to solve this problem using a supervised learning approach and introduce three general complementary types of feature representations of such complex assignments for use in supervised learning. We first show with empirical experiments that it is feasible to automate grading of such assignments provided that the instructor can grade a number of examples. We further study how to integrate an automated grader with human grading and propose to frame the problem as learning to rank assignments to exploit pairwise preference judgments and use NDPM as a measure for evaluation of the accuracy of ranking. We then propose a sequential pairwise online active learning strategy to minimize the effort of human grading and optimize the collaboration of human graders and an automated grader. Experiment results show that this strategy is indeed effective and can substantially reduce human effort as compared with randomly sampling assignments for manual grading.

Paper 5: Online Urbanism: Interest-based Subcultures as Drivers of Informal Learning in an Online Community by Ben U Gelman, Chris Beckley, Aditya Johri, Carlotta Domeniconi, Seungwon Yang

Keywords: Informal Learning; Online Communities; Interest-based Subcultures; Scratch; Programming.
Online communities continue to be an important resource for informal learning. Although many facets of online learning communities have been studied, we have limited understanding of how such communities grow over time to productively engage a large number of learners. In this paper we present a study of a large online community called Scratch which was created to help users learn software programming. We analyzed 5 years of data consisting of 1 million users and their 1.9 million projects. Examination of interactional patterns among highly active members of the community uncovered a markedly temporal dimension to participation. As membership of the Scratch online community grew over time, interest-based subcultures started to emerge. This pattern was uncovered even when clustering was based solely on social network of members. This process, which closely resembles urbanism or the growth of physically populated areas, allowed new members to combine their interests with programming.

Paper 6: Graders as Meta-Reviewers: Simultaneously Scaling and Improving Expert Evaluation for Large Online Classrooms by David A. Joyner, Wade Ashby, Liam Irish, Yeeling Lam, Jacob Langson, Isabel Lupiani, Mike Lustig, Paige Pettoruto, Dana Sheahen, Angela Smiley, Amy Bruckman, Ashok Goel

Keywords: Peer review; online education.
Large classes, both online and residential, typically demand many graders for evaluating students' written work. Some classes attempt to use autograding or peer grading, but these both present challenges to assigning grades at for-credit institutions, such as the difficulty of autograding to evaluate free-response answers and the lack of expert oversight in peer grading. In a large, online class at Georgia Tech in Summer 2015, we experimented with a new approach to grading: framing graders as meta-reviewers, charged with evaluating the original work in the context of peer reviews. To evaluate this approach, we conducted a pair of controlled experiments and a handful of qualitative analyses. We found that having access to peer reviews improves the perceived quality of feedback provided by graders without decreasing the graders' efficiency and with only a small influence on the grades assigned.

Paper 7: Effects of In-Video Quizzes on MOOC Lecture Viewing by Geza Kovacs

Keywords: in-video quizzes; lecture viewing; lecture navigation; seeking behaviors; MOOCs
Online courses on sites such as Coursera use quizzes embedded inside lecture videos (in-video quizzes) to help learners test their understanding of the video. This paper analyzes how users interact with in-video quizzes, and how in-video quizzes influence users' lecture viewing behavior. We analyze the viewing logs of users who took the Machine Learning course on Coursera. Users engage heavily with in-video quizzes -- 74% of viewers who start watching a video will attempt its corresponding in-video quiz. We observe spikes in seek activity surrounding in-video quizzes, particularly seeks from the in-video quiz to the preceding section. We show that this is likely due to users reviewing the preceding section to help them answer the quiz, as the majority of users who seek backwards from in-video quizzes have not yet submitted a correct answer, but will later attempt the quiz. Some users appear to use quiz-oriented navigation strategies, such as seeking directly from the start of the video to in-video quizzes, or skipping from one quiz to the next. We discuss implications of our findings on the design of lecture-viewing platforms.

Paper 8: Brain Points: A Deeper Look at a Growth Mindset Incentive Structure for an Educational Game by Eleanor O'Rourke, Erin Peach, Carol S. Dweck, Zoran Popovic

Keywords: Educational games; growth mindset; incentive structures.
Student retention is a central challenge in systems for learning at scale. It has been argued that educational video games could improve student retention by providing engaging experiences and informing the design of other online learning environments. However, educational games are not uniformly effective. Our recent research shows that player retention can be increased by using a brain points incentive structure that rewards behaviors associated with growth mindset, or the belief that intelligence can grow. In this paper, we expand on our prior work by providing new insights into how growth mindset behaviors can be effectively promoted in the educational game Refraction. We present results from an online study of 25,000 children who were exposed to five different versions of the brain points intervention. We find that growth mindset animations cause a large number of players to quit, while brain points encourage persistence. Most importantly, we find that awarding brain points randomly is ineffective; the incentive structure is successful specifically because it rewards desirable growth mindset behaviors. These findings have important implications that can support the future generalization of the brain points intervention to new educational contexts.

Paper 9: The Civic Mission of MOOCs: Measuring Engagement across Political Differences in Forums by Justin Reich, Brandon Stewart, Kimia Mavon, Dustin Tingley

Keywords: MOOCs; civic education; discourse; text analysis; political ideology; structural topic model
In this study, we develop methods for computationally measuring the degree to which students engage in MOOC forums with other students holding different political beliefs. We examine a case study of a single MOOC about education policy, Saving Schools, where we obtain measures of student education policy preferences that correlate with political ideology. Contrary to assertions that online spaces often become echo chambers or ideological silos, we find that students in this case hold diverse political beliefs, participate equitably in forum discussions, directly engage (through replies and upvotes) with students holding opposing beliefs, and converge on a shared language rather than talking past one another. Research that focuses on the civic mission of MOOCs helps ensure that open online learning engages the same breadth of purposes that higher education aspires to serve.

Paper 10: How Mastery Learning Works at Scale by Steve Ritter, Michael Yudelson, Stephen E Fancsali, Susan R Berman

Keywords: Adaptive educational systems; intelligent tutors; mastery learning; big data; longitudinal data; educational outcomes
Nearly every adaptive learning system aims to present students with materials personalized to their level of understanding (Enyedy, 2014). Typically, such adaptation follows some form of mastery learning (Bloom, 1968), in which students are asked to master one topic before proceeding to the next topic. Mastery learning programs have a long history of success (Guskey and Gates, 1986; Kulik, Kulik & Bangert-Drowns, 1990) and have been shown to be superior to alternative instructional approaches.

Although there is evidence for the effectiveness of mastery learning when it is well supported by teachers, mastery learning’s effectiveness is crucially dependent on the ability and willingness of teachers to implement it properly. In particular, school environments impose time constraints and set goals for curriculum coverage that may encourage teachers to deviate from mastery-based instruction.

In this paper we examine mastery learning as implemented in Carnegie Learning’s Cognitive Tutor. Like in all real-world systems, teachers and students have the ability to violate mastery learning guidance. We investigate patterns associated with violating and following mastery learning over the course of the full school year at the class and student level. We find that violations of mastery learning are associated with poorer student performance, especially among struggling students, and that this result is likely attributable to such violations of mastery learning.

Paper 11: Using Multiple Accounts for Harvesting Solutions in MOOCs by Jose A. Ruiperez-Valiente, Giora Alexandron, Zhongzhou Chen, David E. Pritchard

Keywords: Academic dishonesty; educational data mining; learning analytics; MOOCs
The study presented in this paper deals with copying answers in MOOCs. Our findings show that a significant fraction of the certificate earners in the course that we studied have used what we call harvesting accounts to find correct answers that they later submitted in their main account, the account for which they earned a certificate. In total, around 2.5% of the users who earned a certificate in the course obtained the majority of their points by using this method, and around 10% of them used it to some extent. This paper has two main goals. The first is to define the phenomenon and demonstrate its severity. The second is characterizing key factors within the course that affect it, and suggesting possible remedies that are likely to decrease the amount of cheating. The immediate implication of this study is to MOOCs. However, we believe that the results generalize beyond MOOCs, since this strategy can be used in any learning environments that do not identify all registrants.

Paper 12: Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines by Mehdi S. M. Sajjadi, Morteza Alamgir, Ulrike von Luxburg

Keywords: machine learning; peer grading; peer assessment; peer review; L@S; ordinal analysis; rank aggregation
Peer grading is the process of students reviewing each others' work, such as homework submissions, and has lately become a popular mechanism used in massive open online courses (MOOCs). Intrigued by this idea, we used it in a course on algorithms and data structures at the University of Hamburg. Throughout the whole semester, students repeatedly handed in submissions to exercises, which were then evaluated both by teaching assistants and by a peer grading mechanism, yielding a large dataset of teacher and peer grades. We applied different statistical and machine learning methods to aggregate the peer grades in order to come up with accurate final grades for the submissions (supervised and unsupervised, methods based on numeric scores and ordinal rankings). Surprisingly, none of them improves over the baseline of using the mean peer grade as the final grade. We discuss a number of possible explanations for these results and present a thorough analysis of the generated dataset.

Paper 13: Fuzz Testing Projects in Massive Courses by Sumukh Sridhara, Brian Hou, Jeffrey Lu, John DeNero

Keywords: automated assessment; behavioral analytics; online learning
Scaffolded projects with automated feedback are core instructional components of many massive courses. In subjects that include programming, feedback is typically provided by test cases constructed manually by the instructor. This paper explores the effectiveness of fuzz testing, a randomized technique for verifying the behavior of programs. In particular, we apply fuzz testing to identify when a student's solution differs in behavior from a reference implementation by randomly exploring the space of legal inputs to a program. Fuzz testing serves as a useful complement to manually constructed tests. Instructors can concentrate on designing targeted tests that focus attention on specific issues while using fuzz testing for comprehensive error checking. In the first project of a 1,400-student introductory computer science course, fuzz testing caught errors that were missed by a suite of targeted test cases for more than 48% of students. As a result, the students dedicated substantially more effort to mastering the nuances of the assignment.

Paper 14: $1 Conversational Turn Detector: Measuring How Video Conversations Affect Student Learning in Online Classes by Adam Stankiewicz, Chinmay Kulkarni

Keywords: video discussions; turn taking; peer learning
Massive online classes can benefit from peer interactions such as discussion, critique, or tutoring. However, to scaffold productive peer interactions, systems must be able to detect student behavior in interactions at scale, which is challenging when interactions occur over rich media like video. This paper introduces an imprecise yet simple browser-based conversational turn detector for video conversations. Turns are detected without accessing video or audio data. We show how this turn detector can find dominance in video-based conversations. In a case study with 1,027 students using Talkabout, a video-based discussion system for online classes, we show how detected conversational turn behavior correlates with participants’ subjective experience in discussions and their final course grade.

Paper 15: Improving the Peer Assessment Experience on MOOC Platforms by Thomas Staubitz, Dominic Petrick, Matthias Bauer, Jan Renz, Christoph Meinel

Keywords: MOOC; Online Learning; Peer Assessment; Assessment.
Massive Open Online Courses (MOOCs) have revolutionized higher education by offering university-like courses for a large amount of learners via the Internet.
The paper at hand takes a closer look on peer assessment as a tool for delivering individualized feedback and engaging assignments to MOOC participants. Benefits, such as scalability for MOOCs and higher order learning, and challenges, such as grading accuracy and rogue reviewers, are described. Common practices and the state-of-the-art to counteract challenges are highlighted. Based on this research, the paper at hand describes a peer assessment workflow and its implementation on the openHPI and openSAP MOOC platforms. This workflow combines the best practices of existing peer assessment tools and introduces some small but crucial improvements.

Paper 16: Explaining Student Behavior at Scale The Influence of Video Complexity on Student Dwelling Time by Frans Van der Sluis, Jasper Ginn, Tim Van der Zee

Keywords: MOOCs; video; information complexity; dwelling time; learning analytics; student behavior.
Understanding why and how students interact with educational videos is essential to further improve the quality of MOOCs. In this paper, we look at the complexity of videos to explain two related aspects of student behavior: the dwelling time (how much time students spend watching a video) and the dwelling rate (how much of the video they actually see). Building on a strong tradition of psycholinguistics, we formalize a definition for information complexity in videos. Furthermore, building on recent advancements in time-on-task measures we formalize dwelling time and dwelling rate based on click-stream trace data. The resulting computational model of video complexity explains 22.44% of the variance in the dwelling rate for students that finish watching a paragraph of a video. Video complexity and student dwelling show a polynomial relationship, where both low and high complexity increases dwelling. These results indicate why students spend more time watching (and possibly contemplating about) a video. Furthermore, they show that even fairly straightforward proxies of student behavior such as dwelling can already have multiple interpretations; illustrating the challenge of sense-making from learning analytics.

Paper 17: AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning by Joseph Jay Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter S. Lasecki, Neil Heffernan

Keywords: Explanation; learning at scale; crowdsourcing; learnersourcing; machine learning; adaptive learning
While explanations may help people learn by providing information about why an answer is correct, many problems on online platforms lack high-quality explanations. This paper presents AXIS (Adaptive eXplanation Improvement System), a system for obtaining explanations. AXIS asks learners to generate, revise, and evaluate explanations as they solve a problem, and then uses machine learning to dynamically determine which explanation to present to a future learner, based on previous learners' collective input. Results from a case study deployment and a randomized experiment demonstrate that AXIS elicits and identifies explanations that learners find helpful. Providing explanations from AXIS also objectively enhanced learning, when compared to the default practice where learners solved problems and received answers without explanations. The rated quality and learning benefit of AXIS explanations did not differ from explanations generated by an experienced instructor.

Paper 18: The Role of Social Media in MOOCs: How to Use Social Media to Enhance Student Retention by Saijing Zheng, Kyungsik Han, Mary Beth Rosson, John M. Carroll

Keywords: Massive Open Online Course; MOOCs; Social Media; Facebook; Coursera; Mixed Method
The Massive Open Online Courses (MOOC) have experienced rapid development. However, high dropout rate has become a salient issue. Many studies have attempted to understand this phenomenon; other have explored mechanisms for enhancing retention. For instance, social media has been used to improve student engagement and retention. However there is a lack of (1) empirical studies of social media use and engagement compared to embedded MOOC forums; and (2) rationales for social media use from both instructors’ and students’ perspectives. We addressed these open issues through the collection and analysis of real usage data from three MOOC forums and their associated social media (i.e., Facebook) groups as well as conducting interviews of instructors and students. We found that students show higher engagement and retention in social media than in MOOC forums, and identified both instructors’ and students’ perspectives that lead to the results. We discuss design implications for future MOOC platforms.

Work in progress

Paper 1: Bringing Non-programmer Authoring of Intelligent Tutors to MOOCs by Vincent Aleven, Ryan Baker, Yuan Wang, Jonathan Sewall, Octav Popescu

Keywords: Intelligent tutoring systems, ITSs, MOOCs, Interoperability, Feasibility study, Log data analysis
Learning-by-doing in MOOCs may be enhanced by embedding intelligent tutoring systems (ITSs). ITSs support learning-by-doing by guiding learners through complex practice problems while adapting to differences among learners. We extended the Cognitive Tutor Authoring Tools (CTAT), a widely-used non-programmer tool kit for building intelligent tutors, so that CTAT-built tutors can be embedded in MOOCs and e-learning platforms. We demonstrated the technical feasibility of this integration by adding simple CTAT-built tutors to an edX MOOC, “Big Data in Education.” To the best of our knowledge, this integration is the first occasion that material created through an open-access non-programmer authoring tool for full-fledged ITS has been integrated in a MOOC. The work offers examples of key steps that may be useful in other ITS-MOOC integration efforts, together with reflections on strengths, weaknesses, and future possibilities.

Paper 2: Automatically Learning to Teach to the Learning Objectives by Rika Antonova, Joe Runde, Min Hyung Lee, Emma Brunskill

Keywords: Automated instructional design
machine learning data-driven improvement
We seek to automatically identify which items to include in a set of curriculum, and how to adaptively select these items, in order to maximize student performance on some specified set of learning objectives. Our experimental results with a histogram tutoring system suggest that Bayesian Optimization can quickly (with only a small amount of student data) find good parameters, and may help instructors identify misalignment between their course, and their desired learning objectives.

Paper 3: PEER Support In MOOCs: The Role Of Social Presence by Kwamena Appiah-Kubi, Duncan Rowland

Keywords: MOOC; Social Interaction; Group Cohesion; Social Presence; Community of Inquiry
MOOCs by their design are able to reach several thousands of participants with very few instructors creating, delivering and facilitating the content. Participants interact with each other usually with text based asynchronous discussion forums built into the MOOC platform. The purpose of this research is to explore the role of social presence in facilitating peer support among a large community of learners.

Paper 4: A Framework for Topic Generation and Labeling from MOOC Discussions by Thushari Atapattu, Katrina Falkner

Keywords: MOOC; discussion forum; topic modeling; Naive Bayes; Learning Analytics; NLP, Latent Dirichlet Allocation
This study proposes a standardised open framework to automatically generate and label discussion topics from Massive Open Online Courses (MOOCs). The proposed framework expects to overcome the issues experienced by MOOC participants and teaching staff in locating and navigating their information needs effectively. We analysed two MOOCs – Machine Learning and Statistics: Making Sense of Data offered during 2013 and obtained statistically significant results for automated topic labeling. However, more experiments with additional MOOCs from different MOOC platforms are necessary to generalise our findings.

Paper 5: Promoting Student Engagement in MOOCs by Jiye Baek, Jesse Shore

Keywords: MOOCs; discussion forum; cohort size; student engagement; student retention; performance
MOOCs offer valuable learning experiences to students from all around the world. In addition to providing filmed lectures, readings, and problem sets, many MOOCs allow students to ask and answer questions about course materials with each other through interactive user forums. However, in current MOOCs, only 3 to 5 percent of those students interact in the user forum (Breslow 2013, Rosé et al. 2014) and more than 90 percent of students stop attending the course altogether (Jordan 2014). According to prior studies, this low level of social engagement in MOOCs may lead to student attrition and low performance (Ren et al. 2007). Hence, a natural question that arises then is, how can we promote interaction among students in MOOC discussion forums in order to reduce students’ attrition and raise their performance? In this paper, we conduct a field experiment on the edX platform to identify factors that promote student engagement in MOOC discussion forums. Researchers have discovered that the number of people interacting in one online location (e.g. group, community or virtual classroom size) is a key characteristic mediating user engagement (Butler et. al 2014), and most prior works have shown that users in a smaller size group participate more per person. However, contrary to prior research, our results show that the students in larger size cohorts interact more per person and that this greater interaction in turn increases student retention and performance.

Paper 6: Towards Cross-domain MOOC Forum Post Classification by Aneesha Bakharia

Keywords: Classification; Forum; MOOC; Domain Adaptation
Preliminary research is presented on the generalisability of confusion, urgency and sentiment classifiers for MOOC forum posts. The Stanford MOOCPosts data set is used to train classifiers with forum posts from individual courses and validate these classifiers on MOOC forum posts from other domain areas. While low cross-domain classification accuracy is achieved, the experiment highlights the need for transfer learning and domain adaptation algorithms; and provides insight into the types of algorithms required within an educational context.

Paper 7: Exploring the Effects of Lightweight Social Incentives on Learner Performance in MOOCs by Katherine Brady, Douglas Fisher, Gayathri Narasimham

Keywords: Incentives;Completion;Community TAs
We are exploring the effects of social incentives and motivation on learner performance in a massive open online course. In the preliminary study that we report here, we asked learners if they wanted to be considered for a community TAship in a subsequent offering of the course, if they finished in the top 20% of those who completed the current course instance. We prompted students near the beginning of the course and in the middle of the course. This prompt appears to have had a significant, albeit small effect on learner completion when given early in the course. The prompt had no significant effect when given later in the course. We also discuss our plans to follow-up this study.

Paper 8: Personalized Adaptive Learning using Neural Networks by Devendra Singh Chaplot, Eunhee Rhim, Jihie Kim

Keywords: Adaptive Learning; Neural Networks; Learner Model; Instructional Model; Student Model; Personalized Item Selection;
Adaptive learning is the core technology behind intelligent tutoring systems, which are responsible for estimating student knowledge and providing personalized instruction to students based on their skill level. In this paper, we present a new adaptive learning system architecture, which uses Artificial Neural Network to construct the Learner Model, which automatically models relationship between different concepts in the curriculum and beats Knowledge Tracing in predicting student performance. We also propose a novel method for selecting items of optimal difficulty, personalized to student’s skill level and learning rate, which decreases their learning time by 26.5% as compared to standard pre-defined curriculum sequence item selection policy.

Paper 9: Open-DLAs: An Open Dashboard for Learning Analytics by Ruth Cobos, Silvia Gil, Angel Lareo, Francisco A. Vargas

Keywords: MOOCs, Learning Analytics, Dashboard
In this paper a learning analytics dashboard for MOOCs is proposed. It visualises the progress of learners’ activity taking into account navigation, social interactions and interaction with educational resources. This approach was tested with the MOOCs created by the University Autonóma of Madrid (Spain) in the edX platform. Nowadays, the dashboard is being improved taking into account the received feedback from MOOCs instructors and assistants. Finally, a new version is presented to work along with edX and Open edX.

Paper 10: Supporting Scalable Data Sharing in Online Education by Stephen Cummins, Alastair R Beresford, Ian Davies, Andrew Rice

Keywords: Privacy; Authorisation; Review; Data Sharing
Online educational tools often generate learning data, and sharing such data between tutors and students can often improve learning outcomes. Unfortunately the process of sharing learning data today is not always transparent to students. Our aim is to improve the transparency and user control aspects of sharing data whilst maintaining the educational utility of data sharing between tutors and students. To do so, we start by surveying the possible methods of sharing data, and we use this to design a token-based scheme for facilitating data sharing. We implemented our scheme and observed it in use by 7,798 students over the course of one year. We find that our proposed scheme provides a good balance between transparency, user control, educational utility and scalability.

Paper 11: Investigating the Use of Hints in Online Problem Solving by Stephen Cummins, Alistair Stead, Lisa Jardine-Wright, Ian Davies, Alastair R Beresford, Andrew Rice

Keywords: Problem Solving; Scaffolding; Hints; Physics
We investigate the use of hints as a form of scaffolding for 4,652 eligible users on a large-scale online learning environment called Isaac, which allows users to answer physics questions with up to five hints.

We investigate user behaviour when using hints, users' engagement with fading (the process of gradually becoming less reliant on the hints provided), and hint strategies including Decomposition, Correction, Verification, or Comparison.

Finally, we present recommendations for the design and development of online teaching tools that provide open access to hints, including a mechanism that may improve the speed at which users begin fading.

Paper 12: Work in Progress: Student Behaviors Using Feedback in a Blended Physics Undergraduate Classroom by Jennifer DeBoer, Lori Breslow

Keywords: MOOC platforms; blended learning; immediate feedback
Two major benefits of Massive Open Online Course platforms are their collection of fine grain data on student interactions with the course website and their capacity to give students immediate feedback on their work. We study the patterns of students’ usage of immediate feedback in an undergraduate physics course that uses blended learning, and we present informative aggre-gate descriptives from this 474-student class. We find that overall student study strategies mirror those in “traditional” courses, that students strategically use the auto-checking feature of the platform, and that they extensively use the other content resources available to them on the platform. Several of these findings support educational research that has not had the benefit of the data MOOC platforms give us access to. Better understanding of how students engage with blended learning will aid residential instructors in tailoring in-class time and providing their students with recommendations for approaches to studying.

Paper 13: Challenge and Potential of Fine Grain, Cross-Institutional Learning Data by Alan Dix

Keywords: learning analytics; education technology; reading lists; MOOCs; OER; linked data
While MOOCs and other forms of large-scale learning are of growing importance, the vast majority of tertiary students still study in traditional face-to-face settings. This paper examines some of the challenges in attempting to apply the benefits of large-scale learning to these settings, building on a growing repository of cross-institutional data.

Paper 14: Beetle-Grow: An Effective Intelligent Tutoring System for Data Collection by Elaine Farrow, Myroslava O Dzikovska, Johanna D Moore

Keywords: intelligent tutoring; natural language; interaction data; conceptual learning; physics; electronics
We present the Beetle-Grow intelligent tutoring system, which combines active experimentation, self-explanation, and formative feedback using natural language interaction. It runs in a standard web browser and has a fresh, engaging design. The underlying back-end system has previously been shown to be highly effective in teaching basic electricity and electronics concepts.

Beetle-Grow has been designed to capture student interaction and indicators of learning in a form suitable for data mining, and to support future work on building tools for interactive tutoring that improve after experiencing interaction with students, as human tutors do.

Paper 15: Predicting Students’ Standardized Test Scores Using Online Homework by Mingyu Feng, Jeremy Roschelle

Keywords: Online math homework; log analysis, prediction
How students do homework has been underresearched relative to classroom learning because it is more difficult to collect data on students’ homework behaviors. Presumably, such data would have implications for students’ achievement. To understand how students do homework and how homework performance and behaviors relate to end-of-year standardized test scores, we analyzed the system logs from an online homework support platform used by more than 1,500 seventh-grade students in Maine.

Paper 16: Learning at Scale: Using an Evidence Hub To Make Sense of What We Know by Rebecca Ferguson

Keywords: Ethics; evidence; Evidence Hub; learning; learning analytics; teaching
The large datasets produced by learning at scale, and the need for ways of dealing with high learner/educator ratios, mean that MOOCs and related environments are frequently used for the deployment and development of learning analytics. Despite the current proliferation of analytics, there is as yet relatively little hard evidence of their effectiveness. The Evidence Hub developed by the Learning Analytics Community Exchange (LACE) provides a way of collating and filtering the available evidence in order to support the use of analytics and to target future studies to fill the gaps in our knowledge.

Paper 17: Designing for Open Learning: Design Principles and Scalability Affordances in Practice by Olga Firssova, Francis Brouns, Marco Kalz

Keywords: Open online learning; MOOCs; design principles; active learning; scalability
This work-in-progress paper elaborates on a gradually evolving approach to design of open learning and the design principles used by the Open University of the Netherlands in short open courses - online masterclasses and in Massive Open Online Courses – delivered in the learning environment of the Open University and in the experimental multilingual MOOC aggregator EMMA as part of a European project. As the paper will demonstrate, these principles can be seen as building blocks of open scalable design of active and engaging learning.

Paper 18: LINK-REPORT: Outcome Analysis of Informal Learning at Scale by Xiang Fu, Tyler Befferman, M. D. Burghardt

Keywords: outcome assessment; automated grading
We present LINK-REPORT, a distributed learning outcome analysis module that is integrated with the WISEngineering platform for supporting informal learning in engineering. LINK-REPORT provides a coherent workflow of outcome analysis: starting from development of learning outcome goals, to learner behavior collection, to automated grading of open ended short answer questions, and to report generation and aggregation. It generates learning data for research opportunities in modeling of learner traits.

Paper 19: Observing URL Sharing Behaviour in Massive Online Open Courses by Silvia Gallagher, Timothy Savage

Keywords: Massive Online Open Courses; Uniform Resource Locators; Online learner behaviour.
Information sharing is a key activity of Massive Online Open Courses (MOOCs) user behavior. Sharing Uniform Resource Locators (URLs) has been identified as a means for individuals in online spaces to generate social relationships, construct knowledge, and disseminate information; however this activity has not been investigated within the MOOC space. This paper presents an observational study of URL sharing within MOOCs, and explores how a MOOC learning community responded to this micro behaviour. The research explored 1,471 comments and 416 learners who displayed URL sharing behavior from two iterations of the ‘Irish Lives’ Futurelearn / Trinity College, University of Dublin MOOC. The analysis identified patterns of behavior within ‘URL Sharers’, and suggests that this activity could support greater learner interaction. Although causality is not implied, the results of this analysis contributes a tentative new understanding of URL sharing in MOOCs, and denotes a new MOOC micro behaviour. This can be useful for MOOC practitioners to facilitate design choices.

Paper 20: Scaling up Online Question Answering via Similar Question Retrieval by Chase Geigle, ChengXiang Zhai

Keywords: Community question answering; information retrieval
Faced with growing class sizes and the dawn of the MOOC, educators are in need of tools to help them cope with the growing number of questions asked in large classes since manually answering all the questions in a timely manner is infeasible. In this paper, we propose to exploit historical question/answer data accumulated for the same or similar classes as a basis for automatically answering previously asked questions via the use of information retrieval techniques. We further propose to leverage resolved questions to create test collections for quantitative evaluation of a question retrieval algorithm without requiring additional human effort. Using this evaluation methodology, we study the effectiveness of state of the art retrieval techniques for this special retrieval task, and perform error analysis to inform future directions.

Paper 21: Assessing Problem-Solving Process At Scale by Shuchi Grover, Marie Bienkowski, John Niekrasz, Matthias Hauswirth

Keywords: Problem solving; evidence-centered design; assessment; programming process; K-12 computing,Problem solving; evidence-centered design; assessment, programming process, K-12 computing
Authentic problem solving tasks in digital environments are often open-ended with ill-defined pathways to a goal state. Scaffolds and formative feedback during this process help learners develop the requisite skills and understanding, but require assessing the problem-solving process. This paper describes a hybrid approach to assessing process at scale in the context of the use of computational thinking practices during programming. Our approach combines hypothesis-driven analysis, using an evidence-centered design framework, with discovery-driven data analytics. We report on work-in-progress involving novices and expert programmers working on Blockly games.

Paper 22: The Unexpected Pedagogical Benefits of Making Higher Education Accessible by David A. Joyner, Ashok K. Goel, Charles Isbell

Keywords: Online education; accessibility; higher education.
Many ongoing efforts in online education aim to increase accessibility through affordability and flexibility, but some critics have noted that pedagogy often suffers during these efforts. In contrast, in the low-cost for-credit Georgia Tech Online Masters of Science in Computer Science (OMSCS) program, we have observed that the features that make the program accessible also lead to pedagogical benefits. In this paper, we discuss the pedagogical benefits, and draw a causal link between those benefits and the factors that increase the program's accessibility.

Paper 23: Expert Evaluation of 300 Projects per Day by David A. Joyner

Keywords: Online education; evaluation; feedback.
In October 2014, one-time MOOC developer Udacity completed its transition from primarily producing massive, open online courses to producing job-focused, project-based microcredentials called "Nanodegree" programs. With this transition came a challenge: whereas MOOCs focus on automated assessment and peer-to-peer grading, project-based microcredentials would only be feasible with expert evaluation. With dreams of enrolling tens of thousands of students at a time, the major obstacle became project evaluation. To address this, Udacity developed a system for hiring external experts as project reviewers. A year later, this system has supported project evaluation on a massive scale: 61,000 projects have been evaluated in 12 months, with 50% evaluated within 2.5 hours (and 88% within 24 hours) of submission. More importantly, students rate the feedback they receive very highly at 4.8/5.0. In this paper, we discuss the structure of the project review system, including the nature of the projects, the structure of the feedback, and the data described above.

Paper 24: A Preliminary Look at MOOC-associated Facebook Groups Prevalence, Geographic Representation, and Homophily by Anna Kasunic, Jessica Hammer, Robert Kraut, Michael Massimi, Amy Ogan

Keywords: Massive Open Online Courses; MOOCs; Facebook; Social Media; Homophily
Although xMOOCs are not designed to directly engage students via social media platforms, some students in these courses join MOOC-associated Facebook groups. This study explores the prevalence of Facebook groups associated with courses from MITx and HarvardX, the geographic distribution of students in such groups as compared to the courses at large, and the extent to which such groups are location and/or language homophilous. Results suggests that a non-trivial number of MOOC students engage in Facebook groups, that learners from a number of non-U.S. locations are disproportionately likely to participate in such groups, and that the groups display both location and language homophily. These findings have implications for how MOOCs and social media platforms can support learners from non-English speaking contexts.

Paper 25: The Distributed Esteemed Endorser Review: A Novel Approach to Participant Assessment in MOOCs by Jennifer S. Kay, Tyler J. Nolan, Thomas M. Grello

Keywords: Assessment; MOOC; DEER; Esteemed Endorser; Robot Programming; Educational Robotics
One of the most challenging aspects of developing a Massive Open Online Course (MOOC) is designing an accurate method to effectively assess participant knowledge and skills. The Distributed Esteemed Endorser Review (DEER) approach has been developed as an alternative for those MOOCs where traditional approaches to assessment are not appropriate. In DEER, course projects are certified in-person by an “Esteemed Endorser”, an individual who is typically senior in rank to the student, but is not necessarily an expert in the course content. Not only does DEER provide a means to certify that course goals have been met, it also provides MOOC participants with the opportunity to share information about what they have learned with others at the local level.

Paper 26: Optimizing the Amount of Practice in an On-Line Platform by Kim M Kelly, Neil T Heffernan

Keywords: On-Line learning platform; mastery learning; MOOCs; personalized instruction
Intelligent tutoring systems are known for providing customized learning opportunities for thousands of users. One feature of many systems is differentiating the amount of practice users receive. To do this, some systems rely on a threshold of consecutive correct responses. For instance, Khan Academy used to use ten correct in a row and now uses five correct in a row as the mastery threshold. The present research uses a series of randomized control trials, conducted in an online learning platform (eg.,, to explore the effects of different thresholds of consecutive correct responses on learning. Results indicate that despite spending significantly more time practicing there is no significant difference on learning between two, three, four, or five consecutive correct responses. This suggests that systems, and MOOCS, can employ the simple rule of two or three consecutive correct responses when determining the amount of practice provided to users.

Paper 27: A Constructionist Toolkit for Learning Elementary Web Development at Scale by Meen Chul Kim, Thomas H. Park, Brian Lee, Sukrit Chhabra, Andrea Forte

Keywords: Computing education at scale; constructionism; informal learning; web development
We describe the design rationale and principles of a web authoring tool for beginner web developers called Snowball. We explain how this constructionist toolkit exposes content creators to computational features of web pages, enabling them to create meaningful artifacts while they move from content creation to basic coding. Finally, we discuss how we are instrumenting learning analytics in Snowball.

Paper 28: Elice: An online CS Education Platform to Understand How Students Learn Programming by Suin Kim, Jae Won Kim, Jungkook Park, Alice Oh

Keywords: Online education;online programming;social learning;collaborative learning;computer science education
We present Elice, an online CS (computer science) education platform, and Elivate, a system for taking student learning data from Elice and infers their progress through an educational taxonomy tailored for programming education. Elice captures detailed student learning activities, such as the intermediate revisions of code as students make progress toward completing their programming exercises. With those data, Elivate recognizes each student's progression through an education taxonomy which organizes intermediate stages of learning such that the taxonomy can be used to evaluate student progress as well as to design and improve course materials and structure. With more than 240,000 intermediate source codes generated by 1,000 students, we demonstrate the practicality of the Elice and Elivate. We present case studies that confirm that categorizing student actions into the different steps of the taxonomy results in better understanding of the effect of TA's assist and student's performance.

Paper 29: Recommending Self-Regulated Learning Strategies Does Not Improve Performance in a MOOC by René F. Kizilcec, Mar Pérez-Sanagustín, Jorge J. Maldonado

Keywords: Massive Open Online Course; Self-Regulated Learning
Many committed learners struggle to achieve their goal of completing a Massive Open Online Course (MOOC). This work investigates self-regulated learning (SRL) in MOOCs and tests if encouraging the use of SRL strategies can improve course performance. We asked a group of 17 highly successful learners about their own strategies for how to succeed in a MOOC. Their responses were coded based on a SRL framework and synthesized into seven recommendations. In a randomized experiment, we evaluated the effect of providing those recommendations to learners in the same course (N = 653). Although most learners rated the study tips as very helpful, the intervention did not improve course persistence or achievement. Results suggest that a single SRL prompt at the beginning of the course provides insufficient support. Instead, embedding technological aids that adaptively support SRL throughout the course could better support learners in MOOCs.

Paper 30: Peer Reviewing Short Answers using Comparative Judgement by Pushkar Kolhe, Michael L Littman, Charles L Isbell

Keywords: peer grading; crowd sourcing
We propose a comparative judgement scheme for grading short answer questions in an online class. The scheme works by asking students to answer short answer questions. Then a multiple choice question is created whose choices are the answers given by students. We show that we can formulate a probabilistic graphical model for this scheme which lets us infer each students proficiency for answering and grading questions.

Paper 31: Profiling MOOC Course Returners: How Does Student Behavior Change Between Two Course Enrollments? by Vitomir Kovanovic, Srecko Joksimovic, Dragan Gaševic, James Owers, Anne-Marie Scott, Amy Woodgate

Keywords: MOOCs; Clustering; Learning Analytics; Student Behavior; Self-regulated learning; Educational technology use
Massive Open Online Courses represent a fertile ground for examining student behavior. However, due to their openness MOOC attract a diverse body of students, for the most part, unknown to the course instructors. However, a certain number of students enroll in the same course multiple times, and there are records of their previous learning activities which might provide some useful information to course organizers before the start of the course. In this study, we examined how student behavior changes between subsequent course offerings. We identified profiles of returning students and also interesting changes in their behavior between two enrollments to the same course. Results and their implications are further discussed.

Paper 32: Optimally Discriminative Choice Sets in Discrete Choice Models: Application to Data-Driven Test Design by Igor Labutov, Kelvin Luu, Hod Lipson, Christoph Studer

Keywords: optimal experiment design; active learning; discrete choice model; optimal test; test design
Difficult test questions can be made easy by providing a set of possible answer options of which most are obviously wrong. In the education literature, a plethora of instructional guides exist for crafting a suitable set of wrong choices (distractors) in order to probe the students' understanding of the tested concept. The art of multiple-choice question design thus hinges on the question-maker's experience and knowledge of the potential misconceptions. In contrast, we advocate a data-driven approach, where correct and incorrect options are assembled directly from the students' own past submissions. Large-scale online classroom settings, such as massively open online courses (MOOCs), provide an opportunity to design optimal and adaptive multiple-choice questions that are maximally informative about the students' level of understanding of the material. We deploy a multinomial-logit discrete choice model for the setting of multiple choice testing, derive an optimization objective for selecting optimally discriminative option sets, and demonstrate the effectiveness of our approach via a user study.

Paper 33: Illusion of Progress is Moar Addictive than Cat Pictures by Leo Leppänen, Lassi Vapaakallio, Arto Vihavainen

Keywords: multiple choice questions; visual rewards; visual feedback; motivation; student engagement; gamification; pictures of cats; progress bar
We conducted two studies on the effect of visual reward mechanisms for increasing engagement with an online learning material. In the first study, we studied the effect of showing cat pictures as a reward to correct and incorrect answers to multiple choice questions, and in the second study, we created an illusion of progress using a progress bar that showed step-wise increments as students answered to the questions. Our results show the use of cat pictures as a visual reward mechanism does not significantly increase students' engagement with learning materials. At the same time, students who were shown progress bars had a statistically significant increase in the quantity of answers -- on average 88\% more answers per day. However, our results also indicate that this effect declines over time, meaning that students catch up to the illusion.

Paper 34: A Scalable Learning Analytics Platform for Automated Writing Feedback by Nicholas Lewkow, Jacqueline Feild, Neil Zimmerman, Mark Riedesel, Alfred Essa, David Boulanger, Jeremie Seanosky, Vive Kumar, Kinshuk -, Sandhya Kode

Keywords: Analytic Tools for Learners; Automatic Essay Feedback; Scalable Analytics; Performance Feedback; Natural Language Processing
In this paper, we describe a scalable learning analytics platform which runs generalized analytics models on educational data in parallel. As a proof of concept, we use this platform as a base for an end-to-end automated writing feedback system. The system allows students to view feedback on their writing in near real-time, edit their writing based on the feedback provided, and observe the progression of their performance over time. Providing students with detailed feedback is an important part of improving writing skills and an essential component towards solving Bloom's "two sigma" problem in education. We evaluate the effectiveness of the feedback for students with an ongoing pilot study with 800 students who are using the learning analytics platform in a college English course.

Paper 35: Macro Data for Micro Learning: Developing the FUN! Tool for Automated Assessment of Learning by Taylor Martin, Sarah Brasiel, Soojeong Jeong, Kevin Close, Kevin Lawanto

Keywords: Micro learning; Digital learning environments; Educational data mining; Assessment
Digital learning environments are becoming more common for students to engage in during and outside of school. With the immense amount of data now available from these environments, researchers need tools to process, manage, and analyze the data. Current methods used by many education researchers are inefficient; however, without data science experience tools used in other professions are not accessible. In this paper, we share about a tool we created called the Functional Understanding Navigator! (FUN! Tool). We have used this tool for different research projects which has allowed us the opportunity to (1) organize our workflow process from start to finish, (2) record log data of all of our analyses, and (3) provide a platform to share our analyses with others through GitHub. This paper extends and improves existing work in educational data mining and learning analytics.

Paper 36: Predicting Student Learning using Log Data from Interactive Simulations on Climate Change by Elizabeth McBride, Jonathan M Vitale, Hannah Gogel, Mario M Martinez, Zachary Pardos, Marcia C Linn

Keywords: Inquiry learning; Data mining; Action log data; Science education
Interactive simulations are commonly used tools in technology enhanced education. Simulations can be a powerful tool for allowing students to engage in inquiry, especially in science disciplines. They can help students develop an understanding of complex science phenomena in which multiple variables are at play. Developing models for complex domains, like climate science, is important for learning. Equally important, though, is understanding how students use these simulations. Finding use patterns that lead to learning will allow us to develop better guidance for students who struggle to extract the useful information from the simulation. In this study, we generate features from action log data collected while students interacted with simulations on climate change. We seek to understand what types of features are important for student learning by using regression models to map features onto learning outcomes.

Paper 37: Beyond Traditional Metrics: Using Automated Log Coding to Understand 21st Century Learning Online by Denise Nacu, Caitlin K. Martin, Michael Schutzenhofer, Nicole Pinkard

Keywords: Log analysis; educational data mining; online learning; teaching roles; 21st century learning
While log analysis in massively open online courses and other online learning environments has mainly focused on traditional measures, such as completion rates and views of course content, research is responding to calls for analytic frameworks that are more reflective of social learning models. We introduce a generalizable approach to automatically code log data that highlights educator support roles and student actions that are consistent with recent conceptualizations of 21st century learning, such as creative production, self-directed learning, and social learning. Here, we describe details of a log-coding framework that builds from prior mixed method studies of the use of iRemix, an online social learning network, by middle school youth and adult educators in blended learning contexts.

Paper 38: Supporting Peer Instruction with Evidence-Based Online Instructional Templates by Tricia Ngoon, Alexander Gamero-Garrido, Scott Klemmer

Keywords: online education; peer learning; peer instruction; templates; learning
This work examines whether templates designed from principles of multimedia learning design, and learning sciences research, can support peer instruction in creating more effective educational content on the web. Initial results show that the structure and guidelines within these templates can help novices produce meaningful learning content while improving the overall learning experience. This experiment provides insights into how to design and implement structured outlines online for web users to share learning content, and potentially shift researchers’ focus to more learner-centered online education.

Paper 39: Designing Videos with Pedagogical Strategies: Online Students' Perceptions of Their Effectiveness by Chaohua Ou, Ashok K Goel, David A Joyner, Daniel F Haynes

Keywords: artificial intelligence; educational videos; learning at scale; MOOCs; online learning
Despite the ubiquitous use of videos in online learning and enormous literature on designing online learning, there has been relatively little research on what pedagogical strategies should be used to make the most of video lessons and what constitutes an effective video for student learning. We experimented with a model of incorporating four pedagogical strategies, four instructional phases, and four production guidelines-in designing and developing video lessons for an online graduate course. In this paper, we share our experience as well as students’ perceptions of their effectiveness. We also discuss what needs to be done for future research.

Paper 40: Making the Production of Learning at Scale More Open and Flexible by Tina Papathoma, Rebecca Ferguson, Allison Littlejohn, Angela Coe

Keywords: Online Courses; Professional Learning; Activity Theory; Innovation; Case Study
Professional learning is a critical component of the ongoing improvement, innovation and adoption of new practices that support learning at scale. In this context, educators must learn how to apply digital technologies and work effectively in digital networks. This study examines how higher education professionals adapted their practice to enable more open and flexible work processes. A case study carried out using Activity Theory showed that teams involved in the development of a module all need access to a range of expertise both practical and academic. At each stage, they need to be clear about the learning outcomes of the module, the responsibilities of each team and its constraints. Teams need to be willing to agree ways to shift those constraints in order to develop a module effectively.

Paper 41: Learning Student and Content Embeddings for Personalized Lesson Sequence Recommendation by Siddharth Reddy, Igor Labutov, Thorsten Joachims

Keywords: Probabilistic Embedding; Sequence Recommendation; Adaptive Learning
Students in online courses generate large amounts of data that can be used to personalize the learning process and improve quality of education. In this paper, we present the Latent Skill Embedding (LSE), a probabilistic model of students and educational content that can be used to recommend personalized sequences of lessons with the goal of helping students prepare for specific assessments. Akin to collaborative filtering for recommender systems, the algorithm does not require students or content to be described by features, but it learns a representation using access traces. We formulate this problem as a regularized maximum-likelihood embedding of students, lessons, and assessments from historical student-content interactions. Empirical findings on large-scale data from Knewton, an adaptive learning technology company, show that this approach predicts assessment results competitively with benchmark models and is able to discriminate between lesson sequences that lead to mastery and failure.

Paper 42: A Queueing Network Model for Spaced Repetition by Siddharth Reddy, Igor Labutov, Siddhartha Banerjee

Keywords: Spaced Repetition; Flashcard Scheduling
Flashcards are a popular study tool for exploiting the spacing effect -- the phenomenon in which periodic, spaced review of educational content improves long-term retention. The Leitner system is a simple heuristic algorithm for scheduling reviews such that forgotten items are reviewed more frequently than recalled items. We propose a formalization of the Leitner system as a queueing network model, and formulate optimal review scheduling as a throughput-maximization problem. Through simulations and theoretical analysis, we find that the Leitner Queue Network (LQN) model has desirable properties and gives insight into general principles for spaced repetition.

Paper 43: Enabling Schema Agnostic Learning Analytics in a Service-Oriented MOOC Platform by Jan Renz, Gerado Navarro-Suarez, Rowshan Sathi, Thomas Staubitz, Christoph Meinel

Keywords: Service oriented architecture; MOOC; Learning Analytics
This paper at hand describes the design and implementation of an analytics service to retrieve live usage data from students enrolled in a service-oriented MOOC platform for the purpose of learning analytics (LA) research. A real-time and extensible architecture for consolidating and processing data in versatile analytics stores is introduced.

Paper 44: Using Android Wear for Avoiding Procrastination Behaviours in MOOCs by Cristóbal Romero, Rebeca Cerezo, Jose Antonio Espino, Manuel Bermudez

Keywords: MOOC; Android Wear; SmartWatches; Notifications; Procrastination.
This paper introduces a new feature for instructors to communicate with their MOOC learners via SmartWatches in a different way to the traditional e-mails in order to try to avoiding procrastination. We have developed an Android Wear-based SmartWatches application designed for receiving notifications from MOOCs, and a specific section in Google Course Builder interface that allows instructors to configure and send the messages to each user registered in the course. We have evaluated the implementation of our proposal in an Introduction to Philosophy MOOC. The number and percentage of students who did assessments on time, together with their comments in a satisfaction questionnaire present very promising results.

Paper 45: Course Builder Skill Maps by Boris Roussev, Pavel Simakov, John Orr, Amit Deutsch, John Cox, Michael Lenaghan, Mike Gainer

Keywords: Skill Maps; Adaptive Learning; Learning Analytics; MOOCs
In this paper, we present a new set of features introduced in Course Builder that allow instructors to add skill maps to their courses. We show how skill maps can be used to provide up-to-date and actionable information on students' learning behavior and performance.

Paper 46: Predicting Students’ Performance: Incremental Interaction Classifiers by Miguel Sanchez-Santillan, MPuerto Paule-Ruiz, Rebeca Cerezo, JCarlos Nuñez

Keywords: Educational Data Mining; Classifiers; eLearning
One of the Educational Data Mining (EDM) main aims is to predict the final student's performance, analyzing their behavior in the Learning Management Systems (LMSs). Many studies make use of different classifiers to reach this goal, using the total interaction of the students. In this work we study if it is possible to build more accurate classification models in order to predict the output, analyzing the interaction in an incremental way. We study the data gathered for two years with three kinds of classifying algorithms and we compare the total interaction models with the incremental interaction models.

Paper 47: ASSISTments Dataset from Multiple Randomized Controlled Experiments by Douglas Selent, Thanaporn Patikorn, Neil Heffernan

Keywords: ASSISTments; Randomized Controlled Experiments; Dataset
In this paper, we present a dataset consisting of data generated from 22 previously and currently running randomized controlled experiments inside the ASSIStments online learning platform. This dataset provides data mining opportunities for researchers to analyze ASSISTments data in a convenient format across multiple experiments at the same time. The data preprocessing steps are explained in detail to inform researchers about how this dataset was generated. A list of column descriptions is provided to define the columns in the dataset and a set of summary statistics are presented to briefly describe the dataset.

Paper 48: TAPS: A MOSS Extension for Detecting Software Plagiarism at Scale by Dana Sheahen, David Joyner

Keywords: software plagiarism; MOSS; cheating; academic integrity
Cheating in computer science classes can damage the reputation of institutions and their students. It is therefore essential to routinely authenticate student submissions with available software plagiarism detection algorithms such as Measure of Software Similarity (MOSS). Scaling this task for large classes where assignments are repeated each semester adds complexity and increases the instructor workload. The MOSS Tool for Addressing Plagiarism at Scale (MOSS-TAPS), organizes the MOSS submission task in courses that repeat coding assignments. In a recent use-case in the Online Master of Science in Computer Science (OMSCS) program at the Georgia Institute of Technology, the instructor time spent was reduced from 50 hours to only 10 minutes using the managed submission tool design presented here. MOSS-TAPS provides persistent configuration, supports a mixture of software languages and file organizations, and is implemented in pure Java for cross-platform compatibility.

Paper 49: Identifying Student Misunderstandings using Constructed Responses by Kristin Stephens-Martinez, An Ju, Colin Schoen, John DeNero, Armando Fox

Keywords: constructed response questions; semi-automatic misunderstanding detection; introductory computer science; education; massive courses
In contrast to multiple-choice or selected response questions, constructed response questions can result in a wide variety of incorrect responses. However, constructed responses are richer in information. We propose a technique for using each student's constructed responses in order to identify a subset of their stable conceptual misunderstandings. Our approach is designed for courses with so many students that it is infeasible to interpret every distinct wrong answer manually. Instead, we label only the most frequent wrong answers with the misunderstandings that they indicate, then predict the misunderstandings associated with other wrong answers using statistical co-occurrence patterns. This tiered approach leverages a small amount of human labeling effort to seed an automated procedure that identifies misunderstandings in students. Our approach involves much less effort than inspecting all answers, substantially outperforms a baseline that does not take advantage of co-occurrence statistics, proves robust to different course sizes, and generalizes effectively across student cohorts.

Paper 50: Metaphors for Learning and MOOC Pedagogies by Karen Swan, Scott Day, Leonard Bogle

Keywords: MOOCs; Pedagogy; Metaphors for Learning
The research reported in this paper used a researcher developed tool to categorize the pedagogical approaches used in MOOCs. The Assessing MOOC Pedagogies (AMP) tool characterized MOOC pedagogical approaches on ten dimensions. Preliminary testing on 20 different MOOCs demonstrated >= 80% inter-reliability and the facility of the measure to distinguish differing pedagogical patterns. The patterns distinguished crossed content areas and seemed to be related to what Sfard (1998) identified as metaphors for learning; acquisition and participation approaches seemed to distinguish the pedagogies of differing MOOCs. A third, arguably important, pattern related to self-direction was also distinguished.

Paper 51: Deep Neural Networks and How They Apply to Sequential Education Data by Steven Tang, Joshua C Peterson, Zachary A Pardos

Keywords: Educational Data Mining; Deep Learning; Long Short-Term Memory
Modern deep neural networks have achieved impressive results in a variety of automated tasks, such as text generation, grammar learning, and speech recognition. This paper discusses how education research might leverage recurrent neural network architectures in two small case studies. Specifically, we train a two-layer Long Short-Term Memory (LSTM) network on two distinct forms of education data: (1) essays written by students in a summative environment, and (2) MOOC clickstream data. Without any features specified beforehand, the network attempts to learn the underlying structure of the input sequences. After training, the model can be used generatively to produce new sequences with the same underlying patterns exhibited by the input distribution. These early explorations demonstrate the potential for applying deep learning techniques to large education data sets.

Paper 52: Understanding ESL Students’ Motivations to Increase MOOC Accessibility by Judith Uchidiuno, Amy Ogan, Evelyn Yarzebinski, Jessica Hammer

Keywords: MOOCs; English as a Second Language; MOOC Accessibility; Learning Motivations.
Massive Open Online Courses (MOOCs) have the potential to bridge education and literacy gaps by offering high quality, free courses to anyone with an Internet connection. MOOCs in their present state, however, may be relatively inaccessible to non-native English speakers, as a majority of MOOC content is in the English language. While a potential solution is to translate all MOOC content into all languages, it is not known whether this solution will satisfy the learning goals of all English as a Second Language (ESL) speakers. Through a series of interviews, we investigate ESL speakers’ motivations for taking MOOCs and other online courses. Our findings show that ESL speakers have a variety of motivations for taking online courses that are not captured in current surveys, which implies that current one-size-fits-all approaches to increasing MOOC accessibility for learners with a first language other than English may not be effective. Rather, offering learners individualized tools based on their motivation and needs may be more effective.

Paper 53: Browser Language Preferences as a Metric for Identifying ESL Speakers in MOOCs by Judith Uchidiuno, Amy Ogan, Kenneth Koedinger, Evelyn Yarzebinski, Jessica Hammer

Keywords: Foreign Language Students; MOOC Accessibility
Open access and low cost make Massively Open Online Courses (MOOCs) an attractive learning platform for students all over the world. However, the majority of MOOCs are deployed in English, which can pose an accessibility problem for students with English as a Second Language (ESL). In order to design appropriate interventions for ESL speakers, it is important to correctly identify these students using a method that is scalable to the high number of MOOC enrollees. Our findings suggest that a new metric, browser language preference, may be better than the commonly-used IP address for inferring whether or not a student is ESL.

Paper 54: Evaluating the 'Student' Experience in MOOCs by Lorenzo Vigentini, Catherine Zhao

Keywords: Student experience; MOOCs; surveys; evaluation.
Whilst most research on MOOCs makes inferences about the experience of learners from their interaction with the platform, few considered the rich feedback provided by learners. This paper presents the application of a conceptual model of student experience borrowed from higher education. Its relevance in the context of MOOCs was tested by using a range of questions and presentation methods in four MOOCs selected for their specific features. With varying response rates, results from over 8900 participants show how universities might view and evaluate the experience in MOOCs compared with that in traditional courses.

Paper 55: Learning about Teaching in Low-Resource Indian Contexts by Aditya Vishwanath, Arkadeep Kumar, Neha Kumar

Keywords: ICTD; HCI; learning; India
Online learning environments are being deployed globally to offer learning opportunities to diverse student communities. We propose the deployment of such an environment in low-resource after-school settings across India. We draw on preliminary research conducted in summer 2015 that leveraged existing ties with an NGO working across 35 after-school classrooms. Our larger goal is to (1) support tutors in curating and distributing learning content to students, (2) engage students in a mobile, networked learning environment where they can share and collaborate, and (3) evaluate the feasibility of online learning environments for low-resource contexts. In this submission, our focus is on the first component.

Paper 56: The Opportunity Count Model: A Flexible Approach to Modeling Student Performance by Yan Wang, Korinn Ostrow, Seth Adjei, Neil Heffernan

Keywords: Opportunity Count; Random Forest; Student Modeling; Next Problem Correctness; Intelligent Tutoring System
Detailed performance data can be exploited to achieve stronger student models when predicting next problem correctness (NPC) within intelligent tutoring systems. However, the availability and importance of these details may differ significantly when considering opportunity count (OC), or the compounded sequence of problems a student experiences within a skill. Inspired by this intuition, the present study introduces the Opportunity Count Model (OCM), a unique approach to student modeling in which separate models are built for differing OCs rather than creating a blanket model that encompasses all OCs. We use Random Forest (RF), which can be used to indicate feature importance, to construct the OCM by considering detailed performance data within tutor log files. Results suggest that OC is significant when modeling student performance and that detailed performance data varies across OCs.

Paper 57: Structured Knowledge Tracing Models for Student Assessment on Coursera by Zhuo Wang, Jile Zhu, Xiang Li, Zhiting Hu, Ming Zhang

Keywords: Knowledge Tracing; MOOCs; Student Assessment; Student Modeling; Hierarchical and Temporal
Massive Open Online Courses (MOOCs) provide an effective learning platform with various high-quality educational materials accessible to learners from all over the world. However, current MOOCs lack personalized learning guidance and intelligent assessment for individuals. Though a few recent attempts have been made to trace students' knowledge states by adapting the popular Bayesian Knowledge Tracing (BKT) model, they have largely ignored the rich structures and correlations among knowledge components (KCs) within a course. This paper proposes to model both the hierarchical and the temporal properties of the knowledge states in order to improve the modeling accuracy. Based on the content organization characteristics on the Coursera MOOC platform, we provide a well-defined KC model, and develop Multi-Grained-BKT and Historical-BKT to capture the above features effectively. Experiments on a Coursera course dataset show our approach significantly improves over previous vanilla BKT models on predicting students' quiz performance.

Paper 58: Modeling Student Scheduling Preferences in a Computer-Based Testing Facility by Matthew West, Craig Zilles

Keywords: student modeling; asynchronous exams; discrete choice theory; capacity planning; computerized testing
When undergraduate students are allowed to choose a time slot in which to take an exam from a large number of options (e.g., 40), the students exhibit strong preferences among the times. We found that students can be effectively modelled using constrained discrete choice theory to quantify these preferences from their observed behavior. The resulting models are suitable for load balancing when scheduling multiple concurrent exams and for capacity planning given a set schedule.

Paper 59: An Investigation of the Effects of Online Test Strategy on Students' Learning Behaviors by Tzu Chi Yang, Dai Ling Shih, Meng Chang Chen

Online tests have been identified as a core learning activity. Unlike conventional online tests, which cannot completely reflect students’ learning status, two-tier tests not only consider students’ answers, but also take into account reasons for their answers. Thus, research into a two-tier test had mushroomed but few studies examined why the two-tier test approach was effective. To this end, we conducted an empirical study, where a lag sequential analysis was used to analyze behavior patterns. The results indicated students with the two-tier test demonstrated different behaviors which develop “breadth to depth” and “depth to breadth” strategies.


Demo 1: Beetle-Grow: An Effective Intelligent Tutoring System to Support Conceptual Change by Elaine Farrow, Johanna D Moore

Keywords: intelligent tutoring; natural language; interaction data; conceptual learning; physics; electronics
We will demonstrate the Beetle-Grow intelligent tutoring system, which combines active experimentation, self-explanation, and formative feedback using natural language interaction. It runs in a standard web browser and has a fresh, engaging design. The underlying back-end system has previously been shown to be highly effective in teaching basic electricity and electronics concepts.

Beetle-Grow has been designed to capture student interaction and indicators of learning in a form suitable for data mining, and to support future work on building tools for interactive tutoring that improve after experiencing interaction with students, as human tutors do.

We are interested in partnering with teachers and other education researchers to carry out large-scale user trials with Beetle-Grow in the classroom and remotely.

Demo 2: Instructor Dashboards In EdX by Colin Fredericks, Glenn Lopez, Victor Shnayder, Saif Rayyan, Daniel Seaton

Keywords: mooc; dashboard; edx; analytics; instructor
Staff from edX, MIT, and Harvard will present two instructor dashboards for edX MOOCs. Current workflows will be described, from parsing and displaying data to using dashboards for course revision. A major focus will be lessons learned in the first two years of deployment.

Demo 3: Elivate: A Real-Time Assistant for Students and Lecturers as Part of an Online CS Education Platform by Suin Kim, Jae Won Kim, Jungkook Park, Alice Oh

Keywords: Online education;online programming;social learning;collaborative learning;computer science education
We present Elice, an online CS (computer science) education platform, and Elivate, a system for (i) taking student learning data from Elice, (ii) inferring their progress through an educational taxonomy tailored for programming education, and (iii) generating the real-time assistance for students and lecturers. Online courses suffer from high average attrition rates, and early prediction can enable early personalized feedback to motivate and assist students who may be having difficulties. Elice captures detailed student learning activities including intermediate revisions of code as students make progress toward completing their programming exercises and timestamps of student logins and submissions. Elivate then takes those data to analyze each student's progress and estimate the time to completion. In doing so, Elivate uses a learning taxonomy and automatic clustering of source code revisions. Using more than 240,000 code revisions generated by 1,000 students, we demonstrate how Elivate processes large-scale student data and generates appropriate real-time feedback for students.

Demo 4: Studying Learning at Scale with the ASSISTments TestBed by Korinn S. Ostrow, Neil T. Heffernan

Keywords: ASSISTments TestBed; Randomized Controlled Experimentation at Scale; Authentic Learning Environments; Assessment of Learning Infrastructure
An interactive demonstration on how to design and implement randomized controlled experiments at scale within the ASSISTments TestBed, a new collaborative for educational research funded by the National Science Foundation (NSF). The Assessment of Learning infrastructure (ALI), a unique data retrieval and analysis tool, is also demonstrated.

Demo 5: A Demonstration of ANALYSE: A Learning Analytics Tool for Open edX by Héctor J. Pijeira Díaz, Javier Santofimia Ruiz, José A. Ruipérez-Valiente, Pedro J. Muñoz-Merino, Carlos Delgado Kloos

Keywords: Learning Analytics; Open edX; Visualizations; MOOCs
Education is being powered by technology in many ways. One of the main advantages is making use of data to improve the learning process. The massive open online course (MOOC) phenomenon became viral some years ago, and with it many different platforms emerged. However most of them are proprietary solutions (i.e. Coursera, Udacity) and cannot be used by interested stakeholders. At the moment Open edX is placed as the primary open source application to support MOOCs. The community using Open edX is growing at a fast pace with many interested institutions. Nevertheless, the learning analytics support of Open edX is still in its first steps. In this paper we present an overview and demonstration of ANALYSE, an open source learning analytics tool for Open edX. ANALYSE includes currently 12 new visualizations that can be used by both instructors and students.

Demo 6: Thesis Writer (TW) – Tapping Scale Effects in Academic Writing Instruction by Christian Rapp, Otto Kruse

Keywords: Academic writing; Technology enhanced learning; research-based learning; learning to write
Writing a thesis is no less challenging a task for students, than for organizations who instruct and tutor thesis writing at higher education institutions. Annually within just our departments, 1000 undergraduates face the task of writing a thesis. Increasing student numbers and stagnating resources pose management problems, as well as constant threats to the quality of instruction. In reaction to this, we started exploring how instruction and supervision of thesis writers and related administrative tasks could be electronically supported, allowing for scale effects.

A learning environment named Thesis Writer (TW) was developed, and piloted during the fall of 2015. TW supports individual writing and collaboration between writers, peers, tutors, and supervisors. This web-based software runs in common web browsers, independently of the operating system.

In this paper we highlight the core functions of TW and address such uses in which scale effects can be realized. Conference attendees can use and test the system including real-time collaboration, in either English or German, and discuss experiences made and data collected during the pilot by 300 BA students.

Demo 7: e-Tutor – Scaling Staff Development in the Area of e-Learning Competences by Christian Rapp, Yasemin Gülbahar

Keywords: e-learning; e-competences; faculty development; e-tutoring; train-the-trainer
Faculty development in the area of emerging technologies is demanding and resource intensive. This increases when aiming to qualify instructors to support their teaching virtually, e.g. in blended- and distance learning environments. Most elements of instructional design, delivery, and assessment require rethinking for technology integration. It is also a challenge to develop a sound instructional design model and corresponding teaching materials for courses aimed at developing the necessary skills and competences among staff. With “e-Tutor” a corresponding certificate course was developed at Ankara University, Turkey, a country for which, due to its geographical size and population, e-Learning is now highly popular. Under a project funded by the Swiss National Science Foundation, the course was translated into English, Russian, and Ukrainian, and then made accessible as an Open Educational Resource under Creative Commons Licence. Delivered in Turkish since 2011 with 350 participants, the course has also been successfully conducted with 300 participants from 11 countries in English, and with 320 participants in Ukrainian. This paper will briefly introduce the course design and its resources, before addressing to what extent it allows for scaling effects in staff development.