[email protected] '18- Proceedings of the Fifth Annual ACM Conference on Learning at Scale

Full Citation in the ACM Digital Library

Replicating MOOC predictive models at scale

We present a case study in predictive model replication for student dropout in Massive Open Online Courses (MOOCs) using a large and diverse dataset (133 sessions of 28 unique courses offered by two institutions). This experiment was run on the MOOC Replication Framework (MORF), which makes it feasible to fully replicate complex machine learned models, from raw data to model evaluation. We provide an overview of the MORF platform architecture and functionality, and demonstrate its use through a case study. In this replication of [41], we contextualize and evaluate the results of the previous work using statistical tests and a more effective model evaluation scheme. We find that only some of the original findings replicate across this larger and more diverse sample of MOOCs, with others replicating significantly in the opposite direction. Our analysis also reveals results which are highly relevant to the prediction task which were not reported in the original experiment. This work demonstrates the importance of replication of predictive modeling research in MOOCs using large and diverse datasets, illuminates the challenges of doing so, and describes our freely available, open-source software framework to overcome barriers to replication.

Students, systems, and interactions: synthesizing the first four years of [email protected] and charting the future

We survey all four years of papers published so far at the Learning at Scale conference in order to reflect on the major research areas that have been investigated and to chart possible directions for future study. We classified all 69 full papers so far into three categories: Systems for Learning at Scale, Interactions with Sociotechnical Systems, and Understanding Online Students. Systems papers presented technologies that varied by how much they amplify human effort (e.g., one-to-one, one-to-many, many-to-many). Interaction papers studied both individual and group interactions with learning technologies. Finally, student-centric study papers focused on modeling knowledge and on promoting global access and equity. We conclude by charting future research directions related to topics such as going beyond the MOOC hype cycle, axes of scale for systems, more immersive course experiences, learning on mobile devices, diversity in student personas, students as co-creators, and fostering better social connections amongst students.

The potential for scientific outreach and learning in mechanical turk experiments

The global reach of online experiments and their wide adoption in fields ranging from political science to computer science poses an underexplored opportunity for learning at scale: the possibility of participants learning about the research to which they contribute data. We conducted three experiments on Amazon's Mechanical Turk to evaluate whether participants of paid online experiments are interested in learning about research, what information they find most interesting, and whether providing them with such information actually leads to learning gains. Our findings show that 40% of our participants on Mechanical Turk actively sought out post-experiment learning opportunities despite having already received their financial compensation. Participants expressed high interest in a range of research topics, including previous research and experimental design. Finally, we find that participants comprehend and accurately recall facts from post-experiment learning opportunities. Our findings suggest that Mechanical Turk can be a valuable platform for learning at scale and scientific outreach.

Toward large-scale learning design: categorizing course designs in service of supporting learning outcomes

This paper applies theory and methodology from the learning design literature to large-scale learning environments through quantitative modeling of the structure and design of Massive Open Online Courses. For two institutions of higher education, we automate the task of encoding pedagogy and learning design principles for 177 courses (which accounted for for nearly 4 million enrollments). Course materials from these MOOCs are parsed and abstracted into sequences of components, such as videos and problems. Our key contributions are (i) describing the parsing and abstraction of courses for quantitative analyses, (ii) the automated categorization of similar course designs, and (iii) the identification of key structural components that show relationships between categories and learning design principles. We employ two methods to categorize similar course designs---one aimed at clustering courses using transition probabilities and another using trajectory mining. We then proceed with an exploratory analysis of relationships between our categorization and learning outcomes.

Addressing two problems in deep knowledge tracing via prediction-consistent regularization

Knowledge tracing is one of the key research areas for empowering personalized education. It is a task to model students' mastery level of a knowledge component (KC) based on their historical learning trajectories. In recent years, a recurrent neural network model called deep knowledge tracing (DKT) has been proposed to handle the knowledge tracing task and literature has shown that DKT generally outperforms traditional methods. However, through our extensive experimentation, we have noticed two major problems in the DKT model. The first problem is that the model fails to reconstruct the observed input. As a result, even when a student performs well on a KC, the prediction of that KC's mastery level decreases instead, and vice versa. Second, the predicted performance for KCs across time-steps is not consistent. This is undesirable and unreasonable because student's performance is expected to transit gradually over time. To address these problems, we introduce regularization terms that correspond to reconstruction and waviness to the loss function of the original DKT model to enhance the consistency in prediction. Experiments show that the regularized loss function effectively alleviates the two problems without degrading the original task of DKT.1

The effects of adaptive learning in a massive open online course on learners' skill development

We report an experimental implementation of adaptive learning functionality in a self-paced Microsoft MOOC (massive open online course) on edX. In a personalized adaptive system, the learner's progress toward clearly defined goals is continually assessed, the assessment occurs when a student is ready to demonstrate competency, and supporting materials are tailored to the needs of each learner. Despite the promise of adaptive personalized learning, there is a lack of evidence-based instructional design, transparency in many of the models and algorithms used to provide adaptive technology or a framework for rapid experimentation with different models. ALOSI (Adaptive Learning Open Source Initiative) provides open source adaptive learning technology and a common framework to measure learning gains and learner behavior. This study explored the effects of two different strategies for adaptive learning and assessment: Learners were randomly assigned to three groups. In the first adaptive group ALOSI prioritized a strategy of remediation - serving learners items on topics with the least evidence of mastery; in the second adaptive group ALOSI prioritized a strategy of continuity - that is learners would be more likely served items on similar topic in a sequence until mastery is demonstrated. The control group followed the pathways of the course as set out by the instructional designer, with no adaptive algorithms. We found that the implemented adaptivity in assessment, with emphasis on remediation is associated with a substantial increase in learning gains, while producing no big effect on the drop-out. Further research is needed to confirm these findings and explore additional possible effects and implications to course design.

QG-net: a data-driven question generation model for educational content

The ever growing amount of educational content renders it increasingly difficult to manually generate sufficient practice or quiz questions to accompany it. This paper introduces QG-Net, a recurrent neural network-based model specifically designed for automatically generating quiz questions from educational content such as textbooks. QG-Net, when trained on a publicly available, general-purpose question/answer dataset and without further fine-tuning, is capable of generating high quality questions from textbooks, where the content is significantly different from the training data. Indeed, QG-Net outperforms state-of-the-art neural network-based and rules-based systems for question generation, both when evaluated using standard benchmark datasets and when using human evaluators. QG-Net also scales favorably to applications with large amounts of educational content, since its performance improves with the amount of training data.

Towards domain general detection of transactive knowledge building behavior

Support of discussion based learning at scale benefits from automated analysis of discussion for enabling effective assignment of students to project teams, for triggering dynamic support of group learning processes, and for assessment of those learning processes. A major limitation of much past work in machine learning applied to automated analysis of discussion is the failure of the models to generalize to data outside of the parameters of the context in which the training data was collected. This limitation means that a separate training effort must be undertaken for each domain in which the models will be used. This paper focuses on a specific construct of discussion based learning referred to as Transactivity and provides a novel machine learning approach with performance that exceeds state-of-the-art performance within the same domain in which it was trained and a new domain, and does not suffer any reduction in performance when transferring to the new domain. These results stand as an advance over past work on automated detection of Transactivity and increase the value of trained models for supporting group learning at scale. Implications for practice in at-scale learning environments are discussed.

Docent: transforming personal intuitions to scientific hypotheses through content learning and process training

People's lived experiences provide intuitions about health. Can they transform these personal intuitions into testable hypotheses that could inform both science and their lives? This paper introduces an online learning architecture and provides system principles for people to brainstorm causal scientific theories. We describe the Learn-Train-Ask workflow that guides participants through learning domain-specific content, process training to frame their intuitions as hypotheses, and collaborating with anonymous peers to brainstorm related questions. 344 voluntary online participants from 27 countries created 399 personally-relevant questions about the human microbiome over 4 months, 75 (19%) of which microbiome experts found potentially scientifically novel. Participants with access to process training generated hypotheses of better quality. Access to learning materials improved the questions' microbiome-specific knowledge. These results highlight the promise of performing personally-meaningful scientific work using massive online learning systems.

Supporting answerers with feedback in social Q&A

Prior research has examined the use of Social Question and Answer (Q&A) websites for answer and help seeking. However, the potential for these websites to support domain learning has not yet been realized. Helping users write effective answers can be beneficial for subject area learning for both answerers and the recipients of answers. In this study, we examine the utility of crowdsourced, criteria-based feedback for answerers on a student-centered Q&A website, Brainly.com. In an experiment with 55 users, we compared perceptions of the current rating system against two feedback designs with explicit criteria (Appropriate, Understandable, and Generalizable). Contrary to our hypotheses, answerers disagreed with and rejected the criteria-based feedback. Although the criteria aligned with answerers' goals, and crowdsourced ratings were found to be objectively accurate, the norms and expectations for answers on Brainly conflicted with our design. We conclude with implications for the design of feedback in social Q&A.

Refocusing the lens on engagement in MOOCs

Massive open online courses (MOOCs) continue to see increasing enrollment and adoption by universities, although they are still not fully understood and could perhaps be significantly improved. For example, little is known about the relationships between the ways in which students choose to use MOOCs (e.g., sampling lecture videos, discussing topics with fellow students) and their overall level of engagement with the course, although these relationships are likely key to effective course implementation. In this paper we propose a multilevel definition of student engagement with MOOCs and explore the connections between engagement and students' behaviors across five unique courses. We modeled engagement using ordinal penalized logistic regression with the least absolute shrinkage and selection operator (LASSO), and found several predictors of engagement that were consistent across courses. In particular, we found that discussion activities (e.g., viewing forum posts) were positively related to engagement, whereas other types of student behaviors (e.g., attempting quizzes) were consistently related to less engagement with the course. Finally, we discuss implications of unexpected findings that replicated across courses, future work to explore these implications, and relevance of our findings for MOOC course design.

The relationship between scientific explanations and the proficiencies of content, inquiry, and writing

Examining the interaction between content knowledge, inquiry proficiency, and writing proficiency is central to understanding the relative contribution of each proficiency on students' written communication about their science inquiry. Previous studies, however, have only analyzed one of these primary types of knowledge/proficiencies (i.e. content knowledge, inquiry proficiency, and writing proficiency) at a time. This study investigated the extent to which these proficiencies predicted students' written claims, evidence for their claims, and reasoning linking their claims to the evidence. Results showed that all three types of proficiencies significantly predicted students' claims, but only writing proficiency significantly predicted performance on evidence and reasoning statements. These findings indicate the challenges students face when constructing claim, evidence, and reasoning statements, and can inform scaffolding to support these challenges.

Towards making block-based programming activities adaptive

Block-based environments are today commonly used for introductory programming activities like those that are part of the Hour of Code campaign, which reaches millions of students. These activities typically consist of a static series of problems. Our aim is to make this type of activities more efficient by incorporating adaptive behavior. In this work, we discuss steps towards this goal, specifically a proposal and implementation of a programming game that supports both elementary problems and interesting programming challenges and thus provides an environment for meaningful adaptation. We also discuss methods of adaptivity and the issue of evaluating student performance while solving a problem.

Towards adapting to learners at scale: integrating MOOC and intelligent tutoring frameworks

Instruction that adapts to individual learner characteristics is often more effective than instruction that treats all learners as the same. A practical approach to making MOOCs adapt to learners may be by integrating frameworks for intelligent tutoring systems (ITSs). Using the Learning Tools Interoperability standard (LTI), we integrated two intelligent tutoring frameworks (GIFT and CTAT) into edX. We describe our initial explorations of four adaptive instructional patterns in the PennX MOOC "Big Data and Education." The work illustrates one route to adaptivity at scale.

Combining adaptivity with progression ordering for intelligent tutoring systems

Learning at scale (LAS) systems like Massive Open Online Classes (MOOCs) have hugely expanded access to high quality educational materials however, such material are frequently time and resource expensive to create. In this work we propose a new approach for automatically and adaptively sequencing practice activities for a particular learner and explore its application for foreign language learning. We evaluate our system through simulation and are in the process of running an experiment. Our simulation results suggest that such an approach may be significantly better than an expert system when there is high variability in the rate of learning among the students and if mastering prerequisites before advancing is important, and is likely to be no worse than an expert system if our generated curriculum approximately describes the necessary structure of learning in students.

Toward a large-scale open learning system for data management

This paper describes ClassDB, a free and open source system to enable large-scale learning of data management. ClassDB is different from existing solutions in that the same system supports a wide range of data-management topics from introductory SQL to advanced "native analytics" where code in SQL and non-SQL languages (Python and R) run inside a database management system. Each student/team maintains their own sandbox which instructors can read and provide feedback. Both students and instructors can review activity logs to analyze progress and determine future course of action. ClassDB is currently in its second pilot and is scheduled for a larger trial later this year. After the trials, ClassDB will be made available to about 4,000 students in the university system, which comprises four universities and 12 community colleges. ClassDB is built in collaboration with students employing modern DevOps processes. Its source code and documentation are available in a public GitHub repository. ClassDB is work in progress.

From online learning to offline action: using MOOCs for job-embedded teacher professional development

Over two iterations of a Massive Open Online Course (MOOC) for school leaders, Launching Innovation in Schools, we developed and tested design elements to support the transfer of online learning into offline action. Effective professional learning is job-embedded: learners should employ new skills and knowledge at work. We aimed to get participants to both plan and actually launch new change efforts, and a subset of our most engaged participants were willing to do so during the course. Assessments, instructor calls to action, and exemplars supported student actions. We found that participants led change initiatives, held stakeholder meetings, collected new data about their contexts, and shared and used course materials collaboratively. Collecting data about participant learning and behavior outside the MOOC environment is essential for researchers and designers looking to create effective online environments for professional learning.

Exploring the utility of response times and wrong answers for adaptive learning

Personalized educational systems adapt their behavior based on student performance. Most student modeling techniques, which are used for guiding the adaptation, utilize only the correctness of student's answers. However, other data about performance are typically available. In this work we focus on response times and wrong answers as these aspects of performance are available in most systems. We analyze data from several types of exercises and domains (mathematics, spelling, grammar). The results suggest that wrong answers are more informative than response times. Based on our results we propose a classification of student performance into several categories.

Measuring item similarity in introductory programming

A personalized learning system needs a large pool of items for learners to solve. When working with a large pool of items, it is useful to measure the similarity of items. We outline a general approach to measuring the similarity of items and discuss specific measures for items used in introductory programming. Evaluation of quality of similarity measures is difficult. To this end, we propose an evaluation approach utilizing three levels of abstraction. We illustrate our approach to measuring similarity and provide evaluation using items from three diverse programming environments.

Pilot study on optimal task scheduling in learning

Living in an information era where various online learning contents are rapidly available, students often learn with a combination of multiple learning tasks. In this work we explore the possibilities of using optimization theory to find the optimal trade-off between the time invested in two different completing learning tasks for each individual student. We show that the problem can be formulated as a linear programming problem, which can be efficiently solved to determine the optimal amount of time for each task. We also report our ongoing attempts to apply this theory to our Facebook Messenger chatbot software that can optimize the trade-off between learning and self-assessing in form of MCQs on the chatbot platform.

Longitudinal trends in sentiment polarity and readability of an online masters of computer science course

In four years, the Georgia Tech Online MS in CS (OMSCS) program has grown from 200 students to over 6000. Despite early evidence of success, there is a need to evaluate the program's effectiveness. In this paper, we focus on trends from Fall 2014 to Fall 2017 in the on-campus and online sections of one OMSCS course, Knowledge-Based Artificial-Intelligence (KBAI). We leverage sentiment analysis and readability assessments to quantify the evolving quality of discourse on the online forum discussions of the various sections. The research was conducted as a longitudinal study, and aims to evaluate the success of the KBAI course by comparing trends between residential and online sections. Despite slight downward trends in online discourse quality and sentiment polarity, our results suggest that the growing OMSCS program has been successful in replicating the quality of learning experienced by on-campus students in the KBAI course.

Glanceable code history: visualizing student code for better instructor feedback

Immediate, individualized feedback on their code helps students learning to program. However, even in short, focused exercises in active learning, teachers do not have much time to write feedback. In addition, only looking at a student's final code hides a lot of the students' learning and discovering process. We created a glanceable code history visualization that enables teachers to view a student's entire coding history quickly and efficiently. A preliminary user study shows that this visualization captures previously unseen information that allows teachers to give students better grades and give students longer feedback and better feedback that focuses not just on their final code, but all their code in between.

Toward large-scale automated scoring of scientific visual models

Visual models of scientific concepts drawn by students afford expanded opportunities for showing their understanding beyond textual descriptions, but also introduce other elements characterized by artistic creativity and complexity. In this paper, we describe a standardized framework for evaluation of scientific visual models by human raters. This framework attempts to disentangle the interaction between the scientific modeling skills and artistic skills of representing real objects of students, and potentially provides a fair and valid way to assess understanding of scientific concepts e.g. structure and properties of Matter. Additionally, we report ongoing efforts to build automated assessment models based on the evaluation framework. Preliminary findings suggest the promise of such an automated approach.

An active viewing framework for video-based learning

Video-based learning is most effective when students are engaged with video content; however, the literature has yet to identify students' viewing behaviors and ground them in theory. This paper addresses this need by introducing a framework of active viewing, which is situated in an established model of active learning to describe students' behaviors while learning from video. We conducted a field study with 460 undergraduates in an Applied Science course using a video player designed for active viewing to evaluate how students engage in passive and active video-based learning. The concept of active viewing, and the role of interactive, constructive, active, and passive behaviors in video-based learning, can be implemented in the design and evaluation of video players.

A content engagement score for online learning platforms

Engagement on online learning platforms is essential for user retention, learning, and performance. However, there is a paucity of research addressing latent engagement measurement using user activities. In this work in progress paper, we present a novel engagement score consisting of three sub-dimensions - cognitive engagement, emotional engagement, and behavioral engagement using a comprehensive set of user activities. We plan to evaluate our score on a large scale online learning platform and compare our score with measurements from a user survey-based engagement scale from the literature.

Adaptive natural-language targeting for student feedback

In tutoring software, targeting feedback to students' natural-language inputs is a promising avenue for making the software more effective. As a case study, we built such a system using Natural Language Processing (NLP) to provide adaptive feedback to students in an online learning task. We found that the NLP targeting mechanism, relative to more traditional multiple-choice targeting, was able to provide optimal feedback from fewer student interactions and generalize to previously unseen prompts.

Rain classroom: a tool for blended learning with MOOCs

We present our implementation of a software system that facilitates teachers to create preview and review teaching materials before and after class, as well as enhance interactions between teachers and students for in-class activities. The software system is widely used in China's colleges and universities from 2016, covering more the 3 million teacher/student users. We plan to demonstrate the tool by presenting how it works in a teaching scenario and offering visitors the opportunity to interact with each other.

Gamifying higher education: enhancing learning with mobile game app

We present a mobile game app (EUR Game) that has been designed to complement teaching and learning in higher education. The mobile game app can be used by teachers to gauge how well students are meeting the learning objectives. Teachers can use the information to provide 'just-in-time' support and adapt their lessons accordingly. For the students, the game app is a study tool that can be used to test their own understanding and monitor their study progress. This, in turn, supports students' self-regulated learning. Gamification elements are also included in the game app to enhance the learning experience. During the demonstration, participants will experience the features of the game app and be engaged in an interactive session to explore the possible ways to use the mobile game app to support teaching and learning.

WPSS: dropout prediction for MOOCs using course progress normalization and subset selection

There are existing multi-MOOC level dropout prediction research in which many MOOCs' data are involved. This generated good results, but there are two potential problems. On one hand, it is inappropriate to use which week students are in to select training data because courses are with different durations. On the other hand, using all other existing data can be computationally expensive and inapplicable in practice.

To solve these problems, we propose a model called WPSS (<u>WP</u>ercent and <u>S</u>ubset <u>S</u>election) which combines the course progress normalization parameter wpercent and subset selection. 10 MOOCs offered by The University of Hong Kong are involved and experiments are in the multi-MOOC level. The best performance of WPSS is obtained in neural network when 50% of training data is selected (average AUC of 0.9334). Average AUC is 0.8833 for traditional model without wpercent and subset selection in the same dataset.

Contemporary online course design recommendations to support women's cognitive development

Originally, online higher education was imagined to be a utopia for women because gender would be less salient when learners were not physically co-present and thus gender power issues would be lessened. Unfortunately, gender power is present in online classes, hindering women from experiencing the benefits of an equitable learning environment. One approach to eliminating gender power in online classes is to design courses with gender equity in mind. The existing best practices for designing online courses, however, were not developed for the specific purpose of upending gender power. In this poster, I synthesize the best practices for online course design and pose recommendations informed by feminist pedagogy. As more faculty are encouraged to teach online, an updated set of design recommendations that best supports women online learners is valuable both for practitioners and researchers.

A deep learning model for automatic evaluation of academic engagement

This paper proposed a deep learning model for automatic evaluation of academic engagement based on video data analysis. A coding system based on the BROMP standard for behavioral, emotional, and cognitive states was defined to code typical videos in an autonomous learning environment. Then after the key points of human skeletons were extracted from these videos using pose estimation technology, deep learning methods were used to realize the effective recognition and judgment of motion and emotions. Based on this, an analysis and evaluation of learners' learning states was accomplished, and a prototype of academic engagement evaluation system was successfully established eventually.

Use expert knowledge instead of data: generating hints for hour of code exercises

Within the field of on-line tutoring systems for learning programming, such as Code.org's Hour of code, there is a trend to use previous student data to give hints. This paper shows that it is better to use expert knowledge to provide hints in environments such as Code.org's Hour of code. We present a heuristic-based approach to generating next-step hints. We use pattern matching algorithms to identify heuristics and apply each identified heuristic to an input program. We generate a next-step hint by selecting the highest scoring heuristic using a scoring function. By comparing our results with results of a previous experiment on Hour of code we show that a heuristics-based approach to providing hints gives results that are impossible to further improve. These basic heuristics are sufficient to efficiently mimic experts' next-step hints.

Electronixtutor integrates multiple learning resources to teach electronics on the web

ElectronixTutor is a new Intelligent Tutoring System for electronics that integrates multiple intelligent learning resources, including AutoTutor, Dragoon, LearnForm, ASSISTments, and BEETLE-II, as well as Point & Query hotspots on diagrams and numerous text documents on the subject of electronics. ElectronixTutor's student model contains a set of electronics knowledge components (e.g., "transistor behavior"), each of which are taught by multiple learning resources. ElectronixTutor also features a recommender system, which suggests topics and resources for the student to try based on the student model. ElectronixTutor uses a Moodle interface, and is accessible to anyone via a web browser. Currently, ElectronixTutor is being tested by undergraduate electronics students before supplementing Naval Apprentice Technician Training coursework in the fall of 2018.

Exploring the impact of the default option on student engagement and performance in a statistics MOOC

Engagement and motivation are particularly important in optional learning environments, like educational games and massive open online courses. Providing some aspects of autonomy and choice to the student can yield significant benefits to learner motivation and persistence; yet there is also evidence that unsupported learners may not always automatically choose to allocate their learning time to pedagogical activities that are most known to be as associated with better learning outcomes. We investigated the impact of choice on student engagement and learning in a Massive Open Online Course (MOOC) on introductory statistics and probability. We compared conditions in which students are given free choice over the practice problems completed to conditions in which students receive a full set of practice activities or no practice activities before completing a post-test. In all cases students were free to navigate to other sections of the course at any time. In one of the two topic sections that included personalized practice activities we found that students performed better in the condition in which they were prompted to complete all practice activities. Though more students in this condition dropped out before reaching the post-test, many more students completed the full set of practice activities in this section than those who did in the free choice condition. These results are still quite preliminary but suggest that providing a default encouraged opt in procedure can encourage students to do more problems than they would otherwise, and that doing such additional problems can yield learning gains.

The impact of the peer review process evolution on learner performance in e-learning environments

Student performance over a course of an academic program can be significantly affected and positively influenced through a series of feedback processes by peers and tutors. Ideally, this feedback is structured and incremental, and as a consequence, data presents at large scale even in relatively small classes. In this paper, we investigate the effect of such processes as we analyze assessment data collected from online courses. We plan to fully analyze the massive dataset of over three and a half million granular data points generated to make the case for the scalability of these kinds of learning analytics. This could shed crucial light on assessment mechanism in MOOCs, as we continue to refine our processes in an effort to strike a balance of emphasis on formative in addition to summative assessment.

Multimedia learning principles at scale predict quiz performance

Empirically supported multimedia learning (MML) principles [1] suggest effective ways to design instruction, generally for elements on the order of a graphic or an activity. We examined whether the positive impact of MML could be detected in larger instructional units from a MOOC. We coded instructional design (ID) features corresponding to MML principles, mapped quiz items to these features and their use by MOOC participants, and attempted to predict quiz performance. We found that instructional features related to MML, namely practice problems with high-quality examples and text that is concisely written, were positively predictive. We argue it is possible to predict quiz item performance from features of the instructional materials and suggest ways to extend this method to additional aspects of the ID.

Virtualizing face-2-face trainings for training senior professionals: a comparative case study on financial auditors

Traditionally, professional learning for senior professionals is organized around face-2-face trainings. Virtual trainings seem to offer an opportunity to reduce costs related to travel and travel time. In this paper we present a comparative case study that investigates the differences between traditional face-2-face trainings in physical reality, and virtual trainings via WebEx. Our goal is to identify how the way of communication impacts interaction between trainees, between trainees and trainers, and how it impacts interruptions. We present qualitative results from observations and interviews of three cases in different setups (traditional classroom, web-based with all participants co-located, web-based with all participants at different locations) and with overall 25 training participants and three trainers. The study is set within one of the Big Four global auditing companies, with advanced senior auditors as learning cohort.

XIPIt: updating the XIP dashboard to support educators in essay marking at higher education

Effective written communication is an essential skill which promotes educational success for undergraduates. However, undergraduate students, especially those in their first year at university, are unused to this form of writing. After their long experience with the schoolroom essay, for most undergraduates academic writing development is painstakingly slow. Thus, especially those with poor writing abilities, should write more to be better writers. Yet, the biggest impediment to more writing is that overburdened tutors would ask limited number of drafts from their students. Today, there exist powerful computational language technologies that could evaluate student writing, saving time and providing timely, speedy, reliable feedback which can support educators marking process. This paper motivates an updated visual analytics dashboard, XIPIt, to introduce a set of visual and writing analytics features embedded in a marking environment built on XIP output.

Accelerated apprenticeship: teaching data science problem solving skills at scale

It often takes years of hands-on practice to build operational problem solving skills for a data scientist to be sufficiently competent to tackle real world problems. In this research, we explore a new scalable technology-enhanced learning (TEL) platform that enables accelerated apprenticeship process via a repository of caselets - small but focused case studies with scaffolding questions and feedback. In this paper, we report rationales of the design, caselet authoring process, and the planned experiment with cohorts of students who will use caselets while taking graduate level data science courses.

An automatic knowledge graph construction system for K-12 education

Motivated by the pressing need of educational applications with knowledge graph, we develop a system, called K12EduKG, to automatically construct knowledge graphs for K-12 educational subjects. Leveraging on heterogeneous domain-specific educational data, K12EduKG extracts educational concepts and identifies implicit relations with high educational significance. More specifically, it adopts named entity recognition (NER) techniques on educational data like curriculum standards to extract educational concepts, and employs data mining techniques to identify the cognitive prerequisite relations between educational concepts. In this paper, we present details of K12EduKG and demonstrate it with a knowledge graph constructed for the subject of mathematics.

Who downloads online content and why?

Online learners sometimes prefer to download course content rather than view it on a course website. These students often miss out on interactive content. Knowing who downloads course materials, and why, can help course creators design courses that fit the needs of their students. In this paper we explore downloading behavior by looking at lecture videos in three online classes. We found that the number of days since a video was posted had the strongest relationship with downloading, and non-technical considerations, such as typical classroom size in a student's home country, matter more than technical issues, such as internet speed. Our findings suggest that more materials will be downloaded when a course will be available for limited time, students are less familiar with the language of instruction, students are used to classrooms with a high student-teacher ratio, or a student's internet speed is slow. Possible reasons for these relationships are discussed.

Student trust in e-authentication

Trust is a fundamental prerequisite for the success of any technological development, especially in education. Without the trust of stakeholders, educational technologies (however effective they might be) can fail to be taken up at scale. This work-in-progress paper reports on a study that investigated the trust students have in tools developed to support e-authentication for online assessments in Higher Education. The study, part of the EU-funded TeSLA project (http://tesla-project.eu), involved almost 500 students from the Open University, UK. Students were asked their views on trust and other issues before and after they used a tool developed to authenticate student identity in online assessments. A key finding is that, after using the tools, participants marginally increased their trust in online assessments, with the majority also reporting that they also trusted how the institution would use the outcomes.

Information overload and online collaborative learning: insights from agent-based modeling

This paper investigates information overload (IO) in large online courses by developing an Agent-based Model (ABM) of student interaction in a computer-supported collaborative learning (CSCL) environment. Student surveys provided ABM model parameters, and experimental results suggest unique visitor count to be a superior metric than user activity level for IO detection. ABM of synchronous/asynchronous platforms demonstrates how additional channels can be introduced to effectively combat IO. As work in progress, we look forward to validating model recommendations with activity data in online classrooms.

Representing and predicting student navigational pathways in online college courses

Representation and prediction of student navigational pathways, typically based on neural network (NN) methods, have seen their potential of improving instruction and learning under insufficient human knowledge about learner behavior. However, they are prominently studied in MOOCs and less probed within more institutionalized higher education scenarios. This work extends such research to the context of online college courses. Comparing student navigational sequences through course pages to documents in natural language processing, we apply a skip-gram model to learn vector embedding of course pages, and visualize the learnt vectors to understand the extent to which students' learning pathways align with pre-designed course structure. We find that students who get different letter grades in the end exhibit different levels of adherence to designed sequence. Next, we fit the embedded sequences into a long short-term memory architecture and test its ability to predict next page that a student visits given her prior sequence. The highest accuracy reaches 50.8% and largely outperforms the frequency-based baseline of 41.3%. These results show that neural network methods have the potential to help instructors understand students' learning behaviors and facilitate automated instructional support.

WebLinux: a scalable in-browser and client-side Linux and IDE

"WebLinux" is a web app tool providing a standard Linux OS and an IDE in the browser, including a terminal, a code editor and a file browser. It provides a client-side and offline Linux OS environment based on a Javascript emulated processor. By avoiding the use of a Virtual Machine or any Linux server, Weblinux enables learners to directly start experimenting with the Linux OS without installing any software. The tool is entirely client-side which makes it extremely scalable and easy to deploy within a large community of online learners.

Designing a learning analytics dashboard for twitter-facilitated teaching

Social media sites are increasingly being adopted to support teaching practice in higher education. Learning Analytics (LA) dashboards can be used to reveal how students engage with course material and others in the class. However, research on the best practices of designing, developing, and evaluating such dashboards to support teaching and learning with social media has been limited. Considering the increasing use of Twitter for both formal and informal learning processes, this paper presents our design process and a LA prototype dashboard developed based on a comprehensive literature review and an online survey among 54 higher education instructors who have used Twitter in their teaching.

Team based assignments in MOOCs: results and observations

Teamwork and collaborative learning are considered superior to learning individually by many instructors and didactical theories. Particularly, in the context of e-learning and Massive Open Online Courses (MOOCs) we see great benefits but also great challenges for both, learners and instructors. We discuss our experience with six team based assignments on the openHPI and openSAP1 MOOC platforms.

Managing and analyzing student learning data: a python-based solution for edX

Online learning platforms, such as edX, generate usage statistics data that can be valuable to educators. However, handling this raw data can prove challenging and time consuming for instructors and course designers. The raw data for the MIT courses running on the edX platform (MITx courses) are pre-processed and stored in a Google BigQuery database. We designed a tool based on Python and additional open-source Python packages such as Jupyter Notebook, to enable instructors to analyze their student data easily and securely. We expect that instructors would be encouraged to adopt more evidence-based teaching practices based on their interaction with the data.

The phoenix corps: a graphic novel for scalable online learning research

UPDATED---10 June 2018. This paper describes the demonstration of The Phoenix Corps, the first graphic novel designed specifically for online learning research. While online learning environments regularly use textbooks and videos, graphic novels have not been as popular for research and instruction. This is mainly due to extremely cumbersome and complicated methods of editing traditionally-made graphic novels to update the instructional content or create alternative versions for A/B testing. In this demonstration, attendees will be able to read through, edit, and analyze data from a live online version of The Phoenix Corps.

Grading at scale in earsketch

This paper explores some of the challenges posed by automated grading of programming assignments in a STEAM (Science, Technology, Engineering, Art, and Math) based curriculum, as well as how these challenges are addressed in the automatic grading processes used in EarSketch, a music-based educational programming environment developed at Georgia Tech. This work-in-progress paper reviews common strategies for grading programming assignments at scale and discusses how they are combined in EarSketch to evaluate open ended STEAM-focused assignments.

Transformative approaches in distance online education: aligning evidence to influence the design of teaching at scale

In this paper we consider the role of sharing evidence online in work in progress to develop a new teaching framework for distance and part-time students of The Open University. The work reported here looks at the motivation for applying evidence and how it can act to support the development of the framework, rather than the framework itself. The approach described is adapted from previous research projects, and focuses on how evidence from internal and external scholarship is gathered and refined through an Evidence Hub that is shared online and open to all in the University. An aspect of the framework (offering greater continuity of study) is selected to show how the methodology applies in practice. In conclusion we highlight the value in adopting evidence-based approaches to support change processes and how sharing collective knowledge can influence decision-making.

Classifying and visualizing students' cognitive engagement in course readings

Reading material has been part of course teaching for centuries, but until recently students' engagement with that reading, and its effect on their learning, has been difficult for teachers to assess. In this article, we explore the idea of examining cognitive engagement---a measure of how deeply a student is thinking about course material, which has been shown to correlate with learning gains---as it varies over different sections of the course reading material. We show that a combination of automatic classification and visualization of cognitive engagement anchored in the text can give teachers---and not only researchers---valuable insight into their students' thinking, suggesting ways to modify their lectures and their course readings to improve learning. We demonstrate this approach with analyzing students' comments in two different courses (Physics and Biology) using the Nota Bene annotation platform.

Squeezing the limeade: policies and workflows for scalable online degrees

In recent years, non-credit options for learning at scale have outpaced for-credit options. To scale for-credit options, workflows and policies must be devised to preserve the characteristics of accredited higher education---such as the presumption of human evaluation and an assertion of academic integrity---despite increased scale. These efforts must follow as well with shifting from offering isolated courses (or informal collections thereof) to offering full degree programs with additional administrative elements. We see this shift as one from Massive Open Online Courses (MOOCs) to Large, Internet-Mediated Asynchronous Degrees (Limeades). In this work, we perform a qualitative research study on one such program that has scaled to 6,500 students while retaining full accreditation. We report a typology of policies and workflows employed by the individual classes to deliver this experience.

How do professors format exams?: an analysis of question variety at scale

This study analyzes the use of paper exams in college-level STEM courses. It leverages a unique dataset of nearly 1,800 exams, which were scanned into a web application, then processed by a team of annotators to yield a detailed snapshot of the way instructors currently structure exams. The focus of the investigation is on the variety of question formats, and how they are applied across different course topics.

The analysis divides questions according to seven top-level categories, finding significant differences among these in terms of positioning, use across subjects, and student performance. The analysis also reveals a strong tendency within the collection for instructors to order questions from easier to harder.

A linear mixed effects model is used to estimate the reliability of different question types. Long writing questions stand out for their high reliability, while binary and multiple choice questions have low reliability. The model suggests that over three multiple choice questions, or over five binary questions, are required to attain the same reliability as a single long writing question.

A correlation analysis across seven response types finds that student abilities for different questions types exceed 70 percent for all pairs, although binary and multiple-choice questions stand out for having unusually low correlations with all other question types.

OARS: exploring instructor analytics for online learning

Learning analytics systems have the potential to bring enormous value to online education. Unfortunately, many instructors and platforms do not adequately leverage learning analytics in their courses today. In this paper, we report on the value of these systems from the perspective of course instructors. We study these ideas through OARS, a modular and real-time learning analytics system that we deployed across more than ten online courses with tens of thousands of learners. We leverage this system as a starting point for semi-structured interviews with a diverse set of instructors. Our study suggests new design goals for learning analytics systems, the importance of real-time analytics to many instructors, and the value of flexibility in data selection and aggregation for an instructor when working with an analytics system.

The potential of interdisciplinary in MOOC research: how do education and computer science intersect?

Given that both computer scientists and educational researchers publish on the topic of massive open online courses (MOOCs), the research community should analyze how these disciplines approach the same topic. In order to promote productive dialogue within the community, we report on a bib-liometrics study of the growing MOOC literature and examine the potential interdisciplinarity of this research space. Drawing from 3,380 bibliographic items retrieved from Scopus, we conducted descriptive analyses on publication years, publication sources, disciplinary categories of publication sources, frequent keywords, leading authors, and cited references. We applied bibliographic coupling and network analysis to further investigate clusters of research topics in the MOOC literature. We found balanced representation of education and computer science within most topic clusters. However, integration could be further improved on, for example, by enhancing communication between the disciplines and broadening the scope of methods in specific studies.

Codemotion: expanding the design space of learner interactions with computer programming tutorial videos

Love them or hate them, videos are a pervasive format for delivering online education at scale. They are especially popular for computer programming tutorials since videos convey expert narration alongside the dynamic effects of editing and running code. However, these screencast videos simply consist of raw pixels, so there is no way to interact with the code embedded inside of them. To expand the design space of learner interactions with programming videos, we developed Codemotion, a computer vision algorithm that automatically extracts source code and dynamic edits from existing videos. Codemotion segments a video into regions that likely contain code, performs OCR on those segments, recognizes source code, and merges together related code edits into contiguous intervals. We used Codemotion to build a novel video player and then elicited interaction design ideas from potential users by running an elicitation study with 10 students followed by four participatory design workshops with 12 additional students. Participants collectively generated ideas for 28 kinds of interactions such as inline code editing, code-based skimming, pop-up video search, and in-video coding exercises.

Elicast: embedding interactive exercises in instructional programming screencasts

In programming education, instructors often supplement lectures with active learning experiences by offering programming lab sessions where learners themselves practice writing code. However, widely accessed instructional programming screencasts are not equipped with assessment format that encourages such hands-on programming activities. We introduce Elicast, a screencast tool for recording and viewing programming lectures with embedded programming exercises, to provide hands-on programming experiences in the screen-cast. In Elicast, instructors embed multiple programming exercises while creating a screencast, and learners engage in the exercises by writing code within the screencast, receiving auto-graded results immediately. We conducted an exploratory study of Elicast with five experienced instructors and 63 undergraduate students. We found that instructors structured the lectures into small learning units using embedded exercises as checkpoints. Also, learners more actively engaged in the screencast lectures, checked their understanding of the content through the embedded exercises, and more frequently modified and executed the code during the lectures.

Toward CS1 at scale: building and testing a MOOC-for-credit candidate

If a MOOC is to qualify for equal credit as an existing on-campus offering, students must achieve comparable outcomes, both educational and attitudinal. We have built a MOOC for teaching CS1 with the intent of offering it for degree credit. To test its eligibility for credit, we delivered it as an online for-credit course for two semesters to 197 on-campus students who selected the online version rather than a traditional version. We compared the demographics, outcomes, and experiences of these students to the 715 students in the traditional version. We found the online students more likely to be older; to be underrepresented minorities; and to have previously failed a CS class. We then found that our online students attained comparable learning outcomes to students in the traditional section. Finally, we found that our online students perceived the online course quality more positively and required less time to achieve those comparable learning outcomes.

Effects of automated interventions in programming assignments: evidence from a field experiment

A typical problem in MOOCs is the missing opportunity for course conductors to individually support students in overcoming their problems and misconceptions. This paper presents the results of automatically intervening on struggling students during programming exercises and offering peer feedback and tailored bonus exercises. To improve learning success, we do not want to abolish instructionally desired trial and error but reduce extensive struggle and demotivation. Therefore, we developed adaptive automatic just-in-time interventions to encourage students to ask for help if they require considerably more than average working time to solve an exercise. Additionally, we offered students bonus exercises tailored for their individual weaknesses. The approach was evaluated within a live course with over 5,000 active students via a survey and metrics gathered alongside. Results show that we can increase the call outs for help by up to 66% and lower the dwelling time until issuing action. Learnings from the experiments can further be used to pinpoint course material to be improved and tailor content to be audience specific.

The unbearable lightness of consent: mapping MOOC providers' response to consent

While many strategies for protecting personal privacy have relied on regulatory frameworks, consent and anonymizing data, such approaches are not always effective. Frameworks and Terms and Conditions often lag user behaviour and advances in technology and software; consent can be provisional and fragile; and the anonymization of data may impede personalized learning. This paper reports on a dialogical multi-case study methodology of four Massive Open Online Course (MOOC) providers from different geopolitical and regulatory contexts. It explores how the providers (1) define 'personal data' and whether they acknowledge a category of 'special' or 'sensitive' data; (2) address the issue and scope of student consent (and define that scope); and (3) use student data in order to inform pedagogy and/or adapt the learning experience to personalise the context or to increase student retention and success rates.

This study found that large amounts of personal data continue to be collected for purposes seemingly unrelated to the delivery and support of courses. The capacity for users to withdraw or withhold consent for the collection of certain categories of data such as sensitive personal data remains severely constrained. This paper proposes that user consent at the time of registration should be reconsidered, and that there is a particular need for consent when sensitive personal data are used to personalize learning, or for purposes outside the original intention of obtaining consent.

How much randomization is needed to deter collaborative cheating on asynchronous exams?

This paper investigates randomization on asynchronous exams as a defense against collaborative cheating. Asynchronous exams are those for which students take the exam at different times, potentially across a multi-day exam period. Collaborative cheating occurs when one student (the information producer) takes the exam early and passes information about the exam to other students (the information consumers) that are taking the exam later. Using a dataset of computerized exam and homework problems in a single course with 425 students, we identified 5.5% of students (on average) as information consumers by their disproportionate studying of problems that were on the exam. These information consumers ("cheaters") had a significant advantage (13 percentage points on average) when every student was given the same exam problem (even when the parameters are randomized for each student), but that advantage dropped to almost negligible levels (2--3 percentage points) when students were given a random problem from a pool of two or four problems. We conclude that randomization with pools of four (or even three) problems, which also contain randomized parameters, is an effective mitigation for collaborative cheating. Our analysis suggests that this mitigation is in part explained by cheating students having less complete information about larger pools.

How a data-driven course planning tool affects college students' GPA: evidence from two field experiments

College students rely on increasingly data-rich environments when making learning-relevant decisions about the courses they take and their expected time commitments. However, we know little about how their exposure to such data may influence student course choice, effort regulation, and performance. We conducted a large-scale field experiment in which all the undergraduates at a large, selective university were randomized to an encouragement to use a course-planning web application that integrates information from official transcripts from the past fifteen years with detailed end-of-course evaluation surveys. We found that use of the platform lowered students' GPA by 0.28 standard deviations on average. In a subsequent field experiment, we varied access to information about course grades and time commitment on the platform and found that access to grade information in particular lowered students' overall GPA. Our exploratory analysis suggests these effects are not due to changes in the portfolio of courses that students choose, but rather by changes to their behavior within courses.