[email protected] '19- Proceedings of the Sixth (2019) ACM Conference on Learning @ Scale

Full Citation in the ACM Digital Library

SESSION: Broader Issues

Start of a Science: An Epistemological Analysis of Learning at Scale

The Learning at Scale ([email protected]) conference has brought together researchers from diverse scholarly communities to design and study technologies that are explicitly meant to scale to a large number and variety of learners. Over the last three years, the [email protected] community has published a thematic, methodological, and bibliometric analysis to reflect on its own interests, challenges, and foundations. This paper continues the wider reflection effort and complements these two prior analyses with an epistemological analysis of the way the papers employ learning theory, evaluate evidence, and deploy statistical models. The epistemological analysis uses two methodologies: coding the full papers from the first four years for epistemological markers of interest and analyzing the network of citations from all of the full papers for dominant institutional and epistemological traditions. By combining these two methods, the present analysis reveals that most papers explicitly show their theoretical commitments, target a narrow slice of available learning theory, draw on varied academic fields in different proportions, and showcase epistemological practices in line with what philosophers of computational science observe in communities using similar model-based methods. The paper then situates these claims in wider conversations occurring in the learning sciences and philosophy of science to provide theoretical insights as well as practical recommendations for how the community can more consciously conduct and communicate its scientific endeavor.

Can a diversity statement increase diversity in MOOCs?

Despite the fact that anyone can sign up for open online courses, their enrollment patterns reflect the historical underrepresentation of certain sociodemographic groups (e.g. women in STEM disciplines). We theorize that enrollment choices online are shaped by contextual cues that activate stereotypes about numeric representation and climate in brick-and-mortar institutions. A longitudinal matched-pairs experiment with 14 MOOCs (N=29,000) tested this theory by manipulating the presence of a diversity statement on course pages and measuring effects on who enrolls. We found a 3% increase in the proportion of students with lower socioeconomic status. The effect size varied across courses between -0.5 and 7 percentage points. No significant changes in enrollment patterns by gender, age, and national development level occurred. Implications for the use and content of diversity statements and their alternatives are discussed.

Multiplatform MOOC Analytics: Comparing Global and Regional Patterns in edX and Edraak

While global massive open online course (MOOC) providers such as edX, Coursera, and FutureLearn have garnered the bulk of attention from researchers and the popular press, MOOCs are also provisioned by a series of regional providers, who are often using the Open edX platform. We leverage the data infrastructure shared by the main edX instance and one regional Open edX provider, Edraak in Jordan, to compare the experience of learners from Arab countries on both platforms. Comparing learners from Arab countries on edX to those on Edraak, the Edraak population has a more even gender balance, more learners with lower education levels, greater participation from more developing countries, higher levels of persistence and completion, and a larger total population of learners. This "apples to apples" comparison of MOOC learners is facilitated by an approach to multiplatform MOOC analytics, which employs parallel research processes to create joint aggregate datasets without sharing identifiable data across institutions. Our findings suggest that greater research attention should be paid towards regional MOOC providers, and regional providers may have an important role to play in expanding access to higher education.

On the Acceptance and Usefulness of Personalized Learning Objectives in MOOCs

With Massive Open Online Courses (MOOCs) the number of people having access to higher education increased rapidly. The intentions to enroll for a specific course vary significantly and depend on one's professional or personal learning needs and interests. All learners have in common that they pursue their individual learning objectives. However, predominant MOOC platforms follow a one-size-fits-all approach and primarily aim for completion with certification. Specifically, technical support for goal-oriented and self-regulated learning to date is very limited in this context although both learning strategies are proven to be key factors for students' achievement in large-scale online learning environments. In this first investigation, a concept for the application and technical integration of personalized learning objectives in a MOOC platform is realized and assessed. It is evaluated with a mixed-method approach. First, the learners' acceptance is examined with a multivariate A/B test in two courses. Second, a survey was conducted to gather further feed-back about the perceived usefulness, next to the acceptance. The results show a positive perception by the learners, which paves the way for future research.

SESSION: Engagement

Graded Team Assignments in MOOCs: Effects of Team Composition and Further Factors on Team Dropout Rates and Performance

The ability to work in teams is an important skill in today's work environments. In MOOCs, however, team work, team tasks, and graded team-based assignments play only a marginal role. To close this gap, we have been exploring ways to integrate graded team-based assignments in MOOCs. Some goals of our work are to determine simple criteria to match teams in a volatile environment and to enable a frictionless online collaboration for the participants within our MOOC platform. The high dropout rates in MOOCs pose particular challenges for team work in this context. By now, we have conducted 15 MOOCs containing graded team-based assignments in a variety of topics. The paper at hand presents a study that aims to establish a solid understanding of the participants in the team tasks. Furthermore, we attempt to determine which team compositions are particularly successful. Finally, we examine how several modifications to our platform's collaborative toolset have affected the dropout rates and performance of the teams.

How Learners Engage with In-Context Retrieval Exercises in Online Informational Videos

Learners increasingly refer to online videos for learning new technical concepts, but often overlook or forget key details. We investigated how retrieval practice, a learning strategy commonly used in education, could be designed to reinforce key concepts in online videos. We began with a formative study to understand users' perceptions of cued and free-recall retrieval techniques. We next designed a new in-context flashcard-based technique that provides expert-curated retrieval exercises in context of a video's playback. We evaluated this technique with 14 learners and investigated how learners engage with flashcards that are prompted automatically at predefined intervals or flashcards that appear on-demand. Our results overall showed that learners perceived automatically prompted flashcards to be less effortful and made the learners feel more confident about grasping key concepts in the video. However, learners found that on-demand flashcards gave them more control over their learning and allowed them to personalize their review of content. We discuss the implications of these findings for designing hybrid automatic and on-demand in-context retrieval exercises for online videos.

Instructors Desire Student Activity, Literacy, and Video Quality Analytics to Improve Video-based Blended Courses

While video becomes increasingly prevalent in educational settings, current research has yet to investigate what feedback instructors need regarding their students' engagement and learning despite video technologies being equipped to provide viewing analytics and collect student feedback. In this paper we investigate instructors' requirements from video analytics. We used a Grounded Theory Approach and interviewed 16 instructors who teach using video to determine the advantages for using video in their teaching and the different requirements for analytics and feedback in their existing practice. Based on our analysis of the interviews, we found three categories of information that instructors want to inform their teaching. Instructors are looking to see if their students have watched their videos, how much they understood in those videos, and how useful the videos are to the students. These categories provide the foundations and design implications for instructor-centric educational video analytics interfaces.

Growth Mindset Predicts Student Achievement and Behavior in Mobile Learning

Students' personal qualities other than cognitive ability are known to influence persistence and achievement in formal learning environments, but the extent of their influence in digital learning environments is unclear. This research investigates non-cognitive factors in mobile learning in a resource-poor context. We surveyed 1,000 Kenyan high school students who use a popular SMS-based learning platform that provides formative assessments aligned with the national curriculum. Combining survey responses with platform interaction logs, we find growth mindset to be one of the strongest predictors of assessment scores. We investigate theory-based behavioral mechanisms to explain this relationship. Although students who hold a growth mindset are not more likely to persist after facing adversity, they spend more time on each assessment, increasing their likelihood of answering correctly. Results suggest that cultivating a growth mindset can motivate students in a resource-poor context to excel in a mobile learning environment.

SESSION: Domain Specific

Improv: Teaching Programming at Scale via Live Coding

Computer programming instructors frequently perform live coding in settings ranging from MOOC lecture videos to online livestreams. However, there is little tool support for this mode of teaching, so presenters must now either screen-share or use generic slideshow software. To overcome the limitations of these formats, we propose that programming environments should directly facilitate live coding for education. We prototyped this idea by creating Improv, an IDE extension for preparing and delivering code-based presentations informed by Mayer's principles of multimedia learning. Improv lets instructors synchronize blocks of code and output with slides and create preset waypoints to guide their presentations. A case study on 30 educational videos containing 28 hours of live coding showed that Improv was versatile enough to replicate approximately 96% of the content within those videos. In addition, a preliminary user study on four teaching assistants showed that Improv was expressive enough to allow them to make their own custom presentations in a variety of styles and improvise by live coding in response to simulated audience questions. Users mentioned that Improv lowered cognitive load by minimizing context switching and made it easier to fix errors on-the-fly than using slide-based presentations.

Scaffolding during Science Inquiry

Prior studies on scaffolding for investigative inquiry practices (i.e. forming a question/hypothesis, collecting data, and analyzing and interpreting data [21]) revealed that students who received scaffolding were better able to both learn practices and transfer these competencies to new topics than were students who did not receive scaffolding. Prior studies have also shown that after removing scaffolding, students continued to demonstrate improved inquiry performance on a variety of practices across new driving questions over time. However, studies have not examined the relationship between the amount of scaffolding received and transfer of inquiry performance; this is the focus of the present study. 107 middle school students completed four virtual lab activities (i.e. driving questions) in Inq-ITS. Students received scaffolding when needed from an animated pedagogical computer agent for the first three driving questions for the Animal Cell virtual lab. Then they completed the fourth driving question without access to scaffolding in a different topic, Plant Cell. Results showed that students' performances increased even with fewer scaffolds for the inquiry practices of hypothesizing, collecting data, interpreting data, and warranting claims; furthermore, these results were robust as evidenced by the finding that students required less scaffolding as they completed subsequent inquiry activities. These data provide evidence of near and far transfer as a result of adaptive scaffolding of science inquiry practices.

Teaching UI Design at Global Scales: A Case Study of the Design of Collaborative Capstone Projects for MOOCs

Group projects are an essential component of teaching user interface (UI) design. We identified six challenges in transferring traditional group projects into the context of Massive Open Online Courses: managing dropout, avoiding free-riding, appropriate scaffolding, cultural and time zone differences, and establishing common ground. We present a case study of the design of a group project for a UI Design MOOC, in which we implemented technical tools and social structures to cope with the above challenges. Based on survey analysis, interviews, and team chat data from the students over a six-month period, we found that our socio-technical design addressed many of the obstacles that MOOC learners encountered during remote collaboration. We translate our findings into design implications for better group learning experiences at scale.

Inequality: multi-modal equation entry on the web

Online learning in STEM subjects requires an easy way to enter and automatically mark mathematical equations. Existing solutions did not meet our requirements, and therefore we developed Inequality, a new open-source system which works across all major browsers, supports both mouse and touch-based entry, and is usable by high school children and teachers. Inequality has been in use for over 2 years by about 20000 students and nearly 900 teachers as part of the Isaac online learning platform. In this paper we evaluate Inequality as an entry method, assess the flexibility of our approach, and the effect the system has on student behaviour. We prepared 343 questions which could be answered using either Inequality or a traditional method. Looking across over 472000 question attempts, we found that students were equally proficient at answering questions correctly with both entry methods. Moreover, students using Inequality required fewer attempts to arrive at the correct answer 73% of the time. In a detailed analysis of equation construction, we found that Inequality provides significant flexibility in the construction of mathematical expressions, accommodating different working styles. We expected students who first worked on paper before entering their answers would require fewer attempts than those who did not, however this was not the case (p = 0.0109). While our system is clearly usable, a user survey highlighted a number of issues which we have addressed in a subsequent update.

SESSION: Assessment

Automatic Assessment of Complex Assignments using Topic Models

Automated assessment of complex assignments is crucial for scaling up learning of complex skills such as critical thinking. To address this challenge, one previous work has applied supervised machine learning to automate the assessment by learning from examples of graded assignments by humans. However, in the previous work, only simple lexical features, such as words or n-grams, have been used. In this paper, we propose to use topics as features for this task, which are more interpretable than those simple lexical features and can also address polysemy and synonymy of lexical semantics. The topics can be learned automatically from the student assignment data by using a probabilistic topic model. We propose and study multiple approaches to construct topical features and to combine topical features with simple lexical features. We evaluate the proposed methods using clinical case assignments performed by veterinary medicine students. The experimental results show that topical features are generally very effective and can substantially improve performance when added on top of the lexical features. However, their effectiveness is highly sensitive to how the topics are constructed and a combination of topics constructed using multiple views of the text data works the best. Our results also show that combining the prediction results of using different types of topical features and of topical and lexical features is more effective than pooling all features together to form a larger feature space.

Designing Digital Peer Assessment for Second Language Learning in Low Resource Learning Settings

In low-resource, over-burdened schools and learning centres, peer assessment systems promise significant practical and pedagogical benefits. Many of the these benefits have been realised in contexts like massive open online courses (MOOCs) and university classrooms which share a specific trait with low-resource schools: high learner-teacher ratios. However, the constraints and considerations for designing and deploying peer assessment systems in low-resource classrooms have not been well-researched and understood, especially for high school. In this paper, we present the design of a peer assessment system for second language learning (English as a Second Language) for high school learners in South Africa. We report findings from multiple studies investigating qualitative and quantitative aspects of peer review, as well as the contextual factors that influence the viability of peer assessment systems in these contexts.

Mining Students Pre-instruction Beliefs for Improved Learning

In principle, learning can be increased by assessing the detailed state of student knowledge and mistaken knowledge with a pre-test and then optimizing instruction as measured by the post-test score. As a first step in this direction, we applied a Multidimensional Item Response Theory (MIRT) to 17,000 pre-instruction administrations of the Force Concept Inventory (FCI) to study students' initial knowledge in detail. Examination of Item Response Curves (IRCs) showed that even students scoring below chance are not randomly guessing, but instead preferentially select only one or two distractors. Two dimensional IRT applied to the entire set of 150 possible responses, rather than applied dichotomously to the thirty questions, revealed two skill dimensions of comparable variance. Perpendicular directions were identified within this space corresponding to Newtonian ability and propensity to select responses whose IRC's have a maximum at intermediate Newtonian ability rather than at the top of bottom of this dimension. These intermediate responses corresponded to known pre-Newtonian ideas, particularly the Medieval concept of impetus. The ability to measure the detailed misconceptions of individual students or classes will allow development and application of instructional interventions for such specific misunderstandings, which are typically unchanged by traditional instruction.

Scaling Up Writing in the Curriculum: Batch Mode Active Learning for Automated Essay Scoring

Automated essay scoring (AES) allows writing to be assigned in large courses and can provide instant formative feedback to students. However, creating models for AES can be costly, requiring the collection and human scoring of hundreds of essays. We have developed and are piloting a web-based tool that allows instructors to incrementally score responses to enable AES scoring while minimizing the number of essays the instructors must score. Previous work has shown that techniques from the machine learning subfield of active learning can reduce the amount of training data required to create effective AES models. We extend those results to a less idealized scenario: one driven by the instructor's need to score sets of essays, in which the model is trained iteratively using batch mode active learning. We propose a novel approach inspired by a class of topological methods, but with reduced computational requirements, which we refer to as topological maxima. Using actual student data, we show that batch mode active learning is a practical approach to training AES models. Finally, we discuss implications of using this technology for automated customized scoring of writing across the curriculum.

SESSION: Learning Support

UpGrade: Sourcing Student Open-Ended Solutions to Create Scalable Learning Opportunities

In schools and colleges around the world, open-ended home-work assignments are commonly used. However, such assignments require substantial instructor effort for grading, and tend not to support opportunities for repeated practice. We propose UpGrade, a novel learnersourcing approach that generates scalable learning opportunities using prior student solutions to open-ended problems. UpGrade creates interactive questions that offer automated and real-time feedback, while enabling repeated practice. In a two-week experiment in a college-level HCI course, students answering UpGrade-created questions instead of traditional open-ended assignments achieved indistinguishable learning outcomes in ~30% less time. Further, no manual grading effort is required. To enhance quality control, UpGrade incorporates a psychometric approach using crowd workers' answers to automatically prune out low quality questions, resulting in a question bank that exceeds reliability standards for classroom use.

Design Perspectives of Learning at Scale: Scaling Efficiency and Empowerment

How do we design technology for learning at scale? Based on an examination of a large number of influential systems for learning at scale, I argue that designing for scale is not an amorphous design undertaking. Instead, it builds on two distinct perspectives on scale. Systems built with a scaling through efficiency perspective make learning more efficient and allow the same number of instructors to help a much larger set of learners. Systems with a scaling through empowerment perspective empower a larger number of people to assist learners effectively. I outline how these simple differences in design perspective lead to large differences in design concerns, techniques, and evaluation criteria. Articulating prevalent design perspectives should make overlooked design opportunities more salient, help systems designers design for scale more deliberately and understand their tradeoffs, and open up new opportunities to designers who shift their perspectives.

Apples to Apples: Differences in Viewer Retention When Longer Content is Chopped into Smaller Bites

Numerous studies have concluded that viewer retention decreases as video length increases. However, we are not aware of any prior work in which a set of longer MOOC (Massive Open Online Course) videos are compared with the same content split into multiple shorter videos. We are fortunate to be in the unique position to have two separate MOOCs that teach essentially the same content using two different platforms (the LEGO Mindstorms NXT and EV3 robots). In our NXT MOOC, videos are quite long, with over 20% of the videos having a running time of more than ten minutes. The EV3 MOOC has very similar content; EV3 MOOC scripts were written by modifying NXT scripts as appropriate. However, many of the EV3 lessons were split into two or three shorter videos in place of a single longer one. NXT videos that are very close in terms of both content and duration to EV3 videos have similar average percentage viewed. This suggests that the two populations watching the videos are similar and that we have a promising setup for analyzing the relationship between NXT lessons whose EV3 counterparts consist of multiple shorter videos. We present an analysis of our data, along with various interpretations some, but not all, of which support the "shorter videos are better" hypothesis.

Key Phrase Extraction for Generating Educational Question-Answer Pairs

Automatic question generation is a promising tool for developing the learning systems of the future. Research in this area has mostly relied on having answers (key phrases) identified beforehand and given as a feature, which is not practical for real-world, scalable applications of question generation. We describe and implement an end-to-end neural question generation system that generates question and answer pairs given a context paragraph only. We accomplish this by first generating answer candidates (key phrases) from the paragraph context, and then generating questions using the key phrases. We evaluate our method of key phrase extraction by comparing our output over the same paragraphs with question-answer pairs generated by crowdworkers and by educational experts. Results demonstrate that our system is able to generate educationally meaningful question and answer pairs with only context paragraphs as input, significantly increasing the potential scalability of automatic question generation.

SESSION: [email protected] Perspectives

Master's at Scale: Five Years in a Scalable Online Graduate Degree

In 2014, Georgia Tech launched the first for-credit MOOC-based graduate degree program. In the five years since, the program has proven generally successful, enrolling over 14,000 unique students, and several other similar programs have followed in its footsteps. Existing research on the program has focused largely on details of individual classes; program-level research, however, has been scarce. In this paper, we delve into the program-level details of an at-scale Master's degree, from the story of its creation through the data generated by the program, including the numbers of applications, admissions, matriculations, and graduations; enrollment details including demographic information and retention patterns; trends in student grades and experience as compared to the on-campus student body; and alumni perceptions. Among our findings, we note that the program has stabilized at a retention rate of around 70%; that the program's growth has not slowed; that the program has not cannibalized its on-campus counterpart; and that the program has seen an upward trend in the number of women enrolled as well as a persistently higher number of underrepresented minorities than the on-campus program. Throughout this analysis, we abstract out distinct lessons that should inform the development and growth of similar programs.

Data-Assistive Course-to-Course Articulation Using Machine Translation

Higher education at scale, such as in the California public post-secondary system, has promoted upward socioeconomic mobility by supporting student transfer from 2-year community colleges to 4-year degree granting universities. Among the barriers to transfer is earning enough credit at 2-year institutions that qualify for the transfer credit required by 4-year degree programs. Defining which course at one institution will count as credit for an equivalent course at another institution is called course articulation, and it is an intractable task when attempting to manually articulate every set of courses at every institution with one another. In this paper, we present a methodology towards making tractable this process of defining and maintaining articulations by leveraging the information contained within historic enrollment patterns and course catalog descriptions. We provide a proof-of-concept analysis using data from a 4-year and 2-year institution to predict articulation pairs between them, produced from machine translation models and validated by a set of 65 institutionally pre-established course-to-course articulations. Finally, we create a report of proposed articulations for consumption by the institutions and close with a discussion of limitations and the challenges to adoption.

Via: Illuminating Academic Pathways at Scale

The processes through which course selections accumulate into college pathways in US higher education is poorly instrumented for observation at scale. We offer an analytic toolkit, called Via, which transforms commonly available enrollment data into formal graphs that are amenable to interactive visualizations and computational exploration. We explain the procedures required to project enrollment records onto graphs, and then demonstrate the toolkit utilizing eighteen years of enrollment data at a large private research university. Findings complement prior research on academic search and offer powerful new means for making pathway navigation more efficient.

Predict and Intervene: Addressing the Dropout Problem in a MOOC-based Program

Massive Open Online Courses (MOOCs) are an efficient way of delivering knowledge to thousands of learners. However, even among learners who show a clear intention to complete a MOOC, the dropout rate is substantial. This is particularly relevant in the context of MOOC-based educational programs where a funnel of participation can be observed and high dropout rates at early stages of the program significantly reduce the number of learners successfully completing it. In this paper, we propose an approach to identify learners at risk of dropping out from a course, and we design and test an intervention intended to mitigate that risk. We collect course clickstream data from MOOCs of the MITx MicroMasters® in Supply Chain Management program and apply machine learning algorithms to predict potential dropouts. Our final model is able to predict 80% of actual dropouts. Based on these results, we design an intervention aimed to increase learners' motivation and engagement with a MOOC. The intervention consists on sending tailored encouragement emails to at-risk learners, but despite the high email opening rate, it shows no effect in dropout reduction.

WORKSHOP SESSION: Work in Progress Papers

Chimeria: Grayscale MOOC: Towards Critical Self-Reflection at Scale

Contemporary online learning systems are increasingly common elements of post-secondary, workplace, and lifelong education. These systems typically employ the transmission model of education to teach students, an approach ill-suited for fostering deeper learning. This paper presents our latest findings related to ongoing research developing a generalizable framework for supporting deeper learning in online learning systems. In this work, we focus on the self-debriefing component of our framework and its impact on deeper learning in online learning systems. To pursue this line of inquiry, we conducted an exploratory study evaluating the Chimeria:Grayscale MOOC, an online learning system that implements our framework. Our results suggest that self-debriefing is crucial for effectively supporting students' reflections.

Measuring Students' Performance on Programming Tasks

Large scale learning systems for introductory programming need to be able to automatically assess the quality of students' performance on programming tasks. This assessment is done using a performance measure, which provides feedback to students and teachers, and an input to the domain, student and tutor models. The choice of a good performance measure is nontrivial, since the performance of students can be measured in many ways, and the design of measure can interact with the adaptive features of a learning system or imperfections in the used domain model. We discuss the important design decisions and illustrate the process of an iterative design and evaluation of a performance measure in a case study.

Informing the Design of Collaborative Activities in MOOCs using Actionable Predictions

With the aim of supporting instructional designers in setting up collaborative learning activities in MOOCs, this paper derives prediction models for student participation in group discussions. The salient feature of these models is that they are built using only data prior to the learning activity, and can thus provide actionable predictions, as opposed to post-hoc approaches common in the MOOC literature. Some learning design scenarios that make use of this actionable information are illustrated.

Measuring Difficulty of Introductory Programming Tasks

Quantification of the difficulty of problem solving tasks has many applications in the development of adaptive learning systems, e.g., task sequencing, student modeling, and insight for content authors. There are, however, many potential conceptualizations and measures of problem difficulty and the computation of difficulty measures is influenced by biases in data collection. In this work, we explore difficulty measures for introductory programming tasks. The results provide insight into non-trivial behavior of even simple difficulty measures.

Inquiry learning at scale: pedagogy-informed design of a platform for citizen inquiry

This paper addresses the related issues of which pedagogies improve with scale and how to develop a novel platform for inquiry-led learning at scale. The paper begins by introducing pedagogy-informed design of platforms for learning at scale. It then summarizes previous work to develop a platform for open science investigations. Then, it introduces a new platform for inquiry-led learning at scale. The paper concludes with an evaluation of the effectiveness of the platform to: meet its design requirements; enable individuals, groups and institutions to design inquiry-led investigations; engage members of the public to participate; and enable learning activities on the platform to sustain and grow.

BookBuddy: Turning Digital Materials Into Interactive Foreign Language Lessons Through a Voice Chatbot

Digitization of education has brought a tremendous amount of online materials that are potentially useful for language learners to practice their reading skills. However, these digital materials rarely help with conversational practice, a key component of foreign language learning. Leveraging recent advances in chatbot technologies, we developed BookBuddy, a scalable virtual reading companion that can turn any reading material into an interactive conversation-based English lesson. We piloted our virtual tutor with five 6-year-old native Chinese-speaking children currently learning English. Preliminary results suggest that children enjoyed speaking English with our virtual tutoring chatbot and were highly engaged during the interaction.

Creating a Framework for User-Centered Development and Improvement of Digital Education

We investigate how the technology acceptance and learning experience of the digital education platform HPI Schul-Cloud (HPI School Cloud) for German secondary school teachers can be improved by proposing a user-centered research and development framework. We highlight the importance of developing digital learning technologies in a user-centered way to take differences in the requirements of educators and students into account. We suggest applying qualitative and quantitative methods to build a solid understanding of a learning platform's users, their needs, requirements, and their context of use. After concept development and idea generation of features and areas of opportunity based on the user research, we emphasize on the application of a multi-attribute utility analysis decision-making framework to prioritize ideas rationally, taking results of user research into account. Afterward, we recommend applying the principle build-learn-iterate to build prototypes in different resolutions while learning from user tests and improving the selected opportunities. Last but not least, we propose an approach for continuous short- and long-term user experience controlling and monitoring, extending existing web- and learning analytics metrics.

Leveraging Skill Hierarchy for Multi-Level Modeling with Elo Rating System

In this paper, we are discussing the case of offering retired assessment items as practice problems for the purposes of learning in a system called ACT Academy. In contrast to computer-assisted learning platforms, where students consistently focus on small sets of skills they practice till mastery, in our case, students are free to explore the whole subject domain. As a result, they have significantly lower attempt counts per individual skill.

We have developed and evaluated a student modeling approach that differs from traditional approaches to modeling skill acquisition by leveraging the hierarchical relations in the skill taxonomy used for indexing practice problems. Results show that when applied in systems like ACT Academy, this approach offers significant improvements in terms of predicting student performance.

Student Code Trajectories in an Introductory Programming MOOC

In classrooms, instructors teaching students how to code have the ability to monitor progress and provide feedback through regular interaction. There is generally no analogous tracing of learning progression in programming MOOCs, hindering the ability of MOOC platforms to provide automated feedback at scale. We explore features for every certified student's history of code submissions to specific problems in a programming MOOC and measure similarity to sample solutions. We seek to understand whether students who succeed in the course reach solutions similar to these instructor-intended sample solutions, in terms of the concepts and mechanisms they contain. Furthermore, do students learn to conform to instructor expectations as the course progresses, and does prior experience have correlations with student behavior? We also explore what feature representations are sufficient for code submission history, since they are directly applicable to the development of automated tutors for progress tracking.

Predicting the difficulty of automatic item generators on exams from their difficulty on homeworks

To design good assessments, it is useful to have an estimate of the difficulty of a novel exam question before running an exam. In this paper, we study a collection of a few hundred automatic item generators (short computer programs that generate a variety of unique item instances) and show that their exam difficulty can be roughly predicted from student performance on the same generator during pre-exam practice. Specifically, we show that the rate that students correctly respond to a generator on an exam is on average within 5% of the correct rate for those students on their last practice attempt. This study is conducted with data from introductory undergraduate Computer Science and Mechanical Engineering courses.

Pilot Study to Estimate "Difficult" Area in e-Learning Material by Physiological Measurements

To improve designs of e-learning materials, it is necessary to know which word or figure a learner felt "difficult" in the materials. In this pilot study, we measured electroencephalography (EEG) and eye gaze data of learners and analyzed to estimate which area they had difficulty to learn. The developed system realized simultaneous measurements of physiological data and subjective evaluations during learning. Using this system, we observed specific EEG activity in difficult pages. Integrating of eye gaze and EEG measurements raised a possibility to determine where a learner felt "difficult" in a page of learning materials. From these results, we could suggest that the multimodal measurements of EEG and eye gaze would lead to effective improvement of learning materials. For future study, more data collection using various materials and learners with different backgrounds is necessary. This study could lead to establishing a method to improve e-learning materials based on learners' mental states.

Semantic Matching Evaluation of User Responses to Electronics Questions in AutoTutor

Relatedness between user input and an ideal response is a salient feature required for proper functioning of an Intelligent Tutoring System (ITS) using natural language processing. Improper assessment of text input causes maladaptation in ITSs. Meta-assessment of user responses in ITSs can improve instruction efficacy and user satisfaction. Therefore, this paper evaluates the quality of semantic matching between user input and the expected response in AutoTutor, an ITS which holds a conversation with the user in natural language. AutoTutor's dialogue is driven by the AutoTutor Conversation Engine (ACE), which uses a combination of Latent Semantic Analysis (LSA) and Regular Expressions (RegEx) to assess user input. We assessed ACE via responses from 219 Amazon Mechanical Turk users, who answered 118 electronics questions broken into 5202 response pairings (n = 5202). These analyses explore the relationship between RegEx and LSA, agreement between the two judges, and agreement between human judges and ACE. Additionally, we calculated precision and recall. As expected, regular expressions and LSA had a moderate, positive relationship, and the agreement between ACE and human was fair, but slightly lower than agreement between human.

Proposal and Implementation of an Elderly-oriented User Interface for Learning Support Systems

Extended learning support systems for all-age education requires inclusive user interface design, especially for elderly users. A dual-tablet user interface with simplified visual layers and more intuitive operations was proposed aiming to reduce the physical and mental loads of elderly learners. An initial prototype with basic functions of viewing learning material was developed based on a cross-platform framework. Two preliminary user experiments participated by elderly volunteers were carried out for formative evaluations, in order to improve the usability of the interface design iteratively. The prototype was modified based on the participants' comments and observation of their operations during the experiments. Additional findings of the elderly users' preference and tendency were discussed for further development.

Implementing Learning Analytics to Foster a STEM Learning Ecosystem at the City-Level: Emerging Research and Design Challenges

To address the goal of increasing and broadening participation of youth in STEM fields, a learning ecosystem approach is a promising strategy. Learning analytics can play an important role in such efforts which aim to build learning supports across the diverse spaces in which learning and development occurs, including informal, formal, and online contexts. This paper introduces a city-level learning analytics implementation effort in a developing STEM ecosystem in one mid-sized city. We describe aspects of our design and research approach and challenges that emerge by taking a learning ecosystem perspective of learning and development.

On the Influence of Grades on Learning Behavior of Students in MOOCs

MOOCs (Massive Open Online Courses) frequently use grades to calculate whether a student passes the course. To better understand how student behavior is influenced by grade feedback, we conduct a study on the changes of certified students' behavior before and after they have received their grade. We use observational student data from two MITx MOOCs to examine student behavior before and after a grade is released and calculate the difference (the delta-activity). We then analyze the changes in the delta-activity distributions across all graded assignments a we observe that the variation in delta-activity decreases as grade decreases, with students who have the lowest grade exhibiting little or no change in weekly activity. This trend persists throughout each course, in all course offerings, suggesting that a change in grade does not correlate with a change in the behavior of certified MOOC students.

Synchronous at Scale: Investigation and Implementation of a Semi-Synchronous Online Lecture Platform

Online classes and degree programs continue to grow in popularity, in part due to the increased convenience and accessibility of education that technology has provided in recent years. As online education scales upwards and outwards, there is an increased need to provide students with an engaging and collaborative learning experience. In some online learning environments, student collaboration is perceived to be more difficult than it is in a physical classroom setting due to cultural or geographic distance between students. In particular, online class lectures often lack the collaborative spirit seen in most in-person classroom lectures. To improve upon the online classroom experience, this project first examines the benefits and drawbacks of several in-person and online lecture delivery techniques, then proposes an online lecture platform that allows students to facilitate their own collaborative classrooms on-demand through a semi-synchronous viewing area and chatroom.

Impact of Free-Certificate Coupons on Learner Behavior in Online Courses: Results from Two Case Studies

The relationship between pricing and learning behavior is an increasingly important topic in MOOC (massive open online course) research. We report on two case studies where cohorts of learners were offered coupons for free-certificates to explore price reductions might influence user behavior in MOOC-based online learning settings. In Case Study #1, we compare participation and certification rates between courses with and without coupons for free-certificates. In the courses with a free-certificate track, participants signed up for the verified certificate track at higher rates and completion rates among verified students were higher than in the paid-certificate track courses. In Case Study #2, we compare the behaviors of learners within the same courses based on whether they received access to a free-certificate track. Access to free-certificates was associated with somewhat lower certification rates, but overall certification rates remained high particularly among those who viewed the courses. These findings suggests that some other incentives, other than simply the sunk-cost of paying for a verified certificate-track, may motivate learners to complete MOOC courses.

Guided-KNOWLA: The Use of Guided Unscrambling to Enhance Active Online Learning

Test-yourself questions are effective examples of formative assessment, and have been shown to promote learners' active interaction with materials and knowledge mastery through frequent practice. However, the cost of developing and implementing engaging test-yourself activities can be problematic in large-scale web-based learning environments; a lack of built-in scaffolding to guide learners is also a challenge. We introduce Guided-KNOWLA, an improvement of KNOWLA -- a learning tool has learners assemble a given set of mixed-size scrambled fragments into a logical order using a web-based interface, accompanied by motivational step-by-step hint/guidance as enhancements. We conducted an exploratory study with graduate learners to examine their attitudes toward Guided-KNOWLA activities, measured by perceived usefulness and comparative formats for formative assessment. Preliminary results suggest that using the Guided-KNOWLA were useful in helping learners master online materials and were a preferred format of "test-yourself" practice to multiple-choice questions.

An Automated Feedback System to Support Student Learning in Writing-to-Learn Activities

Formative feedback has long been recognized as a crucial scaffold for student learning. Due to the job demand of instructors, it is impossible for them to provide individual students with on-demand formative feedback based on individual students' performance. There is a growing interest in developing better approaches to provide students with automated formative feedback to assist their learning. In this research, we design and develop an automated formative feedback system to support student learning of conceptual knowledge in the course of writing assignments. In the proposed system, formative feedback can be generated automatically with the help of concept maps constructed from instructors' lecture slides and students' writing assignments. In this paper, we present the automatic approach to generate formative feedback, discuss the system architecture, and illustrate a prototype of the proposed system.

automaTA: Human-Machine Interaction for Answering Context-Specific Questions

When online learners have questions that are related to a specific task, they often use Q&A boards instead of web search because they are looking for context-specific answers. While lecturers, teaching assistants, and other learners can provide context-specific answers on the Q&A boards, there is often a high response latency which can impede their learning. We present automaTA, a prototype that suggests context-specific answers to online learners' questions by capturing the context of the questions. Our solution is to automate the response generation with a human-machine mixed approach, where humans generate high-quality answers, and the human-generated responses are used to train an automated algorithm to provide context-specific answers. automaTA adopts this approach as a prototype in which it generates automated answers for function-related questions in an online programming course. We conduct two user studies with undergraduate and graduate students with little or no experience with Python and found the potential that automaTA can automatically provide answers to context-specific questions without a human instructor, at scale.

What do students at distance universities think about AI?

Algorithms, drawn from Artificial Intelligence (AI) technologies, are increasingly being used in distance education. However, currently little is known about the attitudes of distance education students to the benefits and risks associated with AI. For example, is AI broadly welcomed by distance education students, thought to be irrelevant, or disliked? Here, we present the initial findings of a survey of students from the UK's largest distance university as a first step towards addressing the question "What do students at distance universities think about AI?" Responses from the 222 contributors suggest that these students do expect AI to be beneficial for their future learning, with more respondents selecting potential benefits than selecting risks. Nonetheless, it is important to extend this exploratory study to students in other universities worldwide, and to other stakeholders.

Peer Advising at Scale: Content and Context of a Learner-Owned Course Evaluation System

Peer advising in education, which involves students providing fellow students with course advice, can be important in online student communities and can provide insights into potential course improvements. We examine reviews from a course review web site for online graduate programs. We develop a coding scheme to analyze the free text portion of the reviews and integrate those findings with students' quantitative ratings of each course's overall score, difficulty, and workload. While reviews focus on subjective evaluation of courses, students also provide feedback for instructors, personal context, advice for other students, and objective course descriptions. Additionally, the average review varies by course overall score, difficulty, and workload. Our research examines the importance of student communities in online education and peer advising at scale.

Towards Improving Students' Forum Posts Categorization in MOOCs and Impact on Performance Prediction

Going beyond mere forum posts categorization is key to understand why some students struggle and eventually fail in MOOCs. We propose here an extension of a coding scheme and present the design of the associated automatic annotation tools to tag students' questions in their forum posts. Working of four sessions of the same MOOC, we cluster students' questions and show how the obtained clusters are consistent across all sessions and can be sometimes correlated with students' success in the MOOC. Moreover, it helps us better understand the nature of questions asked by successful vs. unsuccessful students.

PARQR: Augmenting the Piazza Online Forum to Better Support Degree Seeking Online Masters Students

We introduce PARQR, a tool for online education forums that reduces duplicate posts by 40% in a degree seeking online masters program at a top university. Instead of performing a standard keyword search, PARQR monitors questions as students compose them and continuously suggests relevant posts. In testing, PARQR correctly recommends a relevant post, if one exists, 73.5% of the time. We discuss PARQR's design, initial experimental results comparing different semesters with and without PARQR, and interviews we conducted with teaching instructors regarding their experience with PARQR.

Causal Inference in Higher Education: Building Better Curriculums

Higher educational institutions constantly look for ways to meet students' needs and support them through graduation. However, even though institutions provide degree program curriculums and prerequisite courses to guide students, these often fail to capture some of the underlying skills and knowledge imparted by courses that may be necessary for a student.

In our approach, we use methods of Causal Inference to study the relationships between courses using historical student performance data. Specifically, two methods were employed to obtain the Average Treatment Effect (ATE): matching methods and regression. The results from this study so far, show that we can make causal inferences from our data and that the methodology may be used to identify courses with a strong causal relationship - which can then be used to modify course curriculums and degree programs.

Investigating Learning Design Categorization and Learning Behaviour in Computational MOOCS

We investigate learner efficiency by categorizing a computational MOOC and analyzing user behavior data from a learning design point of view. Learning design is important both when designing courses as well as studying them. Learning behavior can be observed from the MOOC platform data. For this study we ask two learning designer experts to categorize a course on MITx: "6.00.1x Introduction to Computer Science and Programming Using Python". We use these categorizations to investigate relationships with learning behavior by analyzing the MOOC platform data. Our study verifies that learning design can be correlated to learning behavior, e.g. students exhibit a pattern of behavior associated to a component's difficulty and category.

LiveDataLab: A Cloud-Based Platform to Facilitate Hands-on Data Science Education at Scale

We present LiveDataLab, a novel general cloud-based platform that facilitates data science education at scale by enabling instructors to offer hands-on data science assignments using large real-world datasets. Using real course assignments as examples, our demonstration will walk attendees through the process of an instructor deploying an assignment, students working on and submitting assignments, and leaderboard-based competition and automated grading to demonstrate the major functions and benefits of LiveDataLab.

Jack Watson: Addressing Contract Cheating at Scale in Online Computer Science Education

Cheating has always been a problem for academic institutions, but the internet has increased access to a form of academic dishonesty known as contract cheating, or "homework for hire." When students purchase work online and submit it as their own, it cannot be detected by commonly-used plagiarism detection tools, and this troubling form of cheating seems to be increasing.

We present an approach to addressing contract cheating: an AI agent that poses as a contractor to identify students attempting to purchase homework solutions. Our agent, Jack Watson, monitors auction sites, identifies posted homework assignments, and provides students with watermarked solutions that can be automatically identified upon submission of the assignment.

Our work is ongoing, but we have proved the model, identifying nine cases of contract cheating through our techniques. We are continuing to improve Jack Watson and further automate the monitoring and identification of contract cheating on online marketplaces.

Developing an Intervention to Advance Learning At Scale

With the rise of technology advancements we witness every day in our contemporary life in general, and in the education field in specific, new ways of learning are emerging, such as Massive Open Online Courses (MOOCs). MOOCs have grown rapidly for the past few years, yet meeting the needs of massive and diverse learners and keeping them motivated to learn is still a challenge. To address this concern, we have developed an intervention to meet students' learning needs and keep them motivated to learn according to their capabilities. In this paper, we will discuss the intervention and report on the preliminary results drawing on the quantitative and qualitative data of the course survey to interpret learners experiences using this approach.

Web of Slides: Automatic Linking of Lecture Slides to Facilitate Navigation

Lecture slides covering many topics are becoming increasingly available online, but they are scattered, making it a challenge for anyone to instantly access all slides relevant to a learning context. To address this challenge, we propose to create links between those scattered slides to form a Web of Slides (WOS). Using the sequential nature of slides, we present preliminary results of studying how to automatically create a basic link based on similarity of slides as an initial step toward the vision of WOS. We also explore interesting future research directions using different link types and the unique features of slides.

WOSView Demo: A Tool to Explore the Web of Slides

We will demonstrate a prototype system WOSView built based on the vision of the Web of Slides(WOS), which aims to link all the lectures slides so as to facilitate navigation over all the slides. The links can be created at the slide level or at the level of phrases inside a slide, and many types of links can be created. The prototype system we built implements the most basic type of links, which link slides that have similar content and integrates lectures from four different MOOCs. WOSView also supports keyword search, which generates virtual links dynamically. We will demonstrate how the graphical interface of the WOSView enables students to flexibly navigate into slides from different courses and explore related slides using both static and dynamic links and solicit feedback from the community about the vision of WOS.

Teaching C Programming Interactively at Scale Using Taskgrader: an Open-source Autograder Tool

This demo paper introduces a tool and a method to provide a barriers-free, rich, interactive learning experience for students of all levels of preparation in programming courses. Taskgrader is an open-source autograding tool providing instant feedback in large-scale online programming classes. This in-browser tool offers extensive feedback to student code submissions right within any LMS and pass data back to the gradebook.

Supporting Instruction of Formulaic Sequences Using Videos at Scale

To help language learners achieve fluency, instructors often focus on teaching formulaic sequences (FS)--phrases such as idioms or phrasal verbs that are processed, stored, and retrieved holistically. Teaching FS effectively is challenging as it heavily involves instructors' intuition, prior knowledge, and manual efforts to identify a set of FSs with high utility. In this paper, we present FSIST, a tool that supports instructors for video-based instruction of FS. The core idea of FSIST is to utilize videos at scale to build a list of FSs along with videos that include example usages. To evaluate how FSIST can effectively support instructors, we conducted a user study with three English instructors. Results show that the browsing interactions provided in FSIST support instructors to efficiently find parts of videos that show example usages of FSs.-

Achievements for building a learning community

Twice a year the National University of Singapore hosts computer programming events open to the nation's secondary, junior college, polytechnic and technical education students. To qualify for the live events, participants complete online programming activities during a month-long qualification phase open to all non-university students over the age of 12. The activities include game-based learning and traditional coding problems.

During the past year, more than 1700 students participated in the two qualification phases and more than 200 students participated in the live events. At these events, students pair-program to test their programming abilities and showcase their coded creations in a tournament format.

In the accompanying poster, we describe our work to build a community of intrinsically motivated learners and develop the technical infrastructure to support them both at scale during the qualification phase and live events. We conclude by detailing our plans for leveraging the community as a site for research on learning going forward.