Keywords

1 Introduction

1.1 Metacognitive Deficits

Progressing through a federated learning environment via adaptive recommendations may constitute experiential learning of metacognitive strategies. Intelligent, personalized recommendations attempt a delicate task otherwise incumbent upon learners. That task depends upon metacognitive processes and awareness which are deficient in (or potentially foreign to) many learners [1]. To compensate for the paucity of those skills, conventional independent learners have rigid lesson plans, at best. Such plans fail to provide many aspects of an experiential process identified by many as critical to knowledge construction [2,3,4].

Appropriate recommendations—human or otherwise—may complement explicit content instruction by scaffolding how to learn complex topics beyond the immediate application or solution. However, this meta-instruction works best when learners understand the essential logic of artificially intelligent recommendations—that their past performance and personal characteristics inform what should come next. The challenge then becomes how to introduce and reinforce the complexities and relationships among these variables through interactive experience, and how to evaluate whether or not such instruction has taken place.

Experiential instruction relies largely on emotional engagement and dynamism. Graesser [5] argues that emotions are the “experiential glue” of learning environments. To accomplish this engagement, adaptive instructional systems sometimes must operate in open defiance of the user experience design principle that cognitive load should be minimized. A cognitively demanding state of confusion may serve as an effective emotional and experiential tool to establish or correct mental models of the domain, as well as the learner’s understanding of his mastery status within it [6]. Naturally, this state of confusion may also lead to frustration and in turn disengagement. This risk necessitates a balance, maintained by accurate assessment of the learner’s status on multiple dimensions.

Further complicating the high-wire act, learners need some degree of understanding of the intelligent recommendations that guide their interaction with the system. Jackson, Graesser, and McNamara [7] argued that the accuracy of expectations of the learning technology constitute stronger predictors of actual learning than prior knowledge, initial motivation, and technological proficiency combined. As learners are unlikely to differentiate between the production quality of individual learning resources and the federated learning system generally (with special emphasis on the intelligent recommender), a holistic measure of expected efficacy compared to actual efficacy may prove strongly predictive of the latter.

1.2 Hybrid Tutor

Effective instruction of complex domains naturally progresses and varies along multiple dimensions to instill complex understanding. For example, Ohm’s Law in electrical engineering logically precedes calculating voltage of series and parallel circuits, because those calculations frequently rely on Ohm’s Law. Complementing that linear progression, alternation between applied, conceptual, and formalized denotation can provide symbiotic advancements. Applied mathematical problems may contextualize the need for Ohm’s Law to find a solution. A subsequent conceptual exercise may demonstrate why voltage increases with current and decreases with resistance, and in what proportions. Understanding those in succession can concretize formulaic problems and foster a learning environment tied to practice. This constitutes an authentic learning experience, critical in developing meaningful skill acquisition [8]. For these reasons, comprehensive content instruction of complex domains nearly requires the availability of learning resources targeted at various levels and perspectives.

In this manuscript, we refer to a class of learning environment called a hybrid tutor [9]. These systems combine static learning resources, like text or videos, with adaptive resources such as intelligent tutoring systems or interactive assessments. Hybrid tutors incorporate these disparate resources within a single interface and provide intelligent recommendations based on progress across the various components.

The hybrid tutor we use for illustration is called ElectronixTutor [10, 11]. This system subsumes several independently developed learning resources pertaining to electrical engineering principles and application. These include static conventional texts, dialogue-based conceptual interactions, stepwise formulaic progressions, tiered multiple-choice questions, and comprehensive diagrammatic problems, among others.

These resources serve as the raw material from which to construct holistic content instruction and meta-instruction through two distinct intelligent recommendation pathways (one unconstrained, the other instructor-driven). Both pathways leverage the same comprehensive model of progression and perspective. They both produce individualized assessment of performance that recognizes (via distinct methods) deficiency and excellence. We leverage the construction and evaluation of ElectronixTutor to examine in context the relevant issues of sequential progression, diversity of perspective, experiential instruction, intelligent recommendation, and the relationship between learner expectation and learning outcomes.

2 Designing for Experiential Learning

2.1 Adaptive Recommendations

Adaptive instructional systems are generally advantageous, relative to non-adaptive learning tools [12]. However, methods of implementation for adaptivity vary widely, as does the relative advantage bestowed on the learner. The relatively novel architecture of hybrid tutors requires similarly novel construction of adaptive recommendation engines. Recommendations need to function simultaneously within-resource and between-resource. The system must translate progress in an independently developed learning resource into a holistic understanding of the learner [13, 14]. From there, the system must make determinations based on learning theory that account for the learner and domain considerations noted above. Based on those determinations, the system then identifies the learning resource and individual item that best matches the needs of the moment.

The complexity of this decision-making process leaves considerable room for variation. ElectronixTutor includes two distinct methods of constructing adaptive recommendations. These follow from two basic use cases. In the first case, independent learners wish to increase their knowledge of the domain. This likely does not involve any structure outside of that suggested by the domain itself, and recommendations follow accordingly with adaptivity derived from the individual’s historical performance data. In the second case, the system functions as a classroom companion learning tool. There, instructors may wish to have a degree of control over the content order and availability, to more closely follow existing syllabi and lesson plans.

Historically Derived.

In the case of an independent learner, ElectronixTutor leverages all data collected about the learner throughout all historical interactions [15]. These data reside in the learning record store, from which the system calculates performance on several dimensions, including performance within that topic, within that resource, recent versus historical average, etc. From there, the system generates potential recommended items using considerations influenced by learner characteristics and experiential factors. For example, the system may use historical performance information to infer that the learner is particularly motivated based on extended time spent on problems without giving up. In that situation, pushing the envelope may be the ideal approach.

This recommendation engine provides three options to the learner, from which they can select freely. These options may all come from the same learning resource or not. If the engine identifies more than three appropriate items, it performs conflict resolution based on historical performance within that learning resource and topic. Low performance triggers a zone of proximal development strategy [16], wherein the candidate problem’s difficulty suggests a learner should be able to successfully complete it with some structural support. High performance triggers a “pushing the envelope” strategy, with advanced difficulty. This strategy allows the system to “catch up” to advanced students quickly. If performance is average, the recommendation engine defaults to random selection among the candidate problems. Figure 1 illustrates the process.

Fig. 1.
figure 1

A simplified depiction of historically derived intelligent recommendations, with sample considerations illustrating how possible items are generated.

Instructor Circumscribed.

In the case of a hybrid tutor being deployed as a classroom companion learning tool, instructors will likely want some degree of control over content ordering and availability. To accommodate that, while still leveraging the intelligent adaptivity afforded by the medium, ElectronixTutor incorporates a second recommendation engine. In this engine, instructors designate on which of the topics (e.g., Ohm’s Law, parallel circuits, transformers) learners should focus during a designated time frame. This singular recommendation (as opposed to the three generated by the historically derived engine) appears under the label “Today’s Topic”.

Because of the restriction to a single topic, and further because that topic is likely new to the learner, the system has substantially fewer datapoints from which to calculate recommendations accurate to the individual’s needs and idiosyncrasies. Also, the instructor may prefer a relatively explainable procedure for determining if learners have reached a set level of mastery on that topic. For these reasons, instructor circumscribed recommendations rely on a decision chart rather than complex calculations.

This process may vary slightly among topics (depending on the breadth of resources available), but typically proceeds as follows. First, the recommendation engine directs learners to a topic summary, with text providing an overview of the new topic, including hyperlinks to external resources (e.g., Wikipedia). Learners then progress to conversational reasoning questions. These questions provide substantial diagnosticity as they contain multi-part answers and account for the level of fluency with which the learner was able to produce each part [17]. Based on this nuanced performance evaluation, the system leads learners to advanced, remedial, or roughly equivalent problems across all learning resources. Basic adaptivity (i.e., correct responses lead to more difficult problems and vice versa) complements a bias toward variability in learning resources presented.

This process continues until two conditions have been met. First, the learner’s overall performance within the topic reaches a “mastery threshold”. This score updates with every completed item and includes weightings relative to difficulty and scope. The instructor can determine the numeric value of the threshold (represented between zero and one), to allow added control over learner requirements. Second, the learner must have completed items in at least three learning resources. This ensures breadth of understanding, as different learning resources have distinct focus areas and approaches (e.g., conceptual versus mathematical versus practical).

Using this approach, instructors can assign homework that biases fluency with the content, rather than interaction with a set amount of content. Both historically derived and instructor circumscribed recommendation engines appear in the top-left portion of the screen, emphasizing their importance to the learning process (see Fig. 2). By default, both are available. The instructors may disable the historically derived option (as well as the self-directed learning option, shown below the other two options) to have added control over the content. In the case of an independent learner unaffiliated with a class, the instructor circumscribed recommendations proceed through the 15 topics in order, with each successive topic unlocked by completion of the previous one.

Fig. 2.
figure 2

The ElectronixTutor interface with a conversational reasoning question.

2.2 Scaffolding

In addition to providing flexibility in learner use cases, the two recommendation generation methods described provide distinct opportunities to scaffold metacognitive strategies. The experience of receiving personalized recommendations acts in the traditional role of a tutor, and “serves the learner as a vicarious form of consciousness until such time as the learner is able to master his own action through his own consciousness and control” [18] (p. 24).

In historically derived recommendations, a complex combination and processing of information yields three appropriate options for next steps. As detailed above, this goes significantly beyond questions becoming harder after a successful completion. The system demonstrates how to properly balance approaches rather than binging on a single learning resource. Detection of frustration within problems should lead to relatively easier problems that build confidence. Ideally this also avoids disengagement by virtue of variety. Repeated, but not exclusive, exposure to problem areas reinforces the need for persistence balanced against diversity.

The ability to select from three options emphasizes these principles by tripling the number of exposures to metacognitively aware decisions. A list of three conversational reasoning recommendations could highlight that the learner has been avoiding that resource, or that he lacks conceptual understanding. And the act of choosing creates a closer link between the artificial intelligence and the acts it is scaffolding.

In instructor circumscribed recommendations, a relatively restricted state space reduces the number of possibilities to a level that the learner may find more manageable. Consistency at the beginning of a topic (Topic Summary followed by Conversational Reasoning) demonstrates important principles in addressing content—first refresh yourself on the big picture then check for conceptual understanding. Subsequent recommendations reinforce the importance of diversity or perspective and of holistic understanding, while progression to harder or easier content provides implicit, high-level feedback on performance. Finally, completing a topic upon reaching the mastery threshold correlates successful content fluency with a specific metacognitive status.

3 Proposed Evaluation

In theory, interacting with the prescribed recommendation engines constitute a means for scaffolding metacognitive awareness. However, their effectiveness in this regard relies on learners understanding that intent and evaluating the proffered information accordingly. How much learners fulfill this requirement, or indeed are aware of its existence, remains largely in doubt. Scaffolding typically assumes explicit instruction, as opposed to the experiential and implicit method described in the hybrid tutor. We conclude our discussion with a proposal of methods to evaluate the degree to which, and mechanisms by which, experiential learning provides instruction related to metacognitive awareness and strategies.

3.1 Expected Efficacy

Following the findings of Jackson, Graesser, and McNamara [7], we anticipate learner expectations to strongly predict learning outcomes. However, these findings have not yet been replicated in metacognitive awareness and strategies—they applied to learning outcomes of the target domain. Adapting the approach appears relatively straightforward. A survey of participants before interacting with the system could illustrate expectation for how adaptive instructional technologies generate recommendations.

Some difficulty arises in avoiding biasing subsequent interactions with the system. For example, asking “What factors do you believe will influence how the system generates recommendations for you?” may cause the participant to search for intention most diligently than he would have otherwise. Further, it may bias him toward belief that the recommendations are genuinely artificially intelligent. This may not have been the default position. To avoid this, acquiring a general sense of the participant’s views on personalized adaptivity may prove less problematic. People commonly interact with applications that claim to learn their preferences (e.g., Nest smart thermostats), daily routine (e.g., Google Maps), language use (e.g., auto-complete text generation) etc. A broad trust or frustration in these technologies likely correlates highly with their expected efficacy of intelligent recommendations in learning technology. Complementing this subtlety is the structure of the hybrid tutor itself, where the focus lies primarily on the learning content, not elements of progression through them.

This aspect of the evaluation has two benefits. First, it expands the research on the link between learner expectations and learning outcomes to include metacognitive learning through experiential instruction. Second, it may provide valuable context for the mechanisms by which intelligent recommendations lead to (or fail to lead to) metacognitive benefits with respect to learning. If the effect does extend to this application but metacognitive learning outcomes are hindered by inaccurate expectations, the intervention to remedy the situation becomes clearer. Effort spent in optimizing the recommender engines may be more efficiently deployed in conveying their capabilities to the learners.

3.2 Perceived Efficacy

A measure of perceived efficacy would complement the expected efficacy survey. Following completion of the testing, a survey would measure perception of the hybrid tutor’s intelligence and the appropriateness of its adaptivity. This would provide a direct measure of the match between actual capabilities and its perceived capabilities. This directly impacts our understanding of experiential instruction of metacognitive strategies. If learners did not notice any reason for the recommendations, then they are unlikely to have absorbed the lessons implicit within them.

Comparison to the previous survey could also demonstrate the effect of bias on perceptions of intelligent adaptivity. This could potentially have far-reaching implications for any kind of artificially intelligent adaptivity. Trust in automation constitutes a large research field with immediate economic concerns such as the public’s willingness to cede control to self-driving cars. Improvements to the automation systems themselves may be tempered by bias that negates or ignores tangible advances.

3.3 Actual Efficacy

Finally, we come to a direct examination of the potential for experiential instruction of metacognitive strategies through intelligent recommendations. This will require a degree of deception. As stated, instructing these strategies requires learners to understand that recommendations are adaptive to the learners, based on past performance and individual characteristics. A proper control group requires the absence of that understanding.

Randomly dividing the participant population, half should proceed normally through the hybrid tutor for some amount of time enough to encounter as many iterations of the recommender(s) as is feasible. The other half of the participants should follow the same procedure, except they should be under the impression that the order of learning resources and items is predetermined. The experimenter should show them a checklist of items through which they will proceed one at a time. Almost certainly, the participant will glance at this (preferably inscrutable) list and then disregard the specifics. From there, the experimenter simply pretends to read from the checklist while in fact instructing the participant to continue using intelligently generated recommendations. Further, the label “Recommended for you” should be altered to “Random practice” to avoid implications of intelligence.

During or after the task, one or more established methods of metacognitive assessment [19] can provide empirical validation of the approach. Think-aloud protocol and reflecting when prompted are both common, though with some concern that they interrupt the learning process [20]. Alternatively, following completion of the task, all participants could take a survey. This should include self-assessment of their mastery of topics encountered (which can be compared to calculated values), impression of the relative values of learning resources (e.g., conceptually-oriented versus mathematically rigorous), and their mastery of each of the learning resources (again, comparing to calculated values). This between-participants experimental approach could rigorously test the impact of experiential instruction on metacognitive strategy learning.

3.4 Conclusion

Deficiencies in metacognitive strategies and awareness mean that learners appropriately selecting content without the benefit of expert supervision is unlikely. Because of this, intelligent recommendations provide an invaluable service to learners in adaptive instructional systems. Beyond the act of substituting for experts, those recommendations may provide meta-instruction by virtue of scaffolding understanding of the mechanisms at play in deciding the best way forward.

Principles of experiential learning and scaffolding of instruction suggest that this may be the case. Hybrid tutors provide a viable testing environment, with differential methods of intelligent recommendation helping to ensure sufficient breadth to generalize any findings. Evaluating the extent to which and mechanisms by which this proves effective could have far-reaching impacts. Comparisons among the expected, perceived, and actual efficacy of intelligent recommendations can inform learning science, trust in automation, and adaptive instructional system design principles.