Abstract
Advances in computer technology have made it possible to intrepidly identify and classify human emotions. While this technology has matured, only a few applications have appeared using this technology in advanced learning environments. Imagine, if you will, the existence of an application of Adaptive Instructional System (AIS), such as an Intelligent Tutoring System (ITS), that would be able to understand when a learner is sad, disengaged, or any other emotional state desired to track. There is a huge potential of this for improving an AIS system’s feedback or interventions. Our team has developed the advanced technology for a quick-acting emotion-tracking system that is easy to use for an internet-based ITS. An ITS is able to incorporate this revolutionary feature by being aware of content, context, and the user’s emotion state. In fact most of the mobile apps are currently tracking these three awareness by utilizing available sensors as audio, video, and inertial measurement unit (IMU) sensors. Learning applications would be more efficient if they are integrated with these three awareness (3A) [3].
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
- Adaptive instructional system
- Intelligent tutoring system
- Content context and user awareness
- Learning environment
1 Introduction
Off-the-shelf technologies [4, 5, 11] already exist to recognize generic emotions from face images. These generic emotions of a user would be an excellent source of valuable information for learning scientists if an efficient translation technique existed (e.g., to smooth out the multidimensional emotion vector and transform it into learning related emotions). Rather than focus on the technology of emotion recognition, in this paper we believe it is more fruitful to focus on methodological issues that demand consideration when deciding how to maximize the utility of the available technologies that directly make AIS applications more efficient and effective. We believe 3A enabled AIS applications, such as ITS, would greatly advance research, development, and evaluations in the learning sciences. ITS is one of the typical Adaptive Instructional Systems (AIS) application.
Enabling Adaptive Instructional System Applications (AISA), including ITSs, with 3A capability necessarily assumes these applications are more similar to human tutors than those that are without the 3A. The existing framework [14, 16, 17] assumes that we consider dynamic learning resources, like ITSs, to have a “psychology” similar to that of humans. We believe that the 3A capability of AISA is essential for making such applications resemble human tutors. In this paper, we address a few research questions involved in making AISA 3A enabled. Questions such as, how to obtain and aggregate the history of a user’s emotions; or how to collect content and context correlated with those emotions, are exceedingly important to address. It is important to note that because ITSs provide individualized content, the awareness aspect must also be individualized. Individualized 3As are possible through distributed applications which are best applied using modern browser based technologies. This research is about exploring possible solutions of the above mentioned research questions more specifically:
-
1.
How to smooth out the generic emotional learning history
-
2.
How to aggregate the content, context, and learner emotions
-
3.
How to determine if a function exists to map generic emotions to learning related emotions
Our pilot implementation uses AWS Rekognition. AWS Rekognition provides an eight dimensional vector and a confidence score of user emotion from facial expression. At a certain interval we can obtain those vectors and store them into a learning record store (LRS) using xAPI. The intelligent tutor is able to keep track of the content delivered as well as detect context from the sequence of delivered content. The main task is to apply some mathematical techniques to aggregate this information and find a transformation function.
One way to smooth the emotion history is to dynamically adjust means and standard deviations along with confidence scores in LRS. The content and context information should be encoded for proper aggregation. In addition, it is important to retrieve information rapidly for an ITS to make use of it effectively (e.g., provide feedback to the learner in realtime).
While the potential for, and effectiveness of, emotionally sensitive feedback in ITSs has been explored in previous research, AISAs that can aggregate emotion, context, and content data beyond the span of single learning session has been less explored, with many questions remaining unanswered. Exciting existing technologies are out there. Thus our goal has been to find a 3A enabled engineering solution for ITSs which is portable, cheap, and scalable [3].
Our paper for this conference focuses primarily on technological aspects of the 3A enabled AIS; however, other questions will be explored to help make readers fully aware of the scope of issues in need of consideration:
-
Theoretical & Methodological: How would a 3A enabled AIS impact learners psychologically? Given that a 3A enabled AIS will model human tutors more realistically, how much will this add to the learning experience? What are the potential benefits, as well as the potential negative impacts?
-
Technological: Although we have tried and achieved the ability to fast store and retrieve historical emotion history from a LRS, we have found it necessary to make our own xAPI statements for the desired behaviors. How can we standardize our statements so that all 3A enabled ITS can contribute to an individual emotional history that is context-sensitive, domain-specific, and individualized?
-
Implementational: Our pilot system is conversational [20], can this be implemented in other AIS that are not conversational?
The technology made 3A possible. Our effort makes 3A in AIS easy. We expect to make our work open-source so all AIS may benefit from our work.
2 Related Literature
Adaptive Instruction refers to the educational activities that accommodate to individual students and their unique behaviors so that each student can develop the necessary knowledge and skills [6, 22]. Currently, many remarkable ITSs are available for teaching STEM subjects as well as computer technologies [12, 23]. ITSs are moving towards adaptive instructions to accommodate exceptional learners [13]. Additionally, there are ongoing efforts and research toward defining AIS standards [29]. 3A has huge potential in AISA [3].
3 Smoothing Function for Confidence of Emotion
AWS Recognition API provides a mechanism to analyze face image details, including emotion. It provides eight emotions with percent confidence scores obtained from a face image. Those emotions are CALM, HAPPY, FEAR, SURPRISED, DISGUSTED, CONFUSED, SAD, ANGRY. The confidence scores for these emotions are not independent. For example with a higher confidence of HAPPY the confidence value of SAD is lower. This is logical because, visually, a person is rarely HAPPY and SAD at the same time. There are exceptional cases of mixed-emotion states where a person may look SAD though he or she is HAPPY. This exception is identifiable by a human, given a cultural context, but not by machine. The AWS Recognition documentation says that the API makes “a determination of the physical appearance of a person’s face” not their “internal emotional state” [4].
We need to retain the emotion vector which emerges during a learner’s interaction with AIS and that requires prepossessing. Each emotion vector is available at a certain interval (e.g., 3 s). We cannot simply consider only the highest confidence score of the emotion vector. In that case we lose essential emotion information that contributes to individualization. Specifically, it is important to establish an individualized baseline. For example, a person’s face may look always happy and an individualized emotion profile can be constructed by looking at other emotion scores as well. In fact, in some cases the machine will incorrectly assign the highest score of an emotion which is not evident. To solve this problem we maintain the sample mean and sample variance. The sample mean and sample variance are mathematically given by Eq. 1 and 2 respectively.
The easiest way to compute sample variance is by accumulating the sum of the x’s and the sums of the squares of the x’s. However, direct evaluation of this formula suffers from loss of precision. The best algorithm proposed thus far is by [31], which is mentioned in [15]. Now, as we take the confidence values at a certain interval the data must be smoothed. Initially, we accomplished this smoothing by taking the average of three data points. However, with this method the data points (i.e., the signal) are piece wise differentiable but not fully differentiable. One solution is to apply a moving average several times in order to improve the behavior of the function. However, calculation in a single step is preferred here. For that reason the triangular weighted moving average in Eq. 3 or a perfectly smoothed one, the Gaussian filter given in this Eq. 4, is most used. In those equations t is the time, x is data point, and \(\sigma \) is standard deviation.
where g is Gaussian probability density function,
The effect of smoothing is shown in Fig. 1.
It is critical to know if we can treat all of the emotions in a similar manner. In other words, can we use the same smoothing function for the confidence values for all of the emotions? According to Paul Eckman, there are six basic emotions; namely happiness, sadness, disgust, fear, surprise, and anger [8]. As of 2017, a study found 27 different categories of emotions that are far more complex than the basic emotions previously found [7]. People undergo these emotions along a gradient instead of a clearly separated instance according to that study. In our study, in a few occasions HAPPY and SAD were observed at the same time although it seems impossible. According to Robert Plutchik’s “wheel of emotions” the basic emotions can be combined to form a complex emotion. The properties and complexities lead us to analyze the correlations among the eight emotions. If any correlation is found, then we have to find either an orthogonal function or functions to treat emotions individually. A correlation matrix along with statistical significance (indicated by ‘*’) is presented in Table 1.
From the Table 1 it is seen that some of the emotions are positively correlated and others are negatively correlated. For example HAPPY, FEAR, SURPRISED are negatively correlated to CALM. On the other hand CONFUSED is positively correlated to FEAR, SURPRISED, DISGUSTED. Moreover most of these emotions are statistically significant (e.g., p-values are less than 0.05).
4 xAPI Learning Record Store for Emotion History
At this stage we have a smoothed function (e.g., Gf(x)) which provides data points along a time axis. The sample mean and variance give a history of the emotions of a user while interacting with AISs. Whenever the z-value at any particular instance exceeds a threshold (e.g., 95%), then it is considered that the emotion is present. For example at a given instance a user may have calm, sad, surprised z-scores above the threshold which means the emotions are calm, sad, and surprised. A sample instance of data from LRS is shown in Fig. 2. Two instances of emotions presented in Table 2 where by looking at z-score the strongly emerged emotion can be inferred (e.g., DISGUSTED and ANGRY at \(n=62\); ANGRY at \(n=63\)).
xAPI [1] is the fastest and most secure platform for storing, retrieving, and distributing the learning data. In this work, the advanced feature of html5 (e.g., media capability) [2] is adopted through JavaScript. This enabled the technology to be distributed for individualization, and computed quickly and efficiently. It can also be simply plugged to any AISA (e.g., AutoTutor, GIFT) [28, 30] with JavaScript or html5 media capability. A simplified working principle is shown in Fig. 3. When a user logs into the system and starts emotion recognition, the system pulls previous values among the saved quantities. Six quantities are saved for each emotion in the xAPI data store, namely type, sum of squares, sum of the confidence score, cumulative sum, total number if instances, and the z-score. Interestingly, at a certain interval, these six quantities provide a baseline for an individualized emotion profile, given that the data are normally distributed. We performed the normality test and all the emotion scores were found, in fact, to be normally distributed. Most importantly, with this approach the learner’s privacy is not violated because no facial images are saved in data store – only the raw numerical output.
The emotional expressions of a human during explaining, recalling, reasoning, or applying, have a certain pattern, which can be easily comprehended by a human. A typical example is when a person says “let me think” along with the expression, we understand that the person is thinking by looking at his or her expression. We can easily understand when someone is pretending to know without actually knowing. With a large enough sample size, the more ambiguous emotional cues can be captured and usefully applied in a system. The subtleties of human emotion states and their associated expressions can eventually make their way into an ITS and be modeled by animated agents.
5 Analysis of Emotions
Some of the emotion states are significantly negatively correlated. Thus, it is important to determine if learners show similar behavior given the same content but within different learning instances. In other words, if a learner who studied “algebra” yesterday and is studying the same “algebra” topic today, should the distribution of emotions be the same? In this case our null hypothesis is \(H_0\) = “learner shows similar emotions” with the alternate hypothesis being \(H_1\) = “learner shows different emotions”. To simplify, we assume that there is no external effect (e.g., no covariate) and the confidence values are continuous. In this case we can apply one-way MANOVA [21] by considering different instances as an independent variable and all eight emotions as dependent variables. By looking at Wilks’ lambda p-value we may determine if the learner is in a similar emotional mode. We saw the p-value 0.000 and null hypothesis was rejected. There is a drawback of not considering external variables because at this stage we do not have enough data. In addition, we considered the emotion values as continuous which is not in fact true.
Another way of looking at the emotion individually is provided by the Wilcoxon signed-rank test [32]. Key assumptions are that the data are from the same population, each pair is chosen randomly and independently, and the data are measured on an interval scale. The emotions are measured at every three seconds in a percentage score which we can view as a rank. One advantage is that we do not need to know the distribution of the emotions in this case. Running Wilcoxon tests in two instances with 150 data each shows that (Table 3) the learner is similarly confused whereas they showed differences in happiness.
6 Future Direction
We believe that AISA enhanced with 3A would dramatically improve the efficacy of the applications. One of the potential applications for the future is to build 3A enabled AIAS to STEM education. We believe that a 3A enabled STEM AISA will maximize the utility of existing frameworks, such as the TIMSS (Trends in International Mathematics and Science Study) assessment frameworks. TIMSS describe content and cognitive domains to be tested in mathematics and science at the fourth and eighth grades [19]. The cognitive domains are knowing, applying, and reasoning. According to TIMSS [19] the cognitive domains skirt “the competencies of problem solving”, “providing a mathematical argument”,“representing mathematically”, “creating models of a problem’, and “utilizing tools (e.g., ruler or calculator)”. This framework well defined the procedures in each cognitive domain listed in Table 4.
In the previous section we looked at the emotion data, where we were able to show that it is possible to aggregate and predict if a learner is showing an unusual emotion for a given content and context. Our vision is to look beyond this emotion data at more granular level; more specifically, to look at the processes of cognitive domains. Here we make an assumption that a learner cannot hide his or her emotion during the learning process. There are well defined mathematical and science questions in TIMSS that require a learner to practice certain procedures. If we present different content (e.g., questions related to cognitive domains and procedures) repeatedly for the same domain and capture emotions there is a higher possibility of finding mapping between cognitive procedures and emotions. For example, some learners may not like computation and whenever a content is presented that requires computation he or she shows DISGUSTED. For other learners it may the case that they like computation and they show a HAPPY emotion. It is possible to analyze this interesting cognitive procedural emotion mapping from large amounts of data. Enabling the 3A framework in AIS can be a cheap, distributed, and efficient platform for collecting what is needed.
A number of theories pertain to psychological models of emotions. These theories try to understand and explain emotions in different dimensions of psychological states or processes. Although some controversies and research issues exist with respect to the different models of emotions, our goal is to find a productive utilization of data driven emotional measurement in learning industry. For example, psychologists have explained the semantic structure of affect and emotion in a two dimensional plane [9, 25] as well as presented mean individual and reference circumplex ratings of emotions [10, 24, 26] (Fig. 4). One may view these as a coding or transformation or reduction of an original master emotion model which is quite complex and not adequately understood [27].
By mapping the eight dimensional vector (e.g., obtained from AWS Rekognition API) into the valence, arousal semantic space we can obtain a type of quantitative aggregated measurement. Empirically evaluating these measurements we can try to identify the underlying actual coding of the emotion. This will help to identify if a learner is “really” learning or exactly where a learner is having difficulty. For example if a learner dislikes computation then 3A data analysis will help to pinpoint his or her dislike and improve not only the content, but its design as well.
7 Conclusion
In this paper, we presented an application of emotion recognition technology to enable any AISA with 3A. We focused on solving methodological issues such as building individualized distributions of an emotion profile using. By doing this, We are able to store a minimum emotion vector in each behavior statement (xAPI) and build emotion distributions with only the last record from LRS. We also explored the ways one might “smooth” the distributions by utilizing different types of averaging algorithms. Additionally, we presented a statistical analysis to aggregate the functions as well in order to see the difference in different instances of learning. This inexpensive, easy to integrate, distributed, and individualized 3A framework has enormous potential in developing the next generation AISA. A specially monitoring problem solving process according to TIMSS framework using 3A will open up the horizon of research direction for learning scientists. Our contributions of this paper remain in the methodological domain and therefore the approaches are not limited by the choice of emotion coding of the processing servers (in this case the AWS Rekongition API). A prototype of the AISA was made for this paper and it is available http://tiny.cc/HCII2020Demo.
References
DoDI 1322.26 xAPI reference. https://adlnet.gov/policy/dodi-xapi/. Accessed 14 Jun 2020
HTMLMediaElement. https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaElement. Accessed 14 Jun 2020
Ahmed, F., Shubek, K., Hu, X.: Towards a GIFT enabled 3A learning environment. In: Goldberg, B.S. (ed.) Proceedings of the Eighth Annual GIFT Users Symposium (GIFTSym8), pp. 87–92, May 2020
Amazon Web Services: Amazon rekognition developer guide. https://docs.aws.amazon.com/rekognition/latest/dg/rekognition-dg.pdf. Accessed 30 May 2020
API, M.A.F.: Facial recognition software|microsoft azure. https://azure.microsoft.com/en-us/services/cognitive-services/face/. Accessed 30 May 2020
Atkinson, R.C.: Adaptive instructional systems: Some attempts to optimize the learning process. Stanford University, Institute for Mathematical Studies in the Social Sciences (1974)
Cowen, A.S., Keltner, D.: Self-report captures 27 distinct categories of emotion bridged by continuous gradients. Proc. Natl. Acad. Sci. 114(38), E7900–E7909 (2017)
Ekman, P.: Basic emotions. In: Handbook of Cognition and Emotion, vol. 98, no. 45–60, p. 16 (1999)
Feldman Barrett, L., Russell, J.A.: Independence and bipolarity in the structure of current affect. J. Pers. Soc. Psychol. 74(4), 967 (1998)
Gerber, A.J., et al.: An affective circumplex model of neural systems subserving valence, arousal, and cognitive overlay during the appraisal of emotional faces. Neuropsychologia 46(8), 2129–2139 (2008)
Google Cloud Vision AI: Vision AI|derive image insights via ML|cloud vision API. https://cloud.google.com/vision. Accessed 30 May 2020
Graesser, A.C., et al.: Electronixtutor: an intelligent tutoring system with multiple learning resources for electronics. Int. J. STEM Educ. 5(1), 15 (2018)
Hallahan, D.P., Pullen, P.C., Kauffman, J.M., Badar, J.: Exceptional learners. In: Oxford Research Encyclopedia of Education (2020)
Hu, X., Tong, R., Cai, Z., Cockroft, J.L., Kim, J.W.: Self-Improvable Adaptive Instructional Systems (SIAISS)-Aproposed Model. Design Recommendations for Intelligent Tutoring Systems, p. 11 (2019)
Knuth, D.E.: The Art of Computer Programming, 3rd edn. Addison Wesley, Boston (1998)
Kuo, B.C., Hu, X.: Intelligent learning environments (2019)
Long, Z., Andrasik, F., Liu, K., Hu, X.: Self-improvable, self-improving, and self-improvability adaptive instructional system. In: Pinkwart, N., Liu, S. (eds.) Artificial Intelligence Supported Educational Technologies. AALT, pp. 77–91. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41099-5_5
Mullis, I.V., Martin, M.O.: TIMSS 2019 Assessment Frameworks. ERIC (2017)
Mullis, I., Martin, M., Goh, S., Cotter, K.: Timss 2015 encyclopedia: Education policy and curriculum in mathematics and science. Boston college, TIMSS & PIRLS international study center website (2016)
Nye, B.D., Graesser, A.C., Hu, X.: AutoTutor and family: a review of 17 years of natural language tutoring. Int. J. Artif. Intell. Educ. 24(4), 427–469 (2014)
O’Brien, R.G., Kaiser, M.K.: Manova method for analyzing repeated measures designs: an extensive primer. Psychol. Bull. 97(2), 316 (1985)
Park, O.C., Lee, J.: Adaptive instructional systems (2004)
Perez, R.S., Skinner, A., Sottilare, R.A.: -A review of intelligent tutoring systems for science technology engineering and mathematics (stem). Assessment of Intelligent Tutoring Systems Technologies and Opportunities, p. 1 (2018)
Posner, J., Russell, J.A., Peterson, B.S.: The circumplex model of affect: an integrative approach to affective neuroscience, cognitive development, and psychopathology. Dev. Psychopathol. 17(3), 715–734 (2005)
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161 (1980)
Russell, J.A., Bullock, M.: Multidimensional scaling of emotional facial expressions: similarity from preschoolers to adults. J. Pers. Soc. Psychol. 48(5), 1290 (1985)
Scherer, K.R., et al.: Psychological models of emotion. Neuropsychol. Emot. 137(3), 137–162 (2000)
Sottilare, R.A., Graesser, A., Hu, X., Holden, H.: Design recommendations for intelligent tutoring systems: Volume 1-learner modeling (2013)
Sottilare, R., Brawner, K.: Component interaction within the generalized intelligent framework for tutoring (gift) as a model for adaptive instructional system standards. In: The Adaptive Instructional System (AIS) Standards Workshop of the 14th International Conference of the Intelligent Tutoring Systems (ITS) Conference, Montreal, Quebec, Canada (2018)
Sottilare, R.A., Brawner, K.W., Sinatra, A.M., Johnston, J.H.: An updated concept for a generalized intelligent framework for tutoring (GIFT). GIFTtutoring. org (2017)
Welford, B.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)
Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in Statistics, pp. 196–202. Springer (1992). https://doi.org/10.1007/978-1-4612-4380-9_16
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ahmed, F., Shubeck, K., Andrasik, F., Hu, X. (2020). Enable 3A in AIS. In: Stephanidis, C., et al. HCI International 2020 – Late Breaking Papers: Cognition, Learning and Games. HCII 2020. Lecture Notes in Computer Science(), vol 12425. Springer, Cham. https://doi.org/10.1007/978-3-030-60128-7_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-60128-7_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60127-0
Online ISBN: 978-3-030-60128-7
eBook Packages: Computer ScienceComputer Science (R0)