Software Usability Testing Using EEG-Based Emotion Detection and Deep Learning
<p>Proposed framework for usability testing with emotion recognition [<a href="#B12-sensors-23-05147" class="html-bibr">12</a>].</p> "> Figure 2
<p>Architecture of a recurrent neural network.</p> "> Figure 3
<p>Classification of emotions based on the VAD model using an RNN classifier.</p> "> Figure 4
<p>Initial electrodes used in this research as the DEAP dataset.</p> "> Figure 5
<p>ROC curves for all dimensions in classification process using RNN. (<b>a</b>) Valence—negative level; (<b>b</b>) Valence—positive level; (<b>c</b>) Arousal—passive level; (<b>d</b>) Arousal—active level; (<b>e</b>) Dominance—low control level; (<b>f</b>) Dominance—high control level.</p> "> Figure 6
<p>Example of testing framework. (<b>a</b>) Main screen of the prototype framework. (<b>b</b>) Instruction screen. (<b>c</b>) Module testing screen. (<b>d</b>) Continuous mode in usability testing experiment.</p> "> Figure 7
<p>EEG recording segments in discrete mode.</p> ">
Abstract
:1. Introduction
2. Methods
2.1. Preparation
2.2. Training Phase
2.2.1. Emotion Recognition Training
2.2.2. Continuous Performance Test
2.3. Usability Testing Phase
- First-impression testing: As we previously measured CPT for each subject, this period is analyzed later to determine the user’s first impression of the website at the beginning, as well as for each task.
- The rest of the time is the EEG recording to facilitate emotion recognition for a particular task.
2.3.1. EEG Signal Processing Chain
Pre-Processing EEG Signals
Feature Extraction
Classification
- Classification Scheme Using RNN: Emotion classification on EEG data from the DEAP dataset [17] was performed using an RNN classifier. The classification process proceeded through the following two levels:
- ‘One-vs.-all’ level: A trial t is input into two classifiers from type ‘one-vs.-all’. The first is a high-level classifier (positive in valence, active in arousal, and low in dominance). The second classifier is a low-level classifier (negative in valence, passive in arousal, and high in dominance). The feature vector for each classifier includes the event-related desynchronization (ERD) and event-related synchronization (ERS) from the significant electrodes of the corresponding level. Figure 3 illustrates the structure of the classification process.
- Neural Network Structure: The considered classifier is based on an RNN with the LM algorithm, Jacobian matrix calculation, and real-time recurrent learning. It uses gradient descent to find the optimal values of weights, as well as a bias factor that gives a low cost and error factor. The general structure of an RNN is illustrated in Figure 2. However, details were obtained through various experiments, as well as research surrounding the issue of what constitutes an optimal structure. Finally, the highest results in terms of accuracy were obtained when the structure configuration is as follows:
- The network consists of several input layers, equal in number to the total number of selected significant electrodes. The size of the feature vector is different from subject to subject, depending on the subject’s brain activity.
- There are 80 hidden layers for receiving each feature vector component. Each layer is an MLP layer with a loop inside, along with delay factors. The MLP layers contain 80 neurons for receiving input components, and then the activation function tanh is applied.
- The proposed classification system is nearly static. In particular, the delay between inputs is set to be a naive value in order to avoid focusing on dependencies in emotions between trials. The DEAP dataset used independent videos in the experiment. The aim to set naive values for delay is to take advantage of RNN in training the data.
2.4. Linking and Reporting
3. Results
3.1. Overview of the DEAP Dataset
3.2. RNN Configuration Settings
3.3. Training and Testing Data
3.4. Performance Evaluation
4. Discussion
4.1. Discrete Method
4.2. Continuous Method
- First impression factor: The classification result of the EEG period in CPT allows the user’s first impression to be determined. This helps to improve design aspects such as the use of color, alignment, typography, and so on.
- Task-based testing factor: This is mapped with the classification results of each module. The feature vector is constructed for each module recording. The emotion label of the classifier is the subject’s emotional state during this task. This factor helps in improving the task design, and it has a direct impact on usability.
- Overall emotions factor: This factor is shown in the continuous method of the framework. It is an important factor for facilitating a measurement of usability when we divide the recording of each module and the satisfaction in general if we take the results of the entire continuous period as a feature vector.
- Effectiveness: In the ‘task-based test’, the level of each dimension in VAD is an indicator of the quality of the design and the function of the interface [13]. The correlation measurement is an accurate indicator of the emotion and the task/function of the system during the user experience, which reflects effectiveness [32].
- Satisfaction: The appearance of emotions that are either pleasant or unpleasant during the testing session can be used to draw an inference about user satisfaction. It is an indicator of each task, function, and the overall system. Scenarios such as the first impression test and free interaction test [33] are chosen in this step. Certain measurements are suggested at this point to decide on the satisfaction level, including the percentage change relative to a baseline state (i.e., neutral emotion or calm), correlation, and the means of the changes [2,33].
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Dumas, J.S.; Dumas, J.S.; Redish, J. A Practical Guide to Usability Testing; Intellect Books: Bristol, UK, 1999. [Google Scholar]
- Jokela, T.; Iivari, N.; Matero, J.; Karukka, M. The standard of user-centered design and the standard definition of usability: Analyzing ISO 13407 against ISO 9241-11. In Proceedings of the Latin American Conference on Human-Computer Interaction, Rio de Janeiro, Brazil, 17–20 November 2003; pp. 53–60. [Google Scholar]
- Thuering, M.; Mahlke, S. Usability, aesthetics and emotions in human–technology interaction. Int. J. Psychol. 2007, 42, 253–264. [Google Scholar] [CrossRef]
- Garrett, J.J. Elements of User Experience, the: User-Centered Design for the Web and Beyond; Pearson Education: London, UK, 2010. [Google Scholar]
- Schmidt, T.; Schlindwein, M.; Lichtner, K.; Wolff, C. Investigating the Relationship Between Emotion Recognition Software and Usability Metrics. i-com 2020, 19, 139–151. [Google Scholar] [CrossRef]
- Landowska, A.; Miler, J. Limitations of emotion recognition in software user experience evaluation context. In Proceedings of the 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Gdańsk, Poland, 11–14 September 2016; pp. 1631–1640. [Google Scholar]
- Saariluoma, P.; Jokinen, J.P. Emotional dimensions of user experience: A user psychological analysis. Int. J. Hum.-Comput. Interact. 2014, 30, 303–320. [Google Scholar] [CrossRef]
- Puwakpitiyage, C.; Rao, V.; Azizi, M.; Tee, W.; Murugesan, R.; Hamzah, M. A Proposed Web Based Real Time Brain Computer Interface (BCI) System for Usability Testing. Int. J. Online Eng. 2019, 15, 111. [Google Scholar]
- do Amaral, V.; Ferreira, L.A.; Aquino, P.T.; de Castro, M.C.F. EEG signal classification in usability experiments. In Proceedings of the 2013 ISSNIP Biosignals and Biorobotics Conference: Biosignals and Robotics for Better and Safer Living (BRC), Rio de Janeiro, Brazil, 18–20 February 2013; pp. 1–5. [Google Scholar]
- Stefancova, E.; Moro, R.; Bielikova, M. Towards Detection of Usability Issues by Measuring Emotions. In Proceedings of the European Conference on Advances in Databases and Information Systems, Budapest, Hungary, 2–5 September 2018; pp. 63–70. [Google Scholar]
- Mangion, R.S.; Garg, L.; Garg, G.; Falzon, O. Emotional Testing on Facebook’s User Experience. IEEE Access 2020, 8, 58250–58259. [Google Scholar] [CrossRef]
- Aledaily, A.; Gannoun, S.; Aboalsamh, H.; Belwafi, K. A Framework for Usability Testing using EEG Signals with Emotion Recognition. In Proceedings of the 5th International Conference on Intelligent Human Systems Integration (IHSI 2022) Integrating People and Intelligent Systems, Venice, Italy, 22–24 February 2022. [Google Scholar] [CrossRef]
- Ward, R.D.; Marsden, P.H. Physiological responses to different WEB page designs. Int. J. Hum.-Comput. Stud. 2003, 59, 199–212. [Google Scholar] [CrossRef]
- GreenTek Company. GT Cap Gelfree-S3, 32/16 EEG Channels Cap, Wuhan, China. Available online: https://www.greenteksensor.com/products/eeg-caps/gt-cap-gelfree-s3/ (accessed on 29 April 2023).
- Belwafi, K.; Djemal, R.; Ghaffari, F.; Romain, O.; Ouni, B.; Gannouni, S. Online Adaptive Filters to Classify Left and Right Hand Motor Imagery. In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies, Rome, Italy, 21–23 February 2016. [Google Scholar] [CrossRef]
- Gannouni, S.; Alangari, N.; Mathkour, H.; Aboalsamh, H.; Belwafi, K. BCWB. Int. J. Semant. Web Inf. Syst. 2017, 13, 55–73. [Google Scholar] [CrossRef]
- Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. Deap: A database for emotion analysis; using physiological signals. IEEE Trans. Affect. Comput. 2011, 3, 18–31. [Google Scholar] [CrossRef]
- Ramirez, R.; Vamvakousis, Z. Detecting emotion from EEG signals using the emotive epoc device. In Proceedings of the International Conference on Brain Informatics, Macau, China, 4–7 December 2012; pp. 175–184. [Google Scholar]
- Gunawan, F.E.; Wanandi, K.; Soewito, B.; Candra, S.; Sekishita, N. Detecting the early drop of attention using EEG signal. In Proceedings of the 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Yogyakarta, Indonesia, 19–21 September 2017; pp. 1–6. [Google Scholar]
- Arnaud Delorme, S.M. EEGLAB Wiki. Swartz Center for Computational Neuroscience. Available online: https://eeglab.org/ (accessed on 15 April 2023).
- Gannouni, S.; Aledaily, A.; Belwafi, K.; Aboalsamh, H. Electroencephalography based emotion detection using ensemble classification and asymmetric brain activity. J. Affect. Disord. 2022, 319, 416–427. [Google Scholar] [CrossRef]
- Pfurtscheller, G. Functional brain imaging based on ERD/ERS. Vis. Res. 2001, 41, 1257–1260. [Google Scholar] [CrossRef] [PubMed]
- Gannouni, S.; Aledaily, A.; Belwafi, K.; Aboalsamh, H. Emotion detection using electroencephalography signals and a zero-time windowing-based epoch estimation and relevant electrode identification. Sci. Rep. 2021, 11, 7071. [Google Scholar] [CrossRef] [PubMed]
- Gavin, H. The Levenberg-Marquardt Method for Nonlinear Least Squares Curve-Fitting Problems; Department of Civil and Environmental Engineering, Duke University: Durham, NC, USA, 2011; pp. 1–15. [Google Scholar]
- Atabay, D. pyrenn: A Recurrent Neural Network Toolbox for Python and Matlab. Available online: https://pyrenn.readthedocs.io/en/latest/ (accessed on 15 April 2023).
- Belwafi, K.; Gannouni, S.; Aboalsamh, H. Embedded Brain Computer Interface: State-of-the-Art in Research. Sensors 2021, 21, 4293. [Google Scholar] [CrossRef] [PubMed]
- Mert, A.; Akan, A. Emotion recognition from EEG signals by using multivariate empirical mode decomposition. Pattern Anal. Appl. 2018, 21, 81–89. [Google Scholar] [CrossRef]
- Nakisa, B.; Rastgoo, M.N.; Tjondronegoro, D.; Chandran, V. Evolutionary computation algorithms for feature selection of EEG-based emotion recognition using mobile sensors. Expert Syst. Appl. 2018, 93, 143–155. [Google Scholar] [CrossRef]
- Alhagry, S.; Fahmy, A.A.; El-Khoribi, R.A. Emotion recognition based on EEG using LSTM recurrent neural network. Emotion 2017, 8, 355–358. [Google Scholar] [CrossRef]
- Pandey, P.; Seeja, K. Subject independent emotion recognition from EEG using VMD and deep learning. J. King Saud-Univ.-Comput. Inf. Sci. 2022, 34, 1730–1738. [Google Scholar] [CrossRef]
- Sharma, R.; Pachori, R.B.; Sircar, P. Automated emotion recognition based on higher order statistics and deep learning algorithm. Biomed. Signal Process. Control 2020, 58, 101867. [Google Scholar] [CrossRef]
- Stickel, C.; Ebner, M.; Steinbach-Nordmann, S.; Searle, G.; Holzinger, A. Emotion detection: Application of the valence arousal space for rapid biological usability testing to enhance universal access. In Proceedings of the International Conference on Universal Access in Human-Computer Interaction, San Diego, CA, USA, 19–24 July 2009; pp. 615–624. [Google Scholar]
- Kołakowska, A.; Landowska, A.; Szwoch, M.; Szwoch, W.; Wrobel, M.R. Emotion recognition and its applications. In Human-Computer Systems Interaction: Backgrounds and Applications 3; Springer: Berlin/Heidelberg, Germany, 2014; pp. 51–62. [Google Scholar]
- Partala, T.; Kangaskorte, R. The combined walkthrough: Measuring behavioral, affective, and cognitive information in usability testing. J. Usability Stud. 2009, 5, 21–33. [Google Scholar]
- Sonderegger, A.; Sauer, J. The influence of socio-cultural background and product value in usability testing. Appl. Ergon. 2013, 44, 341–349. [Google Scholar] [CrossRef] [PubMed]
- Kulviwat, S.; Bruner II, G.C.; Kumar, A.; Nasco, S.A.; Clark, T. Toward a unified theory of consumer acceptance technology. Psychol. Mark. 2007, 24, 1059–1084. [Google Scholar] [CrossRef]
Final Decision Criteria | Classifier of Upper Level | ||
---|---|---|---|
(Target) Label | (Outlier) Label | ||
Classifier of Lower Level | (target) label | Error | Lower level of the dimension |
(outlier) label | Upper level of the dimension | Neutral level of the dimension |
Factor | Value |
---|---|
Adapt damping factor of LM | 10 |
Damping factor of LM | 3 |
Delay factor | 0.01 |
Input nodes | Same as the number of adaptive significant electrodes |
Hidden layers | 80 |
Nodes in each hidden layer | 80 |
Output layers | 1 |
Weight | Random values [−0.5, +0.5] |
Bias | Random values [−0.5, +0.5] |
Epochs | 8 |
Error E (Cost function) |
Dimension | Level | Accuracy (%) | Precision (%) | Recall (%) | Specificity (%) | TPR(%) | F1-Score (%) | Decision (%) |
---|---|---|---|---|---|---|---|---|
Valence | Positive | 90.19 | 84.12 | 94.12 | 86.24 | 94.12 | 88.84 | 85.88 |
Negative | 92.13 | 84.13 | 94.66 | 88.72 | 94.75 | 89.08 | ||
Arousal | Active | 92.13 | 80.12 | 95.45 | 88.33 | 95.95 | 87.11 | 87.32 |
Passive | 92.67 | 91.18 | 90.52 | 92.80 | 92.52 | 90.85 | ||
Dominance | Low | 92.24 | 83.63 | 96.06 | 88.72 | 96.06 | 89.41 | 87.56 |
High | 91.78 | 88.76 | 93.31 | 90.56 | 93.31 | 90.98 |
(a) Valence | ||||
---|---|---|---|---|
Predicted Class | ||||
Actual Class | Positive | Neutral | Negative | Error |
Positive | 81.26 (TP) | 14.77 | 1.10 | 2.85 |
Neutral | 4.25 | 91.47 (TN) | 2.57 | 1.69 |
Negative | 2.22 | 13.64 | 82.71 (TN) | 1.41 |
(b) Arousal | ||||
Predicted Class | ||||
Actual Class | Active | Neutral | Passive | Error |
Active | 80.35 (TP) | 10.11 | 5.14 | 1.26 |
Neutral | 3.28 | 88.66 (TN) | 7.12 | 0.92 |
Passive | 0 | 7.53 | 87.50 (TN) | 1.83 |
(c) Dominance | ||||
Predicted Class | ||||
Actual Class | Low Dominance | Neutral | High Dominance | Error |
Low dominance | 72.63 (TP) | 11.39 | 3.43 | 3.16 |
Neutral | 0.62 | 94.70 (TN) | 4.14 | 0.52 |
High dominance | 1.59 | 8.24 | 76.21 (TP) | 1.44 |
Method | Number of Classes | Feature Extraction Algorithm | Number of Electrodes | Classifier | Valence (%) | Arousal (%) | Dominance (%) |
---|---|---|---|---|---|---|---|
[27] | 2 levels of valence and arousal | EMD and MEMD | 18 | ANN | 75 | 72.87 | N/A |
[28] | 2 levels of valence and arousal | Time- and frequency- domain, with DE as selection method | 5 | PNN | 67.47 | 67.47 | N/A |
[29] | 2 levels of valence and arousal | EEG raw signals | 32 | RNN | 85.65 | 85.45 | N/A |
[30] | 2 levels of valence and arousal | PSD and VMD | 4 | DNN | 62.50 | 61.25 | N/A |
[31] | 2 levels of valence and arousal | HOS | 32 | Proposed NN | 85.21 | 84.16 | N/A |
Proposed method | 2 levels of valence, arousal, and dominance | ERD/ERS of the significant electrodes | Adaptive | RNN | 85.88 | 87.71 | 86.63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gannouni, S.; Belwafi, K.; Aledaily, A.; Aboalsamh, H.; Belghith, A. Software Usability Testing Using EEG-Based Emotion Detection and Deep Learning. Sensors 2023, 23, 5147. https://doi.org/10.3390/s23115147
Gannouni S, Belwafi K, Aledaily A, Aboalsamh H, Belghith A. Software Usability Testing Using EEG-Based Emotion Detection and Deep Learning. Sensors. 2023; 23(11):5147. https://doi.org/10.3390/s23115147
Chicago/Turabian StyleGannouni, Sofien, Kais Belwafi, Arwa Aledaily, Hatim Aboalsamh, and Abdelfettah Belghith. 2023. "Software Usability Testing Using EEG-Based Emotion Detection and Deep Learning" Sensors 23, no. 11: 5147. https://doi.org/10.3390/s23115147
APA StyleGannouni, S., Belwafi, K., Aledaily, A., Aboalsamh, H., & Belghith, A. (2023). Software Usability Testing Using EEG-Based Emotion Detection and Deep Learning. Sensors, 23(11), 5147. https://doi.org/10.3390/s23115147