Louis Armstrong famously sings, “When you’re smilin’, the whole world smiles with you.” Perhaps he means that smiling causes other people to smile, and in so doing makes them happy. Or, perhaps he means that smiling makes it easier to recognize the smiles of others in the world around you. In any case, it is clear that our folk theories of emotion posit a tight relationship between the expression of emotion, the experience of emotion, and the recognition of emotion in other people. Similarly, sensorimotor theories of emotion recognition suggest that understanding smiles, and other sorts of emotional faces, involves simulating the facial expressions of others (Niedenthal, Mermillod, Maringer, & Hess, 2010). The elicited sensorimotor signals can help the perceiver understand other people’s emotional state, either by inducing a similar experience (Goldman & Sripada, 2005), or by bootstrapping the recognition process (Korb et al, 2015; Pitcher, Garrido, Walsh, & Duchaine, 2008).

Sensorimotor models of emotion recognition are closely allied with a parallel movement in cognitive science toward embodied or grounded theories of conceptual knowledge (Niedenthal, Barsalou, Winkielman, Krauth-Gruber, & Ric, 2005). In contrast to classical accounts of concepts as formal symbolic constructs, embodied concepts recruit sensorimotor resources for inferential processes (Barsalou, 2008). Accordingly, conceptual knowledge of emotion includes embodied simulation of emotional experience (Niedenthal, Winkielman, Mondillon, & Vermeulen, 2009). Given the prominent role of emotional concepts in emotion recognition, embodied models strongly suggest a functional role for sensorimotor simulation in the conceptual aspects of this process. As much prior research on this topic has focused on the visual processing of emotional faces (e.g., Achaibou, Pourtois, Schwartz, & Vuilleumier, 2008), here we investigate the contribution of sensorimotor simulation to conceptual aspects of emotion recognition.

Sensorimotor models of emotion recognition

Although not without challenges (see, e.g., Saxe, 2005), sensorimotor models of emotion recognition have received considerable support (Atkinson & Adolphs, 2011; Wood, Rychlowska, Korb, & Niedenthal, 2016). For example, one premise of these models—that people engage in motor simulation of each other’s emotional expressions—is supported by facial electromyography (EMG). When exposed to emotional faces, EMG indicates that people activate emotion-relevant facial muscles to spontaneously mimic the emotions they see (Dimberg, Thunberg, & Elmehed, 2000). This sort of facial mimicry has been shown to impact emotional experience and various other aspects of emotion processing (J. I. Davis, Senghas, Brandt, & Ochsner, 2010; Strack, Martin, & Stepper, 1988, but see Wagenmakers et al., 2016). While sensorimotor simulation can occur without overt facial mimicry, the latter is typically interpreted as an integral component of the larger system for emotion processing (see Wood et al., 2016, for a review). Sensorimotor simulation has, for example, been found to influence the accuracy and efficiency of decoding emotional expressions (Ipser & Cook, 2015; Künecke, Hildebrandt, Recio, Sommer, & Wilhelm, 2014), as well as judgments of valence (Hyniewska & Sato, 2015), intensity (Lobmaier & Fischer, 2015), and intentionality (Korb, With, Niedenthal, Kaiser, & Grandjean, 2014; Rychlowska et al., 2014).

A critical prediction of embodied accounts is that the introduction of irrelevant noise into the sensorimotor system will interfere with simulation, and this in turn will interfere with emotion recognition (e.g., Niedenthal, 2007; Niedenthal et al., 2005). Consistent with this, repetitive transcranial magnetic stimulation (rTMS) of the face region of somatosensory cortex has been found to interfere with the processing of emotional faces, including emotion detection (Korb et al., 2015; Pitcher et al., 2008) and the ability to judge whether a smile represents genuine amusement (Paracampo, Tidoni, Borgomaneri, di Pellegrino, & Avenanti, 2016).

Sensorimotor interference and conceptual aspects of emotion recognition

The introduction of irrelevant noise can also be accomplished by directly manipulating motor activity at the face. Accordingly, facial action manipulations have been found to impair the recognition of facial expressions in perceptual (Wood, Lupyan, Sherrin, & Niedenthal, 2015) and categorical (Ponari, Conson, D’Amico, Grossi, & Trojano, 2012) tasks. Previous work in our laboratory addressed how interfering with motor activity at the face impacted the categorization of emotional expressions (Oberman, Winkielman, & Ramachandran, 2007). First, EMG was used to show that interference, that is, biting down on a pen held horizontally between the teeth and lips, led to greater facial muscle activity relative to a control posture in which the pen was held loosely between the lips. Next, Oberman and colleagues showed that interference led to an increase in categorization errors for emotional faces whose expression relied on the affected muscles (happiness and disgust), thus supporting a causal link between sensorimotor simulation and the categorization of emotional faces (Oberman et al., 2007).

However, one shortcoming of such prior research is a lack of clarity regarding the precise way that facial action manipulations impact emotion recognition. Emotional faces are complex, multidimensional objects that engender an elaborate series of processes (Bruce & Young, 1986; Burton, Bruce, & Hancock, 1999). Sensorimotor interference might influence early stages of perceptual processing (Price, Dieckman, & Harmon-Jones, 2012; Wood et al., 2015), conceptual stages of processing (Niedenthal et al., 2009), or both. Skeptics of embodied accounts suggest interference effects in the literature reflect neither the perception nor the interpretation of emotions, but are rather an artifact of response bias (see Ipser & Cook, 2015). Others have suggested that interference manipulations influence emotion recognition only indirectly, by imposing a greater cognitive load than do their control conditions (see Neal & Chartrand, 2011).

Some of these concerns were addressed in a study that examined how facial interference impacts the processing of emotional language (J. D. Davis et al., 2015). Both EMG and EEG were recorded as participants read sentences such as, “She reached inside the pocket of her coat from last winter and found some CASH/BUGS inside it,” and judged their valence. EMG recordings at the zygomaticus confirmed greater tonic levels of activation in the interference condition, and suggested transient activation (smiling) in the control condition as participants read pleasant sentences (e.g., the version in which she finds cash, but not in the version in which she finds bugs). Event-related potentials (ERPs) time locked to sentence final words in the pleasant sentences revealed larger amplitude N400 in the facial interference condition than the control. ERPs to sentences about unpleasant events were unaffected by the interference manipulation, arguing against the possibility that the unnatural facial posture distracted participants from the language task. Because the N400 ERP component is larger when semantic retrieval demands are more pronounced (Kutas & Federmeier, 2011), these data suggest sensorimotor simulation impacts the retrieval of conceptual knowledge in emotional language processing.

The present study

The present study examines whether interfering with facial muscle activity influences semantic processing of emotional faces by measuring neural responses associated with an event-related potential (ERP) measure of semantic processing: the face N400. ERPs are epoched and averaged EEG signals that are time locked to stimulus onset. The ERP provides a continuous measure of face processing with known sensitivity to its attentional, perceptual, and conceptual aspects (Luck, 2005). The temporal resolution of this technique affords precise inferences about when the experimental manipulation of facial action impacts the neural response to emotional faces. Moreover, the extant literature on ERP indices of face processing can help link observed effects to particular neurocognitive processes. In particular, ERP measures allow us to address whether sensorimotor interference selectively impacts semantic processing of emotional faces, as opposed to promoting a general reduction (i.e., main effect) of attentional resources or engendering response bias.

The face N400 is a negative-going waveform that peaks approximately 400 ms after the visual presentation of a face. Its neural generator is presumed to lie in the anterior fusiform gyrus and nearby ventral temporal structures, and it is thought to index semantic aspects of face processing (Schweinberger & Burton, 2003). The N400 is characterized as a negativity because it is more negative than the positive peak (P2) that often precedes it. However, its amplitude need not be negative in the absolute sense. N400 is larger (i.e., more negative, or less positive) for familiar than unfamiliar faces, presumably because familiar faces engender the retrieval of semantic information about the relevant person (Eimer, 2000). Its amplitude is reduced by repetition, presumably because the semantic information has recently been activated on a previous trial (Schweinberger, 1996; Schweinberger, Pickering, Burton, & Kaufmann, 2002). N400 amplitude is also reduced by associative priming, as a picture of Bill Clinton elicits less negative N400 when preceded by either the name or the image of Hillary Clinton (Schweinberger, 1996; Schweinberger et al., 2002).

Finally, the amplitude of the face N400 has been shown to be sensitive to the demands of emotion recognition. Prototypical emotional faces elicit less negative N400 than nonprototypical ones on an emotion recognition task (Paulmann & Pell, 2009). Similarly, emotional faces elicit less negative N400 when preceded by congruent rather than incongruent emotional speech (Paulmann & Pell, 2010). In sum, the face N400 is thought to index semantic memory activation induced by a face, and (other things being equal), its amplitude is larger (more negative) for more demanding emotion recognition tasks and is reduced by facilitative contextual cues. Therefore, if sensorimotor simulation plays a functional role in the retrieval of semantic information about facial expressions, then interfering with simulation ought to lead to larger amplitude N400.

Regarding the direction of N400 effects, sensorimotor accounts predict motor interference will enhance the amplitude of N400 components (i.e., make the N400 more negative) for relevant expressions. This would imply that additional semantic processing was engaged. Alternatively, a skeptical cognitive load or distraction account would predict that facial interference would lead to a reduction in N400 amplitude, as visual stimuli have previously been shown to elicit reduced amplitude ERPs under conditions of divided relative to focused attention and this could have semantic effects (e.g., Mangels, Picton, & Craik, 2001). Finally, whereas sensorimotor accounts predict selective N400 differences, cognitive load and distraction accounts predict a global effect, impacting all emotional categories in the same way.

As in previous studies, we also recorded EMG from the zygomaticus (smiling), levator (wrinkling one’s nose in disgust), and the corrugator (frowning). The primary purpose of these recordings was to verify that our interference manipulation—holding a conjoined pair of chopsticks horizontally between the teeth and lips—led to increased muscle noise at the zygomaticus and the levator muscles in the lower part of the face, but not for the corrugator muscle in the upper part of the face. We also used EMG to explore whether mimicry, as a downstream indicator of sensorimotor simulation, was present in either the control or interference conditions.

Method

Participants

Informed consent was obtained from 19 UCSD undergraduates (mean age: 20.8 years; range: 18–26 years; 12 females) for participation in the study in return for course credit or financial compensation ($8 an hour). Five participants were rejected due to excessive EEG artifacts (see section on EEG recording and analysis). Consequently, analyses below included data from 14 participants. All participants were right-handed, native English speakers with normal or corrected-to-normal vision who reported no history of head injury, drug use, psychiatric illness, or psychoactive medication.

Materials

Materials were taken from the NimStim Database (Tottenham et al., 2009), and consisted of 120 photographs expressing four different emotional facial expressions: happiness, exuberant surprise, disgust, and anger. These included depictions of 30 different models, and each model expressed each emotion once. See Fig. 1 for sample stimuli. Materials were normed using The Computer Expression Recognition Toolkit (CERT; Bartlett, et al, 2005). This software provides an objective assessment of the amount of evidence for different emotions in facial expressions. (For use of this software in other research, see Bartlett, Littlewort, Frank, & Lee, 2014; Gordon, Pierce, Bartlett, & Tanaka, 2014; Peterson et al., 2016; Sikka et al., 2015; Zanette, Gao, Brunet, Bartlett, & Lee, 2016).

Fig. 1
figure 1

Sample stimuli depicting the different emotional expressions: happiness, exuberant surprise, disgust, and anger. (The model depicting these expressions is NimStim Model 01.)

The evidence for the emotions of joy, exuberant surprise, disgust, and anger were analyzed using a 4 (expression categories; between items) × 4 (emotion evidence) repeated-measures ANOVA. This revealed a main effect of expression category F(3, 114) = 76.77, p < .001, qualified by an interaction between expression category and emotion evidence, F(9, 348) = 94.32, p < .001. The most evidence for joy was found for expressions of happiness. The most evidence for surprise was found for expressions of exuberant surprise. The evidence for disgust was highest for disgust expressions, and the evidence for anger was highest for the anger expressions.

Procedure

After providing informed consent, participants were affixed with EEG and facial EMG electrodes (see EEG and EMG Recording and Analysis sections). After participating in an unrelated experiment on language processing, participants were informed that they would be rating emotionally expressive photographs (i.e., rating faces as expressing feelings that were very good, good, somewhat good, somewhat bad, bad, or very bad). Prior to the experiment, participants received instructions and a demonstration of the chopstick conditions. They also received visual feedback from their EEG and EMG so that they could get a feel for the correct amount of pressure to apply in the interference condition to prevent contaminating the EEG signal.

During the experiment, participants were seated in a dimly lit, sound attenuated chamber. To force participants to choose between positive and negative valence, no neutral option was included. Participants were encouraged to use the full extent of the rating scale using a numerical keypad. Both hands were used to make responses. The side of the keypad corresponding to good and bad (viz. left or right) was counterbalanced across participants. Note that we avoided using specific emotion words (anger, disgust) because emotion words can activate simulation (e.g., Foroni & Semin, 2009; Niedenthal et al., 2009), and because we were interested in the dynamics of the spontaneous retrieval of semantic content related to specific emotions.

Participants were given an explanation of ERP protocol (such as when they were and were not permitted to blink or move their eyes) and task instructions. The experimenter then demonstrated how to hold the chopsticks (see Fig. 2) in the interference and control conditions, and participants were given feedback and ample time to practice the different facial actions so that the bite condition did not interfere with EEG recording. Previous research has manipulated lower face motor noise in different ways that are worth clarifying. In each case, participants hold a utensil (a pen or chopstick) horizontally between their teeth. As measured by EMG, holding a chopstick (or pen) between the teeth generates tonic muscle noise on the lower half of the face, particularly at the zygomaticus (J. D. Davis et al., 2015; Oberman et al., 2007). The specific instructions for participants telling them to hold the chopsticks in their teeth in the interference condition were based on previous research on the role of sensorimotor signals in emotion concepts (J. D. Davis et al., 2015), which showed that this “bite” condition is sufficient to enhance relevant EMG signal from the zygomaticus. Note that in one version of the lower face manipulation, the lips are kept closed (closed-lip bite). This version has been used in studies of emotional language processing (J. D. Davis et al., 2015; Niedenthal et al., 2009). In another version, the lips are held open and participants are asked to bite down hard on the utensil (open-lip bite). This version has previously been used to investigate categorization of facial expressions (Oberman et al., 2007; Ponari et al., 2012). Both versions of the bite manipulation have been found to selectively impact the recognition of emotion (in words or faces) related to happiness and disgust, while not influencing the concept of anger (Niedenthal et al., 2009; Ponari et al., 2012). Because the open-lip bite manipulation requires maintaining muscle activity to hold open the lips and also involves biting down hard on the utensil, it is likely that this version generates more irrelevant muscle noise than the closed-lip version. However, since we are recording EEG, which is particularly prone to contamination from muscle noise (even simple eye movements), we opted for the closed-lip version.

Fig. 2
figure 2

The photo on the left illustrates the interference condition in which participants held a conjoined pair of chopsticks between their teeth and lips. The photo on the right depicts the control condition in which participants held the chopsticks at the front of their mouth with their lips only

After the instruction period, participants also performed a short practice block in each condition before the experiment began.

The experiment consisted of four blocks. Facial action was manipulated within subjects and alternated across blocks. The order of the interference and control conditions was counterbalanced across participants. At the onset of each block, participants read from the monitor which posture they should take, that is, “TEETH and LIPS” (interference condition) or “LIPS ONLY” (control condition), and pressed a button to proceed with the experiment after assuming that posture. There was no mention of facial expressions or emotions. The stimuli were presented in a pseudorandom order such that the number of photographs expressing each emotion was counterbalanced across facial action conditions within subjects.

At the onset of each trial, “(BLINK)” appeared for 500 ms. This served as a warning that the trial would begin and that it was appropriate to blink at this point if needed. They were asked not to blink or move their eyes after this until the rating screen as that could generate EEG artifacts. “(BLINK)” was followed by a blank screen (2,000 ms); a central fixation cross (300 ms); a centrally presented photograph of a face expressing either happiness, surprise, disgust,or anger (200 ms); followed by another blank screen (2,000 ms); and finally the rating task. See Fig. 3 for a schematic of the experimental paradigm. Note that the timing parameters in this paradigm were optimized for the collection of ERP signals (the central interest here), and thus included fast stimulus presentation and short trial duration.

Fig. 3
figure 3

A single trial of the experiment. The text and images are not to scale

EMG recording and analysis

EMG was recorded at three sites on the left side of the face—the zygomaticus major (smiling/happiness), levator labii superioris (wrinkling of the nose disgust), and the corrugator supercilii (frowning/anger)—using bipolar derivations of tin electrodes. Electrodes were placed according to the guidelines for human EMG research set forth by Fridlund and Cacioppo (1986). At all sites, electrical impedance was reduced to less than 5 kΩ via gentle abrasion. EMG was sampled at 1024 Hz, recorded and amplified using the same bioelectric amplifier as the EEG, and band-passed between .01 Hz and 200 Hz. The signals were then screened for artifacts, rectified and integrated off-line. An average of 0.21 trials were rejected, SD = 0.24.

EMG served two functional roles. Of primary importance was its service as a manipulation check. The manipulation check examined whether the interference condition selectively generated irrelevant muscle noise on the lower half of the face relative to the control. To examine the level of baseline noise, 500 ms of prestimulus EMG activity was analyzed. To account for individual differences in muscle activity, EMG values were z scored within muscles sites for each participant (Oberman et al., 2007). These data were subjected to a 2 (interference, control) × 3 (muscle site: zygomaticus, levator, corrugator) repeated-measures MANOVA, with muscle sites being the different dependent variables.

For mimicry, we also used z-scored EMG activity, using activity recorded during the 500 ms before stimulus onset as a baseline. Since perceiving facial expressions first initiates non-emotion-specific motor responses followed by emotion-specific mimicry that begins to become evident around 500 ms after stimulus onset (Dimberg & Öhman, 1996; Dimberg et al., 2000; Dimberg, Thunberg, & Grunedal, 2002), we did not analyze the first 500 ms after stimulus onset, focusing on the next three 500 ms intervals after stimulus onset (i.e., 500–1000ms, 1001–1500ms, and 1501–2000 ms). We then looked at the zygomaticus for expressions of happiness and exuberant surprise, at the levator for disgust, and at the corrugator for anger using four separate 2 (facial manipulation) × 3 (time) repeated-measures ANOVAs. These were followed up by simple effects t tests, which were based on visual inspection of the EMG data.

EEG recording and analysis

EEG was collected at 27 scalp sites using a cap mounted with tin electrodes. The electrodes were referenced online to the left mastoid. Blinks were monitored from an electrode below the right eye. Horizontal eye movements were monitored via a bipolar derivation of electrodes at the outer canthi. At all sites, electrical impedance was reduced to less than 5 kΩ via gentle abrasion of the skin. EEG was recorded and amplified using an SA Instruments bioelectric amplifier with a high-pass filter of 0.01 Hz, a low-pass filter of 100 Hz, and was digitized online at 1024 Hz.

EEG epochs were analyzed offline and averaged within conditions. ERPs were time locked to the onset of the stimuli and included a 200 ms prestimulus baseline period and 800 ms afterward. Epochs were visually examined and manually rejected when contaminated by noise from muscle artifacts, excessive drift, or channel blocking. Five participants were rejected from analysis due to excessive blinking (more than half of their trials had been contaminated by artifacts). Of the remaining 14 participants, the mean trial rejection rate was 0.21, with a range .01 to .29 and a standard deviation of 0.1. To check that the conditions did not differ significantly in their rejection rate, a repeated-measures ANOVA contrasting emotion by facial action manipulation was run on the number of rejected trials. No significant differences in the removal of trials from the experimental conditions, all Fs less than 1.4.

Additionally, as with the EMG data, the trials in which the participants rated a positive expression (happiness or surprise) as bad or a negative expression (disgust or anger) as good were excluded from analysis. This resulted in the removal of less than 0.01 of the data.

Results

We describe three sorts of data: behavioral ratings, EMG, and ERP. The ratings provide information about offline valence judgments. The EMG was used both as a manipulation check, to ensure that the interference condition led to larger amplitude EMG than the control, and to examine whether the experimental materials elicited mimicry. Finally, the ERPs provide information about how the facial action manipulation impacted the brain’s real-time response to emotional faces.

Behavioral ratings

Recall that the ratings task asked only about valence and did not involve any emotional category labels (angry, disgust, happy, surprise). This was done to test if the retrieval of specific semantic emotion content (its label) is influenced by sensorimotor interference. Obviously, providing the specific emotion labels would have made the semantic content highly available. Note, however, that this valence categorization made participants rating task easier and not sensitive to differences in specific emotion (unlike other studies that have asked participants to discriminate between specific emotions). Additionally, typical for physiology-focused studies, but unlike behavior-focused studies, this rating was delayed (2,000 ms after stimulus offset, in order to reduce EEG contamination from movement artifacts).

Participants’ mean rating scores were subjected to a repeated measures ANOVA with factors emotion (4) and facial action (2). This revealed a main effect of emotion, F(3, 39)= 467.4, p <.001, ηp 2 = 0.97 but no effect of facial action or interaction with the facial action manipulation (all Fs < 1). Ratings were then collapsed across the facial action conditions, and post hoc two-tailed paired t tests were used to compare the mean ratings of the different emotions (see Fig. 4 and Table 1). Because emotion effects are neither surprising nor critical for our hypotheses, post hoc t tests were not corrected for multiple comparisons. Overall, participants provided similar ratings for faces in the facial interference condition and in the control condition, consistently rating faces displaying happiness and exuberant surprise as more positive than faces displaying anger and disgust. Taken together, these results suggest that participants attended to the valence expressed on the faces, but the facial action manipulation did not influence their delayed offline valence judgments about them.

Fig. 4
figure 4

Mean ratings of the emotional expressions of happiness, exuberant surprise, disgust, and anger. (“Does the expression convey a good or a bad feeling?”). Ratings were made using a six-point scale—very good, good, somewhat good, somewhat bad, bad, very bad—2000 ms after stimulus offset. Data are collapsed across the facial action manipulation, which had no significant effect on the offline ratings. Error bars reflect SEM. **p < .01. ***p < .001

Table 1 Post hoc comparison statistics for mean emotion ratings

EMG results

EMG activity was recorded at the zygomaticus major (mouth), levator labii (nose wrinkling), and corrugator supercilii (brow), and provided information about the effectiveness and selectivity of the facial action manipulations on mouth, nose and brow muscles. Z-scored baseline activity (500 ms prior to stimulus onset) was analyzed using a facial action (2) × muscle site (3) repeated-measures MANOVA (muscle sites being the different dependent variables). This analysis revealed a significant main effect of facial action, with more activity in the interference than the control condition, F(1, 13) = 10.28, p = .007, ηp 2 = 0.44, qualified by a significant interaction of facial action by muscle site, F(2, 12)= 4.78, p = .03, ηp 2 = 0.42. To follow up on the interaction, post hoc, two-tailed, paired t tests were performed comparing the effect of the facial action manipulation at the different muscle sites. These analyses suggest the facial action manipulation significantly influenced activity at the zygomaticus site, t 2(13) = 5.97, p < .001, d = 1.60, but not at the levator t 2(13) = 0.88, p = .39, d = 0.24, or the corrugator sites t 2(13) = 0.67, p = .51, d = 0.18 (see Fig. 5).

Fig. 5
figure 5

Mean z scores of prestimulus (-500–0 ms) baseline EMG activity at the different muscle sites as modulated by the facial action manipulation. Error bars reflect standard error of the mean. ***p < .001, n.s. = nonsignificant p values

For mimicry, we looked at the zygomaticus for happiness and surprise, at the levator for disgust, and at the corrugator for anger. Four separate facial action (2) × time (3) repeated-measures ANOVAs were used. In each case, the data consisted of three continuous 500-ms epochs of baseline corrected z-scored EMG activity. The first epoch began 500ms after stimulus onset. For happiness, there was a main effect of time F(2, 26) = 4.08, p = .029, ηp 2 = 0.085. Based on visual inspection, the largest increase in activity occurred for both the interference and control conditions between the first and second epoch (501–1000 ms and the 1001–1500 ms post stimulus onset). Comparison of the activity in these epochs during the control condition, using a paired samples two-tailed t test, revealed a significant difference, t 2(13) = -2.232, p = 0.044, d = 0.60. Doing the same comparison for the interference condition did not reveal a significant difference t 2(13) = -1.954, p = .073, d = 0.52. For surprise at the zygomaticus, the ANOVA revealed no significant differences. For disgust at the levator, the ANOVA also revealed no significant differences. For anger at the corrugator, there was a main effect of time, F(2, 26) = 7.762, p =.002, ηp 2 = 0.154. Again, the greatest increase in activity was between the 500–1,000 ms and the 1,001–1,500 ms post stimulus onset intervals. Comparing the difference in activity between these two time intervals using separate, paired samples, two-tailed t tests revealed a significant increase in activity for both the control condition t 2(13) = -3.300, p = .006, d = 0.88, and the interference condition t 2(13) = -2.252, p = .042, d = 0.60 (see Fig. 6 for happiness at the zygomaticus and anger at the corrugator).

Fig. 6
figure 6

Zygomaticus responses to faces expressing happiness and corrugator responses to faces expressing anger. EMG is baseline corrected (-500–0 ms from stimulus onset). Mean EMG activity is based on EMG z scored within participants and muscle sites. Error bars represent SEM. Analysis revealed a significant increase in zygomaticus activity to happy faces presented during the control, but not the interference condition. Corrugator responses to angry faces increased significantly during both facial action conditions

Although zygomaticus activity elicited by happy faces during the control condition follows a pattern consistent with mimicry, as did corrugator activity elicited by angry faces during the control and interference conditions alike, these data should be interpreted with caution. This experiment was not optimized for revealing mimicry—participants had a pair of chopsticks in their mouths, they were wearing an EEG cap, and the experimental trials were designed to measure ERP rather than mimicry. Nonetheless, the EMG data tentatively suggest participants engaged in mimicry following the presentation of the happy faces during the control condition, and following the angry faces during both facial action conditions.

ERP results

To examine the online processing effects of our facial action manipulation, we computed ERPs time locked to the onset of faces presented during the interference and control conditions. N400 amplitude was computed by measuring the mean amplitude of ERPs 300–600 ms post stimulus onset. These values were subjected to repeated measures ANOVA with factors emotion (4: happiness, surprise, disgust, anger), facial action (2: interference, control), lateral scalp regions of interest (ROI) (3: left, center, right), and anterior-posterior ROI (2: anterior, posterior). (See Fig. 7 for electrodes and regions of interest.)

Fig. 7
figure 7

Regions of interest and electrode sites used in the analysis of the ERP data (Color figure online)

Analysis revealed an interaction of emotion by facial action, F(3, 39) = 3.3, p = .030, ηp 2 = 0.20, along with a main effect of anterior–posterior ROI, F(1, 13) = 14.93, p = .002, ηp 2 = 0.54 (anterior = 1.98 μV, posterior = 6.52 μV).

To follow up on the N400 interaction, we first examined the effects of the facial action manipulation on lower face (happiness, disgust) and upper face (surprise, anger) expressions in a repeated measures 2 (facial action) × 2 (expression location) ANOVA. This revealed a significant interaction, F(1, 13) = 13.78, p = .003, ηp 2 = 0.52, in which there was a larger (more negative) N400 for “lower face” expressions in the interference condition (3.01μV, SEM 1.07 μV) relative to the control (5.23μV, SEM 1.28 μV); but there was no difference for “upper face” expressions as a function of the facial action manipulation (interference: 4.48μV, SEM 1.08 μV; control 4.29μV, SEM 1.27 μV). This interaction indicates the interference manipulation impacted the extraction of semantic content from both a negative (disgust) and a positive (happiness) expression—both expressed on the lower part of the face. Second, we analyzed each emotion separately using multiple two-tailed, paired-samples t tests contrasting the interference and the control condition. This revealed significant differences for the emotions of happiness, t(13) = 2.21, p = .045, d= 0.57, with a more negative (viz., less positive) mean amplitude across the scalp in the interference (2.04 μV) than the control (4.98 μV) condition; and for disgust, t(13) = 2.24, p = .043, d = 0.34, also with a more negative (viz., less positive) mean amplitude across the scalp for the interference (3.97 μV) condition relative to the control (5.49 μV) condition.

There were no significant differences for either surprise or anger, both t(13) < 1. See Fig. 8 for a depiction of the mean amplitudes of the interference and control conditions at electrode Cz for each of the four emotions. See Figure 9 for isovoltage maps of the difference between the interference and control conditions for happiness and disgust. These results indicate that faces expressing happiness and disgust elicited larger N400s during the interference condition than during the control. However, the facial action manipulation had no significant effects on ERPs elicited by faces expressing either surprise or anger.

Fig. 8
figure 8

Mean amplitude waveforms for the interference and control conditions at electrode Cz for each of the four emotions. * indicates a significant difference of p < 0.05 for the mean amplitude between facial action conditions across the scalp from 300–600 ms post stimulus onset

Fig. 9
figure 9

Topographic representations of the difference in mean amplitude across the scalp (interference–control) in 100-ms intervals for the emotions (happiness and disgust) in which the facial action manipulation led to a significantly larger N400 effect (300–600 ms) of the interference facial action manipulation relative the control condition

Discussion

Consistent with sensorimotor theories of emotion recognition, interfering with motor signals generated at the face impacted the N400, an ERP component that has previously been implicated in the retrieval of semantic information from faces (Paller, Gonsalves, Grabowecky, Bozic, & Yamada, 2000; Paulmann & Pell, 2010). Importantly, interference effects were restricted to faces expressing happiness and disgust, expressions whose diagnostic features are primarily located on the lower half of the face. For these lower face expressions, the N400 was more negative in the interference condition than in the control, suggesting the sensorimotor noise produced by the experimental manipulation made it more difficult to understand their emotional content.

The EMG data revealed that the interference manipulation significantly increased motor noise on the lower part of the face at the zygomaticus muscle site. This noise led to selective N400 differences for the lower face expression of happiness. This is consistent with previous behavioral research that used the closed-lip bite manipulation while examining language (J. D. Davis et al., 2015; Niedenthal et al., 2009, Experiment 3), and the open-lip bite manipulation while examining emotional facial expressions (Oberman et al., 2007; Ponari et al., 2012). However, the current research extends those findings by relating interference effects specifically to conceptual processes by measuring the real-time brain response associated with semantic retrieval (N400).

We also found a significant N400 difference for expressions of disgust, suggesting the interference manipulation also disrupted semantic processing of these expressions. Observed N400 effects on disgust faces are consistent with previous research using the same closed-lip bite manipulation that found it impaired recognition of words associated with disgust (Niedenthal et al., 2009, Experiment 3). These data are also in keeping with studies that employed the arguably stronger open-lip bite manipulation and found that it impaired the categorization of disgust expressions (Oberman et al., 2007; Ponari et al., 2012). However, our EMG recordings from the levator did not suggest mimicry of disgust faces during the control condition; nor did they reveal an effect of the interference manipulation.

Null findings in the EMG thus call into question the origin of the observed effects of sensorimotor interference on the N400 elicited by disgust faces. We speculate that EMG recordings from the levator were especially noisy and consequently were subject to Type II errors. Given our modest sample size and lack of any EMG effects at the levator, we interpret the relatively small ERP interference effects to disgust faces with caution, and suggest the need for replication.

Interestingly, ERP effects of our facial action manipulation were evident considerably earlier than the onset of overt mimicry in the control condition. Interference led to ERP effects between 300 and 600 ms after the onset of the happy faces, whereas EMG effects at the zygomaticus were most evident 500–1,500 ms. As noted above, the absence of earlier effects might well be a power limitation of the present study. However, the relative timing of semantic and somatic effects is consistent with an account of facial mimicry as an optional and downstream manifestation of an earlier sensorimotor simulation process. The chopstick manipulation in the present study leads to increased activation of facial muscles, especially the zygomaticus, and disrupts both motor output and somatosensory feedback used in sensorimotor simulation of emotional faces. Because these sensorimotor cues play a functional role in emotion concepts, noisy sensorimotor inputs in the interference condition make it more difficult to activate the concept of happiness that is normally recruited to understand happy faces.

Upper face expressions

Although we found N400 differences for lower face expressions as a function of our manipulation, we did not find any differences for upper face expressions. Anger is primarily associated with activity at the brow (Dimberg, 1982; Larsen, Norris, & Cacioppo, 2003), and relevant research suggests the recognition of anger relies on information from the upper half of the face. For example, anger recognition is impaired by replacing the upper half of an angry face with that from a neutral expression, but not by replacing the lower part of the face (Ponari et al., 2012). Moreover, interference paradigms, such as that in this study, that impact the lower face, have consistently failed to affect the recognition of anger (Oberman et al., 2007; Ponari et al., 2012). Likewise, in this study, motor noise from the interference manipulation did not modulate the N400 to angry faces.

The decoding of facial expressions occurs within 200 ms of stimulus onset, beginning with an analysis of the eyes, followed by a zooming out to the entire face, and then by a reanalysis of the eyes (Schyns, Petro, & Smith, 2009). In general, surprise is primarily associated with eye widening (Schyns et al., 2009), but also involves activity on the lower half of the face, such as opening of the mouth and, in our case, smiling. Our failure to observe an interference effect on the N400 to the exuberant surprise faces is consistent with previous behavioral research that failed to find a difference in emotional categorization of surprise expressions using the arguably stronger open-lip bite manipulation (Ponari et al., 2012). However, this study differed somewhat from prior research in that our surprise expressions were positive in nature, raising the possibility that the crucial information in these faces might not lie in the mouth region targeted by our experimental manipulation.

Consequently, we used the CERT facial expression analysis software to do a post hoc exploration of where the critical information was on our surprise faces. To do so, we compared the evidence for action units expressed on the upper (AU5, involved in raising the upper eyelids) versus lower (AU 12, pulling back the corners of the mouth) halves of the face for the happy and exuberant surprise stimuli. These values were subjected to a 2 (evidence: repeated measures) × 2 (expression: between groups) ANOVA. Analysis revealed a significant effect of the evidence F(1, 58) = 37.42, p < .001 (more lower than upper face activity) qualified by a significant interaction between evidence and expression F(1, 58) = 9.52, p = .003. There was more evidence for upper face expression (AU5) in the exuberant surprise condition relative to the happy condition, t 2(58) = 4.04, p < .001. But no significant difference in the lower face evidence (AU12), t 2(58) = 1.56, p > 0.1. Thus the exuberant surprise expressions had more information around the eyes, and this may have influenced the way they were processed.

Another reason for the lack of an N400 effect for exuberant surprise faces could be that the valence task was so simple for these stimuli that participants had little reason to engage in simulation, opting instead for a purely visual strategy. According to sensorimotor accounts, recognizing emotional expressions involves visual perception processes and sensorimotor simulation (Adolphs, 2002; Pitcher et al., 2008). As evidenced by rTMS, visual processes play a functional role in recognition prior to simulation processes (Pitcher et al., 2008). Much of the categorical work can be done by visual perception alone (Adolphs, 2002; Calder, Keane, Cole, Campbell, & Young, 2010; Smith, Cottrell, Gosselin, & Schyns, 2005). Additionally, valence is considered to be an easier and more basic attribution to make than emotional categorization (Russell, Bachorowski, & Fernández-Dolz, 2003; Lindquist, Gendron, Barrett, & Dickerson, 2014) and recall that our exuberant surprise faces received slightly more positive valence ratings than our happy faces.

While the ERPs revealed significant N400 effects for happiness and disgust as a function of our manipulation, the behavioral ratings of valence did not differ. The lack of an effect on ratings supports models of emotion recognition that posit a moderating rather than a mediating role for the sensorimotor cues elicited from facial muscles. These data are thus consistent with emotion recognition models that suggest processing complex social stimuli requires the integration of disparate sorts of cues (Barrett, Wilson-Mendenhall & Barsalou, 2015; Zaki, 2013), and that valence categorization is a more psychologically basic process than that of emotion categorization (Lindquist et al., 2014).

Conceptual aspects of emotion recognition

According to psychological construction models of emotion, emotions emerge from the integration of external perceptual information, interoceptive information, and conceptualization (Barrett, 2006; 2009; Lindquist & Gendron, 2013). The present findings are consistent with the idea that interfering with interoceptive information (i.e., via sensorimotor noise generated at the zygomaticus) interferes with conceptualization (the retrieval of affective semantic information, i.e., an increased N400) of facial expressions that rely on the lower part of the face for their expression.

The N400 is associated with semantic retrieval, with a larger N400 (more negative) occurring in response to stimuli that engage relatively more semantic processing during comprehension (for a review of the N400 in responses to language, images and gestures, see Kutas & Federmeier, 2011). The N400 differences in the current experiment can be explained in different ways. According to theories of embodied semantics, emotional concepts are grounded in sensorimotor systems (Niedenthal et al., 2005; Winkielman, Niedenthal, Wielgosz, Eelen, & Kavanagh, 2015). They hypothesize that the context of one’s bodily state impacts the online retrieval of emotional semantics, as interfering with smiling via the closed-lip bite manipulation, led to enhanced language based N400 for sentences that made people smile, but not for those that didn’t (J. D. Davis et al., 2015).

The larger N400 to happy faces in the interference blocks in this study could reflect a reduced understanding of these facial expressions, consistent with the impaired recognition effects found in other research that has manipulated motor activity. The larger N400 could also reflect the recruitment of additional semantic information, such as that involved in mentalizing processes, in order to compensate for the disrupted sensorimotor information. This is consistent with the lack of any interference effects on the valence ratings. In either case, the larger N400 suggests that interfering with the sensorimotor cues influenced the semantic stage of emotion recognition.

These findings provide support for embodied theories, which hypothesize that sensorimotor systems are involved in the representation of emotion concepts (Niedenthal, 2007). In addition, the selectivity of the interference effects argues against skeptical accounts such as the cognitive load and attentional resources hypotheses, as both predict interference effects should be similar for all emotional expressions. Our finding that interference impacted the N400 to one category of positive expression (happiness), but not the other (exuberant surprise), and one category of negative expression (disgust), but not the other (anger), undermines another alternative explanation, namely that the observed N400 effects were driven by the valence of the faces. These data are thus in keeping with accounts such as the conceptual act theory (Barrett, 2013), that propose semantic and conceptual processes are critical for the recognition of emotion, but not valence.

Conclusion

The current results further refine theories of the role of sensorimotor processes in emotion recognition by directly measuring neural correlates associated with semantic processing. The N400 effects demonstrate that sensorimotor activity has a functional role in accessing affective semantic information about emotional faces, as interfering with it interfered with semantic retrieval. However, the lack of an effect on exuberant surprise and the lack of an effect on behavioral ratings suggest that the functional role in semantic processing may be moderating rather than mediating in nature, and their importance may be more compensatory than compulsory. Sensorimotor simulations may become increasingly important as stimuli move toward ambiguity (e.g., see Halberstadt, Winkielman, Niedenthal, & Dalle, 2009; Niedenthal, Brauer, Halberstadt, & Innes-Ker, 2001, on mimicry and the identification of ambiguous expressions). If so, it points to an especially prominent role for sensorimotor simulation in real world face processing, as the emotional faces we encounter in everyday life are often more subtle and complex than those used in the current research.