1 Introduction
Flying robots, or
unmanned aerial vehicles (UAVs), are commonly known as drones. In this article, we use the definition of “Robot” in ISO 8373:2021(en) [
32], which indicates that only drones with a certain degree of (semi- to full-scale) autonomous capability count as flying robots, not fully piloted ones. We use the term “flying robot” whenever we wish to explicitly emphasize this autonomous characteristic. However, for convenience’s sake, the term “drone” will be used interchangeably in this article, especially when discussing the existing literature. Drones are already in frequent use for different purposes. There is a growing interest in making drone applications more ubiquitous [
1,
48], and domestic drone applications are likewise gaining increased interest [
49].
As the mechanical functioning of robots tends to generate consequential sounds, robot noise is generally inevitable in real life. The noise of a flying drone is particularly salient, as it requires a continuous lifting force from high-frequency turbulent airflows generated by propellers, thus creating loud consequential sounds. Such noise is even more intense in close-range human-drone interaction. With domestic drone applications becoming increasingly popular [
49,
61], noise has become a critical issue for user acceptance of drones that interact with humans in close proximity [
11,
12,
34].
Many strategies to solve the drone noise issue have been tried, but due to limitations of size, cost, and weight, no single strategy has yet achieved satisfying results for domestic drones (see Section
2.1). In this article, we investigate adding natural sounds to mask or mitigate noises. Literature has shown that nature exposure and listening to natural sounds may not only lower stress and annoyance but also improve health and create positive affect [
8,
10,
68]. Noise masking by adding natural sounds has been proposed and studied in various areas of research and commercial applications (see Section
2.3). Following this line of research, we propose adding natural sounds to a noisy domestic flying robot and conjecture that this could have positive effects on people’s perceptions. We designed a mixed-methods empirical study with human participants (N = 56) to examine this strategy. Through both quantitative and qualitative approaches, we acquired interesting insights that may be helpful for the design of enjoyable human-robot interactions (see Sections
6 and
7).
The contributions of this article are: (i) Investigating the idea of adding natural sounds to flying robots to alter human perception and make close-range interaction with domestic drones more acceptable. (ii) Sorting out the correlation between sound conditions and proxemics of flying robots, namely, investigating the changes in people’s reported perceptions when adding different natural sounds at different proxemic distances. (iii) Presenting an original empirical study exposing participants to a real flying robot in a realistic and controlled environment, offering a full sensory experience with high realism. (iv) Offering empirical findings, especially qualitative data, that supports earlier models on how perceptions and experiences are formed. We present and discuss a visual summary with the potential to explain why an identical stimulus might lead to diverse and even contradictory individual interpretations. (v) Deriving design recommendations for domestic drones and future work.
3 Methodology and Experiments
3.1 Ideation and Hypothesis
Adding bird or rain sounds to a drone while it flies may positively affect people’s perception of its presence, especially for close-range interactions. Proxemic distances may impact how people perceive the flying robot within certain sound conditions.
We designed and prototyped a small flying robot with an on-board loudspeaker to play chosen natural sounds (either birdsong or a rain sound; see details in Section
3.3) while flying at different proxemic distances (see Section
3.4). We hypothesized that adding natural sound would improve people’s perceptions of the drone. We also hypothesized that closer distance has a negative effect on perception due to the louder noise and higher risk of collisions at a closer distance.
3.2 Experimental Design
Our experimental design has two factors, while each factor has three levels, namely: 3 (sound conditions: bird, rain, none) \(\times\) 3 (distance conditions: near, middle, far). The setup was a randomized within-subjects approach, where each participant experienced all nine conditions, which were presented in different orders. For each factor with three levels A, B, C, there are six possible sequences (3! = 6), namely: ABC, ACB, BAC, BCA, CBA, CAB. We listed all six possible permutations of conditions for each factor (namely, either sound or distance) accordingly. For practical reasons, we first stipulated the order of the three distance conditions via complete counterbalancing, so each participant received a prearranged sequence of distance conditions; then, at each distance, the order of the three sound conditions was randomly determined by letting each participant roll a six-sided dice.
3.3 Natural Sound Design and Choices
Based on past studies and the particular rationales articulated in Section
2.3, we set up three different sound conditions for our experiment, namely: the original drone noise condition with no additional sound (control condition), the added
birdsong condition, and added
rain sound condition. The three sound conditions will be subsequently referred to as the
none condition, the
bird condition and the
rain condition, respectively, for short.
3.3.1 Original Drone Noise Condition (Control Condition).
No additional sound was added in the original drone consequential sound condition. The humming noise generated by drones is primarily due to their high-speed running motors and rotating propellers [
44], which is a significant pain point for interactive drones [
34].
3.3.2 Birdsong Condition.
We chose the sound of the great tit
(Parus major) as the birdsong sample. This bird is widespread throughout Europe, commonly resident in any sort of woodland [
20]. We chose this common bird, as we had expected participants to recognize the sound as a local bird. Moreover, the song of the great tit was perceived as clear, lively, and cheerful during pilot tests. Typically, their song consists of roughly 3-second strophes and a 2-second break. The strophes consist of a series of phases composed of one to four different notes (defined as a continuous sound trace on a spectrogram) [
25]. The bird sound recording we used is from the open-source bird sounds website xeno-canto [
74].
3.3.3 Rain Sound Condition (White Noise).
For a rain sample, we chose an ambient sound called
Weather Ambience Heavy Rain Downpour Splatty 01.wav from Adobe Audition open-source library [
2]. This sound of heavy rain is characteristic and loud, aiming to ensure participants recognize it even when the flying robot’s noisy motors are running.
3.4 Proxemic Distance Choices
Three takeoff locations were chosen according to the theory of proxemics [
26,
73] and the size constraints of the experimental setting. The three takeoff distances were designated to be approximately 45 cm, 115 cm, and 185 cm away from participants, i.e., in the range of intimate space, personal space, and social space [
18], respectively. We tried to understand these distances in the context of human-robot interaction. These three different locations will be subsequently referred to as the
near,
middle, and
far locations. The three sound conditions were randomly played at each takeoff location.
3.5 Choice of Drone and Engineering the Flying Robot
Crazyflie is a small programmable quadcopter that is designed for indoor flying [
15]. Crazyflie provides a wide range of open source Python programs, decks, and components to meet different research demands. In previous work, a mini flying robot with a smooth, stable, and high-precision flight trajectory was considered more acceptable by participants [
34]. To achieve stable and precise flying trajectories, we chose the lighthouse positioning system to assist our experiment. With two lighthouse base stations and a lighthouse positioning deck on top of the quadcopter, the flying robots were able to fly with precision under our program’s control. To play natural sounds, we mounted a 28 mm round metal loudspeaker to the bottom side of the drone. The loudspeaker was powered by a 5 W Bluetooth amplifier with an extra Li-on battery through wires. The wires allowed the Bluetooth board and the battery to be installed in a small box and hidden under the desk during the test, which is a temporary solution to save on-board battery by lowering the takeoff weight. The small drone and accessories used in the experiment are shown in Figure
1. Following a
Research through Design (RtD) approach [
80], we decided on the following drone trajectory: The robot would first take off and fly vertically to a height of 40 cm above the table, then stay hovering for 10 seconds, afterward vertically land on the table. The selected natural sounds adapted to the drone flight durations and were played through the on-board loudspeaker.
3.6 Experimental Setup
The experiment was conducted in a soundproof chamber to avoid interference from outside noise. To make the experimental environment closely approximate the household setting where domestic flying robots would be expected, we placed several pieces of furniture as shown in Figure
2. Participants sat in front of the long desk with two lighthouse positioning base stations set behind them out of sight. The desk and chair were pre-located and marked to keep all participants at a similar distance from the flying robot. A long blanket with three position marks was placed in the middle of the desk. This blanket was used as an absorber to decrease the reflected sound wave from the desk, and the marks on the blanket were used to show the three different takeoff settings.
3.7 Preliminary Study and Engineering Evaluation
During a preliminary study, we recorded the three chosen sound conditions and measured their sound pressure levels. The total A-weighted SPL of the
bird condition and the
rain condition were calibrated to a similar level during the test (i.e., 71 dBA, 66 dBA, 61 dBA from the
near to the
far locations, respectively). The frequency spectra of the three sound conditions are shown in Figure
3, which matched our expectations well. The spectrum of flying robot noise is a wide-bandwidth noise mainly concentrated below 1 KHz. The rain sound is more constant and close to white noise, which is known to achieve a decent sound-masking effect when added to a wide-bandwidth noise, as some noise features may be hidden. However, bird sound is usually high-frequency and narrow-bandwidth. For this reason, purely from a spectrum engineering perspective, adding bird sound to a wide-bandwidth noise would have only a very limited sound-masking effect. Nonetheless, previous literature shows that adding bird sound to a similar noise can work very well [
13,
27], even better than water sound [
13]. This obviously shows that how humans perceive sounds is not purely dependent on the features of mechanical waves, in accordance with the biological principles mentioned later on in the discussion (see Section
6.2).
3.8 Participants and Study Procedures
We recruited participants through multiple ways, including social media, flyers on campus and at student residences, and sending invitations to friends and colleagues (snowball sampling). Each session involved one individual participant, and each participant received a cinema ticket after the test as compensation. As all experiments were carried out entirely in Sweden, we carefully followed the Swedish Ethics Review Authority’s guidelines [
21] and ensured that the national Ethics Review Act [
60] and relevant regulatory requirements were complied with.
3.8.1 Safety Precautions.
(i) To avoid physical harm to participants in the case of the flying robot losing control, we did hundreds of tests before the formal experiment and implemented a set of safety precautions: All participants were instructed to protect their bodies with a blanket made available by throwing the unfolded blanket over the flying robot to pull it down if the robot happened to divert from the planned trajectories; (ii) participants who were not already wearing their own glasses were required to wear goggles to protect their eyes; (iii) the robot’s battery was exchanged with a fully charged one after every three takeoffs during the manual changes from one takeoff location to another to avoid battery voltage drop and ensure stable operation.
3.8.2 Signing Consent form.
Prior to the study, every participant was given the Research Consent Form and enough time to read it. They were then invited to ask any questions before giving their consent to the mentioned procedures, including being observed and audio-recorded, by signing the form agreeing to participate.
3.8.3 Study Phases.
There were three phases during each study: (i) In the briefing phase (around 10 min), the researchers introduced the study in detail, including the above-mentioned safety precautions. Participants were told that “We (the researchers) hope that, through your participation, we will learn more about the challenges and opportunities for designing flying robots, especially in terms of sound features and the close-range interactions.” To reduce the effect of demand characteristics, we deliberately did not inform about our hypothesis, and we told every participant: “There are no right or wrong answers. We want you to honestly note down your evaluations and later tell us about your feelings and thoughts.” (ii) In the experimental phase (around 20 min), participants were exposed to nine performances by a small sonified flying robot, with the order of performances randomized to exclude sequence effects. After each performance, participants were asked to evaluate six features in a questionnaire. Participants were also asked to rate their preferences among the three sound conditions at each distance. We filmed each experimental condition from the participants’ first-person perspective, and the video clips can be accessed via a link.
1 (iii) In the debriefing phase (around 15 min), participants were interviewed regarding their experience, thoughts, and comments on the aforementioned performances.
We ended up having 56 participants, including 31 self-identified males, 24 self-identified females, and 1 person who self-identified as other, leading to a total of 56 experiment sessions. The age range of participants was between 20 and 59 (M = 28.5, SD = 8.63). Each session took around 40 to 60 minutes, with most differences occurring in the briefing and debriefing phases, as some participants had more things to talk about than others. Eight participants’ ratings were removed due to self-reported hearing impairment (2), technical failures during the test (3), and the written notes on their questionnaires indicating that they could not correctly identify all the sound conditions (3). In the end, we included 48 participants’ quantitative data from answered questionnaires in the statistical analysis, with the nine experimental conditions counterbalanced via a complete counterbalancing the distance factor (eight times all six possible sequences, 8 \(\times\) 6 = 48) and simple randomization of the sound factor (by rolling a dice) at each distance. Nevertheless, we still considered the interview data from all persons to be very valuable, as it adequately represented the participants’ experiences, so we included all 56 participants’ qualitative data for the thematic analysis. Please see the following sections.
3.9 Measurements of Reported Perception
Participants were asked to evaluate each performance after its end with respect to six measurements describing the perceived characteristics of the flying robot: “loud,” “sharp,” “pleasant,” “safe,” “relaxing,” and “attractive,” on a scale of 0–10, with 0 representing “not at all” and 10 “extremely.” These characteristics were selected based on both existing literature and the focus of this study, as indicated below. After all the performances finished, every participant was also asked to rank their preference for the three sound conditions played at each distance by giving 0 points to the least liked, 1 point to the medium favorite, and 2 points to the most favored.
The measurements of perceived loudness and perceived sharpness were intended to examine how participants would feel about adding natural sounds to the drone noise soundscape. For the rest of the four measurements, participants were explicitly asked to consider their full sensory experience with the demonstrated flying robot performances. Pleasantness and attractiveness are the most commonly used perceptual assessment criteria in previous studies in both user experience [
38] and soundscape quality [
3,
22,
24]. Safety and causes of stress are further critical parameters for user acceptance of drones used in close proximity; thus, we wanted to examine both perceived safety and perceived absence of stress.
3.10 Post-experiment Interview Questions
The first and second authors conducted all interviews together, with detailed interview notes taken by each author separately. The interviews were primarily conducted in English. However, a number of participants were international students from China newly arrived in Sweden and felt more comfortable communicating in their native language. As both the first and second authors were native Mandarin Chinese speakers, these interviews were conducted in Chinese. Participants from other countries did not indicate a need to switch to another language. The full interviews were audio-recorded.
We used a semi-structured interview guide to elicit information about participants’ experiences and perceptions of the different noise conditions. Our questions addressed the participants’ preferences among the performances (and the reasons for these preferences), their impressions of the tested sounds, their personal background, and their impressions of the study setup. Finally, the interviewers asked follow-up questions when appropriate, and the participants had the opportunity to add information they considered important. The specific questions used are listed in Table
1.
4 Quantitative Data (Measurements of Reported Perception) Analysis and Results
In this section, we first describe the overall statistical methods we used for analyzing the quantitative data, with a summary of the results regarding the six measurements’ effects on reported perceptions in Table
2. Then, we list the detailed results for each of the six measurements in each of the following subsections, namely:
loudness,
sharpness,
pleasantness,
safety,
relaxedness, and
attractiveness; followed by the last subsection, which discusses the ordinal preference measurement. We provide visualizations for each measurement to support understanding of the data.
4.1 Overall Description of Statistical Methods
Statistical analysis was done using IBM SPSS Statistics (version 28.0.0.0) [
29]. For our within-subjects factorial design, we conducted two-way repeated measures ANOVAs on reported perception for each of the six measurements. For each measurement, we checked the significance of the main effects of each of the two factors (sound and distance) and the interaction effect between them. In cases where one factor’s main effect was significant, we carried out a post hoc analysis through multiple comparisons with Fisher’s
Least Significant Difference (LSD) test to examine the relationships between the corresponding individual levels. Regardless of whether there was a significant interaction effect, we conducted simple effects tests to compare all pairs of three levels of one factor for each of the three levels of the other factor. The simple effects tests were done with one-way repeated measures ANOVAs followed by multiple comparisons with the LSD tests. We checked the normality of residuals via the Shapiro-Wilk test and the homogeneity of variance via Levene’s Test. We decide to report partial eta squared as the estimate of effect size, denoted as
\(\eta ^2_p\), as it offers a more comparable estimate for factorial designs with multiple independent variables [
39]. Table
2 presents a summary of the statistical analysis results.
The preference rating is different from the six parameters mentioned above. Each participant ranked their preferences for the three performances played at each distance by giving 2 points for their most preferred, 1 point for the next preferred, and 0 points for their least favorite. We conducted a one-way repeated measures ANOVA at each distance to compare the effects of the three sound conditions on the participants’ preferences. Where the ANOVA revealed a significant difference, we used the LSD test to see the relationships among the three sound conditions at the specific distances.
For the six measurements of reported perceptions, the data were plotted as box-whisker plots with asterisks highlighting the significance level, where * indicates p < .05, ** indicates p < .01, and *** indicates p < .001. The preference data were plotted as a stacked bar chart. See related figures in the following sections.
4.2 Perceived Loudness
Figure
4 shows the ratings of perceived loudness at the three locations with three sound conditions. The main effect of the sounds (F(2,94) = 18.44, p < .001,
\(\eta ^2_p\) = 0.282) and the distances (F(2,94) = 24.87, p < .001,
\(\eta ^2_p\) = 0.346) on perceived loudness were both significant. The interaction between sounds and distances was also significant (F(4,188) = 3.58, p = .008,
\(\eta ^2_p\) = 0.071).
The simple effects analyses indicated for the three distances: (i) At the near location, the mean perceived loudness rating for the bird condition (M = 7.38, SD = 1.63) was significantly higher than both the none (M = 6.06, SD = 1.92) and the rain conditions (M = 6.46, SD = 1.66), and the rain condition was also rated significantly louder than the none condition. The effect size was \(\eta ^2_p\) = 0.388. (ii) At the middle location, the mean perceived loudness rating for the bird condition (M = 6.52, SD = 1.82) was significantly higher than both the none (M = 5.65, SD = 1.39) and the rain conditions (M = 6.02, SD = 1.58). The effect size was \(\eta ^2_p\) = 0.273. (iii) At the far location, only the mean perceived loudness rating for the bird condition (M = 5.73, SD = 1.90) was significantly higher than the none condition (M = 4.96, SD = 1.96), with an effect size of \(\eta ^2_p\) = 0.140.
For the three sound conditions: (i) For the none condition, the mean perceived loudness ratings at the near (M = 6.06, SD = 1.92) and middle locations (M = 5.65, SD = 1.39) were both significantly higher than the far location (M = 4.96, SD = 1.96). The effect size was \(\eta ^2_p\) = 0.213. (ii) For the bird condition, the mean perceived loudness rating at the near location (M = 7.38, SD = 1.63) was significantly higher than both the middle (M = 6.52, SD = 1.82) and the far locations (M = 5.73, SD = 1.90). The difference between the middle and far locations was also significant, with an effect size of \(\eta ^2_p\) = 0.487. (iii) For the rain condition, the mean perceived loudness ratings at the near (M = 6.46, SD = 1.66) and middle locations (M = 6.02, SD = 1.58) were both significantly higher than the far location (M = 5.48, SD = 2.06). The effect size was \(\eta ^2_p\) = 0.209.
4.3 Perceived Sharpness
Figure
5 shows the ratings of perceived sharpness at the three locations with three sound conditions. The main effects of the sounds (F(2,94) = 27.47, p < .001,
\(\eta ^2_p\) = 0.369) and the distances (F(2,94) = 13.60, p < .001,
\(\eta ^2_p\) = 0.224) on perceived sharpness were both significant. The interaction between sounds and distances was not significant (F(4,188) = 1.82, p = 0.13,
\(\eta ^2_p\) = 0.037), but the simple effects analyses nevertheless indicated a possible interaction.
The simple effects analyses indicated for the three distances: (i) The mean perceived sharpness rating of the bird condition (near: M = 7.65, SD = 1.72; middle: M = 6.73, SD = 2.14; far: M = 6.29, SD = 2.14) was significantly higher than the other two sound conditions. (ii) The mean perceived sharpness ratings for the none (near: M = 5.73, SD = 2.10; middle: M = 5.33, SD = 1.83; far: M = 4.90, SD = 2.03) and the rain conditions (near: M = 5.52, SD = 1.88; middle: M = 4.81, SD = 1.79; far: M = 4.85, SD = 1.99) were less decisive. The effect sizes for the three locations were \(\eta ^2_p\) = 0.404 (near), \(\eta ^2_p\) = 0.267 (middle), \(\eta ^2_p\) = 0.202 (far).
For the three sound conditions: (i) For the none condition, the mean perceived sharpness rating of the near location (M = 5.73, SD = 2.10) was significantly higher than the far location (M = 4.90, SD = 2.03). The effect size was \(\eta ^2_p\) = 0.093. (ii) For the bird condition, the mean perceived sharpness rating at the near location (M = 7.65, SD = 1.72) was significantly higher than both the middle (M = 6.73, SD = 2.14) and the far locations (M = 6.30, SD = 2.14). The difference between the middle and the far locations was also significant. The effect size was \(\eta ^2_p\) = 0.294. (iii) For the rain condition, the perceived sharpness rating at the near location (M = 5.52, SD = 1.88) was significantly higher than both the middle (M = 4.81, SD = 1.79) and the far locations (M = 4.85, SD = 1.99). There was no significant difference between the middle and the far locations. The effect size was \(\eta ^2_p\) = 0.294.
4.4 Perceived Pleasantness
Figure
6 shows the ratings of perceived pleasantness at the three locations with three sound conditions. The main effects of sounds (F(2,94) = 6.01, p = .004,
\(\eta ^2_p\) = 0.113) and distances on perceived pleasantness (F(2,94) = 13.08, p < .001) were both significant, as was the interaction between sounds and distances (F(4,188) = 3.40, p = .010,
\(\eta ^2_p\) = 0.068).
The simple effects analyses indicated for the three locations: (i) The mean perceived pleasantness ratings for the sound conditions did not significantly differ at the near location. (ii) At the middle location, the mean perceived pleasantness ratings for both the bird (M = 5.48, SD = 2.52) and the rain conditions (M = 5.37, SD = 1.96) were significantly higher than the none condition (M = 4.65, SD = 1.72). There was no significant difference between the bird and the rain conditions. The effect size was \(\eta ^2_p\) = 0.066. (iii) At the far location, the mean perceived pleasantness rating for the bird condition (M = 6.21, SD = 2.31) was significantly higher than both the none (M = 4.71, SD = 2.09) and the rain conditions (M = 5.08, SD = 2.14), with no significant difference between the none and the rain conditions. The effect size was \(\eta ^2_p\) = 0.186. It seems that the sound conditions played an important role in the perception of pleasantness at the far and middle distances, but not at the near location.
For the three sound conditions: (i) For the none condition, the mean perceived pleasantness ratings of both the middle (M = 4.65, SD = 1.72) and the far locations (M = 4.71, SD = 2.09) were significantly higher than the near location (M = 3.90, SD = 2.16), with no significant difference between the middle and the far. The effect size was \(\eta ^2_p\) = 0.130. (ii) For the bird condition, the mean perceived pleasantness rating at the far location (M = 6.21, SD = 2.31) was significantly higher than both the near (M = 4.56, SD = 2.53) and the middle locations (M = 5.48, SD = 2.52). The difference between the near and the middle locations was also significant. The effect size was \(\eta ^2_p\) = 0.246. (iii) For the rain condition, only the mean perceived pleasantness rating of the middle location (M = 5.38, SD = 1.96) was significantly higher than the near location (M = 4.54, SD = 2.27). The effect size was \(\eta ^2_p\) = 0.068. The distances played an important role in the perception of pleasantness for all three sound conditions.
4.5 Perceived Safety
Figure
7 shows the
safety ratings at the three locations with three sound conditions. The main effects of sounds on perceived safety (F(2,94) = 1.49, p = .23,
\(\eta ^2_p\) = 0.031) were not significant. However, the main effect of distances on perceived safety (F(2,94) = 29.68, p < .001) was significant. The sound conditions very likely had no effect on the perception of safety, but the distances did.
The simple effects analyses indicated for the three sound conditions: (i) For both the none and the bird conditions, the mean perceived safety rating at the middle (none: M = 6.79, SD = 2.23; bird: M = 7.02, SD = 2.48) and the far locations (none: M = 7.29, SD = 2.46; bird: M = 7.73, SD = 2.39) was significantly higher than the near location (none: M = 5.13, SD = 2.74; bird: M = 5.35, SD = 2.74), and the mean perceived safety rating at the far location was also significantly higher than at the middle location. The effect size was \(\eta ^2_p\) = 0.348 for the none condition and \(\eta ^2_p\) = 0.315 for the bird condition. (ii) For the rain condition, the mean perceived safety rating for both the middle (M = 6.96, SD = 2.12) and the far conditions (M = 7.30, SD = 2.49) was significantly higher than the near (M = 5.35, SD = 2.62) condition. The effect size was \(\eta ^2_p\) = 0.313. The distances played an important role in the perception of safety for all three sound conditions.
4.6 Perceived Relaxedness
Figure
8 shows the ratings of relaxedness at the three distances with three sound conditions. The main effect of sounds (F(2,94) = 5.33, p = .006,
\(\eta ^2_p\) = 0.102) and distances on perceived relaxedness (F(2,94) = 27.77, p < .001,
\(\eta ^2_p\) = 0.371) were both significant. The interaction between sounds and distances was not significant, (F(4,188) = 2.05, p = 0.09,
\(\eta ^2_p\) = 0.042), but simple effects analyses indicated the possibility of an interaction.
The simple effects analyses indicated for the three distances: (i) The mean perceived relaxedness ratings between sound conditions did not significantly differ at the near location. (ii) At the middle location, the mean perceived relaxedness ratings in both the bird (M = 5.56, SD = 2.68) and the rain (M = 5.52, SD = 2.26) conditions were significantly higher than the none condition (M = 4.63, SD = 2.10), with no significant difference between the bird and the rain conditions. The effect size was \(\eta ^2_p\) = 0.075. (iii) At the far location, the mean perceived relaxedness rating for the bird condition (M = 6.10, SD = 2.35) was significantly higher than both the none (M = 4.71, SD = 2.19) and the rain conditions (M = 5.27, SD = 2.52), with no significant difference between the none and the rain conditions. The effect size was \(\eta ^2_p\) = 0.251.
For the three sound conditions, the none, the bird, and the rain conditions: The mean perceived relaxedness ratings at the middle (none: M = 4.62, SD = 2.01; bird: M = 5.56, SD = 2.68; rain: M = 5.52, SD = 2.26) and the far (none: M = 4.70, SD = 2.19; bird: M = 6.10, SD = 2.35; rain: M = 5.27, SD = 2.52) locations were significantly higher than at the near location (none: M = 3.75, SD = 2.12; bird: M = 4.33, SD = 2.31; rain: M = 4.19, SD = 2.27), with no significant difference between the far and the middle locations. The effect size was \(\eta ^2_p\) = 0.145 for the none condition, \(\eta ^2_p\) = 0.270 for the bird condition, and \(\eta ^2_p\) = 0.226 for the rain condition.
4.7 Perceived Attractiveness
Figure
9 shows the ratings of perceived attractiveness at the three distances with three sound conditions. The main effects of sounds (F(2,94) = 10.30, p < .001,
\(\eta ^2_p\) = 0.180) and distances on perceived attractiveness (F(2,94) = 11.38, p < .001,
\(\eta ^2_p\) = 0.195) were both significant. The interaction between sounds and distances was also significant (F(4,188) = 3.32, p = .012,
\(\eta ^2_p\) = 0.066).
The simple effects analyses indicated for the three distances: (i) At the near location, only the mean perceived attractiveness rating of the bird condition (M = 4.79, SD = 2.77) was significantly higher than the none condition (M = 3.98, SD = 2.61), with no significant difference between the other conditions. The effect size was \(\eta ^2_p\) = 0.054. (ii) At the middle location, the mean ratings of perceived attractiveness in both the bird (M = 5.79, SD = 2.43) and the rain conditions (M = 5.35, SD = 2.23) were significantly higher than the none condition (M = 4.43, SD = 2.11), with no significant difference between the bird and rain conditions. The effect size was \(\eta ^2_p\) = 0.167. (iii) At the far location, the mean perceived attractiveness rating for the bird condition (M = 6.21, SD = 2.44) was significantly higher than both the none (M = 4.48, SD = 2.60) and the rain conditions (M = 5.04, SD = 2.34), with no significant difference between the none and the rain conditions. The effect size was \(\eta ^2_p\) = 0.351.
For the three sound conditions: (i) For the none condition, only the mean perceived attractiveness rating at the far location (M = 4.48, SD = 2.60) was significantly higher than at the near location (M = 3.98, SD = 2.61), with no significant difference between other locations. The effect size was \(\eta ^2_p\) = 0.047. (ii) For the bird condition, the mean perceived attractiveness rating at the far location (M = 6.21, SD = 2.44) was significantly higher than at the middle location (M = 5.79, SD = 2.43), and the middle condition was significantly higher than the near condition (M = 4.79, SD = 2.77). The effect size was \(\eta ^2_p\) = 0.237. (iii) For the rain condition, the mean perceived attractiveness ratings at both the middle (M = 5.35, SD = 2.23) and the far locations (M = 5.04, SD = 2.34) were significantly higher than at the near location (M = 4.48, SD = 2.63), with no significant difference between the far and the middle locations. The effect size was \(\eta ^2_p\) = 0.106.
4.8 Preference
Figure
10 shows the participants’ mean preference ratings among the three sound conditions at each distance. Each participant ranked their preference for the three performances at each distance by giving 2 points to their top favorite, 1 point for the medium favorite, and 0 points for their least favorite.
At both the near and the middle locations, the rain sound was most preferred, followed by the bird sound, and the none condition was the least preferred. However, the gap between the rain and the bird sound at the middle distance was smaller than at the near location. The mean preference rating did not significantly differ between sound conditions at both the near and the middle locations.
By contrast, at the far location, the bird condition was ranked the highest, the rain condition second, and the none condition remained last. Remarkably, the bird scored almost double that of the rain and triple that of the none condition. The main effect of sound on preference was significant at the far location (F(2,94) = 19.39, p < .001, \(\eta ^2_p\) = 0.292), but not significant at the near (F(2,94) = 1.02, p = .364, \(\eta ^2_p\) = 0.021) and middle (F(2,94) = 1.02, p = .364, \(\eta ^2_p\) = 0.043) locations.
For all three locations, adding natural sounds (bird or rain) seemed to have a positive effect on participants’ overall preferences; the distance also had an obvious influence on the preference ratings. For instance, ratings of the bird condition improved markedly from the near to the far location, indicating that participants particularly preferred the bird sound at the far, but not so much at the middle or near location.
5 Qualitative Data (Interviews) Analysis and Results
The interview responses were analyzed using a thematic analysis, which is a useful and flexible analytic method for identifying themes or patterns from qualitative data in research in and beyond psychology [
9]. Data analysis was done in six phases as outlined below, following suggestions by Braun and Clarke [
9].
In Phase 1, the qualitative data were analyzed based on the interview notes. We performed three quality checks before taking this decision. First, the first and second authors conducted all interviews together, with detailed interview notes taken by each author separately. Second, the two authors compared their notes before entering the qualitative data into an Excel spreadsheet corresponding to the asked questions to ensure complete and objective data extraction. In the case of disagreement between the two authors’ notes, both authors would listen together to the respective audio recording to reach an agreement. Third, the Excel spreadsheet summarizing the notes and audio recordings of the interviews was shared with the third author. The third author transcribed two randomly selected interviews and compared these transcripts to the interview notes, finding that the interview notes adequately captured the interview content (consistency check). In Phase 2, after familiarizing himself with all shared materials, the third author used MaxQDA 2020 [
42] to code the qualitative data, while the first author conducted coding via paper-and-pencil. Specifically, we used an inductive coding approach in which we developed coding themes based on the collected data rather than prior theories [
9]. During coding, we took care to focus on the explicit, semantic meaning first before moving on to inferred meaning in a later phase [
9]. For example, the notes “very artificial, mechanic” (P39), “not real” (P41), and “artificial, metallic” (P50) referring to the rain sounds were coded as “rain sound: artificial.” Both authors repeatedly went through their codes and the interview notes as a quality check. However, researchers have an active role in this coding, as coding is never fully independent of interpretation [
9]. As a consequence, in Phase 3, the first and third authors collated and discussed their codes to identify initial candidates for overarching themes. Based on these discussions, they performed a refactored analysis of some of the codes in Phase 4, which was reviewed and agreed upon by the second author as an additional quality check. For example, we identified a potential theme that the
sharpness of the
bird sound might have created a negative impression but did not sufficiently distinguish between whether participants referred to this sound as “sharp” or “too sharp.” As a consequence, we went through the notes again and adapted our codes as appropriate. Afterwards, in Phase 5, the research team created four themes that consolidated important aspects of participants’ experiences and thoughts about the experiment. Specifically, we identified the themes of (1) familiarity with the sounds, (2) personal experiences and preferences regarding the sounds, (3) the social dimension of proxemics, and (4) safety associations. These themes were rigorously discussed within the author team, including how they relate to each other and to the sound conditions. Figure
11 provides a visualization of the themes with definitions and sub-themes in the form of an affinity diagram. In Phase 6, we incorporated these identified themes into the present article. We will report on the themes in terms of their impact on the different sound conditions (
bird and
rain) in the following section. Finally, we will report findings regarding these themes that were independent of particular sound conditions. All provided quotes were transcribed and some had to be translated into English first (as some interviews were conducted in Mandarin Chinese).
5.1 Themes Identified for the Bird Sounds
Regarding the theme of familiarity, the majority of participants mentioned that the selected birdsong was sharp and implied they had a common understanding of the objective features of the birdsong as a high-frequency sound. However, the theme of familiarity had a strong impact on the participants’ attitudes toward the added bird sound, which was closely related to the theme of personal experiences. Since the bird species (great tit) we chose is widespread in Europe, the attitudes of participants who had lived in Europe for at least some years generally tended to be more positive, for they mostly associated the sound with a known bird and nature. P25 stated “I feel relaxed when I hear it. It is a nice melody.” P33 said “It sounds very pleasant. The bird reminds me of going to the zoo, like the jungle area in Universeum (a public science center and museum in Gothenburg).” P31 could even identify the exact bird species from the birdsong. However, among participants newly arrived in Europe (mainly international students), most reported that the sound did not sound like a bird, but rather like an alarm, and for this reason their response tended to be negative. P12 mentioned the sound was sharp and annoying and reminded her of a fire alarm beeping. P18 emphasized “It is a threatening alarm. Feels like it’s coming to attack me!” The attitudes of participants also varied depending on whether they were more outdoors or indoors people, as outdoorsy people were more likely to prefer this birdsong. P32, who identified as an outdoorsy person, stated, “I find it very familiar. I heard this specific bird in spring at our summer house so I recognized it very well! It is really good and comforting.” P34, who mentioned that he was an indoors person, said, “The bird gave me a headache. It was way too loud and high-pitched for me.”
The proxemic distances had a strong impact on the participants’ perceptions of the bird sound, mainly associated with the social dimension of proxemics. Among the nine performances, most participants liked the
bird condition the most when the drone was at the
far location, while they disliked the same sound the most at the
near location—this matches the quantitative data on preference (see Figure
10). Participants reported that at the
far location, they perceived the
bird as pleasant, attractive, and comforting, but at the
near location, they felt it was annoying, stressful, dangerous, and uncomfortable. P04 emphasized “the distances obviously made the
bird sound very different.” P52 mentioned that the bird’s song had a different effect according to the three distances, “when it was
far, the
bird made me feel comfortable and it didn’t sound so sharp, I liked it the most. However, when the same sound was played at
middle switched from
far, I felt it became sharper and had an opposite effect which made me feel uncomfortable. I didn’t expect that...When it was
near, it made me even more uncomfortable.” Participants said it was weird to have a bird so close in a real situation. This finding was closely related to the theme of safety. Some participants mentioned feeling that they were being watched by the drone (see details in the Section
5.3.2), and many felt they might even be attacked by the “bird”: P18: “feels like it is coming to attack me.” In their opinion, it is more common to experience a distant bird rather than a near one, and that might be the reason why they preferred the drone to play the birdsong far away, but not when it is
near.
Finally, the theme of personal experiences and preferences explained the presence of somewhat extreme opposing attitudes towards the same
bird condition, especially when the drone was at the
near location. P23, P34, and P50 seemed to detest and dread the
near-
bird condition. P34 stated: “The
bird gave me a headache, especially when it was so close to me, I just wanted it to go away! Oh my god! Please! I wanted to kill the bird, almost like ‘get me a rifle.”’ P50 said: “when it’s coming so close, I would definitely want to grab it, break the wings and hide it...” In contrast, P31, P32, and P45 were true bird lovers. P31 preferred the
near-
bird condition the most, commenting “I like birds a lot. I’ve seen this bird in real life. It’s a common bird in Europe. I know its name in Swedish, it’s ‘Talgoxe.’” Besides correctly identifying the birdsong as belonging to the talgoxe (great tit), P31 talked about the sound features of the great tit and the change from three syllables to two due to increasing urban noise pollution, which perfectly matches the literature we found (see Section
3.3.2). P32: “I want it (the drone with the birdsong) to come onto my hand,” and P45 stated: “I love it. It made me recall I gave food to wild birds and their babies—they came to the balcony singing.”
5.2 Themes Identified for the Rain Sounds
Regarding the theme of familiarity, participants’ interpretations of the added rain sound fell into two groups. The majority of participants belonged to the first group and claimed the sound was like rain. The second group of around one-third of the participants claimed that it was not like rain, but could be further split into two subgroups. The first subgroup, consisting of P1, P12, P20, P23, P27, P29, P42, P44, P53, associated it with water other than rain, e.g., water leakage, a water splash, or a waterfall. The other subgroup mentioned that it felt artificial due to the fact that it was together with the drone noise. Participants seemed to have more negative feelings if they associated the sound with something artificial or something wrong (e.g., water leaking). P38 stated “It irritates me. It sounds very artificial, mechanic, and annoying. It is not natural at all.” However, participants tended to feel more positively if they correctly associated the sound with rain. In the case of rain sounds, the theme of familiarity was also closely associated with the theme of personal experiences, as participants from South Asian cultures tended to identify the sound correctly as heavy rain.
The proxemic distances of the rain sounds impacted participants’ perceptions, but in sharp contrast to the bird sound, those perceptions were unaffected by the theme of the social dimension of proxemics. In the case of rain, the impact was related to the blending of the rain sound with the drone noise. Many participants reported that under the far-rain condition, they could not distinguish the added rain sound, as it blended into the drone noise, but they could distinguish it at the near or middle location. P01 stated, “I couldn’t distinguish the rain sound when at far, but at near it was ok.” P52 mentioned “when far, it didn’t sound like rain when it was together with the machine noise—it was not clear. But the closer, the clearer, the more it sounded like rain.” The social dimension of proxemics entirely disappeared in the rain setting, as no sounds from living or artificial social beings were involved.
In the case of rain sounds, the theme of personal experience was mainly associated with cultural factors. We noticed that the passionate rain lovers P45, P47, and P48, who all came from the monsoon region in South Asia, highly rated the rain condition. They were able to correctly identify the added sound as a heavy rain even though it was together with drone noise, as P47 said: “The rain sound was suppressed by the machine sound, but I still heard the sound of water, the rain was quite heavy.” And P47 continued: “Especially when at far, this rain sound felt special—it reminds me of my home where it has a lot of rain! My school usually reopened during the rainy season, this is exactly the same sound I used to hear when I was a kid sitting in a classroom with heavy rain outside. I could relate to it. It’s a nostalgic feeling.” P45: “This sound is like heavy rain, it reminds me of the rainy season in my home country. I love rain. I have a name that means ‘rain’ in my native language...I recorded the sounds of rain myself...I took shower in the rain—in my country, the raindrops are very big so you can take a shower with them.” P45, P48 associated the rain sound with other sensory experiences as well. P48: “I like rain in my country, it is warm rain [tactile] with the smell of soil [olfactory].” P45: “After raining, it became green [vision], fresh [olfactory], and cool [tactile].”
We did not identify any associations relating to the theme of safety in the case of the rain condition, indicating that the sound might be interpreted as a more neutral alternative.
5.3 Themes Independent of Sound Conditions
Several comments by the participants covered the identified themes but did not apply to any particular sound condition. Several participants commented on personal experiences and perceptions independent of particular sound conditions, such as the purpose of the drone, individual suggestions for alternative sounds, and the feeling of the wind. Furthermore, the drone was sometimes experienced as an invasion of privacy, associated with the social dimension of proxemics. The most pronounced theme, however, was safety. Several comments about safety ranged across the different sound conditions, indicating that this is a general concern when interacting with drones.
5.3.1 The Theme of Personal Experiences and Perceptions.
Regarding the theme of personal experiences and perceptions, we identified three sub-themes across all sound conditions: (1) the need to discuss the purposes of the drones, (2) individual preferences for alternative sounds, and (3) the experiences of the airflow generated by the drones.
Purposes of drones. Although it was not the intention of this study, some participants mentioned that the purpose of the drone plays a vital role in defining their experiences. The intended function of the domestic drone was neither specified nor discussed. During the interviews, participants P05, P08, P22, P31, P32, P37, and P50 mentioned their considerations or doubts about the intended functionalities of the domestic flying robot. P05 and P22 both asked, “What is it used for?” P8 and P37 both said the choice of sounds to add should depend on the use cases—P37: “if it’s delivering me a drink at a party, it will be very different than if I’m reading a book.” P31 pointed out: “It is more annoying if you don’t know the purpose,” and further explained this with reference to her previous experience encountering a commodity drone: “My neighbor was flying a drone...I first felt annoyed, but later felt better when knowing it’s for advertising (the neighbor wanted to sell his house and used the drone to take photos of the property to showcase).” P50 also said he would have a better feeling and give higher ratings to the drone if he knew the drone was coming to help and accompany him. P32 suggested “this small drone could be a little helper for fun and companionship.” The first and second authors both recalled many other participants casually asking about the intended function/purpose/usage of the small flying robot after the interview during chit-chat (not audio-recorded).
Desired sound depends on personal taste. The participants suggested other sounds they thought might be suitable for adding to the domestic drone besides the bird and rain sounds we used. The most common choices were either music or some other type of natural sound. However, these common choices still varied. For instance, the choice of music ranged from classical Beethoven, country music, or festival music (e.g., Christmas music) to rock and roll, with the suggestions directly related to personal taste—as P44 said: “I like rock music, I want to add rock.” The choices of other natural sounds included ocean waves, a campfire, thunderstorms, and so on. In particular, P22 and P44 asked for customized sounds—P44: “Users should be able to choose which sound and which mode.” In addition, some participants’ choices were special or more personal. P09, a lover of tea culture, wanted to add sounds from the tea ceremony like tea cups being set on the table and the sound of pouring tea into cups. P10 suggested “broadcast, verbal sounds; those containing meaningful messages to bring values.” P30 associated the humming drone to a mosquito and wanted the sound of a croaking frog to prey on the insect. P25 had grown up in Gothenburg and wanted to add Gothenburg-related sounds—he suggested the sounds of strong wind or traffic in the city (Gothenburg is a coastal city that has the largest port in the Nordic countries, with busy traffic and strong winds).
The feeling of the wind. Participants commented on how they experienced the airflow generated by the drone. Nearly all participants agreed that the propellers generated airflow, and it felt the strongest and most obvious at the near, weaker at the middle, and barely felt at the far location. The only exception was three people who were fully covered with thick clothes and face masks and claimed they did not feel airflow. However, how participants perceived the wind was strongly associated with their personal experiences and associations. Half of the participants felt the airflow was a cool and refreshing breeze that made them comfortable and relaxed and thus had a general positive feeling towards it. P28 said, “It reminds me of a summer breeze.” P38 mentioned that the positive feeling from the airflow was even better with the rain sound, “It’s nice and soft, especially with the rain sound, reminds me of soft rain outside on warmer days, gives a cozy feeling.” P31 commented: “The airflow was nice. It made me feel a bit more connected to the drone.”
However, these impressions might depend on contextual factors. P42 mentioned that “it feels pleasant now when it is warm, but might be annoying when it’s cold.” One-fifth of participants were negative about the airflow, as they felt it was cold, uncomfortable, and dangerous, particularly at the near position—both P18 and P21 mentioned the wind amplified the presence of the robot and increased fear. P23 explained the negative feeling as arising because “the airflow was surprisingly strong for such a small robot, much more than I had expected. Very annoying.” However, P23 was more positive towards this surprise: “the airflow was surprisingly strong, and it triggered curiosity.” The rest of the participants felt neutral about the wind. P33, P48, and P54 mentioned that they noticed the airflow also had some visual impact. P33 “visually can see the blanket is moving,” and P54 “even saw the paper was shaking.”
5.3.2 Theme of the Social Dimension of Proxemics.
Regarding the social dimensions of proxemics, even though all participants were clearly informed in advance that there was no camera in the experimental environment or on the drone, P08, P30, P37, P38, P50, P55 still mentioned that they felt they were being observed by the drone, especially at the near location. This finding was strongly related to the theme of safety and was associated with experiencing the drone as an invasion of privacy. The feeling of being watched by the drone gave participants negative impressions. P30 stated: “I know it didn’t have a camera. But when it’s very close to me, very stable (hovering), it felt like it was staring at me intensively, I didn’t really feel very safe then.” P37 said: “It felt like I was being observed at near, just its posture, the way it looks, makes it feel like it’s watching me. I felt it was very invasive.” However, P38 commented that this feeling only arose with the bird condition: “with rain or none, the drone was not so much like a living thing, but with the bird sound, together with the ‘silver thing’ (electronics on the drone), it’s more like an animal—the ‘silver thing’ is like a face, I felt something was looking at me. I didn’t feel safe anymore, I felt in more danger—it was like a mechanical bird.”
5.3.3 Theme of Safety.
Regarding the theme of safety, important sub-themes include (1) the small size of the drone, (2) the addition of propeller guards, and (3) the unexpected finding that the electrical wire was perceived as a safety measure.
Small size made the drone feel safer. Many participants acknowledged that the flying robot was small, with some of them pointing out the small size as an advantage, especially in relation to the theme of safety. P03, P08, P18, P31, P32, and P49 mentioned the small size of the drone as a good size that made them feel safer. P31 and P32 thought it was cute at such a small size; as P31 said, “It’s small, it’s cute, it’s like a small animal.”
Adding propeller guards could increase safety. Some participants emphasized that additional propeller guards might be needed to raise feelings of safety. P03 and P49 commented that even though the size is already small, adding a protection frame to each propeller would feel safer. P49: “The size (small) is good, it makes me feel safe. It looks good while flying. However, adding safety protection parts (propeller guards) will be even safer for both human and robot.” Besides P03 and P49, P10, P13, and P48 also suggested adding propeller guards.
Electrical wire was interpreted as a safety precaution. The function of the one-meter-long electrical wires was to connect the Bluetooth board to the loudspeaker, transmitting power and signals. This setting aimed to reduce the drone’s takeoff weight by leaving the Bluetooth board and battery under the desk. It was a temporary solution for prototyping. However, unexpectedly, many participants thought these wires were a safety precaution to restrict the flying area in case the drone got out of control, and it made them feel safer during the experiment. However, some participants worried that the wires would hit the propellers during landing—we noticed that these participants usually had engineering and technical backgrounds.
6 Discussion
Based on the previous two sections, namely, Sections
4 and
5, we discuss here the quantitative and qualitative findings, respectively. It is noteworthy that our quantitative and qualitative analysis results are in line with each other.
6.1 Quantitative Analysis Findings
The statistical analysis of the quantitative data demonstrated a collective pattern among participants regarding their experience encountering a noisy domestic drone with the three sound conditions (bird, rain, none) at three proxemic distances (far, middle, near). The two measurements of the perceived sound characteristics, namely, loudness and sharpness, met our expectations—natural sound conditions were perceived as louder, and the bird in particular was recognized as sharper; the closer the distance, the louder and sharper the sound were perceived. We had hypothesized that it would be beneficial to add natural sounds in terms of people’s perceptions of the drone, and we did find support for this. The natural sounds we added, namely, the bird and the rain, both significantly increased the participants’ ratings for the measurements of pleasantness, relaxedness, and attractiveness, but had no effect on perceived safety. Meanwhile, the proxemic distances had significant effects on all measurements, which was in line with our hypothesis that the closer distance would have a more negative effect on the perceptions. Although we had not explicitly considered an assumption for the interaction effects between both factors, in fact, the interaction effects between sound and distance were significant (p < .05) for loudness, pleasantness, and attractiveness. The participants significantly preferred the bird at the far location, but not at themiddle or near location, implying that they started to dislike the bird when the distance got close.
6.2 Qualitative Analysis Findings
The qualitative data enabled us to look further into each individual participant’s reasons for their feelings and thoughts about the experiment. The findings generalized into four themes: (1) familiarity with the sounds, (2) personal experiences and preferences with the sounds, (3) the social dimension of proxemics, and (4) safety associations. Regarding the
bird and
rain sound conditions, we found that participants’ sensations were similar with regard to the objective features of the sounds—e.g., no matter whether they liked or disliked the
bird, they shared the same view that the sound was relatively sharp; regardless of whether they considered the
rain to be natural or not, they recognized it as somewhat white noise with a water sound, which blended into the drone noise at the
far location but was more distinct at the
near location. However, this common understanding of objective features funneled into various subjective associations and interpretations—e.g., some participants claimed the
bird sounded not like a bird but like an alarm, while others experienced the opposite. This finding emphasized the role that personal experiences and preferences play in the interpretation of sound. Even participants who associated the sound with a bird’s song still ranged from interpreting it as a dangerous “bird” that might attack them (especially in the
near condition, associated with the social dimension of proxemics) to seeing it as a cute and friendly animal. The participants’ associations and interpretations of the sound affected their attitude and experience of the flying robot and were dependent on their previous personal experiences, perceptions, and cultural backgrounds. This empirical finding is supported by neurobiology—the number of neurons in the primary auditory cortex, namely, those dedicated to figuring out what sound information means, greatly outnumber those that transform sound into electrical neural signals, resulting in what humans expect to hear plays a great role in what they indeed do hear [
41]. In other words, how humans perceive sounds is not purely dependent on actual sonifications, but also and more importantly on their previous experience.
We found further support for this interpretation in the findings for themes across all sound conditions. For instance, participants acknowledged the presence of the drone airflow and reached a consensus that the closer the drone, the more obvious the airflow felt. Then, some associated the airflow with natural weather patterns, while others associated it as an amplification of the perceived presence of the robot. When associated with natural weather, it could be further interpreted as either a comfortable cool breeze or an uncomfortable cold wind. Similarly, the association with robot presence amplification could be interpreted as negative feelings of danger or positive feelings of being more connected to the drone. These personal interpretations shaped participants judgments’ of the airflow and by extension, the flying robot and the whole experiment. Across all sound conditions, we found support for the vital role that safety perceptions play in HRI, as numerous statements were categorized into the theme of safety.
The thematic analysis of the interview data provided additional information of the participants’ qualitative experience beyond the quantitative measurements, which allowed us to extract patterns within participants’ perceptions of a noisy flying robot with natural sounds. From the aforementioned themes, we found a generalized pattern within participants’ various qualitative experiences of their encounters with the flying robot with added natural sounds in the experiment. This pattern has been consolidated into a visual summary that setting multiple sequential steps in relation with one another (see Figure
12). This visual summary illustrates why an identical stimulus might lead to diverse and even opposite individual judgments. Our visual summary resonates with models from the existing literature in the fields of user experience [
28,
33] and neurobiology [
41]. Figure
12 shows a sketch of how sounds and other types of stimuli in our experiment are perceived by the participants through a process of sensation and perception, illustrating that perception is a phenomenon of sensemaking. A given stimulus may be first perceived as a physical sensation with certain physical characteristics [
54]. This step depends on the functioning of participants’ sensory apparatus and usually converges into some common understanding of the objective features of the stimulus (except among people with sensory impairments). This sensation may trigger participants to associate the stimulus with something that they are already familiar with. Then, the association develops into an emotional interpretation. Both of these steps, association and interpretation, highly depend on individuals’ previous experience [
59], which further determines personal preferences—participants diverge into viewing the same stimulus either favorably or less favorably.
8 Limitations and Future Work
Even though the birdsong had a positive influence on participants’ perception of the drone and the sound selected was from a common bird, most participants (regardless of whether they recognized it as a bird or not) reported that the sound was too sharp. Some participants suggested choosing another bird species with a lower pitch to achieve better results. Some also mentioned it would be more pleasant to hear a bunch of birds singing in the treetops rather than a single bird singing right in front of you, as a single bird singing such a short distance away from a human does not seem very natural. We conjecture that by making this change, the positive effect of the bird condition may become more significant. The fidelity of the loudspeaker was also questioned. Some participants complained about the poor sound quality of our small light loudspeaker. A better loudspeaker with higher fidelity could improve the positive effect of our added sounds. Another issue was the drone’s own noise when flying. The same drone sounded different after several takeoffs and landings due to the reduction of the joint gap between the propellers and motors. We tuned the gap to ensure that the noise of the drone would not change a lot during the course of the experiment. However, in actual applications, variations in the noise of the drone may affect participants’ evaluations.
Regarding the studied locations, the near location in particular needs stricter control. Individual differences in height and body shape obviously introduced additional distance errors between participants and the drone. This error was slight with respect to the distance between the middle and the far locations relative to the participants’ seats, but it became non-negligible regarding the distance from the participant to the near location. Even though the experiments were held in a noise-controlled lab, some variables were not controllable, e.g., weather and temperature. Participants mentioned they would be more affected by the rain condition on a day with heavy rain. The temperature would also influence the clothing worn and affect the perception of the wind generated by the drone. The design space of domestic drone airflow should be further explored.
Although the functionality of the domestic flying robots was not within the scope of this study, concerns were raised by many participants. This implied that the ambiguity of intended functions might have caused some confusion for the participants and thus to some degree might have biased their judgments. Based on the variety of empirical data we gathered and a thorough consideration of research on close-range human-drone interaction, future work should explore and identify the potential functionalities and usage scenarios of small domestic flying robots. In daily practical usage, the trajectory of a flying robot will be more complex than what we showed in the experiment. How will a complicated flying trajectory that conveys gestural information and fluctuating noise influence people’s perceptions of domestic drones? Will the added natural sounds still work in this scenario? Does the added sound need to be updated in real-time in response to the robot’s movements? A more comprehensive strategy for adding sound must still be explored.