research-article

Open access

Investigation of Low-Moral Actions by Malicious Anonymous Operators of Avatar Robots

Authors:

Taha Shaheen,

Dražen Brščić,

Takayuki KandaAuthors Info & Claims

ACM Transactions on Human-Robot Interaction, Volume 14, Issue 1

Article No.: 11, Pages 1 - 34

https://doi.org/10.1145/3696466

Published: 03 December 2024 Publication History

PDF eReader

Abstract

Avatar robots allow a teleoperator to interact with the people and environment of a remote place. Malicious operators can use this technology to perpetrate malicious or low-moral actions. In this study, we used hazard identification workshops to identify low-moral actions that are possible through the locomotor movement, cameras, and microphones of an avatar robot. We conducted three workshops, each with four potential future users of avatars, to brainstorm possible low-moral actions. As avatars are not yet widespread, we gave participants experience with this technology by having them control both a simulated avatar and a real avatar as a malicious anonymous operator in a variety of situations. They also experienced sharing space with an avatar controlled by a malicious anonymous operator. We categorized the ideas generated from the workshops using affinity diagram analysis and identified four major categories: violate privacy and security, inhibit, annoy, and destroy or hurt. We also identified subcategories for each. In the second half of this study, we discuss all low-moral action subcategories in terms of their detection, mitigation, and prevention by studying literature from autonomous, social, teleoperated, and telepresence robots as well as other fields where relevant.

1 Introduction

In the Sanskrit language, the word avatar refers to incarnations of Hindu deities. In the modern context, it has come to be used for “an electronic image that represents […] a computer user” [21]. Building on this concept, an avatar robot describes a robot that represents a person in a remote location. It is a robot that a remote operator (or teleoperator) can embody or “inhabit” to visit a remote location and interact with the people or objects there. The primary purpose of using an avatar robot is social interaction.

Telepresence robots, also known as Mobile Remote Presences, are similar to avatar robots in that they allow teleoperators to be “present” in remote locations and socially interact with the people there. While the literature on both tends to overlap, in the scope of this study, we distinguish between an avatar robot and a telepresence robot in terms of the anonymity afforded to the operator. In the case of a telepresence robot, a screen shows the operator. Examples of telepresence robots available in the market are Gobe [73], Double 3 [74], and Vgo [29]. On the other hand, in an avatar robot, it is not apparent who the operator is, thus making them anonymous. Another example of robot teleoperation is known as Wizard-of-Oz (WoZ) control, which is primarily used in research settings. In WoZ, the existence of the teleoperator is hidden from participants, thus providing the illusion of an autonomous robot to whoever is interacting with it. This control method enables researchers to quickly test hypotheses in human–robot Interaction studies, without having to program a fully autonomous robot. Since this WoZ-controlled robot does not provide an embodiment of the teleoperator, it is not considered an avatar robot in our study. In the scope of our study, we assume that it is known to most people sharing space with the robot that it is controlled by a remote operator. Examples of avatar robots are Orihime-D [92], Alter-Ego [52], and Engkey [102]. Under this definition, a telepresence robot with its screen turned off is also considered an avatar robot. See Figure 1 for illustrations of telepresence and avatar robots.

Fig. 1.

There are several merits to robots that enable remote presence. By allowing a person to inhabit a space different from the one they are in, avatar robots and telepresence robots open up avenues of remote work for people who are unable to commute or physically unable to work. Additionally, they provide support for remote education by teachers and allow for medical assistance by healthcare professionals to faraway patients. In the future, tourism may become possible through such robots for people who cannot afford to or are unable to travel. They may even enable one individual to be present in multiple locations at once [43]. Some literature (e.g., [43]) uses the term “beaming” to refer to being telepresent as an embodied robot, a word borrowed from the American television series Star Trek. Depictions of robots such as this are popular in contemporary media. An example is the future world depicted in the 2009 movie Surrogates [61] and the 2006 graphic book series it is based on [97] where people operate humanoid avatar robots called Surrogates to go to work, vacation, socialize, or interact with society all from the comfort of their own homes.

In a world with avatar robots, many circumstances may call for or lead to the anonymity of the user. Some operators, perhaps those with certain disabilities such as severe anxiety or Hikikomori (a condition of extreme social withdrawal and isolation), or those trying to avoid attention (such as a celebrity) may wish to not be seen by others while they use an avatar robot to interact with or contribute to society [47]. The use of anonymity will also extend to certain industries, where the operator’s identity is not relevant to the service provided, such as the service industry. For example, convenience stores or hotel lobbies may choose to have a uniform look for the avatars used by their employees to provide a consistent experience to their customers. In the future, companies may begin to mass-produce and rent out avatars to people for temporary use such as to attend an event or take a tour of a historic site. Modifying a robot design for every new user may not be feasible and it may be practical and cost-effective for such companies to produce robots that have a uniform or neutral look. Orihime-D, an avatar robot with a uniform look [92] (Figure 1(c)), exemplifies this approach having been utilized by people with physical limitations to perform customer service tasks at a café in Japan [16].

It must be noted that a focus on uniform appearances of avatar robots can lead operators to feel a loss of identity or individuality. Moreover, a potential consequence involves the societal homogenization in communities utilizing avatar robots, where individuals with disabilities, for instance, might become less visible in daily life, replaced by neutral-appearing avatars. The use of technology has a history of dehumanization of users [69]. Despite these drawbacks associated with anonymity through uniform appearances, the inevitability of anonymity persists.

1.1 Low-Moral Actions by Operators of Avatar Robot

In the context of this study, we define low-moral actions as any violation or nonadherence to moral norms. While norms, in general, can be defined as behaviors that are expected of and followed by a sufficient number of people through their actions or inactions for the interest of their community [57], moral norms are a subset of norms specifically concerned with the principles of right and wrong [22]. Although details may depend on context and culture, moral norms can generally be distinguished from social norms, which are the usual practices or social conventions within a given society [22, 57]. Examples of actions that can be considered low-moral actions are smoking in public, speaking loudly in a library, blocking someone’s path, stealing, hitting someone, and damaging property.

Given anonymity, a malicious operator, unafraid of getting caught and further consequences, may feel free to perform actions against norms and unwritten rules. An example of low-moral behavior encouraged by the safety of anonymity can be seen on popular social media Web sites where malicious users may create an account with completely false information and, protected by anonymity, proceed to post offensive material, troll, harass, deceive, and scam other users. Similarly, malicious operators of avatar robots may use the shield of anonymity through an avatar to protect themselves and do low-moral actions. Unlike on social media, a malicious operator of an avatar robot can cause physical harm as well as psychological harm.

What is a low-moral action is context-dependent and some actions may be perfectly reasonable given a different situation. Firstly, the relationship between the operator of the avatar robot and the person the robot is interacting with may change the morality of an act. For example, the avatar robot hiding and scaring someone is clearly a low-moral action if the person in question is a stranger, but if the person is a friend the act becomes a prank, a much less low-moral and even socially acceptable act. In the scope of this study, for the sake of simplicity, we consider all interactions to be between strangers.

Secondly, the location where the interaction takes place also defines the morality of the act. For example, using an avatar robot to block someone from going inside is rude in a shopping mall, but might be considered acceptable when the location is a private area. In our study, we focus on the interactions that take place in a public area and consider both open public spaces (e.g., a shopping mall or large hall) and closed public spaces (e.g., inside a convenience store or art gallery).

Low-moral actions can be intentional or unintentional. In this study, we consider operators with malicious intent and thus all low-moral actions considered in this study are exclusively intentional (not accidental).

1.2 Enumerating and Preventing Low-Moral Actions

It is imperative to study the low-moral actions possible through avatar robots so that they can be prevented. Presently, avatar robots are still not widespread; however, we believe it is a matter of time before they are ubiquitous given their utility. The rise of avatar robots will inevitably lead to a rise in malicious users who abuse them. To prevent low-moral actions we must (1) understand how they can manifest and (2) investigate what prevention techniques are suitable for each manifestation.

Systematic listing and enumeration of what can go wrong in a system is a popular approach to understanding and preventing accidents, hazards, and loss. This approach has been taken in robotics to identify unintentional dangers to humans when social and assistive robots share space with them [79], to generate a preliminary list of themes of privacy concerns of people sharing space with telepresence robots [46], and to make a safer robot by enumerating what can go wrong technically in interactions between a medical robot and a patient [28, 31].

Avatar robots are unconstrained in their morphologies as they are not required to have any feature that shows the actual operator. While typically humanoid (e.g., [4, 52, 92]), they can also be nonhumanoid (e.g., [15]) or have a screen showing an animated character (e.g., [102]). Depending on the morphology, the teleoperator may be able to control the arms, torso, neck, or other appendages. For the sake of simplicity, we did not consider any upper body movement and restricted this study to the low-moral actions possible through locomotor movement (the robot’s movement through space) and the visual and audio sensors of an avatar robot.

This article aims to identify and prevent low-moral actions that can be carried out through avatar robots. To achieve this goal, in the first half, we present a comprehensive list of identified hazards resulting from a series of workshops conducted with participants to enumerate these actions. In the second half of the article, we speculate and discuss suitable prevention mechanisms for each low-moral action based on a review of relevant literature primarily focusing on technological solutions to low-moral actions.

2 Related Work

2.1 Low-Moral Actions in Cyberspace

Human interactions facilitated by technology extend traditional human-to-human encounters, and online societies or cyberspaces develop their unique norms [26, 35, 58]. Low-moral actions in cyberspace and on social media are often described as a result of the Online Disinhibition Effect which describes “how people say and do things in cyberspace that they wouldn’t ordinarily say and do in the face-to-face world” [90]. In one of the earliest works on this topic, Suler [90] suggested six major factors for this effect. These include dissociative anonymity, the process of separating one’s online actions from one’s offline identity, and invisibility, the absence of physical presence. Negative manifestations of the Disinhibition Effect often take the form of flaming behavior online, marked by the use of vitriolic or profane language targeting individuals or groups. While anonymity and invisibility itself are not direct causes of antisocial behavior, they enable it by removing a sense of accountability from the perpetrator [18, 70]. Additionally, [49] found that the “online sense of unidentifiability,” as characterized by the unavailability of personal information, invisibility, and especially lack of eye contact, plays an important role in enabling online flaming behavior. Similar issues have been reported in the Virtual Reality space, where anonymity is often used for privacy reasons, but also enables harassment and leads to disinhibition [9, 25].

In must be noted that in the realm of online interaction anonymity and invisibility serve as both the poison and the antidote. On the one hand, the concealment of identity afforded by the nature of cyberspaces enables low-moral behaviors with impunity. On the other hand, members of these online societies may prefer to hide their identities (such as gender, sexual orientation, or race) to avoid being the target of low-moral actions by harassers [25, 99]. Moreover, journalists and whistleblowers often rely on online anonymity to safely report the low-moral acts of others, especially under oppressive regimes. The thing that causes the problem also contains, at least partially, a way to temper it. There is no correct answer to how anonymity must be handled in online or virtual spaces but it is clear that its outright removal may not be the answer.

2.2 Low-Moral Actions of Autonomous Robots

Avatar robots and autonomous robots are similar in terms of what low-moral actions may be possible through their locomotor movements and sensors. Concerning the avoidance of low-moral locomotor movement in autonomous robots, the focus has been directed at making autonomous robots socially aware. This social awareness is not limited to simple awareness of its surroundings but extends to navigating complex human environments in a safe and socially acceptable manner [27, 59]. Being socially aware includes the speed at which the robots move [39], the distance they maintain from people [1, 51, 65], the manner in which they navigate around social groups [40] and crowds [36, 42], and how they approach people [32, 39, 86, 95]. For low-moral acts such as collecting sensitive data through the use of visual and audio sensors of autonomous robots, research on privacy-sensitive robots aims to minimize the potential for such actions [5, 55, 56, 77, 100]. Autonomous robots and avatar robots may have access to locations that are usually reserved for humans thus increasing the possibility of privacy violations. In avatar robots, low-moral actions would result in data directly being accessed by a remote anonymous operator.

These issues pertaining to socially aware navigation and privacy-sensitive robots also apply to avatar robots; however, there are two key differences that make the problem of low-moral actions in avatar robots more challenging. Firstly, an avatar robot’s low-moral actions would have a much more significant psychological impact than those of an autonomous robot. For instance, an autonomous robot blocking someone’s path could be an accident or an unexpected consequence of its path-planning algorithm. An avatar robot doing so is a deliberate and malicious action with a person behind it. Secondly, in autonomous robots, a low-moral action is a consequence of its design and programming, making prevention a programming issue. In contrast, low-moral actions in avatar robots result from the actions of the person controlling them, requiring different detection and prevention strategies.

2.3 Low-Moral Actions of Teleoperated Robots

The importance of studying low-moral actions and their consequences in teleoperated robots has been highlighted by discussions in philosophical and legal literature. In a philosophical approach, literature has explored the possibility and consequences of low-moral actions happening via a teleoperated robot without the consent of the robot’s operator. Metzinger [60] imagines an advanced brain–robot interface where the robot acts on the operator’s thoughts. Who is to blame if the robot performs a low-moral action (e.g., murder) on an aggressive impulse from the operator? They propose that the operator is not to blame based on whether they could have stopped the action once the robot had begun to carry it out. Along these lines, Aymerich-Franch et al. [4] explored whether operators apologize when their avatar robot (operator is anonymous in this study) speaks in a low-moral manner without their consent during a conversation. They looked at the shame and guilt that the operator felt.

From a legal perspective, Dickens and Cook [19] and Nagenborg et al. [62] discuss the legal issues that result when teleoperated robots are used, for example, in telemedicine and medical procedures in different countries. What is the legality of a doctor remotely performing a procedure (e.g., abortion) in a country where it is illegal from a country where it is not? Barabas et al. [6] also demonstrate how the use of telepresence robots can lead to legal violations, such as the transmission of sensitive video content to a remote operator during a confidential meeting.

Such works contain discourse regarding when and why low-moral actions can happen in teleoperated robots and the consequences associated with such actions through examples; however, they do not contain a comprehensive list of descriptions of what low-moral actions to look out for. Furthermore, the prevention discourse offered in this body of literature is of a legal nature. In the case of avatar robots, such prohibitions may not necessarily deter a malicious operator from engaging in low-moral actions through avatar robots due to the perceived safety of anonymity and the distance from the space in which the avatar robot is. There remains a gap in the literature and a need to enumerate, with the goal of preventing, the malicious acts possible through avatar robots.

2.4 Malicious Operators of Teleoperated Robots

Literature rarely discusses malicious operators and intentional low-moral acts carried out through teleoperated robots in detail; however, authors have indeed acknowledged the potential of malicious operators in discussions of the future of robotics. Oliveira et al. [68] discuss a wide range of problems that society will face when teleoperated robots become ubiquitous (such as operator addiction to such robots and the consequent lack of exercise and a sense of isolation). In this article, they also mention the possibility of malicious operators and stress the necessity of safety mechanisms to reduce the risk of malicious use.

Riek and Watson [72] approach the topic from a computer security perspective. They exemplify ways in which a malicious user can hijack another’s avatar robot and use the anonymity afforded by the robot to deceive people who share a space with it. For example, they may pretend to be the original user to gain personal information or use their access to subtly alter how the robot behaves while the original operator controls it to cause confusion or misunderstanding for interlocutors.

The cited works highlight the need for a detailed exploration of intentional low-moral actions that can be carried out through avatar robots. While they give some examples, they lack detailed elaboration on specific malicious acts and how they may present. To effectively detect and prevent malicious actions, it is imperative to identify and enumerate these actions.

3 Hazard Identification Workshop

3.1 Approach

As discussed in the introduction, the prevention of low-moral actions possible through avatar robots cannot be done without first knowing what actions are possible and thus our first goal was to generate a list of low-moral actions. Given the infancy of avatar robots, we could not base an approach on real-world observations or empirical data. Furthermore, to our knowledge, literature that we could draw from is scant on such low-moral actions via avatar robots.

The field of risk analysis offers a way to ideate, conceptualize, and think about problems for a technology that essentially does not exist yet. As our aim was to arrive at a list of low-moral actions, this study does not conduct a complete risk analysis nor determine the degree of likelihood and consequence of hazards. However, a technique utilized in the risk analysis process called hazard identification workshops was ideal for our study. These organized group discussions coordinated by a facilitator and attended by key stakeholders (users, investors, engineers, etc.) serve to identify the hazards of a planned project. Since the low-moral actions performed through avatar robots are the hazards of using such robots, a hazard identification workshop was ideal for this study.

Different techniques can be used in a workshop to arrive at a list of hazards. A commonly used method is brainstorming [91, 101] where the goal is to generate as many ideas in as short a time as possible. We used brainstorming as it is a proven technique for hazard identification, promotes out-of-the-box thinking, and is easy to design and implement.

In order to ensure that participants had experience with the technology, had unique thoughts regarding the topic, and agreed regarding definitions and terms, we added an “experience phase” to the standard brainstorming technique. Participants of brainstorming sessions are usually familiar with the discussion topic; however, most people currently have no experience with avatar robots and may even hold false concepts from their exposure to contemporary media. Therefore, in our study, before they discussed amongst themselves, the participants were given a chance to operate an avatar robot. They performed low-moral actions and also experienced being on the receiving end of low-moral actions by a robot.

3.2 Participants

We recruited 12 participants, 6 male and 6 female, through a part-time job recruitment company that hired people in the Kansai region of Japan. Our aim in this study was to explore a broad spectrum of everyday low-moral actions experienced by ordinary people: the experience of carrying out or being on the receiving end of low-moral behaviors. Therefore we decided that a representative group of the general population was ideal for this study because no one person or discipline is an expert of these experiences. As such, we placed no specific conditions on participant recruitment in order to reflect the general public who might perpetrate or experience low-moral actions when such robots become widespread. We could have chosen to limit recruitment only to people with relevant expertise, such as experts in robotics or malicious behaviors. However, in order to avoid a biased perspective, we did not exclusively recruit robotics experts as we believe that their specialized knowledge may have made them exploit avatar robots from a technological standpoint. Additionally, we did not limit recruitment only to experts in malicious behaviors, as their familiarity with criminal or immoral acts may have led them to focus on an extreme subset of low-moral actions.

The average age of the participants was 36.2 years old (SD: 15.4 years). The youngest and the oldest were 20 and 60 years old, respectively. We carried out three hazard identification workshops with two male and two female participants each. A single workshop lasted for approximately 2 hours. We compensated the participants for their time (1,100 Japanese Yen per hour).

3.3 Procedure

We divided the workshop into three phases: Pre-Brainstorming, Experience, and Main Brainstorming Phase. The Institutional Review Board at Kyoto University approved this study design.

3.3.1 Pre-Brainstorming Phase.

This phase served many purposes. It was to build rapport amongst the participants and with the facilitator, to define the problem for the participants, to give them some practice, to get them into the correct state of mind for the hazard identification workshop, and to eliminate general “first-thought” ideas so that unique and creative ideas could be generated in later phases.

This phase included introductions by the facilitator and participants and an explanation of the three-phase structure of the hazard identification workshop. We instructed the participants to imagine a future where avatar robots are commonplace and introduced them to the possibility of malicious and anonymous operators. They were given three examples of low-moral actions: blocking someone’s path, driving the robot down a flight of stairs, and separating a child from their family. We selected these disparate examples to prevent priming the participants so that they would not think of low-moral actions in only one way and to demonstrate the variety of low-moral actions that are possible. Participants were also asked to keep in mind the limitations of the avatar robot considered in this study: It has only locomotor movement, a camera, and a microphone for listening but it has no upper body motion, and it has no voice.

Participants were provided with writing material and sticky note pads for jotting down ideas. For practice, they were given 5 minutes to brainstorm individually. We included ideas generated in this phase in our final analysis.

3.3.2 Experience Phase.

In this phase, participants gained experience by (1) operating an avatar robot, (2) sharing space with an avatar robot, and (3) operating an avatar robot in a simulation. The participants were assigned one of these three roles randomly. After a 10-minute round, they rotated into the next role until they had experienced all three roles. Table 1 shows the distribution of participants during the four rounds. We encouraged them to note any new ideas they thought of during this phase. Each of these roles is briefly discussed below.

Table 1.

Round	Simulation	Operate Robot	Share Space with Robot
Round	Simulation	Operate Robot	Poster Exhibition	Speaker Session
1	A	B	C, D
2	D	A		B, C
3	C	D	A, B
4	B	C		D, A

Table 1. Participant Distribution during the Experience Phase (Section 3.3.2)

Participants are labeled from A to D.

Operating an Avatar Robot. A participant controlled a robot while pretending to be a malicious anonymous operator. The robot we used was Robovie II [38] because of its humanoid appearance, teleoperability, and affordance of anonymity. The robot had safety functionality which prevented it from hitting people or walls. For this phase, in keeping with the limitations on the avatar robots for this study, Robovie’s upper body was disabled, i.e., no arm or neck movements. Additionally, its speakers were also disabled. We placed it in an adjoining room with two other participants. Figure 2 shows this avatar robot being controlled by a malicious operator performing a low-moral action.

Fig. 2.

If the operator felt limited in what they could do with the robot, we allowed them to go into the adjoining room and pretend to be the avatar themselves. Figure 3 shows a participant pretending to be an avatar robot spinning around to annoy bystanders. From all three workshops, only two participants chose to do this.

Fig. 3.

Sharing Space with an Avatar Robot. Two participants shared space with a robot controlled by a malicious anonymous operator. They were assigned both an activity that simulated a situation where avatars would be used. It was explained that the function of the activities was to think of ideas and not the activity itself. One activity simulated an art show, poster exhibition, or window-shopping experience. The participants walked around and searched for Wally in “Where’s Wally?” posters. “Where’s Wally?” is a puzzle game with the goal of locating a character named Wally in a picture depicting hundreds of people doing a variety of things [23]. Two of the walls of the room were covered with Where’s Wally posters and the participants needed to walk around and look at the posters with attention and great detail. Figure 2 depicts participants engaged in this activity.

The second activity simulated a speaker session. Participants stood behind a line and paid attention to a TEDx Talk playing on a screen. TEDx Talks are videos from expert speakers on technology, education, tech, creativity, and so on [89]. The video played was in the local language (Japanese). Figure 3 shows participants engaged in this activity while a third participant pretends to be the avatar.

Operating an Avatar Robot in Simulation. A participant controlled an avatar robot in a simulator while pretending to be a malicious anonymous operator. We added this role in addition to operating an actual robot to increase the variety of scenarios that the participants experienced as malicious operators. We were limited by the space in the experiment area and by the number of situations we could have the participants act out with the real robot.

We used the MORSE simulator [75] to recreate five real-world areas inhabited by simulated people. The simulated area environments were made to scale using Blender [10]. The environments were a large outdoor public area, a large hall in a shopping mall, a long corridor in a shopping mall, a big clothing store, and a small convenience store. Each environment had typical crowd densities (4 in a convenience store to 40 in the large public area). The movements of the 3D animated pedestrians and groups were based on modeling human movement from observational data [82]. The crowds consisted of individuals (male or female) and groups (children and adults). We used almost a hundred distinct 3D models for these individuals with varying appearances, accessories, and clothing. The animated models walked, stopped, and showed interest in things in a human-like manner. They spawned and traveled on random but realistic paths through the environments including stopping locations to finally despawn at an exit point. Additionally, an individual pedestrian or group’s path could be altered randomly to include approaching the robot and interacting with it by standing next to it and looking at it. This interest would expire after some time and the pedestrians would continue toward their goals.

The participants pretending to be malicious operators were allowed to choose which environment they felt would help in their idea generation. They could change to a different environment whenever they wanted. They saw a first-person view through the robot’s camera. The robot model was an omnidirectional drive type. Its controls were the same as the real avatar robot. Low-moral actions being performed in the simulation can be seen in Figure 4.

Fig. 4.

3.3.3 Main Brainstorm Phase.

Here, participants discussed and generated new ideas, and built upon each other’s ideas. They stood facing a wall (Figure 5) and were given a few minutes to write down any ideas from the previous phase they had not written down. Each idea was written down on a sticky note. They pasted all of their past ideas on the wall to display them to all others. To begin the main brainstorming phase, the facilitator explained the rules that encouraged quantity over quality, deferring judgement, and being visual if an idea could not be written down. The facilitator asked one of the participants to tell the group their wildest idea (an ice-breaking conversational tool used in brainstorming [91]) and others were asked to build upon it by saying “Yes, and …” (another conversational tool used in brainstorming) and adding to it. New ideas were encouraged at any point. The facilitator made sure that the rules were being followed and kept the conversation from going on for too long or drying up. This phase went on for 30 minutes or until no more ideas were forthcoming.

Fig. 5.

3.4 Results

3.4.1 Observations from the Workshops.

In the experience phase, participants were especially interested in controlling the actual avatar. Using it, they spun around, approached other participants as close as the robot allowed, “climbed the stage” by crossing the demarcation during the speaker session, blocked other participants’ paths and views, tried to scare them by speeding up, and pushed other participants out of the way. In the simulation, they followed people, blocked paths, got as close as possible, bumped into them, interrupted their conversations, and blocked them from accessing things on shelves.

During the main brainstorming, participants referred to each other’s driving of the avatar and the situations when they felt unsafe or annoyed sharing space with the robot. The avatar was present in the room, and they sometimes pointed to it to express their thoughts. Most discussions and the subsequently emerging ideas overlapped between the sessions with only some being unique to a single session.

We collected a total of 320 sticky notes—the first session generated 90, while the second and third generated 115 each. The second session was the most diverse in terms of discussion topics. It was also the only session that ended early because the participants felt they could not generate any more ideas.

3.4.2 Consolidating Ideas through Affinity Diagram.

We used an affinity diagram to consolidate the ideas and produce a list of low-moral actions. An affinity diagram is a common technique used to organize ideas after brainstorming [91]. Out of the 320 sticky notes collected, 26 were discarded for containing ideas that were too vague, incomplete, or incomprehensible. We merged the remaining ideas from the three sessions and organized them into natural categories based on their similarities. The categorization decisions were made jointly by all authors. The data yielded four distinct categories of low-moral actions each containing subcategories based on how a low-moral action could manifest. Many ideas were classified under more than one subcategory, but we limited each idea to a maximum of three subcategories (see exact numbers in Appendix A).

4 Identified Low-Moral Action Categories and Subcategories

In this section, we report the categories and subcategories of low-moral actions that we identified through the affinity diagram analysis and present some representative examples. See Figure 6 for illustrations of low-moral action examples that fall under each category.

Fig. 6.

4.1 Violate Privacy and Security

This category includes all low-moral actions that breach the privacy of individuals or the security of locations. It includes using the robot’s camera and other sensors to capture data with the possibility of storing or disseminating it without the consent of the person(s) being recorded. It also includes spying, stalking, and so on. Figure 6(a) depicts this low-moral action. We believe that this category can be broken into five subcategories.

4.1.1 Steal Information.

Stealing information manifests as using the avatar robot’s camera to take photos or videos of items such as credit card numbers, addresses, and medical records.

“Steal credit card details near a cash register”
“Looking through your phone and taking pictures”

4.1.2 Photograph without Consent.

Photographing without consent manifests as using the robot’s camera to videotape people or events without permission.

“Being recorded without notice”
“Recording a person’s face without permission”
“I’m going to broadcast the footage”

4.1.3 Eavesdrop.

Eavesdropping manifests as using the robot’s microphone to listen in on or record a conversation that the robot is not part of. This is the smallest subcategory with only one sticky note idea from all three sessions (Figure 7(a)).

“Listen to people’s phones and conversations”

Fig. 7.

4.1.4 Stalk.

Stalking manifests as the avatar robot repeatedly following and observing a person without their consent or knowledge.

“Watching from a corner [of the room]”
“It’s creepy when [the robot] follows me”

4.1.5 Enter Forbidden Space.

Entering forbidden spaces manifests as the avatar robot entering an area where unauthorized people or cameras are not allowed such as backstage, an employee-only area, or a sensitive location.

“Following you to the bathroom (in the park)”
“Climbing on the stage (trying to stand out)”

4.2 Inhibit

This category encompasses the low-moral actions that involve using the robot’s body to prevent the people sharing space with the robot from doing something. Figure 6(b) depicts this low-moral action. We believe that it can be divided into two subcategories.

4.2.1 Inhibit Access to Resources.

Inhibition of access to resources can manifest as the avatar robot preventing someone from accessing things such as a magazine from a stand, water from a water fountain, or juice box from a shelf; from interacting with devices such as elevators, ATMs, or emergency defibrillators; or from accessing items such as those to sit on.

“Stand in front of goods to keep from being taken”
“I won’t move from in front of the sofa”

This low-moral action can also manifest as the avatar robot preventing access to things meant to be seen such as posters, maps, or directions.

“Stop in front of art to prevent close viewing”
“It’s in view, so we can’t see the footage”

This inhibition of access can be done in two ways. It can be active: As a person moves to try to see or interact with the inaccessible thing, the robot moves to assume a new pose so that the thing remains inaccessible.

“Obstructing the view of a photograph”

Or it can be passive: By the act of the robot standing in front of the item, no one even notices that the thing is there.

“Hiding a package”
“Standing on a braille block”
“Stopping for a long time in front of a painting”

4.2.2 Inhibit Movement.

Inhibition of movement manifests as the avatar robot preventing someone from moving by blocking them in a hallway or doorway or making someone change their direction by interrupting it. This is the second-largest subcategory.

“Blocking your direction of travel”
“Standing in a narrow passage”

Inhibition of movement is especially dangerous in the context of an emergency situation such as when medical personnel have to reach a place quickly or when people must evacuate a location.

“Obstructing security guards so that they cannot respond in an emergency”
“Interfering in escape from a fire or other disaster”

4.3 Annoy

This group of low-moral actions includes ideas that annoy, distract, or scare people. Figure 6(c) depicts this low-moral action. We believe it can be broken into five subcategories.

4.3.1 Unnatural Movement.

Unnatural movement manifests as robot movement that causes unease or dread in the people sharing space with the robot. This is the biggest subcategory.

“Surprise someone by approaching suddenly”
“Slowly coming closer and closer when [I’m] in [its] sight, not knowing how close it will get”
“Very surprised when they move suddenly from one angle to another”

This unnatural movement can manifest as the robot’s unceasing movement.

“Uncomfortable to be near […]. [It] can’t calm down.”
“Very annoying when it moves around all the time”

It can also manifest as the avatar robot moving in unexpected ways such as moving backward or spinning (as demonstrated by a participant in Figure 3).

“Can advance backwards to go after anyone off guard”
“Spinning around in front of people”

A third manifestation is the avatar robot hiding and appearing suddenly.

“Appears out of nowhere”
“Hiding in the shadows, scaring people who pass by”

Finally, some other forms of unnatural movement are:

“Maneuvering like a gasping car”
“Move in a jerky manner and interfere with walking”
“Make people worry by repeatedly stopping and pretending to have a breakdown”

4.3.2 Invade Personal Space.

Invasion of personal space manifests as getting too close to a person so as to cause discomfort or getting too close to a group of people to try to separate members of the group from one another. Little [54] defines personal space as “The area immediately surrounding the individual in which the majority of his interactions with others take place.” This is the third-largest subcategory.

“Approaches as close as possible”
“Moves around in close proximity”
“Get right behind people”

Another manifestation of this was disturbing groups by entering their interpersonal spaces. Ideas for disturbing groups were more frequent than those regarding an individual’s personal space.

“Interrupting a group of friends”
“Interrupting a couple holding hands”
“Getting between people who want to talk about the clothes they are buying”

Finally, breaking up a group containing a pet and a master is another way that invasion of personal space manifests. One participant drew Figure 7(b) and wrote:

“Run into a pet’s lead”

4.3.3 Chase.

Chasing manifests as following a person or an animal. Following a friend is a normal thing to do in a social situation but following a stranger without their consent is a low-moral action.

“Chases after them even if they run away”
“Follow you all the way and discourage you from buying [something]”
“Chasing a pet”

4.3.4 Mislead.

Misleading manifests as the avatar robot coercing someone to follow it and leading them to the wrong destination. Participants were mostly concerned with children being the victims. It came up in all three workshops.

“Directing someone to another place”
“Attract a child’s curiosity and lead them [away]”
“Attracting children’s attention and getting them lost”

4.3.5 Use Sound.

Using sound manifests as weaponizing the sound of the robot’s operation and movement (also known as consequential sounds), such as from its motors. It can be used to annoy or scare bystanders or disturb a quiet environment such as during a lecture or speech.

“Keep moving in quiet places and disturb with sound”
“[The avatar robot is] behind me (I don’t know if it’s there or not, but I don’t like to hear the noise)”
“Moving around in a situation where you should be quiet”

4.4 Destroy or Hurt

This category of low-moral actions represents ideas that focus on the destruction of property and harming people or animals. Figure 6(d) depicts this low-moral action. We believe that it can be divided into three subcategories.

4.4.1 Destroy Items.

Destroying items manifests as damaging things by striking them, making them fall, or running over them.

“Attempt to destroy the robot itself”
“Hitting and knocking over a product shelf”
“Destroy coloured cones or simple barricades”

4.4.2 Hurt Living Things.

Hurting living things manifests as the avatar robot hitting people or animals and causing physical harm. In one session, participants discussed this with people with disabilities in mind.

“Moves in a way that may harm a child”
“Interfering with a blind person”
“Surprise people by starting and stopping suddenly, leading to injury”

4.4.3 Mass Violence.

Mass violence manifests as using the avatar robot to do acts of mass violence such as activating sensitive buttons or levers in a high-risk area or fitting the avatar with a weapon or incendiary device and using it to injure or kill multiple people. This subcategory only came up in one session.

“Hovering in front of a sensor […] activating it”
“Enables the use of weapons at a distance”
“Mass shootings”
“Install an ignition system”

4.5 Summary

We have identified four major categories of low-moral actions, each with distinct subcategories. Violate Privacy and Security (Section 4.1) as well as Annoy (Section 4.3) both had five subcategories each. Inhibit (Section 4.2) with only two had the least subcategories. In terms of the number of ideas, Annoy was the largest category with 154 entries, while Violate Privacy and Security was the smallest with only 53. Table 2 shows the number of ideas (frequency) that fell under each category and subcategory.

Table 2.

Low-Moral Action Category	Frequency	Suggested Prevention Mechanism	Challenges with Suggested Solution
Violate Privacy and Security	53
Steal Information	7	Filter video feed to operator	Difficulty in operating Operator’s attention drawn to concealed item Robust identification of sensitive items required
Photograph without Consent	19	In closed space: Allow people to choose to be filtered in video feed to operator In open public space: Reduce impact through transparency and accountability	Same problems associated with video filtering Transparency and accountability cannot prevent, only mitigate
Eavesdrop	1	Dim audio if avatar robot not part of group	Group formation detection and audio filtering technically challenging
Stalk	17	Detect using robot pose and locations of surrounding people Detect by noticing if a person kept in view but no conversation engaged for a long time	Prevention impossible Detection challenging
Enter Forbidden Space	9	Constrain robot navigation Conceal entrances via video filtering	Attention drawn to concealed item Robust identification of sensitive items required
Inhibit	86
Inhibit Access to Resources	32	Passive inhibition: Limit time robot can wait in sensitive locations Active inhibition: Detect using pose, location, and surrounding objects and people	Forcing avatar to move takes away from user autonomy Technically challenging to detect active inhibition.
Inhibit Movement	54	Adapt socially aware navigation from autonomous and controlled robots	Takes away from operator autonomy
Annoy	154
Unnatural Movement	73	Adapt socially aware navigation from autonomous and controlled robots	Takes away from operator autonomy
Invade Personal Space	35	Individual’s personal space: Maintain some distance determined by prior robotics literature Group’s personal space: Further research required	Personal space is dependent on many factors thus difficult to determine Difficult to differentiate between robot invading a group’s personal space versus trying to join group
Chase	23	None	Very difficult to distinguish from following someone socially
Mislead	8	Make abundantly clear that avatar robot is not a guide	Impossible to detect
Use Sound	15	Reduce or mask sound
Destroy or Hurt	59
Destroy Items	24	Object avoidance Restrict from approaching sensitive objects
Hurt Living Things	25	Socially aware navigation and object collision avoidance Use soft exterior	Technically challenging
Mass Violence	10	Restrict movement Filter video feed to hide sensitive objects

Table 2. Summary of Analysis on Low-Moral Actions: Low-Moral Action Categories (bolded), Subcategories, Their Frequency in the Hazard Identification Workshops, Suggested Prevention Methods, and Challenges That Arise

Some ideas came up only once, such as eavesdropping (Section 4.1.3) or invading the interpersonal space between a pet and an owner (Section 4.3.2). However, the overlap between ideas in all three sessions suggests that a large majority of possible manifestations of low-moral actions were successfully uncovered by the hazard identification workshops.

5 Prevention of the Low-Moral Actions

The second goal of our study was to look at if the low-moral actions identified could be detected and or prevented. This section speculates general directions for prevention for each low-moral action subcategory that was identified in the last section. These discussions draw from past literature on avatar, telepresence, teleoperation, and autonomous robots, as well as other relevant fields such as psychology. Where detection or prevention seemed impossible, we suggest ways to mitigate the impact of the low-moral action. Our discussion in this section mainly focuses on technological solutions for the low-moral action categories. We assume that the avatar robots are rented by companies to people for temporary use rather than owned by individuals. However, we believe that many of the solutions would apply to both scenarios. Table 2 contains an overview of this analysis.

5.1 Violate Privacy and Security

5.1.1 Steal Information.

Literature on privacy-sensitive teleoperated robotics argues that in order to protect the privacy of the people sharing space with a robot, visual information provided to robot operators should be limited. Filtering video feeds to protect privacy was originally proposed for video telecommunication in the late 90s and early 2000s [11, 105]. In the same vein, research on telepresence robots has advocated limiting the visual information provided to teleoperators. Studies have explored applying filters, such as blur, on the entire visual feed to the operator [14, 45] or subtle video manipulations that conceal sensitive objects while the rest of the video feed remains unedited [33], such as concealing only credit cards on a table.

However, filtering the video feed to the teleoperator poses some challenges. Firstly, it hinders the ability of the operator to control the avatar robot because they cannot see the environment clearly. It could even lead to nonmalicious operators hurting a bystander accidentally. Some work has sought a balance between privacy and the utility of such a system [14, 45] but there is no definitive answer. Secondly, if filtering is too obvious it may draw attention instead of concealing. An operator may become curious about what is being hidden from them [5, 77]. Object removal or inpainting techniques based on neural networks could be used to remove or replace objects so that concealment is not obvious [17, 48, 64]. Thirdly, any form of automated concealment system or software will need to identify the sensitive items first [76]. The lag between detection and concealment must be minimal as even visibility for the briefest moment could lead to information being stolen.

Despite these challenges, we believe using some form of filtering on sensitive items such as credit cards or phone screens to be the most secure way to prevent the information of people from being stolen by a malicious operator of an avatar robot.

5.1.2 Photograph without Consent.

To prevent photography without consent in a closed space where only certain people are permitted to enter, such as an art gallery or poster presentation, we suggest that people sharing space with the robot be allowed to elect to be filtered in the video feed to the operator. This could be done by them carrying a tag identifiable by the avatar robot. The notion of individuals hiding their true identity has been frequently depicted in film and media, where a person appears as a blurred or distorted image to other people or cameras [24, 53, 93]. This solution, of course, comes with the challenges accompanying doctoring video feeds to an operator such as those that were discussed in Steal Information (Section 5.1.1).

In open public spaces such as shopping malls, providing people with the option of carrying tags would be challenging due to the number of people. On the other hand, anonymizing every person sharing space with the robot would hinder social interactions and defeat the purpose of using the avatar robot. Thus, in areas with a large number of people, we suggest that instead of trying to prevent the low-moral action from happening, its impact should be reduced through transparency: the intentions of the avatar robot operator made apparent to the people sharing space with it with a minimum of possible interpretations [68, 77, 100]. Bystanders should be informed when the camera on the robot is on and notified if the remote operator takes a photograph or records a video using the software for controlling the robot. We discuss transparency in detail in Section 5.6.2.

5.1.3 Eavesdrop.

People tend to think of robots in terms that we use for other people, such as saying “it wants” or “the robot is looking,” when in fact the robot has no wants or desires and cannot see. This is called the Android Fallacy [71]. A byproduct of this is that people assume that robots have the same sensing limitations as humans [50, 77]. A robot could, however, be equipped with audio sensors that far exceed a human’s capacity. The unique aspect of a malicious operator using an avatar robot to eavesdrop is its undetectability by people sharing space with the robot: Even if the robot was facing away or hidden somewhere not in sight, it could still listen in on a conversation.

We suggest that combining some form of group formation technique [30] and audio source detection methods [8, 80] is a possible approach to distinguish between an avatar robot having a conversation as part of the group from one eavesdropping. The audio to the operator could be dimmed accordingly for the groups the avatar robot is not part of. This solution comes with some challenges which make the complete prevention of eavesdropping hard to achieve. Firstly, while group formation detection is possible in small spaces with sensors on board the robot, it would be difficult in large public spaces without sensor networks. Secondly, audio filtering may be technically difficult in public spaces. The environment has to be sufficiently quiet for it to be successful, and even if the source of the audio is suppressed, echoes are typically difficult to remove [8].

In addition to prevention, transparency could help in mitigating the impact of this low-moral action, such as informing people when the robot’s audio sensors are active. Transparency is discussed more in Section 5.6.2.

5.1.4 Stalk.

In avatar robots, stalking is similar to classical stalking in its presentation because the robot needs to be in the vicinity of the victim just like a human stalker. Cambridge Academic Content Dictionary defines stalking as “the act of following a person […] as closely as possible without being seen or heard” [20]. Classical stalking is difficult to detect and prevent because stalkers usually employ apparently routine and harmless behavior that is not overtly threatening [85]. Stalking usually ends when the victim moves from where they live, the stalker enters a new relationship, or there is some action taken by the police such as a warning or an arrest [84].

However, unlike a classical stalker, it is more difficult for an avatar robot to hide from the system or administrators as its location can be made to always be known and all actions done through it always be recorded. Therefore, we suggest that stalking could be detected by observing the robot’s pose and movements as well as tracking the surrounding people. Once detected, the avatar robot itself could inform the people sharing space with it that someone is observing them through the robot (i.e., transparency, discussed in Section 5.6.2). We believe that knowledge of such features may even discourage malicious operators from engaging in stalking behavior.

In addition to online prevention, post hoc detection could perhaps be done for avatars owned by companies by analyzing the operator’s video feed to see if the robot persistently monitors the same person(s) without engaging them in a conversation. However, given the difficulty of catching an anonymous operator, it is challenging to completely prevent stalking. Even after being issued a ban, such an operator may find workarounds like using a different username to log in to the avatar again.

5.1.5 Enter Forbidden Spaces.

A straightforward prevention technique would be to restrict the avatar robot from entering a location by constraining its navigation, as this is already standard practice in autonomous robot navigation. It may, however, prove challenging to implement conditional limitations (e.g., women’s only area) due to the anonymity of the operator.

A further complication may arise when a teleoperator can see a space but cannot enter it. They may still try to direct the avatar robot to enter the forbidden space despite the system not letting them. Such attempts could lead to further low-moral actions such as Inhibiting Movement (Section 4.2.2). These attempts at even entering such a space could be prevented by concealing the entrance to the forbidden space from the operator by filtering information to the operator. Restricting where the avatar robot can go combined with a sophisticated filter that completely hides the existence of a forbidden location visually could effectively prevent this low-moral action.

5.2 Inhibit

5.2.1 Inhibit Access to Resources.

Passive Inhibition. In the case where the avatar robot does not move so that a resource is inaccessible or goes unnoticed, although difficult to eliminate, could be reduced by limiting the time an avatar can wait in certain locations. Similar approaches have been employed in autonomous robots in the past [44, 81]. Unlike autonomous robots, however, an avatar robot would require access to the same objects as people, such as maps and artworks, and therefore cannot be outright restricted from waiting in front of a resource. Enforcing a limit on waiting time and requiring the avatar robot to relocate once the time limit has expired could be a practical solution in such situations.

Active Inhibition. Inhibition of access where the avatar robot moves to keep the resource actively inaccessible may present a greater challenge to detection and prevention than passive inhibition. It is difficult to distinguish between an avatar robot that is interacting with a person from a robot engaging in access inhibition. Nevertheless, we believe that by observing the robot’s posture, location, and surroundings, a best guess can be made to determine whether the robot is engaged in access inhibition.

5.2.2 Inhibit Movement.

To address the issue of collisions and obstruction caused by avatar robots, we suggest that a predictive model could be used based on the avatar robot’s location, the surrounding map, and the movements of nearby individuals. This model could forecast the future positions of the robot and people in the vicinity, thereby predicting possible collisions or path blockages. Such models have been studied before with autonomous robots that navigate around people and crowds [36, 40, 42, 79] and may be adapted to work with avatar robots. Once detected, prevention could be done in a number of ways.

One approach for prevention involves providing the operator with warnings and suggestions for movement directions or alternative waiting positions to guide them to where the avatar robot would cause the least inhibition of movement. Another more stringent approach would involve the predictive model acting as an intermediary layer, modifying user commands, and executing them to ensure that no inhibition occurs so that the avatar robot moves and stops without causing any movement issues for pedestrians. The latter idea is supported by previous research on robots where the operator travels on the robot, such as wheelchairs, which has suggested that the robot’s operation should be semi-autonomous, i.e., the robot regulates its own velocity and direction to avoid obstacles and people while acting on operator commands [3, 63, 79]. However, while reducing the control the operator has over the robot’s movements would enhance the experience of people sharing space with the robot, it would lead to a poor user experience. This approach is also applicable to the prevention of other low-moral actions and we discuss it more in Section 6.2.

5.3 Annoy

5.3.1 Unnatural Movement.

For the prevention of movement that causes a sense of unease or dread in the people sharing space with the robot, an approach similar to the one suggested for the robot inhibiting movement of bystanders (Section 5.2.2) could be utilized: restricting the control of the operator over the movements of the robot. This restriction could be mild such as the robot stopping when the robot identifies that it is being spun nonstop but then giving control back to the operator. On the other hand, this restriction could also be strict so that even though the avatar robot relies on operator input to determine the goal, it carries out the commands based on a model of social behavior. Literature on autonomous social robots discusses socially aware models [27, 59] and we believe that they could be adapted to work with avatar robots.

As discussed in Section 5.2.2, restricting control takes from the autonomy of the operator. A balance between operator independence and prevention should be established. We further discuss the agency of the operator in Section 6.2.

Transparency can alleviate some of the negative impacts of an unnaturally moving avatar robot. For example, in the case where the avatar may hide and scare someone, making passersby aware of the robot’s presence and the intentions of the teleoperator to move such as by using intention projection [2, 37, 88] or making sounds before it exits a hidden waiting area may be a way to prevent the negative effects. We discuss transparency more in Section 5.6.2.

5.3.2 Invade Personal Space.

In order to ensure that an avatar robot does not enter an individual’s personal space, it is necessary to first determine the parameters of that space. The shape and size of an individual’s sense of personal space when sharing space with a robot varies and is dependent on the cultural context, the task of either party, the speed and direction of both, the presence of social cues, mutual gaze, and the dimensions and shape of the robot [51, 65]. Additionally, this space is different for friends versus strangers [54]. While it may be challenging to establish concretely what exact distance an avatar robot should keep from an individual, existing studies and related research can provide insights for developing general guidelines to avoid discomfort or violation of personal space.

Regarding spaces in and around a social group, while there exists literature about the socially aware locomotor movement for autonomous robots near social groups [40], the discussion is scant concerning robots breaking up groups or separating members. Furthermore, to our knowledge, very little literature exists on low-moral behavior by robots toward groups containing animals such as separating a pet from its master. Prevention of this manifestation of invading personal space is complicated further by the difficulty of distinguishing between a robot attempting to disturb a group or break it up versus a robot aiming to join the group. Therefore, there is a need to expand current techniques of socially aware navigation to include these behaviors, and further research is needed to develop strategies for detecting and preventing attempts at disrupting social and pet-owner groups.

5.3.3 Chase.

Detecting the low-moral action of an avatar robot chasing after a person or an animal is challenging. It is difficult to distinguish low-moral chasing from an avatar robot socially following a friend. We believe that it may be possible to construct a detection model that identifies this action from the reactions of the people around the avatar such as pointing, getting out of the way, or running away; however, we were unable to find any past literature addressing this issue. We believe further research in this area is needed to develop effective methods of detecting and preventing low-moral chasing by avatar robots.

5.3.4 Mislead.

This is the act of a malicious operator taking advantage of a child’s curiosity or someone’s assumption that the avatar robot is there to guide them but instead leading them astray or getting them lost. While this may visually appear to be the opposite of chasing (Section 5.3.3), it may be more difficult to distinguish this from the robot’s legitimate movement as the victim of this low-moral action would willingly follow the robot. Unlike chasing, which involves the robot following a person without their consent, misleading is an issue of deceit. One possible mitigation strategy could be to make it abundantly clear through writing or symbols on the body of the avatar robot that it is remotely operated and that it is not controlled by an employee or guide (i.e., transparency, discussed more in Section 5.6.2). However, this approach may not be entirely effective, and children may still be vulnerable to such low-moral actions. Further research is needed to explore additional ways to prevent such deceitful behavior in avatar robots.

5.3.5 Use Sound.

The low-moral action of disturbing people through the weaponization of noise generated by the robot’s movement and motors is difficult to completely eliminate, as reducing this noise to 0 is not possible. However, in locations and situations where this action is likely to manifest, such as calm and quiet environments like a library or during a speech session, the use of sufficiently quiet robots may be able to minimize its impact. Alternatively, in situations where noise itself is not the problem but rather the quality of the sound, noise masking or manipulation may be used to make the robot sound more pleasant. Studies on masking noise with musical sounds have shown that masked sounds are rated more positively than pure noise [94, 104]. Another study has shown that higher-pitched robot motor sounds seem more pleasant, energetic, and warm [103].

5.4 Destroy or Hurt

5.4.1 Destroy Items.

Destruction of items can be avoided by employing obstacle avoidance, a standard practice in robotics where the robot is fitted with sensors that prevent it from bumping into walls and other objects. In cases where objects have high value or are particularly delicate, limitations on movement and access may be necessary. Preventing the robot access to areas around such precious objects (as discussed in Section 5.1.5) would act as a further layer preventing this low-moral action.

5.4.2 Hurt Living Things.

While preventing the robot from hurting living things seems easily done using appropriate obstacle avoidance, some issues make this more challenging. Firstly, human movement is fundamentally different from stationary objects in the environment and often unpredictable. Secondly, obstacle detection cannot avoid an avatar robot suddenly stopping causing a person behind it to run into it [79]. Thirdly, harming a human is much more unethical than destroying objects. Finally, natural interactions between humans involve touching each other. As the avatar robot represents a person, people around the robot may wish to touch it.

To address these issues, in addition to obstacle avoidance, we suggest that safe driving may be enforced by restricting an operator from the low-level control of the robot’s movement as suggested for the prevention of inhibition of movement (Section 5.2.2) and for unnatural movements (Section 5.3.1). Using the avatar robot’s and the surrounding people’s current trajectories and positions, it may be possible to anticipate if any collisions are imminent and then try to avoid the paths or positions that could lead to bad outcomes.

Soft robot exteriors offer a potential solution to the issue of accidental brushes, scrapes, and injuries caused by physical contact with avatar robots.

5.4.3 Mass Violence.

Despite the improbability of this type of low-moral action taking place, the consequence of such would be devastating. Wherever avatar robots are utilized, appropriate measures must be put in place to ensure that their misuse in this manner is not possible. The use of an avatar robot for use in acts of mass violence could be prevented by restricting the avatar robot’s movement (as in Section 5.1.5) so that it cannot enter sensitive areas and by filtering its visual feed (as in Section 5.1.1) so that the malicious operator cannot see sensitive levers or buttons. Furthermore, while fitting weapons on teleoperated robots has been done before [13], this requires the malicious operator to have access to the robot itself. Where the avatar robots are owned by a company that rents them out, this would be difficult without some form of coordinated inside attack. Consequently, an act of mass violence through an avatar robot in a sensitive location such as an airport would be unlikely given that no one with a weapon should be able to enter anyway.

5.5 Summary

All low-moral actions manifest in unique ways and must thus be addressed differently. However, there is also considerable overlap between the approaches to preventing many of the malicious actions identified. Here we present the shared overarching themes in the proposed solutions.

5.5.1 Filtering Information to Operator.

Doctoring how the operator perceives the world around the avatar robot by editing or filtering the video and audio feeds to the operator can prevent stealing information such as when the avatar robot camera captures a credit card (Section 5.1.1); photographing or videotaping without consent such as when the robot teleoperator records a video or takes a photo without the consent of the people being captured (Section 5.1.2); and eavesdropping by listening to people’s conversations even recording it without them knowing (Section 5.1.3).

However, such filtering hinders the operator from being able to do their task. Unable to see or hear their environment or the people around them would detract from the embodied experience. A balance must be found so that enough is concealed, but not so much so that it poses an obstacle to the operator’s control of the robot and the interactions they wish to have with the people and the environment.

Moreover, any such filtering needs to be robust and subtle in nature. Obvious filtering would draw attention or even lead to further low-moral actions. A blurred doorway through which an operator cannot see may incentivize them to investigate leading to them blocking the doorway. Audio source doctoring may have similar issues if the source and its echoes are not sufficiently concealed.

5.5.2 Restricting the Low-Level Control of Robot.

Giving the robot operator only the ability to give high-level commands and keeping them from having low-level control of the robot’s trajectory or where it can stop can prevent the robot from entering a forbidden space (Section 5.1.5) or from entering someone’s personal space (Section 5.3.2). It can also prevent the avatar robot from being used to inhibit access to a resource like a water cooler, an ATM, or a couch, or inhibit the movement of people through a hallway or entrance (Sections 5.2.1 and 5.2.2). Finally, it can prevent the operator from having the robot do any unnatural movement that bothers or confuses people sharing space with it (Section 5.3.1), damages any property, or hurts any people (Sections 5.4.1 and 5.4.2).

Just as with the previous section, while this would keep the people and objects sharing space with the robot safe, it may detract from the user experience of the operator.

5.5.3 Monitoring the Operator’s Actions and Choices.

Effective implementation of the prevention mechanisms, particularly those that are not addressed by simple solutions, such as stalking (Section 5.1.4) and chasing (Section 5.3.3), necessitates close monitoring of each action undertaken by the operator to decipher their intent. Tracking the location of the avatar robot at any given time, what the robot is looking at, and what past actions has the robot been used to do, to predict low-moral actions in order to prevent them is not only technically challenging but also may be considered a breach of the teleoperator’s privacy.

5.6 Mitigating the Impact of Low-Moral Actions

Some low-moral actions are seemingly impossible to solve such as misleading someone to go somewhere where they did not wish to go (Section 5.3.4), stalking (Section 5.1.4), and following people causing them discomfort or fear (Section 5.3.3). In the field of Risk Analysis, when prevention of a hazard is not possible, modifying the consequence of the hazard can be an effective approach [34, 101] (for example, a flood cannot be prevented but its consequences can be reduced by evacuating all the people from the area which will be flooded). Similarly, for these low-moral actions, impact mitigation can be done by providing certain types of information to the people sharing space with the robot.

5.6.1 Accountability.

Accountability, as in who is to be held responsible if a low-moral action occurs, helps to relieve some worry of the people sharing space with a robot [5, 46]. It helps to have some documentation that informs people sharing space with the robot of the robot’s capabilities, how the system operates, and whom to hold accountable if some low-moral action occurs. Additionally, pertaining to the mitigation of violation of privacy and security (Section 5.1), people feel that their privacy is respected if they know how long the video files are kept, who has access to them, and the security measures in place. Given an anonymous teleoperator, the onus may fall to companies that rent robots temporarily or manufacture and sell the robots.

Of course, an obvious way to increase accountability would be to decrease the anonymity of the operator. It can be seen as irresponsible on the part of avatar robot rental companies to allow complete anonymization of operators. Requiring renters to provide identification before a robot may be used could be one solution. Another solution could be to place identification markers on the robot itself such as a name tag or description that explicitly identifies the operator to the people near the robot. Any low-moral actions done by the robot can then be associated with that operator. An argument can be made here for getting rid of anonymity altogether; however, as we discussed in Section 1, the complete removal of anonymity is not possible: Certain types of operators may not wish to provide identities, circumstances may prevent them from providing their true identities, or malicious operators could just provide false identities.

Despite these challenges, any form of accountability would be useful because even though accountability cannot prevent a low-moral action, it can mitigate an action’s impact on bystanders.

5.6.2 Transparency.

Transparency in avatar robots is its teleoperated nature being unambiguously obvious and the goals and intentions of the operator apparent. This property can be a powerful tool in mitigating the impact of a low-moral action and in some cases may even discourage a malicious operator from doing it in the first place. Even though in the scope of this study we assumed that the teleoperated nature of the avatar robot was apparent to bystanders, we believe that an effort by the developers of avatar robots to make this so would be beneficial in reducing malicious use.

Transparency can be further extended by the avatar robot notifying the people sharing space with it that they are being recorded when a photograph is taken or video or audio recording is done through the software used to teleoperate the robot. Examples can be taken from contemporary social media sites such as Snapchat which notify their users when someone takes a screenshot of stories they have posted or chats they have sent [87]. These stories and chats are meant to automatically be deleted after a certain number of viewings or after a certain amount of time has passed. It is a low-moral action to try to preserve them. Another example of ensuring transparency can be taken from phone laws in Japan and South Korea. It is mandatory for all phones to have a shutter sound that cannot be turned off in the camera applications [96]. In the case of avatar robots, software that the operator uses to control the robot may be fitted with similar features so that if an operator takes a photograph or records any audio, the robot could perhaps audibly announce that it did so, flash a light, or use some form of mixed-reality signaling [2, 37, 88] by projecting its intentions onto the surrounding surfaces. Despite it being technically possible to bypass these “transparency notifications” (e.g., nothing can be done if the operator decides to take their phone out and take a photograph of their screen) their existence would require a malicious operator to go the extra mile to do the action secretly and thus may keep low-moral actions such as photographing without consent (Section 5.1.2), eavesdropping (Section 5.1.3), and stalking (Section 5.1.4) at bay.

In addition, given a closed space where only certain people are permitted to enter, some transparency can be achieved by having a poster or notice in the area where the avatar robot is or a notice on the robot body itself informing people that they are being recorded, that it is not there to guide them, and that the robot itself is teleoperated.

6 Discussion

6.1 Main Findings

We believe that avatar robots have the potential to become ubiquitous and as avatar robots become more widespread, there is a potential for misuse of avatar robots by malicious operators. Indeed, it was determined through our hazard identification workshops that it is generally easy for people to think of low-moral actions that can be done through an avatar robot. This range of possible malicious behaviors, however, appears to be limited and the same ideas tended to come up across multiple sessions. Our analysis revealed that these actions can be divided into a few basic categories based on their manifestations and their effect on the people sharing space with the robot. All of the low-moral actions identified were those that people can also do; however, perhaps unsurprisingly, none of the behaviors identified was unique to robots. This was because an avatar robot is the representation of a person in a remote location. Some actions, such as using the sound of the motors and movement of the robot to disturb a quiet environment (Section 4.3.5), while not exactly doable by humans (our joints do not generally make an audible sound when we move) can still be carried out by people, such as making loud footsteps when walking.

On the other hand, the potential strategies tend to be unique for each low-moral action. This makes it difficult to devise a general solution to address all potential malicious actions. However, the strategies do appear to follow one of a few basic approaches: limiting the operators (either their perception or control), continually monitoring the avatars, or just making the bystanders more aware of the risks.

To allow for safe and socially acceptable use, designers and programmers of avatar robots may wish to prevent the misuse of avatar robots. The low-moral actions we have identified can serve as a starting point or general guidelines for such endeavors. Depending on the avatar robot and the use case, designers may prioritize which category or subcategory to address as it may be technically difficult for all of the low-moral actions to be addressed at the same time in a system.

Our analysis revealed some challenges and issues that can arise when implementing these strategies. In the following sections, we discuss these problems in detail.

6.2 Restricting Agency of Operator

This study assumed malicious operators and intentional low-moral actions. However, we acknowledge that the vast majority of users will not be malicious. Keeping information about the environment from the operator such as the operator being unable to listen to a conversation until it enters a group to prevent eavesdropping (Section 5.1.3), restraining the operator from being able to wait at or enter certain areas to prevent inhibition of access or movement (Sections 5.1.5 and 5.2.2), limiting visibility to prevent stealing of information or photography without consent (Sections 5.1.1 and 5.1.2), and restricting control of robot movement to prevent unnatural movement (Section 5.3.1) all detract from the agency of the teleoperator. Nonmalicious operators may feel limited in what they can do with an avatar robot and come away with a bad experience if strict measures are in place. Nonmalicious operators may also resent the presumption of malicious intent. Furthermore, observing every move the operator makes and limiting what they can and cannot do can cause the operators anxiety, a sense of being surveilled, and a feeling of loss of control, especially if they belong to a vulnerable population such as senior citizens. We believe some deliberation is warranted regarding the prevention strategies’ ability to judge whether the operator is indeed malicious.

Note also that the actions examined in this study were considered as always being low-moral. However, in a specific context, some actions may not necessarily be low-moral. For example, blocking someone’s path could be a way to alert them to an emergency or a way to stop a criminal from leaving a building. Such actions would have a moral justification or a valid reason, so preventing them may not be desirable.

A possible solution to both these issues could be to implement a form of shared control or sliding autonomy between the operator and the avatar robot where the operator initially has full or partial control over the robot’s actions but if and as low-moral actions are attempted by the operator, restrictions on their behavior could be introduced. This would do away with the presumption of guilt and restrictions on autonomy and agency would come into play only after low-moral actions have been attempted.

Nonetheless, an avatar robot can be considered a representation of a person in a remote space, so placing restrictions on its senses and movements could be likened to placing restrictions on someone’s autonomy and agency. This may make any preemptive prevention mechanisms ethically or morally questionable, especially when such restrictions are taken too far. For example, allowing people to only do a very limited set of actions and expressions could degrade their autonomy and individuality to the point that it would be dehumanizing [69]. There is already evidence that people who use avatars (with limited functionality) to be present in a remote venue tend to be treated more like robots than physically present people, a phenomenon termed “robomorphism” [83]. This suggests there is a real risk that excessive limitation of operators’ autonomy might truly lead to their dehumanization. The fact that the avatar is a robot is probably not going to mitigate the negative consequences of dehumanization; indeed, Keijsers and Bartneck [41] have shown that dehumanization occurs also in human–robot interactions and that it is linked to aggressive behavior toward the robot. Because of that, we believe that any implementation of restrictive prevention mechanisms requires careful consideration and must take the ethics and consequences of such measures into account.

6.3 Low-Moral Actions toward the Avatar Robot

Abuse toward robots is a documented phenomenon. While this bullying has mostly been documented as being perpetrated by children toward autonomous robots [12, 67, 78] or telepresence robots [66], there have been reported cases of robots in the wild being attacked by adults [7, 98]. The possibility that avatar robots may be abused or bullied in a similar manner is not far-fetched.

Those who implement detection and prevention mechanisms must consider that the people sharing space with the robot may also do low-moral actions to the avatar robot. These actions toward the robot may appear to a low-moral action prevention system as if the robot was doing the low-moral actions itself: A person inhibiting the path of the robot may present similarly to if the robot were blocking the path of the person. Any detection and prevention solutions must therefore be able to distinguish between the operator’s malicious behavior and that of the people around the robot. Solutions must not, for example, punish a teleoperator for inhibiting a bystander’s path when in fact it was the bystander who suddenly stepped in front of the robot.

6.4 Limitations

This study was constrained to avatar robots with locomotor movement, microphones, and cameras. Sound from speakers, hand and arm motions, or gestures were not considered in this study. Moreover, we only considered a single avatar robot, Robovie II. It is possible that repeating the hazard identification workshop with a robot with a different form factor may reveal some new low-moral action examples. Additionally, the participants in the workshops were homogeneous: They were Japanese and spoke Japanese, and none were people with special needs. Moreover, we carried out only three workshops. While we believe that we were able to identify a large section of the space of low-moral action possible through locomotor movement, it is possible that we perhaps did not identify all of them. Finally, our discussion of prevention mechanisms primarily centered on technological solutions. Solutions beyond the technological, other than transparency and accountability, were not extensively explored in this study.

7 Conclusion

In this study, we identified low-moral actions possible through an avatar robot limited to locomotor movement via hazard identification sessions. Participants experienced being malicious anonymous operators and sharing space with an avatar controlled by such operators before they brainstormed possible low-moral actions. We used affinity diagram analysis to consolidate the ideas and presented the 4 categories and 15 subcategories of the low-moral actions identified. We followed with a discussion of detection and prevention techniques in the second half for each subcategory and also highlighted possible future research areas in human–robot interaction and robotics. In presenting this taxonomy of low-moral actions, suggesting possible prevention strategies, and subsequent discussion, we hope that we will inspire the larger human–robot interaction research community to identify worthwhile avenues of research and design.

References

[1]

Philipp Althaus, Hiroshi Ishiguro, Takayuki Kanda, Takahiro Miyashita, and Henrik I. Christensen. 2004. Navigation for human-robot interaction tasks. In IEEE International Conference on Robotics and Automation, 1894–1900. DOI: