1 Introduction
In the Sanskrit language, the word
avatar refers to incarnations of Hindu deities. In the modern context, it has come to be used for “an electronic image that represents […] a computer user” [
21]. Building on this concept, an
avatar robot describes a robot that represents a person in a remote location. It is a robot that a remote operator (or teleoperator) can embody or “inhabit” to visit a remote location and interact with the people or objects there. The primary purpose of using an avatar robot is social interaction.
Telepresence robots, also known as Mobile Remote Presences, are similar to avatar robots in that they allow teleoperators to be “present” in remote locations and socially interact with the people there. While the literature on both tends to overlap, in the scope of this study, we distinguish between an avatar robot and a telepresence robot in terms of the anonymity afforded to the operator. In the case of a telepresence robot, a screen shows the operator. Examples of telepresence robots available in the market are Gobe [
73], Double 3 [
74], and Vgo [
29]. On the other hand, in an avatar robot, it is not apparent who the operator is, thus making them anonymous. Another example of robot teleoperation is known as
Wizard-of-Oz (WoZ) control, which is primarily used in research settings. In WoZ, the existence of the teleoperator is hidden from participants, thus providing the illusion of an autonomous robot to whoever is interacting with it. This control method enables researchers to quickly test hypotheses in human–robot Interaction studies, without having to program a fully autonomous robot. Since this WoZ-controlled robot does not provide an embodiment of the teleoperator, it is not considered an avatar robot in our study. In the scope of our study, we assume that it is known to most people sharing space with the robot that it is controlled by a remote operator. Examples of avatar robots are Orihime-D [
92], Alter-Ego [
52], and Engkey [
102]. Under this definition, a telepresence robot with its screen turned off is also considered an avatar robot. See
Figure 1 for illustrations of telepresence and avatar robots.
There are several merits to robots that enable remote presence. By allowing a person to inhabit a space different from the one they are in, avatar robots and telepresence robots open up avenues of remote work for people who are unable to commute or physically unable to work. Additionally, they provide support for remote education by teachers and allow for medical assistance by healthcare professionals to faraway patients. In the future, tourism may become possible through such robots for people who cannot afford to or are unable to travel. They may even enable one individual to be present in multiple locations at once [
43]. Some literature (e.g., [
43]) uses the term “beaming” to refer to being telepresent as an embodied robot, a word borrowed from the American television series Star Trek. Depictions of robots such as this are popular in contemporary media. An example is the future world depicted in the 2009 movie Surrogates [
61] and the 2006 graphic book series it is based on [
97] where people operate humanoid avatar robots called Surrogates to go to work, vacation, socialize, or interact with society all from the comfort of their own homes.
In a world with avatar robots, many circumstances may call for or lead to the anonymity of the user. Some operators, perhaps those with certain disabilities such as severe anxiety or Hikikomori (a condition of extreme social withdrawal and isolation), or those trying to avoid attention (such as a celebrity) may wish to not be seen by others while they use an avatar robot to interact with or contribute to society [
47]. The use of anonymity will also extend to certain industries, where the operator’s identity is not relevant to the service provided, such as the service industry. For example, convenience stores or hotel lobbies may choose to have a uniform look for the avatars used by their employees to provide a consistent experience to their customers. In the future, companies may begin to mass-produce and rent out avatars to people for temporary use such as to attend an event or take a tour of a historic site. Modifying a robot design for every new user may not be feasible and it may be practical and cost-effective for such companies to produce robots that have a uniform or neutral look. Orihime-D, an avatar robot with a uniform look [
92] (
Figure 1(c)), exemplifies this approach having been utilized by people with physical limitations to perform customer service tasks at a café in Japan [
16].
It must be noted that a focus on uniform appearances of avatar robots can lead operators to feel a loss of identity or individuality. Moreover, a potential consequence involves the societal homogenization in communities utilizing avatar robots, where individuals with disabilities, for instance, might become less visible in daily life, replaced by neutral-appearing avatars. The use of technology has a history of dehumanization of users [
69]. Despite these drawbacks associated with anonymity through uniform appearances, the inevitability of anonymity persists.
1.1 Low-Moral Actions by Operators of Avatar Robot
In the context of this study, we define low-moral actions as any violation or nonadherence to moral norms. While norms, in general, can be defined as behaviors that are expected of and followed by a sufficient number of people through their actions or inactions for the interest of their community [
57], moral norms are a subset of norms specifically concerned with the principles of right and wrong [
22]. Although details may depend on context and culture, moral norms can generally be distinguished from social norms, which are the usual practices or social conventions within a given society [
22,
57]. Examples of actions that can be considered low-moral actions are smoking in public, speaking loudly in a library, blocking someone’s path, stealing, hitting someone, and damaging property.
Given anonymity, a malicious operator, unafraid of getting caught and further consequences, may feel free to perform actions against norms and unwritten rules. An example of low-moral behavior encouraged by the safety of anonymity can be seen on popular social media Web sites where malicious users may create an account with completely false information and, protected by anonymity, proceed to post offensive material, troll, harass, deceive, and scam other users. Similarly, malicious operators of avatar robots may use the shield of anonymity through an avatar to protect themselves and do low-moral actions. Unlike on social media, a malicious operator of an avatar robot can cause physical harm as well as psychological harm.
What is a low-moral action is context-dependent and some actions may be perfectly reasonable given a different situation. Firstly, the relationship between the operator of the avatar robot and the person the robot is interacting with may change the morality of an act. For example, the avatar robot hiding and scaring someone is clearly a low-moral action if the person in question is a stranger, but if the person is a friend the act becomes a prank, a much less low-moral and even socially acceptable act. In the scope of this study, for the sake of simplicity, we consider all interactions to be between strangers.
Secondly, the location where the interaction takes place also defines the morality of the act. For example, using an avatar robot to block someone from going inside is rude in a shopping mall, but might be considered acceptable when the location is a private area. In our study, we focus on the interactions that take place in a public area and consider both open public spaces (e.g., a shopping mall or large hall) and closed public spaces (e.g., inside a convenience store or art gallery).
Low-moral actions can be intentional or unintentional. In this study, we consider operators with malicious intent and thus all low-moral actions considered in this study are exclusively intentional (not accidental).
1.2 Enumerating and Preventing Low-Moral Actions
It is imperative to study the low-moral actions possible through avatar robots so that they can be prevented. Presently, avatar robots are still not widespread; however, we believe it is a matter of time before they are ubiquitous given their utility. The rise of avatar robots will inevitably lead to a rise in malicious users who abuse them. To prevent low-moral actions we must (1) understand how they can manifest and (2) investigate what prevention techniques are suitable for each manifestation.
Systematic listing and enumeration of what can go wrong in a system is a popular approach to understanding and preventing accidents, hazards, and loss. This approach has been taken in robotics to identify unintentional dangers to humans when social and assistive robots share space with them [
79], to generate a preliminary list of themes of privacy concerns of people sharing space with telepresence robots [
46], and to make a safer robot by enumerating what can go wrong technically in interactions between a medical robot and a patient [
28,
31].
Avatar robots are unconstrained in their morphologies as they are not required to have any feature that shows the actual operator. While typically humanoid (e.g., [
4,
52,
92]), they can also be nonhumanoid (e.g., [
15]) or have a screen showing an animated character (e.g., [
102]). Depending on the morphology, the teleoperator may be able to control the arms, torso, neck, or other appendages. For the sake of simplicity, we did not consider any upper body movement and restricted this study to the low-moral actions possible through locomotor movement (the robot’s movement through space) and the visual and audio sensors of an avatar robot.
This article aims to identify and prevent low-moral actions that can be carried out through avatar robots. To achieve this goal, in the first half, we present a comprehensive list of identified hazards resulting from a series of workshops conducted with participants to enumerate these actions. In the second half of the article, we speculate and discuss suitable prevention mechanisms for each low-moral action based on a review of relevant literature primarily focusing on technological solutions to low-moral actions.
3 Hazard Identification Workshop
3.1 Approach
As discussed in the introduction, the prevention of low-moral actions possible through avatar robots cannot be done without first knowing what actions are possible and thus our first goal was to generate a list of low-moral actions. Given the infancy of avatar robots, we could not base an approach on real-world observations or empirical data. Furthermore, to our knowledge, literature that we could draw from is scant on such low-moral actions via avatar robots.
The field of risk analysis offers a way to ideate, conceptualize, and think about problems for a technology that essentially does not exist yet. As our aim was to arrive at a list of low-moral actions, this study does not conduct a complete risk analysis nor determine the degree of likelihood and consequence of hazards. However, a technique utilized in the risk analysis process called hazard identification workshops was ideal for our study. These organized group discussions coordinated by a facilitator and attended by key stakeholders (users, investors, engineers, etc.) serve to identify the hazards of a planned project. Since the low-moral actions performed through avatar robots are the hazards of using such robots, a hazard identification workshop was ideal for this study.
Different techniques can be used in a workshop to arrive at a list of hazards. A commonly used method is brainstorming [
91,
101] where the goal is to generate as many ideas in as short a time as possible. We used brainstorming as it is a proven technique for hazard identification, promotes out-of-the-box thinking, and is easy to design and implement.
In order to ensure that participants had experience with the technology, had unique thoughts regarding the topic, and agreed regarding definitions and terms, we added an “experience phase” to the standard brainstorming technique. Participants of brainstorming sessions are usually familiar with the discussion topic; however, most people currently have no experience with avatar robots and may even hold false concepts from their exposure to contemporary media. Therefore, in our study, before they discussed amongst themselves, the participants were given a chance to operate an avatar robot. They performed low-moral actions and also experienced being on the receiving end of low-moral actions by a robot.
3.2 Participants
We recruited 12 participants, 6 male and 6 female, through a part-time job recruitment company that hired people in the Kansai region of Japan. Our aim in this study was to explore a broad spectrum of everyday low-moral actions experienced by ordinary people: the experience of carrying out or being on the receiving end of low-moral behaviors. Therefore we decided that a representative group of the general population was ideal for this study because no one person or discipline is an expert of these experiences. As such, we placed no specific conditions on participant recruitment in order to reflect the general public who might perpetrate or experience low-moral actions when such robots become widespread. We could have chosen to limit recruitment only to people with relevant expertise, such as experts in robotics or malicious behaviors. However, in order to avoid a biased perspective, we did not exclusively recruit robotics experts as we believe that their specialized knowledge may have made them exploit avatar robots from a technological standpoint. Additionally, we did not limit recruitment only to experts in malicious behaviors, as their familiarity with criminal or immoral acts may have led them to focus on an extreme subset of low-moral actions.
The average age of the participants was 36.2 years old (SD: 15.4 years). The youngest and the oldest were 20 and 60 years old, respectively. We carried out three hazard identification workshops with two male and two female participants each. A single workshop lasted for approximately 2 hours. We compensated the participants for their time (1,100 Japanese Yen per hour).
3.3 Procedure
We divided the workshop into three phases: Pre-Brainstorming, Experience, and Main Brainstorming Phase. The Institutional Review Board at Kyoto University approved this study design.
3.3.1 Pre-Brainstorming Phase.
This phase served many purposes. It was to build rapport amongst the participants and with the facilitator, to define the problem for the participants, to give them some practice, to get them into the correct state of mind for the hazard identification workshop, and to eliminate general “first-thought” ideas so that unique and creative ideas could be generated in later phases.
This phase included introductions by the facilitator and participants and an explanation of the three-phase structure of the hazard identification workshop. We instructed the participants to imagine a future where avatar robots are commonplace and introduced them to the possibility of malicious and anonymous operators. They were given three examples of low-moral actions: blocking someone’s path, driving the robot down a flight of stairs, and separating a child from their family. We selected these disparate examples to prevent priming the participants so that they would not think of low-moral actions in only one way and to demonstrate the variety of low-moral actions that are possible. Participants were also asked to keep in mind the limitations of the avatar robot considered in this study: It has only locomotor movement, a camera, and a microphone for listening but it has no upper body motion, and it has no voice.
Participants were provided with writing material and sticky note pads for jotting down ideas. For practice, they were given 5 minutes to brainstorm individually. We included ideas generated in this phase in our final analysis.
3.3.2 Experience Phase.
In this phase, participants gained experience by (1) operating an avatar robot, (2) sharing space with an avatar robot, and (3) operating an avatar robot in a simulation. The participants were assigned one of these three roles randomly. After a 10-minute round, they rotated into the next role until they had experienced all three roles.
Table 1 shows the distribution of participants during the four rounds. We encouraged them to note any new ideas they thought of during this phase. Each of these roles is briefly discussed below.
Operating an Avatar Robot. A participant controlled a robot while pretending to be a malicious anonymous operator. The robot we used was Robovie II [
38] because of its humanoid appearance, teleoperability, and affordance of anonymity. The robot had safety functionality which prevented it from hitting people or walls. For this phase, in keeping with the limitations on the avatar robots for this study, Robovie’s upper body was disabled, i.e., no arm or neck movements. Additionally, its speakers were also disabled. We placed it in an adjoining room with two other participants.
Figure 2 shows this avatar robot being controlled by a malicious operator performing a low-moral action.
If the operator felt limited in what they could do with the robot, we allowed them to go into the adjoining room and pretend to be the avatar themselves.
Figure 3 shows a participant pretending to be an avatar robot spinning around to annoy bystanders. From all three workshops, only two participants chose to do this.
Sharing Space with an Avatar Robot. Two participants shared space with a robot controlled by a malicious anonymous operator. They were assigned both an activity that simulated a situation where avatars would be used. It was explained that the function of the activities was to think of ideas and not the activity itself. One activity simulated an art show, poster exhibition, or window-shopping experience. The participants walked around and searched for Wally in “Where’s Wally?” posters. “Where’s Wally?” is a puzzle game with the goal of locating a character named Wally in a picture depicting hundreds of people doing a variety of things [
23]. Two of the walls of the room were covered with Where’s Wally posters and the participants needed to walk around and look at the posters with attention and great detail.
Figure 2 depicts participants engaged in this activity.
The second activity simulated a speaker session. Participants stood behind a line and paid attention to a TEDx Talk playing on a screen. TEDx Talks are videos from expert speakers on technology, education, tech, creativity, and so on [
89]. The video played was in the local language (Japanese).
Figure 3 shows participants engaged in this activity while a third participant pretends to be the avatar.
Operating an Avatar Robot in Simulation. A participant controlled an avatar robot in a simulator while pretending to be a malicious anonymous operator. We added this role in addition to operating an actual robot to increase the variety of scenarios that the participants experienced as malicious operators. We were limited by the space in the experiment area and by the number of situations we could have the participants act out with the real robot.
We used the MORSE simulator [
75] to recreate five real-world areas inhabited by simulated people. The simulated area environments were made to scale using Blender [
10]. The environments were a large outdoor public area, a large hall in a shopping mall, a long corridor in a shopping mall, a big clothing store, and a small convenience store. Each environment had typical crowd densities (4 in a convenience store to 40 in the large public area). The movements of the 3D animated pedestrians and groups were based on modeling human movement from observational data [
82]. The crowds consisted of individuals (male or female) and groups (children and adults). We used almost a hundred distinct 3D models for these individuals with varying appearances, accessories, and clothing. The animated models walked, stopped, and showed interest in things in a human-like manner. They spawned and traveled on random but realistic paths through the environments including stopping locations to finally despawn at an exit point. Additionally, an individual pedestrian or group’s path could be altered randomly to include approaching the robot and interacting with it by standing next to it and looking at it. This interest would expire after some time and the pedestrians would continue toward their goals.
The participants pretending to be malicious operators were allowed to choose which environment they felt would help in their idea generation. They could change to a different environment whenever they wanted. They saw a first-person view through the robot’s camera. The robot model was an omnidirectional drive type. Its controls were the same as the real avatar robot. Low-moral actions being performed in the simulation can be seen in
Figure 4.
3.3.3 Main Brainstorm Phase.
Here, participants discussed and generated new ideas, and built upon each other’s ideas. They stood facing a wall (
Figure 5) and were given a few minutes to write down any ideas from the previous phase they had not written down. Each idea was written down on a sticky note. They pasted all of their past ideas on the wall to display them to all others. To begin the main brainstorming phase, the facilitator explained the rules that encouraged quantity over quality, deferring judgement, and being visual if an idea could not be written down. The facilitator asked one of the participants to tell the group their wildest idea (an ice-breaking conversational tool used in brainstorming [
91]) and others were asked to build upon it by saying “Yes, and …” (another conversational tool used in brainstorming) and adding to it. New ideas were encouraged at any point. The facilitator made sure that the rules were being followed and kept the conversation from going on for too long or drying up. This phase went on for 30 minutes or until no more ideas were forthcoming.
3.4 Results
3.4.1 Observations from the Workshops.
In the experience phase, participants were especially interested in controlling the actual avatar. Using it, they spun around, approached other participants as close as the robot allowed, “climbed the stage” by crossing the demarcation during the speaker session, blocked other participants’ paths and views, tried to scare them by speeding up, and pushed other participants out of the way. In the simulation, they followed people, blocked paths, got as close as possible, bumped into them, interrupted their conversations, and blocked them from accessing things on shelves.
During the main brainstorming, participants referred to each other’s driving of the avatar and the situations when they felt unsafe or annoyed sharing space with the robot. The avatar was present in the room, and they sometimes pointed to it to express their thoughts. Most discussions and the subsequently emerging ideas overlapped between the sessions with only some being unique to a single session.
We collected a total of 320 sticky notes—the first session generated 90, while the second and third generated 115 each. The second session was the most diverse in terms of discussion topics. It was also the only session that ended early because the participants felt they could not generate any more ideas.
3.4.2 Consolidating Ideas through Affinity Diagram.
We used an affinity diagram to consolidate the ideas and produce a list of low-moral actions. An affinity diagram is a common technique used to organize ideas after brainstorming [
91]. Out of the 320 sticky notes collected, 26 were discarded for containing ideas that were too vague, incomplete, or incomprehensible. We merged the remaining ideas from the three sessions and organized them into natural categories based on their similarities. The categorization decisions were made jointly by all authors. The data yielded four distinct categories of low-moral actions each containing subcategories based on how a low-moral action could manifest. Many ideas were classified under more than one subcategory, but we limited each idea to a maximum of three subcategories (see exact numbers in Appendix A).
6 Discussion
6.1 Main Findings
We believe that avatar robots have the potential to become ubiquitous and as avatar robots become more widespread, there is a potential for misuse of avatar robots by malicious operators. Indeed, it was determined through our hazard identification workshops that it is generally easy for people to think of low-moral actions that can be done through an avatar robot. This range of possible malicious behaviors, however, appears to be limited and the same ideas tended to come up across multiple sessions. Our analysis revealed that these actions can be divided into a few basic categories based on their manifestations and their effect on the people sharing space with the robot. All of the low-moral actions identified were those that people can also do; however, perhaps unsurprisingly, none of the behaviors identified was unique to robots. This was because an avatar robot is the representation of a person in a remote location. Some actions, such as using the sound of the motors and movement of the robot to disturb a quiet environment (
Section 4.3.5), while not exactly doable by humans (our joints do not generally make an audible sound when we move) can still be carried out by people, such as making loud footsteps when walking.
On the other hand, the potential strategies tend to be unique for each low-moral action. This makes it difficult to devise a general solution to address all potential malicious actions. However, the strategies do appear to follow one of a few basic approaches: limiting the operators (either their perception or control), continually monitoring the avatars, or just making the bystanders more aware of the risks.
To allow for safe and socially acceptable use, designers and programmers of avatar robots may wish to prevent the misuse of avatar robots. The low-moral actions we have identified can serve as a starting point or general guidelines for such endeavors. Depending on the avatar robot and the use case, designers may prioritize which category or subcategory to address as it may be technically difficult for all of the low-moral actions to be addressed at the same time in a system.
Our analysis revealed some challenges and issues that can arise when implementing these strategies. In the following sections, we discuss these problems in detail.
6.2 Restricting Agency of Operator
This study assumed malicious operators and intentional low-moral actions. However, we acknowledge that the vast majority of users will not be malicious. Keeping information about the environment from the operator such as the operator being unable to listen to a conversation until it enters a group to prevent eavesdropping (
Section 5.1.3), restraining the operator from being able to wait at or enter certain areas to prevent inhibition of access or movement (
Sections 5.1.5 and
5.2.2), limiting visibility to prevent stealing of information or photography without consent (
Sections 5.1.1 and
5.1.2), and restricting control of robot movement to prevent unnatural movement (
Section 5.3.1) all detract from the agency of the teleoperator. Nonmalicious operators may feel limited in what they can do with an avatar robot and come away with a bad experience if strict measures are in place. Nonmalicious operators may also resent the presumption of malicious intent. Furthermore, observing every move the operator makes and limiting what they can and cannot do can cause the operators anxiety, a sense of being surveilled, and a feeling of loss of control, especially if they belong to a vulnerable population such as senior citizens. We believe some deliberation is warranted regarding the prevention strategies’ ability to judge whether the operator is indeed malicious.
Note also that the actions examined in this study were considered as always being low-moral. However, in a specific context, some actions may not necessarily be low-moral. For example, blocking someone’s path could be a way to alert them to an emergency or a way to stop a criminal from leaving a building. Such actions would have a moral justification or a valid reason, so preventing them may not be desirable.
A possible solution to both these issues could be to implement a form of shared control or sliding autonomy between the operator and the avatar robot where the operator initially has full or partial control over the robot’s actions but if and as low-moral actions are attempted by the operator, restrictions on their behavior could be introduced. This would do away with the presumption of guilt and restrictions on autonomy and agency would come into play only after low-moral actions have been attempted.
Nonetheless, an avatar robot can be considered a representation of a person in a remote space, so placing restrictions on its senses and movements could be likened to placing restrictions on someone’s autonomy and agency. This may make any preemptive prevention mechanisms ethically or morally questionable, especially when such restrictions are taken too far. For example, allowing people to only do a very limited set of actions and expressions could degrade their autonomy and individuality to the point that it would be dehumanizing [
69]. There is already evidence that people who use avatars (with limited functionality) to be present in a remote venue tend to be treated more like robots than physically present people, a phenomenon termed “robomorphism” [
83]. This suggests there is a real risk that excessive limitation of operators’ autonomy might truly lead to their dehumanization. The fact that the avatar is a robot is probably not going to mitigate the negative consequences of dehumanization; indeed, Keijsers and Bartneck [
41] have shown that dehumanization occurs also in human–robot interactions and that it is linked to aggressive behavior toward the robot. Because of that, we believe that any implementation of restrictive prevention mechanisms requires careful consideration and must take the ethics and consequences of such measures into account.
6.3 Low-Moral Actions toward the Avatar Robot
Abuse toward robots is a documented phenomenon. While this bullying has mostly been documented as being perpetrated by children toward autonomous robots [
12,
67,
78] or telepresence robots [
66], there have been reported cases of robots in the wild being attacked by adults [
7,
98]. The possibility that avatar robots may be abused or bullied in a similar manner is not far-fetched.
Those who implement detection and prevention mechanisms must consider that the people sharing space with the robot may also do low-moral actions to the avatar robot. These actions toward the robot may appear to a low-moral action prevention system as if the robot was doing the low-moral actions itself: A person inhibiting the path of the robot may present similarly to if the robot were blocking the path of the person. Any detection and prevention solutions must therefore be able to distinguish between the operator’s malicious behavior and that of the people around the robot. Solutions must not, for example, punish a teleoperator for inhibiting a bystander’s path when in fact it was the bystander who suddenly stepped in front of the robot.
6.4 Limitations
This study was constrained to avatar robots with locomotor movement, microphones, and cameras. Sound from speakers, hand and arm motions, or gestures were not considered in this study. Moreover, we only considered a single avatar robot, Robovie II. It is possible that repeating the hazard identification workshop with a robot with a different form factor may reveal some new low-moral action examples. Additionally, the participants in the workshops were homogeneous: They were Japanese and spoke Japanese, and none were people with special needs. Moreover, we carried out only three workshops. While we believe that we were able to identify a large section of the space of low-moral action possible through locomotor movement, it is possible that we perhaps did not identify all of them. Finally, our discussion of prevention mechanisms primarily centered on technological solutions. Solutions beyond the technological, other than transparency and accountability, were not extensively explored in this study.