Keywords

1 Introduction

In the past decade, we have witnessed the birth of a new kind of technology: virtual assistants. They provide a new way of interacting with more and more complex machines and are meant to support the user in his daily activities. In the context of automated driving, the virtual assistant is the spokesperson of the automated car. He acts as a mediator between the car and the driver and embodies the car intelligence [1]. Among the different concepts presented by car manufacturers we can find invisible characters only present by their voice (YUI from Toyota) [2], or cartoon characters represented on a screen (Hana concept from Honda), or abstract representations (Dragon Drive concept from Nuance). We lack scientific publications that support those design choices especially in regards to trust since different studies [9, 19,20,21,22] showed that a large majority of people do not trust highly automated cars.

In our study we assess trustworthiness of various visual embodiment models for virtual assistant. We needed first to select a range of representative images for our models. For this purpose, we conducted to a picture sorting procedure. Once our visual representation sample was defined we ran an online survey to actually assess trustworthiness.

2 Related Work

2.1 Visual Embodiment and Trust

Researchers have been investigating the visual representation of virtual assistants from varying perspectives. Some of them focus on the question of the importance of a visual embodiment [14, 15, 17]. Their results showed that visual embodiment is important for a pleasant interaction especially when the user’s visual attention is not required. However the realism of the embodiment might be of little importance. Others focus on the advantages of designing a humanlike face for a virtual assistant [13, 16, 18] but also the most important features to implement on the face. For example Disalvo & al. found in [11] that to project a high level of humanness a robot face should have a mouth, a nose, and eyelids. In [10], the authors found that the faces ranked as least friendly (without pupils or mouth, with eyelids) were also the ones ranked as less trustworthy. Similarly, Li et al. showed in their study [3] that a robot’s visual embodiment has an impact on user’s likeability and the found a significant correlation between likeability and trust in the robots.

2.2 Design Space for Virtual Assistants

It is worth noting that all the research work presented above focus on the “Humanlike” visual embodiment. However, the design space is much larger and there are many other models of visual embodiment to choose from. In [5], Haake & Gulz suggested a 3 dimensional design space for visual embodiment including basic model, graphical style and physical properties. The basic model refers to the constitution of the visual embodiment which can follow the form of a human, an animal, a fantasy concept, an inanimate object or a combination of these. The graphical style which can be naturalistic or stylized refers to the degree of details used in the visual design. Considering this design space, the possibilities are countless and the question of which one is most trustworthy depending on the role of the virtual assistant remains. In our study, we focus on the dimension of basic model for a virtual assistant in a highly automated car. We hypothesized that depending on the assistant visual embodiment model, user trust level in the automated system will be different. Hence some visual embodiment models might instill higher levels of trust than others.

3 Research Methodology

Our study articulates two procedures. The first one is a picture sorting procedure meant to help us select images that represented the most each of the predefined virtual assistant models We then incorporated these selected visuals in our survey.

3.1 Picture Sorting Procedure

Picture sorting [6], one of the many card sorting techniques, is used to study user’s mental models and how they categorize different type of images.

19 participants [4] (11 male, 8 female, mean age = 28,10) recruited from IRT SystemX and from a student house (Paris, France) took part in the card sorting procedure.

We collected 82 pictures from the website Pinterest based on ten (10) predefined models: “Human Naturalistic”, “Human stylized”, “Animal Naturalistic”, “Animal stylized”, “Human Mechanical”, “Animal Mechanical”, “Mini Mechanical”, “Abstract”, “Inanimate”, “Fantasy”. Those models were formed based on the first 2 dimensions of the design space proposed in [5].

All models labels were in French during the experiment (translated here). Many other possibilities of models can be identified but for this experiment we chose to focus on only 10.

Pictures were printed and placed on a large table. Models labels were also placed on a table next to the pictures. (see setting in Fig. 1 and Fig. 2)

Fig. 1.
figure 1

Picture sorting setting (before)

Fig. 2.
figure 2

Picture sorting setting (after)

Firstly, we red to the participants the definition of every label and answered their questions to make sure they understood what each label meant. Then we asked them to sort the pictures by label following two rules:

  • A picture can be placed in more than one group. If that’s the case participant can put the picture in a group and use additional post-it to specify the other group(s) where the image might also be classified.

  • In the case where a picture cannot be placed in any of the predefined groups, the participant can put it in a separate “Non categorized” group.

At the end of the process, we ask the participant to explain his sorting and especially the pictures that were not sorted. Then we asked for their age and professional background before they leave.

3.2 Picture Sorting Results

Each participant sorting results were saved in an excel file and analyzed using the spread rate of each image in predefined groups [7]. Two criteria were used for selection. A model is selected if it has at least 4 representative pictures. A picture is representative if it have been placed in the same group by at least 70% of participants.

Using these criteria, we were able to select 5 of the predefined models and for each of them 4 pictures (Fig. 3).

Fig. 3.
figure 3

Virtual assistant models and representative pictures selected by picture sorting.

3.3 Survey Procedure

Participants were invited to this online study via a link shared on social Media like Facebook, Whaller and different mailing lists (universities, student associations and professional associations). 146 participants (88 female, Mean age = 36.92 SD = 14.04 ranging from 19 to 72) completed the online survey. 124 (82.87%) participants had a driver’s license.

When they clicked on the link they were forwarded to a hosted (at Université de Poitiers) version of LimeSurvey, the survey tool that we used for this study.

On the first page participants red general information about the study procedure and gave their informed consent and data assessment agreement. The survey consisted of five parts:

  1. (1)

    An introduction with questions on driving habits and virtual assistant’s usage;

  2. (2)

    A short text instructing participants to imagine sitting in a highly automated car with a virtual assistant handling the driving task when automated mode is activated. We asked a few questions on participant behavior in manual mode and once automated mode is activated;

  3. (3)

    Participants are now instructed to imagine a critical driving situation in automated mode -with the virtual assistant in charge of the driving task- (shifting manoeuver in busy traffic to give passage to an ambulance coming from behind). Then they are presented with each of the 5 visual embodiment models (one model at a time and each model represented by 4 different images; example of abstract model in Fig. 4) selected in the picture sorting procedure. For each model they are asked to fill a 16 items questionnaire (on a scale of 0 to 10) that assessed perceived anthropomorphism, liking and self-reported trust towards the model.

    Fig. 4.
    figure 4

    Virtual assistant Abstract model as presented in the survey

    We used a translated version of the questionnaire developed and used in a simulator study by Waytz et al. (2014) in [8]. The order in which models were presented was automatically randomized by Limesurvey.

  4. (4)

    Here participants can choose the best and worst assistant between the ones presented in part (3).

  5. (5)

    Participants are asked to fill their personal information as age, gender, education level, country of residence, professional activity.

3.4 Preliminary Survey Results

Nine items in the questionnaires assessed self-reported trust towards each model. The items were averaged to form a single composite (α = 0.97), namely the trust score.

Our results (Table 1) based on this trust score are showing that the Mechanical Human category followed by the Human and Abstract ones were ranked with the highest scores in trust. Conversely, Animal and Mechanical Animal are appearing to be the least appropriate to elicit trust with virtual assistant in autonomous car (Table 2).

Table 1. Median ranks and other non-parametrical statistics
Table 2. Paired Samples Wilcoxon signed-rank test. Hypothesis is measurement one greater than measurement two.

Figure 5 show on one hand the distribution of trust score in our study; and on the other hand, it shows how many times each model has been ranked on that particular score. First, trust score distribution shows a frequent scoring between five (the middle of the scale) and eight. This seems to indicate that participants had a relatively positive attitude towards trusting the proposed models (median is generally above five, except for animal model). Very high scores in trust (nine or ten) were rather rare; a non-negligible part of our results shows that a lack of trust may also occur (scores between zero and four). Indeed, looking at model categories, our findings are pointing out that the Animal and Mechanical-Animal categories are more represented in low trust score. Conversely, Human, Mechanical-Human and Abstract are categories the most represented in higher trust scores (third quartile above 7).

Fig. 5.
figure 5

Trust score across conditions

4 General Discussion

The objective of this study was to investigate the impact of virtual assistant’s visual embodiment model on user trust in a highly automated car. We measured trust through an online trust questionnaire assessing a range of visual embodiment models, selected in a prior picture sorting procedure.

Our preliminary results are pointing to the Mechanical Human model followed by the Human and Abstract to be the most suitable embodiment models for representing a virtual assistant in an autonomous driving context; while Animal and Mechanical Animal models must be avoided.

Of course, this study by its methodology cannot answer all questions we may raise on virtual assistant assessment. Especially, on-line studies may induce many uncontrolled factors such as (the screen size, participant reading & understanding, doing the survey alone or with someone’s help). Furthermore, participants had to rely on a static picture which might have influence their answers. Seeing a virtual assistant in movement might be very different. For a better ecological validity, future studies might replicate this experiment in a controlled environment like a driving simulator or even a real car with a real-life automated driving experience.

Despite these limitations we have been able through this study to demonstrate a difference in visual embodiment models in regards to trust. Further analysis will be performed on our dataset firstly to identify the impact of the different models on anthropomorphism and liking measures but also correlations between these 3 measures. We will also investigate the hypothesis of potential user profiles related to preference of specific visual embodiment.