Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems

Wanling Cai, Department of Computer Science, Hong Kong Baptist University, China, cswlcai@comp.hkbu.edu.hk

Yucheng Jin, Department of Computer Science, Hong Kong Baptist University, China, yuchengjin@hkbu.edu.hk

Li Chen, Department of Computer Science, Hong Kong Baptist University, China, lichen@comp.hkbu.edu.hk

DOI: https://doi.org/10.1145/3491102.3517471
CHI '22: CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, April 2022

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding items through conversations and have recently gained increasing attention in domains such as media and e-commerce. Like in human communication, building trust in human-agent communication is essential given its significant influence on user behavior. However, inspiring user trust in CRSs with a “one-size-fits-all” design is difficult, as individual users may have their own expectations for conversational interactions (e.g., who, user or system, takes the initiative), which are potentially related to their personal characteristics. In this study, we investigated the impacts of three personal characteristics, namely personality traits, trust propensity, and domain knowledge, on user trust in two types of text-based CRSs, i.e., user-initiative and mixed-initiative. Our between-subjects user study (N=148) revealed that users’ trust propensity and domain knowledge positively influenced their trust in CRSs, and that users with high conscientiousness tended to trust the mixed-initiative system.

CCS Concepts: • Human-centered computing → User interface design; • Human-centered computing → Empirical studies in interaction design; User models; User studies; • Information systems → Recommender systems; Empirical studies in HCI;

Keywords: Conversational recommender systems; trust; personal characteristics; mixed-initiative interaction

ACM Reference Format:
Wanling Cai, Yucheng Jin, and Li Chen. 2022. Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems. In CHI Conference on Human Factors in Computing Systems (CHI '22), April 29-May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA 14 Pages. https://doi.org/10.1145/3491102.3517471

1 INTRODUCTION

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding desired items through multi-turn conversations and have been attracting increasing attention in recent years for developing task-oriented chatbots [10, 17, 29]. Some commercial chatbots have been built on Facebook Messenger or Amazon Alexa for recommending items (e.g., songs, movies, and products) [69]. Unlike traditional recommender systems that mainly present one-shot recommendations (e.g., a ranked list of items) to users [63], CRSs can support mixed-initiative (combining both user-initiative and system-initiative [3]) interactions between users and the system [10]. In such a system, users can not only actively inform the system of their preferences (e.g., “I want relaxing music.”), but also accept proactive suggestions from the system (e.g., “Do you want to try some piano music?”) [29]. Recent works have shown that mixed-initiative CRSs can help users better control the recommendation [30] and facilitate their exploration [11].

However, few studies have investigated the influence of the conversational interaction – particularly the initiative strategy (i.e., who, user or system, takes the initiative in the conversation) – on user trust in CRSs [29]. Given that user trust plays a vital role in users’ willingness to accept recommendations [6] and adopt a given system [5, 12], which can be inherently affected by users’ personal characteristics (such as personality traits) [38, 52], this work aims to identify whether and how user characteristics and system conversation design factors affect user trust in text-based CRSs. Our findings will be useful for optimizing the design of CRSs to be more trustworthy for individual users, which may potentially maximize the benefit of CRSs.

Our work is theoretically driven by the three-layered trust model proposed by Hoff and Bashir [26], which suggests that user trust in a computer system can be influenced by three types of factors: user-related factors, system-related factors, and context-related factors. Among user-related factors, inspired by previous works [16, 38, 76], we considered three personal characteristics: (1) personality traits, which refer to enduring characteristics related to people's thinking, feeling, and behaving, and have been shown to influence user trust in both human-human and human-machine relationships [16, 76]; (2) trust propensity, which can be defined as the user's general tendency to trust others, and has been demonstrated to impact user trust in traditional recommender systems [12, 38]; and (3) domain knowledge, which represents the user's expert knowledge in the choice domain, and has been shown to influence user reliance and trust in intelligent systems [26, 66].

Among system-related factors, we considered initiative strategies, among which the mixed-initiative strategy is a special characteristic of CRSs. However, it might be difficult to inspire user trust in CRSs with a “one-size-fits-all” design of the initiative strategy, because different users may prefer different conversation interactions given their personal characteristics [21, 38]. Thus, we investigated whether and how users’ personal characteristics influence their trust in CRSs with different initiative strategies (user-initiative vs. mixed-initiative).

Finally, regarding context-related factors (also known as situational factors such as the user's performed task [26, 39]), we examined whether and how users’ personal characteristics interact with task complexity to affect user trust in CRSs. User tasks with different levels of complexity may influence how users interact with systems [9, 54], and the influences vary among different types of users [7, 43, 62]; for example, domain novices tend to spend much more time completing a more complex knowledge-seeking task than do domain experts [43].

To summarize, we aim to answer the following three research questions in this work (see Figure 1):

RQ1: How do personal characteristics (personality, trust propensity, domain knowledge) affect user trust in CRSs?

RQ2: How do personal characteristics and initiative strategy interact¹ to affect user trust in CRSs?

RQ3: How do personal characteristics and task complexity interact to affect user trust in CRSs?

To answer our research questions, we conducted a between-subjects user study (N=148). Two variants of a text-based conversational music recommender were implemented for the experiment: a User-Initiative system that mainly responds to users’ requests or feedback, and a Mixed-Initiative system that not only allows users to freely give feedback on the recommendation, but also proactively offers suggestions. Additionally, to vary the task complexity, we designed two user tasks in the context of seeking recommendations: a Simple Task that asks users to find five songs based on their current preferences, and a Complex Task that asks users to first explore diverse types of songs beyond their current interests and then select five songs.

Our analyses revealed three main findings: (1) User experience with conversational interaction in a CRS can be influential on user trust in the system; (2) Among the three types of personal characteristics considered, users’ trust propensity and domain knowledge significantly affected user trust toward the CRS; (3) The personality trait conscientiousness separately interacted with the initiative strategy and the task complexity to inspire user trust in the CRS. Based on these findings, we present in this paper practical implications for designing trustworthy CRSs that can be tailored to individual users’ needs based on their personal characteristics (e.g., conscientiousness, trust propensity, and domain knowledge). We believe this work contributes to the research on conversational Artificial Intelligence (AI) systems, and will facilitate improved CRS design by integrating personalization.

2 RELATED WORK

2.1 Conversational Recommender Systems

Conversational recommender systems (CRSs) aim to mimic a human advisor to assist users in looking for recommendations in a multi-turn dialogue via text or voice [17, 29, 74, 75], and have been applied in several domains, such as movies [10], music [11, 30], and e-commerce [75]. Unlike single-shot traditional recommender systems [63], CRSs allow users to interact with the system in a multi-turn conversation, enabling the system to incrementally refine the user preference model to generate more satisfying recommendations [10, 34]. Such systems can support mixed-initiative interaction by combining both user-initiative (i.e., users actively tell the system what they want) and system-initiative (i.e., the system proactively offers suggestions to users during the recommendation process) interactions, which is regarded as a more flexible interaction strategy in human-computer interaction (HCI) [3]. Several recent studies on CRSs have demonstrated that such systems enable more natural interactions between the user and the system, which can better enhance user experience with recommender systems [11, 29, 30, 55].

With increased interest in CRSs, recommender system researchers have been focusing on improving CRS efficiency (i.e., reducing the number of dialogue turns) and effectiveness (i.e., improving the recommendation quality) [22, 29, 75]. Although conversational system design is a trending topic within the HCI community, few studies have investigated conversational interaction designs for recommender systems [11, 29, 30, 60]. For instance, one study compared two critiquing-based conversational music recommenders that employed different initiative strategies [30], and found that the user-initiative CRS gives users more control to tune recommendations on their own, whereas the mixed-initiative CRS guides users to discover more diverse recommendations. Another recent study demonstrated the ability of a CRS to promote user exploration activities [11], and suggested that the mixed-initiative CRS enhances user exploration by allowing users to control the exploration direction on their own as well as guiding them to explore something different. While existing studies have demonstrated several advantages of CRSs, work on the critical factor of user trust, which strongly determines users’ intention to adopt CRSs in real-world situations, is limited. Thus, this study aims to investigate the factors that may affect user trust in CRSs.

2.2 Trust in Human-Computer Interaction and Recommender Systems

Trust is an important factor in both human-human and human-computer relationships [21, 44, 46], which has been studied for a long period. Trust is defined in various ways in the existing HCI literature [45, 50, 73], but a common theme is that trust can be regarded as a behavioral intention (e.g., intention to use) or “trusting intention” [50]. Studies have suggested three types of factors that can influence trust: user-related, system-related, and context-related factors. These three types of factors respectively correspond to the three layers of trust model proposed by Hoff and Bashir [26]: Dispositional trust refers to the user's general tendency to trust systems, which may arise from individual characteristics such as personality (user-related); learned trust represents the user's evaluations of a system's trustworthiness drawn from past interactions (system-related); situational trust is based on the context of the user-system interaction, such as the complexity of the performed task and user workload (context-related). Motivated by the three-layered trust model [26], we are interested in examining these three types of factors (user-related, system-related and context-related) that may influence user trust in CRSs.

Trust-related issues have also gained a lot of attention in recommender systems (RSs), because user trust highly influences users’ willingness to use a system and follow its recommendations in their decision-making process [38, 40, 57, 72]. User trust in a technological artifact (e.g., recommender system) is often based on competence (i.e., the system's ability to assist users in a specific task), benevolence (i.e., the system's qualities such as security and reputation), and integrity (i.e., the system's reliability and honesty) [50]. Studies on RSs have demonstrated that users’ perceived competence of the system positively influences their trust in the system [12, 41]. For example, the accuracy and diversity of recommendation lists tend to improve user trust and increase customer purchases in the e-commerce domain [58]. Moreover, the organization-based recommendation interface was demonstrated to reduce user effort in the decision-making process, sustain user trust, and increase users’ intention to use the system [12]. Recommendations accompanied by explanations that provide information to assist users in making judgments on the recommended item have also been shown to increase user trust and decision confidence [12, 70].

Literature on user trust in RSs has mostly focused on the aspect of recommendations [41, 41, 58], whereas, to the best of our knowledge, user trust in the context of conversational recommendations has rarely been investigated. In CRSs, the conversational interaction between users and the system usually mimics human communication, suggesting that user trust toward the system is similar to trust in interpersonal relationships. Thus, to improve trust, the system should be both reliable in performing the requested tasks and predictable in interactions (i.e., behaving as expected by the user) [62]. However, individual users may have their own expectations of interaction strategies (e.g., preference for user-initiative or mixed-initiative) depending on their individual characteristics, which may influence their trust in the system. To facilitate the design of trustworthy CRSs that can serve individual users’ needs, our work focuses on investigating the impact of personal characteristics on user trust in CRSs that employ different initiative strategies.

2.3 Personal Characteristics

Because previous HCI and RS studies have indicated that user trust in the human-system relationship depends on individual characteristics [16, 26, 76], we believe that user trust in CRSs may also be influenced by users’ personal characteristics. The literature suggests that three personal characteristics, namely personality traits, trust propensity and domain knowledge, are likely to affect user trust in conversational recommenders.

Personality Traits. Personality is defined as individual differences in one's enduring way of thinking, feeling, and behaving [37, 48]. The Big-Five personality model, which comprises five traits – openness to experience (openness), conscientiousness, extroversion, agreeableness, and neuroticism – is widely used to assess user personality [48]. Studies have reported the impacts of personality traits on trust in interpersonal relationships [21], demonstrating that openness and conscientiousness affect trust in both friends and strangers, and agreeableness affects trust in strangers. Personality traits also influence user trust in the human-machine collaboration [16, 76]; for example, people who are more agreeable and conscientious are more likely to trust automation in decision-making [16]. Thus, we speculate that personality traits (such as agreeableness and conscientiousness) can also influence user trust toward system guidance in CRSs.

Trust Propensity. Trust propensity is defined as the general tendency to trust others [18, 64] and is viewed as a dynamic individual difference that may be affected by personality type as well as situational factors (e.g., cultural background) [47]. Trust literature has shown that a user's trust propensity influences the formation of trust toward specific technological systems [49, 50]. When deciding whether to trust a system, users tend to look for cues that signify the system's trustworthiness; however, the perception of the signals is affected by their trust propensity [45]. Thus, we seek to determine whether this characteristic will impact user trust in CRSs.

Domain Knowledge. Domain knowledge refers to a person's expert knowledge in a specific field. HCI research has demonstrated that users’ domain knowledge can influence their interaction behaviors and preferred interaction strategies [56]. In recommender systems, domain experts prefer more control during the decision-making process [38], whereas domain novices tend to perceive recommendations without too much control to be more helpful. Moreover, users’ reliance on decision support systems is related to their domain knowledge; for example, users with little or no specialized domain knowledge are likely to rely on the system's suggestions [8]. Thus, we believe that domain knowledge may influence the way users prefer to interact with CRSs (e.g., preference for user-initiative or mixed-initiative), hence affecting user trust.

3 USER EXPERIMENT

3.1 Experiment Design

Based on Hoff and Bashir's three-layered trust model [26], we investigated how user-related factors (personal characteristics) interact with both a system-related factor (initiative strategy) and a context-related factor (task complexity) to influence user trust in CRSs. We deployed two text-based prototype conversational music recommenders that employ different initiative strategies (user-initiative and mixed-initiative) [11], and designed two user tasks of varying complexity in the recommendation domain. Thus, we designed a 2 (User-Initiative vs. Mixed-Initiative) × 2 (Simple Task vs. Complex Task) online between-subjects user study, in which participants were randomly assigned to one of the four experimental conditions (see Figure 2). Below we present two experimental manipulations.

Figure 2: Interfaces of two text-based conversational music recommenders employing different initiative strategies (User-Initiative [left] and Mixed-Initiative [right]), and user tasks with low and high complexity (Simple Task and Complex Task [middle]) in our 2 × 2 between-subjects study.

3.1.1 Conversational Recommenders. We used two variants of text-based conversational music recommenders that employ different initiative strategies to support users in looking for music recommendations [11]:

User-Initiative System: This system, which performs reactive system behavior, only responds when users initiate requests during the conversation. In this system, users can post feedback to refine the current recommended item or ask for songs based on music-related attributes (e.g., genres, tempo, and danceability). For example, a user can tune a recommendation by typing “I want higher tempo.”
Mixed-Initiative System: This system supports both user-initiative and system-initiative interactions. Specifically, in addition to reactively responding to users’ requests, the system can proactively provide suggestions (e.g., “Compared with the last played song, do you like the song of lower tempo?”) to facilitate users’ music discovery during the recommendation process. As suggested by a study of chatbot proactivity [60], our system offers suggestions to users when they make an explicit request (i.e., by clicking the “Let bot suggest” button; Figure 2) or when the system identifies a good time to offer suggestions.²

Although conversational systems can employ three types of initiative strategies, namely user-initiative, system-initiative, and mixed-initiative strategies, we did not employ a purely system-initiative strategy in our study because this design relies on a “system asks, user responds” conversation paradigm [75], which can restrict user interaction, reduce flexibility, and make users feel passive [29, 36].

Figure 2 shows the user interfaces of the two conversational music recommenders, and the dialogue windows show the conversation between the user and the system. Each recommended song is displayed on a card using which the user can control music playback, along with a set of buttons under the card for the user to give feedback. Specifically, the user can click the “Like” button to add the current song into their playlist where they can rate the song, and the “Next” button to skip the current song. In the Mixed-Initiative system, the user can click the “Let bot suggest” button to trigger the system's suggestion based on the currently recommended song. Additionally, the user can send a message in natural language about the music genre, audio feature, or artist to provide feedback on the currently recommended song and accordingly refine the recommendation. We used a popular natural language understanding platform, DialogFlow,³ and a widely used online music service, Spotify API,⁴ to develop our conversational music recommenders. For the generation of the system-initiative suggestions, we employed the progressive system-suggested critiquing technique designed by Cai et al. [11], which considers the user's song preferences as well as incremental feedback captured from past interactions.

3.1.2 User Tasks. To determine whether and how users’ personal characteristics interact with the context-related factor (task complexity) to influence user trust in CRSs, we considered two typical user tasks in the recommendation domain:

Simple Task. Users are asked to interact with our conversational music recommender (called “music chatbot” in our study) to find five songs that suit their preferences, and rate each song in terms of its pleasant surprise.
Complex Task. Users are asked to complete two steps: (1) use our music chatbot to discover songs as many different music genres as possible, create a playlist containing 20 songs that fit their tastes, and then rate each song in terms of its pleasant surprise; and (2) select their top-5 most preferred songs from the playlist they created. Compared with the simple task, this task requires users to discover more types of music and make comparisons for selecting their most preferred songs, which is more cognitively demanding.

3.2 Participants

We recruited participants from Prolific,⁵ a popular platform for academic surveys [59]. To ensure experiment quality, we pre-screened users in Prolific using the following criteria: (1) participants should be fluent in English; (2) they must have more than 100 previous submissions; (3) their approval rate should be greater than 95%. The experiment took 25 minutes to complete on average. We compensated each participant £2.40 on successfully completing the experiment. The Research Ethics Committee (REC) of the authors’ university approved this study.

In total, 194 users participated in our study. We removed the responses of 23 participants because of their excessively long experiment completion time (outliers). We excluded the responses of another 23 participants who failed the attention check questions.⁶ Thus, the remaining responses of 148 participants were included in the analyses [User-Initiative: Simple Task (32), Complex Task (35); Mixed-Initiative: Simple Task (45), Complex Task (36); Gender: female (70), male (75), other (3); Age: 19-25 (69), 26-30 (27), 31-35 (25), 36-40 (10), 41-50 (11), > 50 (6)]. Participants were mainly from the United Kingdom (32), the United States (32), Portugal (18), Poland (12), and Italy (9).

3.3 Experimental Procedure

Participants had to accept a general data protection regulation consent form before they signed into our system using their Spotify accounts. After reading the user study instructions, participants were asked to fill out a pre-study questionnaire, which included demographic questions and questions for measuring their personal characteristics (see Section 3.4). To ensure that participants understood the study task and how to use the conversational recommender, they were given a tutorial of interacting with the assigned conversational music recommender, followed by two minutes to try the system. After completing the tutorial, participants were asked to complete a randomly assigned task (Simple Task or Complex Task as described in Section 3.1.2). After finishing the task, participants were asked to fill out a post-study questionnaire regarding their trust-related perception of the conversational music recommender (see Section 3.5).

3.4 Pre-Study Questionnaire

In the pre-study questionnaire, we used a short personality test, the Ten Item Personality Inventory (TIPI) [23], to assess participants’ Big-Five personality traits: openness to experience, conscientiousness, extroversion, agreeableness, and neuroticism. Each personality trait is assessed by two questions in the TIPI, and the personality value for each trait is the average of the scores on the two questions. To measure participants’ trust propensity, we adopted two statements developed by Lee and Turban [45]: “I tend to trust the recommender, even though having little knowledge of it.” and “Trusting someone or something is difficult.” Because our system was built for the music domain, we used the nine statements from the “Active Musical Engagement” facet of Goldsmiths Musical Sophistication Index [53] to assess participants’ musical sophistication as their domain knowledge. All statements were rated on a 7-point Likert scale from 1 (strongly disagree) to 7 (strongly agree). In Table 1, we briefly introduce each measured personal characteristic.

Table 2 shows the descriptive statistics of our participants’ personal characteristics (PCs). The scored values are centered between 3 and 5 for almost all PCs, and the standard deviations are comparable across all PCs. Table 3 shows Pearson's correlations between these PCs; these correlations (e.g., trust propensity is positively related to extroversion and agreeableness) are generally consistent with the results of previous literature [21, 24, 33].

Table 1: Description of Big-Five personality traits, trust propensity, and domain knowledge (musical sophistication).

Personal Characteristic (PC)		Description
Big-Five Personality Traits [23, 24]
	Openness to Experience (O)	This trait, also called Openness, is related to one's cognitive style, distinguishing creative, imaginative people (high O) from down-to-earth, conventional people (low O).
	Conscientiousness (C)	This trait is associated with one's way of controlling, regulating, and directing impulses, distinguishing prudent people (high C) from impulsive people (low C).
	Extroversion (E)	This trait concerns the active level of engagement with the external world, distinguishing sociable, outgoing people (high E) from reserved, quiet people (low E).
	Agreeableness (A)	This trait reflects one's attitude toward cooperation and social harmony, distinguishing cooperative, sympathetic people (high A) from critical, tough people (low A).
	Neuroticism (N)	This trait describes one's tendency to experience negative feelings, distinguishing sensitive, easily upset people (high N) from calm, unflappable people (low N).
Trust Propensity (TP) [45]		TP reflects one's general willingness to trust other people or technologies. People with high TP are naturally inclined to trust others, while people with low TP are hesitant.
Musical Sophistication (MS) [53]		MS is related to one's ability to successfully engage with music. People with high MS are more flexible in responding to a great range of musical situations than are people with low MS.

Table 2: Descriptive statistics of participants’ personal characteristics (PCs).

PC	Min	Median	Mean	Max	S.D.
O	2.00	5.00	5.01	7.00	1.15
C	2.00	5.25	5.19	7.00	1.19
E	1.00	3.25	3.29	7.00	1.54
A	2.00	5.00	4.94	7.00	1.10
N	1.00	3.50	3.51	6.50	1.53
TP	1.00	4.00	4.05	6.50	0.99
MS	1.44	4.22	4.25	6.89	1.03

Table 3: Pearson's correlations between the Big-Five personality traits, trust propensity, and musical sophistication.

PC	O	C	E	A	N	TP	MS
O	-	***	**	***	***		***
C	0.2858	-		***	***	*
E	0.2189	0.0894	-		**	**	*
A	0.2920	0.3321	0.1518	-	***	***
N	-0.3112	-0.3277	-0.2375	-0.3954	-
TP	0.1419	0.1854	0.2668	0.2729	-0.1509	-
MS	0.2875	0.0702	0.2086	0.0213	0.0326	0.0916	-

Significance: * p <.001, p <.01, * p <.05.

3.5 Trust Measurement

In the post-study questionnaire, we measured users’ trust-related perception of the conversational music recommender in two main dimensions: Competence Perception and User Trust. Competence Perception refers to how users perceive the system's competence in assisting them in performing tasks, which contains the following three constructs derived from prior works [13, 39, 71]:

Perceived Recommendation Quality: This construct measures the system's ability to provide good recommendations to help users make decisions or support their exploration. Users may judge the quality of recommendations in terms of several aspects, e.g., accuracy, novelty, and serendipity [13, 39]. A previous study showed that users’ perceived recommendation quality influences their perceived usefulness of the system in helping them accomplish tasks, which consequently impacts user trust toward the system [13]. Thus, we considered this construct and measured it using questions from ResQue [13], a widely used user-centric evaluation framework for recommender systems.
Perceived Conversational Interaction: This construct measures the system's ability to effectively communicate with users to perform tasks during the interaction. Several aspects of conversational interaction are deemed crucial to CRSs [31], which include understandability, perceived control, interaction adequacy (i.e., ability to elicit and refine preferences [13]), and naturalness of the dialogue interaction. Because communication is the primary way people develop trust within interpersonal relationships [19], we hypothesize that users’ experience with conversational interaction will also influence the formation of user trust in the system. We measured this construct by adopting questions mainly from an evaluation framework for conversational agents [71].
Perceived Effort: This construct measures users’ perceived difficulty or ease in using the system for completing their tasks, which can reflect the effectiveness of the system in supporting users to accomplish tasks. When users perceive high effort in using the system to complete tasks, they may feel frustrated and show less trust [12, 13]. We used questions in ResQue [13] to measure this construct.

The User Trust dimension directly measures user trust in the CRS based on two constructs, each measured using one question item: Perceived Trust assesses users’ overall feelings of trust toward the conversational recommender, and Intention to Use measures users’ willingness to use the system in the future.

We assessed the validity of our constructs as measured by the question items (19 items in the initial questionnaire) by conducting confirmatory factor analysis (CFA) with R library Lavvan.⁷ In CFA, the items within the same scale are represented by a latent factor, where the loading of each item denotes how strongly that item is associated with the corresponding factor. We iteratively removed 5 items with low loadings (<0.50) or high cross-loadings, leaving behind 14 items in total (Table 4). All items were assessed by 7-point Likert scale from 1 (strongly disagree) to 7 (strongly agree). Each factor had good internal consistency (Cronbach's α > 0.80), composite reliability (CR > 0.80), and convergent validity [Average Variance Extracted (AVE) > 0.50] [1], and the loading of each item exceeded the acceptable level of 0.50, with an overall good model fit [27]: χ²(51) = 86.283, p <.001; Root Mean Square Error of Approximation (RMSEA) = 0.068, Comparative Fit Index (CFI) = 0.967, Turker-Lewis Index (TLI) = 0.957.

Table 4: Post-study questionnaire for measuring users’ trust-related perception of the conversational recommender.

Construct	Item (each statement is rated on a 7-point Likert scale)	Loadings

Competence Perception

Perceived Recommendation Quality (Cronbach alpha: 0.9001; CR: 0.8951; AVE: 0.6647)
	The music chatbot helped me discover new songs.	0.7940
	The songs recommended to me were novel.	0.5378
	The music chatbot provided me with recommendations that I had not considered in the first place but turned out to be a positive and surprising discovery.	0.8457
	The music chatbot provided me with surprising recommendations that helped me discover new songs that I wouldn't have found elsewhere.	0.9226
	The music chatbot provided me with recommendations that were a pleasant surprise to me because I would not have discovered them somewhere else.	0.8728
Perceived Conversational Interaction (Cronbach alpha: 0.8668; CR: 0.8692; AVE: 0.5756)
	I found the music chatbot easy to understand in this conversation.	0.7590
	The music chatbot worked the way I expected it to in this conversation.	0.7950
	I found it easy to inform the music chatbot if I dislike/like the recommended song.	0.6967
	I felt in control of modifying my taste using this music chatbot.	0.7995
	In this conversation, I knew what I could say or do at each point of the dialog.	0.7236
Perceived Effort (Cronbach alpha: 0.8712; CR: 0.8730; AVE: 0.7729)
	Looking for a song using this interface required too much effort.	0.8675
	I easily found the songs I was looking for. (reversed)	0.8927

User Trust
Perceived Trust	This music chatbot can be trusted.
Intention to Use	I will use this music chatbot again.

4 Analyses & Results

The three-layered trust model [26] indicates three types of factors that may influence user trust: user-related, system-related, and context-related factors. We conducted a series of analyses to investigate the influences of these factors on users’ trust-related perception of CRSs. First, we examined the relationship between Competence Perception and User Trust, and the impacts of user-related factors (i.e., the three personal characteristics) on these two dimensions (RQ1). For this purpose, we used structural equation modeling (SEM) to build a path model to test and evaluate multivariate causal relationships among the constructs in Table 4 and the effects of personal characteristics in an integrative structure.

Next, we investigated in-depth the impacts of personal characteristics to determine whether and how user-related factors interact with the system-related factor (initiative strategy) and the context-related factor (task complexity) to influence Competence Perception and User Trust (RQ2 & RQ3). As it is relatively complicated to perform interaction effect analyses with multiple factors using SEM [25], we conducted an additional set of linear regression analyses to investigate the interaction effects.

4.1 User Trust in Conversational Recommender Systems

Figure 3 illustrates the results of the structural equation modeling (SEM) analysis, showing all significant paths in our model. The SEM model had overall good model fit indices: χ²(123) = 182.312, p <.001; RMSEA = 0.057, CFI = 0.956, TLI = 0.947, which meet the recommended SEM fit standard.⁸

Figure 3: Structural equation modeling (SEM) results. Two personality traits (*conscientiousness* and *extroversion*) influenced User Trust via Competence Perception, and *trust propensity* and *musical sophistication* directly affected User Trust. The numbers on the arrows represent the β coefficient and standard error (in parentheses) of the effect. Significance: *** p <.001, ** p <.01, * p <.05. R² is the proportion of variance explained by the model. Factors are scaled to have a standard deviation of 1.

In the resulting model, the paths between the perception constructs (inside black rectangles) show how users’ perceptions of the system's competence influenced their trust in the CRS. Specifically, the significant paths (Perceived Recommendation Quality → Perceived Trust and Intention to Use; Perceived Conversational Interaction → Perceived Trust and Intention to Use) justify the positive effects of users’ competence perception of the CRS on their trust in the CRS. Furthermore, the path coefficients indicate that Perceived Trust was affected more by Perceived Conversational Interaction (coefficient = 0.695) than Perceived Recommendation Quality (coefficient = 0.235). Our model also verifies the positive effect of Perceived Trust on Intention to Use [61]. Additionally, we observed an interesting path (Perceived Conversational Interaction → Perceived Effort → Perceived Recommendation Quality), showing that users’ perceptions of conversational interaction positively influenced their perceptions of the recommendation quality, which were mediated by their perceived effort. These effects highlight the importance of considering Perceived Conversational Interaction for inspiring user trust in CRSs.

Moreover, our SEM model shows how personal characteristics influence the constructs of Competence Perception and User Trust. The results indicate that two personality traits (conscientiousness and extroversion) influenced User Trust via Competence Perception, whereas trust propensity and domain knowledge (musical sophistication) directly affected User Trust in the CRS.

Conscientiousness. The trait conscientiousness positively influenced users’ perceptions of conversational interaction: users with higher conscientiousness tended to have a better perception of their interaction with the conversational recommender.
Extroversion. The trait extroversion was positively related to users’ perceived recommendation quality. Users with higher extroversion tended to perceive higher system competence in recommending satisfying songs. One possible explanation is that compared with introverted users, extroverted users (who are more outgoing and vigorous [33]) are more willing to take risks and try listening to different music during the interaction, hence improving their perceptions of recommendations.
Trust Propensity. Trust propensity positively affected users’ perceptions of the conversational interaction and their intention to use. Namely, users who are more willing to trust others tended to enjoy the conversational interaction with the CRS and have a higher intention to use it again. People with a higher trust propensity (who tend to believe others are sincere and have good intentions [18]) may be more cooperative [28] with the system during the conversation, resulting in a more positive conversational experience.
Musical Sophistication. Regarding the influence of domain knowledge, we found that musical sophistication positively influenced users’ intention to use the CRS, suggesting that users with higher musical sophistication are more likely to use the conversational recommender in the future.

In addition to the user-related factors (personal characteristics), we investigated whether the system-related factor (initiative strategy) and the context-related factor (task complexity) directly influenced user trust in the model. Among these factors, task complexity negatively affected users’ perceived conversation interaction (p <.05), which may be attributed to the increased user effort required to perform a complex task.

4.2 Interaction Effects on User Trust

As inspired by previous studies [38, 54], individual users may have different perceptions of the two conversational recommenders (User-Initiative and Mixed-Initiative systems), and may show different attitudes when performing the two user tasks (Simple Task and Complex Task), which may influence their formation of trust in the CRS. Therefore, we investigated how the user-related factors (personal characteristics) interact with the system-related factor (initiative strategy) and the context-related factor (task complexity) to influence user trust in the CRS. Specifically, we used linear regression models to process the mix of numerical and categorical independent variables, namely personal characteristics, initiative strategy and task complexity as the independent variables, and the five trust-related perception constructs (Table 4) as the dependent variables. Table 5 presents the results of the regression models that show how users’ trust-related perception is influenced by personal characteristics, initiative strategy, and task complexity, revealing their interaction effects (represented by interaction terms in the model). We report coefficients, standard errors, p-values, R² and adjusted R² values.

Table 5: Regression models for estimating the interaction effects of personal characteristics with initiative strategy and task complexity on users’ trust-related perception constructs (as shown in Table 4) in the conversational recommender.

	Perceived Recommendation Quality	Perceived Conversational Interaction	Perceived Effort	Perceived Trust	Intention to Use
	Coef. (S.E.)	Coef. (S.E.)	Coef. (S.E.)	Coef. (S.E.)	Coef. (S.E.)
Mixed Initiative vs. User Initiative	0.408 (0.229).	-0.064 (0.147)	-0.183 (0.235)	0.054 (0.190)	-0.003 (0.259)
Complex Task vs. Simple Task	0.222 (0.228)	-0.290 (0.146) *	0.391 (0.234).	-0.055 (0.189)	-0.408 (0.258)
Openness	0.086 (0.199)	-0.053 (0.128)	-0.073 (0.205)	0.076 (0.165)	-0.143 (0.225)
Conscientiousness	-0.038 (0.182)	0.133 (0.117)	0.051 (0.187)	-0.055 (0.151)	0.054 (0.207)
Extroversion	-0.099 (0.149)	-0.092 (0.095)	0.050 (0.153)	0.165 (0.123)	-0.086 (0.168)
Agreeableness	0.131 (0.199)	-0.151 (0.128)	0.078 (0.205)	-0.088 (0.165)	0.162 (0.226)
Neuroticism	0.141 (0.153)	-0.069 (0.098)	-0.220 (0.157)	0.136 (0.127)	-0.004 (0.173)
Trust Propensity	0.189 (0.248)	0.076 (0.159)	-0.122 (0.255)	0.029 (0.206)	0.212 (0.281)
Musical Sophistication	0.782 (0.221) ***	0.189 (0.142)	-0.399 (0.227).	0.280 (0.183)	0.620 (0.250) *
Mixed Initiative x Openness	-0.013 (0.230)	0.241 (0.148)	-0.218 (0.237)	0.066 (0.191)	0.128 (0.261)
Mixed Initiative x Conscientiousness	0.652 (0.208) **	0.284 (0.134) *	-0.388 (0.214).	0.372 (0.173) *	0.405 (0.236).
Mixed Initiative x Extroversion	0.067 (0.173)	-0.064 (0.111)	-0.061 (0.178)	-0.239 (0.143).	0.130 (0.196)
Mixed Initiative x Agreeableness	-0.327 (0.239)	0.167 (0.154)	-0.196 (0.246)	-0.016 (0.198)	0.047 (0.271)
Mixed Initiative x Neuroticism	-0.049 (0.175)	0.175 (0.113)	0.002 (0.180)	-0.087 (0.145)	0.179 (0.199)
Mixed Initiative x Trust Propensity	-0.224 (0.261)	-0.074 (0.167)	0.199 (0.268)	-0.040 (0.216)	-0.034 (0.296)
Mixed Initiative x Musical Sophistication	-0.496 (0.236) *	-0.143 (0.151)	0.377 (0.243)	0.069 (0.196)	0.010 (0.267)
Complex Task x Openness	-0.169 (0.221)	-0.104 (0.142)	0.324 (0.227)	-0.055 (0.183)	-0.002 (0.250)
Complex Task x Conscientiousness	-0.526 (0.211) *	-0.210 (0.136)	0.184 (0.217)	0.005 (0.175)	-0.364 (0.239)
Complex Task x Extroversion	0.205 (0.165)	0.041 (0.106)	0.050 (0.170)	-0.076 (0.137)	0.047 (0.187)
Complex Task x Agreeableness	0.354 (0.240)	0.151 (0.154)	-0.206 (0.247)	0.304 (0.199)	0.009 (0.272)
Complex Task x Neuroticism	-0.048 (0.167)	-0.008 (0.107)	0.140 (0.172)	0.042 (0.138)	-0.020 (0.189)
Complex Task x Trust Propensity	0.337 (0.260)	0.326 (0.167).	-0.545 (0.267) *	0.220 (0.215)	0.596 (0.294) *
Complex Task x Musical Sophistication	-0.524 (0.242) *	0.052 (0.155)	-0.064 (0.249)	-0.190 (0.200)	-0.417 (0.274)
Constant	3.948 (0.207) ***	5.920 (0.133) ***	2.840 (0.213) ***	5.421 (0.172) ***	5.054 (0.235) ***
R²	0.314	0.314	0.243	0.249	0.301
Adjusted R²	0.186	0.187	0.102	0.110	0.171

Given that interaction effects are present in our regression models, we only interpret the interaction effects (highlighted in bold) because the interpretation of the main effects (i.e., the effect of one independent variable on the dependent variable) is incomplete or misleading [42]. Significance: * p <.001, p <.01, * p <.05,. p <.1; Coef. stands for coefficient; S.E. stands for standard error.

4.2.1 Interaction Effects between Personal Characteristics and Initiative Strategy, Task Complexity. We detected a significant three-way interaction effect between the trait agreeableness, initiative strategy and task complexity on users’ perceived conversational interaction. Specifically, when using the Mixed-Initiative system to accomplish the Complex Task, users’ agreeableness positively affected their perceptions of the conversation interaction (r = 0.40, p <.05, 95% confidence interval [CI]: [0.08, 0.64]).⁹ In other words, system-initiative suggestions help users explore music, and users with higher agreeableness are likely to have a better experience with such conversational interaction. However, no significant correlations were detected in the other three experimental conditions.

4.2.2 Interaction Effects between Personal Characteristics and Initiative Strategy. Table 5 shows significant interaction effects between initiative strategy and the two personal characteristics, conscientiousness and musical sophistication:

Conscientiousness. The models in Table 5 show significant interaction effects between the trait conscientiousness and initiative strategy on several trust-related perception constructs, including perceived recommendation quality, perceived conversational interaction, and perceived trust. Figures 4(a), 4(b) and 4(c) visualize these interaction effects. In the Mixed-Initiative system, users’ conscientiousness levels positively influenced their perceived recommendation quality (r = 0.36, p <.001, 95% CI: [0.15, 0.53]), perceived conversational interaction (r = 0.41, p <.001, 95% CI: [0.21, 0.57]), and perceived trust (r = 0.39, p <.001, 95% CI: [0.18, 0.56]). In contrast, in the User-Initiative system, the trait conscientiousness was not correlated with users’ trust-related perception. Conscientious users may be more cautious and consider more choices when making a decision [33], so they may be more inclined to appreciate the suggestions offered by the Mixed-Initiative system that can guide them to discover more music when finding songs of interest.
Musical Sophistication. As for domain knowledge, we detected an interaction effect between musical sophistication and initiative strategy on users’ perceived recommendation quality. As illustrated in Figure 4(d), we can see that users with higher musical sophistication tended to have a better perception of recommendations in the User-Initiative system (r = 0.21, p <.1, 95% CI: [-0.03, 0.43]), whereas in the Mixed-Initiative System, the level of musical sophistication did not have a significant influence. We also observed that users of lower musical sophistication tended to perceive higher recommendations quality in the Mixed-Initiative system than in the User-Initiative system, implying that the system's suggestions are more helpful for domain novices.

Figure 4: Interaction effects between personal characteristics and initiative strategy on users’ trust-related perception constructs. (a-c) *Conscientiousness* (C): Users with higher C tended to have a better perception and showed more trust in the Mixed-Initiative system. (d) *Musical Sophistication* (MS): Users with higher MS tended to perceive higher recommendation quality from the User-Initiative system.

4.2.3 Interaction Effects between Personal Characteristics and Task Complexity. From Table 5, significant interaction effects were detected between task complexity and three personal characteristics, conscientiousness, trust propensity, and musical sophistication:

Conscientiousness. We found a significant interaction effect between the trait conscientiousness and task complexity on users’ perceived recommendation quality. As visualized in Figure 5(a), the positive effect of conscientiousness on perceived recommendation quality was observed when users perform the Simple Task (r = 0.32, p <.01, 95% CI: [0.10, 0.51]), but no relationship was found for the Complex Task. Together with the results in Figure 4(a), a crossover interaction effect was observed between conscientiousness and initiative strategy, suggesting that when users perform the Complex Task, their conscientiousness levels may differently influence their perceived recommendation quality, depending on the system's initiative strategy (user-initiative or mixed-initiative).
Trust Propensity. Task complexity influenced the effects of trust propensity on users’ perceived effort and intention to use the conversational recommender. Specifically, users with higher trust propensity levels tended to feel less effort using the conversational recommender to perform the Complex Task (r = -0.34, p <.01, 95% CI: [-0.53, -0.12]), but the correlation between them was not obvious regarding the Simple Task [see Figure 5(b) ], probably due to the intrinsically lower user effort required for the Simple Task. Moreover, trust propensity positively influenced users’ intention to use the conversational recommender (also shown in Figure 3), and Figure 5(c) shows that the positive effect was stronger when users performed the Complex Task (r = 0.32, p <.01, 95% CI: [0.09, 0.52]) than the Simple Task (r = 0.27, p <.05, 95% CI: [0.04, 0.46]).
Musical Sophistication. A significant interaction effect was detected between musical sophistication and task complexity on users’ perceived recommendation quality. As shown in Figure 5(d), when performing the Simple Task, users with higher musical sophistication tended to have a more positive perception of recommendations than users with lower musical sophistication (r = 0.34, p <.01, 95% CI: [0.12, 0.52]), which could be due to the higher skill levels of music professionals for tuning recommendations to find songs that suit their tastes.

Figure 5: Interaction effects between personal characteristics and task complexity on users’ trust-related perception constructs. (a) *Conscientiousness* (C): C showed a positive effect on the users’ perceived recommendation quality for the Simple Task. (b-c) *Trust Propensity* (TP): The effects of TP on users’ trust-related perception were stronger for the Complex Task. (d) *Musical Sophistication* (MS): Users with higher MS tended to have a better perception of recommendations for the Simple Task.

Table 6 summarizes the effects of the three personal characteristics on user trust toward the conversational music recommenders and their interaction effects with the initiative strategy (User-Initiative and Mixed-Initiative) and with the task complexity (Simple Task and Complex Task). Overall, trust propensity and musical sophistication directly influenced users’ intention to use, and conscientiousness interacted with the initiative strategy to influence users’ perceived trust in the CRS.

Table 6: Summary of the major findings. The positive sign (+) and the negative sign (-) indicate significant positive effects and negative effects, respectively.

Personal Characteristic		Direct Effect	Interaction Effect with Initiative Strategy	Interaction Effect with Task Complexity

Big-Five Personality Traits
	Conscientiousness	(+): Perceived Conversational Interaction	(+) in Mixed-Initiative: Perceived Recommendation Quality; Perceived Conversational Interaction; Perceived Trust	(+) in Simple Task: Perceived Recommendation Quality
	Extroversion	(+): Perceived Recommendation Quality
	Agreeableness		(+) in Mixed-Initiative & Complex Task: Perceived Conversational Interaction

Trust Propensity		(+): Perceived Conversation Interaction; Intention to Use		(-) in Complex Task: Perceived Effort (+) in Complex Task > Simple Task: Intention to Use

Music Sophistication		(+): Intention to Use	(+) in User-Initiative: Perceived Recommendation Quality	(+) in Simple Task: Perceived Recommendation Quality

5 DISCUSSION AND DESIGN IMPLICATIONS

In this research, we have sought to better understand user trust in conversational recommender systems (CRSs). By examining the relationships between users’ perceptions of system competence (especially recommendation quality and conversational interaction) and their trust, we found that users’ experience with conversational interaction was particularly important for inspiring user trust toward the conversational recommender (high β coefficients for the significant paths, as shown in Figure 3). As driven by the three-layered trust model [26], we investigated the influences of three types of factors (user-related, system-related, and context-related) on user trust in CRSs, in which we highlight the impacts of user-related factors (users’ Big-Five personality traits, trust propensity, and domain knowledge). This section will discuss the key findings of our study and their implications for designing trustworthy CRSs.

5.1 Key Findings

Key Finding #1: Users with higher conscientiousness have a better perception of system competence and show more trust toward the Mixed-Initiative system. Our results demonstrate that users with a higher level of conscientiousness have more positive perceptions in terms of both recommendations and conversational interaction with the Mixed-Initiative system, engendering higher trust in the CRS [see Figures 4(a), 4(b) and 4(c) ]. This finding is in line with previous studies showing that more conscientious people have higher trust in automation when conducting decision-making tasks [15, 16]. Highly conscientious users tend to be cautious, responsible [33], and may have maximising tendencies (i.e., the tendency to explore and compare alternatives, and look for the best option) [51], which may result in more appreciation for the suggestions from the system that may help them become more informed to make a confident decision. This finding also suggests that individual differences in users’ decision-making style, i.e., maximizing (examining more alternatives to select the best option) and satisficing (settling for a good-enough option) [35, 67], may be influential on user trust in CRSs, which can be investigated in future research.

Design Implications: Trustworthy CRS design should consider users’ personality traits, especially conscientiousness. For users with higher conscientiousness who like to carefully consider all facets before making a choice, the Mixed-Initiative system that supports both user-initiative and system-initiative interactions is more desirable. System-initiated guidance may support conscientious users in seeking alternatives and finding the “perfect” items from recommendations, hence fostering user trust toward the system. However, for users with lower conscientiousness, the level of system-initiative can be relatively lower because those users tend to be casual and impulsive and might not appreciate extensive guidance from the system.

Key Finding #2: Users’ trust propensity positively influences user trust in conversational recommenders, but the degree of influence is affected by the task complexity. Our results imply the positive effects of trust propensity on users’ perceptions of the conversational interaction and their intention to use, which is consistent with previous reports of the positive effect of one's general tendency to trust others or technology on trust in recommender systems [12, 72]. Moreover, the complexity of the performed task tends to strengthen this effect [see Figures 5(b) and 5(c) ], suggesting a stronger influence of trust propensity when users perform the Complex Task. We found that users with higher trust propensity perceived much less effort and higher intention to use the system than users with lower trust propensity, but this trend was more significant for the Complex Task than the Simple Task. We argue that, when performing a complex task, users with higher trust propensity are more likely to take advantage of an effective conversational interaction to indicate what they like or dislike and obtain system guidance when they get stuck on a task. However, as shown in our model (Figure 3), users with lower trust propensity benefit less from conversational interaction, which has a strong influence on user trust (in terms of both perceived conversational interaction and intention to use).

Design Implications: CRS researchers have attempted to improve recommendation quality and conversation interaction to build user trust in the system. However, previous studies have not adapted the design of trustworthy CRSs to users’ trust propensity. The “one size fits all” approach can be flawed because it assumes all users have the same trust propensity level. Thus, future design of CRSs could also consider users’ general tendency to trust technology. For example, the system may help users with lower trust propensity understand more about the system's ability and guide them to accomplish simple tasks in the initial period, which would improve their initial trust in the system's competence.

Key Finding #3: Users with stronger domain knowledge have a higher intention to use conversational recommenders and prefer to explore recommendations by themselves. Our results indicate that users with more domain knowledge (i.e., higher musical sophistication in our case) have a higher intention to use the CRS. Furthermore, users with a higher level of domain knowledge benefit more from the conversational interaction with recommendations, because they possess a greater ability to articulate their preferences than do domain novices [32]. In addition, system-initiative suggestions are more helpful for users with less domain knowledge when looking for recommendations. In contrast, domain-knowledgeable users tended to have a better perception in finding recommendations by themselves, probably because this type of user desires more control over their decisions [38].

Design Implications: This finding informs that the users’ domain knowledge level should be taken into account in the design of CRSs, because it influences users’ intention to use the system as well as their preferred initiative strategies. For example, the Mixed-Initiative system is more beneficial for novice users as they may need more suggestions from the system to find recommendations that fit their interests. In contrast, the User-Initiative system might be sufficient for domain experts because they often expect higher control over the interaction with the system and to be interrupted less by the system-initiated guidance.

5.2 Limitations

Before concluding this paper, we highlight some limitations of our research. First, the factors that influence user trust in conversational systems are not limited to Competence Perception, which was the only dimension investigated in our study. Anthropomorphism [68], security and privacy [20] are additional relevant dimensions of user trust. However, these dimensions are frequently discussed in the context of user trust in customer service chatbots and are influenced by additional personal characteristics, such as affective states [2] and privacy concerns[65]. To avoid added complexity, our trust model mainly considers the dimension of Competence Perception of CRSs, namely, perceived recommendation quality, perceived effort, and perceived conversational interaction. Second, recommender systems are applied in various domains including media, e-commerce, and healthcare. However, we conducted our study with a CRS designed only for music recommendations, which may limit the generalizability of our findings to other domains. In light of differences in user involvement levels [14], user trust is more crucial in certain domains, such as e-commerce and healthcare. Future work will validate our findings in different CRS application domains. Third, we only considered a text-based CRS for this investigation, and the results may differ when users interact with a voice-based CRS. In future work, we plan to investigate whether our results are applicable to the voice-based CRS.

6 CONCLUSIONS

This study investigated the effects of the three types of factors (user-related, system-related and context-related) on user trust, grounded on the framework of Hoff and Bashir's three-layered trust model [26]. Our study demonstrated the main effects of user-related factors (personal characteristics) and their interaction effects with the system-related factor (initiative strategy) and the context-related factor (task complexity) on user trust in conversational recommender systems (CRSs). Our findings indicate that trust propensity and domain knowledge directly influence user trust. Moreover, personal characteristics, like conscientiousness and domain knowledge, can exert influences on user trust in CRSs with different initiative strategies (user-initiative and mixed-initiative).

Prior work on user trust toward traditional recommender systems [6, 12] has highlighted the significance of measuring competence perception based on recommendation quality, whereas we emphasize the importance of gauging perceived conversational interaction because it has a stronger influence on user trust in CRSs. As the initiative strategy influences the way users interact with the CRS, we also highlight the interaction effects of personal characteristics and initiative strategy on user trust. Our findings contribute to the research community of Human-AI interactions [4] and will be of interest to researchers who investigate the role of personalization in building user trust in conversational AI systems and the impacts of personal characteristics when developing trustworthy AI systems such as CRSs.

ACKNOWLEDGMENTS

The work was supported by Hong Kong Research Grants Council (RGC/HKBU12201620) and Hong Kong Baptist University IRCMS Project (IRCMS/19-20/D05). We also thank all participants for their time in taking part in our experiment and reviewers for their constructive comments on our paper.

REFERENCES

MR Ab Hamid, Waqas Sami, and MH Mohmad Sidek. 2017. Discriminant validity assessment: Use of Fornell & Larcker criterion versus HTMT criterion. In Journal of Physics: Conference Series, Vol. 890. 012163.
Gabriella Airenti. 2018. The development of anthropomorphism in interaction: Intersubjectivity, imagination, and theory of mind. Frontiers in psychology 9 (2018), 2136.
J.E. Allen, C.I. Guinn, and E. Horvtz. 1999. Mixed-initiative interaction. IEEE Intelligent Systems and their Applications 14, 5(1999), 14–23.
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). 1–13.
Izak Benbasat and Weiquan Wang. 2005. Trust in and adoption of online recommendation agents. Journal of the association for information systems 6, 3 (2005), 4.
Shlomo Berkovsky, Ronnie Taib, and Dan Conway. 2017. How to recommend? User trust factors in movie recommender systems. In Proceedings of the 22nd International Conference on Intelligent User Interfaces(IUI ’17). ACM, 287–300.
Michael K Buckland and Doris Florian. 1991. Expertise, task complexity, and the role of intelligent information systems. Journal of the American Society for Information Science 42, 9(1991), 635–643.
Adrian Bussone, Simone Stumpf, and Dympna O'Sullivan. 2015. The role of explanations on trust and reliance in clinical decision support systems. In 2015 International Conference on Healthcare Informatics. IEEE, 160–169.
Katriina Byström and Kalervo Järvelin. 1995. Task complexity affects information seeking and use. Information Processing & Management 31, 2 (1995), 191–213.
Wanling Cai and Li Chen. 2020. Predicting user intents and satisfaction with dialogue-based conversational recommendations. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization(UMAP ’20). 33–42.
Wanling Cai, Yucheng Jin, and Li Chen. 2021. Critiquing for music exploration in conversational recommender systems. In Proceedings of the 26th ACM Conference on Intelligent User Interfaces(IUI ’21). 480–490.
Li Chen and Pearl Pu. 2005. Trust building in recommender agents. In Proceedings of the Workshop on Web Personalization, Recommender Systems and Intelligent User Interfaces at the 2nd International Conference on E-Business and Telecommunication Networks. Citeseer, 135–145.
Li Chen and Pearl Pu. 2006. Evaluating critiquing-based recommender agents. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1(AAAI ’06). 157–162.
Li Chen and Pearl Pu. 2012. Critiquing-based recommenders: Survey and emerging trends. User Modeling and User-Adapted Interaction 22, 1-2 (2012), 125–150.
Shih-Yi Chien, Katia Sycara, Jyi-Shane Liu, and Asiye Kumru. 2016. Relation between trust attitudes toward automation, Hofstede's cultural dimensions, and big five personality traits. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 60. SAGE Publications, 841–845.
Jin-Hee Cho, Hasan Cam, and Alessandro Oltramari. 2016. Effect of personality traits on trust and risk to phishing vulnerability: Modeling and analysis. In 2016 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA). IEEE, 7–13.
Konstantina Christakopoulou, Filip Radlinski, and Katja Hofmann. 2016. Towards conversational recommender systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD ’16). 815–824.
Jason A Colquitt, Brent A Scott, and Jeffery A LePine. 2007. Trust, trustworthiness, and trust propensity: a meta-analytic test of their unique relationships with risk taking and job performance.Journal of applied psychology 92, 4 (2007), 909.
Reinout E de Vries, Angelique Bakker-Pieper, Femke E Konings, and Barbara Schouten. 2013. The communication styles inventory (CSI) a six-dimensional behavioral model of communication styles and its relation with personality. Communication Research 40, 4 (2013), 506–532.
Asbjørn Følstad, Cecilie Bertinussen Nordheim, and Cato Alexander Bjørkli. 2018. What makes users trust a chatbot for customer service? An exploratory interview study. In International Conference on Internet Science. 194–208.
Markus Freitag and Paul C Bauer. 2016. Personality traits and the propensity to trust friends and strangers. The Social Science Journal 53, 4 (2016), 467–476.
Chongming Gao, Wenqiang Lei, Xiangnan He, Maarten de Rijke, and Tat-Seng Chua. 2021. Advances and challenges in conversational recommender systems: A survey. AI Open 2(2021), 100–126.
Samuel D Gosling, Peter J Rentfrow, and William B Swann Jr. 2003. A very brief measure of the Big-Five personality domains. Journal of Research in personality 37, 6 (2003), 504–528.
David M Greenberg, Daniel Müllensiefen, Michael E Lamb, and Peter J Rentfrow. 2015. Personality predicts musical sophistication. Journal of Research in Personality 58 (2015), 154–158.
Jörg Henseler and Wynne W Chin. 2010. A comparison of approaches for the analysis of interaction effects between latent variables using partial least squares path modeling. Structural Equation Modeling 17, 1 (2010), 82–109.
Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in automation: Integrating empirical evidence on factors that influence trust. Human factors 57, 3 (2015), 407–434.
Li-tze Hu and Peter M Bentler. 1999. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal 6, 1(1999), 1–55.
Baptiste Jacquet, Alexandre Hullin, Jean Baratgin, and Frank Jamet. 2019. The impact of the gricean maxims of quality, quantity and manner in chatbots. In 2019 International Conference on Information and Digital Technologies (IDT). 180–189.
Dietmar Jannach, Ahtsham Manzoor, Wanling Cai, and Li Chen. 2021. A survey on conversational recommender systems. ACM Computing Surveys (CSUR) 54, 5 (2021), 1–36.
Yucheng Jin, Wanling Cai, Li Chen, Nyi Nyi Htun, and Katrien Verbert. 2019. MusicBot: Evaluating critiquing-based music recommenders with conversational interaction. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management(CIKM ’19). 951–960.
Yucheng Jin, Li Chen, Wanling Cai, and Pearl Pu. 2021. Key qualities of conversational recommender systems: From users’ perspective. In Proceedings of the 9th International Conference on Human-Agent Interaction(HAI ’21). 93–102.
Yucheng Jin, Nava Tintarev, and Katrien Verbert. 2018. Effects of individual traits on diversity-aware music recommender user interfaces. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization. 291–299.
Oliver P John, Sanjay Srivastava, et al. 1999. The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of Personality: Theory and Research 2, 1999(1999), 102–138.
Michael Jugovac and Dietmar Jannach. 2017. Interacting with recommenders—overview and research directions. ACM Transactions on Interactive Intelligent Systems (TiiS) 7, 3(2017), 1–46.
Michael Jugovac, Ingrid Nunes, and Dietmar Jannach. 2018. Investigating the decision-making behavior of maximizers and satisficers in the presence of recommendations. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization(UMAP ’18). 279–283.
Dan Jurafsky and James H. Martin. 2000. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, N.J: Prentice Hall.
Alan E Kazdin. 2000. Encyclopedia of psychology. Vol. 8. American Psychological Association Washington, DC.
Bart P Knijnenburg, Niels JM Reijmer, and Martijn C Willemsen. 2011. Each to his own: How different users call for different interaction methods in recommender systems. In Proceedings of the fifth ACM conference on Recommender systems(RecSys ’11). 141–148.
Bart P Knijnenburg, Martijn C Willemsen, Zeno Gantner, Hakan Soncu, and Chris Newell. 2012. Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22, 4-5 (2012), 441–504.
Sherrie YX Komiak and Izak Benbasat. 2006. The effects of personalization and familiarity on trust and adoption of recommendation agents. MIS quarterly 30, 4 (2006), 941–960.
Johannes Kunkel, Tim Donkers, Lisa Michael, Catalin-Mihai Barbu, and Jürgen Ziegler. 2019. Let me explain: Impact of personal and impersonal explanations on trust in recommender systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). ACM, 1–12.
Michael H Kutner, Christopher J Nachtsheim, John Neter, William Li, et al. 2005. Applied linear statistical models. McGraw-Hill New York.
Hui-Min Lai and Shin-Yuan Hung. 2012. Influence of user expertise, task complexity and knowledge management support on knowledge seeking strategy and task performance. In Pacific Asia Conference on Information Systems (PACIS). Citeseer, 37.
John D Lee and Katrina A See. 2004. Trust in automation: Designing for appropriate reliance. Human factors 46, 1 (2004), 50–80.
Matthew KO Lee and Efraim Turban. 2001. A trust model for consumer internet shopping. International Journal of Electronic Commerce 6, 1 (2001), 75–91.
Stephen Marsh and Mark R Dibben. 2003. The role of trust in information science and technology. Annual Review of Information Science and Technology (ARIST) 37 (2003), 465–98.
Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of management review 20, 3 (1995), 709–734.
Robert R McCrae and Oliver P John. 1992. An introduction to the five-factor model and its applications. Journal of Personality 60, 2 (1992), 175 – 215.
D Harrison Mcknight, Michelle Carter, Jason Bennett Thatcher, and Paul F Clay. 2011. Trust in a specific technology: An investigation of its components and measures. ACM Transactions on Management Information Systems (TMIS) 2, 2(2011), 1–25.
D Harrison McKnight, Larry L Cummings, and Norman L Chervany. 1998. Initial trust formation in new organizational relationships. Academy of Management review 23, 3 (1998), 473–490.
Silvana Miceli, Valeria de Palo, Lucia Monacis, Santo Di Nuovo, and Maria Sinatra. 2018. Do personality traits and self-regulatory processes affect decision-making tendencies?Australian Journal of Psychology 70, 3 (2018), 284–293.
Martijn Millecamp, Nyi Nyi Htun, Yucheng Jin, and Katrien Verbert. 2018. Controlling Spotify recommendations: Effects of personal characteristics on music recommender user Interfaces. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization(UMAP ’18). 101–109.
Daniel Müllensiefen, Bruno Gingras, Jason Musil, and Lauren Stewart. 2014. The musicality of non-musicians: an index for assessing musical sophistication in the general population. PloS one 9, 2 (2014), e89642.
Chelsea M. Myers, Anushay Furqan, and Jichen Zhu. 2019. The impact of user characteristics and preferences on performance with an unfamiliar voice user interface. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). 47:1–47:9.
Fedelucio Narducci, Pierpaolo Basile, Marco de Gemmis, Pasquale Lops, and Giovanni Semeraro. 2020. An investigation on the user interaction modes of conversational recommender systems for the music domain. User Modeling and User-Adapted Interaction 30, 2 (2020), 251–284.
Mahsan Nourani, Joanie King, and Eric Ragan. 2020. The role of domain expertise in user trust and the impact of first impressions with intelligent systems. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8. 112–121.
John O'Donovan and Barry Smyth. 2005. Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces(IUI ’05). 167–174.
Umberto Panniello, Michele Gorgoglione, and Alexander Tuzhilin. 2016. Research note—In CARSs we trust: How context-aware recommendations affect customers’ trust and other business performance measures of recommender systems. Information Systems Research 27, 1 (2016), 182–196.
Eyal Peer, Laura Brandimarte, Sonam Samat, and Alessandro Acquisti. 2017. Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology 70 (2017), 153–163.
Zhenhui Peng, Yunhwan Kwon, Jiaan Lu, Ziming Wu, and Xiaojuan Ma. 2019. Design and evaluation of service robot's proactivity in decision-making support process. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems(CHI ’19). 1–13.
Pearl Pu, Li Chen, and Rong Hu. 2011. A user-centric evaluation framework for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems(RecSys ’11). 157–164.
Minjin Rheu, Ji Youn Shin, Wei Peng, and Jina Huh-Yoo. 2021. Systematic review: trust-building factors and implications for conversational agent design. International Journal of Human–Computer Interaction 37, 1(2021), 81–96.
Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor. 2015. Recommender Systems Handbook(2nd ed.). Springer-Verlag.
Julian B Rotter. 1971. Generalized expectancies for interpersonal trust.American Psychologist 26, 5 (1971), 443.
Rahime Belen Saglam, Jason RC Nurse, and Duncan Hodges. 2021. Privacy concerns in chatbot interactions: When to trust and when to worry. In International Conference on Human-Computer Interaction. Springer, 391–399.
Julian Sanchez, Wendy A Rogers, Arthur D Fisk, and Ericka Rovira. 2014. Understanding reliance on automation: effects of error type, error distribution, age and experience. Theoretical Issues in Ergonomics Science 15, 2 (2014), 134–160.
Barry Schwartz, Andrew Ward, John Monterosso, Sonja Lyubomirsky, Katherine White, and Darrin R Lehman. 2002. Maximizing versus satisficing: happiness is a matter of choice.Journal of Personality and Social Psychology 83, 5(2002), 1178.
Anna-Maria Seeger and Armin Heinzl. 2018. Human versus machine: Contingency factors of anthropomorphism as a trust-inducing design strategy for conversational agents. In Information Systems and Neuroscience. 129–139.
Yueming Sun and Yi Zhang. 2018. Conversational recommender system. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR ’18). 235–244.
Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design and evaluation. In Recommender systems handbook. Springer, 353–382.
Marilyn A. Walker, Diane J. Litman, Candace A. Kamm, and Alicia Abella. 1997. PARADISE: A framework for evaluating spoken dialogue agents. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics(ACL ’98/EACL ’98). 271–280.
Weiquan Wang and Izak Benbasat. 2007. Recommendation agents for electronic commerce: Effects of explanation facilities on trusting beliefs. Journal of Management Information Systems 23, 4 (2007), 217–246.
Ye Diana Wang and Henry H Emurian. 2005. An overview of online trust: Concepts, elements, and implications. Computers in human behavior 21, 1 (2005), 105–125.
Longqi Yang, Michael Sobolev, Christina Tsangouri, and Deborah Estrin. 2018. Understanding user interactions with podcast recommendations delivered via voice. In Proceedings of the 12th ACM Conference on Recommender Systems(RecSys ’18). 190–194.
Yongfeng Zhang, Xu Chen, Qingyao Ai, Liu Yang, and W. Bruce Croft. 2018. Towards conversational search and recommendation: System ask, user respond. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management(CIKM ’18). 177–186.
Jianlong Zhou, Simon Luo, and Fang Chen. 2020. Effects of personality traits on user trust in human–machine collaborations. Journal on Multimodal User Interfaces 14 (2020), 387–400.

FOOTNOTE

¹Here, “interaction” is a statistical term. An interaction between A and B to affect Z indicates that A influences Z depending on B, or B influences Z depending on A [42].

²According to our pilot test observations, it is reasonable for the system to provide suggestions when the user has consecutively skipped three songs or listened to five songs.

³ https://cloud.google.com/dialogflow/es/docs

⁴ https://developer.spotify.com/documentation/web-api

⁵ https://www.prolific.co/

⁶To ensure the quality of user responses, we set three attention checking questions (e.g.,“Please indicate which of the following items is not fruit?”).

⁷ http://lavaan.ugent.be/

⁸Hu and Bentler [27] suggest good values for the following indices: CFI >.96, TLI >.95, and RMSEA <.05.

⁹Here we conducted Spearman's correlation analyses after detecting interaction effects to clearly show the relationship between a personal characteristic and a user perception construct in a particular condition. We followed this procedure to analyze all the detected interaction effects.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.

CHI '22, April 29–May 05, 2022, New Orleans, LA, USA

© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9157-3/22/04…$15.00.
DOI: https://doi.org/10.1145/3491102.3517471