1 Introduction
The classroom is a constantly changing environment. Both course content and standards for best practices in teaching constantly evolve. Teachers must adapt to the needs of individual students while also adapting to the group dynamics that make every class unique. Teachers need to plan lessons, reflect on the effectiveness of their practices, and develop strategies to engage students. Feedback is critical for understanding what is working, and finding opportunities for positive change [
30,
76,
89]. Feedback can help teachers develop reflective practices toward improved student learning and greater equitable participation. Unfortunately, continuous feedback for teachers can be hard to come by [
1,
55].
Class observations in which a professional with expertise in teacher training observes one or two class periods can provide personalized, in-depth feedback for teachers about their teaching practices [
1]. A review of studies found that such immediate feedback was most effective for specific and corrective changes to teaching behaviors [
83]. However, observations often focus on performance evaluations of the teacher as an employee rather than personal growth [
91]. In addition, more subtle teaching moves, such as conversational dynamics during class discussions are difficult for a single observer to make sense of in real-time. While expert teaching observations are more frequent for pre-service teachers, they are less common for in-service teachers who would also benefit from continuous feedback. Video recordings offer an alternative, offering a more accurate recollection of events and rich opportunities for reflection [
5,
7,
25,
92], but they require great effort. This includes setting up and testing recording equipment and effort to curate actions worthy of reflection from recordings [
5,
27,
40,
54,
88]. More generic forms of professional development (PD) such as workshops and seminars can scale and provide background on changing content and teaching standards, but they cannot provide the personalized and persistent guidance necessary for effective change [
4,
55,
80,
88]. In addition, most teacher PD sessions are singular instances that do not provide opportunities for progress follow-ups on new techniques and approaches. Finally, PD largely focuses on standards or content rather than individual goals, strategies, and strengths [
52,
55].
Learning scientists have recently investigated technology for observing teaching practices as a novel type of personalized PD, particularly for discourse pedagogy. Class discussion can give voice to students’ reasoning and provide teachers with a greater understanding of student cognition and learning process [
2,
16,
39,
82]. Prior work has found that even low cost recording equipment could accurately capture and model conversational dynamics in the classroom [
31,
56]. Several current systems, both research and commercial, use class audio recordings to provide teachers with automated feedback about their discussion practices [
14,
26,
31,
36,
41,
56,
57]
1. Open questions remain about the design of such tools and their impact in practice, which are relevant to the HCI and educational technology communities for bridging research and practice. For our work, we build upon ClassInSight as a tool that visualizes classroom discussion data for personalized teacher PD and conversation support [
41]. Our approach to designing ClassInSight incorporates data visualizations of discourse and collaborative reflection with a PD researcher to answer the following research questions:
•
How did teachers use features of ClassInSight in reflections over time?
•
What are the factors and barriers of adoption of discourse visualization tools?
As part of a collaborative design-based research (DBR) [
20] project among three research institutions, we situate ClassInSight within middle and high school science teaching and guided reflection sessions in which teachers discuss their class discussion data with a PD researcher. This tool is an expansion of Gomoll et al [
41] that visualizes classroom discussion data in three different levels: the Talk Ratio, Turn-Taking, and Transcript (Figure
1). In addition to these visualizations, teachers follow a schema to structure their noticings and reflections. As part of a longitudinal deployment over 3 academic years, 5 middle and high school science teachers from a large city in the United States participated in 22 reflection sessions during which they engaged with their discussion data in the tool. At the end of the deployment, we conducted interviews with teachers to understand their experiences in using the prototype. From their interactions with the tool in reflection sessions and interviews, we found themes related to quantification, contextualization, shifting professional vision, and adoptability. We extend previous results [
41,
94] by showing how interactions with data within ClassInSight impact reflections over time.
We make the following contributions. First, we provide an analysis of how the design of a data-driven discussion analysis tool, ClassInSight, impacts professional learning and reflection in teaching. Second, we contribute design implications that influence the adoption of conversational support tools in professions that often also lack frequent personalized feedback on professional interactions (e.g. clinicians, vets, mentors, therapists, trainers, police, advisors, lawyers, consultants). Our findings from the deployment of ClassInSight provide lessons learned both within and beyond teaching.
3 Research Context
This paper presents the latest cycle of a design-based research (DBR) project with collaborations between three research institutions. As part of a research-practice partnership, researchers and teachers collaborated in a long-term partnership to address problems of practice rather than problems of theory [
19]. Our interdisciplinary research team consists of faculty, postdoctoral researchers, graduate students, and designers with expertise in learning science, human-computer interaction, design, natural language processing, and software development. We met weekly to discuss design directions and decisions. DBR is complementary to human-centered design as it involves iterative cycles of designing interventions and testing these interventions in educational contexts [
20]. DBR addresses learning in authentic educational contexts beyond narrow measures of learning [
21]. Our goal with this DBR project was to design a tool that helps teachers facilitate discussions in science classrooms. As developing discourse practice takes effort and time [
71], we chose a longitudinal approach to examine teachers’ reflections and learning in-depth over 3 academic years. This approach is rooted in field trials and longitudinal studies to understand users’ experience in HCI [
37,
59]. This study was approved by the Institutional Review Board (IRB).
3.1 Teacher Participants
From the 2019-2022 academic years, 5 in-service middle and high school science teachers (3 identified as female, 2 identified as male) from 5 schools in the same public school district in a large city in the southwestern United States participated in interviews, co-design sessions, reflection sessions, and prototype testing with our team. All teachers had at least 10 years of teaching experience (average 15.8 years) and taught various science subjects at the middle and high school levels. Table
1 shows demographic information for our participants and the pseudonyms we will use throughout the paper for each teacher. All teachers and the parents of students in their class consented to participating in our research. From these interactions, we developed an early version of ClassInSight and deployed it with teachers in collaborative reflection sessions [
41]. This paper discusses the latest design of our ClassInSight prototype and findings from reflection sessions and final interviews with teachers. As part of this longitudinal project took place during the COVID-19 pandemic, a large part of our data collection in classrooms and with teachers took place over Zoom videoconferencing
2.
6 Findings
We connect main themes of quantification, context, shifting professional vision and adoptability to our research questions of how teachers used features of the tool in their reflections over time (Sections 6.1, 6.2, and 6.3) and the factors and barriers to adoption of discourse analysis tools (Section 6.4). Each quote from a reflection session is labeled with (R) and the number of the reflection session (i.e. (R2) would denote a quote from a teacher’s second reflection session).
6.1 Quantification of Discussion Data in the Talk Ratio Visualization
The Talk Ratio gave a quantified summary of teacher and student talk and the types of talk that occurred. Teachers noted this in their post-interviews. In particular, 2 teachers mentioned how quantification of discussion gave them insights into their facilitation practices. As Kate stated, “I do think it’s good to see the categorization and the amount of time or the relative amount of time...to see how much time is spent on the different types of communication, and I do like the bar graph indicating when there’s a lot more or a lot less.” Sheila thought the Talk Ratio visualization helped her see her class from an external perspective, “The whole point of being able to see and being able to have like that fishbowl of the dynamics in my classroom is really helpful.” As the first visualization teachers saw, the Talk Ratio gave a coarse overview of the discussion from which they could form initial impressions.
6.1.1 Dissonance from the Talk Ratio Quantification.
Seeing the Talk Ratio often led to initial surprise or dissonance throughout all reflection sessions for teachers. For instance, Jeff mentioned, “It’s odd that this one was 82% [teacher talk] to 18% [student talk], because...I thought this one might be more [like] 60% to 30%. So I’m a little surprised that it went more the other way” (RS4). This dissonance continued into later reflections as well, “I was still surprised at how much of the talking came [from] my side because I thought that it was still much more the students, but it appears that it was still majority me according to this” (RS6). Similarly, Bonnie experienced dissonance from the Talk Ratio, “So, this is a little surprising to me to have less student talk this time, and I, in my head, I’m like, ‘Why?’” (RS4). This dissonance likely occurred because the data differed from teachers’ expectations about their talk practices. As Kate noted, “It’s always shocking to me how much time I spend talking” (RS2).
In post-interviews, teachers mentioned how the dissonance experienced from seeing the Talk Ratio provided a different perspective to their discussion. Jeff noted, “The thing that I found most striking is I’m always talking more than I think I am so to see the graph of it is really helpful to let me know how much I need to get the students talking more.” He further elaborated, “What seems like a normal amount seems like a lot...But it turns out that because a lot was only 15% before that, they were only talking 19% now, and it seems like they’re talking a lot more than usual. Then [I’ll see] the graph and say, ‘oh wow they were only talking 19%!”’ Kate also mentioned this dissonance in terms of talk codes in the Talk Ratio, “There’s a personal perception and then there’s actually seeing the data, and so for me to see how much time I really did spend on Building and Connecting versus Evaluating or other talks helped a lot.” The Talk Ratio was a way for teachers to see how much their expectations were accurate to the discussion data.
6.1.2 Learning from Dissonance.
According to the theory of cognitive dissonance, dissonance leads to people’s need to resolve it [
10]. In our study, teachers resolved their dissonance by adding to their beliefs about their talk and setting new discussion goals. Jeff noted, “
I think in class, hopefully [the Talk Ratio] will get closer to 50/50...That might still be doable in the near future...But I’m still working on pausing more, giving students time to think and respond” (RS3). 2 teachers talked about their discussion goals in terms of “moving the line” in the Talk Ratio visualization. Jeff said, “
I still wanna move this line left and get more students talking” (RS4). Sheila also mentioned, “
Toward the end of the year, that line shifts where it’s mostly, that’s where you want it to be, is that it’s mostly student” (RS2). Dissonance led teachers to think about whether and why their data confirmed or contradicted their expectations. Bonnie discussed, “
I like that [the Talk Ratio] breaks it down by percentages so I can see,...is this where I want it to be? And it helps me start thinking about are these the results that I expected or are they different from the results I expected?” (RS2). Tom stated his expectations in terms of concrete percentages, “
I would think that Building on Ideas should be an important part of a lesson...Build on Ideas, 4% doesn’t look very substantial” (RS3). In post-interviews, 2 teachers mentioned using the Talk Ratio to ask questions about their discussion behaviors. Tom stated, “
I think it’s a positive thing when students are talking...What are the things we can do to get students to participate...in the class?” Sheila also said, “
Am I giving them an appropriate amount of time and opportunity in order to answer questions? Am I actually asking questions that require them to answer using more than one word?” For these teachers, dissonance led to a desire for greater understanding about their discussion data.
6.2 Contextualization: Recalling and Understanding Dialogue Data in the Turn-Taking and Transcript Visualizations
From the quantification of talk in the Talk Ratio, teachers then used the Turn-Taking and Transcript visualizations to further understand and contextualize what occurred through recalling moments and gaining an understanding of discussion dynamics.
6.2.1 Recalling Classroom Events in the Transcript.
To understand what happened in a class, teachers used the Transcript visualization to recall specific moments. Jeff referred to specific lines in the Transcript, “And then...line 290, [I am] kind of just reassuring them again like, ‘Hey, like nothing to be ashamed about.”’ (RS2). Kate also used the Transcript to recall and explain events, “So yeah, here’s where we’re getting into the article that we read, and yeah, so here’s where they’re actually starting to classify” (RS4). Tom added that he sometimes found surprise in the Transcript, “It’s always weird when I see what I actually say in class. I find it so strange because it’s totally different than my impression of what I’m saying” (RS3). Recollection of discussion provided teachers with context into what was said and strategies used. As Bonnie noted the value of this recollection, “The value that I see in...having a transcript...[is] to be able to look at the actual conversations...You still have to look at the actual content.” (RS4).
6.2.2 Understanding Classroom Dynamics in the Turn-Taking and Transcript.
Where the Transcript helped teachers recall moments in class, the Turn-Taking guided teachers towards where to look. As Kate noted, “I will say this turn-taking is kind of also eye-opening in terms of making a more visual representation of the percentages from the Talk Ratio” (RS2). She noticed, “There’s long stretches where I’m the only one talking a lot” (RS2). Tom identified “chunks” of teacher dialogue, “To start off, I seem to be saying larger chunks, and then there’s a period of time where [the chunks are] quite short and then they get longer again, towards the end of the class” (RS4). He noted how these chunks led him to examine the Transcript, “So [I’m] looking at the colors and...at what I’m saying, and then what the students are saying, and trying to relate those two, which I think is exactly what you would wanna do with this kind of information” (RS4). Jeff also noticed chunks in his discussion data, “So I’m seeing like big blocks of me [talking] still. But it seems like there’s some good back and forth” (RS2). Teachers used the Turn-Taking and Transcript to answer questions about their data. For example, Sheila looked at, “What part of the speaking is instruction? What part of the speaking would be giving directions?” (RS1). When Bonnie used the Turn-Taking visualization, she noticed, “I was talking for this long and there’s no student responses happening yet. And I’m scrolling down...and I go ‘Okay, something occurred here. Oh, what kind of questions was it or was something sticking out?’” (RS1). The Turn-Taking and Transcript visualizations together contextualized the quantification seen in the Talk Ratio to give further understanding about class discussions.
In post-interviews, teachers had generally positive feedback about the Turn-Taking and Transcript visualizations. They confirmed the value of the Transcript for recalling class events. As Tom said, “That’s what I find shocking is when I would read [the Transcript], I would immediately go ‘Oh! Now I know exactly where I was,’ I would just come back very quickly.” Kate felt the Transcript visualization was, “the most effective or most significant for me because it really helps me understand the way I was presenting the lesson and how much I was allowing for the students to participate...instead of how I thought I was doing, which aren’t always the same thing.” Teachers mentioned that the Turn-Taking visualization gave quick insights into the rhythm of discussion. Tom mentioned, “It was on [the Turn-taking] that you could click on [the graph] and that will take you to what was being said at that period of time. And I thought that was useful. It made sense to me [and] is easy for me to understand my lesson and access what I wanted to look at.” Sheila stated how the Turn-Taking gave visibility to her discussion facilitation, “Just by scanning through [the Turn-Taking], I think you can tell [if] you’re giving your students time to actually talk, if you’re preparing them well enough in order to discuss on their own, or giving them the opportunity to talk to each other to respond individually.” Kate also specifically mentioned the connection between the Turn-Taking and Transcript visualizations in the miniview, “I think this [miniview] on the left is very helpful and then I do think that seeing the dialogue as well is really helpful to see who’s actually engaged...so I think this page together is very helpful.” However, Jeff was critical of the two visualizations, stating, “I probably wouldn’t look at what [students] said...What they say really isn’t as important as how much they say.” He noted instead, “To me [the Turn-Taking] needs to be broken up by students so just one accumulated bar for student one, [another] the bar for student two because I want to see which student is talking how much.”
6.3 Shifting Professional Vision
Where contextualization refers to how teachers viewed and reflected on past data for understanding, we observed how teachers shifted in their professional vision in which they examined their data with an eye towards future actions.
6.3.1 Early Noticings: Quantification of Discussion.
Related to our findings about quantification, teachers often focused on the amount of student talk in early reflection sessions, which may reflect their level of noticing [
95]. This led teachers to evaluate their data. Kate evaluated her use of talk codes, “
The student breakout group shows...Making Reasoning Explicit and Building on Ideas. Those parts are definitely good” (RS1). Jeff evaluated the discussion cadence, “
There was a good portion where it seemed like it was kind of back and forth. But maybe not fully reaching that goal, something that I still need to continue to be aware of” (RS2). Bonnie noted specific wording, “
I think it was the wording in the question, and I know how I would do it differently next year for sure...I was hoping to get at least a few questions kind of like, you know, touched upon.” However, evaluation of data could also lead to negative feelings. In her second reflection session, Sheila said, “
[I] still...feel so inadequate. It’s really sad because...it’s really hard to get them engaged” (RS2). Though we observed this in one teacher, Sheila only had two reflection sessions so we were unable to see how or if these were resolved.
With an emphasis in noticing quantitative aspects of talk, teachers set quantitative goals to increase student talk in general in early reflection sessions. Bonnie stated her goal to “have less teacher talk and more student interaction because that’s where more learning takes place” (RS1). Jeff set goals around how many students spoke, “I’d say that the types of goals we want to set...would maybe be the number of people that interact, trying to reach a certain percentage of people interacting in conversations” (RS1). These goals reflected how teachers interpreted their data at their stage of professional learning.
6.3.2 Shifting Expectations of Data.
Over multiple reflection sessions in engaging with the tool, teachers’ professional vision shifted in their expectations of data. Early on, Bonnie did not know what to expect in her Turn-Taking visualization, “I didn’t have an expectation actually...No expectation, I didn’t know what I was gonna see” (RS2). By her last reflection, she knew what to look for in her data, “This is the kind of stuff I wait for, is what do they understand about the new concepts we’re getting into?” (RS5). Kate began to notice how her actions impacted discussion, “I was happy to see the Build on Ideas because...I had been purposeful about wanting to get them to remember what we’d done the week before. So it was nice to see that that was actually captured in the data” (RS4). Notably, Kate spoke about her expectations and strategies in terms of the talk codes. Jeff also showed this shift. In an early reflection, he stated, “So they’re making connections. That’s what I want” (RS2). During his last reflection session, he was more specific, “A lot of times before the students would say...very short answers, but now we’re getting some multiple sentence ideas...it seems like we’re getting more complete thoughts out of each student this way...When I had put them in small groups before, it didn’t look like this” (RS6). Over time, we saw how teachers had more directed noticings and expectations in their data.
We also observed two teachers set goals beyond quantitative talk goals. Bonnie set goals related to types of talk, “I wanted to put that as a goal for discussion, that they’re building on each other’s ideas or they’re using some of it to reformulate their own” (RS3). Kate deliberately shifted away from looking at the quantity of talk, “We had mentioned that we weren’t really gonna focus on the Talk Ratio” (RS3). She set a goal for more student evaluation, “I would like to get to the point where I could have students interacting with each other and have them evaluate each other” (RS3). However, Jeff preferred the Talk Ratio visualization through all his reflection sessions, “Mostly what tells me [that I’m moving forward] is where this line is in the middle...The Talk Ratio is probably still...the most useful thing for me.” All teachers changed in their expectations of data, but quantification of talk sometimes super-ceded focusing on other aspects of discussion.
6.4 Adoptability: Factors and Barriers for Adoption of Conversation Support Technology
In post-interviews, teachers addressed multiple aspects of conversation support tools and their implementation that would lead to or hinder adoption and continued usage of these sorts of technology.
6.4.1 Personalization, Persistence, and Regularity of Data.
Teachers expressed frustration that prior PD experiences did not seem applicable to their own classrooms and contexts. Sheila noted, “[We] will get curriculum from people who have written the curriculum for our grade level...and we’re like who are the students that these people are writing this for? Definitely not mine.” Tom also said, “Teachers would be forced to change, but [the PD professionals] don’t know why a certain teacher is effective or why they’re successful with their students.” In contrast, teachers appreciated the personalized PD provided in our tool. Kate stated, “This is the only PD that’s been specific to me and my behavior in the classroom and my presentation of lessons...this was like getting into the nitty gritty of how I actually am in the classroom and no PD has ever even come close to that even when I get evaluated every two years.” Sheila thought, “Any data is helpful to see what I’m doing great and what I’m doing that I should be better at…This would be something that we have clear, concrete data as to…what did we do, are we meeting reaching our own personal goals that we’re actually putting forth?” Teachers felt the data within the tool provided personalized feedback where they could see their growth whereas generalized seminars did not.
Teachers also expressed frustration at the lack of follow-up and accountability from prior PD seminars and workshops. Sheila said, “
PD that we have are all ‘here you go, implement’... Typically it’s a one and done...and there is not any real accountability as to whether or not you’re even doing anything.” Kate concurred, “
Usually PD stops at the idea phase like ’oh here’s a great idea to engage your students, now go do it.’” Jeff also mentioned that the lack of follow through, “
Our big complaint as teachers has been we never see [the PD] again, and we never analyze how we do things are changed.” These statements from teachers align with findings that PD is often neither personalized nor persistent [
4,
55,
80,
88]. In contrast, teachers appreciated the persistence of multiple reflection sessions. As Tom said, “
You guys have been more persistent…Because you keep on coming back and you keep on reminding me, ‘Okay, this is what we did last time.”’ Jeff said this form of PD was “
more about the progress in the journey...This process has been I think much more useful being long-term rather than these one and done things that most districts do.” Tom valued the accountability of the reflections, “
What I’m getting out of this is it was requiring me to reflect, which teachers should do anyways…I know some percentage do, but I don’t think it’s the whole group.” Sheila added that separation from the district was an important factor, “
This is not intrusive. [It’s] low risk, given that…it’s helping me stay accountable to me because I’m also accountable to you [as researchers], but you’re not accountable to my district.” However, challenges in data collection and coding led to large gaps between classes and reflection sessions. After long periods of time, the Transcript may be less effective in recalling class events. As Tom said, “
There would be this delay in terms of...[the reflection session] and when the lesson was. So...I was asked to look at [the data] ahead of time, which I would do, but that would be an extra step...I’d be asked these questions and at the same time I’d be trying...to remember what was said.” We found that persistence of reflection is important if done with somewhat consistent regularity. These findings suggest a need to restructure teacher PD to enable greater personalization and shift away from the “one-and-done” nature of current PD practices.
6.4.2 Learning Curve of Technology and Talk Codes.
Teachers expressed a learning curve in both using our prototype as well as understanding the talk codes. Some of the barriers came from a general resistance to technology. As Tom stated, “Teachers need to take advantage of the technology that’s available and they’re not. A lot of teachers are technology phobic and you know they’re being forced to learn how to use programs and stuff.” Bonnie stated a learning curve specifically in using our prototype, “First of all, just navigating [the app]...that in itself is a learning curve. There’s two things happening here. One is how was the lesson? And the other is how is the app?...So there’s several moving parts here.” Tom also talked about challenges in learning to understand and interpret the data visualizations and talk codes, “So brown is Invite...I wasn’t quite sure how it was being decided that the snippets that I was reading, um, how that matched with invite. Because I would’ve thought that all of mine would’ve been brown, because I’m always asking the students, ’okay, what do you think?’” (RS5). Jeff also expressed confusion about the talk categories, “I’m sitting here looking back at the [legend] and saying, ’okay, that one’s Connect. And then over here, Evaluation.’ I’m thinking, how’s Evaluation different from Reasoning? (RS6). This is a limitation of developing a shared understanding of discussion theory and practice between researchers and teachers.
6.4.3 Granularity and Types of Data Presented.
Related to the learning curve of interpreting data within the tool was the granularity of data presented. Jeff mentioned the data in the Turn-Taking and Transcript visualizations was too fine-grained, “I thought that the very basic Talk Ratio is helpful, but a lot of the other things...were too much information and not really useful...The granular level of detail...[for] a teacher using a daily or weekly tool, it’s just too much information [and] too time consuming.” Bonnie found it challenging to navigate through specific talk codes, “This Other Teacher Talk is really interesting...I’d have to go through each [excerpt] to know” (RS2). She also felt that talk data did not capture the spectrum of student learning, “Just because they’re not saying [anything] doesn’t mean they’re not writing an elaborate report or talking about it with each other” (RS2). Jeff suggested measures related to individual student talk, “I’d like to see what’s the total [talk] for student one? What’s the total for student two?” (RS3). Tom wanted to see silence or wait time reflected, “If I didn’t say anything for 15 minutes, there would just be no timestamps. There’s no space indicating that there was 15 minutes of quiet time” (RS3). These different forms of data could help teachers reflect more deeply on their discussion.
6.4.4 Generalization of Discourse Visualization.
Beyond our reflection sessions, teachers mentioned other situations where personalized discourse analysis would be useful. Kate suggested collaboration with other teachers, “I would love to have...a colleague cohort working with [data], but I would use it, even if I was just on my own. I think it’s that helpful and valuable in terms of improving the quality of teaching.” Sheila thought discussion data could be useful for new teachers, “I would think for any new teacher...unless someone’s watching you and giving you feedback, you don’t know what you’re doing.” Teachers also mentioned discourse analysis for other settings. Tom stated how discussion data would be useful in supervision meetings, “I was supervising maybe four people and...you spend a lot of time thinking about...what they’re doing, what you want them to do, and how to...achieve that goal. So that’s an opportunity where...you would be interested in recalling specific conversations and you know what was said.” Kate also thought this data could be useful for staff meetings, “We have staff development dates coming up, and I would love to have this kind of data for staff development because I don’t think the facilitator, who is also the principal, really understands he talks 95% of the time…I would love to see it in any kind of a group meeting.” Jeff similarly thought, “I think it might be interesting if they looked at...this sort of thing but looking at professional development, and how teachers interact with instructors and administrators...It’s kind of funny how a lot of the [PD facilitators do] exactly the opposite of what they tell us to do.” Teachers found discourse analysis useful both for their own insights as well as other situations where interactions are ephemeral and subject to personal perceptions.
7 Discussion & Future Work
In answering our research questions, 1) How did teachers use features of the app in their reflections over time? and 2) What are the factors and barriers of adoption of discourse visualization tools?, we found how teachers used data visualizations in our tool to understand their discussion practices towards future actions. The themes of quantification, contextualization, shifting professional vision, and adoptability cut across these questions. In this section, we discuss lessons learned from our longitudinal deployment and design implications that might generalize to other professions.
7.1 Design Implications
7.1.1 Resolving Dissonance.
The Talk Ratio visualization was designed to answer the question of how much teachers talked versus students. Several systems like the M-Powering Teachers tool [
26] and TeachFX [
36], include a similar visualization that quantifies discussion. In answering our first research question, we found that a visualization of talk ratio or talk percentage can be effective at sparking dissonance when the data does not match expectations. As dissonance can cause varying degrees of emotional reaction, it can lead to a desire for resolution [
10]. We found that over time, teachers in our study sought to resolve dissonance through seeking understanding in the Turn-Taking and Transcript visualizations. This is in line with prior findings that dissonance can motivate self-reflection and understanding [
6,
41]. However, dissonance may cause negative emotions that may not be resolved in a single reflection session. Sheila, who only had two reflection sessions, noted inadequacies after seeing her Talk Ratio data. In addition, dissonance in quantification of data could emphasize
how much talk occurred rather than
what types of talk occurred. It may be that the percentage in the Talk Ratio could lead to inherent evaluation about performance, whether positive or negative. Prior work in personal informatics systems also finds that users may be demotivated if their data shows they did not reach their goals [
44]. One possible direction to mitigate these reactions is to include positive feedback alongside the quantitative data, similar to the positive prompts the M-Powering Teachers tool provides [
26]. Another direction is margin-based design, which includes a range in which users could achieve their goals. Jung et al [
58] found that setting goals within a margin allowed users to evaluate their behaviors as “good enough” rather than a failure. These directions could help to reduce negative emotions associated with dissonance and lead to productive reflection.
7.1.2 Scaffolding Attention to Relevant Data.
Teachers generally found the Turn-Taking and Transcript visualizations useful for adding context to the Talk Ratio data and resolving potential dissonance. We noticed that teachers changed in their expectations of this data over time as they viewed these visualizations, representing a shift in their professional vision. However, some teachers found the data too granular. As a result, their early reflections and goals centered around the Talk Ratio visualization and quantification of talk. One potential reason is that the Talk Ratio was the first visualization teachers saw and was separated from the more closely-integrated Turn-Taking and Transcript visualizations, making it more challenging to connect patterns between the three visualizations. A potential implication is creating layered visualizations that tell a narrative about teachers’ discussion data through storytelling elements that allow for both coarse-grained exploration and fine-grained explanation in data [
33,
67]. Data annotations that extract relevant points in the data can also scaffold sense-making in complex visualizations [
33,
51]. In our own work, we are currently building annotations into the data visualizations for teachers to better connect their reflection notes to specific points in the data.
A larger discussion is what “relevant” data means. As part of this research-practice partnership, our goal was to increase teachers’ usage of academically productive talk and designed our tool around this goal. Some teachers found the talk codes difficult to actually apply in their classroom discussions and instead mentioned aspects of discussion they found more relevant, such as how much individual students spoke or wait time. While researchers explained talk codes to teachers during reflection sessions, these codes and definitions were not co-constructed or designed with teachers as they were informed by prior research. A mismatch between what measures researchers value versus the views of teachers (and students) may hinder adoption and acceptance of technology in practice [
75,
77]. Developing a shared understanding of discourse terminology and its meanings is an ongoing challenge in research-practice translation [
94]. Co-designing definitions or terminology that fit within teachers’ understanding of talk and discourse may improve how teachers interpret their discourse data. Another implication is to progressively reveal parts of the data that foreground and background elements of the visualizations according to teachers’ own professional vision. We found how teachers’ shifted in their expectations of data and the goals they set from quantitative to characteristics of the discussion. Scaffolding professional vision through hints, notifications, or other guidance for what and when to analyze classroom data could proivde adaptive support to teachers in examining their discussion data [
72]. This is an important area that requires close conversation and collaboration between teachers and research and design teams.
7.1.3 Discourse Analysis Tools for Professions Beyond Teaching.
Beyond teaching, many other professions rely on quality interactions between professionals and those they serve. These professions include those in healthcare, mental health, coaching, and customer service. Prior work on automated systems in the professions often focuses on productivity such as systems in algorithmic management [
3,
61] and personal informatics systems to track workers’ time usage [
34]. Conversation analysis using instrumented sensors (such as microphones) to improve professional interactions is a growing area [
38,
48,
64,
65]. For instance, psychotherapists valued the automated feedback about how they converse with patients the CORE-MI system provided [
48]. Liu et al [
64,
65] found that a system that visualizes non-verbal behaviors improved medical students’ awareness of these behaviors in doctor-patient rapport. These professions, like teaching, have a set of established best practices and also lack ongoing, personalized feedback and PD. Teachers themselves mentioned other situations where conversation support could help generate dissonance towards behavioral change and in recalling specifics about interactions. Our themes from this work could apply to professions for which discourse and interaction are key components of professional success, but are not easily quantified or evaluated. Future work could explore what these types of interfaces could look like in other professions for broader professional learning.
7.2 Challenges with Classroom Studies and Implications for Scaling Conversation Support
7.2.1 Challenges of Authentic Classroom Studies and Alternative Implementation Models.
Our model of reflection consisted of collaborative reflection sessions where teachers discuss their discussion data in depth with a PD researcher. This model is line with instructional coaching and personalized teaching consultations where PD experts observe teachers’ progress over time and provide feedback [
28,
55]. Our findings are closely tied to the model in which they were situated, collaborative reflection sessions with a PD researcher that covered one class session in-depth. While teachers appreciated the personalized and persistent PD provided, constraints in scheduling, the COVID-19 pandemic, and data processing and coding time meant that reflection sessions did not happen at consistent intervals. This irregularity may have impacted teachers’ reflection of the data though we observed changes in teachers’ noticings and expectations of their data even with these challenges. This provides promise for different implementation models of reflection using discourse visualization tools. For example, because some schools may not have the resources to provide teachers with regular one-on-one PD, teachers mentioned possibly creating communities of practice between peers or mentorship communities between experienced and novice peers [
79]. A structural model might be setting time aside specifically for teaching reflection. School administrations could create these structures for teachers to record their own classes and take the time to reflect on their data themselves or with peers. This could provide teachers with agency in their reflection and incorporation of classroom technologies [
96].
7.2.2 Scaling Conversation Support with AI.
For self- or peer-regulated reflection structures to occur, automation of transcribing discussions and categorizing talk is necessary. In this current iteration, discourse was human-coded for ground truth accuracy, which is a labor-intensive and time-consuming process. Several automated models can classify discourse with accuracy on par with that of humans and can scale teacher feedback on discussion [
31,
56,
57,
84]. We are currently working on automated models that can reliably classify the discourse categorized in our tool. However, even with human-coding, there are challenges with inter-rater reliability and agreement in talk categories [
18], and people in general may not trust AI judgments due to lack of transparency [
74,
102]. We found that teachers expressed confusion in how the talk codes were categorized even with human coding, which may impact trust and how teachers might perceive any feedback provided from an AI system. A mixed-initiative approach in which users can evaluate and refine automated outputs could create a collaborative feedback loop [
29]. Increasing transparency in how AI judgments are made could improve trust through explanations alongside confidence scores to explain where models are potentially less accurate. Our own future work is exploring designs with code correction and confidence scores. Other work could expand on how teachers perceive the accuracy and usefulness of AI feedback.
7.2.3 Privacy Implications.
We focused on audio recordings of discussion for this work, which has limitations in capturing the full spectrum of learning behaviors. Multimodal data beyond audio (such as video or wearables) could capture both verbal and nonverbal behaviors [
31,
67,
68]. However, these modalities of data (including audio data) are part of a significant conversation around privacy concerns, particularly with the involvement of K-12 students and parental consent. State laws and district-level policies dictated how we collected classroom data and what types of data we could collect. We had multiple discussions with school leadership and our institution’s IRB to ensure informed consent from participants as well as compliance to laws and policies. As laws vary depending on location, navigating these restrictions could be a challenge in automated discourse support. In addition, sharing of data is an important consideration. In our study, teachers mentioned that the separation between the research team and administration was the reason why they were comfortable sharing their data with us. They may have felt differently if administration was more involved in the use and analysis of their data. Since schools, districts, or PD organizations are likely to be the stakeholders who purchase and implement these types of conversation support tools, future research on guidelines around the collection, use, and sharing of data are necessary to move this field forward.
7.3 Limitations
There are several limitations to this work. The COVID-19 pandemic caused major disruptions to our data collection workflow. As classes were shifted online, data collection paused for a portion of the 2020-2021 school year and resumed during online teaching. While online teaching did make audio recording easier due to built-in recording functions (e.g. the Record function in Zoom), it also led to significant differences in classroom discussion behaviors. Teachers reported that their students engaged far less in discussion than in in-person classes. As a result, there were large discrepancies in teacher versus student talk for courses recorded online, which may have impacted how teachers reflected on their data towards their discussion goals during this period. In addition, the regularity with which reflection sessions could be scheduled was impacted by the teachers’ schedules, time constraints of talk coding, and the pandemic. This meant that reflection sessions sometimes occurred months after data was collected, which could affect teachers’ memory of the specific class and how they might take action towards their discussion goals. Student behavioral and learning outcomes in class over time from reflections were not in the scope of this paper, but these would likely influence how teachers engaged with their data. Lastly, our sample size of teachers was small with one of the 5 teachers in our study being unable to participate in interviews. However, the reflection sessions provide a rich longitudinal data set for understanding teachers’ reflections on their discussion data in ClassInSight.