[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2014069444A1 - Complaint conversation determination device and complaint conversation determination method - Google Patents

Complaint conversation determination device and complaint conversation determination method Download PDF

Info

Publication number
WO2014069444A1
WO2014069444A1 PCT/JP2013/079235 JP2013079235W WO2014069444A1 WO 2014069444 A1 WO2014069444 A1 WO 2014069444A1 JP 2013079235 W JP2013079235 W JP 2013079235W WO 2014069444 A1 WO2014069444 A1 WO 2014069444A1
Authority
WO
WIPO (PCT)
Prior art keywords
utterance
conversation
section
dissatisfaction
hold
Prior art date
Application number
PCT/JP2013/079235
Other languages
French (fr)
Japanese (ja)
Inventor
真宏 谷
祥史 大西
真 寺尾
岡部 浩司
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2014544515A priority Critical patent/JPWO2014069444A1/en
Publication of WO2014069444A1 publication Critical patent/WO2014069444A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/20Aspects of automatic or semi-automatic exchanges related to features of supplementary services
    • H04M2203/2038Call context notifications

Definitions

  • the present invention relates to a conversation analysis technique.
  • An example of a technology for analyzing conversation is a technology for analyzing call data.
  • data of a call performed in a department called a call center or a contact center is analyzed.
  • a contact center such a department that specializes in the business of responding to customer calls such as inquiries, complaints and orders regarding products and services.
  • Patent Document 1 emotion is recognized from the voice of a phone call from a customer, and whether or not the voice content is a complaint is determined by whether or not the emotion represents at least one of “anger” and “excitement”.
  • a technique for notifying an appropriate person in charge according to the determination result has been proposed.
  • Patent Document 2 periodic fluctuations in the amplitude envelope are detected from the input speech signal in order to detect the anger and irritation of the speaker without being affected by individual differences, language differences, and regional differences, and the detection thereof. There has been proposed a method for discriminating whether or not the input voice is a strong voice according to the result.
  • Patent Document 3 in order to provide information that meets customer needs, it is determined whether or not a keyword set in advance during a call between a call center operator and the customer is spoken. There has been proposed a method of grasping potential needs of customers and providing guidance information associated with the keywords in advance to the customers.
  • the time limit for each call partner is calculated from the information about the caller or the receiver whose call is put on hold, and information about the caller and the receiver, There has been proposed a method for notifying an operator of progress information with respect to a time limit and a psychological state of a communication partner according to a comparison result with the time limit.
  • Patent Document 4 it is proposed to determine whether or not the call partner has lost normality while the call is on hold.
  • dissatisfaction due to the waiting time for call holding is only determined, and no detection of dissatisfaction regarding the call itself is assumed.
  • the present invention has been made in view of such circumstances, and provides a technique for appropriately detecting the degree of dissatisfaction with a conversation.
  • the degree of dissatisfaction of the conversation means the degree of dissatisfaction that a person who participates in the conversation in the conversation (hereinafter referred to as a conversation participant) will feel. It may indicate only whether or not it is felt.
  • the first aspect relates to a dissatisfied conversation determination device.
  • the dissatisfied conversation determination device includes a data acquisition unit that acquires hold section data indicating start time and end time of a conversation hold section and voice data of the conversation, and voice data acquired by the data acquisition unit
  • a dissatisfaction level determination unit that determines the dissatisfaction level of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold period indicated by the hold period data included in Have.
  • the second aspect relates to a dissatisfied conversation determination method executed by at least one computer.
  • the dissatisfied conversation determination method obtains hold section data indicating start time and end time of a hold section of conversation and voice data of the conversation, and uses the hold section data included in the acquired voice data. It includes determining the degree of dissatisfaction of the conversation by performing a predetermined voice analysis process on the voice data of the held conversation participant in the hold section shown.
  • Another aspect of the present invention may be a program that causes at least one computer to implement each configuration in the first aspect, or a computer-readable recording medium that records such a program. There may be.
  • This recording medium includes a non-transitory tangible medium.
  • the dissatisfied conversation determination device includes a data acquisition unit that acquires hold section data indicating start time and end time of a conversation hold section, and voice data of the conversation, and voice data acquired by the data acquisition unit
  • a dissatisfaction level determination unit that determines the dissatisfaction level of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold period indicated by the hold period data included in Have.
  • the dissatisfied conversation determination method is executed by at least one computer, acquires hold section data indicating the start time and end time of a hold section of the conversation, and voice data of the conversation, and acquires the acquired voice data. It includes determining the degree of dissatisfaction of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold section indicated by the hold section data included.
  • conversation means that two or more speakers speak by expressing their intentions by uttering a language.
  • conversation participants can speak directly, such as at bank counters and cash registers at stores, and in remote conversations such as telephone conversations and video conferencing. There may be a form in which the participants talk.
  • the on-hold section means that the conversation is on hold.
  • Holding a conversation means that at least one of a plurality of conversation participants having a conversation does not participate in the conversation.
  • the remaining at least one conversation participant is described as a held-side conversation participant.
  • the held-side conversation participant means a conversation participant on the side waiting for the conversation to be held and the hold state to be released.
  • the state in which the conversation participant does not participate in the conversation means that the conversation participant is consciously separated from the conversation, such as a state away from the conversation seat or a state where the call is put on hold. It is illustrated by. Therefore, as an example of the reserved section, when two customers and a bank person in charge are having a conversation at the bank window, the bank person in charge is out of the seat.
  • the held-side conversation participant is a customer waiting for a bank representative.
  • the present inventors have obtained the knowledge that the held-side conversation participant says the true intention only while the conversation is on hold. From this knowledge, the utterance of the held-side held conversation participant is not satisfied with the held-side conversation participant. We found that there is a high probability that For example, even a conversation participant who is responding moderately (a held-side conversation participant) may be dissatisfied during the hold. A situation in which dissatisfaction is likely to appear during a hold can occur in all conversations including calls. For example, there may be a scene in which customers talk about dissatisfaction while the person in charge is out of their seats.
  • the conversation hold section data and the voice data of the conversation are acquired, and predetermined voice analysis processing is performed on the voice data of the held-side conversation participant in the hold section specified by the hold section data. As a result, the degree of dissatisfaction with the conversation is determined.
  • the conversation voice data acquired in the present embodiment includes voice data of a held conversation participant who is holding a conversation.
  • the predetermined speech analysis processing is, for example, emotion recognition processing as proposed in Patent Documents 1 and 2, speech recognition processing for extracting character data indicating the contents from speech data, and the like.
  • the present embodiment does not limit the predetermined voice analysis process itself for determining the degree of dissatisfaction. Further, the degree of dissatisfaction of the call determined in the present embodiment may indicate only whether or not the held-side conversation participant expressed dissatisfaction, or the degree of dissatisfaction may be indicated by a numerical value.
  • the voice data of the held-side held conversation participant is a target for analysis processing for dissatisfaction determination.
  • the dissatisfaction determination of each conversation can be executed at high speed.
  • voice data in a section where dissatisfaction is likely to be expressed is targeted for analysis processing for determining dissatisfaction, it is possible to prevent dissatisfaction leveling throughout the entire conversation. As a result, it is possible to appropriately determine the degree of dissatisfaction with the conversation.
  • conversation data for example, data indicating conversation between a person in charge and a customer at a bank counter or a store cash register can be exemplified.
  • call refers to a call from when a caller has a caller to a caller until the call is disconnected.
  • the on-hold section means a section during a call where the call is put on hold.
  • FIG. 1 is a conceptual diagram showing a configuration example of a contact center system 1 in the first embodiment.
  • the contact center system 1 in the first embodiment includes an exchange (PBX) 5, a plurality of operator telephones 6, a plurality of operator terminals 7, a file server 9, a call analysis server 10, and the like.
  • the call analysis server 10 corresponds to the unsatisfactory conversation determination device in the above-described embodiment.
  • the customer corresponds to the above-mentioned held-side conversation participant
  • the operator corresponds to the conversation participant on the holding side.
  • the exchange 5 is communicably connected to a call terminal 3 such as a PC, a fixed phone, a mobile phone, a tablet terminal, or a smartphone that is used by a customer via the communication network 2.
  • the communication network 2 is a public network such as the Internet or a PSTN (Public Switched Telephone Network), a wireless communication network, or the like.
  • the exchange 5 is connected to each operator telephone 6 used by each operator of the contact center. The exchange 5 receives the call from the customer and connects the call to the operator telephone 6 of the operator corresponding to the call.
  • Each operator uses an operator terminal 7.
  • Each operator terminal 7 is a general-purpose computer such as a PC connected to a communication network 8 (LAN (Local Area Network) or the like) in the contact center system 1.
  • LAN Local Area Network
  • each operator terminal 7 records customer voice data and operator voice data in a call between each operator and the customer.
  • Each operator terminal 7 also records voice data of customers who are on hold.
  • the customer voice data and the operator voice data may be generated by being separated from the mixed state by predetermined voice processing. Note that this embodiment does not limit the recording method and the recording subject of such audio data.
  • Each voice data may be generated by a device (not shown) other than the operator terminal 7.
  • each operator terminal 7 detects an operator's hold operation and hold release operation on the operator telephone 6. Thereby, each operator terminal 7 generates hold section data indicating the start time and end time of the hold section in each call.
  • this embodiment does not limit the production
  • the operator terminal 7 may generate the hold section data by detecting the hold sound included in the voice data of the operator without detecting the hold operation and the hold release operation of the operator.
  • the generation of the reserved section data may be performed by a device (not shown) other than the operator terminal 7.
  • the file server 9 is realized by a general server computer.
  • the file server 9 stores the call data of each call between the customer and the operator together with the identification information of each call.
  • Each call data includes a pair of customer voice data and operator voice data, and the above-described hold section data.
  • the file server 9 acquires customer voice data and operator voice data from another device (each operator terminal 7 or the like) that records each voice of the customer and the operator. Further, the file server 9 acquires the hold section data from each operator telephone 6, each operator terminal 7, the exchange 5 and the like.
  • the call analysis server 10 analyzes the customer dissatisfaction state with respect to each call data stored in the file server 9.
  • the call analysis server 10 includes a CPU (Central Processing Unit) 11, a memory 12, an input / output interface (I / F) 13, a communication device 14 and the like as a hardware configuration.
  • the memory 12 is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like.
  • the input / output I / F 13 is connected to a device that accepts an input of a user operation such as a keyboard and a mouse, and a device that provides information to the user such as a display device and a printer.
  • the communication device 14 communicates with the file server 9 and the like via the communication network 8. Note that the hardware configuration of the call analysis server 10 is not limited.
  • FIG. 2 is a diagram conceptually illustrating a processing configuration example of the call analysis server 10 in the first embodiment.
  • the call analysis server 10 includes a data acquisition unit 20, an utterance detection unit 21, a holding section analysis unit 23, an object determination unit 24, a dissatisfaction determination unit 25, and the like.
  • Each of these processing units is realized, for example, by executing a program stored in the memory 12 by the CPU 11. Further, the program may be installed from a portable recording medium such as a CD (Compact Disc) or a memory card, or another computer on the network via the input / output I / F 13 and stored in the memory 12. Good.
  • CD Compact Disc
  • the data acquisition unit 20 acquires the call data of the call to be analyzed from the file server 9 together with the identification information of the call.
  • the acquired call data includes a pair of customer voice data and operator voice data, and hold section data.
  • the call data may be acquired by communication between the call analysis server 10 and the file server 9, or may be acquired via a portable recording medium.
  • the utterance detection unit 21 detects the utterance section of the customer in the holding section from the voice data of the customer in the holding section indicated by the holding section data included in the voice data acquired by the data acquisition section 20. For example, the utterance detection unit 21 detects, as the utterance section, a section in which the amplitude equal to or greater than a predetermined value is continued in the voice waveform indicated by the customer's voice data in the hold section.
  • the detection of the utterance section means detecting a section indicating one utterance of the customer in the voice data, whereby the start time and the end time of the section are acquired.
  • FIG. 3 is a diagram conceptually showing a reserved section and an utterance section in the reserved section.
  • the holding section data as shown in FIG. 3 is indicated by the holding section data.
  • the first voice among the voices uttered by the customer during the call holding, that is, during the holding section, is detected as the first speaking section as shown in FIG.
  • the inventors have found that the utterance of a customer who is on hold of a call is likely due to the customer's dissatisfaction, and further, among the dissatisfactions expressed during a call hold, It has been found that there are unsatisfied causes (hereinafter referred to as unsatisfied waiting) and other unsatisfied reasons (hereinafter referred to as non-holding unsatisfactory). Dissatisfaction other than waiting on hold may be dissatisfaction with the reception of the other party in the call, or dissatisfaction with the product or service that attracted the call itself.
  • the present inventors tend to utter utterances indicating dissatisfaction waiting on hold after a certain amount of time has passed since the beginning of the hold interval or the end of the previous utterance in the hold interval. It was found that utterances showing dissatisfaction other than waiting for waiting tend to be uttered relatively soon when the call is on hold. Therefore, in the present embodiment, by providing the holding section analysis unit 23 and the object determination unit 24, the customer's utterance in the holding section is caused by two types of dissatisfaction (holding wait dissatisfaction and dissatisfaction other than the hold waiting). Classification is performed, and only one of the two classified speech groups can be set as a dissatisfaction determination target. Thereby, according to this embodiment, the dissatisfaction degree about only the cause of the specified dissatisfaction can be determined.
  • the hold section analysis unit 23 calculates a time difference between the start time of the first utterance section detected by the utterance detection unit 21 and the start time of the hold section.
  • this time difference calculated by the reserved section analysis unit 23 is referred to as a start time difference.
  • This start time difference is also shown in FIG.
  • the reserved section analysis unit 23 compares the calculated start time difference with a first predetermined time threshold.
  • This first predetermined time threshold is a statistical value of the time from the start of the hold interval until the utterance that expresses the hold waiting dissatisfaction is uttered, and is held in advance by the hold interval analysis unit 23 so as to be adjustable.
  • the start time difference thereby, according to the comparison result between the start time difference and the first predetermined time threshold, it is possible to specify whether the customer's utterance in the hold section in this call represents only the hold waiting dissatisfaction or other than that. it can.
  • the latter represents only dissatisfaction other than hold waiting, or both dissatisfaction other than hold wait and non-hold wait.
  • the first predetermined time threshold the start and end times of the operator's hold section and the customer's utterance section, and about 1000 calls labeled whether the customer's utterance in the hold section is unsatisfied with the hold are used. Is set.
  • a histogram of the difference in start time calculated on the horizontal axis and the frequency that the customer's first utterance is an unsatisfied utterance waiting on the vertical axis and the frequency that is not an unsatisfied utterance pending on the vertical axis is created.
  • the most separated time difference is set as a first predetermined time threshold.
  • Two adjacent utterance sections in which the time width between two adjacent utterance sections is larger than the second predetermined time threshold are specified from the plurality of utterance sections detected by the utterance detection unit 21.
  • This second predetermined time threshold is a statistical value of the time from the end of the previous utterance in the hold interval until the utterance that expresses the hold waiting dissatisfaction is uttered, and the hold interval analysis can be adjusted in advance. Held by the unit 23.
  • the second predetermined time threshold is labeled with the start and end times of the operator's hold section and the customer's utterance section, and whether the customer's utterance in the hold section is unsatisfied with the hold It is set using about 1000 calls made.
  • the horizontal axis represents the time difference from the end of a certain utterance of a customer in the holding section to the beginning of the next utterance, and the ordinate represents that the certain utterance is not dissatisfied on hold and the next utterance is dissatisfied on hold.
  • a histogram is created that takes a certain frequency and a frequency in which both utterances are not unsatisfactory utterances on hold, and a time difference at which these two frequency distributions are most separated is defined as a second predetermined time threshold.
  • the customer's utterance in the holding section may include both an utterance indicating a waiting dissatisfaction and an utterance indicating a dissatisfaction other than the holding wait. Identified.
  • the utterance section after the utterance section after the adjacent two utterance sections specified by the hold section analysis unit 23 represents dissatisfaction waiting, and the utterance sections before the previous utterance section are other than hold wait Represents dissatisfaction.
  • it is determined by the holding section analysis unit 23 that the two adjacent utterance sections do not exist it is specified that the utterance of the customer in the holding section represents only dissatisfaction other than waiting for holding.
  • the target determination unit 24 determines whether or not the voice data of the customer in the hold section is to be a dissatisfaction determination target based on the comparison result between the start time difference by the hold section analysis unit 23 and the first predetermined time threshold. For example, the target determination unit 24 adjustably holds information indicating that the determination target of the degree of dissatisfaction desired by the user of the call analysis server 10 is a hold wait dissatisfaction or a dissatisfaction other than hold wait, and based on this information The determination of the degree of dissatisfaction is performed. Specifically, if the start time difference is greater than or equal to the first predetermined time threshold, the customer's utterance in the holding section in the call is likely to have been expressed only from the hold waiting dissatisfaction. If it is desired to determine the degree of dissatisfaction other than waiting for a hold, it is determined that the voice data of the customer in the hold section is not subject to the dissatisfaction determination.
  • the object determination unit 24 obtains the voice data of the customer in the holding section based on the comparison result between the start time difference and the first predetermined time threshold and the specific result of the two adjacent utterance sections by the holding section analysis unit 23. Decide whether or not to make a dissatisfaction level. Specifically, when the start time difference is smaller than the first predetermined time threshold and the two adjacent speech sections are not specified (does not exist) by the holding section analysis unit 23, the customer of the holding section in the call is not identified. The utterance is likely to have been expressed only from dissatisfaction other than waiting on hold.
  • the target determination unit 24 determines the voice data of the customer in the hold section as a dissatisfaction determination target, If the determination of dissatisfaction is desired, it is determined that the voice data of the customer in the reserved section is not subject to dissatisfaction determination.
  • the target determination unit 24 specifies the specified adjacent 2 Dissatisfaction with either the speech data of the utterance section before the previous utterance section of one utterance section or the speech data of the utterance section after the utterance section after the specified two adjacent utterance sections Determine to be judged.
  • the difference between the start times is smaller than the first predetermined time threshold and the two adjacent utterance sections are specified by the holding section analysis unit 23
  • the customer's utterance in the holding section in the call is other than the hold waiting There is a high possibility of including utterances indicating dissatisfaction and utterances indicating dissatisfaction waiting.
  • the target determining unit 24 determines the utterance section before the previous utterance section of the two adjacent utterance sections specified. Only the voice data is determined as a dissatisfaction level determination target. On the other hand, in the case where it is desired to determine the degree of dissatisfaction of the waiting waiting dissatisfaction, the object determining unit 24 only selects the audio data of the utterance section after the utterance section after the specified two adjacent utterance sections. Decide on a dissatisfaction level.
  • the dissatisfaction level determination unit 25 performs a predetermined voice analysis process on the voice data of the customer in the holding section determined as the determination target of the dissatisfaction level by the target determination unit 24, thereby determining the dissatisfaction level of the call. judge.
  • the dissatisfaction determination unit 25 generates, as a determination result, output data including identification information related to the analysis target call and information indicating the dissatisfaction level, and determines the determination to the display unit or other output device via the input / output I / F 13. Output the result.
  • the present embodiment does not limit the specific form of output of the determination result.
  • FIG. 4 is a flowchart showing an operation example of the call analysis server 10 in the first embodiment.
  • FIG. 4 shows an example in which information indicating that the dissatisfaction determination target desired by the user of the call analysis server 10 is dissatisfaction other than waiting for waiting is set in the call analysis server 10 (target determination unit 24). Show.
  • the call analysis server 10 determines the presence or absence or degree of dissatisfaction other than the hold waiting dissatisfaction for the target call.
  • the call analysis server 10 acquires call data (S41).
  • the call analysis server 10 acquires call data to be analyzed from a plurality of call data stored in the file server 9.
  • the acquired call data includes a pair of customer voice data and operator voice data, and hold section data.
  • the voice data of the customer includes voice uttered by the customer while the call is on hold.
  • the call analysis server 10 acquires customer voice data and hold section data from the call data acquired in step (S41), and the customer utterance in the hold section from the customer voice data in the hold section indicated by the hold section data. A section is detected (S42). The call analysis server 10 acquires the start time and the end time for each utterance section.
  • the call analysis server 10 calculates the time difference (start time difference) between the start time of the first utterance section and the start time of the hold section in the utterance section detected in the step (S42) (S43).
  • the call analysis server 10 compares the start time difference calculated in the step (S43) with the first predetermined time threshold value (S44).
  • the call analysis server 10 determines that the customer's voice data in the hold section is not a subject of dissatisfaction determination in the call (S45), Finish the process.
  • the determination (S44; NO) is the determination and consent that the utterance of the customer in the holding section in the call represents only the holding wait dissatisfaction.
  • the call analysis server 10 may output information indicating that the customer did not express dissatisfaction with the call.
  • the call analysis server 10 determines the time width between the adjacent utterance intervals from the utterance interval detected in the step (S42). Two adjacent utterance sections whose length is larger than the second predetermined time threshold are specified (S46). When the two adjacent utterance sections are not specified in the step (S46) (S47; NO), the call analysis server 10 determines the voice data of the customer in the hold section in the call as the object of dissatisfaction determination (S48). .
  • the determination (S47; NO) is a determination and consent that the customer's utterance in the holding section in the call represents only dissatisfaction other than waiting for holding.
  • the call analysis server 10 determines the utterance section before the previous utterance section in the two specified adjacent utterance sections. Only the audio data is determined as a dissatisfaction level determination target (S49).
  • the determination (S47; YES) is the determination and consent that the utterance of the customer in the holding section in the call includes an utterance that represents dissatisfaction other than waiting for waiting and an utterance that represents dissatisfaction other than waiting for holding. .
  • the call analysis server 10 determines the degree of dissatisfaction of the call to be analyzed using the audio data determined as the degree of dissatisfaction determination (S50). In this case, the call analysis server 10 may output information indicating the determined degree of dissatisfaction.
  • step (S45) the voice data of the customer in the holding section in the call is determined as a dissatisfaction level determination target.
  • step (S48) it is determined that the customer's voice data in the holding section is not a dissatisfaction determination target in the call, and in step (S49), the utterances behind the two specified adjacent utterance sections. Only the voice data of the utterance section after the section is determined as a dissatisfaction level determination target.
  • the above-mentioned dissatisfied call determination method further includes a step of determining whether the determination target of the dissatisfaction level desired by the user is dissatisfaction other than waiting for waiting or dissatisfied with waiting for holding, and the determination result includes Accordingly, the processing contents of steps (S45), (S48), and (S49) may be switched.
  • the degree of dissatisfaction related to dissatisfaction other than waiting for waiting for waiting and the degree of dissatisfaction relating to waiting for dissatisfaction may be determined separately.
  • the time difference (start time difference) between the start time of the first utterance section and the start time of the hold section in the customer utterance section in the hold section is calculated.
  • the predetermined time threshold is compared. This comparison is based on the characteristics (trends) of a person (customer) that an utterance that represents dissatisfaction on hold is uttered after a certain amount of time has elapsed since the start of the hold interval. If the threshold is equal to or greater than one predetermined time, it can be specified that the customer's utterance in the hold section represents only the hold wait dissatisfaction.
  • two adjacent utterance intervals in which the time width between the two adjacent utterance intervals is larger than the second predetermined time threshold are specified. Is done. This specification is based on the characteristics (trends) of a person (customer) that an utterance indicating dissatisfaction waiting is uttered after a certain amount of time has elapsed since the end of the previous utterance in the hold section. Is impossible, it is possible to specify that the customer's utterance in the holding section represents only dissatisfaction other than holding on hold.
  • the identification is possible, it is possible to specify that the customer's utterance in the holding section includes an utterance indicating dissatisfaction other than the waiting waiting and an utterance indicating the holding waiting dissatisfaction. Furthermore, the two adjacent utterance sections that have been identified can distinguish a plurality of utterances in the holding section into utterances that indicate dissatisfaction other than waiting for waiting and utterances that indicate dissatisfaction waiting.
  • two types of dissatisfaction are caused by the customer's utterance in the holding section by the information processing (comparison processing and specific processing) in consideration of the characteristics regarding the customer's utterance during the call holding. It is possible to classify them according to (unsatisfied with waiting and unsatisfied other than waiting).
  • the dissatisfaction degree of the call can be determined only using the audio
  • the predetermined speech analysis processing is performed only on the speech data corresponding to the desired cause of dissatisfaction, so that the entire call data or the entire holding section is targeted for determination. Compared to the above, the processing efficiency can be improved, and the processing speed can be increased.
  • the second embodiment generates information indicating the cause of dissatisfaction with the degree of dissatisfaction of the call determined as in the first embodiment.
  • the contact center system 1 in the second embodiment will be described focusing on the content different from the first embodiment. In the following description, the same contents as those in the first embodiment are omitted as appropriate.
  • FIG. 5 is a diagram conceptually illustrating a processing configuration example of the call analysis server 10 in the second embodiment.
  • the call analysis server 10 in the second embodiment further includes a cause estimating unit 27 in addition to the configuration of the first embodiment.
  • the cause estimation unit 27 is realized, for example, by executing a program stored in the memory 12 by the CPU 11, similarly to the other processing units.
  • the cause estimation unit 27 holds based on the comparison result between the start time difference and the first predetermined time threshold by the hold segment analysis unit 23 and the identification result of the two adjacent utterance intervals by the hold segment analysis unit 23.
  • Cause information indicating a cause of dissatisfaction corresponding to the customer's utterance in the section is generated.
  • the cause estimating unit 27 may generate output data including the cause information and output the determination result to the display unit or another output device via the input / output I / F 13.
  • the specific form of outputting the cause information is not limited.
  • the cause of dissatisfaction corresponding to the customer's utterance in the reserved section indicated by the cause information there are the following three cases.
  • the first case only the length of the waiting waiting time is caused as a cause of dissatisfaction.
  • the second case only the length of the hold waiting time is caused as a cause of dissatisfaction.
  • the third case is a case where both the first type and the second type are caused by dissatisfaction.
  • the cause estimating unit 27 generates cause information in the first case when the start time difference is equal to or greater than a first predetermined time threshold.
  • the cause estimating unit 27 does not have two adjacent utterance sections in which the start time difference is smaller than the first predetermined time threshold and the time width between the two adjacent utterance sections is larger than the second predetermined time threshold.
  • the cause information in the second case is generated.
  • the cause estimating unit 27 has two adjacent utterance sections in which the difference in start time is smaller than the first predetermined time threshold and the time width between the two adjacent utterance sections is larger than the second predetermined time threshold.
  • the cause information in the third case is generated.
  • the call analysis server 10 further generates cause information as described above.
  • the call analysis server 10 may generate the cause information after determining the degree of dissatisfaction as in the first embodiment, or may generate the cause information before determining the degree of dissatisfaction.
  • the call analysis server 10 in the case of (S44; NO), the call analysis server 10 generates cause information indicating that only the length of the waiting waiting time is the cause of dissatisfaction. In the case of (S44; YES) and (S47; NO), the call analysis server 10 generates cause information indicating that only the length of the holding waiting time is a cause of dissatisfaction. In the case of (S44; YES) and (S47; YES), the call analysis server 10 generates cause information indicating that the length of the holding waiting time and the rest are the causes of dissatisfaction.
  • the cause information indicating the cause of dissatisfaction corresponding to the customer's utterance in the holding section is generated together with the determination of the degree of dissatisfaction.
  • the processing results of the reserved section analysis unit 23 and the object determination unit 24 in the first embodiment described above are used. Therefore, according to the second embodiment, the cause of dissatisfaction can be estimated with high accuracy by information processing (comparison processing and specific processing) that takes into account the characteristics relating to the customer's utterance while the call is on hold.
  • the call data includes a pair of customer voice data and operator voice data, and holding section data.
  • Customer and operator voice text data, and time information of each voice text may be further included.
  • the voice text data is data in which a voice uttered by a customer or an operator is converted into text.
  • the time information of each voice text is information related to the time when the utterance indicated by each voice text is uttered. This voice text data is generated, for example, by applying voice recognition processing to each voice of the customer and the operator at each operator terminal 7 or file server 9.
  • the utterance detection unit 21 of the call analysis server 10 is not necessary.
  • the holding section analysis unit 23 can perform processing using the time information of each voice text and the holding section data.
  • the dissatisfaction level determination unit 25 may perform dissatisfaction level determination using the voice text data.
  • the customer utterance section in the holding section detected by the utterance detection unit 21 may be generated by a device other than the call analysis server 10.
  • the holding section analysis unit 23 compares the start time difference with the first predetermined time threshold.
  • the threshold value may also be referred to as a first predetermined time threshold value.
  • the representative value may be a value output from a predetermined function having the start time difference as an input.
  • the predetermined function may be, for example, a function that inputs a start time difference and outputs a value indicating the accuracy that the dissatisfaction corresponding to the utterance in the hold section is only the hold wait dissatisfaction.
  • each embodiment described above shows an example in which the holding section analysis unit 23 specifies two adjacent utterance sections in which the time width between the two adjacent utterance sections is larger than the second predetermined time threshold.
  • the holding section analysis unit 23 has two adjacent values whose other values (also referred to as representative values) obtained from the time width between the two adjacent speech sections are larger than the second predetermined time threshold.
  • the utterance section may be specified.
  • the representative value may be a value output from a predetermined function having the time width as an input.
  • the predetermined function is, for example, a function that takes a time width as an input and outputs a value indicating the accuracy that the dissatisfaction corresponding to the utterance in the hold section is only due to the length of the hold waiting time. It is good.
  • the above-described cause estimation unit 27 determines that the cause of dissatisfaction corresponding to the customer's utterance in the holding section is based on the length of the holding waiting time. It may be included in the cause information as a value indicating the certainty of being. Similarly, the cause estimating unit 27 uses the above representative value obtained from the time width as a value indicating the accuracy that the cause of dissatisfaction corresponding to the customer's utterance in the holding section is only due to the length of the holding waiting time. May be included in the cause information.
  • the reserved section analysis unit 23 determines that two adjacent speech sections in which the time width between the two adjacent speech sections is larger than the second predetermined time threshold value.
  • the reserved section analysis unit 23 may perform only the comparison between the start time difference and the first predetermined time threshold.
  • the object determination unit 24 may determine whether or not the voice data of the customer in the reserved section is to be a dissatisfaction level determination target based on the comparison result.
  • the cause estimating unit 27 may generate cause information indicating whether the utterance of the customer on hold is caused only by the cause of dissatisfaction with the length of the hold waiting time.
  • the call data is handled.
  • the above-mentioned dissatisfied conversation determination device and the dissatisfied conversation determination method may be applied to an apparatus or a system that handles conversation data other than a call.
  • a recording device for recording a conversation to be analyzed is installed at a place (conference room, bank window, store cash register, etc.) where the conversation is performed.
  • the conversation data is recorded in a state in which the voices of a plurality of conversation participants are mixed, the conversation data is separated from the mixed state into voice data for each conversation participant by a predetermined voice process.
  • the hold interval data is generated by detecting the hold operation and hold release operation by the operator, detecting the hold sound included in the operator's voice data, and the like.
  • the hold section data may be input by a user operation.
  • a holding section analysis unit Based on the comparison result by the holding section analysis unit, an object determining unit that determines whether or not the voice data of the held-side conversation participant in the holding section is a determination target of the dissatisfaction degree determination unit;
  • the unsatisfactory conversation determination device according to supplementary note 1, further comprising:
  • Appendix 4 Based on the time difference calculated by the holding section analysis unit, the probability that the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the holding section is only due to the length of the holding waiting time.
  • a cause estimator that generates cause information to indicate The unsatisfactory conversation determination device according to Appendix 2, further comprising:
  • the utterance detection unit detects a plurality of utterance sections of the held-side conversation participant in the holding section from the voice data of the held-side conversation participant of the holding section,
  • the holding section analysis unit includes two adjacent utterance sections in which a time width between two adjacent utterance sections or a representative value corresponding to the time width is larger than a second predetermined time threshold from the plurality of utterance sections. Identify
  • the target determination unit is configured to determine whether the reserved section is based on the comparison result between the time difference or the representative value and the first predetermined time threshold, and the specific result. Determining whether or not the voice data of the conversation participant is to be determined by the dissatisfaction level determination unit; The unsatisfactory conversation determination device according to any one of supplementary notes 2 to 4.
  • the target determination unit includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified by the holding section analysis section, and the two adjacent utterance sections specified Any one of the speech data of the utterance section after the utterance section behind is set as the determination target of the dissatisfaction level determination unit.
  • the unsatisfactory conversation determination device according to appendix 5.
  • the cause estimation unit is based on the identification result of the two adjacent utterance intervals by the hold interval analysis unit, and the cause of dissatisfaction corresponding to the utterance of the held conversation participant in the hold interval is a hold waiting time. Generating cause information indicating whether it is only due to the length of the waiting time, or both due to the length of the holding waiting time and other than the length of the holding waiting time, The unsatisfactory conversation determination device according to appendix 5 or 6.
  • a cause estimator that generates cause information indicating the accuracy of being only due to length other than The unsatisfactory conversation determination device according to appendix 5 or 6, further comprising:
  • the dissatisfied conversation determination method further including:
  • the dissatisfied conversation determination method further including:
  • the determination regarding the object of determination of the dissatisfaction level includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified, and of the two adjacent utterance sections specified. Either one of the speech data of the utterance section after the subsequent utterance section is the object of the dissatisfaction determination, The method for determining a dissatisfied conversation according to appendix 13.
  • the generation of the cause information is based on the result of specifying the two adjacent utterance intervals, and the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold interval is other than the length of the hold waiting time. Generating cause information that indicates whether it is only due to the length of the hold waiting time or not due to the length of the hold waiting time, 15.
  • the method for determining a dissatisfied conversation according to appendix 13 or 14.
  • Appendix 17 A program for causing at least one computer to execute the unsatisfactory conversation determination method according to any one of appendices 9 to 16.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Provided is a complaint conversation determination device comprising: a data acquisition unit that acquires the audio data for a conversation and hold segment data indicating the start time and end time of a hold segment within said conversation; and a dissatisfaction degree determination unit that performs predetermined audio analysis processing on the audio data for the conversation participant who is put on hold using a hold segment that is indicated by the hold segment data and that is included in the audio data acquired by the data acquisition unit in order to determine the degree of dissatisfaction occurring in the conversation.

Description

不満会話判定装置及び不満会話判定方法Dissatisfied conversation determination device and dissatisfied conversation determination method
 本発明は、会話の分析技術に関する。 The present invention relates to a conversation analysis technique.
 会話を分析する技術の一例として、通話データを分析する技術がある。例えば、コールセンタ、コンタクトセンタなどと呼ばれる部署で行われる通話のデータが分析される。以降、このような、商品やサービスに関する問い合わせや苦情や注文といった顧客からの電話に応対する業務を専門的に行う部署をコンタクトセンタと表記する。 An example of a technology for analyzing conversation is a technology for analyzing call data. For example, data of a call performed in a department called a call center or a contact center is analyzed. Hereinafter, such a department that specializes in the business of responding to customer calls such as inquiries, complaints and orders regarding products and services will be referred to as a contact center.
 コンタクトセンタに寄せられる顧客の声には、顧客ニーズや満足度などが反映されている場合が多く、そのような顧客からの電話応対を適切に行うことは、顧客満足度を向上させリピータ顧客を増加させるためには重要である。そこで、顧客からの電話音声を分析することで、顧客の感情(怒り、苛立ち、不快感など)や顧客ニーズを抽出する各種手法が提案されている。 Customer feedback sent to the contact center often reflects customer needs and satisfaction. Properly answering calls from such customers improves customer satisfaction and helps repeat customers. It is important to increase. Therefore, various methods for extracting customer emotions (anger, irritation, discomfort, etc.) and customer needs by analyzing telephone voices from customers have been proposed.
 下記特許文献1には、顧客からの電話の音声から感情を認識し、その感情が「怒り」及び「興奮」の少なくとも一方を表すか否かによりその音声内容が苦情か否かを判別し、その判別結果に応じて適切な担当者に通知する手法が提案されている。下記特許文献2には、個人差、言語差、地方差の影響を受けずに、発話者の怒りや苛立ちを検出するために、入力音声信号から振幅包絡の周期的変動を検出し、その検出結果に応じて入力音声を力み音声か否か判別する手法が提案されている。下記特許文献3には、顧客のニーズに合った情報を提供するために、コールセンタのオペレータと顧客との間の通話中に予め設定されたキーワードが発話されたか否かを判断し、そのキーワードにより顧客の潜在的なニーズを把握し、そのキーワードと予め関連付けられた案内情報を顧客に提供する手法が提案されている。下記特許文献4には、通話を保留状態にされた通話相手が発信者か受信者か、及び、発信者及び受信者に関する情報から、通話相手毎の制限時間を算出し、通話保留時間とこの制限時間との比較結果に応じて、制限時間に対する経過情報及び通話相手の心理状態をオペレータに報知する手法が提案されている。 In Patent Document 1 below, emotion is recognized from the voice of a phone call from a customer, and whether or not the voice content is a complaint is determined by whether or not the emotion represents at least one of “anger” and “excitement”. A technique for notifying an appropriate person in charge according to the determination result has been proposed. In Patent Document 2 below, periodic fluctuations in the amplitude envelope are detected from the input speech signal in order to detect the anger and irritation of the speaker without being affected by individual differences, language differences, and regional differences, and the detection thereof. There has been proposed a method for discriminating whether or not the input voice is a strong voice according to the result. In Patent Document 3 below, in order to provide information that meets customer needs, it is determined whether or not a keyword set in advance during a call between a call center operator and the customer is spoken. There has been proposed a method of grasping potential needs of customers and providing guidance information associated with the keywords in advance to the customers. In the following Patent Document 4, the time limit for each call partner is calculated from the information about the caller or the receiver whose call is put on hold, and information about the caller and the receiver, There has been proposed a method for notifying an operator of progress information with respect to a time limit and a psychological state of a communication partner according to a comparison result with the time limit.
特開2011-009902号公報JP 2011-009902 A 特開2009-003162号公報JP 2009-003162 A 特開2009-182432号公報JP 2009-182432 A 特開2009-111829号公報JP 2009-1111829 A
 上記特許文献1から3における提案手法では、会話(通話)全体を分析しているため、各通話に関し不満通話であるか否かを判別するのに時間が掛かる。また、それらの提案手法では、通話の一部のみで顧客の不満が表出している場合には、通話全体を通じてその不満が平準化されるため、適切に不満を検出できない場合がある。例えば、穏やかに応対していた顧客が、その通話が保留状態とされる途端に、不満を表出させる場合があり得る。このような場合に、提案手法では、その不満が検出されない可能性がある。 In the proposed methods in Patent Documents 1 to 3, since the entire conversation (call) is analyzed, it takes time to determine whether or not each call is a dissatisfied call. In addition, in those proposed methods, when customer dissatisfaction is expressed only in a part of the call, the dissatisfaction may not be appropriately detected because the dissatisfaction is leveled throughout the entire call. For example, a customer who has been gently answering may be dissatisfied as soon as the call is put on hold. In such a case, the dissatisfaction may not be detected by the proposed method.
 上記特許文献4では、通話保留中に通話相手が平常心を失っているか否かを判断することが提案されている。しかしながら、この提案手法では、通話保留の待ち時間に起因する不満が判定されているに過ぎず、その通話自体に関する不満の検出が想定されていない。 In the above-mentioned Patent Document 4, it is proposed to determine whether or not the call partner has lost normality while the call is on hold. However, in this proposed method, dissatisfaction due to the waiting time for call holding is only determined, and no detection of dissatisfaction regarding the call itself is assumed.
 本発明は、このような事情に鑑みてなされたものであり、会話の不満度を適切に検出する技術を提供する。ここで、会話の不満度とは、その会話において会話に参加する者(以下、会話参加者と表記する)が感じたであろう不満の度合いを意味し、その度合いは、通話者が不満を感じたか否かのみを示すものであってもよい。 The present invention has been made in view of such circumstances, and provides a technique for appropriately detecting the degree of dissatisfaction with a conversation. Here, the degree of dissatisfaction of the conversation means the degree of dissatisfaction that a person who participates in the conversation in the conversation (hereinafter referred to as a conversation participant) will feel. It may indicate only whether or not it is felt.
 本発明の各態様では、上述した課題を解決するために、それぞれ以下の構成を採用する。 Each aspect of the present invention employs the following configurations in order to solve the above-described problems.
 第1の態様は、不満会話判定装置に関する。第1態様に係る不満会話判定装置は、会話の保留区間の開始時間及び終了時間を示す保留区間データ及びその会話の音声データを取得するデータ取得部と、このデータ取得部により取得される音声データに含まれる、当該保留区間データにより示される保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、上記会話の不満度を判定する不満度判定部と、を有する。 The first aspect relates to a dissatisfied conversation determination device. The dissatisfied conversation determination device according to the first aspect includes a data acquisition unit that acquires hold section data indicating start time and end time of a conversation hold section and voice data of the conversation, and voice data acquired by the data acquisition unit A dissatisfaction level determination unit that determines the dissatisfaction level of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold period indicated by the hold period data included in Have.
 第2の態様は、少なくとも1つのコンピュータにより実行される不満会話判定方法に関する。第2態様に係る不満会話判定方法は、会話の保留区間の開始時間及び終了時間を示す保留区間データ及びその会話の音声データを取得し、取得される音声データに含まれる、当該保留区間データにより示される保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、上記会話の不満度を判定する、ことを含む。 The second aspect relates to a dissatisfied conversation determination method executed by at least one computer. The dissatisfied conversation determination method according to the second aspect obtains hold section data indicating start time and end time of a hold section of conversation and voice data of the conversation, and uses the hold section data included in the acquired voice data. It includes determining the degree of dissatisfaction of the conversation by performing a predetermined voice analysis process on the voice data of the held conversation participant in the hold section shown.
 なお、本発明の他の態様としては、上記第1の態様における各構成を少なくとも1つのコンピュータに実現させるプログラムであってもよいし、このようなプログラムを記録したコンピュータが読み取り可能な記録媒体であってもよい。この記録媒体は、非一時的な有形の媒体を含む。 Another aspect of the present invention may be a program that causes at least one computer to implement each configuration in the first aspect, or a computer-readable recording medium that records such a program. There may be. This recording medium includes a non-transitory tangible medium.
 上記各態様によれば、会話の不満度を適切に検出する技術を提供することができる。 According to each aspect described above, it is possible to provide a technique for appropriately detecting the degree of dissatisfaction with the conversation.
 上述した目的、およびその他の目的、特徴および利点は、以下に述べる好適な実施の形態、およびそれに付随する以下の図面によってさらに明らかになる。 The above-described object and other objects, features, and advantages will be further clarified by a preferred embodiment described below and the following drawings attached thereto.
第1実施形態におけるコンタクトセンタシステムの構成例を示す概念図である。It is a conceptual diagram which shows the structural example of the contact center system in 1st Embodiment. 第1実施形態における通話分析サーバの処理構成例を概念的に示す図である。It is a figure which shows notionally the process structural example of the call analysis server in 1st Embodiment. 保留区間及び保留区間内の発話区間を概念的に示す図である。It is a figure which shows notionally the reservation area and the speech area in a reservation area. 第1実施形態における通話分析サーバの動作例を示すフローチャートである。It is a flowchart which shows the operation example of the telephone call analysis server in 1st Embodiment. 第2実施形態における通話分析サーバの処理構成例を概念的に示す図である。It is a figure which shows notionally the process structural example of the call analysis server in 2nd Embodiment.
 以下、本発明の実施の形態について説明する。なお、以下に挙げる各実施形態はそれぞれ例示であり、本発明は以下の各実施形態の構成に限定されない。 Hereinafter, embodiments of the present invention will be described. In addition, each embodiment given below is an illustration, respectively, and this invention is not limited to the structure of each following embodiment.
 本実施形態に係る不満会話判定装置は、会話の保留区間の開始時間及び終了時間を示す保留区間データ及びその会話の音声データを取得するデータ取得部と、このデータ取得部により取得される音声データに含まれる、当該保留区間データにより示される保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、上記会話の不満度を判定する不満度判定部と、を有する。 The dissatisfied conversation determination device according to the present embodiment includes a data acquisition unit that acquires hold section data indicating start time and end time of a conversation hold section, and voice data of the conversation, and voice data acquired by the data acquisition unit A dissatisfaction level determination unit that determines the dissatisfaction level of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold period indicated by the hold period data included in Have.
 本実施形態に係る不満会話判定方法は、少なくとも1つのコンピュータにより実行され、会話の保留区間の開始時間及び終了時間を示す保留区間データ及びその会話の音声データを取得し、取得される音声データに含まれる、当該保留区間データにより示される保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、上記会話の不満度を判定する、ことを含む。 The dissatisfied conversation determination method according to the present embodiment is executed by at least one computer, acquires hold section data indicating the start time and end time of a hold section of the conversation, and voice data of the conversation, and acquires the acquired voice data. It includes determining the degree of dissatisfaction of the conversation by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold section indicated by the hold section data included.
 ここで、会話とは、2以上の話者が、言語の発声などによる意思表示によって、話をすることを意味する。会話には、銀行の窓口や店舗のレジ等のように、会話参加者が直接、話をする形態もあれば、通話機を用いた通話やテレビ会議等のように、離れた位置にいる会話参加者同士が話をする形態もあり得る。 Here, “conversation” means that two or more speakers speak by expressing their intentions by uttering a language. In some conversations, conversation participants can speak directly, such as at bank counters and cash registers at stores, and in remote conversations such as telephone conversations and video conferencing. There may be a form in which the participants talk.
 また、保留区間とは、会話が保留されている間を意味する。会話を保留するとは、会話を行っている複数の会話参加者の中の少なくとも1人が、その会話に参加しない状態となることを意味する。残った少なくとも1人の会話参加者は、被保留側会話参加者と表記される。言い換えれば、被保留側会話参加者とは、会話を保留されその保留状態が解除されるのを待つ側の会話参加者を意味する。ここで、会話参加者が会話に参加しない状態とは、その会話参加者がその会話から意識的に離れた状態を意味し、その会話の席から離れた状態や、通話を保留にする状態などで例示される。よって、保留区間の例としては、銀行窓口で2人の顧客と銀行の担当者とが会話をしている場合に、銀行の担当者が席を外している間が挙げられる。この例では、被保留側会話参加者は、銀行の担当者を待つ顧客である。 Also, the on-hold section means that the conversation is on hold. Holding a conversation means that at least one of a plurality of conversation participants having a conversation does not participate in the conversation. The remaining at least one conversation participant is described as a held-side conversation participant. In other words, the held-side conversation participant means a conversation participant on the side waiting for the conversation to be held and the hold state to be released. Here, the state in which the conversation participant does not participate in the conversation means that the conversation participant is consciously separated from the conversation, such as a state away from the conversation seat or a state where the call is put on hold. It is illustrated by. Therefore, as an example of the reserved section, when two customers and a bank person in charge are having a conversation at the bank window, the bank person in charge is out of the seat. In this example, the held-side conversation participant is a customer waiting for a bank representative.
 本発明者らは、会話保留中にこそ被保留側会話参加者が本音を言うという知見を得、この知見から、保留中の被保留側会話参加者の発話は被保留側会話参加者の不満に起因する可能性が高いことを見出した。例えば、穏やかに応対している会話参加者(被保留側会話参加者)であっても、保留中には不満を表出させる場合があり得る。保留中に不満が表出され易いという状況は、通話を含む会話全般に起こりうる。例えば、担当者が席を外している間に、顧客同士で不満事項を話す場面が考えられる。そこで、本実施形態では、会話の保留区間データ及びその会話の音声データが取得され、保留区間データにより特定される保留区間の被保留側会話参加者の音声データを対象に、所定の音声解析処理が行われることで、当該会話の不満度が判定される。 The present inventors have obtained the knowledge that the held-side conversation participant says the true intention only while the conversation is on hold. From this knowledge, the utterance of the held-side held conversation participant is not satisfied with the held-side conversation participant. We found that there is a high probability that For example, even a conversation participant who is responding moderately (a held-side conversation participant) may be dissatisfied during the hold. A situation in which dissatisfaction is likely to appear during a hold can occur in all conversations including calls. For example, there may be a scene in which customers talk about dissatisfaction while the person in charge is out of their seats. Therefore, in the present embodiment, the conversation hold section data and the voice data of the conversation are acquired, and predetermined voice analysis processing is performed on the voice data of the held-side conversation participant in the hold section specified by the hold section data. As a result, the degree of dissatisfaction with the conversation is determined.
 本実施形態で取得される会話音声データには、会話保留中の被保留側会話参加者の音声データが含まれる。また、所定の音声解析処理とは、例えば、上記特許文献1及び2で提案されているような感情認識処理や、音声データからその内容を示す文字データを取り出す音声認識処理などである。本実施形態は、不満度を判定するための当該所定の音声解析処理自体を制限しない。また、本実施形態で判定される通話の不満度は、被保留側会話参加者が不満を表したか否かのみを示してもよいし、その不満の程度を数値で示してもよい。 The conversation voice data acquired in the present embodiment includes voice data of a held conversation participant who is holding a conversation. The predetermined speech analysis processing is, for example, emotion recognition processing as proposed in Patent Documents 1 and 2, speech recognition processing for extracting character data indicating the contents from speech data, and the like. The present embodiment does not limit the predetermined voice analysis process itself for determining the degree of dissatisfaction. Further, the degree of dissatisfaction of the call determined in the present embodiment may indicate only whether or not the held-side conversation participant expressed dissatisfaction, or the degree of dissatisfaction may be indicated by a numerical value.
 このように、本実施形態によれば、不満が表出され易い保留中に着目し、この保留中の被保留側会話参加者の音声データが不満判定のための解析処理対象とされるため、会話全体を分析する手法に比べて、各会話の不満判定を高速に実行することができる。更に、本実施形態によれば、不満が表出され易い区間の音声データがピンポイントで不満判定のための解析処理対象とされるため、会話全体を通じて不満が平準化されることを防ぐことができ、結果、会話の不満度を適切に判定することができる。 Thus, according to the present embodiment, focusing on the hold during which dissatisfaction is likely to be expressed, since the voice data of the held-side held conversation participant is a target for analysis processing for dissatisfaction determination, Compared with the technique of analyzing the entire conversation, the dissatisfaction determination of each conversation can be executed at high speed. Furthermore, according to the present embodiment, since voice data in a section where dissatisfaction is likely to be expressed is targeted for analysis processing for determining dissatisfaction, it is possible to prevent dissatisfaction leveling throughout the entire conversation. As a result, it is possible to appropriately determine the degree of dissatisfaction with the conversation.
 以下、上述の実施形態について更に詳細を説明する。以下には、詳細実施形態として、第1実施形態及び第2実施形態を例示する。以下の各実施形態は、上述の不満会話判定装置及び不満会話判定方法をコンタクトセンタシステムに適用した場合の例である。なお、上述の不満会話判定装置及び不満会話判定方法は、通話データを扱うコンタクトセンタシステムへの適用に限定されるものではなく、会話データを扱う様々な態様に適用可能である。例えば、それらは、コンタクトセンタ以外の社内の通話管理システムや、個人が所有する、PC(Personal Computer)、固定電話機、携帯電話機、タブレット端末、スマートフォン等の通話端末などに適用することも可能である。更に、会話データとしては、例えば、銀行の窓口や店舗のレジにおける、担当者と顧客の会話を示すデータなどが例示できる。以下、通話とは、或る通話者と或る通話者とがそれぞれ持つ通話機間が呼接続されてから呼切断されるまでの間の呼を意味する。保留区間とは、通話中の、通話が保留状態とされている区間を意味する。 Hereinafter, further details of the above-described embodiment will be described. Below, 1st Embodiment and 2nd Embodiment are illustrated as detailed embodiment. Each of the following embodiments is an example when the above-mentioned unsatisfactory conversation determination device and unsatisfactory conversation determination method are applied to a contact center system. Note that the above-mentioned unsatisfactory conversation determination device and unsatisfactory conversation determination method are not limited to application to a contact center system that handles call data, and can be applied to various aspects of handling conversation data. For example, they can also be applied to in-house call management systems other than contact centers, and personal terminals such as PCs (Personal Computers), fixed telephones, mobile phones, tablet terminals, smartphones, etc. . Furthermore, as conversation data, for example, data indicating conversation between a person in charge and a customer at a bank counter or a store cash register can be exemplified. Hereinafter, the term “call” refers to a call from when a caller has a caller to a caller until the call is disconnected. The on-hold section means a section during a call where the call is put on hold.
 [第1実施形態]
 〔システム構成〕
 図1は、第1実施形態におけるコンタクトセンタシステム1の構成例を示す概念図である。第1実施形態におけるコンタクトセンタシステム1は、交換機(PBX)5、複数のオペレータ電話機6、複数のオペレータ端末7、ファイルサーバ9、通話分析サーバ10等を有する。通話分析サーバ10は、上述の実施形態における不満会話判定装置に相当する。第1実施形態では、顧客が上述の被保留側会話参加者に相当し、オペレータが保留を行う側の会話参加者に相当する。
[First Embodiment]
〔System configuration〕
FIG. 1 is a conceptual diagram showing a configuration example of a contact center system 1 in the first embodiment. The contact center system 1 in the first embodiment includes an exchange (PBX) 5, a plurality of operator telephones 6, a plurality of operator terminals 7, a file server 9, a call analysis server 10, and the like. The call analysis server 10 corresponds to the unsatisfactory conversation determination device in the above-described embodiment. In the first embodiment, the customer corresponds to the above-mentioned held-side conversation participant, and the operator corresponds to the conversation participant on the holding side.
 交換機5は、通信網2を介して、顧客により利用される、PC、固定電話機、携帯電話機、タブレット端末、スマートフォン等の通話端末3と通信可能に接続されている。通信網2は、インターネットやPSTN(Public Switched Telephone Network)等のような公衆網、無線通信ネットワーク等である。更に、交換機5は、コンタクトセンタの各オペレータが用いる各オペレータ電話機6とそれぞれ接続される。交換機5は、顧客からの呼を受け、その呼に応じたオペレータのオペレータ電話機6にその呼を接続する。 The exchange 5 is communicably connected to a call terminal 3 such as a PC, a fixed phone, a mobile phone, a tablet terminal, or a smartphone that is used by a customer via the communication network 2. The communication network 2 is a public network such as the Internet or a PSTN (Public Switched Telephone Network), a wireless communication network, or the like. Further, the exchange 5 is connected to each operator telephone 6 used by each operator of the contact center. The exchange 5 receives the call from the customer and connects the call to the operator telephone 6 of the operator corresponding to the call.
 各オペレータは、オペレータ端末7をそれぞれ用いる。各オペレータ端末7は、コンタクトセンタシステム1内の通信網8(LAN(Local Area Network)等)に接続される、PC等のような汎用コンピュータである。例えば、各オペレータ端末7は、各オペレータと顧客との間の通話における顧客の音声データ及びオペレータの音声データをそれぞれ録音する。また、各オペレータ端末7は、通話保留中の顧客の音声データも録音する。顧客の音声データとオペレータの音声データとは、混合状態から所定の音声処理により分離されて生成されてもよい。なお、本実施形態は、このような音声データの録音手法及び録音主体を限定しない。各音声データの生成は、オペレータ端末7以外の他の装置(図示せず)により行われてもよい。 Each operator uses an operator terminal 7. Each operator terminal 7 is a general-purpose computer such as a PC connected to a communication network 8 (LAN (Local Area Network) or the like) in the contact center system 1. For example, each operator terminal 7 records customer voice data and operator voice data in a call between each operator and the customer. Each operator terminal 7 also records voice data of customers who are on hold. The customer voice data and the operator voice data may be generated by being separated from the mixed state by predetermined voice processing. Note that this embodiment does not limit the recording method and the recording subject of such audio data. Each voice data may be generated by a device (not shown) other than the operator terminal 7.
 更に、各オペレータ端末7は、オペレータ電話機6におけるオペレータの保留操作及び保留解除操作を検出する。これにより、各オペレータ端末7は、各通話において保留区間の開始時間及び終了時間を示す保留区間データをそれぞれ生成する。なお、本実施形態は、このような保留区間データの生成手法及び生成主体を限定しない。例えば、オペレータ端末7は、オペレータの保留操作及び保留解除操作を検出することなく、オペレータの音声データに含まれる保留音を検出することで、当該保留区間データを生成するようにしてもよい。保留区間データの生成は、オペレータ端末7以外の他の装置(図示せず)により行われてもよい。 Furthermore, each operator terminal 7 detects an operator's hold operation and hold release operation on the operator telephone 6. Thereby, each operator terminal 7 generates hold section data indicating the start time and end time of the hold section in each call. In addition, this embodiment does not limit the production | generation method and production | generation main body of such reservation area data. For example, the operator terminal 7 may generate the hold section data by detecting the hold sound included in the voice data of the operator without detecting the hold operation and the hold release operation of the operator. The generation of the reserved section data may be performed by a device (not shown) other than the operator terminal 7.
 ファイルサーバ9は、一般的なサーバコンピュータにより実現される。ファイルサーバ9は、顧客とオペレータとの間の各通話の通話データを、各通話の識別情報と共にそれぞれ格納する。各通話データには、顧客の音声データとオペレータの音声データとのペア、及び、上述の保留区間データがそれぞれ含まれる。ファイルサーバ9は、顧客及びオペレータの各音声を録音する他の装置(各オペレータ端末7等)から、顧客の音声データとオペレータの音声データとを取得する。また、ファイルサーバ9は、保留区間データを、各オペレータ電話機6、各オペレータ端末7、交換機5等から取得する。 The file server 9 is realized by a general server computer. The file server 9 stores the call data of each call between the customer and the operator together with the identification information of each call. Each call data includes a pair of customer voice data and operator voice data, and the above-described hold section data. The file server 9 acquires customer voice data and operator voice data from another device (each operator terminal 7 or the like) that records each voice of the customer and the operator. Further, the file server 9 acquires the hold section data from each operator telephone 6, each operator terminal 7, the exchange 5 and the like.
 通話分析サーバ10は、ファイルサーバ9に格納される各通話データに関し、顧客の不満状態を分析する。
 通話分析サーバ10は、図1に示されるように、ハードウェア構成として、CPU(Central Processing Unit)11、メモリ12、入出力インタフェース(I/F)13、通信装置14等を有する。メモリ12は、RAM(Random Access Memory)、ROM(Read Only Memory)、ハードディスク、可搬型記憶媒体等である。入出力I/F13は、キーボード、マウス等のようなユーザ操作の入力を受け付ける装置、ディスプレイ装置やプリンタ等のようなユーザに情報を提供する装置などと接続される。通信装置14は、通信網8を介して、ファイルサーバ9などと通信を行う。なお、通話分析サーバ10のハードウェア構成は制限されない。
The call analysis server 10 analyzes the customer dissatisfaction state with respect to each call data stored in the file server 9.
As shown in FIG. 1, the call analysis server 10 includes a CPU (Central Processing Unit) 11, a memory 12, an input / output interface (I / F) 13, a communication device 14 and the like as a hardware configuration. The memory 12 is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like. The input / output I / F 13 is connected to a device that accepts an input of a user operation such as a keyboard and a mouse, and a device that provides information to the user such as a display device and a printer. The communication device 14 communicates with the file server 9 and the like via the communication network 8. Note that the hardware configuration of the call analysis server 10 is not limited.
 〔処理構成〕
 図2は、第1実施形態における通話分析サーバ10の処理構成例を概念的に示す図である。第1実施形態における通話分析サーバ10は、データ取得部20、発話検出部21、保留区間分析部23、対象決定部24、不満度判定部25等を有する。これら各処理部は、例えば、CPU11によりメモリ12に格納されるプログラムが実行されることにより実現される。また、当該プログラムは、例えば、CD(Compact Disc)、メモリカード等のような可搬型記録媒体やネットワーク上の他のコンピュータから入出力I/F13を介してインストールされ、メモリ12に格納されてもよい。
[Processing configuration]
FIG. 2 is a diagram conceptually illustrating a processing configuration example of the call analysis server 10 in the first embodiment. The call analysis server 10 according to the first embodiment includes a data acquisition unit 20, an utterance detection unit 21, a holding section analysis unit 23, an object determination unit 24, a dissatisfaction determination unit 25, and the like. Each of these processing units is realized, for example, by executing a program stored in the memory 12 by the CPU 11. Further, the program may be installed from a portable recording medium such as a CD (Compact Disc) or a memory card, or another computer on the network via the input / output I / F 13 and stored in the memory 12. Good.
 データ取得部20は、ファイルサーバ9から、分析対象となる通話の通話データをその通話の識別情報と共に取得する。取得された通話データには、上述のとおり、顧客の音声データとオペレータの音声データとのペア、及び、保留区間データが含まれる。当該通話データは、通話分析サーバ10とファイルサーバ9との間の通信により取得されてもよいし、可搬型記録媒体を介して取得されてもよい。 The data acquisition unit 20 acquires the call data of the call to be analyzed from the file server 9 together with the identification information of the call. As described above, the acquired call data includes a pair of customer voice data and operator voice data, and hold section data. The call data may be acquired by communication between the call analysis server 10 and the file server 9, or may be acquired via a portable recording medium.
 発話検出部21は、データ取得部20により取得される音声データに含まれる、保留区間データにより示される保留区間の顧客の音声データから、保留区間における顧客の発話区間を検出する。例えば、発話検出部21は、保留区間の顧客の音声データが示す音声波形において、所定値以上の振幅が継続している区間を発話区間として検出する。発話区間の検出とは、音声データの中の顧客の1発話を示す区間を検出することを意味し、それにより、その区間の始端時間と終端時間とが取得される。 The utterance detection unit 21 detects the utterance section of the customer in the holding section from the voice data of the customer in the holding section indicated by the holding section data included in the voice data acquired by the data acquisition section 20. For example, the utterance detection unit 21 detects, as the utterance section, a section in which the amplitude equal to or greater than a predetermined value is continued in the voice waveform indicated by the customer's voice data in the hold section. The detection of the utterance section means detecting a section indicating one utterance of the customer in the voice data, whereby the start time and the end time of the section are acquired.
 図3は、保留区間及び保留区間内の発話区間を概念的に示す図である。保留区間データにより、図3に示されるような保留区間が示される。また、通話保留中、即ち、保留区間中に、顧客が発した音声のうち、最初の音声が、図3に示されるように、最初の発話区間として検出される。 FIG. 3 is a diagram conceptually showing a reserved section and an utterance section in the reserved section. The holding section data as shown in FIG. 3 is indicated by the holding section data. Further, the first voice among the voices uttered by the customer during the call holding, that is, during the holding section, is detected as the first speaking section as shown in FIG.
 本発明者らは、通話保留中の顧客の発話がその顧客の不満に起因する可能性が高いことを見出すと共に、更に、通話保留中に表出される不満の中でも、保留待ち時間の長さに起因する不満(以降、保留待ち不満と表記する)と、それ以外に起因する不満(以降、保留待ち以外の不満と表記する)とが含まれることを見出した。保留待ち以外の不満としては、その通話での通話相手の応対に対する不満、その通話自体を誘引した製品やサービス等の不満などがあり得る。 The inventors have found that the utterance of a customer who is on hold of a call is likely due to the customer's dissatisfaction, and further, among the dissatisfactions expressed during a call hold, It has been found that there are unsatisfied causes (hereinafter referred to as unsatisfied waiting) and other unsatisfied reasons (hereinafter referred to as non-holding unsatisfactory). Dissatisfaction other than waiting on hold may be dissatisfaction with the reception of the other party in the call, or dissatisfaction with the product or service that attracted the call itself.
 更に、本発明者らは、保留待ち不満を示す発話は、保留区間の開始、又は、保留区間中の前の発話の終端から、或る程度の時間経過後に発声される傾向にあり、逆に、保留待ち以外の不満を示す発話は、通話保留中になると比較的すぐに発声される傾向にあることを見出した。そこで、本実施形態は、保留区間分析部23及び対象決定部24を設けることにより、保留区間の中の顧客の発話を、2種類の不満の原因(保留待ち不満と保留待ち以外の不満)で分類し、その分類された2つの発話群のいずれか一方のみを不満判定対象とすることを可能とする。これにより、本実施形態によれば、特定された不満の原因のみについての不満度を判定することができる。 Furthermore, the present inventors tend to utter utterances indicating dissatisfaction waiting on hold after a certain amount of time has passed since the beginning of the hold interval or the end of the previous utterance in the hold interval. It was found that utterances showing dissatisfaction other than waiting for waiting tend to be uttered relatively soon when the call is on hold. Therefore, in the present embodiment, by providing the holding section analysis unit 23 and the object determination unit 24, the customer's utterance in the holding section is caused by two types of dissatisfaction (holding wait dissatisfaction and dissatisfaction other than the hold waiting). Classification is performed, and only one of the two classified speech groups can be set as a dissatisfaction determination target. Thereby, according to this embodiment, the dissatisfaction degree about only the cause of the specified dissatisfaction can be determined.
 保留区間分析部23は、発話検出部21で検出される最初の発話区間の始端時間と保留区間の開始時間との時間差を算出する。以降、保留区間分析部23により算出されるこの時間差を開始時間差と表記する。この開始時間差は、図3にも示される。更に、保留区間分析部23は、算出された開始時間差を第1の所定時間閾値と比較する。この第1の所定時間閾値は、保留区間の開始から、保留待ち不満を表出させる発話が発声されるまでの間の時間の統計値であり、予め調整可能に保留区間分析部23により保持される。 The hold section analysis unit 23 calculates a time difference between the start time of the first utterance section detected by the utterance detection unit 21 and the start time of the hold section. Hereinafter, this time difference calculated by the reserved section analysis unit 23 is referred to as a start time difference. This start time difference is also shown in FIG. Furthermore, the reserved section analysis unit 23 compares the calculated start time difference with a first predetermined time threshold. This first predetermined time threshold is a statistical value of the time from the start of the hold interval until the utterance that expresses the hold waiting dissatisfaction is uttered, and is held in advance by the hold interval analysis unit 23 so as to be adjustable. The
 これにより、開始時間差と第1の所定時間閾値との比較結果によれば、この通話における保留区間中の顧客の発話が、保留待ち不満のみを表すか、又は、それ以外かを特定することができる。後者は、保留待ち以外の不満のみを表すか、又は、保留待ち不満と保留待ち以外の不満の両方を表すかである。第1の所定時間閾値は、オペレータの保留区間と顧客の発話区間の始終端時刻と、更に、保留区間中の顧客の発話が保留待ち不満か否かがラベル付けされた通話1000件程度を用いて設定される。例えば、横軸に算出された開始時間差、縦軸に顧客の最初の発話が保留待ち不満発話である度数と、保留待ち不満発話ではない度数をとったヒストグラムを作成し、これら2つの度数分布が最も分離する時間差を第1の所定時間閾値とする。 Thereby, according to the comparison result between the start time difference and the first predetermined time threshold, it is possible to specify whether the customer's utterance in the hold section in this call represents only the hold waiting dissatisfaction or other than that. it can. The latter represents only dissatisfaction other than hold waiting, or both dissatisfaction other than hold wait and non-hold wait. As the first predetermined time threshold, the start and end times of the operator's hold section and the customer's utterance section, and about 1000 calls labeled whether the customer's utterance in the hold section is unsatisfied with the hold are used. Is set. For example, a histogram of the difference in start time calculated on the horizontal axis and the frequency that the customer's first utterance is an unsatisfied utterance waiting on the vertical axis and the frequency that is not an unsatisfied utterance pending on the vertical axis is created. The most separated time difference is set as a first predetermined time threshold.
 更に、保留区間分析部23は、保留区間中の顧客の発話が、保留待ち以外の不満のみを表すか、又は、保留待ち不満と保留待ち以外の不満の両方を表すかを特定するために、発話検出部21で検出される複数の発話区間から、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する。この第2の所定時間閾値は、保留区間中の前の発話の終端から、保留待ち不満を表出させる発話が発声されるまでの間の時間の統計値であり、予め調整可能に保留区間分析部23により保持される。第2の所定時間閾値は、第1の所定時間閾値同様、オペレータの保留区間と顧客の発話区間の始終端時刻と、更に、保留区間中の顧客の発話が保留待ち不満か否かがラベル付けされた通話1000件程度を用いて設定される。例えば、横軸に、保留区間中の顧客のある発話の終端から、次の発話の始端までの時間差、縦軸に、前記ある発話が保留待ち不満ではなく、前記次の発話が保留待ち不満である度数と、両発話とも保留待ち不満発話でない度数をとったヒストグラムを作成し、これら2つの度数分布が最も分離する時間差を第2の所定時間閾値とする。 Furthermore, in order to identify whether the customer's utterance in the holding section represents only dissatisfaction other than waiting for waiting, or both dissatisfaction other than holding waiting and dissatisfaction other than holding waiting, Two adjacent utterance sections in which the time width between two adjacent utterance sections is larger than the second predetermined time threshold are specified from the plurality of utterance sections detected by the utterance detection unit 21. This second predetermined time threshold is a statistical value of the time from the end of the previous utterance in the hold interval until the utterance that expresses the hold waiting dissatisfaction is uttered, and the hold interval analysis can be adjusted in advance. Held by the unit 23. As with the first predetermined time threshold, the second predetermined time threshold is labeled with the start and end times of the operator's hold section and the customer's utterance section, and whether the customer's utterance in the hold section is unsatisfied with the hold It is set using about 1000 calls made. For example, the horizontal axis represents the time difference from the end of a certain utterance of a customer in the holding section to the beginning of the next utterance, and the ordinate represents that the certain utterance is not dissatisfied on hold and the next utterance is dissatisfied on hold. A histogram is created that takes a certain frequency and a frequency in which both utterances are not unsatisfactory utterances on hold, and a time difference at which these two frequency distributions are most separated is defined as a second predetermined time threshold.
 保留区間分析部23により当該隣接する2つの発話区間が特定された場合、保留区間中の顧客の発話が、保留待ち不満を表す発話と保留待ち以外の不満を表す発話との両方を含むことが特定される。この場合、保留区間分析部23により特定される隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間は、保留待ち不満を表し、前の発話区間以前の発話区間は、保留待ち以外の不満を表す。一方、保留区間分析部23により当該隣接する2つの発話区間が存在しないと判定された場合、保留区間中の顧客の発話が、保留待ち以外の不満のみを表すことが特定される。 When the two adjacent utterance sections are specified by the holding section analysis unit 23, the customer's utterance in the holding section may include both an utterance indicating a waiting dissatisfaction and an utterance indicating a dissatisfaction other than the holding wait. Identified. In this case, the utterance section after the utterance section after the adjacent two utterance sections specified by the hold section analysis unit 23 represents dissatisfaction waiting, and the utterance sections before the previous utterance section are other than hold wait Represents dissatisfaction. On the other hand, when it is determined by the holding section analysis unit 23 that the two adjacent utterance sections do not exist, it is specified that the utterance of the customer in the holding section represents only dissatisfaction other than waiting for holding.
 対象決定部24は、保留区間分析部23による開始時間差と第1の所定時間閾値との比較結果に基づいて、保留区間の顧客の音声データを不満度判定対象とするか否かを決定する。例えば、対象決定部24は、通話分析サーバ10のユーザが望む不満度の判定対象が、保留待ち不満又は保留待ち以外の不満であることを示す情報を調整可能に保持し、この情報に基づいて、不満度判定対象の決定を行う。具体的には、開始時間差が第1の所定時間閾値以上である場合、その通話における保留区間の顧客の発話は、保留待ち不満のみから表出された可能性が高いため、対象決定部24は、保留待ち以外の不満の不満度判定が望まれている場合には、その保留区間の顧客の音声データを不満度判定対象ではないと決定する。 The target determination unit 24 determines whether or not the voice data of the customer in the hold section is to be a dissatisfaction determination target based on the comparison result between the start time difference by the hold section analysis unit 23 and the first predetermined time threshold. For example, the target determination unit 24 adjustably holds information indicating that the determination target of the degree of dissatisfaction desired by the user of the call analysis server 10 is a hold wait dissatisfaction or a dissatisfaction other than hold wait, and based on this information The determination of the degree of dissatisfaction is performed. Specifically, if the start time difference is greater than or equal to the first predetermined time threshold, the customer's utterance in the holding section in the call is likely to have been expressed only from the hold waiting dissatisfaction. If it is desired to determine the degree of dissatisfaction other than waiting for a hold, it is determined that the voice data of the customer in the hold section is not subject to the dissatisfaction determination.
 対象決定部24は、当該開始時間差と第1の所定時間閾値との比較結果、及び、保留区間分析部23による当該隣接する2つの発話区間の特定の結果により、保留区間の顧客の音声データを不満度判定対象とするか否かを決定する。具体的には、開始時間差が第1の所定時間閾値よりも小さく、かつ、保留区間分析部23により、隣接する2つの発話区間が特定されない(存在しない)場合、その通話における保留区間の顧客の発話は、保留待ち以外の不満のみから表出された可能性が高い。よって、このとき、対象決定部24は、保留待ち以外の不満の不満度判定が望まれている場合には、その保留区間の顧客の音声データを不満度判定対象に決定し、保留待ち不満の不満度判定が望まれている場合には、その保留区間の顧客の音声データを不満度判定対象ではないと決定する。 The object determination unit 24 obtains the voice data of the customer in the holding section based on the comparison result between the start time difference and the first predetermined time threshold and the specific result of the two adjacent utterance sections by the holding section analysis unit 23. Decide whether or not to make a dissatisfaction level. Specifically, when the start time difference is smaller than the first predetermined time threshold and the two adjacent speech sections are not specified (does not exist) by the holding section analysis unit 23, the customer of the holding section in the call is not identified. The utterance is likely to have been expressed only from dissatisfaction other than waiting on hold. Therefore, at this time, if it is desired to determine the degree of dissatisfaction other than hold waiting, the target determination unit 24 determines the voice data of the customer in the hold section as a dissatisfaction determination target, If the determination of dissatisfaction is desired, it is determined that the voice data of the customer in the reserved section is not subject to dissatisfaction determination.
 更に、対象決定部24は、当該開始時間差が第1の所定時間閾値よりも小さく、かつ、保留区間分析部23により当該隣接する2つの発話区間が特定された場合、その特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データ、及び、特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのいずれか一方を不満度判定対象に決定する。当該開始時間差が第1の所定時間閾値よりも小さく、かつ、保留区間分析部23により当該隣接する2つの発話区間が特定された場合、その通話における保留区間の顧客の発話は、保留待ち以外の不満を表す発話と保留待ち不満を表す発話とを含む可能性が高い。よって、このとき、対象決定部24は、保留待ち以外の不満の不満度判定が望まれている場合には、その特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データのみを不満度判定対象に決定する。一方、対象決定部24は、保留待ち不満の不満度判定が望まれている場合には、その特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのみを不満度判定対象に決定する。 Further, when the difference between the start times is smaller than the first predetermined time threshold value and the two adjacent speech sections are specified by the holding section analysis unit 23, the target determination unit 24 specifies the specified adjacent 2 Dissatisfaction with either the speech data of the utterance section before the previous utterance section of one utterance section or the speech data of the utterance section after the utterance section after the specified two adjacent utterance sections Determine to be judged. When the difference between the start times is smaller than the first predetermined time threshold and the two adjacent utterance sections are specified by the holding section analysis unit 23, the customer's utterance in the holding section in the call is other than the hold waiting There is a high possibility of including utterances indicating dissatisfaction and utterances indicating dissatisfaction waiting. Therefore, at this time, when it is desired to determine the degree of dissatisfaction other than waiting for waiting, the target determining unit 24 determines the utterance section before the previous utterance section of the two adjacent utterance sections specified. Only the voice data is determined as a dissatisfaction level determination target. On the other hand, in the case where it is desired to determine the degree of dissatisfaction of the waiting waiting dissatisfaction, the object determining unit 24 only selects the audio data of the utterance section after the utterance section after the specified two adjacent utterance sections. Decide on a dissatisfaction level.
 不満度判定部25は、対象決定部24により不満度判定の判定対象に決定された、保留区間の顧客の音声データに対して、所定の音声解析処理を行うことにより、その通話の不満度を判定する。不満度判定部25は、判定結果として、分析対象の通話に関する識別情報及び不満度を示す情報を含む出力データを生成し、入出力I/F13を介して表示部や他の出力装置にその判定結果を出力する。本実施形態は、この判定結果の出力の具体的形態を制限しない。 The dissatisfaction level determination unit 25 performs a predetermined voice analysis process on the voice data of the customer in the holding section determined as the determination target of the dissatisfaction level by the target determination unit 24, thereby determining the dissatisfaction level of the call. judge. The dissatisfaction determination unit 25 generates, as a determination result, output data including identification information related to the analysis target call and information indicating the dissatisfaction level, and determines the determination to the display unit or other output device via the input / output I / F 13. Output the result. The present embodiment does not limit the specific form of output of the determination result.
 〔動作例〕
 以下、第1実施形態における不満通話判定方法について図4を用いて説明する。図4は、第1実施形態における通話分析サーバ10の動作例を示すフローチャートである。図4は、通話分析サーバ10(対象決定部24)に、通話分析サーバ10のユーザが望む不満度の判定対象が保留待ち以外の不満であることを示す情報が設定されている場合の例を示す。この例によれば、通話分析サーバ10は、対象通話に関し、保留待ち不満以外の不満の有無又は度合いを判定する。
[Operation example]
Hereinafter, the dissatisfied call determination method in the first embodiment will be described with reference to FIG. FIG. 4 is a flowchart showing an operation example of the call analysis server 10 in the first embodiment. FIG. 4 shows an example in which information indicating that the dissatisfaction determination target desired by the user of the call analysis server 10 is dissatisfaction other than waiting for waiting is set in the call analysis server 10 (target determination unit 24). Show. According to this example, the call analysis server 10 determines the presence or absence or degree of dissatisfaction other than the hold waiting dissatisfaction for the target call.
 通話分析サーバ10は、通話データを取得する(S41)。第1実施形態では、通話分析サーバ10は、ファイルサーバ9に格納される複数の通話データの中から、分析対象となる通話データを取得する。取得された通話データには、顧客の音声データとオペレータの音声データとのペア、及び、保留区間データが含まれる。顧客の音声データには、通話保留中に顧客により発声された音声も含まれる。 The call analysis server 10 acquires call data (S41). In the first embodiment, the call analysis server 10 acquires call data to be analyzed from a plurality of call data stored in the file server 9. The acquired call data includes a pair of customer voice data and operator voice data, and hold section data. The voice data of the customer includes voice uttered by the customer while the call is on hold.
 通話分析サーバ10は、工程(S41)で取得された通話データから顧客の音声データ及び保留区間データを取得し、保留区間データにより示される保留区間の顧客の音声データから、保留区間における顧客の発話区間を検出する(S42)。通話分析サーバ10は、各発話区間について始端時間及び終端時間をそれぞれ取得する。 The call analysis server 10 acquires customer voice data and hold section data from the call data acquired in step (S41), and the customer utterance in the hold section from the customer voice data in the hold section indicated by the hold section data. A section is detected (S42). The call analysis server 10 acquires the start time and the end time for each utterance section.
 通話分析サーバ10は、工程(S42)で検出された発話区間の中の最初の発話区間の始端時間と保留区間の開始時間との時間差(開始時間差)を算出する(S43)。 The call analysis server 10 calculates the time difference (start time difference) between the start time of the first utterance section and the start time of the hold section in the utterance section detected in the step (S42) (S43).
 続いて、通話分析サーバ10は、工程(S43)で算出された開始時間差と第1の所定時間閾値とを比較する(S44)。通話分析サーバ10は、当該開始時間差が第1所定時間閾値以上となる場合(S44;NO)、その通話において保留区間の顧客の音声データを不満度判定の対象ではないと決定し(S45)、処理を終える。判定(S44;NO)は、その通話において保留区間の顧客の発話が保留待ち不満のみを表すとの判定と同意である。この場合、通話分析サーバ10は、その通話で顧客が不満を表出しなかったことを示す情報を出力してもよい。 Subsequently, the call analysis server 10 compares the start time difference calculated in the step (S43) with the first predetermined time threshold value (S44). When the start time difference is equal to or greater than the first predetermined time threshold (S44; NO), the call analysis server 10 determines that the customer's voice data in the hold section is not a subject of dissatisfaction determination in the call (S45), Finish the process. The determination (S44; NO) is the determination and consent that the utterance of the customer in the holding section in the call represents only the holding wait dissatisfaction. In this case, the call analysis server 10 may output information indicating that the customer did not express dissatisfaction with the call.
 一方で、通話分析サーバ10は、当該開始時間差が第1所定時間閾値より小さい場合(S44;YES)、工程(S42)で検出された発話区間から、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する(S46)。通話分析サーバ10は、工程(S46)において当該隣接する2つの発話区間が特定されない場合(S47;NO)、その通話において保留区間の顧客の音声データを不満度判定の対象に決定する(S48)。判定(S47;NO)は、その通話において保留区間の顧客の発話が保留待ち以外の不満のみを表すとの判定と同意である。 On the other hand, when the start time difference is smaller than the first predetermined time threshold (S44; YES), the call analysis server 10 determines the time width between the adjacent utterance intervals from the utterance interval detected in the step (S42). Two adjacent utterance sections whose length is larger than the second predetermined time threshold are specified (S46). When the two adjacent utterance sections are not specified in the step (S46) (S47; NO), the call analysis server 10 determines the voice data of the customer in the hold section in the call as the object of dissatisfaction determination (S48). . The determination (S47; NO) is a determination and consent that the customer's utterance in the holding section in the call represents only dissatisfaction other than waiting for holding.
 通話分析サーバ10は、工程(S46)において当該隣接する2つの発話区間が特定された場合(S47;YES)、特定された隣接する2つの発話区間の中の前の発話区間以前の発話区間の音声データのみを不満度判定対象に決定する(S49)。ここで、判定(S47;YES)は、その通話において保留区間の顧客の発話が、保留待ち以外の不満を表す発話と、保留待ち以外の不満を表す発話とを含むとの判定と同意である。 When the two adjacent utterance sections are specified in the step (S46) (S47; YES), the call analysis server 10 determines the utterance section before the previous utterance section in the two specified adjacent utterance sections. Only the audio data is determined as a dissatisfaction level determination target (S49). Here, the determination (S47; YES) is the determination and consent that the utterance of the customer in the holding section in the call includes an utterance that represents dissatisfaction other than waiting for waiting and an utterance that represents dissatisfaction other than waiting for holding. .
 通話分析サーバ10は、不満度判定対象に決定された音声データを用いて分析対象の通話の不満度を判定する(S50)。この場合、通話分析サーバ10は、判定された不満度を示す情報を出力してもよい。 The call analysis server 10 determines the degree of dissatisfaction of the call to be analyzed using the audio data determined as the degree of dissatisfaction determination (S50). In this case, the call analysis server 10 may output information indicating the determined degree of dissatisfaction.
 図4では、当該ユーザが望む不満度の判定対象が保留待ち以外の不満である場合が例示されたが、当然に、当該ユーザが望む不満度の判定対象が保留待ち不満であるにも、第1実施形態における不満通話判定方法は適用可能である。この場合、工程(S45)では、その通話において保留区間の顧客の音声データが不満度判定対象に決定される。また、工程(S48)では、その通話において保留区間の顧客の音声データが不満度判定対象ではないと決定され、工程(S49)では、特定された隣接する2つの発話区間の中の後ろの発話区間以降の発話区間の音声データのみが不満度判定対象に決定される。 In FIG. 4, the case where the determination target of the degree of dissatisfaction desired by the user is dissatisfaction other than the waiting for hold is illustrated. The unsatisfactory call determination method in one embodiment is applicable. In this case, in the step (S45), the voice data of the customer in the holding section in the call is determined as a dissatisfaction level determination target. In step (S48), it is determined that the customer's voice data in the holding section is not a dissatisfaction determination target in the call, and in step (S49), the utterances behind the two specified adjacent utterance sections. Only the voice data of the utterance section after the section is determined as a dissatisfaction level determination target.
 更に、上述の不満通話判定方法は、当該ユーザが望む不満度の判定対象が、保留待ち以外の不満であるか、又は、保留待ち不満であるかを判定する工程を更に含み、この判定結果に応じて、工程(S45)、(S48)及び(S49)の処理内容を切り替えるようにしてもよい。また、保留待ち以外の不満に関する不満度と、保留待ち不満に関する不満度とが各々区別されて判定されるようにしてもよい。 Furthermore, the above-mentioned dissatisfied call determination method further includes a step of determining whether the determination target of the dissatisfaction level desired by the user is dissatisfaction other than waiting for waiting or dissatisfied with waiting for holding, and the determination result includes Accordingly, the processing contents of steps (S45), (S48), and (S49) may be switched. In addition, the degree of dissatisfaction related to dissatisfaction other than waiting for waiting and the degree of dissatisfaction relating to waiting for dissatisfaction may be determined separately.
 〔第1実施形態の作用及び効果〕
 上述のように、第1実施形態では、保留区間における顧客の発話区間の中の最初の発話区間の始端時間と保留区間の開始時間との時間差(開始時間差)が算出され、この開始時間差と第1所定時間閾値とが比較される。この比較は、保留待ち不満を表す発話が、保留区間の開始から或る程度の時間経過後に発声されるという人間(顧客)の特性(傾向)に基づくものであり、この比較により開始時間差が第1所定時間閾値以上となる場合には、保留区間での顧客の発話が、保留待ち不満のみを表すものと特定することができる。
[Operation and Effect of First Embodiment]
As described above, in the first embodiment, the time difference (start time difference) between the start time of the first utterance section and the start time of the hold section in the customer utterance section in the hold section is calculated. 1 The predetermined time threshold is compared. This comparison is based on the characteristics (trends) of a person (customer) that an utterance that represents dissatisfaction on hold is uttered after a certain amount of time has elapsed since the start of the hold interval. If the threshold is equal to or greater than one predetermined time, it can be specified that the customer's utterance in the hold section represents only the hold wait dissatisfaction.
 更に、第1実施形態では、当該開始時間差が第1所定時間閾値より小さい場合、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間が特定される。この特定は、保留待ち不満を示す発話が、保留区間中の前の発話の終端から或る程度の時間経過後に発声されるという人間(顧客)の特性(傾向)に基づくものであり、その特定が不可能の場合、保留区間での顧客の発話が、保留待ち以外の不満のみを表すものと特定することができる。また、その特定が可能の場合、保留区間での顧客の発話が、保留待ち以外の不満を表す発話と保留待ち不満を表す発話とを含むものと特定することができる。更に、その特定された隣接する2つの発話区間により、保留区間内の複数の発話を、保留待ち以外の不満を表す発話と、保留待ち不満を表す発話とに区別することが出来る。 Furthermore, in the first embodiment, when the start time difference is smaller than the first predetermined time threshold, two adjacent utterance intervals in which the time width between the two adjacent utterance intervals is larger than the second predetermined time threshold are specified. Is done. This specification is based on the characteristics (trends) of a person (customer) that an utterance indicating dissatisfaction waiting is uttered after a certain amount of time has elapsed since the end of the previous utterance in the hold section. Is impossible, it is possible to specify that the customer's utterance in the holding section represents only dissatisfaction other than holding on hold. When the identification is possible, it is possible to specify that the customer's utterance in the holding section includes an utterance indicating dissatisfaction other than the waiting waiting and an utterance indicating the holding waiting dissatisfaction. Furthermore, the two adjacent utterance sections that have been identified can distinguish a plurality of utterances in the holding section into utterances that indicate dissatisfaction other than waiting for waiting and utterances that indicate dissatisfaction waiting.
 このように、第1実施形態によれば、通話保留中における顧客の発話に関する特性を加味した情報処理(比較処理及び特定処理)により、保留区間中の顧客の発話を、2種類の不満の原因(保留待ち不満と保留待ち以外の不満)で分類することができる。これにより、第1実施形態によれば、分類された2種類の不満原因のうちの所望の不満原因に対応する発話の音声データのみを用いて、その通話の不満度を判定することができる。 As described above, according to the first embodiment, two types of dissatisfaction are caused by the customer's utterance in the holding section by the information processing (comparison processing and specific processing) in consideration of the characteristics regarding the customer's utterance during the call holding. It is possible to classify them according to (unsatisfied with waiting and unsatisfied other than waiting). Thereby, according to 1st Embodiment, the dissatisfaction degree of the call can be determined only using the audio | voice data of the speech corresponding to the desired dissatisfaction cause among the classified two types of dissatisfaction causes.
 従って、第1実施形態によれば、所望の不満原因のみについての通話の不満度を高精度に判定することができる。更に、第1実施形態によれば、所望の不満原因に対応する発話の音声データのみに限定して、所定の音声解析処理が行われるため、通話データ全体又は保留区間全体を判定対象にする形態に比べて、処理効率を向上させることができ、処理の高速化を実現することができる。 Therefore, according to the first embodiment, it is possible to determine the degree of dissatisfaction with respect to only a desired cause of dissatisfaction with high accuracy. Furthermore, according to the first embodiment, the predetermined speech analysis processing is performed only on the speech data corresponding to the desired cause of dissatisfaction, so that the entire call data or the entire holding section is targeted for determination. Compared to the above, the processing efficiency can be improved, and the processing speed can be increased.
 [第2実施形態]
 第2実施形態は、第1実施形態のように判定される通話の不満度の不満原因を示す情報を生成する。以下、第2実施形態におけるコンタクトセンタシステム1について、第1実施形態と異なる内容を中心に説明する。以下の説明では、第1実施形態と同様の内容については適宜省略する。
[Second Embodiment]
The second embodiment generates information indicating the cause of dissatisfaction with the degree of dissatisfaction of the call determined as in the first embodiment. Hereinafter, the contact center system 1 in the second embodiment will be described focusing on the content different from the first embodiment. In the following description, the same contents as those in the first embodiment are omitted as appropriate.
 〔処理構成〕
 図5は、第2実施形態における通話分析サーバ10の処理構成例を概念的に示す図である。第2実施形態における通話分析サーバ10は、第1実施形態の構成に加えて、原因推定部27を更に有する。原因推定部27は、他の処理部と同様に、例えば、CPU11によりメモリ12に格納されるプログラムが実行されることにより実現される。
[Processing configuration]
FIG. 5 is a diagram conceptually illustrating a processing configuration example of the call analysis server 10 in the second embodiment. The call analysis server 10 in the second embodiment further includes a cause estimating unit 27 in addition to the configuration of the first embodiment. The cause estimation unit 27 is realized, for example, by executing a program stored in the memory 12 by the CPU 11, similarly to the other processing units.
 原因推定部27は、保留区間分析部23による、開始時間差と第1所定時間閾値との比較結果、及び、保留区間分析部23による、当該隣接する2つの発話区間の特定結果に基づいて、保留区間における顧客の発話に対応する不満原因を示す原因情報を生成する。原因推定部27は、その原因情報を含む出力データを生成し、入出力I/F13を介して表示部や他の出力装置にその判定結果を出力してもよい。但し、この原因情報の出力の具体的形態は制限されない。 The cause estimation unit 27 holds based on the comparison result between the start time difference and the first predetermined time threshold by the hold segment analysis unit 23 and the identification result of the two adjacent utterance intervals by the hold segment analysis unit 23. Cause information indicating a cause of dissatisfaction corresponding to the customer's utterance in the section is generated. The cause estimating unit 27 may generate output data including the cause information and output the determination result to the display unit or another output device via the input / output I / F 13. However, the specific form of outputting the cause information is not limited.
 原因情報により示される、保留区間における顧客の発話に対応する不満原因としては、例えば、次の3つの場合があり得る。第1の場合は、保留待ち時間の長さのみを不満原因とする場合である。第2の場合は、保留待ち時間の長さ以外のみを不満原因とする場合である。第3の場合は、第1のタイプと第2のタイプの両方を不満原因とする場合である。 As the cause of dissatisfaction corresponding to the customer's utterance in the reserved section indicated by the cause information, for example, there are the following three cases. In the first case, only the length of the waiting waiting time is caused as a cause of dissatisfaction. In the second case, only the length of the hold waiting time is caused as a cause of dissatisfaction. The third case is a case where both the first type and the second type are caused by dissatisfaction.
 原因推定部27は、当該開始時間差が第1所定時間閾値以上である場合に、上記第1の場合の原因情報を生成する。原因推定部27は、当該開始時間差が第1所定時間閾値より小さく、かつ、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間が存在しない場合に、上記第2の場合の原因情報を生成する。原因推定部27は、当該開始時間差が第1所定時間閾値より小さく、かつ、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間が存在する場合に、上記第3の場合の原因情報を生成する。 The cause estimating unit 27 generates cause information in the first case when the start time difference is equal to or greater than a first predetermined time threshold. The cause estimating unit 27 does not have two adjacent utterance sections in which the start time difference is smaller than the first predetermined time threshold and the time width between the two adjacent utterance sections is larger than the second predetermined time threshold. In the case, the cause information in the second case is generated. The cause estimating unit 27 has two adjacent utterance sections in which the difference in start time is smaller than the first predetermined time threshold and the time width between the two adjacent utterance sections is larger than the second predetermined time threshold. In the case, the cause information in the third case is generated.
 〔動作例〕
 以下、第2実施形態における不満通話判定方法について説明する。通話分析サーバ10は、第1実施形態の動作(図4参照)に加えて、更に、上述のような原因情報を生成する。通話分析サーバ10は、第1実施形態のように不満度を判定した後、当該原因情報を生成してもよいし、原因情報を不満度判定前に生成してもよい。
[Operation example]
Hereinafter, the dissatisfied call determination method in the second embodiment will be described. In addition to the operation of the first embodiment (see FIG. 4), the call analysis server 10 further generates cause information as described above. The call analysis server 10 may generate the cause information after determining the degree of dissatisfaction as in the first embodiment, or may generate the cause information before determining the degree of dissatisfaction.
 図4の例によれば、通話分析サーバ10は、(S44;NO)の場合に、保留待ち時間の長さのみを不満原因とすることを示す原因情報を生成する。通話分析サーバ10は、(S44;YES)かつ(S47;NO)の場合に、保留待ち時間の長さ以外のみを不満原因とすることを示す原因情報を生成する。通話分析サーバ10は、(S44;YES)かつ(S47;YES)の場合に、保留待ち時間の長さ及びそれ以外の両方を不満原因とすることを示す原因情報を生成する。 According to the example of FIG. 4, in the case of (S44; NO), the call analysis server 10 generates cause information indicating that only the length of the waiting waiting time is the cause of dissatisfaction. In the case of (S44; YES) and (S47; NO), the call analysis server 10 generates cause information indicating that only the length of the holding waiting time is a cause of dissatisfaction. In the case of (S44; YES) and (S47; YES), the call analysis server 10 generates cause information indicating that the length of the holding waiting time and the rest are the causes of dissatisfaction.
 〔第2実施形態の作用及び効果〕
 上述のように、第2実施形態では、不満度の判定と共に、保留区間における顧客の発話に対応する不満原因を示す原因情報が生成される。この不満原因の推定は、上述の第1実施形態における保留区間分析部23及び対象決定部24の処理結果が用いられる。従って、第2実施形態によれば、通話保留中における顧客の発話に関する特性を加味した情報処理(比較処理及び特定処理)により、高精度に不満原因を推定することができる。
[Operation and Effect of Second Embodiment]
As described above, in the second embodiment, the cause information indicating the cause of dissatisfaction corresponding to the customer's utterance in the holding section is generated together with the determination of the degree of dissatisfaction. For the estimation of the cause of dissatisfaction, the processing results of the reserved section analysis unit 23 and the object determination unit 24 in the first embodiment described above are used. Therefore, according to the second embodiment, the cause of dissatisfaction can be estimated with high accuracy by information processing (comparison processing and specific processing) that takes into account the characteristics relating to the customer's utterance while the call is on hold.
 [変形例]
 上述の第1実施形態及び第2実施形態では、通話データには、顧客の音声データとオペレータの音声データとのペア、及び、保留区間データが含まれる例が示されたが、通話データには、顧客及びオペレータの音声テキストデータ、並びに、各音声テキストの時間情報が更に含まれていてもよい。ここで、音声テキストデータとは、顧客又はオペレータにより発された声がテキスト化されたデータである。また、各音声テキストの時間情報とは、各音声テキストで示される発話が発声された時間に関する情報である。この音声テキストデータは、例えば、各オペレータ端末7又はファイルサーバ9において、顧客及びオペレータの各音声に対して音声認識処理が適用されることにより生成される。
[Modification]
In the first embodiment and the second embodiment described above, an example in which the call data includes a pair of customer voice data and operator voice data, and holding section data is shown. , Customer and operator voice text data, and time information of each voice text may be further included. Here, the voice text data is data in which a voice uttered by a customer or an operator is converted into text. Further, the time information of each voice text is information related to the time when the utterance indicated by each voice text is uttered. This voice text data is generated, for example, by applying voice recognition processing to each voice of the customer and the operator at each operator terminal 7 or file server 9.
 この場合、通話分析サーバ10の発話検出部21は不要となる。保留区間分析部23は、各音声テキストの時間情報と保留区間データとを用いて処理を行うことができる。また、不満度判定部25は、その音声テキストデータを用いて、不満度判定を行うようにしてもよい。このように、発話検出部21により検出される、保留区間における顧客の発話区間は、通話分析サーバ10以外の他の装置により生成されてもよい。 In this case, the utterance detection unit 21 of the call analysis server 10 is not necessary. The holding section analysis unit 23 can perform processing using the time information of each voice text and the holding section data. Further, the dissatisfaction level determination unit 25 may perform dissatisfaction level determination using the voice text data. As described above, the customer utterance section in the holding section detected by the utterance detection unit 21 may be generated by a device other than the call analysis server 10.
 上述の各実施形態は、保留区間分析部23が、開始時間差と第1の所定時間閾値とを比較する例を示したが、保留区間分析部23は、開始時間差から得られる他の値(代表値と呼ぶこともできる)と第1の所定時間閾値を比較するようにしてもよい。例えば、その代表値が、開始時間差を入力とする所定の関数から出力される値であってもよい。この場合、その所定関数は、例えば、開始時間差を入力とし、当該保留区間内の発話に対応する不満が、保留待ち不満のみであることの確度を示す値を出力する関数としてもよい。 In each of the above-described embodiments, the holding section analysis unit 23 compares the start time difference with the first predetermined time threshold. However, the holding section analysis unit 23 uses other values (representative) obtained from the start time difference. The threshold value may also be referred to as a first predetermined time threshold value. For example, the representative value may be a value output from a predetermined function having the start time difference as an input. In this case, the predetermined function may be, for example, a function that inputs a start time difference and outputs a value indicating the accuracy that the dissatisfaction corresponding to the utterance in the hold section is only the hold wait dissatisfaction.
 同様に、上述の各実施形態は、保留区間分析部23が、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する例を示したが、保留区間分析部23は、隣接する2つの発話区間の間の時間幅から得られる他の値(代表値と呼ぶこともできる)が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定するようにしてもよい。この場合、その代表値が、時間幅を入力とする所定の関数から出力される値であってもよい。この場合、その所定関数は、例えば、時間幅を入力とし、当該保留区間内の発話に対応する不満が、保留待ち時間の長さ以外によるもののみであることの確度を示す値を出力する関数としてもよい。 Similarly, each embodiment described above shows an example in which the holding section analysis unit 23 specifies two adjacent utterance sections in which the time width between the two adjacent utterance sections is larger than the second predetermined time threshold. However, the holding section analysis unit 23 has two adjacent values whose other values (also referred to as representative values) obtained from the time width between the two adjacent speech sections are larger than the second predetermined time threshold. The utterance section may be specified. In this case, the representative value may be a value output from a predetermined function having the time width as an input. In this case, the predetermined function is, for example, a function that takes a time width as an input and outputs a value indicating the accuracy that the dissatisfaction corresponding to the utterance in the hold section is only due to the length of the hold waiting time. It is good.
 このような変形例によれば、上述の原因推定部27は、開始時間差から得られる上記代表値を、保留区間における顧客の発話に対応する不満原因が、保留待ち時間の長さによるもののみであることの確度を示す値として、原因情報に含めるようにしてもよい。同様に、原因推定部27は、時間幅から得られる上記代表値を、保留区間における顧客の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであることの確度を示す値として、原因情報に含めるようにしてもよい。 According to such a modified example, the above-described cause estimation unit 27 determines that the cause of dissatisfaction corresponding to the customer's utterance in the holding section is based on the length of the holding waiting time. It may be included in the cause information as a value indicating the certainty of being. Similarly, the cause estimating unit 27 uses the above representative value obtained from the time width as a value indicating the accuracy that the cause of dissatisfaction corresponding to the customer's utterance in the holding section is only due to the length of the holding waiting time. May be included in the cause information.
 更に、上述の第1実施形態及び第2実施形態は、保留区間分析部23が、隣接する2つの発話区間の間の時間幅が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する例を示したが、保留区間分析部23は、その特定処理をしないようにしてもよい。即ち、保留区間分析部23は、開始時間差と第1の所定時間閾値との比較のみを行うようにしてもよい。この場合、対象決定部24は、その比較結果に基づいて、保留区間の顧客の音声データを不満度判定対象とするか否かを決定するようにしてもよい。この形態では、保留待ち不満を表す発話と保留待ち以外の不満を表す発話とが保留区間中に混在する場合に、両方の不満原因を表す発話から不満度が判定されることになる。このような形態は、不満原因を制限する必要がない場合などに有効である。また、この場合、原因推定部27は、保留中の顧客の発話が保留待ち時間の長さの不満原因のみに起因するか否かを示す原因情報を生成するようにすればよい。 Further, in the first embodiment and the second embodiment described above, the reserved section analysis unit 23 determines that two adjacent speech sections in which the time width between the two adjacent speech sections is larger than the second predetermined time threshold value. Although the example to identify was shown, you may make it the pending | holding area analysis part 23 not perform the specific process. That is, the reserved section analysis unit 23 may perform only the comparison between the start time difference and the first predetermined time threshold. In this case, the object determination unit 24 may determine whether or not the voice data of the customer in the reserved section is to be a dissatisfaction level determination target based on the comparison result. In this form, when utterances indicating unsatisfied waiting and utterances indicating dissatisfaction other than waiting are mixed in the holding section, the degree of dissatisfaction is determined from utterances indicating both causes of dissatisfaction. Such a form is effective when there is no need to limit the cause of dissatisfaction. In this case, the cause estimating unit 27 may generate cause information indicating whether the utterance of the customer on hold is caused only by the cause of dissatisfaction with the length of the hold waiting time.
 [他の実施形態]
 上述の各実施形態では、通話データが扱われたが、上述の不満会話判定装置及び不満会話判定方法は、通話以外の会話データを扱う装置やシステムに適用されてもよい。この場合、例えば、分析対象となる会話を録音する録音装置がその会話が行われる場所(会議室、銀行の窓口、店舗のレジなど)に設置される。また、会話データが複数の会話参加者の声が混合された状態で録音される場合には、その混合状態から所定の音声処理により会話参加者毎の音声データに分離される。
[Other Embodiments]
In each of the above-described embodiments, the call data is handled. However, the above-mentioned dissatisfied conversation determination device and the dissatisfied conversation determination method may be applied to an apparatus or a system that handles conversation data other than a call. In this case, for example, a recording device for recording a conversation to be analyzed is installed at a place (conference room, bank window, store cash register, etc.) where the conversation is performed. Further, when the conversation data is recorded in a state in which the voices of a plurality of conversation participants are mixed, the conversation data is separated from the mixed state into voice data for each conversation participant by a predetermined voice process.
 また、上述の第1実施形態及び第2実施形態では、保留区間データが、オペレータの保留操作及び保留解除操作の検出や、オペレータの音声データに含まれる保留音の検出等で生成された。会話データが扱われる形態では、会話を保留にする会話参加者の移動の検出や、会話を保留にする際及び保留を解除する際に発声される定型表現(例えば、「少し失礼します」など)の検出などで、保留区間データが自動生成されてもよい。もちろん、保留区間データは、ユーザ操作により入力されてもよい。 In the first and second embodiments described above, the hold interval data is generated by detecting the hold operation and hold release operation by the operator, detecting the hold sound included in the operator's voice data, and the like. In the form in which conversation data is handled, the movement of conversation participants who put the conversation on hold is detected, and the standard expression that is uttered when putting the conversation on hold and releasing the hold (for example, “I am a little rude”) ), Etc. may be automatically generated. Of course, the hold section data may be input by a user operation.
 なお、上述の説明で用いたフローチャートでは、複数の工程(処理)が順番に記載されているが、本実施形態で実行される工程の実行順序は、その記載の順番に制限されない。本実施形態では、図示される工程の順番を内容的に支障のない範囲で変更することができる。また、上述の各実施形態及び各変形例は、内容が相反しない範囲で組み合わせることができる。 In the flowchart used in the above description, a plurality of steps (processes) are described in order, but the execution order of the steps executed in the present embodiment is not limited to the description order. In the present embodiment, the order of the illustrated steps can be changed within a range that does not hinder the contents. Moreover, each above-mentioned embodiment and each modification can be combined in the range with which the content does not conflict.
 上記の各実地形態及び各変形例の一部又は全部は、以下の付記のようにも特定され得る。但し、各実地形態及び各変形例が以下の記載に限定されるものではない。 Some or all of the above-described actual forms and modifications may be specified as in the following supplementary notes. However, each actual form and each modification are not limited to the following description.
(付記1)
 会話の保留区間の開始時間及び終了時間を示す保留区間データ及び該会話の音声データを取得するデータ取得部と、
 前記データ取得部により取得される前記音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、前記会話の不満度を判定する不満度判定部と、
 を備える不満会話判定装置。
(Appendix 1)
A data acquisition unit for acquiring hold section data indicating start time and end time of a hold section of the conversation and voice data of the conversation;
By performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the holding section indicated by the holding section data included in the voice data acquired by the data acquisition unit, A dissatisfaction determination unit that determines the dissatisfaction level of the conversation;
A dissatisfied conversation determination device comprising:
(付記2)
 前記データ取得部により取得される前記音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データから、前記保留区間における該被保留側会話参加者の最初の発話区間を検出する発話検出部と、
 前記発話検出部で検出される前記最初の発話区間の始端時間と前記保留区間の開始時間との時間差を算出し、該時間差又は該時間差に対応する代表値を第1の所定時間閾値と比較する保留区間分析部と、
 前記保留区間分析部による比較結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定部の判定対象とするか否かを決定する対象決定部と、
 を更に備える付記1に記載の不満会話判定装置。
(Appendix 2)
From the voice data of the held-side conversation participant in the holding section indicated by the holding section data, included in the voice data acquired by the data acquisition unit, the first of the held-side conversation participants in the holding section An utterance detection unit for detecting an utterance section of
The time difference between the start time of the first utterance section detected by the utterance detection unit and the start time of the holding section is calculated, and the time difference or a representative value corresponding to the time difference is compared with a first predetermined time threshold value. A holding section analysis unit;
Based on the comparison result by the holding section analysis unit, an object determining unit that determines whether or not the voice data of the held-side conversation participant in the holding section is a determination target of the dissatisfaction degree determination unit;
The unsatisfactory conversation determination device according to supplementary note 1, further comprising:
(付記3)
 前記保留区間分析部による比較結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が保留待ち時間の長さによるもののみであるか否かを示す原因情報を生成する原因推定部、
 を更に備える付記2に記載の不満会話判定装置。
(Appendix 3)
Based on the comparison result by the holding section analysis unit, cause information indicating whether or not the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the holding section is only due to the length of the holding waiting time. A cause estimator to generate,
The unsatisfactory conversation determination device according to Appendix 2, further comprising:
(付記4)
 前記保留区間分析部により算出される前記時間差に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さによるもののみであることの確度を示す原因情報を生成する原因推定部、
 を更に備える付記2に記載の不満会話判定装置。
(Appendix 4)
Based on the time difference calculated by the holding section analysis unit, the probability that the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the holding section is only due to the length of the holding waiting time. A cause estimator that generates cause information to indicate,
The unsatisfactory conversation determination device according to Appendix 2, further comprising:
(付記5)
 前記発話検出部は、前記保留区間の前記被保留側会話参加者の音声データから、前記保留区間における前記被保留側会話参加者の複数の発話区間を検出し、
 前記保留区間分析部は、前記複数の発話区間から、隣接する2つの発話区間の間の時間幅又は該時間幅に対応する代表値が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定し、
 前記対象決定部は、前記保留区間分析部による、前記時間差又は前記代表値と前記第1の所定時間閾値との比較結果、及び、前記特定の結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定部の判定対象とするか否かを決定する、
 付記2から4のいずれか1つに記載の不満会話判定装置。
(Appendix 5)
The utterance detection unit detects a plurality of utterance sections of the held-side conversation participant in the holding section from the voice data of the held-side conversation participant of the holding section,
The holding section analysis unit includes two adjacent utterance sections in which a time width between two adjacent utterance sections or a representative value corresponding to the time width is larger than a second predetermined time threshold from the plurality of utterance sections. Identify
The target determination unit is configured to determine whether the reserved section is based on the comparison result between the time difference or the representative value and the first predetermined time threshold, and the specific result. Determining whether or not the voice data of the conversation participant is to be determined by the dissatisfaction level determination unit;
The unsatisfactory conversation determination device according to any one of supplementary notes 2 to 4.
(付記6)
 前記対象決定部は、前記保留区間分析部により特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データ、及び、該特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのいずれか一方を前記不満度判定部の判定対象とする、
 付記5に記載の不満会話判定装置。
(Appendix 6)
The target determination unit includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified by the holding section analysis section, and the two adjacent utterance sections specified Any one of the speech data of the utterance section after the utterance section behind is set as the determination target of the dissatisfaction level determination unit.
The unsatisfactory conversation determination device according to appendix 5.
(付記7)
 前記原因推定部は、前記保留区間分析部による、前記隣接する2つの発話区間の特定結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであるか、又は、保留待ち時間の長さによるもの及び保留待ち時間の長さ以外によるものの両方であるかを示す原因情報を生成する、
 付記5又は6に記載の不満会話判定装置。
(Appendix 7)
The cause estimation unit is based on the identification result of the two adjacent utterance intervals by the hold interval analysis unit, and the cause of dissatisfaction corresponding to the utterance of the held conversation participant in the hold interval is a hold waiting time. Generating cause information indicating whether it is only due to the length of the waiting time, or both due to the length of the holding waiting time and other than the length of the holding waiting time,
The unsatisfactory conversation determination device according to appendix 5 or 6.
(付記8)
 前記保留区間分析部により算出される、前記隣接する2つの発話区間の間の前記時間幅に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであることの確度を示す原因情報を生成する原因推定部、
 を更に備える付記5又は6に記載の不満会話判定装置。
(Appendix 8)
Based on the time width between the two adjacent utterance intervals calculated by the holding interval analysis unit, the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold interval is the hold waiting time. A cause estimator that generates cause information indicating the accuracy of being only due to length other than
The unsatisfactory conversation determination device according to appendix 5 or 6, further comprising:
(付記9)
 少なくとも1つのコンピュータにより実行される不満会話判定方法において、
 会話の保留区間の開始時間及び終了時間を示す保留区間データ及び該会話の音声データを取得し、
 前記取得された音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、前記会話の不満度を判定する、
 ことを含む不満会話判定方法。
(Appendix 9)
In a dissatisfied conversation determination method executed by at least one computer,
Obtaining the hold section data indicating the start time and the end time of the conversation hold section and the voice data of the conversation;
The speech dissatisfaction level is determined by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold section indicated by the hold section data included in the acquired voice data. judge,
A method for determining dissatisfied conversations.
(付記10)
 前記取得された音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データから、前記保留区間における該被保留側会話参加者の最初の発話区間を検出し、
 前記検出された最初の発話区間の始端時間と前記保留区間の開始時間との時間差を算出し、
 前記時間差又は前記時間差に対応する代表値を第1の所定時間閾値と比較し、
 前記比較結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定の対象とするか否かを決定する、
 ことを更に含む付記9に記載の不満会話判定方法。
(Appendix 10)
Detecting the first utterance section of the held-side conversation participant in the holding section from the voice data of the held-side conversation participant in the holding section indicated by the holding section data included in the acquired voice data And
Calculating the time difference between the start time of the detected first utterance interval and the start time of the hold interval;
Comparing the time difference or a representative value corresponding to the time difference with a first predetermined time threshold;
Based on the comparison result, it is determined whether or not the voice data of the held-side conversation participant in the holding section is the object of the dissatisfaction determination.
The unsatisfactory conversation determination method according to supplementary note 9, further including:
(付記11)
 前記比較結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が保留待ち時間の長さによるもののみであるか否かを示す原因情報を生成する、
 ことを更に含む付記10に記載の不満会話判定方法。
(Appendix 11)
Based on the comparison result, generate cause information indicating whether or not the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold section is due to the length of the hold waiting time.
The dissatisfied conversation determination method according to supplementary note 10, further including:
(付記12)
 前記算出される時間差に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さによるもののみであることの確度を示す原因情報を生成する、
 ことを更に含む付記10に記載の不満会話判定方法。
(Appendix 12)
Based on the calculated time difference, the cause information indicating the probability that the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold section is only due to the length of the hold waiting time is generated. ,
The dissatisfied conversation determination method according to supplementary note 10, further including:
(付記13)
 前記保留区間の前記被保留側会話参加者の音声データから、前記保留区間における前記被保留側会話参加者の複数の発話区間を検出し、
 前記複数の発話区間から、隣接する2つの発話区間の間の時間幅又は該時間幅に対応する代表値が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する、
 ことを更に含み、
 前記不満度判定の対象に関する決定は、前記時間差又は前記代表値と前記第1の所定時間閾値との比較結果、及び、前記特定の結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定の対象とするか否かを決定する、
 付記10から12のいずれか1つに記載の不満会話判定方法。
(Appendix 13)
From the voice data of the held-side conversation participant in the holding section, detecting a plurality of utterance sections of the held-side conversation participant in the holding section,
From the plurality of utterance sections, specify two adjacent utterance sections in which the time width between two adjacent utterance sections or a representative value corresponding to the time width is greater than a second predetermined time threshold.
Further including
The determination regarding the object of determination of the degree of dissatisfaction is based on the comparison result between the time difference or the representative value and the first predetermined time threshold, and the specific result, and the held-side conversation participant in the holding section. To determine whether or not the voice data is subject to the dissatisfaction determination,
The dissatisfied conversation determination method according to any one of appendices 10 to 12.
(付記14)
 前記不満度判定の対象に関する決定は、前記特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データ、及び、該特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのいずれか一方を前記不満度判定の対象とする、
 付記13に記載の不満会話判定方法。
(Appendix 14)
The determination regarding the object of determination of the dissatisfaction level includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified, and of the two adjacent utterance sections specified. Either one of the speech data of the utterance section after the subsequent utterance section is the object of the dissatisfaction determination,
The method for determining a dissatisfied conversation according to appendix 13.
(付記15)
 前記原因情報の生成は、前記隣接する2つの発話区間の特定結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであるか、又は、保留待ち時間の長さによるもの及び保留待ち時間の長さ以外によるものの両方であるかを示す原因情報を生成する、
 付記13又は14に記載の不満会話判定方法。
(Appendix 15)
The generation of the cause information is based on the result of specifying the two adjacent utterance intervals, and the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold interval is other than the length of the hold waiting time. Generating cause information that indicates whether it is only due to the length of the hold waiting time or not due to the length of the hold waiting time,
15. The method for determining a dissatisfied conversation according to appendix 13 or 14.
(付記16)
 前記隣接する2つの発話区間の間の前記時間幅に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであることの確度を示す原因情報を生成する、
 ことを更に含む付記13又は14に記載の不満会話判定方法。
(Appendix 16)
Based on the time width between the two adjacent utterance intervals, the cause of dissatisfaction corresponding to the utterance of the held conversation participant in the hold interval is only due to the length of the hold waiting time. Generate cause information indicating the accuracy of
The unsatisfactory conversation determination method according to supplementary note 13 or 14, further including:
(付記17)
 少なくとも1つのコンピュータに、付記9から16のいずれか1つに記載の不満会話判定方法を実行させるプログラム。
(Appendix 17)
A program for causing at least one computer to execute the unsatisfactory conversation determination method according to any one of appendices 9 to 16.
(付記18)
 付記17に記載のプログラムをコンピュータに読み取り可能に記録する記録媒体。
(Appendix 18)
A recording medium for recording the program according to attachment 17 in a computer-readable manner.
 この出願は、2012年10月31日に出願された日本出願特願2012-240759号を基礎とする優先権を主張し、その開示の全てをここに取り込む。 This application claims priority based on Japanese Patent Application No. 2012-240759 filed on October 31, 2012, the entire disclosure of which is incorporated herein.

Claims (15)

  1.  会話の保留区間の開始時間及び終了時間を示す保留区間データ及び該会話の音声データを取得するデータ取得部と、
     前記データ取得部により取得される前記音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、前記会話の不満度を判定する不満度判定部と、
     を備える不満会話判定装置。
    A data acquisition unit for acquiring hold section data indicating start time and end time of a hold section of the conversation and voice data of the conversation;
    By performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the holding section indicated by the holding section data included in the voice data acquired by the data acquisition unit, A dissatisfaction determination unit that determines the dissatisfaction level of the conversation;
    A dissatisfied conversation determination device comprising:
  2.  前記データ取得部により取得される前記音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データから、前記保留区間における該被保留側会話参加者の最初の発話区間を検出する発話検出部と、
     前記発話検出部で検出される前記最初の発話区間の始端時間と前記保留区間の開始時間との時間差を算出し、該時間差又は該時間差に対応する代表値を第1の所定時間閾値と比較する保留区間分析部と、
     前記保留区間分析部による比較結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定部の判定対象とするか否かを決定する対象決定部と、
     を更に備える請求項1に記載の不満会話判定装置。
    From the voice data of the held-side conversation participant in the holding section indicated by the holding section data, included in the voice data acquired by the data acquisition unit, the first of the held-side conversation participants in the holding section An utterance detection unit for detecting an utterance section of
    The time difference between the start time of the first utterance section detected by the utterance detection unit and the start time of the holding section is calculated, and the time difference or a representative value corresponding to the time difference is compared with a first predetermined time threshold value. A holding section analysis unit;
    Based on the comparison result by the holding section analysis unit, an object determining unit that determines whether or not the voice data of the held-side conversation participant in the holding section is a determination target of the dissatisfaction degree determination unit;
    The dissatisfied conversation determination device according to claim 1, further comprising:
  3.  前記保留区間分析部による比較結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が保留待ち時間の長さによるもののみであるか否かを示す原因情報を生成する原因推定部、
     を更に備える請求項2に記載の不満会話判定装置。
    Based on the comparison result by the holding section analysis unit, cause information indicating whether or not the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the holding section is only due to the length of the holding waiting time. A cause estimator to generate,
    The dissatisfied conversation determination device according to claim 2, further comprising:
  4.  前記保留区間分析部により算出される前記時間差に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さによるもののみであることの確度を示す原因情報を生成する原因推定部、
     を更に備える請求項2に記載の不満会話判定装置。
    Based on the time difference calculated by the holding section analysis unit, the probability that the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the holding section is only due to the length of the holding waiting time. A cause estimator that generates cause information to indicate,
    The dissatisfied conversation determination device according to claim 2, further comprising:
  5.  前記発話検出部は、前記保留区間の前記被保留側会話参加者の音声データから、前記保留区間における前記被保留側会話参加者の複数の発話区間を検出し、
     前記保留区間分析部は、前記複数の発話区間から、隣接する2つの発話区間の間の時間幅又は該時間幅に対応する代表値が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定し、
     前記対象決定部は、前記保留区間分析部による、前記時間差又は前記代表値と前記第1の所定時間閾値との比較結果、及び、前記特定の結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定部の判定対象とするか否かを決定する、
     請求項2から4のいずれか1項に記載の不満会話判定装置。
    The utterance detection unit detects a plurality of utterance sections of the held-side conversation participant in the holding section from the voice data of the held-side conversation participant of the holding section,
    The holding section analysis unit includes two adjacent utterance sections in which a time width between two adjacent utterance sections or a representative value corresponding to the time width is larger than a second predetermined time threshold from the plurality of utterance sections. Identify
    The target determination unit is configured to determine whether the reserved section is based on the comparison result between the time difference or the representative value and the first predetermined time threshold, and the specific result. Determining whether or not the voice data of the conversation participant is to be determined by the dissatisfaction level determination unit;
    The unsatisfactory conversation determination apparatus of any one of Claim 2 to 4.
  6.  前記対象決定部は、前記保留区間分析部により特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データ、及び、該特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのいずれか一方を前記不満度判定部の判定対象とする、
     請求項5に記載の不満会話判定装置。
    The target determination unit includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified by the holding section analysis section, and the two adjacent utterance sections specified Any one of the speech data of the utterance section after the utterance section behind is set as the determination target of the dissatisfaction level determination unit.
    The unsatisfactory conversation determination device according to claim 5.
  7.  前記原因推定部は、前記保留区間分析部による、前記隣接する2つの発話区間の特定結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであるか、又は、保留待ち時間の長さによるもの及び保留待ち時間の長さ以外によるものの両方であるかを示す原因情報を生成する、
     請求項5又は6に記載の不満会話判定装置。
    The cause estimation unit is based on the identification result of the two adjacent utterance intervals by the hold interval analysis unit, and the cause of dissatisfaction corresponding to the utterance of the held conversation participant in the hold interval is a hold waiting time. Generating cause information indicating whether it is only due to the length of the waiting time, or both due to the length of the holding waiting time and other than the length of the holding waiting time,
    The unsatisfactory conversation determination apparatus according to claim 5 or 6.
  8.  前記保留区間分析部により算出される、前記隣接する2つの発話区間の間の前記時間幅に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さ以外によるもののみであることの確度を示す原因情報を生成する原因推定部、
     を更に備える請求項5又は6に記載の不満会話判定装置。
    Based on the time width between the two adjacent utterance intervals calculated by the holding interval analysis unit, the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold interval is the hold waiting time. A cause estimator that generates cause information indicating the accuracy of being only due to length other than
    The dissatisfied conversation determination device according to claim 5 or 6, further comprising:
  9.  少なくとも1つのコンピュータにより実行される不満会話判定方法において、
     会話の保留区間の開始時間及び終了時間を示す保留区間データ及び該会話の音声データを取得し、
     前記取得された音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データに対して、所定の音声解析処理を行うことにより、前記会話の不満度を判定する、
     ことを含む不満会話判定方法。
    In a dissatisfied conversation determination method executed by at least one computer,
    Obtaining the hold section data indicating the start time and the end time of the conversation hold section and the voice data of the conversation;
    The speech dissatisfaction level is determined by performing a predetermined voice analysis process on the voice data of the held-side conversation participant in the hold section indicated by the hold section data included in the acquired voice data. judge,
    A method for determining dissatisfied conversations.
  10.  前記取得された音声データに含まれる、前記保留区間データにより示される前記保留区間の被保留側会話参加者の音声データから、前記保留区間における該被保留側会話参加者の最初の発話区間を検出し、
     前記検出された最初の発話区間の始端時間と前記保留区間の開始時間との時間差を算出し、
     前記時間差又は前記時間差に対応する代表値を第1の所定時間閾値と比較し、
     前記比較結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定の対象とするか否かを決定する、
     ことを更に含む請求項9に記載の不満会話判定方法。
    Detecting the first utterance section of the held-side conversation participant in the holding section from the voice data of the held-side conversation participant in the holding section indicated by the holding section data included in the acquired voice data And
    Calculating the time difference between the start time of the detected first utterance interval and the start time of the hold interval;
    Comparing the time difference or a representative value corresponding to the time difference with a first predetermined time threshold;
    Based on the comparison result, it is determined whether or not the voice data of the held-side conversation participant in the holding section is the object of the dissatisfaction determination.
    The dissatisfied conversation determination method according to claim 9, further comprising:
  11.  前記比較結果に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が保留待ち時間の長さによるもののみであるか否かを示す原因情報を生成する、
     ことを更に含む請求項10に記載の不満会話判定方法。
    Based on the comparison result, generate cause information indicating whether or not the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold section is due to the length of the hold waiting time.
    The dissatisfied conversation determination method according to claim 10, further comprising:
  12.  前記算出される時間差に基づいて、前記保留区間における前記被保留側会話参加者の発話に対応する不満原因が、保留待ち時間の長さによるもののみであることの確度を示す原因情報を生成する、
     ことを更に含む請求項10に記載の不満会話判定方法。
    Based on the calculated time difference, the cause information indicating the probability that the cause of dissatisfaction corresponding to the utterance of the held-side conversation participant in the hold section is only due to the length of the hold waiting time is generated. ,
    The dissatisfied conversation determination method according to claim 10, further comprising:
  13.  前記保留区間の前記被保留側会話参加者の音声データから、前記保留区間における前記被保留側会話参加者の複数の発話区間を検出し、
     前記複数の発話区間から、隣接する2つの発話区間の間の時間幅又は該時間幅に対応する代表値が第2の所定時間閾値よりも大きい隣接する2つの発話区間を特定する、
     ことを更に含み、
     前記不満度判定の対象に関する決定は、前記時間差又は前記代表値と前記第1の所定時間閾値との比較結果、及び、前記特定の結果に基づいて、前記保留区間の前記被保留側会話参加者の音声データを前記不満度判定の対象とするか否かを決定する、
     請求項10から12のいずれか1項に記載の不満会話判定方法。
    From the voice data of the held-side conversation participant in the holding section, detecting a plurality of utterance sections of the held-side conversation participant in the holding section,
    From the plurality of utterance sections, specify two adjacent utterance sections in which the time width between two adjacent utterance sections or a representative value corresponding to the time width is greater than a second predetermined time threshold.
    Further including
    The determination regarding the object of determination of the degree of dissatisfaction is based on the comparison result between the time difference or the representative value and the first predetermined time threshold, and the specific result, and the held-side conversation participant in the holding section. To determine whether or not the voice data is subject to the dissatisfaction determination,
    The dissatisfied conversation determination method according to any one of claims 10 to 12.
  14.  前記不満度判定の対象に関する決定は、前記特定された隣接する2つの発話区間のうちの前の発話区間以前の発話区間の音声データ、及び、該特定された隣接する2つの発話区間のうちの後ろの発話区間以降の発話区間の音声データのいずれか一方を前記不満度判定の対象とする、
     請求項13に記載の不満会話判定方法。
    The determination regarding the object of determination of the dissatisfaction level includes voice data of an utterance section before the previous utterance section of the two adjacent utterance sections specified, and of the two adjacent utterance sections specified. Either one of the speech data of the utterance section after the subsequent utterance section is the object of the dissatisfaction determination,
    The unsatisfactory conversation determination method according to claim 13.
  15.  少なくとも1つのコンピュータに、請求項9から14のいずれか1項に記載の不満会話判定方法を実行させるプログラム。 A program for causing at least one computer to execute the unsatisfactory conversation determination method according to any one of claims 9 to 14.
PCT/JP2013/079235 2012-10-31 2013-10-29 Complaint conversation determination device and complaint conversation determination method WO2014069444A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2014544515A JPWO2014069444A1 (en) 2012-10-31 2013-10-29 Dissatisfied conversation determination device and dissatisfied conversation determination method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012-240759 2012-10-31
JP2012240759 2012-10-31

Publications (1)

Publication Number Publication Date
WO2014069444A1 true WO2014069444A1 (en) 2014-05-08

Family

ID=50627348

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/079235 WO2014069444A1 (en) 2012-10-31 2013-10-29 Complaint conversation determination device and complaint conversation determination method

Country Status (2)

Country Link
JP (1) JPWO2014069444A1 (en)
WO (1) WO2014069444A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110062117A (en) * 2019-04-08 2019-07-26 商客通尚景科技(上海)股份有限公司 A kind of sonic detection and method for early warning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008219741A (en) * 2007-03-07 2008-09-18 Fujitsu Ltd Exchange and waiting call control method of exchange
WO2012120656A1 (en) * 2011-03-08 2012-09-13 富士通株式会社 Telephone call assistance device, telephone call assistance method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008219741A (en) * 2007-03-07 2008-09-18 Fujitsu Ltd Exchange and waiting call control method of exchange
WO2012120656A1 (en) * 2011-03-08 2012-09-13 富士通株式会社 Telephone call assistance device, telephone call assistance method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110062117A (en) * 2019-04-08 2019-07-26 商客通尚景科技(上海)股份有限公司 A kind of sonic detection and method for early warning

Also Published As

Publication number Publication date
JPWO2014069444A1 (en) 2016-09-08

Similar Documents

Publication Publication Date Title
JP6341092B2 (en) Expression classification device, expression classification method, dissatisfaction detection device, and dissatisfaction detection method
US11210461B2 (en) Real-time privacy filter
EP2717258B1 (en) Phrase spotting systems and methods
US10592611B2 (en) System for automatic extraction of structure from spoken conversation using lexical and acoustic features
US9293133B2 (en) Improving voice communication over a network
WO2014069076A1 (en) Conversation analysis device and conversation analysis method
US9571638B1 (en) Segment-based queueing for audio captioning
JP6358093B2 (en) Analysis object determination apparatus and analysis object determination method
US9711167B2 (en) System and method for real-time speaker segmentation of audio interactions
US20150310863A1 (en) Method and apparatus for speaker diarization
JP2017508188A (en) A method for adaptive spoken dialogue
JP6213476B2 (en) Dissatisfied conversation determination device and dissatisfied conversation determination method
US20150149162A1 (en) Multi-channel speech recognition
JP5385677B2 (en) Dialog state dividing apparatus and method, program and recording medium
JP6365304B2 (en) Conversation analyzer and conversation analysis method
JP6327252B2 (en) Analysis object determination apparatus and analysis object determination method
JP5691174B2 (en) Operator selection device, operator selection program, operator evaluation device, operator evaluation program, and operator evaluation method
WO2014069444A1 (en) Complaint conversation determination device and complaint conversation determination method
JP2021078012A (en) Answering machine determination device, method and program
WO2014069443A1 (en) Complaint call determination device and complaint call determination method
JP2013235050A (en) Information processing apparatus and method, and program
US20240312466A1 (en) Systems and Methods for Distinguishing Between Human Speech and Machine Generated Speech
CN116975242A (en) Voice broadcast interrupt processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13850998

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2014544515

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13850998

Country of ref document: EP

Kind code of ref document: A1