WO2017145929A1

WO2017145929A1 - Pose control device, robot, and pose control method

Info

Publication number: WO2017145929A1
Application number: PCT/JP2017/005857
Authority: WO
Inventors: 誠悟伊藤; 秀俊篠原
Original assignee: シャープ株式会社
Priority date: 2016-02-25
Filing date: 2017-02-17
Publication date: 2017-08-31
Also published as: JPWO2017145929A1; CN108698231A

Abstract

The present invention addresses the problem of enabling a robot to be able to indicate to a user that the robot intends to speak when initiating dialog with a user. The present invention is a pose control device (3) for controlling the pose of a robot (101) that is capable of dialog with a user, wherein when the pose of the robot (101) identified at the time dialog with the user is initiated is not a pose for indicating an intent to speak, a drive system (1) is driven to cause the robot (101) to assume a pose for indicating an intent to speak.

Description

Attitude control device, robot, and attitude control method

The present invention relates to a posture control device that controls the posture of a robot that can interact with a user, a robot including the posture control device, and a posture control method.

In recent years, robots that perform actions according to their utterances have been developed. And it is requested | required of these robots to perform the operation | movement according to own utterance more naturally. For example, Patent Document 1 discloses a robot apparatus that naturally performs an operation corresponding to an utterance by synthesizing a voice synchronized with an actual operation. Further, Patent Document 2 discloses a humanoid robot that naturally performs an operation corresponding to an utterance by generating a gesture of the robot while the robot outputs sound.

Japanese Patent Publication “Patent No. 5402648” (registered on November 8, 2013) Japanese Patent Gazette “Special Table 2014-504959” (published February 27, 2014)

By the way, in order to facilitate the dialogue between the user and the robot, it is necessary to clearly notify the user whether or not the robot itself has an intention to speak when the dialogue with the user is started. However, although the robots disclosed in each of the above-mentioned patent documents are devised so that the behavior when the robot itself speaks is natural, when the dialogue with the user starts, the robot itself speaks. It is not particularly considered to indicate to the user whether or not there is an intention.

The present invention has been made in view of the above-described problems, and its purpose is to clearly indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. An attitude control device and an attitude control method are realized.

In order to solve the above problems, an attitude control device according to one embodiment of the present invention is a robot that can interact with a user and can drive a plurality of driving units to take various attitudes. A posture control device that is provided and controls the posture of the robot, the posture specifying unit that specifies the posture of the robot from the drive state of each drive unit, and the drive control unit that performs drive control of each drive unit And the drive control unit is configured such that the posture of the robot specified by the posture specifying unit at the start of dialogue with the user is not an utterance intention presentation posture indicating that the robot has an intention to speak. The drive units are driven to cause the robot to take the utterance intention presentation posture.

An attitude control method according to an aspect of the present invention is an attitude control method for controlling the attitude of a robot capable of interacting with a user and driving a plurality of driving units to take various attitudes. Thus, the posture specifying step for specifying the posture of the robot at the start of the dialogue with the user, and the posture of the robot specified by the posture specifying step are not the utterance intention presenting posture indicating that the robot is intended to speak. A drive control step of driving each of the drive units to cause the robot to take the utterance intention presentation posture.

According to one aspect of the present invention, it is possible to clearly indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user.

1 is a schematic configuration block diagram of a robot according to a first embodiment of the present invention. It is a sequence diagram which shows the flow of the process of the attitude | position control of the robot by the attitude | position control apparatus with which the robot shown in FIG. 1 is provided. It is a schematic block diagram of a robot according to Embodiment 2 of the present invention. FIG. 4 is a sequence diagram showing a flow of processing of posture control of the robot by the posture control device provided in the robot shown in FIG. 3. It is a schematic block diagram of the robot which concerns on Embodiment 3 of this invention. It is a schematic block diagram of the robot which concerns on the modification of Embodiment 3 of this invention.

Embodiment 1
Hereinafter, embodiments of the present invention will be described in detail. In the present embodiment, a robot having a driving system composed of at least a human or animal or similar outer shell and a plurality of driving units that operate the outer shell and capable of interacting with a user will be described.

(Robot overview)
FIG. 1 is a schematic configuration diagram of a robot 101 according to the present embodiment. The robot 101 includes at least a human or animal or similar outer shell (not shown). The robot 101 further drives the driving system 1 composed of a plurality of driving units (manipulators) that operate the outer shell, the audio system 2 for realizing dialogue with the user, and the driving system 1 to take various postures. The attitude control device 3 is included.

The voice system 2 includes a microphone 21, an input device 22, a voice recognition device 23, a dialogue device 24, a voice synthesis device 25, a playback device 26, a speaker 27, and a playback status acquisition device 28. The microphone 21 is a device that collects a voice uttered by a user and converts the collected voice into electronic wave data (waveform data). The microphone 21 sends the converted electronic waveform data to the input device 22 at the subsequent stage.

The input device 22 is a device for recording the electronic waveform data. When recording the waveform data, if the state of the waveform data indicating that the waveform data indicates no sound continues for a predetermined time or longer, the input device 22 ends the recording, and indicates that the recording has ended, that is, a signal indicating the end of input. Send to control device 3. The input device 22 sends the recorded waveform data to the subsequent speech recognition device 23 at the timing of sending a signal indicating the end of input to the attitude control device 3. The speech recognition device 23 is a device that converts electronic waveform data sent from the input device 22 into text data (ASR: Automatic Speech Recognition). The voice recognition device 23 sends the converted text data to the subsequent dialogue device 24.

The dialogue device 24 analyzes the text data sent from the speech recognition device 23 to identify the user's utterance content (analysis result), and obtains dialogue data indicating the response content in which conversation is established for the identified utterance content. It is a device that performs. Further, the dialogue device 24 extracts text data corresponding to the response content from the obtained dialogue data. Then, the dialogue device 24 sends the extracted text data to the subsequent speech synthesizer 25.

The speech synthesizer 25 is a TTS (Text-to-Speech) device that converts text data sent from the dialogue device 24 into PCM data. The speech synthesizer 25 sends the converted PCM data to the playback device 26 at the subsequent stage. The playback device 26 is a device that outputs the PCM data sent from the speech synthesizer 25 to the speaker 27 as sound waves. The sound wave output here means a sound that can be recognized by a person. In addition, the sound wave output from the playback device 26 becomes the response content for the user's utterance content. As a result, a conversation is established between the user and the robot 101. Further, the playback device 26 outputs the PCM data to the speaker 27 and simultaneously outputs it to the playback status acquisition device 28.

When the PCM data is sent from the playback device 26, the playback status acquisition device 28 is a signal indicating that the output of the voice from the speaker 27 has started, that is, when the playback of the voice to the user by the robot 101 is started (at the start of speech). Is sent to the attitude control device 3.

The posture control device 3 is a device that controls the posture of the robot 101, and includes a drive control device 31, a housing state acquisition device 32, a posture recording device 33, and a behavior pattern recording device 34. The drive control device 31 includes a posture specifying unit 31 a that specifies the posture of the robot 101 from the drive state of the drive system (drive unit) 1 and a drive control unit 31 b that performs drive control of the drive system 1.

The housing state acquisition device 32 is a device that acquires information indicating the drive state of the drive system 1. Here, the information indicating the driving state of the driving system 1 is information indicating what driving state the driving system 1 for specifying the posture of the robot 101 is in. For example, joint angle information obtained from a rotary encoder attached to the joint of the robot, torque on / off state, and the like correspond to information indicating the drive state. This information is sent from the housing state acquisition device 32 to the attitude specifying unit 31a of the drive control device 31.

The posture recording device 33 is a device that records a speech intention presentation posture taken by the robot 101. Specifically, information indicating the driving state of the drive system 1 is recorded in the posture recording device 33 so that the robot 101 assumes the utterance intention presentation posture. The utterance intention presentation posture is, for example, a posture in which the robot touches the mouth, a careful posture, a posture in which the user faces the user's face, and the posture in which the robot indicates the intention to speak to the user. It is.

The behavior pattern recording device 34 is a device that records a behavior pattern associated with the utterance content of the robot 101. Specifically, in the behavior pattern recording device 34, information indicating the driving state of the driving system 1 associated with each utterance content is recorded as a behavior pattern. As behavior patterns, not only the information of the posture recording device 33 but also the housing state acquisition device 32, for example, various sensors such as fall detection and gravitational acceleration, or the internal state of the robot 101, for example, past behaviors of voice recognition results. You may add a pattern. In addition, the user's utterance content is divided into categories. It may be adapted to the pitch at the time of utterance. Also, the utterance intention presentation posture is not limited to one type, and may be a plurality of types, depending on the situation of the robot 101 such as the robot 101 holding an object.

The posture specifying unit 31a acquires information indicating the drive state of the drive system 1 of the robot 201, thereby specifying what posture the robot 201 is currently in. Information indicating the identified posture is sent from the posture identifying unit 31a to the drive control unit 31b.

The drive control unit 31b determines whether or not the posture of the robot 101 specified by the posture specifying unit 31a at the start of the dialogue with the user is an utterance intention presentation posture. Here, the dialogue with the user is started when the robot 101 starts reproducing the voice to the user. That is, the drive control unit 31b determines the posture of the robot 101 specified by the posture specifying unit 31a at the timing when the signal indicating that the reproduction of the voice to the user by the robot 101 is started from the reproduction status acquisition device 28 is received. judge.

If the posture of the robot 101 is not the utterance intention presentation posture as a result of determining the posture of the robot 101, the drive control unit 31b drives the drive system 1 to cause the robot 101 to assume the utterance intention presentation posture. That is, whether or not the robot 101 is identified at the start of the dialogue with the user (posture identifying step), and whether the identified robot posture is an utterance intention presentation posture indicating that the robot 101 is intended to speak. Determine whether or not. If it is not the utterance intention presentation posture, the drive system 1 is driven to cause the robot 101 to take the utterance intention presentation posture (drive control step). As described above, when the robot 101 is not in the utterance intention presentation posture at the start of the dialogue with the user, the robot 101 performs an operation of returning to the utterance intention presentation posture. Therefore, the user can easily confirm that the robot 101 has the intention of utterance. I can understand.

In addition, when the posture of the robot 101 is determined to be the utterance intention presentation posture as a result of determining the posture of the robot 101, the drive control unit 31b notifies the user that the utterance will be performed before the utterance of the robot 101 starts. Let the action take place. For example, if the head is facing the front in the utterance intention presentation posture of the robot 101, the operation of returning the head to the front with the head once swung is performed before starting the utterance. After performing this operation, the robot 101 speaks. Thereby, the user can easily understand that the robot 101 has the intention of speaking.

(Attitude control processing)
FIG. 2 is a sequence diagram showing the flow of the posture control process of the robot 101 shown in FIG. In the following sequence diagram, the process until the voice is reproduced by the robot 101 (1), the process when the behavior of the robot 101 is finished during the voice reproduction (2), and the voice reproduction is finished during the behavior of the robot 101. Processing (3) is included.

Outline of Process (1): In the robot 101, in the voice system 2, the user's utterance is basically acquired from the microphone 21 and recorded by the input device 22. The recorded utterance is then recognized by the speech recognition device 23, the dialogue character string is acquired from the speech recognition result from the dialogue device 24, the dialogue character string is synthesized by the speech synthesizer 25, and the playback device 26 The speech synthesis content is sounded by the speaker 27. Note that a series of operations from the acquisition of the user's utterance to the ringing of the speech synthesis content is made.

In the present embodiment, at the start of dialogue with the user in the voice system 2, the user's utterance is acquired from the microphone 21, and playback of the response utterance corresponding to the acquired utterance is started by the playback device 26. Timing.

At the attitude control device 3, information (drive information) of the housing (the driving system 1 of the robot 101) is obtained by the housing state obtaining device 32 at the start of the dialogue. Then, if necessary, the drive control device 31 activates the drive system 1 and then changes to the utterance intention presentation posture according to the information of the posture recording device 33, and either one of the behavior patterns is selected from the behavior pattern recording device 34 according to the utterance content. Select.

When the drive system 1 starts driving according to the behavior pattern, the utterance of the robot 101 is started. Specifically, the processing (1) corresponds to the processing from (1. voice data input) to (13. sound data ringing) in the sequence shown in FIG. That is, the microphone 21 converts voice input by the user's speech into waveform data, and outputs the waveform data to the input device 22 (1. voice data input). The input device 22 inputs the input sound data, and outputs the input sound data to the speech recognition device 23 (2. Speech recognition start command).

The voice recognition device 23 receives a voice recognition start command from a control unit (not shown), converts the input sound data into text data, and outputs the text data to the dialogue device 24 (3. dialogue start command). The dialogue device 24 receives a dialogue start command from a control unit (not shown), analyzes the user's utterance content from the input text data, and obtains text data of a dialogue sentence corresponding to the utterance content from a database (not shown). get. Then, the acquired text data is output to the speech synthesizer 25 (4. dialogue wording synthesis command).

The voice synthesizer 25 receives an interactive wording synthesis command from a control unit (not shown), converts the input text data into output sound wave data (PCM data), and outputs it to the playback device 26 (5. Voice data playback). order). When the reproduction device 26 receives an audio data reproduction command from a control unit (not shown) and reproduces the output sound wave data, the reproduction device 26 outputs utterance start state change information to the reproduction state acquisition device 28 (6. Speech start). State change). This utterance start state change information is information indicating whether or not the utterance by the robot 101 has been started. In this case, the utterance start state change information is information indicating that the utterance by the robot 101 has been disclosed.

The reproduction status acquisition device 28 notifies the drive control device 31 that the robot 101 has spoken from the input utterance start state change information (7. Talk start status notification). The signal to be notified here is a signal indicating that the robot 101 has started to reproduce the voice to the user.

The drive control device 31 receives the signal indicating that the reproduction of the voice to the user by the robot 101 from the reproduction status acquisition device 28 is received, and the state of the robot 101 (the case state) from the case state acquisition device 32. ) (8. Acquire housing information). Further, the drive control device 31 also acquires the speech intention presentation posture recorded in the posture recording device 33 (9. Speech intention presentation posture acquisition). Accordingly, the drive control device 31 specifies the posture of the robot 101 by the posture specifying unit 31a from the acquired housing state, and determines whether or not the specified posture of the robot 101 is the acquired speech intention presentation posture. can do. Then, the drive control device 31 drives the drive system 1 according to the determination result (10. Utterance intention presentation posture transition).

Here, when the specified posture of the robot 101 is not the utterance intention presentation posture, the drive control device 31 drives the drive system 1 so as to be the utterance intention presentation posture. On the other hand, when the identified posture of the robot 101 is the utterance intention presentation posture, the drive control device 31 temporarily hangs the head if the head of the utterance intention presentation posture is facing the front. To return to the front.

When the robot 101 starts an utterance, the drive control device 31 acquires a behavior pattern corresponding to the utterance content from the behavior pattern recording device 34 (11. behavior pattern acquisition), and drives the drive system so that the acquired behavior pattern is obtained. 1 is started (12. Behavior start command). When driving of the drive system 1 is started, the playback device 26 receives an audio data playback command from a control unit (not shown) and causes the speaker 27 to ring the input output sound wave data as a sound wave (13. audio data). Ringing).

Outline of process (2): When the information acquired from the reproduction status acquisition device 28 is information indicating continuation of the utterance, that is, when the utterance (reproduction) has not ended, the drive control device 31 again performs the behavior pattern recording device Activating any of the behavior patterns in 34. The behavior pattern may be selected at the timing when the utterance ends or may be selected in advance.

Specifically, the processing (2) corresponds to the processing from (14. Behavior end) to (18. Behavior start command) in the sequence shown in FIG. That is, the end of the behavior pattern by the drive system 1 during the speech of the robot 101 (during sound reproduction) is determined by the drive state (case state) of the drive system 1 acquired by the case state acquisition device 32 (14. Behavior). End). The housing state acquisition device 32 outputs information indicating that the behavior has ended to the drive control device 31 as a behavior notification (15. behavior notification command).

When the drive control device 31 is notified that the behavior has ended from the housing state acquired from the housing state acquisition device 32, the drive control device 31 acquires the playback state from the playback state acquisition device 28 (16. playback state acquisition). If the drive control device 31 determines that the reproduction is in progress from the obtained reproduction status, the drive control device 31 again obtains a behavior pattern according to the utterance content from the behavior pattern recording device 34 (17. behavior pattern acquisition). Then, the drive of the drive system 1 is started so as to obtain the acquired behavior pattern (18. Behavior start command).

Outline of Process (3): When the information acquired from the reproduction status acquisition device 28 is information indicating the end of the utterance, the drive control device 31 determines that the utterance has ended and sets the drive system 1 in the idle state or inactive. On the other hand, when the playback status acquisition device 28 acquires the timing when the utterance is finished, the drive control device 31 checks the housing state acquisition device 32, and if the operation is being performed, issues a stop command to the drive system 1. The drive system 1 is driven so as to return to the utterance intention presentation posture that is issued and is the initial posture. If the operation is within a predetermined time (for example, 400 ms), no stop command is issued as an allowable range.

Specifically, the processing (3) corresponds to the processing from (19. Playback end) to (22. Playback end command) in the sequence shown in FIG. That is, the playback device 26 outputs playback end state change information to the playback status acquisition device 28 (20. playback end state change) when playback ends (19. playback end).

The playback status acquisition device 28 notifies the drive control device 31 that the robot 101 has finished speaking from the input playback end status change information (21. Playback end notification). The signal to be notified here is a signal indicating that the reproduction of the voice to the user by the robot 101 has been completed.

The drive control device 31 issues a playback end command (stop command) to the drive system 1 from the playback end notification acquired from the playback status acquisition device 28 (22. playback end command). As a result, the operation of the drive system 1 is stopped.

(effect)
As described above, the robot 101 becomes an utterance intention presentation posture for informing the user that the posture is intended to be uttered when the conversation with the user is started. That is, the robot 101 can indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. Thereby, since the dialogue between the user and the robot 101 can be performed smoothly, natural non-verbal communication can be realized between the user and the robot.

In this embodiment, the start of the dialog with the user is the start of the voice reproduction to the user by the robot. However, the present invention is not limited to this, and is the end of the user's voice input by the robot. Also good. In this case, in the sequence shown in FIG. 2, a dialog start status notification is sent to the drive control device 31 when input by the input device 22 is completed. At this timing, the drive control device 31 acquires the housing state from the housing state acquisition device 32. Subsequent processing is the same as the processing described above.

Further, since the end of input by the input device 22 is also the start of speech recognition, the start of dialogue with the user may be the start of speech recognition by the robot. Furthermore, in the case of a device in which a switch is provided in a housing constituting the robot 101, the microphone 21 is turned on when the switch is pressed, and the microphone 21 is turned off when the switch is released, the start of the dialog with the user may be the start of the switch release. In addition, when a camera is used for the housing constituting the robot 101, a person is detected by the camera, and further, the timing when the conversation is assumed to be started by detecting that the movement of the lips of the person is finished is determined with the user. It is also possible to start the dialogue.

Note that the robot 101 starts utterance when the utterance content for response can be formed from the dialogue device 24. In the case of the utterance content that the user's intention is insufficient or the user's utterance such as sneeze. If there is no meaning, utterance content cannot be formed. In such a case, the robot 101 may take an utterance intention release behavior indicating that the utterance intention is lost from the utterance intention presentation posture instead of uttering. The utterance intention release behavior may be any behavior as long as the utterance intention presentation posture is different from the utterance intention presentation posture, and is a behavior in which the user can easily recognize that the utterance intention has been lost. Preferably there is.

In the posture control device 3 configured as described above, an example is shown in which the posture recording device 33 and the behavior pattern recording device 34 are provided independently. However, these two devices may be a single recording device. .

[Embodiment 2]
Another embodiment of the present invention will be described as follows. For convenience of explanation, members having the same functions as those described in the first embodiment are given the same reference numerals, and descriptions thereof are omitted.

(Robot overview)
FIG. 3 is a schematic configuration diagram of the robot 201 according to the present embodiment. The robot 201 has the configuration in which the voice recognition device 23 is provided in a server (not shown) on the network and a communication device 29 for performing communication with the voice recognition device 23 is added. Different from the first robot 101. That is, in the robot 201, after the voice input through the microphone 21 is input by the input device 22, the input voice data is sent to the server on the network by the communication device 29, and the voice recognition device 23 in the server recognizes the voice. Can be performed. Then, the recognition result by the voice recognition device 23 in the server is sent to the dialogue device 24 via the communication device 29. In these respects, the robot 201 is different from the robot 101 of the first embodiment. The communication device 29 may be any type of communication device as long as it can communicate with the voice recognition device 23 provided on an external network such as the Internet.

(Attitude control processing)
FIG. 4 is a sequence diagram showing the flow of the posture control process of the robot 201 shown in FIG. In the following sequence diagram, the process until the voice is reproduced by the robot 201 (11), the process when the behavior of the robot 201 is finished during the voice reproduction (12), and the voice reproduction is finished during the behavior of the robot 201. Processing (13) is included.

Outline of process (11): This process (11) is substantially the same process as the process (1) described in the first embodiment, but is different at the time of starting the dialogue with the user. That is, the process of the first embodiment is that the user's utterance is acquired as sound data from the microphone 21, the acquired sound data is input by the input device 22, and the input end time is set as the start of the dialog with the user ( Different from 1).

Specifically, the processing (11) corresponds to the processing from (1. voice data input) to (15. sound data ringing) in the sequence shown in FIG. That is, the microphone 21 converts voice input by the user's speech into waveform data, and outputs the waveform data to the input device 22 (1. voice data input). The input device 22 inputs the input sound data, and when the input is completed, notifies the drive control device 31 of a dialog start status (2. Dialog start status notification). By this dialog start status notification, the drive control device 31 is notified that the input of the user's voice has been completed.

The drive control device 31 receives the signal indicating that the reproduction of the voice to the user by the robot 101 from the reproduction status acquisition device 28 is received, and the state of the robot 201 (the case state) from the case state acquisition device 32. ) (3. Acquire housing information). The drive control device 31 also acquires the utterance intention presentation posture recorded in the posture recording device 33 (4. utterance intention presentation posture acquisition). Accordingly, the drive control device 31 specifies the posture of the robot 201 by the posture specifying unit 31a from the acquired housing state, and determines whether or not the specified posture of the robot 201 is the acquired utterance intention presentation posture. can do. And the drive control apparatus 31 drives the drive system 1 according to a determination result (5. Speech intention presentation attitude | position transition).

When the identified posture of the robot 201 is not the utterance intention presentation posture, the drive control device 31 drives the drive system 1 so as to be the utterance intention presentation posture. On the other hand, when the identified posture of the robot 201 is the utterance intention presentation posture, the drive control device 31 temporarily hangs the head if the head of the utterance intention presentation posture is facing the front. To return to the front.

Thereafter, the input device 22 receives a voice recognition start command (1) from a control unit (not shown), and transmits the voice data input to the voice recognition device 23 provided in the server on the network via the communication device 29. (6. Speech recognition start command (1)). The speech recognition device 23 receives the speech recognition start command (2) from the control unit in the server, converts the input sound data into text data (7. speech recognition start command (2)), and the dialogue device 24. (8. Dialogue start command).

The dialogue device 24 receives a dialogue start command from a control unit (not shown), analyzes the user's utterance content from the input text data, and obtains text data of a dialogue sentence corresponding to the utterance content from a database (not shown). get. Furthermore, the acquired text data is output to the speech synthesizer 25 (9. Dialogue word composition command).

The speech synthesizer 25 receives an interactive wording synthesis command from a control unit (not shown), converts the input text data into output sound wave data (PCM data), and outputs it to the playback device 26 (10. Voice data playback). order). When the playback device 26 receives an audio data playback command from a control unit (not shown) and plays back the output sound wave data, the playback device 26 outputs utterance start state change information to the playback status acquisition device 28 (11. Start of speech). State change). The utterance start state change information is information indicating whether or not the utterance by the robot 201 has been started. In this case, the utterance start state change information is information indicating that the utterance by the robot 201 has been started.

The reproduction status acquisition device 28 notifies the drive control device 31 that the robot 201 has spoken from the input utterance start state change information (12. Talk start status notification). The signal to be notified here is a signal indicating that the robot 201 has started playing the voice to the user. When the robot 201 starts utterance, the drive control device 31 acquires a behavior pattern corresponding to the utterance content from the behavior pattern recording device 34 (13. behavior pattern acquisition), and drives the drive system so that the acquired behavior pattern is obtained. 1 is started (14. Behavior start command). When driving of the drive system 1 is started, the playback device 26 receives a sound data playback command from a control unit (not shown) and causes the speaker 27 to ring the input output sound wave data as a sound wave (15. sound data). Ringing).

The process (11) is as described above, but the process (12) is the same as the process (2) in the first embodiment, and the process (13) is the same as the process (3) in the first embodiment. Therefore, description of these processes is omitted.

(effect)
As described above, the robot 201 becomes an utterance intention presentation posture for informing the user that the posture is intended to be uttered when the conversation with the user is started. That is, the robot 201 can indicate to the user whether or not the robot itself intends to speak at the start of dialogue with the user. Thereby, the dialogue between the user and the robot 201 can be performed smoothly. In addition, in the case of the present embodiment, since the voice recognition device 23 is provided in a server on the network, it is not necessary to perform voice recognition processing inside the robot 201, so the processing load on the robot 201 can be reduced. .

In the present embodiment, the start of the dialog with the user is the end of the user's voice input by the robot. However, the present invention is not limited to this, and may be the start of the playback of the voice to the user by the robot. .

In the case of the present embodiment, as in the first embodiment, the end of input by the input device 22 is also the start of speech recognition, so the start of dialogue with the user is the start of speech recognition by the robot. There may be. Furthermore, in the case of a device in which a switch is provided in a casing constituting the robot 201 and the microphone 21 is turned on when the switch is pressed and turned off when the switch is released, the start of the dialog with the user may be set as the start of the switch release. In addition, when a camera is used for the housing constituting the robot 201, a person is detected by the camera, and further, the timing when the conversation is assumed to be started by detecting that the movement of the lips of the person is finished is determined with the user. It is also possible to start the dialogue.

Also, the robot 201 starts the utterance when the utterance content for response can be formed from the dialogue device 24. In the case of the utterance content that the user's intention is insufficient or the user's utterance such as sneeze. If there is no meaning, utterance content cannot be formed. In such a case, the robot 201 may take an utterance intention release behavior indicating that the utterance intention has disappeared from the utterance intention presentation posture instead of speaking.

In the first and second embodiments, the posture control of the

robots

101 and 201 is performed by using an output signal from the voice system 2 (a signal indicating the end of voice input from the input device 22 and voice playback start from the playback status acquisition device 28). Signal). On the other hand, in the following third embodiment, an example performed based on a user's face image captured by a camera will be described.

[Embodiment 3]
The following will describe still another embodiment of the present invention. For convenience of explanation, members having the same functions as those described in the first embodiment are given the same reference numerals, and descriptions thereof are omitted.

(Robot overview)
FIG. 5 is a schematic configuration diagram of the robot 301 according to the present embodiment. The robot 301 has substantially the same configuration as the robot 101 of the first embodiment, and is different in that the image system 4 is newly provided. The image system 4 includes a camera 41 that captures the face of the user, and an image acquisition device (image acquisition unit) 42 that acquires a face image captured by the camera 41. The image system 4 further includes an image determination device (image determination unit) 43 that determines whether or not the face image acquired by the image acquisition device 42 is an image indicating the end of speech by the user.

The camera 41 is a digital camera that captures an image of a user who is a conversation partner of the robot 301, and may be any type and system as long as it can be mounted inside the robot 301. The image acquisition device 42 is a device that acquires the face image of the user from the user image captured by the camera 41. The image acquisition device 42 sends the acquired user face image to the image determination device 43.

The image determination device 43 is a device that performs face authentication from the user's face image sent from the image acquisition device 42 and determines whether the image indicates the end of speech by the user from the authentication result. Here, it is determined whether or not the face image is when the user's mouth is closed. Then, the result of determining the face image when the user's mouth is closed is sent to the posture control device 3. That is, the posture control device 3 performs posture control of the robot 301 at the timing when the user's mouth is closed. That is, in this embodiment, the timing of the posture determination of the robot 301 by the posture control device 3, that is, the time when the dialogue with the user is started is determined by the image determination device 43 as an image indicating the end of the user's utterance. To do.

(effect)
As described above, the robot 301 informs the user that the posture is intended to be uttered when the dialogue with the user is started (when the image determining device 43 determines that the image indicates the end of the user's utterance). It becomes the utterance intention presentation posture to squeeze. That is, the robot 301 can indicate to the user whether or not the robot itself intends to speak at the start of the dialogue with the user. Thereby, the dialogue between the user and the robot 301 can be performed smoothly.

(Modification)
FIG. 6 is a schematic configuration block diagram of a robot 401 which is a modification of the robot 301 shown in FIG. The robot 401 has substantially the same configuration as the robot 201 of the second embodiment, and is different in that the image system 4 is newly provided. The posture control using the image system 4 is the same as that of the robot 301 shown in FIG.

(effect)
According to the robot 401, substantially the same effect as the robot 301 is obtained, and furthermore, since the voice recognition device 23 is provided in a server on the network, it is not necessary to perform voice recognition processing inside the robot 401. As a result, the processing load on the robot 401 can be reduced.

Note that in the robot 301 of the present embodiment and the robot 401 of the modification example, the dialogue start time with the user is determined when an image indicating the end of the user's speech is determined. However, the present invention is not limited to this. Good. For example, similar to the robot 101 of the first embodiment and the robot 201 of the second embodiment, an output signal from the voice system 2 (a signal indicating the end of voice input from the input device 22, a voice reproduction from the playback status acquisition device 28). It may be at the time of reception of a signal indicating the start.

[Example of software implementation]
The control block of the drive control device 31 may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU (Central Processing Unit). .

In the latter case, the drive control device 31 includes a CPU that executes instructions of a program that is software that implements each function, and a ROM (Read Only Memory) in which the program and various data are recorded so as to be readable by the computer (or CPU). Alternatively, a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for expanding the program, and the like are provided. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.

[Summary]
The posture control apparatus according to the first aspect of the present invention is a robot (101, 201,) capable of interacting with a user and driving a plurality of drive units (drive system 1) to take various postures. 301, 401) and a posture control device (31) for controlling the posture of the robot (101, 201, 301, 401), wherein the posture of the robot (101, 201, 301, 401) is An attitude specifying unit (31a) that specifies the drive state of each drive unit (drive system 1), and a drive control unit (31b) that controls the drive of each drive unit (drive system 1). The unit (31b) indicates that the posture of the robot (101, 201, 301, 401) specified by the posture specifying unit (31a) at the start of the dialogue with the user is the robot (101, 201, 301, 40). ) Is not an utterance intention presentation posture indicating that there is an intention to utter, and each of the driving units (drive system 1) is driven to set the utterance intention presentation posture to the robot (101, 201, 301, 401). It is characterized by letting it take.

According to the above configuration, since the posture of the robot (101, 201, 301, 401) can always be set to the utterance intention presentation posture at the start of the dialogue with the user, the user can confirm that the robot has the utterance intention. It can be easily recognized visually by the posture of the robot.

As a result, it is possible to clearly indicate to the user whether or not the robot itself intends to speak at the start of the dialog with the user. Natural non-verbal communication can be realized between the user and the robot.

In the posture control apparatus according to aspect 2 of the present invention, in the aspect 1, the robot (101, 201, 301, 401) inputs a user's voice, and makes a voice toward the user according to the input voice. When a dialog with the user is performed by playing, the start of the dialog with the user may be a time when the robot (101, 201, 301, 401) starts to play a voice to the user.

According to the above configuration, since the robot (101, 201, 301, 401) starts the voice reproduction to the user at the start of the dialogue with the user, the robot prompts the user at the timing when the robot is about to speak. The utterance intention presentation posture can be taken. Thereby, in addition to the posture of the robot, the user can clearly recognize that the robot has an intention to speak by voice.

The posture control device according to aspect 3 of the present invention is the posture control apparatus according to aspect 1, in which the robot (101, 201, 301, 401) inputs the user's voice and speaks to the user according to the input voice. When the dialogue with the user is performed by reproducing the above, the start of the dialogue with the user may be the end of the user's voice input by the robot (101, 201, 301, 401).

According to the above configuration, since the dialogue with the user starts when the user's voice input by the robot (101, 201, 301, 401) ends, the utterance intention presentation posture to the user at the timing of the user's utterance end. Can be taken. Thereby, the robot can inform the user that there is an intention to speak quickly.

In the aspect control device according to aspect 4 of the present invention, in the above aspect 1, the image acquisition unit (image acquisition device 42) that acquires a face image obtained by imaging the user's face and the image acquisition unit (image acquisition device 42) include: An image determination unit (image determination device 43) for determining whether or not the acquired face image is an image indicating the end of the utterance by the user, and at the start of the dialogue with the user, the image determination unit ( It may be when the image determination device 43) determines that the image indicates the end of the user's utterance.

According to the above configuration, the start of the dialogue with the user is a time when the image determination unit (image determination device 43) determines that the image indicates the end of the user's utterance. The user can take the utterance intention presentation posture. Thereby, the robot can inform the user that there is an intention to speak quickly.

A robot according to aspect 5 of the present invention is characterized by including the attitude control device (31) according to any one of the above aspects 1 to 4. According to the above configuration, it is possible to notify the user that there is a clear intention to speak.

The posture control method according to aspect 6 of the present invention is a robot (101, 201, 101) capable of interacting with a user and driving a plurality of drive units (drive system 1) to take various postures. 301, 401) is a posture control method for controlling the posture of the robot (101, 201, 301, 401) at the start of dialogue with the user, and is specified by the posture specifying step and the posture specifying step. When the posture of the robot (101, 201, 301, 401) is not an utterance intention presentation posture indicating that the robot (101, 201, 301, 401) has an intention to utter, the above drive units are driven. And a drive control step for causing the robot (101, 201, 301, 401) to take the utterance intention presentation posture. According to said structure, there exists the same effect as the said aspect 1. FIG.

The posture control device according to each aspect of the present invention may be realized by a computer. In this case, the posture control device is operated on each computer by causing the computer to operate as each unit (software element) included in the posture control device. The attitude control program of the attitude control device to be realized and a computer-readable recording medium on which the attitude control program is recorded also fall within the scope of the present invention.

The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

1 drive system (drive unit), 2 voice system, 3 attitude control device, 4 image system, 21 microphone, 22 input device, 23 speech recognition device, 24 dialog device, 25 speech synthesizer, 26 playback device, 27 speaker, 28 Playback status acquisition device, 29 communication device, 31 drive control device 31a attitude specifying unit, 31b drive control unit, 32 housing state acquisition device, 33 attitude recording device, 34 behavior pattern recording device, 41 camera, 42 image acquisition device (image Acquisition unit), 43 image determination device (image determination unit) 101, 201, 301, 401 robot

Claims

A posture control device for controlling a posture of a robot that is capable of interacting with a user and that is provided in a robot that can take various postures by driving a plurality of drive units,
A posture identifying unit that identifies the posture of the robot from the driving state of each driving unit;
A drive control unit that performs drive control of each of the drive units;
With
The drive control unit
When the posture of the robot specified by the posture specifying unit at the start of dialogue with the user is not an utterance intention presentation posture indicating that the robot has an intention to speak, the robot is driven by driving the driving units. A posture control device that causes the above-mentioned utterance intention presentation posture to be taken.
When the robot inputs a user's voice and interacts with the user by reproducing the voice toward the user according to the input voice,
The posture control apparatus according to claim 1, wherein the dialogue with the user is started when the robot starts reproducing the voice to the user.
When the robot interacts with the user by inputting the user's voice and playing the voice toward the user according to the input voice,
The posture control apparatus according to claim 1, wherein the dialogue with the user is started when the user's voice input by the robot is finished.
An image acquisition unit that acquires a face image obtained by imaging a user's face;
An image determination unit for determining whether the face image acquired by the image acquisition unit is an image indicating the end of speech by the user;
With
The posture control apparatus according to claim 1, wherein the start of the dialogue with the user is a time when the image determination unit determines that the image indicates the end of the user's utterance.
A robot capable of interacting with a user and driving a plurality of driving units to take various postures;
A robot comprising the attitude control device according to any one of claims 1 to 4.
An attitude control method for controlling the attitude of a robot capable of interacting with a user and driving a plurality of driving units to take various attitudes,
A posture identifying step for identifying the posture of the robot at the start of dialogue with the user;
When the posture of the robot identified by the posture identifying step is not an utterance intention presentation posture indicating that the robot has an intention to utter, the drive unit is driven to set the utterance intention presentation posture to the robot. Drive control step to be taken,
A posture control method comprising: