KR20140061009A

KR20140061009A - Hybrid augmented reality using voice recognition and method the same

Info

Publication number: KR20140061009A
Application number: KR1020120128083A
Authority: KR
Inventors: 윤은경
Original assignee: 주식회사 한울네오텍
Priority date: 2012-11-13
Filing date: 2012-11-13
Publication date: 2014-05-21

Abstract

A method of operating a hybrid augmented reality system according to an embodiment of the present invention includes: acquiring an image including an object; Recognizing at least one object included in the acquired image; Detecting augmented reality information related to the recognized object; Inputting voice information and comparing the voice information with the augmented reality information; And a user interface for separately displaying the augmented reality information matched with the voice information.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a hybrid augmented reality system using speech recognition,

The present invention relates to an augmented reality service providing system, and more particularly, to a system and method for providing an augmented reality service providing system, in which an image of a marker is received from a camera, 3D content is output on a screen, To an augmented reality system.

AR (Augmented Reality) is one of the virtual reality in which the user views the real world viewed by the user and the virtual world having the additional information as one image. AR, a concept that complements the real world with a virtual system, uses a virtual environment created by computer graphics, but the protagonist is a real environment. Computer graphics serve to provide additional information needed for the real world. It means that the distinction between the real environment and the virtual screen is blurred by superimposing the 3D virtual image on the real image that the user is viewing. Such augmented reality is applied to various fields such as medicine, industry, amusement and military field because it provides users with improved reality and awareness.

It is very important to accurately estimate the motion of the camera or tracked object to realize the augmented reality. Conventional methods for realizing augmented reality include a marker based method for realizing augmented reality using AR (Augmented Reality) markers and augmented reality using feature points or 3D models collected from an object existing in the real world. And the marker-less method. The marker here refers to providing size, direction, and position information to be drawn on a three-dimensional graphic model that is actually present on a two-dimensional plane and connected to itself.

The present invention provides a 3D content corresponding to a current camera view using a smart phone equipped with an application (an application program of some sort) that implements such an augmented reality technology. However, the conventional execution of the application by the general hand operation is inconvenient when the number of applications is large, it may be troublesome for the hand or the eye inconvenient person, and it is applied to the layer which is vulnerable to the function of the smart phone such as a child or a student There was a difficult problem.

In order to solve the above problems, the present invention has been made to solve the above problems by using a smart phone equipped with an operating system capable of executing an augmented reality application, and by adding a technique of matching a voice of a smartphone user, For that purpose.

A method of operating a hybrid augmented reality system according to the present invention includes: acquiring an image including an object; Recognizing at least one object included in the acquired image; Detecting augmented reality information related to the recognized object; Inputting voice information and comparing the voice information with the augmented reality information; And a user interface for separately displaying the augmented reality information matched with the voice information.

According to another aspect of the present invention, there is provided a hybrid AR system including: an image acquiring unit acquiring an image including an object; A display unit for outputting the image or augmented reality information output from the image acquisition unit; A recognition unit for recognizing at least one object included in the image acquired from the image acquiring unit and for detecting augmented reality information related to the recognized object, And a control unit for generating a user interface for separately displaying the augmented reality information by matching the information and the voice information recognized by the voice recognition unit.

According to the embodiment of the present invention, it is possible to provide a service for providing personalized contents together with three-dimensional contents in a smart phone by combining speech recognition and a markerless augmented reality.

In addition, there is an advantage that the matched three-dimensional content is recognized by recognizing the photographed marker or voice, and the narration is expressed, thereby realizing an enhanced sense of reality, thereby helping children or students to learn by themselves .

In addition, the present invention can use the voice recognition technology to allow a user to search for desired information after listing contents to be searched with only a voice command without a manual operation, and to easily provide information to a child or a student who has an uncomfortable hand There is an advantage.

1 is a block diagram of an augmented reality user interface device according to an embodiment of the present invention;
2 is a diagram showing a speech recognition flow according to an embodiment of the present invention.
3 is a view schematically showing a configuration of an augmented reality user interface according to an embodiment of the present invention.
4 is a flowchart for explaining a method of providing an augmented reality user interface according to an embodiment of the present invention

Generally, the augmented reality information (AR information) can be obtained by a location based (GPS based) method, a marker recognition based method, or the like.

In the case of the location-based method, the mobile terminal uses the GPS information and the geomagnetic sensor information (direction, tilt information) to determine the augmented reality information about the object viewed by the mobile terminal (for example, And display the acquired augmented reality information on the photographed image.

In the case of the marker recognition based method, the mobile terminal finds a marker displayed on the image, and recognizes the size of the marker and the distance between the marker and the mobile terminal, thereby determining the three-dimensional position or distance of the marker. The mobile terminal can obtain the augmented reality information directly from the augmented reality marker or obtain the augmented reality information associated with the augmented reality marker from the server and display the acquired augmented reality information at the image or marker position.

For example, the augmented reality markers may be implemented in the form of a two-dimensional code, in which case various data such as letters, numbers, symbols, control codes, etc. may be included in the augmented reality markers themselves. The mobile terminal can acquire the augmented reality information by reading the augmented reality marker encoded with the augmented reality information and decoding the read image or the two-dimensional code. The concrete method for constructing the augmented reality markers with the two-dimensional code can be understood in a similar manner to the known two-dimensional codes (e.g., QR code, PDF417, DataMatrix, MaxiCode, etc.), and therefore detailed description thereof will be omitted.

Alternatively, identification information (e.g., a combination of numbers or characters, etc.) capable of identifying each augmented reality information may be encoded in the augmented reality marker. In this case, the mobile terminal can read the augmented reality marker encoded with the identification information, and decode the read image or the two-dimensional code to obtain the identification information. The mobile terminal can query the server for the identification information and acquire corresponding augmented reality information.

The augmented reality information may be obtained from a single augmented reality information server or a plurality of augmented reality information servers, and an object to which the augmented reality information is displayed may be acquired from all the augmented reality information servers that can provide guide information such as buildings, goods, Includes objects.

According to an embodiment of the present invention, when a plurality of augmented reality information about objects photographed through a camera of a mobile terminal is displayed on a screen, augmented reality information that is not necessary for a user is displayed on the screen By displaying the augmented reality information only on the object, the screen area and the layer desired by the user according to the voice input of the user so as not to cover the screen, the user can easily recognize only the augmented reality information he wants.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. Prior to describing the present invention, terms used throughout the specification are defined. These terms are defined in consideration of the functions in the embodiments of the present invention and can be sufficiently modified according to the intention, customs, etc. of the user or operator. Therefore, the definitions of these terms are based on the contents of the specification of the present invention It should be reduced.

1 is a configuration diagram of an augmented reality system apparatus according to an embodiment of the present invention.

1, an augmented reality system apparatus according to the present invention includes an image acquisition unit 110, a display unit 120, a database 130, and a control unit 140, and further includes a sensor unit 150, And a temporary memory 170. [0034] FIG.

The image acquisition unit 110 may be a camera or an image sensor that acquires an image including at least one augmented reality object. The image acquiring unit 110 may be a camera capable of enlarging or reducing an image under the control of the controller 140 during image capturing, or automatically or manually rotating the image. Since the present invention proposes a device including a mobile terminal or a method of using a terminal, the image acquisition unit 110 can be applied to a digital camera provided in a smart phone having Android or IOS as an operating system. Also, the object used in the present invention means a marker existing in the real world and a markerless based object.

The display unit 120 overlaps the image obtained by the image obtaining unit 110 and the augmented reality information related to at least one or more images included in the image, and outputs the augmented reality information. In outputting the augmented reality information to the display unit 120, the controller 140 may output the augmented reality information through a user interface in which the augmented reality information is classified into groups according to a predetermined attribute.

The controller 140 controls each component to perform a method of providing an augmented reality system according to the present invention and may be a hardware processor or a software module executed in the hardware processor. That is, an augmented reality application running on a smartphone can be applied.

The sensor unit 150 provides additional sensing information (current time, position, and photographing direction) to assist in detecting the augmented reality data for object detection or object detection in the image of the control unit 140. [ The sensor unit may include a GPS receiver for receiving a position information signal of a camera transmitted by the GPS satellite, and a gyro sensor for sensing and outputting the azimuth and tilt angle of the camera.

The user interface non-operation unit 160 may include a key input unit, a touch sensor, a mouse, and the like as means for receiving information from a user, but the present invention may be applied to a touch screen provided in a smart phone.

Like the operation unit 160, the voice recognition unit 170 is a means for receiving information from a user. In the present invention, it is preferable to use a voice recognition application provided by Google or IOS.

FIG. 2 is a diagram illustrating a speech recognition flow according to an embodiment of the present invention. FIG.

Referring to FIG. 2, voice features are extracted from voice data recognized by command 1, and a reference pattern is generated. This process can be performed not only 10 times, but also the number of times the actual speech recognition is executed. When the user sends an input signal for inputting actual voice to the terminal, the reference pattern corresponding to the number of times of execution is generated, and the end point signal for ending the voice is transmitted to extract the feature of the voice. An input pattern is generated based on the extracted features and compared with the reference pattern to obtain a speech recognition result with an improved accuracy.

FIG. 3 is a schematic view showing an augmented reality system according to the present invention.

Referring to FIG. 3, a marker included in a smartphone terminal may be photographed with a camera, and a 3D image-based content may be recognized by the terminal. Then, the augmented reality information (3D contents) can be outputted on the screen of the terminal through the application for implementing the augmented reality system executed in the terminal. Also, voice input can be performed by executing a voice recognition application supported by the terminal, and the input voice is compared with the current image to display the augmented reality information (3D content) matching the condition.

The database 130 stores information for implementing the embodiment of the present invention and may be a server of an augmented reality application running on a smartphone terminal. That is, the data can be received through the network after being formed on the outside, but the present invention is not limited to this and may be configured in a built-in form.

The database 130 may include object recognition information 161, augmented reality information 162, and interface determination information 163 according to an embodiment of the present invention.

The object recognition information 161 is mapping information capable of recognizing an object, and stores predetermined object feature information. The controller 140 may include object recognition information obtained by the filter information, and object recognition information stored in advance. The object recognition information may include object shape information, color, texture, pattern, color, And judges what the object is. The object recognition information may include object position information such as GPS data.

The augmented reality information 162 stores information related to an object, and may represent a specific characteristic of the object through a predetermined tag image. Such augmented reality information can be managed by being given the same identifier as the mapped object.

The user interface determination information 163 is interface generation information for providing the detected augmented reality information. The interface determination information 163 is used to determine how to classify the detected augmented reality information according to the attribute information, And the like.

4 is a flowchart illustrating a method for providing an augmented reality system according to a preferred embodiment of the present invention.

Referring to FIG. 4, a user executes an augmented reality application for implementing an augmented reality system. Then, when the object recognition mode is set by the user's key input, the controller 140 drives the image acquisition unit 110 in step 410 to acquire an image including at least one object. In step 420, the controller 140 recognizes the object included in the acquired image. In recognizing the object, the control unit 140 may refer to the position information of the terminal obtained by the sensor unit 150 as well as the image information.

The controller 140 detects the augmented reality information related to the recognized object in step 430. [ That is, the augmented reality information to which the same identifier as the identifier assigned in the recognized object is assigned is detected.

In step 440, the user inputs a voice for the desired information to the terminal using the voice recognition application supported by the smartphone terminal. The controller 140 compares the augmented reality information detected in step 430 with the voice input to the terminal and detects the matched information.

In step 450, the controller 140 determines whether the augmented reality information associated with the recognized one or more objects exists.

If it is determined in step 450 that the augmented reality data related to the recognized object exists, the controller 140 classifies the augmented reality information according to the criteria previously stored in step 460. In step 470, a UI is generated in which the augmented reality information is displayed.

That is, the control unit 140 can apply priority to the augmented data to be displayed, thereby enabling efficient information retrieval. In step 480, the controller 140 displays the augmented reality information on the smartphone terminal through the generated interface.

As described above, the augmented reality system according to the present embodiment is a hybrid augmented reality system that matches object information and voice, and recognizes the photographed marker or voice, displays the matched three-dimensional content, This has the advantage of helping children and students to learn and immerse themselves.

Further, according to the present invention, a user can search for desired information by listing contents to be searched by voice commands alone without using a hand operation by using a speech recognition technology, and can provide a user-driven and sensible-type content.

Claims

Obtaining an image including an object;
Recognizing at least one object included in the acquired image;
Detecting augmented reality information related to the recognized object;
Inputting voice information and comparing the voice information with the augmented reality information; And
And generating a user interface for separately displaying the augmented reality information matched with the voice information.

The method according to claim 1,
Wherein the object is a marker or markerless method.

The method according to claim 1,
Wherein the voice input uses a voice recognition application provided in a smart phone.

The method according to claim 1,
Wherein the image is a digital camera provided in a smartphone.

An image acquiring unit acquiring an image including an object;
A display unit for outputting the image or augmented reality information output from the image acquisition unit;
A voice recognition unit for recognizing a voice associated with the image;
Recognizing at least one object included in the image acquired from the image acquisition unit, detecting augmented reality information related to the recognized object, and matching the detected augmented reality information and the voice information recognized by the voice recognition unit And a controller for generating a user interface for separating and displaying the augmented reality information.

6. The method of claim 5,
Wherein the image acquiring unit uses a digital camera provided in a smart phone.

6. The method of claim 5,
Wherein the object uses a marker or a markerless method.

6. The method of claim 5,
Wherein the speech recognition unit is linked to a control unit using a speech recognition application embedded in a smart phone.