KR20230030502A

KR20230030502A - Method for generating conversaion information using examplar-based generation model and apparatus for the same

Info

Publication number: KR20230030502A
Application number: KR1020220010973A
Authority: KR
Inventors: 엥흐바야르 에르데네; 김범수; 서석준; 안상일; 장부루; 한승주
Original assignee: 하이퍼커넥트 유한책임회사
Priority date: 2021-08-25
Filing date: 2022-01-25
Publication date: 2023-03-06

Abstract

A method for normalizing an embedding of a neural network model according to various embodiments of the present disclosure may comprise: a step of acquiring a plurality of feature vectors; a step of acquiring first embedding information by individually embedding the plurality of feature vectors; a step of acquiring normalized values of the first embedding information through a first parameter and a second parameter; and a step of outputting a first result value by applying normalized values for each of the individual feature vector to a training model. Therefore, the present invention is capable of providing an effect that can generate various and interesting answers.

Description

Conversation information generation method and apparatus using an example-based generation model

본 개시의 다양한 실시예는 예시 기반 생성 모델을 이용한 대화 정보 생성 방법 및 그 장치에 관한 것이다. Various embodiments of the present disclosure relate to a method and apparatus for generating conversation information using an example-based generation model.

인공 지능(artificial intelligence, AI)은 다양한 산업 분야에서 활용되고 있다. 인간의 사고와 유사한 방식으로 동작하는 인공 지능은 표본이 되는 대상이 접근하려는 물체의 피처(feature)를 추출하는데 활용될 수 있다.Artificial intelligence (AI) is being used in various industries. Artificial intelligence, which operates in a manner similar to human thinking, can be used to extract features of an object that a target object to be a sample is to approach.

특정한 대화의 주제가 정해지지 않고 자연스러운 대화를 이끌어낼 수 있는 챗봇(chatbot)을 만들기 위한 분야로 오픈-도메인 대화(Open-domain Conversaion) 분야가 있다. 오픈-도메인 대화(Open-domain Conversaion) 분야에서 사용되는 인공지능 모델은 크게 생성 기반 대화 모델 (Generation-based Conversaion Model)과 검색 기반 대화 모델 (Search-based Conversation Model)이 있다.There is an open-domain conversation field as a field for creating a chatbot that can lead a natural conversation without a specific topic of conversation being determined. Artificial intelligence models used in the field of open-domain conversation are largely divided into a generation-based conversation model and a search-based conversation model.

생성 기반 대화 모델은, 시퀀스 투 시퀀스(Sequence to Sequence) 아키텍처에 기초하여, 추출 및 입력된 대화 맥락 정보(context)를 이용하여, 대화 맥락 정보에 적절한 응답을 생성한다. 검색 기반 대화 모델은, 우선적으로 응답으로 사용될 수 있는 응답 세트(Response Set)를 미리 정의해둔 상태에서, 입력된 대화 맥락 정보에 가장 어울리는 응답을 응답 세트에서 검색하여 응답으로 리턴한다.The generation-based dialog model generates an appropriate response to the dialog context information by using extracted and input dialog context information based on a sequence to sequence architecture. In the search-based dialog model, a response set that can be used as a response is first defined in advance, and a response most suitable for input dialog context information is retrieved from the response set and returned as a response.

생성 기반 대화 모델은, 큰 스케일의 언어 모델을 함께 사용했을 때, 언어 모델의 풍부한 지식을 기반으로 주어진 대화 맥락 정보에 어울리는 유창한 응답을 생성하는 장점이 있으나, 생성 기반 대화 모델은 최대한 자연스러운 응답을 생성하기 위하여 대화의 흐름을 방해하지 않는 무난하고 재미없는 응답을 생성하는 경향 이 있다. The generative-based dialogue model has the advantage of generating fluent responses that match the given dialogue context information based on the rich knowledge of the language model when used together with a large-scale language model. However, the generative-based dialogue model generates the most natural response To do this, they tend to produce benign and uninteresting responses that do not interrupt the flow of conversation.

반면 검색 기반 대화 모델은, 확인 또는 입력된 대화 맥락 정보에 어울리지 않는 이질적인 응답을 추출해주기도 하지만 생성 기반 대화 모델에 비하여 다양하고 흥미로운 응답을 내놓는 장점이 있다. 또한, 고성능 검색 라이브러리 (예를 들어, FAISS 등)와 함께 사용하였을 때 생성 기반 대화 모델과는 비교적으로 대화 맥락 정보에 적절한 응답을 빠르게 추출할 수 있는 장점도 있다.On the other hand, the search-based dialogue model extracts heterogeneous responses that do not match the confirmed or entered dialogue context information, but has the advantage of providing various and interesting responses compared to the generation-based dialogue model. In addition, when used with a high-performance search library (eg, FAISS, etc.), it has the advantage of being able to quickly extract an appropriate response to conversation context information, compared to a generation-based conversation model.

따라서, 생성 기반 대화 모델의 자연스러운 응답 생성 능력과, 검색 기반 대화 모델의 다양하고 재밌는 응답을 생성하는 두 대 모델의 장점을 활용하기 위하여 주어진 대화 맥락 정보에 대하여 검색 기반 대화 모델이 추출해주는 응답을 생성 기반 대화 모델의 예시(Examplar)로 제공해주고, 생성 기반 대화 모델은 주어진 예시를 보다 자연스러운 응답으로 생성해내는 예시-기반 생성(Exemplar based Generation) 모델들이 제안되어 왔다.Therefore, in order to take advantage of the natural response generation capability of the generation-based dialog model and the strengths of the two models that generate various and interesting responses of the search-based dialog model, responses extracted by the search-based dialog model for given dialog context information are generated. Example-based generation models have been proposed that provide an example of a base dialog model and generate a more natural response from a given example.

검색 기반 대화 모델에 예시 답변을 검색할 때에, 생성 기반 대화 모델의 정답 답변을 이용하는 방법이 관련 연구에서 제안된 바 있다 (Angela Fan, et al 2021. Augmenting transformers with knn based composite memory for dialog).A method of using the correct answer of the generation-based dialog model when searching for example answers in the search-based dialog model has been proposed in a related study (Angela Fan, et al 2021. Augmenting transformers with knn based composite memory for dialog).

기존에 제안되어 왔던 예시-기반 생성 모델들은, 검색 기반 대화 모델이 추출 및 제공해주는 답변을 활용하지 않고 무시하는 경우 또는 주어진 예시를 고려하지 않은 채 그대로 동일한 답변으로 제공해주는 경우가 발생할 수 있다. 또 기존에 제안되어 왔던 예시-기반 생성 모델들은 시간의 흐름에 따라 상술한 경우들로 학습이 편중되는 경향이 있다. Previously proposed example-based generation models may ignore the answers extracted and provided by the search-based dialogue model without using them, or may provide the same answers as they are without considering the given examples. In addition, example-based generative models that have been proposed in the past tend to bias learning toward the above-described cases over time.

따라서, 실시예들에 따른 예시 기반 생성 모델을 이용한 대화 정보 생성 방법은 상술한 문제점을 극복하여 주어진 예시를 적절하게 이용하여 보다 다양하면서 유창한 답변을 생성하는 예시-기반 생성 모델을 제안한다.Accordingly, the method for generating dialogue information using an example-based generation model according to embodiments overcomes the above problems and proposes an example-based generation model that generates more diverse and fluent answers by appropriately using given examples.

실시예들에 따른 예시 기반 생성 모델을 이용한 대화 정보 생성 방법은 제1컨텍스트 정보를 확인하는 단계; 제1모델에 기초하여 제1컨텍스트 정보에 대응하는 제1응답 세트를 확인하는 단계; 상기 제1컨텍스트 정보에 대응하는 골드 응답 정보를 기반으로 상기 제1응답 세트에서 선택된 응답 서브 세트를 확인하는 단계; 및 상기 제1컨텍스트 정보 및 상기 응답 서브 세트에 기초하여 상기 제2모델을 학습 시키는 단계; 중 적어도 하나를 포함할 수 있다. A method of generating conversation information using an example-based generation model according to embodiments includes checking first context information; identifying a first response set corresponding to the first context information based on the first model; identifying a response subset selected from the first response set based on Gold response information corresponding to the first context information; and learning the second model based on the first context information and the response subset. may include at least one of them.

또한, 실시예들에 따른 응답 서브 세트는 상기 골드 응답 정보 및 군집 알고리즘 에 기초하여 확인되는 후보 응답들 내에서 선택될 수 있다.Also, a response subset according to embodiments may be selected from candidate responses identified based on the Gold response information and the clustering algorithm.

또한, 실시예들에 따른 상기 응답 서브 세트는 상기 후보 응답들 중 임베딩 공간 내 상기 골드 응답 정보에 대응하는 값으로부터 특정 범위 내에 대응하는 적어도 하나의 답변을 제외함 으로써 선택될 수 있다.Also, the response subset according to embodiments may be selected by excluding at least one answer corresponding to a specific range from a value corresponding to the Gold response information in an embedding space among the candidate responses.

나아가, 실시예들에 따른 상기 대화 모델 훈련 방법은 상기 응답 서브 세트에 포함되는 각 응답에 대해 상기 제1컨텍스트 정보를 기반으로 가중치 정보를 설정하는 단계를 더 포함할 수 있고, 상기 제2모델을 학습시키는 단계는 상기 설정된 가중치 정보에 기초하여 상기 제2모델을 학습시킬 수 있다.Furthermore, the dialog model training method according to embodiments may further include setting weight information based on the first context information for each response included in the response subset, and In the learning step, the second model may be learned based on the set weight information.

더 나아가, 실시예들에 따른 상기 가중치 정보는 상기 응답 서브 세트 내 각 답변에 대한 관련성 점수에 기초하여 설정될 수 있고, 상기 답변에 대한 관련성 점수는 임베딩 공간 상에서 상기 제1컨텍스트 정보에 대응하는 값 및 상기 답변에 대응하는 값에 기초하여 계산될 수 있다.Furthermore, the weight information according to embodiments may be set based on a relevance score for each answer in the response subset, and the relevance score for the answer is a value corresponding to the first context information in an embedding space. And it can be calculated based on the value corresponding to the answer.

또한, 실시예들에 따른 상기 제2모델은 사용자로부터 획득한 대화 정보에 대한 제2컨텍스트 정보를 확인하고, 상기 제2컨텍스트 정보에 기초하여 상기 제2컨텍스트 정보에 대한 골드 응답 정보를 제공할 수 있다.In addition, the second model according to embodiments may check second context information for conversation information obtained from a user, and provide gold response information for the second context information based on the second context information. there is.

나아가, 실시예들에 따른 상기 제1컨텍스트 정보는 사용자로부터 획득한 적어도 하나 이상의 대화 정보를 포함할 수 있다.Furthermore, the first context information according to embodiments may include at least one piece of conversation information obtained from a user.

더 나아가, 실시예들에 따른 제2모델은 상기 가중치 정보에 기초하여 계산된 손실 함수를 이용하여 역전파 동작을 수행함으로써 학습될 수 있고, 상기 가중치 정보는 각 답변에 대한 관련성 점수를 정규화함으로써 계산될 수 있다.Furthermore, the second model according to the embodiments may be learned by performing a backpropagation operation using a loss function calculated based on the weight information, and the weight information is calculated by normalizing relevance scores for each answer. It can be.

실시예들에 따른 예시 기반 생성 모델을 이용한 대화 정보 생성 방법은 검색 모델(200)과 생성 모델(201)의 조합에 기반하여 대화의 컨텍스트 정보에 적합한 답변을 추출하도록 학습함으로써, 풍부한 지식을 기반으로 주어진 대화 컨텍스트에 어울리는 유창한 답변을 생성함과 동시에, 다양하고 흥미로운 답변을 생성할 수 있는 효과를 제공할 수 있다.A method of generating conversation information using an example-based generation model according to embodiments learns to extract an answer suitable for context information of a conversation based on a combination of a search model 200 and a generation model 201, thereby providing information based on rich knowledge. It is possible to provide an effect of generating various interesting answers while generating fluent answers suitable for a given conversation context.

실시예들에 따른 전자 장치는, 후보 예시 답변들 중 임베딩 공간 상에서 골드 응답 정보에 대응하는 값으로부터 지나치게 가까운(또는 연관성이 높은) 범위 내의 답변들을 배제함으로써, 생성 모델(201)이 문맥에 적절하면서도 다양한 답변을 도출하도록 학습할 수 있게 도와 준다.The electronic device according to the embodiments excludes answers within a range that is too close (or highly correlated) from the value corresponding to the gold response information in the embedding space among candidate example answers, so that the generation model 201 is suitable for the context and It helps you learn to come up with a variety of answers.

실시예들에 따른 전자 장치는, 예시 답변들 및 각 예시 답변에 대한 가중치를 더 고려하여 최적의 답변을 생성하도록 학습함으로써, 예시 답변들을 적절하게 반영하여 문맥에 맞고 어색하지 않는 답변을 제공함과 동시에 유창하면서도 창의적인 답변을 생성할 수 있도록 유도할 수 있으며, 대화 모델을 사용하는 사용자로 하여금 질리지 않는 대화를 이끌어낼 수 있다.The electronic device according to the embodiments learns to generate an optimal answer by further considering example answers and a weight for each example answer, thereby appropriately reflecting the example answers to provide an answer that is appropriate for the context and not awkward, and at the same time It can be induced to generate fluent and creative answers, and users who use the dialog model can lead conversations that they do not get tired of.

도 1은 본 개시의 다양한 실시예에 따른 전자 장치의 구성을 나타낸 개략적인 블록도이다.
도 2는 실시예들에 따른 전자 장치의 구성도의 일부를 나타낸다.
도 3은 실시예들에 따른 전자 장치가 컨텍스트 정보로부터 답변 정보를 생성하는 전반적인 결과의 예시를 나타낸 도면이다.
도 4는 실시예들에 따른 전자 장치가 컨텍스트 정보로부터 답변 정보를 생성하도록 생성 모델부를 학습하는 동작들의 예시를 나타낸다.
도 5는 실시예들에 따른 검색 모델부의 동작의 예시를 나타낸 것이다.
도 6은 실시예들에 따른 검색 모델부가 대화 정보 및 답변 정보를 학습하기 위한 동작의 결과의 예시를 나타낸 것이다.
도 7은 실시예들에 따른 전자 장치의 동작들의 예시를 나타낸 것이다.
도 8 내지 도 11은 실시예들에 따른 전자 장치의 오픈-도메인 대화 모델의 성능을 다른 오픈-도메인 대화(Open-domain Conversaion) 모델들과 비교한 것을 나타낸다.1 is a schematic block diagram illustrating the configuration of an electronic device according to various embodiments of the present disclosure.
2 shows a part of a configuration diagram of an electronic device according to embodiments.
3 is a diagram illustrating an example of an overall result of generating answer information from context information by an electronic device according to embodiments.
4 illustrates examples of operations for learning a generation model unit to generate answer information from context information by an electronic device according to embodiments.
5 illustrates an example of an operation of a search model unit according to embodiments.
6 illustrates an example of a result of an operation for learning conversation information and answer information by a search model unit according to embodiments.
7 illustrates examples of operations of an electronic device according to embodiments.
8 to 11 show comparison of performance of an open-domain conversation model of an electronic device with other open-domain conversation models according to embodiments.

실시 예들에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the embodiments have been selected as general terms that are currently widely used as much as possible while considering the functions in the present disclosure, but they may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. In addition, in a specific case, there are also terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the corresponding description. Therefore, terms used in the present disclosure should be defined based on the meaning of the term and the general content of the present disclosure, not simply the name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 “포함”한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 명세서에 기재된 “...부”, “...모듈” 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.In the entire specification, when a part is said to "include" a certain component, it means that it may further include other components, not excluding other components unless otherwise stated. In addition, terms such as "...unit" and "...module" described in the specification mean a unit that processes at least one function or operation, which is implemented as hardware or software, or a combination of hardware and software. It can be.

명세서 전체에서 기재된 “a, b, 및 c 중 적어도 하나”의 표현은, ‘a 단독’, ‘b 단독’, ‘c 단독’, ‘a 및 b’, ‘a 및 c’, ‘b 및 c’, 또는 ‘a,b,c 모두’를 포괄할 수 있다.The expression of “at least one of a, b, and c” described throughout the specification means 'a alone', 'b alone', 'c alone', 'a and b', 'a and c', 'b and c' ', or 'all of a, b, and c'.

이하에서 언급되는 "단말"은 네트워크를 통해 서버나 타 단말에 접속할 수 있는 컴퓨터나 휴대용 단말로 구현될 수 있다. 여기서, 컴퓨터는 예를 들어, 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop) 등을 포함하고, 휴대용 단말은 예를 들어, 휴대성과 이동성이 보장되는 무선 통신 장치로서, IMT(International Mobile Telecommunication), CDMA(Code Division Multiple Access), W-CDMA(W-Code Division Multiple Access), LTE(Long Term Evolution) 등의 통신 기반 단말, 스마트폰, 태블릿 PC 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치를 포함할 수 있다.A “terminal” referred to below may be implemented as a computer or portable terminal capable of accessing a server or other terminals through a network. Here, the computer includes, for example, a laptop, desktop, laptop, etc. equipped with a web browser, and the portable terminal is, for example, a wireless communication device that ensures portability and mobility. , IMT (International Mobile Telecommunication), CDMA (Code Division Multiple Access), W-CDMA (W-Code Division Multiple Access), LTE (Long Term Evolution), etc. It may include a handheld-based wireless communication device.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시 예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다.Hereinafter, with reference to the accompanying drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily carry out the present disclosure. However, the present disclosure may be implemented in many different forms and is not limited to the embodiments described herein.

이하, 본 개시의 실시 예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

실시 예를 설명함에 있어서 본 개시가 속하는 기술 분야에 익히 알려져 있고 본 개시와 직접적으로 관련이 없는 기술 내용에 대해서는 설명을 생략한다. 이는 불필요한 설명을 생략함으로써 본 개시의 요지를 흐리지 않고 더욱 명확히 전달하기 위함이다.In describing the embodiments, descriptions of technical contents that are well known in the technical field to which the present disclosure belongs and are not directly related to the present disclosure will be omitted. This is to more clearly convey the gist of the present disclosure without obscuring it by omitting unnecessary description.

마찬가지 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 개략적으로 도시되었다. 또한, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. 각 도면에서 동일한 또는 대응하는 구성요소에는 동일한 참조 번호를 부여하였다.For the same reason, in the accompanying drawings, some components are exaggerated, omitted, or schematically illustrated. Also, the size of each component does not entirely reflect the actual size. In each figure, the same reference number is assigned to the same or corresponding component.

본 개시의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 개시는 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 개시의 개시가 완전하도록 하고, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 개시는 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present disclosure, and methods for achieving them, will become clear with reference to embodiments described below in detail in conjunction with the accompanying drawings. However, the present disclosure is not limited to the embodiments disclosed below and may be implemented in various different forms, only the present embodiments make the disclosure of the present disclosure complete, and the common knowledge in the art to which the present disclosure belongs It is provided to fully inform the holder of the scope of the invention, and the present disclosure is only defined by the scope of the claims. Like reference numbers designate like elements throughout the specification.

이 때, 처리 흐름도 도면들의 각 블록과 흐름도 도면들의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수 있음을 이해할 수 있을 것이다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 흐름도 블록(들)에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 흐름도 블록(들)에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 흐름도 블록(들)에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.At this time, it will be understood that each block of the process flow chart diagrams and combinations of the flow chart diagrams can be performed by computer program instructions. These computer program instructions may be embodied in a processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, so that the instructions executed by the processor of the computer or other programmable data processing equipment are described in the flowchart block(s). It creates means to perform functions. These computer program instructions may also be stored in a computer usable or computer readable memory that can be directed to a computer or other programmable data processing equipment to implement functionality in a particular way, such that the computer usable or computer readable memory The instructions stored in are also capable of producing an article of manufacture containing instruction means that perform the functions described in the flowchart block(s). The computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to generate computer or other programmable data processing equipment. Instructions for performing processing equipment may also provide steps for performing the functions described in the flowchart block(s).

또한, 각 블록은 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실행 예들에서는 블록들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다. Additionally, each block may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s). It should also be noted that in some alternative implementations it is possible for the functions mentioned in the blocks to occur out of order. For example, two blocks shown in succession may in fact be executed substantially concurrently, or the blocks may sometimes be executed in reverse order depending on their function.

인공 지능(artificial intelligence, AI)은 인간과 같이 사고하고 학습하며 판단하는 일련의 논리 알고리즘을 통해 인간의 지능을 본 따 동작하는 컴퓨터 프로그램의 일종일 수 있다. 소위 인공 지능은 인간의 신경계를 닮은 뉴럴 네트워크(neural network)를 통해 인간의 뇌에 대응하는 프로세서에서 복잡한 연산을 처리할 수 있다. 본 명세서에서는 딥 러닝(deep learning)에 포함될 수 있는 머신 러닝(machine learning) 및 다른 학습을 통해 특징(feature)를 정규화하고 모델링(modeling)하는 과정을 설명한다. 본 명세서 내에서는 머신 러닝과 기계 학습의 용어가 혼용되어 사용될 수 있다.Artificial intelligence (AI) can be a type of computer program that mimics human intelligence through a series of logical algorithms that think, learn, and judge like humans. So-called artificial intelligence can process complex calculations in a processor corresponding to the human brain through a neural network resembling the human nervous system. In this specification, a process of normalizing and modeling features through machine learning and other learning, which may be included in deep learning, will be described. In this specification, the terms of machine learning and machine learning may be used interchangeably.

뉴럴 네트워크는 인간의 신경계의 기초 단위가 되는 뉴런의 동작 원리와 뉴런 간 연결 관계를 모델링한 네트워크를 의미할 수 있다. 뉴럴 네트워크는 개별 노드(node) 또는 프로세싱 요소(processing element)를 레이어(layer) 형태로 연결한 데이터 처리 시스템일 수 있다. 뉴럴 네트워크는 복수의 레이어를 포함할 수 있으며, 각각의 레이어는 복수의 뉴런을 포함할 수 있다. 또한, 뉴럴 네트워크는 뉴런 간 데이터를 전달할 수 있는 신경 자극체에 대응하는 시냅스를 포함할 수 있다. 본 명세서 내에서는 레이어와 계층의 용어가 혼용되어 사용될 수 있다.A neural network may refer to a network modeling an operating principle of neurons, which are basic units of the human nervous system, and a connection relationship between neurons. A neural network may be a data processing system in which individual nodes or processing elements are connected in a layer form. A neural network may include a plurality of layers, and each layer may include a plurality of neurons. In addition, the neural network may include synapses corresponding to nerve stimulators capable of transmitting data between neurons. In this specification, the terms layer and layer may be used interchangeably.

구체적으로, 뉴럴 네트워크는 인공의 뉴런이 반복적인 학습을 통해 시냅스의 결합 세기를 변화시켜 주어진 문제 또는 변수가 생긴 문제에 대한 해결 능력을 가지는 데이터 처리 모델을 전반적으로 의미할 수 있다. 본 명세서 내에서는 뉴럴 네트워크와 인공 신경망의 용어가 혼용되어 사용될 수 있다.Specifically, a neural network may generally mean a data processing model having an ability to solve a given problem or a problem with a variable by changing synaptic coupling strength through repetitive learning of artificial neurons. In this specification, the terms neural network and artificial neural network may be used interchangeably.

뉴럴 네트워크는 트레이닝 데이터(training data)를 이용하여 트레이닝 할 수 있다. 구체적으로, 트레이닝은 입력 데이터를 분류(classification), 회귀 분석(regression), 군집(clustering)하는 등의 목적을 달성하기 위하여 특징 데이터를 이용하여 뉴럴 네트워크의 파라미터를 결정하는 과정을 포함할 수 있다. 보다 구체적으로, 파라미터를 결정하는 요소로 가중치(weight) 또는 편향(bias)이 있을 수 있다.A neural network may be trained using training data. Specifically, training may include a process of determining parameters of a neural network using feature data to achieve a purpose such as classification, regression analysis, or clustering of input data. More specifically, a weight or a bias may be a factor determining the parameter.

뉴럴 네트워크는 입력 데이터를 트레이닝하여 패턴에 따라 분류 또는 군집화할 수 있으며, 트레이닝된 뉴럴 네트워크는 학습 모델(trained model)로 지칭될 수 있다. 구체적으로, 트레이닝의 방식은 지도 학습, 비 지도 학습, 준 지도 학습(semi-supervised learning), 강화 학습(reinforced learning)으로 구분될 수 있다. 보다 구체적으로, 지도 학습은 트레이닝 데이터로부터 함수를 유추해내기 위한 머신 러닝의 한 방식일 수 있다. 머신 러닝을 통해 유추된 함수 중 연속된 결과 값을 출력하는 것은 회귀 분석일 수 있으며, 입력 데이터의 클래스(class)를 예측하여 결과 값을 출력하는 것이 분류(classification)일 수 있다.The neural network may train input data to classify or cluster according to a pattern, and the trained neural network may be referred to as a trained model. Specifically, training methods may be classified into supervised learning, unsupervised learning, semi-supervised learning, and reinforced learning. More specifically, supervised learning may be a method of machine learning for inferring a function from training data. Outputting continuous result values from functions inferred through machine learning may be regression analysis, and outputting result values by predicting a class of input data may be classification.

지도 학습에서는 트레이닝 데이터에 대한 레이블(label)이 주어질 수 있으며, 레이블은 뉴럴 네트워크가 추론해야 하는 유의미한 결과 값을 포함할 수 있다. 구체적으로, 뉴럴 네트워크가 추론해야 하는 결과 값은 레이블링 데이터(labeling data)일 수 있다. 보다 구체적으로, 트레이닝 데이터와 트레이닝 데이터에 대응하는 레이블링 데이터는 하나의 트레이닝 세트로 구성될 수 있으며, 뉴럴 네트워크는 트레이닝 세트의 형태로 입력 값과 결과 값을 획득할 수 있다.In supervised learning, a label for training data may be given, and the label may include a meaningful result value to be inferred by a neural network. Specifically, the resulting value to be inferred by the neural network may be labeling data. More specifically, training data and labeling data corresponding to the training data may be configured as one training set, and the neural network may obtain input values and result values in the form of the training set.

트레이닝 데이터는 복수의 피처 벡터(feature vector)를 포함할 수 있으며, 뉴럴 네트워크는 트레이닝 데이터를 추론하여 개별 피처 벡터에 레이블을 달아 레이블링 데이터를 결과 값으로 출력할 수 있다. 뉴럴 네트워크는 트레이닝 데이터와 레이블링 데이터를 통해 각 데이터의 연관 관계에 대한 함수를 유추할 수 있다. 또한, 뉴럴 네트워크에서 유추된 함수에 대한 피드백을 통해 개별 벡터에 대한 파라미터가 최적화될 수 있다.The training data may include a plurality of feature vectors, and the neural network may infer the training data, label individual feature vectors, and output labeling data as result values. The neural network may infer a function for correlation between each data through training data and labeling data. In addition, parameters for individual vectors can be optimized through feedback on the function inferred from the neural network.

오픈-도메인 대화(Open-domain Conversation)를 위한 예시-기반 생성 모델(Examplar-based generative model)과 예시-기반 생성 모델의 향상 방법에 대하여 설명한다.An example-based generative model for open-domain conversation and a method for improving the example-based generative model are described.

예시-기반 생성 모델은 검색 모델부에 의해 추출된 예시 답변들(exemplars)에 기반하여 답변(response)들을 생성하며, 생성 모델(generative model)과 검색 모델(retrieval model)을 이용한다. 본 명세서에서 설명하는 실시예들에 따른 전자 장치는, 생성 모델과 검색 모델이 연결 또는 결합된 예시-기반 생성 모델을 포함한다. 한편, 기존의 예시-기반 생성 모델은 답변을 생성하는 동안 검색된 표본을 무시하거나 검색된 표본에 과적합된 답변을 생성하는 경우가 많았다. 따라서, 실시예들에 따른 전자 장치는, 검색 모델 및 생성 모델을 연결하는 CORGE (Connecting Retriever and Generator)를 포함하며, 실시예들에 따른 전자 장치는 예시-기반 생성 모델을 학습하는 동작들 일부 또는 전부를 수행할 수 있다. 실시예들에 따른 전자 장치는, 예시-기반 생성 모델을 학습하는 단계에서, 대화 컨텍스트 정보뿐만 아니라 적합 응답(gold response) 정보를 예시 답변을 선택하는 쿼리를 사용할 수 있다. 그 후 실시예들에 따른 전자 장치는, 상술한 과적합 문제를 완화하기 위해 골드 응답 정보와 지나치게 유사한 예시 답변을 제외할 수 있다. 나머지 예시 답변들 중 일부는 주어진 컨텍스트와 관련이 없을 수 있으므로 실시예들에 따른 전자 장치는 컨텍스트 정보와 예시 답변 간의 관련성 점수를 추가로 활용하여 생성 모델을 학습시킬 수 있다. The example-based generative model generates responses based on example answers extracted by the search model unit, and uses a generative model and a retrieval model. An electronic device according to embodiments described in this specification includes an example-based generation model in which a generation model and a search model are connected or combined. On the other hand, existing example-based generative models often ignore retrieved samples or generate answers that are overfitted to retrieved samples while generating answers. Accordingly, the electronic device according to embodiments includes a Connecting Retriever and Generator (CORGE) that connects a search model and a generation model, and the electronic device according to embodiments includes some or all operations for learning an example-based generation model. can do everything In the step of learning the example-based generation model, the electronic device according to embodiments may use a query for selecting an example answer using gold response information as well as conversation context information. After that, the electronic device according to embodiments may exclude example answers that are too similar to Gold response information in order to alleviate the above-described overfitting problem. Since some of the remaining example answers may not be related to a given context, the electronic device according to embodiments may train a generating model by additionally utilizing a relevance score between context information and example answers.

한편, 실시예들에 따른 골드 응답(gold response) 정보는, 예를 들어 특정 컨텍스트(context)에 대한 최적의 응답 정보, 적절한 응답 정보, 또는 기 설정된 응답 정보 등을 의미할 수 있으며, 예를 들어 학습 데이터(training data) 내에서 지도 학습되도록 설정된 응답 정보일 수도 있다.Meanwhile, gold response information according to embodiments may mean, for example, optimal response information for a specific context, appropriate response information, or preset response information. For example, It may be response information set to be supervised learning within training data.

이하 본 명세서에서 사용되는 용어 및 본 명세서와 관련된 논문에 대한 용어(영문 용어)의 대응 관계는 다음과 같을 수 있다. Hereinafter, the correspondence between the terms used in this specification and terms (English terms) for papers related to this specification may be as follows.

관련 논문 용어Related Paper Terminology 본 명세서의 용어Terminology in this specification Exemplar-based generative modelExemplar-based generative model 예시-기반 생성 모델Example-based generative model Human evaluationHuman evaluation 인적 평가human evaluation Retriever, Retrieval modelRetriever, retrieval model 검색부, 검색 모델부, 검색 모델Search part, search model part, search model Generator, Generation modelGenerator, generation model 생성부, 생성 모델부, 생성 모델generation unit, generation model unit, generation model One-to-many problemOne-to-many problem 원-투-매니 문제One-to-many problem ExemplerExempler 예시 답변, 예시example answer, example ResponseResponse 응답answer CORGECORGE 실시예들에 따른 전자 장치, 실시예들에 따른 학습 방법 등Electronic device according to embodiments, learning method according to embodiments, etc. Gold responseGold response 골드 응답, 골드 응답 정보Gold Response, Gold Response Information AppropriatenessAppropriateness 적합성compatibility InformativenessInformativeness 정보성informational Knowledge-grounded generation modelKnowledge-grounded generation model 지식-기반 생성 모델knowledge-based generative models ContextContext 컨텍스트, 컨텍스트 정보context, context information Relevance scoreRelevance score 연관성 점수relevance score Normalized relevance scoreNormalized relevance score 정규화된 연관성 점수Normalized association score Jaccard filterJaccard filter 재커드 필터Jacquard filter Jaccard similarityJaccard similarity 재커드 유사도Jackard Similarity Over-fittingOver-fitting 과적합overfitting

도 1은 본 개시의 다양한 실시예에 따른 전자 장치의 구성을 나타낸 개략적인 블록도이다.1 is a schematic block diagram illustrating the configuration of an electronic device according to various embodiments of the present disclosure.

전자 장치는 뉴럴 네트워크를 포함하는 장치를 포함할 수 있다. 전자 장치는 트레이닝 데이터를 이용하여 머신 러닝을 수행할 수 있는 장치로, 뉴럴 네트워크로 구성된 모델을 이용하여 학습을 수행할 수 있는 장치를 포함할 수 있다. 예를 들어, 전자 장치는 데이터 마이닝, 데이터 분석, 지능형 의사 결정, 및 머신 러닝 알고리즘을 위해 이용될 데이터를 수신, 분류, 저장, 및 출력하도록 구성될 수 있다.An electronic device may include a device including a neural network. The electronic device is a device capable of performing machine learning using training data, and may include a device capable of performing learning using a model composed of a neural network. For example, an electronic device may be configured to receive, classify, store, and output data to be used for data mining, data analysis, intelligent decision making, and machine learning algorithms.

전자 장치는 뉴럴 네트워크를 트레이닝하기 위한 다양한 장치를 포함할 수 있다. 예를 들어, 전자 장치는 복수의 서버 세트, 클라우드 서버, 또는 이들의 조합으로 구현될 수 있다. 구체적으로, 전자 장치는 분산 처리를 통해 데이터 분석 또는 트레이닝으로 결과 값을 획득할 수 있다.An electronic device may include various devices for training a neural network. For example, an electronic device may be implemented as a plurality of server sets, a cloud server, or a combination thereof. Specifically, the electronic device may obtain result values through data analysis or training through distributed processing.

도 1을 참조하면, 전자 장치는 프로세서(110), 입출력부(Input/Output Module)(120), 및 메모리(130)를 구성요소로 포함할 수 있다. 도 1에 도시된 전자 장치의 구성요소들은 이에 한정되는 것은 아니며, 추가되거나 치환될 수 있다.Referring to FIG. 1 , an electronic device may include a processor 110, an input/output module 120, and a memory 130 as components. Components of the electronic device shown in FIG. 1 are not limited thereto and may be added or replaced.

프로세서(110)는 데이터 분석 및 머신 러닝 알고리즘을 통해 전자 장치의 동작을 제어 또는 예측할 수 있다. 프로세서(110)는 학습(training)하고자 하는 데이터를 요청, 검색, 수신, 또는 활용할 수 있으며, 트레이닝을 통해 학습한 바람직한 동작을 실행하도록 전자 장치를 제어할 수 있다. 프로세서(110)는 예를 들어, 학습 데이터를 요청, 검색, 수신 또는 활용하거나, 이들을 전처리하거나, 이들을 이용하여 학습을 수행하는 실시예들에 따른 러닝 프로세서(learning processor)를 포함할 수 있다.The processor 110 may control or predict the operation of the electronic device through data analysis and machine learning algorithms. The processor 110 may request, search, receive, or utilize data to be trained, and may control the electronic device to execute a desired operation learned through training. The processor 110 may include, for example, a learning processor according to embodiments that requests, retrieves, receives, or utilizes learning data, pre-processes them, or performs learning using them.

프로세서(110)는 사용자의 입력 또는 자연어 입력에 기초하여 입력 값에 대한 결과 값을 도출하고 감지하도록 구성될 수 있다. 프로세서(110)는 예를 들어, 프로세싱 및 저장을 위한 데이터를 수집하도록 구성되는 실시예들에 따른 러닝 프로세서를 포함할 수 있다. 데이터의 수집은 센서를 통해 데이터를 감지하거나 메모리(130)에 저장된 데이터를 추출하거나 입출력부(120)를 통해 외부 장치로부터 데이터를 수신하는 것을 포함할 수 있다.The processor 110 may be configured to derive and sense a result value for an input value based on a user's input or natural language input. Processor 110 may include, for example, a learning processor according to embodiments configured to collect data for processing and storage. Data collection may include sensing data through a sensor, extracting data stored in the memory 130 , or receiving data from an external device through the input/output unit 120 .

프로세서(110)는 전자 장치의 동작 히스토리를 데이터화 하여 메모리(130)에 저장할 수 있다. 프로세서(110)는 저장된 동작 히스토리 데이터 및 트레이닝된 모델을 기반으로 특정 동작을 수행하기 위한 최상의 결과 값을 획득할 수 있다.The processor 110 may convert the operation history of the electronic device into data and store it in the memory 130 . The processor 110 may obtain the best result value for performing a specific operation based on the stored operation history data and the trained model.

프로세서(110)는 특정 동작이 수행되는 경우, 데이터 분석 및 머신 러닝 알고리즘을 통해 특정 동작의 실행에 따른 히스토리를 분석할 수 있다. 구체적으로, 프로세서(110)는 분석한 히스토리를 기초로 이전에 트레이닝한 데이터의 업데이트를 수행할 수 있다. 즉, 프로세서(110)는 업데이트된 데이터에 기초하여 데이터 분석 및 머신 러닝 알고리즘의 정확성을 향상시킬 수 있다.When a specific operation is performed, the processor 110 may analyze a history according to the execution of the specific operation through data analysis and a machine learning algorithm. Specifically, the processor 110 may update previously trained data based on the analyzed history. That is, the processor 110 may improve the accuracy of data analysis and machine learning algorithms based on the updated data.

예를 들어, 실시예들에 따른 프로세서(110) 또는 프로세서(110)에 포함된 러닝 프로세서는 트레이닝 데이터 또는 트레이닝 세트를 이용하여 뉴럴 네트워크를 트레이닝시킬 수 있다. 예를 들어, 프로세서(110) 또는 프로세서(110)에 포함된 러닝 프로세서는 획득한 입력 값을 전처리한 데이터를 통해 뉴럴 네트워크를 트레이닝시킬 수 있다. 다른 예를 들어, 프로세서(110) 또는 프로세서(110)에 포함된 러닝 프로세서는 메모리(130)에 저장된 전처리 데이터를 통해 뉴럴 네트워크를 트레이닝시킬 수 있다. 구체적으로, 프로세서(110) 또는 프로세서(110)에 포함된 러닝 프로세서는 다양한 트레이닝 방식을 이용하여 뉴럴 네트워크를 반복하여 트레이닝시킴으로써 뉴럴 네트워크의 최적화 모델 및 최적화에 활용되는 파라미터를 결정할 수 있다.For example, the processor 110 according to embodiments or a learning processor included in the processor 110 may train a neural network using training data or a training set. For example, the processor 110 or a learning processor included in the processor 110 may train a neural network through data obtained by preprocessing an acquired input value. For another example, the processor 110 or a learning processor included in the processor 110 may train a neural network through preprocessing data stored in the memory 130 . Specifically, the processor 110 or a learning processor included in the processor 110 may determine an optimization model of the neural network and parameters used for optimization by repeatedly training the neural network using various training methods.

입출력부(120)는 전자 장치의 메모리(130)에 저장된 데이터 또는 프로세서(110)에 의해 처리된 데이터를 다른 장치로 전송하거나, 다른 장치로부터 전자 장치로 데이터를 수신하는 기능을 수행할 수 있다. 입출력부(120)는 예를 들어, 다른 전자 장치로부터 데이터를 수신하거나, 다른 전자 장치로 데이터를 송신하는 수신부, 송신부 또는 트랜시버(transceiver) 등을 포함할 수 있다. 나아가, 실시예들에 따른 입출력부(120)는 전자 장치와 물리적으로 또는 논리적으로 연결된 구성요소 또는 장치들과 전자 신호 또는 데이터를 입력받거나 출력하기 위한 하나 이상의 입출력 모듈(Input/Output Module)을 포함할 수 있다.The input/output unit 120 may perform a function of transmitting data stored in the memory 130 of the electronic device or data processed by the processor 110 to another device or receiving data from another device to the electronic device. The input/output unit 120 may include, for example, a receiving unit, a transmitting unit, or a transceiver that receives data from another electronic device or transmits data to another electronic device. Furthermore, the input/output unit 120 according to embodiments includes one or more input/output modules for receiving or outputting electronic signals or data with components or devices physically or logically connected to the electronic device. can do.

메모리(130)는 프로세서(110) 또는 뉴럴 네트워크에서 트레이닝된 모델을 저장할 수 있다. 예를 들어, 메모리(130)는 트레이닝된 모델 또는 트레이닝 중인 모델을 구분하여 저장할 수 있다. 구체적으로, 메모리(130)는 뉴럴 네트워크가 트레이닝되는 과정의 모델들을 저장하여 트레이닝 히스토리에 따른 트레이닝된 모델을 저장할 수 있다. The memory 130 may store a model trained in the processor 110 or the neural network. For example, the memory 130 may separately store a trained model or a model under training. Specifically, the memory 130 may store models of a process in which a neural network is trained and store a trained model according to a training history.

예를 들어, 실시예들에 따른 메모리(130)는 모델 저장부및/또는 데이터베이스(database)를 포함할 수 있다.For example, the memory 130 according to embodiments may include a model storage unit and/or a database.

예를 들어, 모델 저장부는 프로세서(110)를 통해 트레이닝 중인 또는 이미 트레이닝된 모델(예를 들어, 뉴럴 네트워크 모델 등)을 저장할 수 있다. 또한, 모델 저장부는 트레이닝된 모델이 업데이트된 모델을 저장할 수 있다.For example, the model storage unit may store a model (eg, a neural network model, etc.) being trained or already trained through the processor 110 . Also, the model storage unit may store a model in which the trained model is updated.

예를 들어, 데이터베이스는 입력 값인 입력 데이터, 모델 트레이닝을 위한 트레이닝 데이터, 모델 트레이닝 히스토리 데이터 등을 저장할 수 있다. 데이터베이스에 저장된 입력 데이터는 모델 트레이닝에 적합하게 가공된 데이터와 가공되지 않은 로 데이터(raw data)일 수 있다.For example, the database may store input data that is an input value, training data for model training, model training history data, and the like. The input data stored in the database may be processed data suitable for model training and raw data that are not processed.

실시예들에 따른 프로세서(110)는 제1컨텍스트 정보를 확인하는 단계; 제1모델 에 기초하여 상기 제1컨텍스트 정보에 대응하는 제1응답 세트를 확인하는 단계; 상기 제1컨텍스트 정보에 대응하는 골드 응답 정보를 기반으로 상기 제1응답 세트에서 선택된 응답 서브 세트를 확인하는 단계; 및 상기 제1컨텍스트 정보 및 상기 응답 서브 세트에 기초하여 상기 제2모델을 학습 시키는 단계; 를 수행할 수 있다.Checking, by the processor 110 according to embodiments, first context information; identifying a first response set corresponding to the first context information based on a first model; identifying a response subset selected from the first response set based on Gold response information corresponding to the first context information; and learning the second model based on the first context information and the response subset. can be performed.

본 명세서는 오픈-도메인 대화에서 사용되는 예시-기반 생성 모델들(예를 들어, (Wu et al., 2019; Cai et al., 2019; Gupta et al., 2020 등)의 일반적인 단점을 해결하기 위한 간단한 훈련 방법을 제안한다. 예시-기반 생성 모델들은 두 가지 방법들로 응답들(responses)을 생성하기 위하여 검색 모델(retrieval model, 예를 들어, Humeau et al., 2019; Mazarι et al., 2018 등) 및 생성 모델(generative model, Adiwardana et al., 2020; Roller et al., 2020; Zhang et al., 2020; Brown et al., 2020)을 하나의 프레임워크(framework)로 결합한다. 첫째, 검색부(retriever, 검색 모델)은 주어진 컨텍스트(context)를 쿼리로 하여 예시(exempler)를 검색하고, 생성부(generator, 생성 모델)는 상기 주어진 컨텍스트 및 상기 예시에 기반하여 응답을 생성한다. 예시-기반 생성 모델들은 생성 모델들보다 더 구체적인 응답들을 생성하며, 검색 모델들보다 더 유창한 응답을 생성한다 (Weston et al., 2018).This specification addresses a common shortcoming of example-based generative models used in open-domain conversations (e.g., (Wu et al., 2019; Cai et al., 2019; Gupta et al., 2020, etc.)). We propose a simple training method for example-based generative models, a retrieval model (e.g. Humeau et al., 2019; Mazarι et al., 2019; Mazarι et al., 2018) and generative models (Adiwardana et al., 2020; Roller et al., 2020; Zhang et al., 2020; Brown et al., 2020) into one framework. First, a retriever (search model) searches for an example using a given context as a query, and a generator (generation model) generates a response based on the given context and the example. Example-based generative models generate more specific responses than generative models, and more fluent responses than search models (Weston et al., 2018).

예시-기반 생성 모델들의 성공에도 불구하고, 이러한 종래의 모델들은 두 가지 단점이 존재한다. 이하 도 3에서 나타난 바와 같이, 원시적인 예시-기반 생성 모델 (예를 들어, Weston et al., 2018; Cai et al., 2018)은 예시들을 전체적으로 무시하는 경향이 있고, 순수한 생성 모델(vanilla generative model)의 경향과 유사하게 보인다. 이는 훈련 단계에서 상기 제시된 컨텍트스로부터 검색된 예시가 골드 응답(gold response, 또는 골드 응답 정보)으로부터 확연하게 차이가 있는, 즉 원-투-매니 문제(one-to-many problem) (Li et al., 2016)에 의해 발생한다. 이러한 문제를 완화하기 위하여, 예시-기반 생성 모델들은 골드 응답 (Roller et al., 2020) 또는 교란된 골드 응답 (perturbed gold response)을 훈련 단계 내에서의 하나의 예시로 활용할 수도 있다. 하지만, 이러한 훈련 방법들은 생성부로 하여금 검색된 예시들을 과도하게 의존하게 하며 (예를 들어, 이하 도 3 내 (b)의 실시예와 같이) 생성 모델이 제공된 토큰들을 부적절하게 사용할 수도 있다. 이러한 두 가지 단점들은 생성된 응답의 질(quality)에 악영향을 줄 수 있다. Despite the success of example-based generative models, these conventional models suffer from two drawbacks. As shown in Fig. 3 below, primitive example-based generative models (e.g., Weston et al., 2018; Cai et al., 2018) tend to ignore examples entirely, and vanilla generative models. model) appears to be similar. This is because the example retrieved from the presented contexts in the training phase is significantly different from the gold response (or gold response information), that is, a one-to-many problem (Li et al. , 2016). To alleviate this problem, example-based generative models may utilize a Gold response (Roller et al., 2020) or a perturbed gold response as an example within the training phase. However, these training methods make the generation unit overly dependent on the retrieved examples, and may improperly use the tokens provided by the generation model (eg, as in the embodiment of (b) in FIG. 3 below). These two disadvantages can adversely affect the quality of the generated response.

본 명세서는, 예시-기반 생성 모델들을 훈련하는 과정에서 골드 응답(골드 응답 정보) 정보로부터 의미적으로 관련이 있으나 적당히 거리가 있는 예시들을 제공함으로써 이러한 단점들을 완화시킬 수 있는 모델을 제안한다. 본 명세서에서 제안하는 모델은 CORGE(COnnecdting Retreiver and Generator)으로, 하나의 훈련 방법으로, 적절한 예시(exemplar)들을 선택하는 방법을 제안한다. 첫째, 실시예들에 따른 CORGE는 유사하지만 골드 응답과 사전적으로 동일하지 않은 예시들을 선택하기 위해 골드 응답을 활용할 수 있다. 하지만, 선택된 예시들은 상기 주어진 컨텍스트와 무의미해질 수 있으며, 이 경우 골드 응답에 전적으로 의존할 수 있다. 따라서, CORGE는 상기 컨텍스트 및 상기 예시와의 연관성 점수를 계산할 수 있다. 그리고, CORGE는 상기 점수를 이용하여 유사한 예시에 가중치를 부여하고 주어진 컨텍스트 정보와 비유사한 예시들을 패널라이즈(penalize)한다. CORGE는 예시-기반 생성 모델들에 기본적으로 적용이 가능하며, 적절성과 정보성의 관점에서 생성된 응답들의 질을 증진시킨다. The present specification proposes a model that can mitigate these disadvantages by providing semantically related but appropriately distant examples from Gold response (Gold response information) information in the process of training example-based generative models. The model proposed in this specification is CORGE (Connecdting Retreiver and Generator), and as one training method, it proposes a method of selecting appropriate examples. First, CORGE according to embodiments may utilize the Gold response to select examples that are similar but not a priori identical to the Gold response. However, the selected examples may become irrelevant with the given context, in which case they may rely entirely on the Gold response. Accordingly, CORGE can calculate an association score with the context and the example. Then, CORGE assigns weight to similar examples using the score and penalizes examples that are similar to the given context information. CORGE is basically applicable to instance-based generative models and improves the quality of generated responses in terms of relevance and informativeness.

다시 말하면, 기존의 예시-기반 생성 모델들은 검색 모델에서 추출 및 선택된 예시 답변(exemplar)들에 과도하게 의존하거나 무시하려는 경향을 보인다. 즉, 기존의 예시-기반 생성 모델들은 검색 모델에서 추출 및 선택된 예시 답변들에 과적합(overfit)된 응답 정보를 생성하거나, 추출 및 선택된 예시 답변들과 무관한 응답을 생성하는 빈도를 줄이지 못한다. In other words, existing example-based generative models tend to over-rely on or ignore exemplars extracted and selected from the search model. That is, existing example-based generation models fail to reduce the frequency of generating response information that is overfitted to example answers extracted and selected from a search model or generating responses unrelated to example answers extracted and selected.

반면, 본 명세서에서 제안하는 모델(예를 들어, 도 2의 200 및 201의 결합에 따른 모델)은 기존의 검색 모델을 개선한 구성을 가질 수 있으며(예를 들어, 도 2의 200), 추출 및 선택된 예시 답변들 중 과적합(overfit)된 응답 정보를 생성할 우려가 있는 예시 답변들을 제거하거나 예시 답변들의 연관성 점수에 기반하여 가중치를 부여함으로써, 예시 답변들을 재검증하는 동작을 더 수행할 수 있다. 이러한 구성으로, 본 명세서에서 제안하는 모델(예를 들어, 도 2의 200 및 201의 결합에 따른 모델)은 검색 모델에서 추출 및 선택된 예시 답변들에 과적합된 응답을 생성하는 것을 방지하여 생성 모델부의 학습 효율성을 극대화할 수 있고, 추출 및 선택된 예시 답변들과 무관한 응답을 생성하는 빈도를 줄임으로써 자연스러운 대화가 이어나갈 수 있도록 도와주는 효과를 제공한다.On the other hand, the model proposed in this specification (eg, a model based on the combination of 200 and 201 in FIG. 2) may have an improved configuration of the existing search model (eg, 200 in FIG. 2), and extract and re-verifying the example answers by removing example answers that may generate overfit response information from among the selected example answers or by assigning weights based on the relevance scores of the example answers. there is. With this configuration, the model proposed in this specification (for example, the model according to the combination of 200 and 201 in FIG. 2) avoids generating an overfitted response to the example answers extracted and selected from the search model, thereby generating a model It is possible to maximize the learning efficiency of negatives, and it provides an effect of helping natural conversations continue by reducing the frequency of generating responses unrelated to the extracted and selected example answers.

따라서, 본 명세서는 다음과 같은 기여사항 또는 제안사항을 설명하고, 다음과 같은 기여사항 또는 제안사항을 수행하기 위한 예시-기반 생성 모델의 개선된 실시예들의 구성을 제안(예를 들어, 도 2의 200 및 201의 결합)한다.Accordingly, this specification describes the following contributions or suggestions, and proposes the construction of improved embodiments of example-based generative models for making the following contributions or suggestions (e.g., FIG. 2 Combination of 200 and 201 of).

1) 본 명세서는 기 존재하는 예시-기반 생성 모델들이 예시들을 무시하거나 예시들에 과적합된 응답들을 생성함을 보여준다.1) This specification shows that existing example-based generative models ignore examples or generate responses that are overfitted to examples.

2) 본 명세서에서 제안하는 훈련 방법인 CORGE는, 적절한 예시들을 선택하고 검색 모델에 의해 평가된 연관성 점수들 예시들의 가중치를 적용함으로써 상술한 단점들을 완화한다.2) CORGE, a training method proposed in this specification, alleviates the above-mentioned disadvantages by selecting appropriate examples and applying weights to examples of relevance scores evaluated by a search model.

3) 인적 평가(Human evaluation) 결과는 CORGE가 적절성과 정보성의 관점에서 예시-기반 생성 모델들의 성능을 획기적으로 증진시킴을 보여 준다.3) Human evaluation results show that CORGE dramatically improves the performance of example-based generative models in terms of relevance and informativeness.

본 명세서에서 제안하는 CORGE의 훈련 방법과 관련하여, 예시-기반 생성 모델에 대하여 설명한다. Regarding the CORGE training method proposed in this specification, an example-based generative model will be described.

생성 모델들이 오픈-도메인 대화에서 성공적인 성능을 보여준 반면, 생성 모델들은 정보성이 부족하고 무뚝뚝한 반응(bland response)을 제시하는 것으로 잘 알려져 있다 (Li et al., 2016; Liu et al., 2016; Serban et al., 2017; Li et al., 2019; Holtzman et al., 2019; Welleck et al., 2019). 예시-기반 생성 모델들은 생성 모델들이 부담하는 상술한 문제점을 극복한다. Wu et al. (2019) 오픈 도메인 대화를 위한 예제 기반 생성 모델을 도입했다. 이 모델은 입력 컨텍스트에 따라 조건이 지정된 컨텍스트-모형 쌍을 검색하고 입력 컨텍스트와 검색된 컨텍스트 간의 어휘 차이를 편집 벡터로 인코딩한다. 응답은 상기 예시 및 상기 편집 벡터를 피드(feed)함으로써 생성된다. Weston et al. (2018); Roller et al. (2020) 또한 제시된 컨텍스트를 쿼리로 하여 예시를 검색하고, 예시와 컨텍스트를 결합 (concatenate) 하고, 오픈-도메인 대화를 위한 최종 응답을 생성하기 위하여 결합된 예시를 생성부로 피드한다. Cai et al. (2018, 2019)는 예시로부터 관련 없는 정보를 제거하고, 응답을 생성하기 위해 생성부에게 알리기 위해 마스킹된 예시를 사용한다. Gupta et al. (2020)는 생성부를 검색된 예시들 및 얘시들의 추출된 시멘틱 프레임(semantic frame)과 함께 조절한다. 본 명세서에서, 이 모델은 추가적인 의미 프레임 추출기가 필요하고 제안된 훈련 방법과 상호 보완될 수 있기 때문에 이 모델을 기준선으로 고려하지 않는다.While generative models have shown successful performance in open-domain conversations, generative models are well-known for lacking information and presenting bland responses (Li et al., 2016; Liu et al., 2016; Serban et al., 2017; Li et al., 2019; Holtzman et al., 2019; Welleck et al., 2019). Instance-based generative models overcome the above-mentioned problems borne by generative models. Wu et al. (2019) introduced an example-based generative model for open domain conversations. This model retrieves a context-model pair conditioned according to the input context and encodes the lexical difference between the input context and the retrieved context into an edit vector. A response is generated by feeding the example and the edit vector. Weston et al. (2018); Roller et al. (2020) also retrieves examples using the presented context as a query, concatenates the examples and contexts, and feeds the combined examples to the generator to generate a final response for the open-domain conversation. Cai et al. (2018, 2019) use the masked example to remove extraneous information from the example and inform the generator to generate a response. Gupta et al. 2020 adjusts the generation unit with the extracted semantic frame of retrieved examples and statements. In this specification, we do not consider this model as a baseline because it requires an additional semantic frame extractor and can be complemented with the proposed training method.

정리하면, 실시예들에 따른 전자 장치는, 상술한 CORGE에서 수행하는 훈련 방법(traninig method)을 수행할 수 있다. 실시예들에 따른 전자 장치는, 기 존재하는 예시-기반 생성 모델들과 다르게, 유사하지만 어느 정도 거리가 있는 예시 답변을 선택하고, 선택된 예시 답변들에 대한 연관성 점수(relevance score)를 계산하여 가중치를 부여함으로써, 선택된 예시를 무시하거나 선택된 예시에 과적합된 응답을 생성하지 않도록 할 수 있다. 실시예들에 따른 전자 장치의 구체적인 구성은 도 2에서 설명하며, 실시예들에 따른 전자 장치가 올바른 응답을 생성하는 과정은 도 3 내지 도 7에서 설명하며, 도 8 내지 도 11은 실시예들에 따른 전자 장치의 성능에 대한 진보된 효과를 보여 준다.In summary, the electronic device according to the embodiments may perform the above-described training method performed by CORGE. Different from existing example-based generation models, the electronic device according to the embodiments selects example answers that are similar but have a certain distance, and calculates relevance scores for the selected example answers to weight them. By giving , it is possible to ignore the selected example or not to generate an overfitting response to the selected example. A detailed configuration of an electronic device according to embodiments is described in FIG. 2 , a process of generating a correct response by an electronic device according to embodiments is described in FIGS. 3 to 7 , and FIGS. 8 to 11 are illustrated in FIGS. shows the advanced effect on the performance of the electronic device according to

도 2는 실시예들에 따른 전자 장치의 구성도의 일부를 나타낸다.2 shows a part of a configuration diagram of an electronic device according to embodiments.

구체적으로 도 2는 실시예들에 따른 전자 장치가 개방형 오픈-도메인 대화(Open-domain Conversation)를 위한 인공지능 모델을 학습하기 위한 구성 또는 명령어의 집합을 나타낸다. 실시예들에 따른 전자 장치는 개방형 오픈-도메인 대화를 위한 예시-기반 생성 모델(Examplar-based generation model)의 일부 또는 전부를 나타낼 수 있다. 실시예들에 따른 전자 장치는 실시예들에 따른 검색 모델부(200)는 대화에 대한 컨텍스트(context) 정보 및 해당 컨텍스트 정보에 대응하는 적합 응답(gold response) 정보를 입력 받고, 이들을 학습한다. 실시예들에 따른 전자 장치는 상술한 정보를 학습하여, 대화 정보(또는 컨텍스트 정보)를 입력 받아 해당 대화 정보에 대한 응답 정보를 예측 또는 생성하는 인공지능 모델을 포함할 수 있다.Specifically, FIG. 2 shows a configuration or a set of instructions for an electronic device to learn an artificial intelligence model for open-domain conversation according to embodiments. An electronic device according to embodiments may represent part or all of an example-based generation model for open open-domain conversations. In the electronic device according to embodiments, the search model unit 200 receives context information about a conversation and gold response information corresponding to the corresponding context information, and learns them. An electronic device according to embodiments may include an artificial intelligence model that learns the above information, receives conversation information (or context information), and predicts or generates response information for the corresponding conversation information.

본 명세서에서는 예시-기반 생성 모델의 구조 및 동작을 설명하기 위하여 다음과 같은 노테이션(notation)을 사용하기로 하며, 기존의 모델들의 문제점에 대하여 설명한다.In this specification, the following notation will be used to explain the structure and operation of the example-based generative model, and problems of existing models will be described.

는 다이얼로그 데이터세트(dialogue dataset)를 나타내며, 컨텍스트 정보 c 및 답변 정보 r를 이루는 n개의 짝들로 이루어진다. 예시-기반 생성 모델들은 두 개의 구성요소를 포함할 수 있다: 하나는 검색부 (

, Retriever), 및 생성부 (

, generator). 주어진 컨텍스트 정보

에 대하여, 검색부는 기-정의된 답변 세트

에 있는 예시

의 연관성 점수

의 최-고점(top-scoring)의 예시를 확인한다. 상기 생성부는 예시 z를 활용하여 컨텍스트 정보

에 대한 답변의 확률

를 계산한다.

represents a dialog dataset, and consists of n pairs constituting context information c and answer information r. Example-based generative models can include two components: a search unit (

, Retriever), and generating unit (

, generator). given context information

For , the search unit pre-defined set of answers

example in

relevance score of

Check out examples of top-scoring of . The generation unit utilizes example z to provide context information

probability of answering

Calculate

여기서, 검색부는 예를 들어, 검색 모델(retrieval model)로 호칭할 수 있고, 생성부는 예를 들어, 생성 모델(generative model)로 호칭할 수 있다.Here, the search unit may be referred to as, for example, a retrieval model, and the generation unit may be referred to as, for example, a generative model.

예를 들어, 하나의 예시 답변(exemplar)은 복수 개의 차원(dimension)을 갖는 데이터들의 리스트 또는 벡터(vector)로 표현(또는 임베딩(embedding))될 수 있고, 따라서 하나의 예시 답변은 임베딩 공간(embedding space) 상에서의 하나 또는 그 이상의 위치로 표현될 수 있다. 마찬가지로, 실시예들에 따른 컨텍스트(context) 정보도 복수 개의 차원(dimension)을 갖는 데이터들의 리스트 또는 벡터(vector)로 표현(또는 임베딩(embedding))될 수 있고, 임베딩 공간(embedding space) 상에서의 하나 또는 그 이상의 위치로 표현될 수 있다.For example, one example answer may be expressed (or embedded) as a list or vector of data having a plurality of dimensions, and thus one example answer may be expressed in an embedding space ( It can be expressed as one or more positions in embedding space. Similarly, context information according to embodiments may also be expressed (or embedded) as a list or vector of data having a plurality of dimensions, and Can be expressed in one or more locations.

실시예들에 따른 연관성 점수(relevance score,

)는 적어도 두 예시 답변들 간의 연관성, 또는 컨텍스트 정보와 특정 예시 답변 간의 연관성을 나타낼 수 있는 지표를 나타낼 수 있다. 예를 들어, 연관성 점수는 예를 들어 임베딩 공간 상에 위치하는 두 개의 예시 답변들 (또는 하나의 예시 답변과 하나의 컨텍스트 정보) 간의 연관성을 나타낼 수 있다. Relevance score according to embodiments (relevance score,

) may indicate an index that may indicate a correlation between at least two example answers or a correlation between context information and a specific example answer. For example, the correlation score may represent a correlation between two example answers (or one example answer and one context information) located in the embedding space.

한편, 본 명세서에서 서술하는 '연관성(relevance)'이라는 개념은, 임베딩 공간 상에 위치하는 두 개의 예시 답변들의 임베딩된 벡터의 값에 기반하여, 두 벡터 간의 거리, 밀집 정도, 유사도(similarity) 등을 의미할 수 있는 개념을 포함할 수 있다.On the other hand, the concept of 'relevance' described in this specification is based on the value of the embedded vector of two example answers located on the embedding space, the distance between the two vectors, the degree of density, the degree of similarity (similarity), etc. It can include concepts that can mean.

여기서,

은 임베딩 공간 상에 표현되는 복수의 예시 답변들을 확인할 수 있고 저장할 수 있는 검색 모델부를 의미할 수 있다.

는 실시예들에 따른 컨텍스트 정보를 나타낼 수 있으며, z는 실시예들에 따른 복수의 예시 답변들 중 하나를 의미할 수 있다.here,

may denote a search model unit capable of checking and storing a plurality of example answers expressed on the embedding space.

may indicate context information according to embodiments, and z may mean one of a plurality of example answers according to embodiments.

실시예들에 따른 연관성 점수를 계산하는 방법은 예를 들어, 도 13에 나타난 바와 같을 수 있다.한편, 예를 들어, 도 2를 참조하면, 실시예들에 따른 전자 장치는 검색 모델부(200) 및 생성 모델부(201)를 포함할 수 있다. A method of calculating a correlation score according to embodiments may be, for example, as shown in FIG. 13 . Meanwhile, for example, referring to FIG. 2 , an electronic device according to embodiments may include a search model unit 200 ) and a generation model unit 201.

실시예들에 따른 생성 모델부(201)는, 대화의 컨텍스트 정보를 입력 받으면 해당 대화의 컨텍스트에 적합한 응답 정보를 출력하는 인공지능 모델을 포함할 수 있다. 생성 모델부(201)에 포함된 인공지능 모델은, 훈련 세트(training set) 데이터에 의하여 학습된 인공신경망 모델을 포함할 수 있으며, 훈련 세트 데이터는 예를 들어 대화에 대한 컨텍스트(context) 정보 및 해당 컨텍스트 정보에 대응하는 적합 응답(gold response) 정보를 포함할 수 있다.The generation model unit 201 according to embodiments may include an artificial intelligence model that outputs response information suitable for the context of a corresponding conversation upon receiving input of context information of the conversation. The artificial intelligence model included in the generation model unit 201 may include an artificial neural network model learned by training set data, and the training set data may include, for example, context information and Gold response information corresponding to the corresponding context information may be included.

여기서, 실시예들에 따른 훈련 세트 데이터는 대화에 대한 컨텍스트(context) 정보로부터 적절한 해당 컨텍스트 정보에 대응하는 적합 응답(gold response) 정보를 출력할 수 있도록 예시(exemplar) 답변(들)에 대한 정보를 더 포함할 수 있다. 즉, 실시예들에 따른 생성 모델(201)은 대화의 컨텍스트 정보, 적합 응답(gold response) 정보뿐만 아니라 예시 답변(들)에 대한 정보를 더 활용하여 인공지능 모델을 학습시킬 수 있다.Here, the training set data according to the embodiments is information on example answer(s) so that gold response information corresponding to appropriate context information can be output from context information on the conversation. may further include. That is, the generative model 201 according to the embodiments may train an artificial intelligence model by further utilizing information on example answer(s) as well as context information of conversation and gold response information.

따라서, 실시예들에 따른 검색 모델부(200)는 실시예들에 따른 예시 답변(들)을 생성할 수 있다. 검색 모델부(200)는 대화에 대한 컨텍스트(context) 정보 및 해당 컨텍스트 정보에 대응하는 적합 응답(gold response) 정보를 입력 받아, 실시예들에 따른 예시 답변(들)을 생성 또는 선택할 수 있고, 생성 또는 선택한 예시 답변들을 실시예들에 따른 생성 모델(201)에게 훈련 세트 데이터로 전달할 수 있다.Accordingly, the search model unit 200 according to embodiments may generate example answer(s) according to embodiments. The search model unit 200 may generate or select example answer(s) according to embodiments by receiving context information about a conversation and gold response information corresponding to the corresponding context information, Generated or selected example answers may be transmitted as training set data to the generated model 201 according to embodiments.

실시예들에 따른 전자 장치는, 이러한 검색 모델(200)과 생성 모델(201)의 조합한 예시-기반 생성 모델(Exemplar-based Generation model)에 기반하여 대화의 컨텍스트 정보에 적합한 응답을 추출하도록 학습함으로써, 풍부한 지식을 기반으로 주어진 대화 컨텍스트에 어울리는 유창한 응답을 생성함과 동시에, 다양하고 흥미로운 응답을 생성할 수 있는 효과를 제공할 수 있다.An electronic device according to embodiments learns to extract a response suitable for context information of a conversation based on an example-based generation model in which the search model 200 and the generation model 201 are combined. By doing so, it is possible to generate a fluent response suitable for a given conversational context based on rich knowledge and at the same time to provide an effect capable of generating various and interesting responses.

한편, 종래의 검색 모델과 종래의 생성 모델을 포함하는 예시-기반 생성 모델은 다음과 같은 단점들이 있다.On the other hand, the example-based generation model including the conventional search model and the conventional generative model has the following disadvantages.

예를 들어, 검색 모델과 생성 모델을 포함하는 종래의 예시-기반 생성 모델을 개시하는 Roller et al., (2020)에 나타난 바와 같이, 원시적인 예시-기반 생성 모델 (Weston et al., 2018)은 응답 생성 과정에서 오픈-도메인 대화에서 원-투-매니 문제(one-to-many problem)에 의해 검색된 예시를 무시하려는 경향이 있다 (Li et al., 2016). 검색 모델이 예시 답변을 컨텍스트에 기반하여 추출하므로, 비록 검색된 예시 및 골드 응답이 모두 주어진 컨텍스트 정보에 유사하다고 하더라도, 도 3의 (a) 예시와 같이 검색된 예시는 골드 응답 정보(골드 응답, gold response)로부터 상이하게 차이나는 경우가 발생할 수 있다. 검색된 예시가 골드 응답을 생성하는데 도움이 되지 않으므로, 종래의 생성 모델은 검색된 예시를 무시하도록 훈련되며, 컨텍스트 정보만을 이용하여 응답을 생성할 수 있다.For example, a primitive example-based generative model (Weston et al., 2018), as shown in Roller et al., (2020), which discloses a conventional example-based generative model that includes a search model and a generative model. tend to ignore examples retrieved by the one-to-many problem in open-domain conversations during response generation (Li et al., 2016). Since the search model extracts an example answer based on the context, even if both the retrieved example and the gold response are similar to the given context information, the example retrieved as shown in (a) of FIG. 3 is the gold response information (gold response, gold response). ) may be different from each other. Since retrieved examples do not help generate gold responses, conventional generative models are trained to ignore retrieved examples, and can generate responses using only the contextual information.

생성 모델이 검색된 예시를 더 활동적으로 활용하기 위하여, 모델의 훈련 과정에서 검색된 예시들을 사용하는 것보다, Roller et al. (2020)는 골드 응답을 활용하며, Cai et al. (2019)는 변형된 골드 응답(perturbed gold response)를 예시로 사용한다. 하지만, 예시 답변

및 골드 응답

가 과도하게 유사하면 (예를 들어, 도 3의 (b) 실시예), 종래의 예시-기반 생성 모델은 상기 예시에 과도하게 의존하도록 학습될 수 있다. 이 경우 결과적으로, 생성 모델은 상기 예시의 토큰들을 직접적으로 복사함으로써 예시에 과도하게 적합된(over-fitted) 응답을 생성할 수 있다.Rather than using retrieved examples during model training, Roller et al. (2020) utilizes the Gold response, and Cai et al. (2019) use the perturbed gold response as an example. However, an example answer

and gold response

If is excessively similar (eg, embodiment (b) of FIG. 3 ), a conventional example-based generative model may be learned to rely excessively on the example. As a result in this case, the generative model may generate an over-fitted response to the example by directly copying the tokens of the example.

요약하면, 상술한 종래의 검색 모델과 종래의 생성 모델을 조합한 예시-기반 생성 모델(Exemplar-based Generation model)에 기반하여 학습할 경우, 생성 모델이 주어진 예시(exemplar)들을 무시하거나 주어진 예시(들)을 그대로 사용하여 응답을 도출하는 경우가 있다. 예를 들어, 검색 모델에서 선택 또는 확인한 예시 답변이 실제 컨텍스트 정보와의 관련성이 매우 낮은 경우, 생성 모델은 주어진 예시를 무시하여 올바른 학습을 수행할 수 없다. 또한, 검색 모델에서 선택 또는 확인한 예시 답변이 실제 컨텍스트 정보와 관련성이 높은 경우, 생성 모델은 주어진 예시를 그대로 사용 및 출력할 뿐이며 올바른 학습을 수행할 수 없다.In summary, when learning based on an example-based generation model that combines the conventional search model and the conventional generative model described above, the generative model ignores given examples or gives examples ( ) may be used as is to derive a response. For example, if an example answer selected or confirmed by a search model has very little relevance to actual context information, the generative model cannot perform correct learning by ignoring the given example. In addition, if the example answers selected or confirmed by the search model are highly related to actual context information, the generative model only uses and outputs the given example as it is and cannot perform correct learning.

상술한 종래의 검색 모델과 종래의 생성 모델을 포함하는 예시-기반 생성 모델의 구체적인 문제점들은 도 3에서 자세히 설명하기로 한다.Specific problems of the above-described example-based generative model including the conventional search model and the conventional generative model will be described in detail with reference to FIG. 3 .

상술한 종래의 검색 모델과 종래의 생성 모델을 포함하는 예시-기반 생성 모델의, 문제점을 극복하고자, 실시예들에 따른 검색 모델부(200)는 응답 저장부(200a), 후보 예시 답변 확인부(200b), 예시 답변 선택부(200c), 가중치 계산부(200d) 중 적어도 하나를 더 포함할 수 있다.In order to overcome the problems of the above-described conventional search model and example-based generation model including the conventional generation model, the search model unit 200 according to the embodiments includes a response storage unit 200a, a candidate example answer confirmation unit 200b, an example answer selection unit 200c, and a weight calculation unit 200d may further include at least one.

응답 저장부(200a)는 복수의 응답들에 대한 정보를 포함할 수 있다. 응답 저장부(200a)는 복수의 응답들의 내용을 저장할 수도 있고, 각 응답이 임베딩 공간(embedding space) 상에서의 위치 또는 값(value)을 저장할 수도 있다.The response storage unit 200a may include information on a plurality of responses. The response storage unit 200a may store contents of a plurality of responses, and may store a position or value of each response in an embedding space.

후보 예시 답변 확인부(200b)는, 응답 저장부(200a)에 저장된 복수의 응답들 중 하나 이상의 후보 예시 답변들을 확인할 수 있다. 후보 예시 답변 확인부(200b)는 훈련 세트로 입력 받은 적합 응답(gold response) 정보를 이용하여, 후보 예시 답변들을 확인할 수 있다. The candidate example answer checking unit 200b may check one or more candidate example answers among a plurality of responses stored in the response storage unit 200a. The candidate example answer checking unit 200b may check candidate example answers using gold response information received as a training set.

구체적으로, 후보 예시 답변(들)은 임베딩 공간 상에서 적합 응답의 값(또는 위치)로부터 제1범위 내에 포함되어 있는 응답들을 의미할 수 있다. 여기서 후보 예시 답변(들)을 확인하기 위한 제1범위는 예를 들어, 군집(clustering) 알고리즘에 기반하여 결정될 수 있다. 예를 들어, 후보 예시 답변(들)은 임베딩 공간 상에서 적합 응답의 값(또는 위치)과 군집 알고리즘(예를 들어, k-means 알고리즘, k-Nearest Neighbor(kNN) 알고리즘 등)에 기반하여, 복수의 응답들로부터 선택된 응답(들)을 의미할 수 있다. Specifically, the candidate example answer(s) may refer to responses included within a first range from a value (or location) of a suitable response on an embedding space. Here, the first range for identifying the candidate example answer(s) may be determined based on, for example, a clustering algorithm. For example, the candidate example answer(s) is based on the value (or location) of the appropriate response and a clustering algorithm (eg, k-means algorithm, k-Nearest Neighbor (kNN) algorithm, etc.) on the embedding space, It may mean the response(s) selected from the responses of.

실시예들에 따른 예시 답변 선택부(200c)는, 후보 예시 답변 확인부(200b)로부터 확인 및 선택된 후보 예시 답변(들)로부터 하나 이상의 예시 답변(들)을 선택할 수 있다. 실시예들에 따른 생성 모델(201)이 검색 모델(200)에서 선택된 예시 답변들로 인해 높은 효율의 학습을 유도하기 위해서는, 검색 모델(200)이 실시예들에 따른 컨텍스트 정보와 지나치게 연관성이 높거나 낮은 예시 답변(들), 골드 응답 정보로부터 지나치게 연관성이 높거나 낮은 응답(들) 중 적어도 하나를 제거할 수도 있다. 따라서, 예시 답변 선택부(200c)는 후보 예시 답변 확인부(200b)로부터 확인 및 선택된 후보 예시 답변(들) 중, 학습에 적합한 예시 답변들을 선택할 수 있다.The example answer selector 200c according to embodiments may select one or more example answer(s) from the candidate example answer(s) checked and selected by the candidate example answer checker 200b. In order for the generative model 201 according to the embodiments to induce high-efficiency learning due to the example answers selected from the search model 200, the search model 200 has too high correlation with the context information according to the embodiments. Or, at least one of the example answer(s) and the response(s) with high or low relevance may be removed from the gold response information. Accordingly, the example answer selector 200c may select example answers suitable for learning from among the candidate example answers checked and selected by the candidate example answer checker 200b.

예시 답변 선택부(200c)가 선택한 예시 답변(들)은 예를 들어, 후보 예시 답변(들) 중 임베딩 공간 상에서 골드 응답 정보에 대응하는 값으로부터 지나치게 가까운(또는 연관성이 높은) 제2범위 내의 값(응답)을 배제함으로써 선택될 수도 있다. 예를 들어, 제2범위는 재커드 유사도(Jaccard Similarity)를 기준으로 일정 스레숄드(threshold) 이상의 유사도를 갖는 범위를 의미할 수 있다.The example answer(s) selected by the example answer selection unit 200c is, for example, a value within the second range that is too close (or highly correlated) to a value corresponding to the gold response information in the embedding space among the candidate example answer(s). It can also be selected by excluding (response). For example, the second range may mean a range having similarity equal to or greater than a certain threshold based on Jaccard similarity.

실시예들에 따른 전자 장치는, 후보 예시 답변들 중 임베딩 공간 상에서 골드 응답 정보에 대응하는 값으로부터 지나치게 가까운(또는 연관성이 높은) 범위 내의 응답들을 배제함으로써, 생성 모델(201)이 문맥에 적절하면서도 다양한 응답을 도출하도록 학습할 수 있게 도와 준다.The electronic device according to the embodiments excludes responses within a range that is too close (or highly correlated) from the value corresponding to the gold response information in the embedding space among candidate example answers, so that the generation model 201 is appropriate for the context and It helps you learn to derive a variety of responses.

실시예들에 따른 검색 모델(200)은 예시 답변 선택부(200c)에서 선택한 예시 답변들 (또는 예시 답변들에 대한 정보)를 실시예들에 따른 생성 모델(201)로 제공할 수 있다.The search model 200 according to embodiments may provide example answers (or information on example answers) selected by the example answer selector 200c to the generation model 201 according to embodiments.

한편, 실시예들에 따른 생성 모델(201)은 선택된 예시 답변들뿐만 아니라 예시 답변들과 관련된 가중치 값(들)을 이용하여 학습할 수도 있다. 예시 답변들과 관련된 가중치 값(들)은 예를 들어, 실시예들에 따른 예시 답변들이 실시예들에 따른 컨텍스트 정보 및/또는 골드 응답 정보와 연관된 정도를 나타낼 수 있다. 예를 들어 가중치 값(들)은 각 예시 답변에 대한 연관성 점수(relevance score) 또는 정규화된 연관성 점수(normalized relevance score)를 의미할 수도 있다.Meanwhile, the generation model 201 according to the embodiments may be trained using not only selected example answers but also weight value(s) related to example answers. The weight value(s) associated with example answers may indicate, for example, the extent to which example answers according to embodiments are associated with context information and/or gold response information according to embodiments. For example, the weight value(s) may mean a relevance score or a normalized relevance score for each example answer.

따라서, 실시예들에 따른 검색 모델부(200)는 예시 답변 선택부(200c)로부터 선택된 예시 답변들로부터 각 예시 답변에 대한 가중치 값(들)을 계산 또는 도출하는 가중치 계산부(200d)를 더 포함할 수 있다. 가중치 계산부(200d)는 실시예들에 따른 선택된 예시 답변들 및/또는 실시예들에 따른 컨텍스트 정보를 이용하여, 각 예시 답변에 대한 연관성 점수 또는 정규화된 연관성 점수를 계산함으로써, 각 응답에 대한 가중치를 계산할 수 있다.Accordingly, the search model unit 200 according to embodiments further includes a weight calculation unit 200d that calculates or derives weight value(s) for each example answer from the example answers selected by the example answer selection unit 200c. can include The weight calculation unit 200d calculates a relevance score or a normalized relevance score for each example answer using selected example answers according to embodiments and/or context information according to embodiments, thereby calculating a relevance score for each response. weights can be calculated.

실시예들에 따른 전자 장치는, 예시 답변들 및 각 예시 답변에 대한 가중치를 더 고려하여 최적의 답변을 생성하도록 학습함으로써, 예시 답변들을 적절하게 반영하여 문맥에 맞고 어색하지 않는 응답을 제공함과 동시에 유창하면서도 창의적인 응답을 생성할 수 있도록 유도할 수 있으며, 대화 모델을 사용하는 사용자로 하여금 질리지 않는 대화를 이끌어낼 수 있다.The electronic device according to the embodiments learns to generate an optimal answer by further considering example answers and a weight for each example answer, thereby appropriately reflecting example answers to provide a response that is appropriate for the context and is not awkward at the same time. It can be induced to generate fluent and creative responses, and users who use the dialog model can lead conversations that they do not get tired of.

실시예들에 따른 생성 모델(201)은 상술한 선택된 예시 답변들, 각 예시 답변에 대한 가중치 정보를 이용하여, 생성 모델(201)에 포함된 인공신경망 모델을 학습할 수 있다. The generative model 201 according to the embodiments may learn the artificial neural network model included in the generative model 201 using the above-described selected example answers and weight information for each example answer.

구체적으로, 실시예들에 따른 생성 모델(201)은, 선택된 예시 답변들(및/또는 각 예시 답변에 대한 가중치 정보)에 기초하여 인공신경망 모델의 순전파(Forward Propagation)를 수행할 수 있고 응답을 생성할 수 있다. 실시예들에 따른 생성 모델(201)은 생성된 응답 정보을 사용자에게 제공할 수 있다.Specifically, the generative model 201 according to the embodiments may perform forward propagation of the artificial neural network model based on selected example answers (and/or weight information for each example answer) and responds can create The generation model 201 according to embodiments may provide generated response information to a user.

또한, 실시예들에 따른 생성 모델(201)은 상술한 골드 응답 정보와 선택된 예시 답변들(및/또는 각 예시 답변에 대한 가중치 정보)를 이용하여 손실 함수(loss function)을 생성할 수 있고, 생성된 손실 함수(loss function)를 이용하여 생성 모델(201)에 포함된 인공신경망 모델의 역전파(Back Propagation)을 수행함으로써, 인공신경망 모델의 업데이트 또는 학습을 수행할 수 있다.In addition, the generation model 201 according to the embodiments may generate a loss function using the above-described gold response information and selected example answers (and/or weight information for each example answer), The artificial neural network model may be updated or learned by performing back propagation of the artificial neural network model included in the generation model 201 using the generated loss function.

실시예들에 따른 손실 함수(loss function)은 예를 들어, 뉴럴 네트워크를 학습하는데 수행되는 역전파 동작을 수행하는데 필요한 함수 또는 값을 의미할 수 있다. 따라서, 실시예들에 따른 손실 함수는 예를 들어, 검색 모델부(200)가 선택한 예시 답변들 각각에 대하여, 각 예시 답변 및 컨텍스트 정보에 기반하여 계산된 각 답변에 대한 유사성 점수들(또는 정규화된 유사성 점수)에 기반하여 결정될 수 있다.A loss function according to embodiments may mean, for example, a function or value required to perform a backpropagation operation performed to learn a neural network. Accordingly, the loss function according to the embodiments may be, for example, similarity scores (or normalized values) for each answer calculated based on each example answer and context information for each example answer selected by the search model unit 200. It can be determined based on the similarity score).

실시예들에 따른 예시 기반 생성 모델을 이용한 대화 정보 생성 방법은 검색 모델(200)과 생성 모델(201)의 조합에 기반하여 대화의 컨텍스트 정보에 적합한 응답을 추출하도록 학습함으로써, 풍부한 지식을 기반으로 주어진 대화 컨텍스트에 어울리는 유창한 응답을 생성함과 동시에, 다양하고 흥미로운 응답을 생성할 수 있는 효과를 제공할 수 있다.A method of generating conversation information using an example-based generation model according to embodiments learns to extract a response suitable for context information of a conversation based on a combination of a search model 200 and a generation model 201, thereby providing rich knowledge based It is possible to provide an effect capable of generating various interesting responses while generating a fluent response suitable for a given conversation context.

한편, 실시예들에 따른 검색 모델부(200)는 검색부, 리트리버(retriever) 등으로 다양하게 호칭될 수 있다. 이하에서는, 실시예들에 따른 예시 답변 선택부(200c)의 동작들의 예시, 가중치 계산부(200d)의 동작들의 예시를 구체적으로 살펴본다.Meanwhile, the search model unit 200 according to embodiments may be variously called a search unit, a retriever, and the like. Hereinafter, examples of operations of the example answer selector 200c and examples of operations of the weight calculation unit 200d according to embodiments will be described in detail.

도 3은 실시예들에 따른 전자 장치가 컨텍스트 정보로부터 응답 정보를 생성하는 전반적인 결과의 예시를 나타낸 도면이다.3 is a diagram illustrating an example of overall results of generating response information from context information by an electronic device according to embodiments.

도 3에 나타난 동작들은, 도 2에 나타난 동작들에 기초하여 학습된 실시예들에 따른 검색 모델부(200) 및 생성 모델부(201) (예를 들어, 실시예들에 따른 학습된 인공신경망 모델 등)를 이용하여, 사용자로부터 획득한 대화 정보로부터 응답 정보를 생성하는 동작의 예시를 나타낸다.The operations shown in FIG. 3 are the search model unit 200 and the generation model unit 201 according to the embodiments learned based on the operations shown in FIG. 2 (eg, the learned artificial neural network according to the embodiments) model, etc.), an example of an operation of generating response information from conversation information acquired from a user is shown.

구체적으로, 도 3은 실시예들에 따른 전자 장치가 대화 정보로부터 컨텍스트 정보를 확인(300)한 후, 컨텍스트 정보에 기초하여 예시 답변(들)을 선택(301)하고, 선택된 예시 답변(들)을 이용하여 대화 정보에 대응하는 응답 정보를 생성(302)하는 동작의 예시를 나타낸다.Specifically, FIG. 3 shows that after the electronic device according to embodiments checks context information from conversation information (300), selects example answer(s) based on the context information (301), and selects the selected example answer(s). It shows an example of an operation of generating 302 response information corresponding to conversation information using .

도 3을 참조하면, 실시예들에 따른 전자 장치는 사용자로부터 입력 받은 대화 정보(또는 주어진 대화 정보)를 이용하여 컨텍스트(context) 정보를 생성 또는 확인(300)할 수 있다. 컨텍스트 정보는 사용자로부터 입력 받은 하나 또는 그 이상의 대화(예를 들어, "A: Do you ever feel like time is just going by way too fast?" 등), 각 대화에 대한 기 제공된 응답(예를 들어, "B: OMG! Especially recently. A week seems like one day.")에 기초하여 생성될 수 있다. Referring to FIG. 3 , an electronic device according to embodiments may generate or check (300) context information using conversation information (or given conversation information) input from a user. Context information includes one or more conversations received from the user (eg, "A: Do you ever feel like time is just going by way too fast?", etc.), and responses provided for each conversation (eg, "A: Do you ever feel like time is just going by way too fast?"). "B: OMG! Especially recently. A week seems like one day.").

도 3을 참조하면, 실시예들에 따른 전자 장치는, 확인된 컨텍스트 정보를 쿼리(query)로 이용하여, 예시 답변(들)을 검색 및 선택(301)할 수 있다. 예시 답변(들)을 선택하는 동작은 예를 들어 도 2의 검색 모델(200)의 동작들 일부 또는 전부에 기초하여 수행될 수 있다. 도 3을 참조하면, 실시예들에 따른 전자 장치는, 선택된 예시 답변(들)을 이용하여 대화에 대한 답변(response)를 생성(302)한다.Referring to FIG. 3 , an electronic device according to embodiments may search for and select example answer(s) by using the checked context information as a query (301). The operation of selecting the example answer(s) may be performed based on some or all of the operations of the search model 200 of FIG. 2 , for example. Referring to FIG. 3 , an electronic device according to embodiments generates a response to a conversation using the selected example answer(s) (302).

한편, 훈련 세트 데이터를 학습하는 과정에서, 실시예들에 따른 생성 모델부는 검색 모델부로부터 제공되는 예시 답변들의 특성, 예시 답변들의 내용 및 그 외에 제공되는 데이터에 따라 다양하게 학습될 수 있다. 도 3의 302a 내지 302c를 살펴보면, 실시예들에 따른 생성 모델부는 제1예시(302a)와 같이 컨텍스트 정보(300)와 선택된 예시 답변(301)과 관련 없는 응답을 생성할 수도 있고, 제2예시(302b)와 같이 선택된 예시 답변(301)과 과도하게 동일한 응답을 생성할 수도 있다. Meanwhile, in the process of learning training set data, the generation model unit according to the embodiments may be trained in various ways according to the characteristics of example answers provided from the search model unit, the contents of example answers, and other data provided. Looking at 302a to 302c of FIG. 3 , the generation model unit according to the embodiments may generate a response unrelated to the context information 300 and the selected example answer 301 as in the first example 302a, and the second example 302a. An excessively identical response to the selected example answer 301 may be generated, such as 302b.

검색 모델부에서 선택 및 제공되는 예시 답변(들)이 골드 응답 정보와 상당히 거리가 있는 경우(즉, 임베딩 공간 상에서의 값의 차이가 큰 경우), 전자 장치는 상술한 제1예시(302a)와 같이 대화 정보와 관련 없는 답안으로 도출되도록 잘못 학습시킬 수 있다. 또, 검색 모델부에서 선택 및 제공되는 예시 답변(들)이 골드 응답 정보와 상당히 유사한 경우(즉, 임베딩 공간 상에서의 값의 차이가 극히 적은 경우)는, 전자 장치는 상술한 제2예시(302a)와 같이 골드 응답 정보를 그대로 도출하도록 잘못 학습시킬 수 있다. 즉, 하나의 컨텍스트 정보로부터 선택될 수 있는 예시 답변이 다양할 수 있고, 선택된 예시 답변들 내에서도 유사도의 괴리가 발생할 수 있는, 소위 원-투-매니 문제(One-to-Many Problem)가 존재하기 때문에, 인공신경망의 학습의 효과가 저하됨과 동시에 과적합(overfitting) 문제를 야기시킬 가능성을 높인다.When the example answer(s) selected and provided by the search model unit is considerably far from the gold response information (ie, when the difference between the values in the embedding space is large), the electronic device responds to the above-described first example 302a. Likewise, it can be erroneously trained to derive answers that are not related to conversation information. In addition, when the example answer(s) selected and provided by the search model unit is very similar to the gold response information (ie, the difference in values in the embedding space is extremely small), the electronic device uses the above-described second example 302a. ), it can be erroneously trained to derive gold response information as it is. That is, there is a so-called one-to-many problem in which there may be various example answers that can be selected from one context information, and a gap in similarity may occur even within the selected example answers. Therefore, the learning effect of the artificial neural network is reduced, and at the same time, the possibility of causing an overfitting problem is increased.

따라서, 실시예들에 따른 전자 장치는, 제3예시(302c)와 같이 유창하고 대화의 연속성을 보장할 수 있도록 응답 정보를 생성하기 위해, 예시 답변(들)을 특정 방법을 기준으로 선택할 수 있다.Accordingly, the electronic device according to the embodiments may select example answer(s) based on a specific method in order to generate response information so as to be fluent and guarantee continuity of conversation as in the third example 302c. .

실시예들에 따른 예시 기반 생성 모델을 이용한 대화 정보 생성 풍부한 지식을 기반으로 주어진 대화 컨텍스트에 어울리는 유창한 응답을 생성함과 동시에, 다양하고 흥미로운 응답을 생성할 수 있는 효과를 제공할 수 있다.Conversation information generation using an example-based generation model according to embodiments Based on rich knowledge, a fluent response suitable for a given conversation context can be generated, and at the same time, various interesting responses can be generated.

도 4는 실시예들에 따른 전자 장치가 컨텍스트 정보로부터 응답 정보를 생성하도록 생성 모델부를 학습하는 동작들의 예시를 나타낸다.4 illustrates examples of operations for learning a generation model unit to generate response information from context information by an electronic device according to embodiments.

구체적으로, 도 4는 실시예들에 따른 전자 장치가 대화 정보로부터 응답 정보를 생성하기 위하여 예시 답변(들)을 학습하는 동작의 일부 또는 전부를 나타낸다. 도 4에 나타난 동작들은 도 2의 전자 장치 내 검색 모델부(200) 및 생성 모델부(201)에 의해 수행될 수 있다.Specifically, FIG. 4 illustrates part or all of an operation of learning example answer(s) in order to generate response information from conversation information by an electronic device according to embodiments. Operations shown in FIG. 4 may be performed by the search model unit 200 and the generation model unit 201 in the electronic device of FIG. 2 .

실시예들에 따른 컨텍스트 정보는 임베딩 공간(400) 상에서의 특정 값으로 임베딩되어 대응될 수 있다. 또, 실시예들에 따른 복수의 응답들(예를 들어, 도 2의 응답 저장부(200a)에 저장된 응답들)도 임베딩 공간(400) 상에서의 특정 값으로 임베딩될 수 있다. 예를 들어, 도 2의 응답 저장부(200a)는 각 응답이 임베딩 공간(400) 상에서 임베딩된 특정 값들을 저장할 수 있다.Context information according to embodiments may correspond to being embedded with a specific value on the embedding space 400 . In addition, a plurality of responses according to embodiments (eg, responses stored in the response storage unit 200a of FIG. 2 ) may also be embedded with a specific value on the embedding space 400 . For example, the response storage unit 200a of FIG. 2 may store specific values embedded in the embedding space 400 for each response.

실시예들에 따른 전자 장치는, 복수의 응답들 중 컨텍스트 정보에 대응하는 임베딩 공간 상의 값으로부터 특정 범위(400a) 내의 응답(들)을 검색할 수 있다. 실시예들에 따른 전자 장치는 특정 범위(400a) 내의 응답(들) 중 생성 모델부로 제공할 예시 답변(들)(403a, 403b)을 선택할 수 있다. 한편, 실시예들에 따른 전자 장치는, 훈련 세트 데이터에 포함된 적합 응답(gold response) 정보(402)를 확인할 수 있다. An electronic device according to embodiments may search for response(s) within a specific range 400a from a value on an embedding space corresponding to context information among a plurality of responses. The electronic device according to embodiments may select example answer(s) 403a and 403b to be provided to the generation model unit from among the response(s) within the specific range 400a. Meanwhile, the electronic device according to embodiments may check gold response information 402 included in training set data.

한편, 훈련 세트 데이터를 학습하는 과정에서, 도 4의 403a 내지 403b를 살펴보면, 실시예들에 따른 생성 모델부는 제1예시(403a)와 같이 골드 응답 정보(402)와 관련 없는 예시 답변을 선택할 수 있다. 마찬가지로, 제2예시(403b)와 같이 선택된 골드 응답 정보(402)와 과도하게 동일한 응답을 선택할 수도 있다. Meanwhile, in the process of learning the training set data, looking at 403a to 403b of FIG. 4 , the generation model unit according to the embodiments may select an example answer unrelated to the gold response information 402 as in the first example 403a. there is. Similarly, a response that is excessively identical to the selected gold response information 402 may be selected as in the second example 403b.

실시예들에 따른 전자 장치의 생성 모델부(404)는, 생성 모델부(404) 내의 인공신경망 모델을 학습하기 위하여 골드 응답 정보와 선택된 예시 답변들 (및/또는 각 답변의 가중치 정보)의에 기반하여 손실 함수(loss function)을 계산할 수 있고, 손실 함수를 기반으로 생성 모델부(404) 내의 인공신경망의 역전파 동작을 수행함으로써, 생성 모델부(404) 내의 인공신경망을 학습시킬 수 있다.The generation model unit 404 of the electronic device according to the embodiments may, in order to learn the artificial neural network model in the generation model unit 404, use gold response information and selected example answers (and/or weight information of each answer) A loss function may be calculated based on the loss function, and the artificial neural network in the generation model unit 404 may be trained by performing a backpropagation operation of the artificial neural network in the generation model unit 404 based on the loss function.

그러나, 제1예시(403a)를 예시 답변으로 선택할 경우, 생성 모델부(404)는 골드 응답 정보와의 큰 차이로 해당 예시 답변을 무시(ignore)할 수 있다. 반면, 제2예시(403b)를 예시 답변으로 선택할 경우, 생성 모델부(404)는 골드 응답 정보와의 매우 적은 차이로 해당 예시 답변을 과하게 고려하여 과적합(overfitting) 현상이 발생할 수 있다. 따라서, 실시예들에 따른 전자 장치는, 유창하고 대화의 연속성을 보장할 수 있도록 응답 정보를 생성하기 위해, 예시 답변(들)을 특정 방법을 기준으로 선택할 수 있다.However, when the first example 403a is selected as an example answer, the generation model unit 404 may ignore the corresponding example answer due to a large difference from gold response information. On the other hand, when the second example 403b is selected as an example answer, the generation model unit 404 excessively considers the corresponding example answer with a very small difference from the gold response information, and thus an overfitting phenomenon may occur. Accordingly, the electronic device according to the embodiments may select example answer(s) based on a specific method in order to generate response information that is fluent and ensures continuity of conversation.

도 5는 실시예들에 따른 검색 모델부의 동작의 예시를 나타낸 것이다.5 illustrates an example of an operation of a search model unit according to embodiments.

도 5는 도 3 내지 도 4에 따른, 실시예들에 따른 검색 모델부가 복수의 응답들로부터 예시 답변(들) (

)을 선택하는 방법을 나타낸 일 실시예이다. 5 is a search model unit according to embodiments according to FIGS. 3 to 4, example answer(s) from a plurality of responses (

) is an embodiment showing a method of selecting.

도 5를 참조하면, 실시예들에 따른 검색 모델부는, 골드 응답 정보로부터 후보 예시 답변(들)을 확인하는 동작(5A), 후보 예시 답변(들)로부터 예시 답변(들)을 선택하는 동작(5B), 선택된 예시 답변 각각에 대하여 가중치를 계산하는 동작(5C) 중 적어도 하나 이상을 수행할 수 있다. 도 5에서 5A는 예를 들어 도 2의 후보 예시 답변 확인부(200b)에 의해, 5B는 예를 들어 도 2의 예시 답변 선택부(200c)에 의해, 5C는 예를 들어 도 2의 가중치 계산부(200d)에 의해 수행될 수 있다.Referring to FIG. 5 , the search model unit according to embodiments includes an operation of identifying candidate example answer(s) from gold response information (5A), and an operation of selecting example answer(s) from candidate example answer(s) ( At least one of 5B) and an operation 5C of calculating a weight for each selected example answer may be performed. In FIG. 5, 5A is, for example, by the candidate example answer checker 200b of FIG. 2, 5B is by, for example, the example answer selector 200c of FIG. 2, and 5C is, for example, weight calculation in FIG. It may be performed by unit 200d.

도 5의 5A를 참조하면, 실시예들에 따른 검색 모델부는 컨텍스트 정보(501)를 확인할 수 있고, 컨텍스트 정보(501)로부터 임베딩 공간(500) 상에서의 연관 범위(501a) 내 또는 골드 응답 정보(502)로부터 임베딩 공간(500) 상에서의 제1범위(502a) 내에 존재하는 후보 예시 답변(들)(503)을 확인할 수 있다. 실시예들에 따른 제1범위(502a)는 군집 알고리즘(예를 들어, k-means 알고리즘, k-Nearest Neighbor(kNN) 알고리즘 등)에 기반하여 결정된 범위를 나타낼 수 있다.Referring to 5A of FIG. 5 , the search model unit according to the embodiments may check context information 501, and from the context information 501, within the association range 501a on the embedding space 500 or gold response information ( From 502 , it is possible to check candidate example answer(s) 503 existing within the first range 502a on the embedding space 500 . The first range 502a according to embodiments may represent a range determined based on a clustering algorithm (eg, k-means algorithm, k-Nearest Neighbor (kNN) algorithm, etc.).

실시예들에 따른 전자 장치는, 군집 알고리즘 등에 기반하여 후보 예시 답변들을 선택하여, 궁극적으로 선택된 예시 답변이 무시되거나 과소적합 현상이 발생하는 것을 방지할 수 있다. 또한, 이러한 동작으로, 선택된 예시 답변이 무시되지 않으므로 학습 과정에서의 불필요한 지연을 줄일 수 있다.The electronic device according to embodiments may select candidate example answers based on a clustering algorithm, etc., and ultimately prevent the selected example answer from being ignored or an underfitting phenomenon from occurring. In addition, with this operation, unnecessary delay in the learning process can be reduced because the selected example answer is not ignored.

도 5의 5B를 참조하면, 실시예들에 따른 검색 모델부는 선택한 후보 예시 답변(들)(503) 중 제2범위(504) 내에 포함된 답변(들)을 제외할 수 있다. 여기서, 제2범위(504)는 사용자에 의해 또는 시스템에 의해 결정되는 범위일 수 있다. 예를 들어, 제2범위(504)는 임베딩 공간 상에서 Jaccard Filter Boundary를 의미할 수도 있다.Referring to 5B of FIG. 5 , the search model unit according to embodiments may exclude answer(s) included in the second range 504 from among selected candidate example answer(s) 503 . Here, the second range 504 may be a range determined by a user or a system. For example, the second range 504 may mean a Jaccard Filter Boundary on an embedding space.

실시예들에 따른 전자 장치는, 이러한 답변(들)을 제외함으로써 골드 응답 정보와 과도하게 유사한 답변으로부터 학습을 수행하는 것을 방지할 수 있어, 궁극적으로 과적합(overfitting) 현상을 방지할 수 있다.The electronic device according to embodiments may prevent learning from answers that are excessively similar to gold response information by excluding these answer(s), ultimately preventing an overfitting phenomenon.

한편, 5A 및 5B에 의해 선택된 답변들은 골드 응답 정보와 연관도가 높으나 실시예들에 따른 컨텍스트 정보와의 연관도는 차이가 있을 수 있다. 예를 들어, 5A 및 5B에 의해 선택된 답변들 내에서는 5A에서 수행한 군집 알고리즘(예를 들어, kNE)의 특성에 따라 컨텍스트 정보와 연관성이 높으나 골드 응답 정보와의 연관성은 낮을 수도 있고, 컨텍스트 정보와 연관성이 낮으나 골드 응답 정보와의 연관성은 높을 수도 있다. 실시예들에 따른 전자 장치는, 각 예시 답변들과 컨텍스트 정보와의 연관성도 고려하여 생성 모델부를 학습시킬 필요가 있다. 또한, 만약 검색 모델부가 상술한 5A 및 5B 동작만 수행할 경우, 선택된 예시 답변(들)은 전적으로 골드 응답 정보에 의존한 답변들을 포함할 수 있어, 학습의 효율이 저하될 수 있다. Meanwhile, the answers selected by 5A and 5B have a high degree of association with gold response information, but there may be a difference in degree of association with context information according to embodiments. For example, within the answers selected by 5A and 5B, the correlation with context information is high, but the correlation with gold response information may be low, depending on the characteristics of the clustering algorithm (eg, kNE) performed in 5A, and the context information Although correlation with is low, correlation with gold response information may be high. The electronic device according to the embodiments needs to learn the generation model unit considering the correlation between each example answer and the context information. In addition, if the search model unit performs only the above-described operations 5A and 5B, the selected example answer(s) may include answers entirely dependent on gold response information, which may reduce learning efficiency.

따라서, 도 5의 5C에 따르면, 실시예들에 따른 검색 모델부는 각 답변과 컨텍스트 정보의 유사성 정도(예를 들어, 가중치(weight))를 계산할 수 있다. 실시예들에 따른 가중치는 하나의 답변과 컨텍스트 정보와의 유사성을 나타낸 것일 수 있으며, 유사성은 예를 들어 임베딩 공간 상에서의 값의 차이에 기반하여 계산될 수 있다. 실시예들에 따른 검색 모델부는 예시 답변들의 개수만큼 가중치 값들을 컨텍스트 정보를 이용하여 계산할 수 있고, 각 계산된 가중치를 각 예시 답변에 매핑하여 실시예들에 따른 생성 모델부로 제공할 수 있다.Accordingly, according to 5C of FIG. 5 , the search model unit according to the embodiments may calculate a degree of similarity (eg, weight) between each answer and context information. A weight according to embodiments may indicate similarity between one answer and context information, and similarity may be calculated based on, for example, a difference between values in an embedding space. The search model unit according to embodiments may calculate weight values as many as the number of example answers using context information, and may map each calculated weight to each example answer and provide the calculated weight values to the generation model unit according to embodiments.

구체적으로, 일 실시예에 따른 검색 모델부(

) 는, 선택된 예시 답변들(exemplar

), 및 실시예들에 따른 컨텍스트 정보 (

)에 기반하여 각 예시 답변마다 계산 유사성 점수(relevance score,

)할 수 있다. 나아가, 일 실시예들에 따른 검색 모델부는 각 예시 답변에 대한 유사성 점수(

)에 소프트맥스(softmax) 함수를 적용하여 정규화된 유사성 점수 (

)를 계산할 수 있다. 그 후, 실시예들에 따른 전자 장치는 각 답변에 대한 정규화된 유사성 점수들을 이용하여 기존의 가능도(traditional likelihood)를 가중치 가능도(weighted likelihood)로 변환하고, 변환된 가중치 가능도로부터 생성된 손실 함수(loss function)을 최소화하도록 실시예들에 따른 생성 모델부를 학습시킬 수 있다. 실시예들에 따른 생성 모델부를 학습시키기 위하여, 변환된 가중치 가능도로부터 생성된 손실 함수(loss function, L)는 예를 들어, 아래와 같이 계산될 수 있다.Specifically, the search model unit according to an embodiment (

) is the selected example answers (exemplar

), and context information according to embodiments (

), calculated similarity score for each example answer (relevance score,

)can do. Furthermore, the search model unit according to an embodiment has a similarity score for each example answer (

) by applying the softmax function to the normalized similarity score (

) can be calculated. After that, the electronic device according to the embodiments converts the traditional likelihood into a weighted likelihood using normalized similarity scores for each answer, and generates a weighted likelihood generated from the converted weighted likelihood. A generative model unit according to embodiments may be trained to minimize a loss function. In order to train the generative model unit according to the embodiments, a loss function (L) generated from the transformed weight likelihood may be calculated as follows, for example.

[수학식 1][Equation 1]

실시예들에 따른 생성 모델부 (

)는, 계산된 손실 함수(L)을 이용하여 역전파 동작을 수행할 수 있다. 실시예들에 따른 역전파 동작을 수행하는 과정에서 계산되는 기울기 즉, 그래이디언트(gradient)는 예를 들어 다음과 같이 계산될 수 있다.Generation model unit according to the embodiments (

) may perform a backpropagation operation using the calculated loss function (L). A gradient, that is, a gradient calculated in a process of performing a backpropagation operation according to embodiments may be calculated as follows, for example.

[수학식 2][Equation 2]

예시와 같은 수학식 2는, 생성 모델부의 그래이디언트가 정규화된 유사성 점수(

)에 의해 스케일(scale)되었음을 증명하며, 선택된 예시 답변(들) z가 컨텍스트 정보 (

)와 연관성이 적은 경우 생성 모델부가 적게 업데이트(즉, 적게 변화하도록 학습)됨을 나타낸다.Equation 2 as an example is a similarity score in which the gradient of the generative model unit is normalized (

), and the selected example answer(s) z is the context information (

) and a small correlation indicates that the generative model part is updated less (that is, learns to change less).

실시예들에 따른 전자 장치는, 상술한 검색 모델부 및 생성 모델부의 동작으로 인해, 부적절하거나 무관한 예시 답변(들)을 무시하거나 적게 고려하여 학습하도록 유도할 수 있다. 또한, 전자 장치는 이러한 구성으로, 골드 응답 정보와 연관된 예시 답변으로부터 토큰들을 쉽게 패치(fetch)하도록 생성 모델을 학습시킬 수 있다.Due to the above-described operations of the search model unit and the generation model unit, the electronic device according to embodiments may ignore inappropriate or irrelevant example answer(s) or may induce learning by considering them less. Also, with this configuration, the electronic device can learn a generating model to easily fetch tokens from example answers associated with gold response information.

도 6은 실시예들에 따른 검색 모델부가 대화 정보 및 응답 정보를 학습하기 위한 동작의 결과의 예시를 나타낸 것이다.6 illustrates an example of a result of an operation for learning conversation information and response information by a search model unit according to embodiments.

구체적으로, 도 6은 도 5에 나타난 실시예들에 따른 검색 모델부의 대화 정보 및 응답 정보를 학습하기 위한 동작의 예시를 나타낸다. 도 6을 참조하면, 실시예들에 따른 검색 모델부는 대화 정보를 이용하여 컨텍스트 정보(600), 및 골드 응답 정보(601)를 확인할 수 있다.Specifically, FIG. 6 shows an example of an operation for learning conversation information and response information of the search model unit according to the embodiment shown in FIG. 5 . Referring to FIG. 6 , the search model unit according to embodiments may check context information 600 and gold response information 601 using conversation information.

도 6을 참조하면, 602는 실시예들에 따른 복수의 응답들 중 컨텍스트 정보를 쿼리(query)로 검색 모델부에서 검색한 후보 예시 답변을 의미할 수 있다. 도 6을 참조하면, 603은 실시예들에 따른 군집 알고리즘(예를 들어, kNE)에 기초하여 추출한 후보 예시 답변(들)을 의미할 수 있다. 602 및 603에서 'Sim'은 골드 응답 정보와 각 후보 예시 답변 간의 문자적인 유사도를 나타낼 수 있으며, '

' 는 실시예들에 따른 검색 모델부에 의해 계산된 각 응답에 대한 정규화된 연관성 점수를 나타낼 수 있다. 또한, 'Use?' 는 실시예들에 따른 전자 장치가 생성 모델부의 학습을 위하여 생성 모델부로 예시 답변을 제공할지 여부를 나타낸다.Referring to FIG. 6 , 602 may mean a candidate example answer retrieved from a search model unit using context information as a query among a plurality of responses according to embodiments. Referring to FIG. 6 , 603 may mean candidate example answer(s) extracted based on a clustering algorithm (eg, kNE) according to embodiments. In 602 and 603, 'Sim' may indicate a literal similarity between the gold response information and each candidate example answer, and '

' may indicate a normalized relevance score for each response calculated by the search model unit according to embodiments. Also, 'Use?' Indicates whether the electronic device according to the embodiments provides an example answer to the generative model unit for learning of the generative model unit.

도 7은 실시예들에 따른 전자 장치의 동작들의 예시를 나타낸 것이다.7 illustrates examples of operations of an electronic device according to embodiments.

도 7에 나타난 동작들 일부 또는 전부는 예를 들어, 도 1의 프로세서(110), 또는 프로세서(110) 내에 포함된 러닝 프로세서(learning processor)에 의해 수행될 수 있고, 도 2의 검색 모델(200) 및/또는 생성 모델(201)에 의해 수행될 수 있다.Some or all of the operations shown in FIG. 7 may be performed by, for example, the processor 110 of FIG. 1 or a learning processor included in the processor 110, and the search model 200 of FIG. ) and/or by the generative model 201.

도 7을 참조하면, 실시예들에 따른 전자 장치는, 제1컨텍스트 정보를 확인(700)할 수 있다. 도 7을 참조하면, 실시예들에 따른 전자 장치는, 제1모델에 기초하여 상기 제1컨텍스트 정보에 대응하는 제1응답 세트를 확인(701)할 수 있다. 도 7에 나타난 제1응답 세트는 도 5에 나타난 연관 범위(501a) 내에 포함된 예시 답변들을 의미할 수 있다. 실시예들에 따른 제1컨텍스트 정보는 사용자로부터 획득한 적어도 하나 이상의 대화 정보를 포함할 수 있다.Referring to FIG. 7 , an electronic device according to embodiments may check first context information (700). Referring to FIG. 7 , an electronic device according to embodiments may check a first response set corresponding to the first context information based on a first model (701). The first response set shown in FIG. 7 may mean example answers included in the association range 501a shown in FIG. 5 . The first context information according to embodiments may include at least one piece of conversation information obtained from a user.

도 7을 참조하면, 실시예들에 따른 전자 장치는, 상기 제1컨텍스트 정보에 대응하는 적합 응답 정보를 기반으로 상기 제1응답 세트에서 선택된 응답 서브 세트를 확인(702)할 수 있다. 도 7에 나타난 응답 서브 세트는 실시예들에 따른 후보 응답 세트를 의미할 수 있다. 예를 들어, 도 7에 나타난 응답 서브 세트는 도 5의 502 내에 포함된 예시 답변들 중 제2범위(504) 내에 포함된 예시 답변들을 제외한 나머지 답변을 나타낼 수 있다.Referring to FIG. 7 , the electronic device according to embodiments may check a response subset selected from the first response set based on appropriate response information corresponding to the first context information (702). The response subset shown in FIG. 7 may mean a candidate response set according to embodiments. For example, the response subset shown in FIG. 7 may represent answers other than the example answers included in the second range 504 among the example answers included in 502 of FIG. 5 .

한편, 실시예들에 따른 응답 서브 세트는 실시예들에 따른 골드 응답 정보 및 군집 알고리즘에 기초하여 확인되는 후보 응답들 내에서 선택될 수 있다. 또한, 실시예들에 따른 응답 서브 세트는 후보 응답들 중 임베딩 공간 내 상기 골드 응답 정보에 대응하는 값으로부터 특정 범위 내에 대응하는 적어도 하나의 답변을 제외함 으로써 선택될 수 있다. Meanwhile, a response subset according to embodiments may be selected from candidate responses identified based on Gold response information and a clustering algorithm according to embodiments. Also, a response subset according to embodiments may be selected by excluding at least one answer corresponding to a specific range from a value corresponding to the Gold response information in an embedding space among candidate responses.

도 7을 참조하면, 실시예들에 따른 전자 장치는, 상기 제1컨텍스트 정보 및 상기 응답 서브 세트에 기초하여 상기 제2모델을 학습시킬(703) 수 있다. Referring to FIG. 7 , the electronic device according to embodiments may learn the second model based on the first context information and the response subset (703).

실시예들에 따른 전자 장치는 응답 서브 세트에 포함되는 각 응답에 대해 상기 제1컨텍스트 정보를 기반으로 가중치 정보를 설정하는 단계를 더 포함할 수 있다. 여기서, 실시예들에 따른 전자 장치는 설정된 가중치 정보에 기초하여 상기 제2모델을 학습시킬 수 있다.The electronic device according to embodiments may further include setting weight information for each response included in the response subset based on the first context information. Here, the electronic device according to embodiments may learn the second model based on set weight information.

실시예들에 따른 가중치 정보는 상기 응답 서브 세트 내 각 답변에 대한 관련성 점수에 기초하여 설정될 수 있고, 답변에 대한 관련성 점수는 임베딩 공간 상에서 상기 제1컨텍스트 정보에 대응하는 값 및 상기 답변에 대응하는 값에 기초하여 계산될 수 있다.Weight information according to embodiments may be set based on a relevance score for each answer in the response subset, and the relevance score for an answer corresponds to a value corresponding to the first context information and the answer in an embedding space. It can be calculated based on the value of

한편, 실시예들에 따른 제2모델은 사용자로부터 획득한 대화 정보에 대한 제2컨텍스트 정보를 확인하고, 제2컨텍스트 정보에 기초하여 상기 제2컨텍스트 정보에 대한 골드 응답 정보를 제공할 수 있다.Meanwhile, the second model according to embodiments may check second context information for conversation information obtained from a user, and provide gold response information for the second context information based on the second context information.

실시예들에 따른 제2모델은 상기 가중치 정보에 기초하여 계산된 손실 함수를 이용하여 역전파 동작을 수행함으로써 학습될 수 있고, 상기 가중치 정보는 각 답변에 대한 관련성 점수를 정규화함으로써 계산될 수 있다.The second model according to the embodiments may be learned by performing a backpropagation operation using a loss function calculated based on the weight information, and the weight information may be calculated by normalizing relevance scores for each answer. .

도 8은 실시예들에 따른 전자 장치의 동작의 예시를 나타낸다.8 shows an example of an operation of an electronic device according to embodiments.

구체적으로 도 8는 실시예들에 따른 오픈-도메인 대화(Open-domain Conversaion)를 수행하는 대화전자 장치의 동작의 예시를 나타낸다. 예를 들어, 도 8의 동작은 학습 데이터(training data)로부터 응답을 추출하기 위해 학습하는 과정이 아닌, 실제 사용자로부터 입력 받은 대화 정보로부터 답변 정보를 추출하기 위한 동작의 예시를 나타낼 수 있다.Specifically, FIG. 8 illustrates an example of an operation of a conversational electronic device that performs an open-domain conversation according to embodiments. For example, the operation of FIG. 8 may represent an example of an operation for extracting answer information from conversation information input from an actual user, rather than a process of learning to extract a response from training data.

도 8을 참조하면, 실시예들에 따른 전자 장치는 사용자로부터 대화 정보를 수신할 수 있다. 실시예들에 따른 전자 장치는 수신한 대화 정보로부터 컨텍스트 정보(8a)를 추출할 수 있다. 실시예들에 따른 컨텍스트 정보(8a)는 전자 장치 내 프로세서(110) 또는, 프로세서(110) 내에 포함된 러닝 프로세서(learning processor) 등에 의해 추출될 수 있으며, 도 2 내지 도 11에서 상술한 실시예들에 따른 컨텍스트 정보를 의미할 수 있다.Referring to FIG. 8 , an electronic device according to embodiments may receive conversation information from a user. An electronic device according to embodiments may extract context information 8a from received conversation information. Context information 8a according to embodiments may be extracted by the processor 110 in the electronic device or a learning processor included in the processor 110, and the embodiments described above with reference to FIGS. 2 to 11 It may mean context information according to .

실시예들에 따른 전자 장치는 검색 모델부(800) 및 생성 모델부(801) 중 적어도 하나를 더 포함할 수 있으며, 각 구성은 도 2 내지 도 11에 따른 검색 모델부(800) 및 생성 모델부(801)를 의미할 수 있다. An electronic device according to embodiments may further include at least one of a search model unit 800 and a generation model unit 801, each of which includes the search model unit 800 and the generation model according to FIGS. 2 to 11 It may mean part 801.

실시예들에 따른 전자 장치는, 사용자로부터 입력 받은 대화 정보로부터 추출한 컨텍스트 정보(8a)를 수신하여, 해당 대화 정보에 적절한 답변(8b)를 생성할 수 있다. 따라서, 전자 장치는 컨텍스트 정보(8a)를 검색 모델부(800)로 전달할 수 있다. 실시예들에 따른 검색 모델부(800) 내의 KNN부(800b)는 실시예들에 따른 컨텍스트 정보(8a) 및 검색 모델부(800)의 예시 답변 저장부(800a)에 저장된 복수의 답변들에 기초하여, 후보 예시 답변들을 추출할 수 있다. 실시예들에 따른 검색 모델부(800) 내의 예시 답변 선택부(800c)는, 추출된 후보 예시 답변들 중 하나 이상을 선택할 수 있다. 실시예들에 따른 검색 모델부(800) 내의 가중치 계산부(800d)는, 선택된 하나 이상의 후보 예시 답변들로부터 가중치를 계산(또는 연관성 점수 계산)할 수 있다. 실시예들에 따른 검색 모델부(800)는 선택된 예시 답변들을 실시예들에 따른 생성 모델부(801)에 전달할 수 있다. 실시예들에 따른 생성 모델부(801)는 선택된 예시 답변들 및 실시예들에 따른 컨텍스트 정보(8a)를 수신할 수 있고, 이들을 생성 모델부(801) 내에 포함된 인공신경망 모델(801a)의 입력 레이어에 입력할 수 있고, 출력 레이어로부터 응답 정보(8b)를 생성 및 확인할 수 있다. 생성 모델부(801)는 출력 레이어로부터 확인되는 답변 정보(8b)를 출력하여 사용자에게 제공할 수 있다.An electronic device according to embodiments may receive context information 8a extracted from conversation information input from a user and generate an appropriate answer 8b for the corresponding conversation information. Accordingly, the electronic device may transmit the context information 8a to the search model unit 800 . The KNN unit 800b in the search model unit 800 according to embodiments stores context information 8a according to embodiments and a plurality of answers stored in the example answer storage unit 800a of the search model unit 800. Based on this, candidate example answers can be extracted. The example answer selector 800c in the search model unit 800 according to embodiments may select one or more of the extracted candidate example answers. The weight calculation unit 800d in the search model unit 800 according to embodiments may calculate weights (or calculate relevance scores) from one or more selected candidate example answers. The search model unit 800 according to embodiments may transmit the selected example answers to the generation model unit 801 according to embodiments. The generation model unit 801 according to the embodiments may receive selected example answers and context information 8a according to the embodiments, and transmit them to the artificial neural network model 801a included in the generation model unit 801. It can be input to the input layer, and response information 8b can be generated and checked from the output layer. The generation model unit 801 may output the answer information 8b checked from the output layer and provide it to the user.

이들을 생성 모델부(801) 내에 포함된 인공신경망 모델(801a)은, 입력 레이어로 입력 받은 선택된 예시 답변들 및 실시예들에 따른 컨텍스트 정보(8a)를 이용하여 순전파 동작(801b)을 수행할 수 있다.The artificial neural network model 801a included in the generation model unit 801 performs a forward propagation operation 801b using the selected example answers received as input layers and the context information 8a according to the embodiments. can

한편, 실시예들에 따른 생성 모델(801)은, 즉각적이고 실시간의 대화를 유도하기 위하여, 사용자로부터 입력 받은 대화로부터 응답 정보(8b)를 추출한 후, 실시예들에 따른 역전파 과정 (및/또는 실시예들에 따른 손실 함수(loss function)를 계산하는 동작 등)을 생략할 수도 있다. 한편, 역전파 과정 (및/또는 실시예들에 따른 손실 함수(loss function)를 계산하는 동작 등)을 생략하면 실시예들에 따른 생성 모델(801)의 인공신경망 모델(801a)의 학습이 이루어지지 않는 대신 사용자로부터 입력 받은 대화에서 즉각적인 응답을 추출할 수 있다는 점에서, 본 명세서에서 설명하는 전자 장치의 일부 실시예가 될 수 있다. 따라서, 역전파 과정 (및/또는 실시예들에 따른 손실 함수(loss function)를 계산하는 동작 등)을 생략할 수 있는 실시예의 기재만으로, 역전파 과정 및/또는 본 명세서에서 개시하는 손실 함수(loss function)를 계산하는 동작 등이 인습적(conventional)이라거나 일상적인(routine) 구성이라고 해석되어서는 안된다.On the other hand, the generation model 801 according to embodiments extracts response information 8b from a conversation input from a user in order to induce an immediate and real-time conversation, and then performs a backpropagation process (and/or Alternatively, an operation of calculating a loss function according to embodiments, etc.) may be omitted. On the other hand, if the backpropagation process (and/or the operation of calculating a loss function according to the embodiments, etc.) is omitted, the artificial neural network model 801a of the generation model 801 according to the embodiments is learned. Some embodiments of the electronic device described in this specification may be in that an immediate response may be extracted from a conversation input from a user instead of being lost. Therefore, only by describing an embodiment in which the back-propagation process (and/or the operation of calculating a loss function according to the embodiments) can be omitted, the back-propagation process and/or the loss function disclosed in this specification ( The operation of calculating the loss function, etc., should not be construed as a conventional or routine construct.

도 9 내지 도 12은 실시예들에 따른 전자 장치의 오픈-도메인 대화 모델의 성능을 다른 오픈-도메인 대화(Open-domain Conversaion) 모델들과 비교한 것을 나타낸다.9 to 12 show comparison of performance of an open-domain conversation model of an electronic device with other open-domain conversation models according to embodiments.

도 9는 실시예들에 따른 전자 장치의 효과를 나타낸 도면이다.9 is a diagram illustrating effects of an electronic device according to embodiments.

도 9를 참조하면, Bi-encoder 256M' 및 'Blender 90M'은 베이스라인 검색 모델(baseline retrieval model) 및 베이스라인 생성 모델(baseline generative model)의 예시일 수 있다. 도 8를 참조하면, 'RetNRef', 'RetNRef_α','MatToGen'은 실시예들에 따른 생성 모델부의 예시를 나타낸 것이며, CORGE는 실시예들에 따른 검색 모델부를 나타낸 것이다. 구체적으로, RetNRef는 예를 들어 답변을 생성하기 위해 생성 모델부의 예시로, 입력으로 주어진 컨텍스트 정보와 검색 모델부를 연결할 수 있다. RetNRef_α는 RetNRef의 대화 검색 버전으로, 검색된 예제(α = 0:5)를 단순히 무시하는 것을 피하기 위해 혼합된 구성을 채택한 구성을 포함할 수 있다. MatToGen은 예시 답변(들)에서 의미 있는 토큰을 추출하여 생성기에 제공하는 모델일 수 있다. 도 8에서, RAG 및 KIF는 지식-기반 생성 모델(Knowledge-grounded Generative model)에 기반한 오픈-도메인 대화 모델을 나타낸다. 도 8는 'RetNRef', 'RetNRef_α','MatToGen'를 실시예들에 따른 전자 장치(또는 전자 장치의 검색 모델부)와 결합하여 성능을 두 개의 모델마다 비교함으로써 실험한 것을 나타낸다. Referring to FIG. 9 , 'Bi-encoder 256M' and 'Blender 90M' may be examples of a baseline retrieval model and a baseline generative model. Referring to FIG. 8 , 'RetNRef', 'RetNRef_α', and 'MatToGen' represent examples of generation model units according to embodiments, and CORGE represents search model units according to embodiments. Specifically, RetNRef is an example of a generation model unit to generate an answer, and can connect context information given as an input and a search model unit. RetNRef_α is a conversational search version of RetNRef, which can contain constructs that adopt mixed constructs to avoid simply ignoring retrieved examples (α = 0:5). MatToGen can be a model that extracts meaningful tokens from the example answer(s) and feeds them to the generator. In FIG. 8, RAG and KIF represent an open-domain conversation model based on a Knowledge-grounded Generative model. FIG. 8 shows an experiment conducted by combining 'RetNRef', 'RetNRef_α', and 'MatToGen' with an electronic device (or a search model unit of the electronic device) according to embodiments and comparing performance for each of the two models.

도 9의 Appropriateness(적합성)'는 답변이 얼마나 유창하고 논리적이며 컨텍스트(문맥)과 어울리는지를 측정하는 지표를 의미할 수 있고, 'Informativeness(정보성)'은 생성한 답변이 컨텍스트 정보를 기준으로 얼마나 의미있는 정보를 포함하는지를 측정하는 지표를 의미할 수 있으며, 각 지표는 Wetson et al. (2018)에 따른다.'Appropriateness' in FIG. 9 may mean an index that measures how fluent and logical an answer is and how well it matches the context (context), and 'Informativeness' is how well a generated answer is based on context information. It may refer to indicators that measure whether meaningful information is included, and each indicator is described in Wetson et al. (2018).

도 9은 두 모델의 비교 결과를 요약한 것이다. RetNRef와 MatToGen이 실시예들에 따른 검색 모델부(CORGE)를 채택할 경우, Informativeness에 대해 RetNRef + CORGE 대 KIF의 경우를 제외하고 모든 기준선을 능가하는 효과를 제공한다. 구체적으로 실시예들에 따른 검색 모델부를 결합한 RetNRef + CORGE 및 MatToGen + CORGE는 두 메트릭에서 각각 RetNRefα 및 MatToGen보다 우수한 성능을 제공한다. 특히 MatToGen + CORGE는 Biencoder 256M을 능가하고 Blender 90M을 크게 능가하는 반면 MatToGen은 Bi-encoder 256M 및 Blender 90M보다 성능이 떨어짐을 확인할 수 있다. 또한 실시예들에 따른 검색 모델부인 CORGE는 Blender 90M에 대한 RetNRefa의 승률은 높게 확인될 수 있다. 이러한 평가 결과는 실시예들에 따른 검색 모델부인 CORGE가 기존의 모범 기반 생성 모델을 선도하여 보다 유창하고 유익한 생성 모델을 생성함을 보여준다.Figure 9 summarizes the comparison results of the two models. When RetNRef and MatToGen adopt the search model part (CORGE) according to the embodiments, the effect of surpassing all baselines except for the case of RetNRef + CORGE vs. KIF for informativeness is provided. Specifically, RetNRef + CORGE and MatToGen + CORGE combining search model units according to the embodiments provide better performance than RetNRefα and MatToGen in both metrics, respectively. In particular, it can be seen that MatToGen + CORGE outperforms Biencoder 256M and greatly outperforms Blender 90M, while MatToGen is inferior to Bi-encoder 256M and Blender 90M. In addition, it can be confirmed that CORGE, a search model unit according to embodiments, has a high win rate of RetNRefa against Blender 90M. These evaluation results show that CORGE, a search model unit according to the embodiments, leads the existing model-based generative model to create a more fluent and informative generative model.

도 10은 실시예들에 따른 전자 장치의 효과를 나타낸 도면이다.10 is a diagram illustrating effects of an electronic device according to embodiments.

도 10은 각 모델이 생성하는 답변을 분석하기 위한, 실시예들에 따른 전자 장치에 대한, 자동 평가 메트릭스(automatic evaluation metrics)인 PPL(Perplexity), Dist-n, BLEU (Papineni et al., 202)를 나타낸 것이다.10 is automatic evaluation metrics PPL (Perplexity), Dist-n, BLEU (Papineni et al., 202 ) is shown.

PPL은 제시된 입력 컨텍스트에 기반하여 얼마나 모델이 답변을 잘 예측하는지를 측정하는 지표로, PPL이 낮다는 것은 모델이 답변을 잘 예측한다는 것을 나타낸다. 예시-기반 생성 모델이 검색된 예시 답변들을 얼마나 활용하는지 분석하기 위해, 예시 답변들이 제공될 때 조건부 확률을 활용한 PPL의 두 가지 변형된 지표를 사용할 수 있다. 첫 번째로, 1)

는 골드 응답 정보가 예시 답변으로 주어졌을 때의 상황을 가정한 조건부 확률

를 이용한다. 두 번째로, 2)

는, z가

을 이용하여 검색된 예시 답변을 나타내는 조건부 확률

를 이용한다.

가 작다는 것은, 골드 응답 정보가 예시 답변으로 제공되었을 때 예시-기반 생성 모델이 골드 응답 정보를 잘 예측한다는 것을 의미한다.

이 작다는 것은 예시-기반 생성 모델이 골드 응답 정보를 예측하기 위하여 제공된 예시 답변을 잘 활용한다는 것을 의미한다.PPL is an index that measures how well a model predicts an answer based on the presented input context. A low PPL indicates that the model predicts an answer well. To analyze how much the example-based generative model utilizes retrieved example answers, we can use two modified indicators of PPL utilizing conditional probabilities when example answers are provided. First, 1)

is a conditional probability assuming the situation when gold response information is given as an example answer

Use Second, 2)

is, z is

Conditional probability representing the example answer retrieved using

Use

When is small, it means that the example-based generative model predicts the gold response information well when the gold response information is provided as an example answer.

This smallness means that the example-based generative model makes good use of the example answers provided to predict gold response information.

Dist-n은 Li et al., 2016에 따르며, 모든 생성된 답변들에 대한 전체 n-그램(n-gram)들의 개수 대비 디스팅트 n-그램(distinct n-gram)의 비율을 나타낸 것으로, 생성된 답변들의 다양성을 나타내는 지표일 수 있다.Dist-n follows Li et al., 2016, and represents the ratio of distinct n-grams to the total number of n-grams for all generated answers, It may be an indicator of the diversity of generated answers.

BLEU는 제공된 예시 답변과 생성된 답변의 쌍(z, r) 사이의 토큰 오버랩 정도를 측정하기 위한 지표일 수 있다. BLEU 점수가 높다는 것은, 생성 모델부가 답변을 생성함에 있어 제공된 예시 답변들로부터 많은 부분을 복사(copy) 또는 참조하였다는 것을 나타낸다.BLEU may be an indicator for measuring the degree of token overlap between the provided example answer and the generated answer pair (z, r). A high BLEU score indicates that the generative modeling unit copied or referenced many parts from the example answers provided in generating the answers.

도 10에 따르면, RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성, 및 MatToGen과 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성은 Blender 90M보다 낮은 PPL_retrieve를 나타낸다. 이는 실시예들에 따른 전자 장치의 CORGE로 훈련된 예제 기반 생성 모델이 제공된 예제를 활용하여 Blender 90M보다 골드 응답을 더 잘 예측한다는 것을 의미한다. RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성은 RetNRef보다 PPL_gold 및 PPL_retrieve 정도가 더 작으며, 이는 RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성은, 제공된 예시 답변을 RetNRef보다 더 잘 활용한다고 추론할 수 있다. According to FIG. 10 , a configuration in which RetNRef and a search model unit of an electronic device according to embodiments are combined, and a configuration in which MatToGen and a search model unit of an electronic device according to embodiments are combined show PPL_retrieve lower than Blender 90M. This means that the CORGE-trained example-based generative model of the electronic device according to the embodiments predicts the Gold response better than Blender 90M using provided examples. The configuration combining RetNRef and the search model unit of the electronic device according to the embodiments has smaller PPL_gold and PPL_retrieve degrees than RetNRef, which means that the configuration combining RetNRef and the search model unit of the electronic device according to the embodiments is RetNRef the provided example answer. It can be inferred that it is better used.

RetNRef는 RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성보다 PPL_gold가 낮지만 RetNRef는 RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성보다 PPL_retrieve가 높다. 이 결과는 골드 응답 정보로 검색된 예시 답변으로 제공되는 경우를 제외하고 RetNRef가 검색된 예제를 잘 사용하지 않음을 보여준다. 이 관찰에 따르면, RetNRef가 훈련 단계에서 골드 응답 정보를 예시로 활용함으로써 발생하는 선택된 예시에 고도로 과적합된 응답을 생성한다고 추론할 수 있다. 또한, 실시예들에 따른 전자 장치의 모델이 과적합 문제를 완화하며, 이는 MatToGen과의 결합에서도 나타난다. Blender 90M과 비교하여 RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성, 및 MatToGen과 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성의 더 높은 Dist-n은 실시예들에 따른 전자 장치의 모델이 바닐라(vanilla) 생성 모델보다 더 다양한 응답을 생성한다는 것을 보여준다. 또한 RetNRef 및 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성는 RetNRef보다 Dist-n이 더 높기 때문에 예시 답변을 활용하면 생성기가 응답을 다양화하는 데 도움을 줄 수 있다. 단순히 RetNRef는 바닐라 검색 모델인 Bi-encoder 256M에 필적하는 Dist-n을 달성한 유일한 모델이지만, PPL_gold와 PPL_retrieve 간의 격차를 고려하면 예시에 과적합되어 적절성과 정보성이 저조하다는 것을 확인할 수 있다.RetNRef has a lower PPL_gold than a configuration in which RetNRef and a search model unit of an electronic device according to embodiments are combined, but RetNRef has a PPL_retrieve higher than a configuration in which RetNRef and a search model unit of an electronic device according to embodiments are combined. This result shows that RetNRef does not make good use of retrieved examples, except when provided as example answers retrieved with gold response information. According to this observation, it can be inferred that RetNRef generates highly overfitted responses to selected examples that arise from utilizing gold response information as examples in the training phase. In addition, the model of the electronic device according to the embodiments alleviates the overfitting problem, which is also shown in combination with MatToGen. Compared to Blender 90M, the higher Dist-n of the configuration combining RetNRef and the search model unit of the electronic device according to the embodiments, and the configuration combining MatToGen and the search model unit of the electronic device according to the embodiments, is the electronic device according to the embodiments. We show that the device's model generates more diverse responses than the vanilla generative model. In addition, since the configuration combining RetNRef and the search model unit of the electronic device according to the embodiments has a higher Dist-n than RetNRef, the generator can help diversify the response by using example answers. Simply, RetNRef is the only model that achieves a Dist-n comparable to Bi-encoder 256M, a vanilla search model, but considering the gap between PPL_gold and PPL_retrieve, it can be confirmed that it is overfitted to examples, resulting in poor relevance and informativeness.

평균 BLEU 점수는 검색된 예시 답변과 생성된 답변 간의 중첩을 암시적으로 측정하므로, BLEU 수준이 높을수록 생성기가 검색된 모형에 더 많이 의존함을 나타낸다. RetNRef는 무시할 수 있는 BLEU 점수를 보여주므로 모델이 검색된 모형을 거의 활용하지 않고 있음을 재확인한다. 또한, RetNRefa 및 MatToGen은 각각 RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성, 및 MatToGen와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성에 비해 더 높은 BLEU 점수를 가지며, 이는 실시예들에 따른 전자 장치의 구성이 검색 모델부에 의해 검색된 예시 답변들에 더 많이 의존하지 않음을 확인한다. The average BLEU score implicitly measures the overlap between retrieved example answers and generated answers, so a higher BLEU level indicates that the generator is more dependent on the retrieved model. RetNRef shows a negligible BLEU score, reaffirming that the model is making little use of the retrieved model. In addition, RetNRefa and MatToGen each have a higher BLEU score than a configuration combining RetNRef and a search model unit of an electronic device according to embodiments, and a configuration combining MatToGen and a search model unit of an electronic device according to embodiments, which is Confirm that the configuration of the electronic device according to the examples does not depend more on the example answers retrieved by the search model unit.

도 11은 상술한 결과에 따른 효과를 증명하기 위한 예시이다.11 is an example for proving the effect according to the above-described result.

Bi-encoder 256M, Blender90M, RetNRef, RetNRefa, RetNRef와 실시예들에 따른 전자 장치의 검색 모델부를 결합한 구성, KIF, RAG 모델이, 입력 컨텍스트 대비 생성한 답변을 나타낸다.A configuration combining Bi-encoder 256M, Blender90M, RetNRef, RetNRefa, and RetNRef with the search model unit of the electronic device according to the embodiments, and the KIF and RAG models represent answers generated in comparison to the input context.

도 12은 지식-기반 생성 모델 대비 실시예들에 따른 전자 장치의 모델의 상술한 결과에 따른 효과를 나타낸다.12 illustrates effects according to the above results of a model of an electronic device according to embodiments in comparison to a knowledge-based generation model.

구체적으로, 정규화된 검색기 점수의 표준 편차는 표본 기반 생성 모델에 대해 검색기를 공동으로 훈련할 때 더 작아진다. 여기서 각 훈련 인스턴스에 대해 5개의 표본(k = 5)이 사용됩니다. 도 11의 'Ours'는 RetNRef + CORGE를 나타내며 'joint'는 생성 모델부와 함께 검색 모델부를 훈련하는 것을 나타낸다.Specifically, the standard deviation of the normalized searcher scores becomes smaller when jointly training the searcher on the sample-based generative model. Here, 5 samples (k = 5) are used for each training instance. 'Ours' in FIG. 11 represents RetNRef + CORGE and 'joint' represents training of the search model unit together with the generative model unit.

한편, 지식-기반 생성(Knowledge-grounded Generation)에 대하여도 설명한다.Meanwhile, knowledge-grounded generation is also described.

답변들을 생성하기 위하여 검색된 결과들(예를 들어, Wikipedia로부터 확인되는 관련된 문서들)을 이용하는 지식-기반 생성 모델들은 지식-집약적(knowledge-intensive) NLP 작업 (예를 들어, 오픈-도메인 질문 및 답변) 을 수행하기 위하여 제안되었다. 지식-기반 생성은 예시-기반 생성과 흡사하다. 하지만, 주 차이점은 지식-기반 생성 모델들은 답변을 생성하기 위하여 외부 자원들로부터 지식을 추출한다. Guu et al. (2020) 는 오픈-도메인 질문 답변을 위한 대규모의 언어 모델과 함께 지식 추출부의 기-훈련(pre-training)의 효율성을 보여 준다. Lewis et al. (2020)는 지식-기반 생성 모델들이 지식-집약적인 NLP 작업의 다양한 방면에서 순수한 생성 모델들에 비해 더 정보적이고 다양한 문장을 생성할 수 있음을 증명한다. Fan et al. (2021)도 유사하게, 답변 생성을 위한 지식-기반 생성 모델을 제안하나, 오픈-도메인 대화에 초점이 맞춰져 있지 않다. 본 명세서에서는, 기 존재한 지식-기반 생성 모델들이 오픈-도메인 대화에 직접적으로 적용될 수 없음을 보여 준다. Knowledge-based generative models that use retrieved results (e.g., related documents identified from Wikipedia) to generate answers are knowledge-intensive NLP tasks (e.g., open-domain question and answer). ) was proposed to perform. Knowledge-based production is similar to example-based production. However, the main difference is that knowledge-based generative models extract knowledge from external resources to generate answers. Guu et al. (2020) demonstrate the efficiency of pre-training the knowledge extraction unit with a large-scale language model for answering open-domain questions. Lewis et al. (2020) prove that knowledge-based generative models can generate more informative and diverse sentences than pure generative models in various aspects of knowledge-intensive NLP tasks. Fan et al. (2021) similarly proposes a knowledge-based generative model for answer generation, but is not focused on open-domain conversations. In this specification, we show that existing knowledge-based generative models cannot be directly applied to open-domain conversations.

도 12에 따른 자동 평가 결과는 지식-기반 생성 모델(Knowledge-grounded Models)이 예시 답변을 무시하고 있음을 확인할 수 있다. RAG 및 KIF의 PPLgold, PPLretrieve 및 Dist-n은 Blender 90M과 유사한 정도를 가지고 있는데, 이는 예시 답변이 응답을 생성하는 동안 유용한 정보가 제공되지 않는다는 것을 나타낸다. 또한 평균 BLEU 점수는 검색된 예시와 생성된 응답 간에 중복이 거의 없음을 나타내는 낮은 정도를 나타낸다. 이러한 결과는 오픈 도메인 대화와 지식 기반 생성 작업의 차이에서 비롯된 것이다. 지식 기반 생성 모델을 학습 및 훈련하는 동안

를 사용하여 외부 지식을 가져오지만 개방형 도메인 대화의 일대다 특성으로 인해 생성기는 다시 검색된 예시 답변을 기본 예제로 무시할 수 있다. The automatic evaluation result according to FIG. 12 confirms that the knowledge-grounded models are ignoring the example answers. RAG and KIF's PPLgold, PPLretrieve, and Dist-n have similar degrees to Blender 90M, indicating that example answers do not provide useful information while generating responses. Also, the average BLEU score is low indicating that there is little overlap between the examples retrieved and the responses generated. These results stem from the difference between open domain conversations and knowledge base creation tasks. While learning and training a knowledge-based generative model

to bring in external knowledge, but due to the one-to-many nature of open domain conversations, generators may ignore back-retrieved example answers as default examples.

또한, 생성기와 함께 실시예들에 따른 검색 모델부(리트리버)를 훈련하면 리트리버가 국소 최솟값에 갇히게 된다. 그림 4와 같이 RAG의 검색자가 공동으로 훈련할 때 검색자에 의해 계산된 정규화된 관련성 점수

의 표준 편차는 거의 0에 가까워진다. 표준 편차가 작을수록 관련성 점수가 평평해지고 있음을 의미한다. 지식 기반 생성 모델은 검색기와 생성기를 공동으로 훈련하면 지식 집약적 NLP 작업에서 성능이 향상된다는 것을 경험적으로 보여주었지만(Lewis et al., 2020), 공개 도메인 대화에서 검색된 예시 답변은 무시된다. 따라서 검색자는 정보가 없는 관련성 점수를 생성하는 방법을 학습한다. 결과적으로 검색기가 축소되어 검색기가 생성기에 부적절한 전형을 반환할 수 있다(도 10의 KIF 및 RAG의 예에도 표시됨). 실시예들에 따른 전자 장치와 함께 검색 모델부를 훈련하면 도 11와 같이 리트리버 점수가 평평해지며 RAG에서도 경험한 것처럼 리트리버의 사소한 붕괴를 경험적으로 관찰할 수 있다. 따라서 실시예들에 따른 전자 장치는 검색 모델부를 공동으로 훈련시키지 않을 수도 있다. 지식-기반 생성 모델들이 경험적으로 보여준 바와 같이 결합하여 훈련되는 검색부 및 생성부는 지식-집약적인 NLP 작업 내 성능을 향상시키지만 (Lewis et al., 2020), 오픈-도메인 대화에서는 검색된 예시들이 무시된다; 따라서 검색부는 정보적이지 않은 연관성 점수를 생성한다. In addition, training the search model unit (retriever) according to the embodiments together with the generator results in the retriever being trapped in a local minimum. As shown in Figure 4, the normalized relevance score calculated by the searchers in the RAG when they are jointly trained.

The standard deviation of is close to zero. A smaller standard deviation means a flatter relevance score. Knowledge-based generative models have shown empirically that jointly training searchers and generators improves performance on knowledge-intensive NLP tasks (Lewis et al., 2020), but example answers retrieved from public domain conversations are ignored. Thus, searchers learn how to generate uninformative relevance scores. As a result, the searcher may be collapsed, causing the searcher to return inappropriate exemplars to the generator (also shown in the example of KIF and RAG in Figure 10). When the search model unit is trained with the electronic device according to the embodiments, the retriever score is flattened as shown in FIG. 11, and minor collapse of the retriever can be empirically observed as experienced in RAG. Accordingly, the electronic device according to embodiments may not jointly train the search model unit. As knowledge-based generative models have shown empirically, jointly trained retrieval and generator parts improve performance in knowledge-intensive NLP tasks (Lewis et al., 2020), but retrieved examples are ignored in open-domain conversations. ; Accordingly, the search unit generates non-informative relevance scores.

이하에서는 본 명세서에서 개시하는 CORGE (전자 장치)의 Ablation study (절제 훈련 검증)에 대하여 설명한다.Hereinafter, an ablation study (verification of ablation training) of CORGE (electronic device) disclosed in this specification will be described.

본 명세서에서는 연관성 점수 (Relevance Score, RS) 및 kNE 군집 알고리즘이 생성부로 하여금 예시들을 활동적으로 이용하고, Jaccard Filter (JF)가 과적합 문제를 완화하도록 유도함을 보여 주었다. 도 9B를 참조하면, RetNRef + CORGE의 PPL_retrieve는 다른 절제된 비교군들에 대비하여 낮음을 보여주며, 이는 각 구성요소가 답변을 예시하는데 기여함을 보여 준다. RetNRef + CORGE - RS 및 RetNRef + CORGE - kNE는 높은 정도의 PPL_retrieve 및 PPL_gold를 보여 주며, 이는 답변을 생성하는 과정에서 RS 및 kNE가 생성부로 하여금 예시를 활용하도록 도와줌을 보여준다. RetNRef + CORGE - JF는 과적합에 강한 신호를 제공하며, PPL_gold에 대하여 크게 낮은 수치를 보여주나, 대조적으로 높은 PPL_retrieve 수치를 보여 준다. Dist-n은 본 명세서에서 제안하는 모델이, RetNRef + CORGE - JF를 제외한 모델들 내에서 가장 다양한 답변들을 생성함을 나타내며, RetNRef + CORGE - JF는 검색된 예시로부터 토큰들을 과하게 복사함을 나타낸다. 평균 BLEU 점수도 동일한 경향을 보여 CORGE 구성 요소의 효과를 재확인한다.Here, we show that the Relevance Score (RS) and kNE clustering algorithms lead the generator to actively use examples, and the Jaccard Filter (JF) to mitigate the overfitting problem. Referring to FIG. 9B, PPL_retrieve of RetNRef + CORGE is low compared to other restrained comparison groups, which shows that each component contributes to exemplifying the answer. RetNRef + CORGE - RS and RetNRef + CORGE - kNE show high levels of PPL_retrieve and PPL_gold, which shows that RS and kNE help the generator to exploit examples in the process of generating answers. RetNRef + CORGE - JF provides a signal that is strong against overfitting and shows significantly lower values for PPL_gold, but in contrast, higher PPL_retrieve values. Dist-n indicates that the model proposed herein generates the most diverse answers within the models except for RetNRef + CORGE - JF, and RetNRef + CORGE - JF indicates excessive copying of tokens from the retrieved example. The average BLEU score also showed the same trend, reaffirming the effect of the CORGE component.

결론적으로, 본 명세서에서는 적절성 및 정보성의 관점에서 성능을 극대화할 수 있는 예시-기반 생성 모델에서 적용가능한 훈련 모델을 제안한다. 본 명세서에서 제안하는 훈련 방법은 의미상으로 유사하지만 골드 응답과 적당히 거리가 있는 예시들을 선택하고 검색부에서 예시들을 연관성 점수(relevance score)에 기반하여 가중치를 부여함으로써, 기 존재한 예시-기반 생성 모델들의 단점들을 완화할 수 있다. 페어와이즈(pairwise) 인적 평가를 포함한 광범위한 분석을 통해 본 명세서에서 제안한 훈련 방법이 기존 예제 기반 생성 모델의 성능을 향상시키는 것을 확인한다.In conclusion, in this specification, we propose a training model applicable to an example-based generative model that can maximize performance in terms of relevance and informativeness. The training method proposed in this specification selects examples that are semantically similar but are reasonably far from the gold response and weights the examples based on the relevance score in the search unit, thereby generating existing examples-based. The disadvantages of the models can be mitigated. Through extensive analysis including pairwise human evaluation, it is confirmed that the training method proposed in this specification improves the performance of existing example-based generative models.

도 13은 실시예들에 따른 연관성 점수(relevance score)를 계산하는 방법의 예시를 나타낸다.13 shows an example of a method for calculating a relevance score according to embodiments.

먼저, 입력 컨텍스트 (예를 들어, 실시예들에 따른 컨텍스트 정보) 및 후보 레이블(candidate label) (예를 들어, 실시예들에 따른 예시 답변, 선택된 예시 답변)의 인코딩된 벡터(encoded vector)들(예를 들어, 도 13에 나타난 바와 같이 실시예들에 따른 전자 장치 내 포함된 인코더(1200a, 1200b) 동작에 의해 인코딩된 벡터들)을 확인한다. 예를 들어, 입력 컨텍스트의 인코딩된 벡터

및 후보 레이블의 인코딩된 벡터

는 도 12의 1200에 나타난 바와 같이 나타낼 수 있다. First, encoded vectors of an input context (eg, context information according to embodiments) and a candidate label (eg, an example answer according to embodiments, a selected example answer) (For example, as shown in FIG. 13 , vectors encoded by the operation of the encoders 1200a and 1200b included in the electronic device according to the embodiments) are checked. For example, the encoded vector of the input context

and the encoded vector of candidate labels.

may be represented as shown in 1200 of FIG. 12 .

여기서,

및

는 도 13에서 설명되는 도 13에 나타난 동작(들)이 기 훈련된 두 트랜스포머(transformer)들을 의미할 수 있으며, 처음에는 동일한 가중치로 시작하지만 미세 조정(fine tuning) 과정에서 업데이트될 수 있다.

는 각 트랜스포머 T의 출력 결과를 의미할 수 있으며,

는 벡터들의 시퀀스(sequence)를 하나의 벡터로 감소시키는 함수를 의미할 수 있다. 입력 컨텍스트 및 후보 레이블이 각각 별도로 인코딩되므로, 세그먼트 토큰(segment token)들은 각각 0이다. 사전 훈련 중에 수행된 것과 유사하게 입력과 레이블 모두 특수 토큰 [S]로 둘러싸여 있으므로 h1은 [S]에 해당될 수 있다. 벡터들의 시퀀스(sequence)를 하나의 벡터로 감소시키는

는 예를 들어, 다음 방법들 중 하나 이상을 포함할 수 있다. 1) 트랜스포머(특수 토큰 [S]에 대응하는)의 첫 번째 출력물을 선택하는 방법, 2) 최초 m < N 개의 출력물 벡터들에 대한 평균 또는 전체 출력물 벡터들에 대한 평균으로 결정하는 방법. 도 13에 따른

및

는 도 13의 1200c 및 1200d에 나타난 구성요소 일부 또는 전부를 포함할 수 있다.here,

and

may refer to two transformers previously trained in the operation(s) shown in FIG. 13 described in FIG. 13, and initially start with the same weight, but may be updated in a fine tuning process.

May mean the output result of each transformer T,

may mean a function that reduces a sequence of vectors into one vector. Since the input context and candidate label are encoded separately, the segment tokens are each 0. Similar to what was done during pre-training, both inputs and labels are surrounded by special tokens [S], so h1 can correspond to [S]. reducing a sequence of vectors to one vector

may include, for example, one or more of the following methods. 1) A method of selecting the first output of the transformer (corresponding to the special token [S]), 2) A method of determining the average of the first m < N output vectors or the average of all output vectors. according to Figure 13

and

may include some or all of the components shown in 1200c and 1200d of FIG. 13 .

후보 레이블

의 점수(score)는 도 12의 1201에 나타난 수학식 및 도 13의 1201a에 나타난 바와 같이 닷-곱셈(dot product)에 의해 계산될 수 있다. 여기서, 네트워크(예를 들어, 도 13에서 설명하는 네트워크)는 로짓(logit)들이 각각

이고,

이 올바른 레이블이고 나머지는 훈련 세트에서 선택될 수 있는 경우로, 크로스 엔트로피(cross-entropy) 손실이 최소화되도록 훈련된 네트워크일 수 있다.candidate label

The score of can be calculated by dot product as shown in Equation 1201 of FIG. 12 and 1201a of FIG. 13 . Here, the network (eg, the network described in FIG. 13) has logits, respectively

ego,

is the correct label and the rest can be selected from the training set, which can be a network trained to minimize cross-entropy loss.

따라서, 실시예들에 따른 연관성 점수는, 실시예들에 따른 컨텍스트 정보를 입력 컨텍스로, 및 실시예들에 따른 예시 답변을 후보 레이블(candidate label)로 하여 계산된 후보 레이블의 점수(score)를 의미할 수 있다. 또한, 실시예들에 따른 연관성 점수는 예를 들어, Humeau, Samuel, et al. (2019)에 소개된 인코더(들) (예를 들어, Bi-encoder, Poly-encoder 등)에 의해 생성된 점수(score)를 의미할 수도 있다.Therefore, the relevance score according to the embodiments is obtained by using the score of the candidate label calculated by using the context information according to the embodiments as an input context and the example answer according to the embodiments as a candidate label. can mean In addition, association scores according to embodiments, for example, Humeau, Samuel, et al. (2019) may also mean a score generated by the encoder (s) (eg, Bi-encoder, Poly-encoder, etc.).

본 개시의 다양한 실시예에 따른 기록 매체는 예시 기반 생성 모델을 이용한 대화 정보 생성 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 판독 가능한 비일시적 기록 매체를 포함할 수 있다.A recording medium according to various embodiments of the present disclosure may include a computer-readable non-transitory recording medium recording a program for executing a method of generating conversation information using an example-based generation model in a computer.

한편, 본 명세서와 도면에는 본 개시의 바람직한 실시 예에 대하여 개시하였으며, 비록 특정 용어들이 사용되었으나, 이는 단지 본 개시의 기술 내용을 쉽게 설명하고 발명의 이해를 돕기 위한 일반적인 의미에서 사용된 것이지, 본 개시의 범위를 한정하고자 하는 것은 아니다. 여기에 개시된 실시 예 외에도 본 개시의 기술적 사상에 바탕을 둔 다른 변형 예들이 실시 가능하다는 것은 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자에게 자명한 것이다.On the other hand, in the present specification and drawings, preferred embodiments of the present disclosure have been disclosed, and although specific terms have been used, they are only used in a general sense to easily describe the technical content of the present disclosure and help understanding of the present invention. It is not intended to limit the scope of the disclosure. In addition to the embodiments disclosed herein, it is obvious to those skilled in the art that other modified examples based on the technical spirit of the present disclosure may be implemented.

전술한 실시예들에 따른 전자 장치 또는 단말은, 프로세서, 프로그램 데이터를 저장하고 실행하는 메모리, 디스크 드라이브와 같은 영구 저장부(permanent storage), 외부 장치와 통신하는 통신 포트, 터치 패널, 키(key), 버튼 등과 같은 사용자 인터페이스 장치 등을 포함할 수 있다. 소프트웨어 모듈 또는 알고리즘으로 구현되는 방법들은 상기 프로세서상에서 실행 가능한 컴퓨터가 읽을 수 있는 코드들 또는 프로그램 명령들로서 컴퓨터가 읽을 수 있는 기록 매체 상에 저장될 수 있다. 여기서 컴퓨터가 읽을 수 있는 기록 매체로 마그네틱 저장 매체(예컨대, ROM(read-only memory), RAM(random-Access memory), 플로피 디스크, 하드 디스크 등) 및 광학적 판독 매체(예컨대, 시디롬(CD-ROM), 디브이디(DVD: Digital Versatile Disc)) 등이 있다. 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템들에 분산되어, 분산 방식으로 컴퓨터가 판독 가능한 코드가 저장되고 실행될 수 있다. 매체는 컴퓨터에 의해 판독가능하며, 메모리에 저장되고, 프로세서에서 실행될 수 있다. An electronic device or terminal according to the above-described embodiments includes a processor, a memory for storing and executing program data, a permanent storage unit such as a disk drive, a communication port for communicating with an external device, a touch panel, and a key ), user interface devices such as buttons, and the like. Methods implemented as software modules or algorithms may be stored on a computer-readable recording medium as computer-readable codes or program instructions executable on the processor. Here, the computer-readable recording medium includes magnetic storage media (e.g., read-only memory (ROM), random-access memory (RAM), floppy disk, hard disk, etc.) and optical reading media (e.g., CD-ROM) ), and DVD (Digital Versatile Disc). A computer-readable recording medium may be distributed among computer systems connected through a network, and computer-readable codes may be stored and executed in a distributed manner. The medium may be readable by a computer, stored in a memory, and executed by a processor.

본 실시 예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들은 특정 기능들을 실행하는 다양한 개수의 하드웨어 또는/및 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 실시 예는 하나 이상의 마이크로프로세서들의 제어 또는 다른 제어 장치들에 의해서 다양한 기능들을 실행할 수 있는, 메모리, 프로세싱, 로직(logic), 룩 업 테이블(look-up table) 등과 같은 직접 회로 구성들을 채용할 수 있다. 구성 요소들이 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있는 것과 유사하게, 본 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 실시 예는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. “매커니즘”, “요소”, “수단”, “구성”과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다. 상기 용어는 프로세서 등과 연계하여 소프트웨어의 일련의 처리들(routines)의 의미를 포함할 수 있다.This embodiment can be presented as functional block structures and various processing steps. These functional blocks may be implemented with any number of hardware or/and software components that perform specific functions. For example, an embodiment is an integrated circuit configuration such as memory, processing, logic, look-up table, etc., which can execute various functions by control of one or more microprocessors or other control devices. can employ them. Similar to components that can be implemented as software programming or software elements, the present embodiments include data structures, processes, routines, or various algorithms implemented as combinations of other programming constructs, such as C, C++, Java ( It can be implemented in a programming or scripting language such as Java), assembler, or the like. Functional aspects may be implemented in an algorithm running on one or more processors. In addition, this embodiment may employ conventional techniques for electronic environment setting, signal processing, and/or data processing. Terms such as “mechanism”, “element”, “means” and “composition” may be used broadly and are not limited to mechanical and physical components. The term may include a meaning of a series of software routines in association with a processor or the like.

전술한 실시예들은 일 예시일 뿐 후술하는 청구항들의 범위 내에서 다른 실시예들이 구현될 수 있다. The foregoing embodiments are only examples, and other embodiments may be implemented within the scope of the claims described below.

Claims

A conversation model training method in an electronic device,
checking first context information;
identifying a first response set corresponding to the first context information based on a first model;
identifying a response subset selected from the first response set based on Gold response information corresponding to the first context information; and
training the second model based on the first context information and the response subset; including,
How to train a dialog model.

According to claim 1,
The response subset is selected from among candidate responses identified based on the Gold response information and a clustering algorithm.
How to train a dialog model.

According to claim 2,
The response subset is selected by excluding at least one answer corresponding to a specific range from a value corresponding to the appropriate response information in an embedding space among the candidate responses.
How to train a dialog model.

The method of claim 1, wherein the dialog model training method
Further comprising setting weight information based on the first context information for each response included in the response subset,
The step of learning the second model is learning the second model based on the set weight information,
How to train a dialog model.

The method of claim 4, wherein the weight information is set based on a relevance score for each answer in the response subset,
The relevance score for the answer is calculated based on a value corresponding to the first context information and a value corresponding to the answer in the embedding space.
How to train a dialog model.

5. The method of claim 4 , wherein the second model checks second context information for the conversation information obtained from the user, and provides suitable response information for the second context information based on the second context information.
How to train a dialog model.

According to claim 1,
The first context information includes at least one piece of conversation information obtained from a user.
How to train a dialog model.

The method of claim 5, wherein the second model is learned by performing a backpropagation operation using a loss function calculated based on the weight information,
The weight information is calculated by normalizing the relevance score for each answer,
How to train a dialog model.

An electronic device performing a method for generating conversation information,
a memory in which at least one program is stored; and a processor; including,
The processor identifies first context information, identifies a first response set corresponding to the first context information based on a first model, and an appropriate response corresponding to the first context information based on a second model. information, identifying a response subset selected from the first response set based on the appropriate response information, and learning the second model based on the first context information and the response subset.
electronic device.

As a non-transitory computer-readable storage medium,
a medium configured to store computer readable instructions;
When the computer readable instructions are executed by a processor, the processor:
checking first context information;
identifying a first response set corresponding to the first context information based on a first model;
checking appropriate response information corresponding to the first context information based on a second model;
identifying a response subset selected from the first response set based on the appropriate response information; and
training the second model based on the first context information and the response subset; to do,
A non-transitory computer-readable storage medium.