KR20230166853A

KR20230166853A - Algebraic word problem solving method and apparatus applying human strategy to artificial neural network

Info

Publication number: KR20230166853A
Application number: KR1020220172570A
Authority: KR
Inventors: 권가진; 김부근; 기경서
Original assignee: 서울대학교산학협력단
Priority date: 2022-05-30
Filing date: 2022-12-12
Publication date: 2023-12-07

Abstract

문제 풀이 장치 및 방법을 제시한다. 문제 풀이 장치는 데이터를 입력받고, 이를 연산 처리한 결과를 출력하기 위한 입출력부, 문제 풀이 방법을 수행하기 위한 프로그램이 저장되는 저장부, 및 적어도 하나의 프로세스를 포함하며, 상기 프로그램을 실행시킴으로써 상기 입출력부를 통해 수신된 데이터를 분석하는 제어부를 포함하고, 상기 제어부는, 문장형 수학 문제를 입력받아 인코더-디코더 기반의 설명문 생성 모델을 통해 상기 문장형 수학 문제에 내포된 숫자와 변수에 관한 설명문을 생성하여 출력하고, 상기 문장형 수학 문제, 상기 생성된 설명문, 또는 이들의 조합을 입력받아 인코더-디코더 기반의 수학식 생성 모델을 통해 수학식을 생성하여 출력한다.Presents problem-solving devices and methods. The problem-solving device includes an input/output unit for receiving data and outputting results of calculating and processing the data, a storage unit for storing a program for performing a problem-solving method, and at least one process, and executing the program It includes a control unit that analyzes data received through an input/output unit, wherein the control unit receives a sentence-type math problem and generates an explanation text about the numbers and variables contained in the sentence-type math problem through an encoder-decoder-based explanation generation model. Generate and output, receive the sentence-type math problem, the generated explanation, or a combination thereof, and generate and output a mathematical equation through an encoder-decoder based mathematical equation generation model.

Description

Method and device for solving sentence-type algebra problems that apply human strategies to artificial neural networks {ALGEBRAIC WORD PROBLEM SOLVING METHOD AND APPARATUS APPLYING HUMAN STRATEGY TO ARTIFICIAL NEURAL NETWORK}

본 명세서에서 개시되는 실시예들은 문장형 대수 문제 풀이 방법 및 장치에 관한 것으로, 더욱 상세하게는, 인간의 전략을 인코더-디코더 구조의 인공신경망 모델에 적용하여 주어진 수학 문제를 해석하고 이를 토대로 문제 속 숫자와 변수에 관한 설명을 자연어 형태로 생성하고, 수학 문제를 올바른 수학식으로 자동 생성하는 방법 및 장치에 관한 것이다. Embodiments disclosed in this specification relate to a method and device for solving sentence-type algebra problems. More specifically, a human strategy is applied to an artificial neural network model of an encoder-decoder structure to interpret a given mathematical problem and solve the problem based on this. It relates to a method and device for generating explanations of numbers and variables in natural language and automatically generating mathematical problems with correct mathematical expressions.

인간의 문제 풀이 전략은 크게 문제를 이해하는 단계, 수학식을 작성하는 단계, 및 답을 계산하는 단계로 구분된다. 문제를 이해하는 단계와 수학식을 작성하는 단계에서 숫자 간의 관계와 같은 맥락 정보(contextual information)를 사용한다.Human problem-solving strategies are largely divided into the steps of understanding the problem, writing mathematical equations, and calculating the answer. Contextual information, such as relationships between numbers, is used in the stages of understanding problems and writing mathematical equations.

인간은 문제를 이해하는 단계에서 주어진 문제를 설명함으로써 숫자와 변수에 대한 정보를 추출한다. 인간은 수학식을 작성하는 단계에서 연산자와 피연산자 간의 관계에 대한 정보를 보존하는 여러 템플릿을 결합함으로써 수학식을 작성한다.At the stage of understanding a problem, humans extract information about numbers and variables by explaining the given problem. Humans write mathematical expressions by combining several templates that preserve information about the relationship between operators and operands.

문장형 문제 풀이에 관한 기계 학습은 주어진 문장형 문제를 인코딩하고 수학식으로 디코딩하도록 설계된 모델을 구축한다.Machine learning for sentence problem solving builds a model designed to encode a given sentence problem and decode it into a mathematical equation.

기존에는 문장형 수학 문제를 풀기 위해서 전문가에 의해 정의된 학습 특징(feature)들을 데이터 세트에 구축한 후, 해당 특징들을 기계 학습 모델이 학습하여 문제를 유형들로 구분하고, 이어서 해당 유형에 맞는 풀이를 적용하는 방식이 보편적이었으나 이러한 방식은 모델 구축에 전문가의 작업이 요구되는 문제가 있다. Previously, in order to solve sentence-type math problems, learning features defined by experts were built into the data set, and then a machine learning model learned the features to classify the problems into types and then provide solutions appropriate for those types. The method of applying was common, but this method has the problem of requiring expert work to build the model.

인공신경망을 사용하게 되면 전문가가 사전에 정의한 특징을 사용할 필요가 없으나, 인공신경망이 문제를 어떻게 해석하고 어떤 과정을 거쳐서 수학식을 추론한 것인지를 해석하기가 어렵다는 문제가 있다.When using an artificial neural network, there is no need to use features defined in advance by experts, but there is a problem that it is difficult to interpret how the artificial neural network interprets the problem and through what process it infers the mathematical formula.

따라서, 인공신경망을 사용하여 추론하는 자동화된 모델 및 인공신경망을 해석할 수 있는 근거자료로서 문제의 주요 부분에 대한 자연스러운 줄글 설명을 생성하는 모델이 필요한 실정이다. Therefore, there is a need for an automated model that makes inferences using artificial neural networks and a model that generates natural line-by-line descriptions of the main parts of the problem as evidence for interpreting artificial neural networks.

참고로, 특허문헌 1은 인공지능 기반 수학 문제 해결장치에 관한 발명이고, 특허문헌 2는 신경망 기반 기계번역 및 셈뭉치를 이용한 수학 문제 개념유형 예측 서비스 제공 방법에 관한 발명으로, 특허문헌 1 및 특허문헌 2는 일반적인 수학 문제 풀이 내용만을 개시하고 있을 뿐, 인공신경망의 문제 이해도를 설명하면서 설명에 관한 신뢰도를 제공할 수 있는 자동화된 수학 문제 풀이 기술을 제공하지 아니한다.For reference, Patent Document 1 is an invention regarding an artificial intelligence-based mathematical problem-solving device, and Patent Document 2 is an invention regarding a method of providing a mathematical problem concept type prediction service using neural network-based machine translation and arithmetic. Patent Document 1 and Patent Document 2 only discloses general mathematical problem solving content and does not provide automated mathematical problem solving technology that can provide reliability regarding the explanation while explaining the problem understanding of artificial neural networks.

한국등록특허 제10-2110784호(2020.05.08.)Korean Patent No. 10-2110784 (2020.05.08.) 한국등록특허 제10-1986721호(2019.05.31.)Korean Patent No. 10-1986721 (2019.05.31.)

본 명세서에서 개시되는 실시예들은, 문장형 수학 문제에 대한 문제 이해 과정에서 문제 속 숫자와 변수에 관한 해석 결과를 자연어 형태로 생성하고, 수학식 작성 과정에서 해석 결과를 재구성한 재결합 문제를 생성하고, 재결합 문제와 원 문제를 기반으로 연산자와 연관되는 피연산자를 함께 묶은 단위인 표현식을 사용하여 수학식을 생성하는 문제 풀이 방법 및 장치를 제공하는 데 그 목적이 있다. Embodiments disclosed in this specification generate analysis results regarding numbers and variables in the problem in the form of natural language during the problem understanding process for sentence-type mathematical problems, and generate recombination problems that reconstruct the analysis results during the mathematical formula writing process. The purpose is to provide a problem-solving method and device for generating mathematical expressions using expressions, which are units that group operands associated with operators based on recombination problems and original problems.

본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 일 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허청구범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.Other objects and advantages of the present invention can be understood from the following description and will be more clearly understood through an example. In addition, it will be readily apparent that the objects and advantages of the present invention can be realized by means and combinations thereof as indicated in the claims.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 문제 풀이 장치는 데이터를 입력받고, 이를 연산 처리한 결과를 출력하기 위한 입출력부; 문제 풀이 방법을 수행하기 위한 프로그램이 저장되는 저장부; 및 적어도 하나의 프로세스를 포함하며, 상기 프로그램을 실행시킴으로써 상기 입출력부를 통해 수신된 데이터를 분석하는 제어부를 포함하고, 상기 제어부는, 문장형 수학 문제를 입력받아 인코더-디코더 기반의 설명문 생성 모델을 통해 상기 문장형 수학 문제에 내포된 숫자와 변수에 관한 설명문을 생성하여 출력하고, 상기 문장형 수학 문제, 상기 생성된 설명문, 또는 이들의 조합을 입력받아 인코더-디코더 기반의 수학식 생성 모델을 통해 수학식을 생성하여 출력한다.As a technical means for achieving the above-described technical problem, a problem-solving device includes an input/output unit for receiving data and outputting the result of processing the data; a storage unit in which a program for performing a problem solving method is stored; and at least one process, including a control unit that analyzes data received through the input/output unit by executing the program, wherein the control unit receives a sentence-type math problem and runs it through an encoder-decoder based explanation generation model. Explanations about the numbers and variables contained in the sentence-type math problem are generated and output, and the sentence-type math problem, the generated explanation, or a combination thereof are input, and mathematics is performed through an encoder-decoder-based mathematical expression generation model. Create and output an expression.

다른 실시예에 따르면, 문제 풀이 장치에 의한 문제 풀이 방법에 있어서, 문장형 수학 문제를 입력받아 인코더-디코더 기반의 설명문 생성 모델을 통해 상기 문장형 수학 문제에 내포된 숫자와 변수에 관한 설명문을 생성하여 출력하는 단계; 및 상기 문장형 수학 문제, 상기 생성된 설명문, 또는 이들의 조합을 입력받아 인코더-디코더 기반의 수학식 생성 모델을 통해 수학식을 생성하여 출력하는 단계를 포함한다.According to another embodiment, in a problem solving method using a problem solving device, a sentence-type math problem is input and an explanation text regarding numbers and variables contained in the sentence-type math problem is generated through an encoder-decoder-based explanation text generation model. Step of outputting; and receiving the sentence-type math problem, the generated explanation, or a combination thereof, and generating and outputting a math equation through an encoder-decoder-based math equation generation model.

또 다른 실시예에 따르면, 기록매체는, 문제 풀이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록 매체이다. According to another embodiment, the recording medium is a computer-readable recording medium on which a program for performing a problem-solving method is recorded.

또 다른 실시예에 따르면, 컴퓨터 프로그램은, 문제 풀이 장치에 의해 수행되며, 문제 풀이 방법을 수행하기 위해 기록 매체에 저장된 컴퓨터 프로그램이다.According to another embodiment, the computer program is a computer program that is executed by a problem-solving device and stored in a recording medium to perform a problem-solving method.

전술한 과제 해결 수단 중 어느 하나에 의하면, 설명문 생성 모델이 문장형 수학 문제를 해석하고 이를 토대로 문제 속 숫자와 변수에 관한 해석을 자연어 형태로 생성하는 문제 풀이 방법 및 장치를 제시할 수 있다. According to any one of the above-mentioned problem solving means, a problem-solving method and device can be proposed in which an explanatory text generation model interprets a sentence-type mathematical problem and based on this, generates an interpretation of the numbers and variables in the problem in the form of natural language.

또한, 전술한 과제 해결 수단 중 어느 하나에 의하면, 문제 해석이 정확한지 모델이 스스로 비교 대조할 수 있도록 문제 해석을 원 문제와 함께 수학식 생성 모델에 입력하여 수학식을 생성하는 문제 풀이 방법 및 장치를 제시할 수 있다.In addition, according to any one of the above-mentioned problem solving means, a problem solving method and device for generating a mathematical expression by inputting the problem analysis together with the original problem into a mathematical expression generation model so that the model can compare and contrast itself to see if the problem analysis is accurate is provided. can be presented.

또한, 전술한 과제 해결 수단 중 어느 하나에 의하면, 수학식 생성 모델이 연산자와 연관되는 피연산자를 함께 묶은 단위인 표현식을 사용하여 맥락 정보를 보존하는 문제 풀이 방법 및 장치를 제시할 수 있다.In addition, according to any one of the above-mentioned problem solving means, it is possible to present a problem solving method and device that preserves context information by using an expression in which a mathematical expression generation model is a unit of operands associated with an operator grouped together.

또한, 전술한 과제 해결 수단 중 어느 하나에 의하면, 인공신경망 모델이 맥락 정보를 보존할 수 있도록 디코딩과 인코딩 과정을 개선하여 정답률을 향상시키는 문제 풀이 방법 및 장치를 제시할 수 있다.In addition, according to any one of the above-mentioned problem solving means, a problem solving method and device that improves the correct answer rate by improving the decoding and encoding process so that the artificial neural network model can preserve context information can be proposed.

개시되는 실시예들에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 개시되는 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects that can be obtained from the disclosed embodiments are not limited to the effects mentioned above, and other effects not mentioned are clear to those skilled in the art to which the disclosed embodiments belong from the description below. It will be understandable.

이하, 첨부되는 도면들은 본 명세서에 개시되는 바람직한 실시예를 예시하는 것이며, 발명을 실시하기 위한 구체적인 내용들과 함께 본 명세서에 개시되는 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 명세서에 개시되는 내용은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 일 실시예에 따른 문제 풀이 장치의 블록도이다.
도 2는 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델을 예시한 도면이다.
도 3 및 도 4는 일 실시예에 따른 문제 풀이 장치에 적용 가능한 수학식 생성 모델의 동작 원리를 예시한 도면이다.
도 5 및 도 6은 일 실시예에 따른 문제 풀이 장치에 적용 가능한 수학식 생성 모델에 대한 절제(ablation) 분석을 예시한 도면이다.
도 7은 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델의 동작 원리를 예시한 도면이다.
도 8 및 도 9는 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델에 대한 절제 분석을 예시한 도면이다.
도 10 및 도 11은 다른 실시예에 따른 문제 풀이 방법의 흐름도이다.Hereinafter, the attached drawings illustrate preferred embodiments disclosed in the present specification, and serve to further understand the technical idea disclosed in the present specification along with specific details for carrying out the invention, and thus the drawings disclosed in the present specification The contents should not be construed as limited to the matters described in such drawings.
1 is a block diagram of a problem-solving device according to an embodiment.
Figure 2 is a diagram illustrating an artificial neural network model applied to a problem-solving device according to an embodiment.
Figures 3 and 4 are diagrams illustrating the operating principle of a mathematical equation generation model applicable to a problem-solving device according to an embodiment.
Figures 5 and 6 are diagrams illustrating ablation analysis of a mathematical equation generation model applicable to a problem-solving device according to an embodiment.
Figure 7 is a diagram illustrating the operating principle of an artificial neural network model applied to a problem-solving device according to an embodiment.
Figures 8 and 9 are diagrams illustrating ablation analysis of an artificial neural network model applied to a problem-solving device according to an embodiment.
10 and 11 are flowcharts of a problem solving method according to another embodiment.

아래에서는 첨부한 도면을 참조하여 다양한 실시예들을 상세히 설명한다. 아래에서 설명되는 실시예들은 여러 가지 상이한 형태로 변형되어 실시될 수도 있다. 실시예들의 특징을 보다 명확히 설명하기 위하여, 이하의 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서 자세한 설명은 생략하였다. 그리고, 도면에서 실시예들의 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Below, various embodiments will be described in detail with reference to the attached drawings. The embodiments described below may be modified and implemented in various different forms. In order to more clearly explain the characteristics of the embodiments, detailed descriptions of matters widely known to those skilled in the art to which the following embodiments belong have been omitted. In addition, in the drawings, parts that are not related to the description of the embodiments are omitted, and similar parts are given similar reference numerals throughout the specification.

명세서 전체에서, 어떤 구성이 다른 구성과 "연결"되어 있다고 할 때, 이는 '직접적으로 연결'되어 있는 경우뿐 아니라, '그 중간에 다른 구성을 사이에 두고 연결'되어 있는 경우도 포함한다. 또한, 어떤 구성이 어떤 구성을 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한, 그 외 다른 구성을 제외하는 것이 아니라 다른 구성들을 더 포함할 수도 있음을 의미한다.Throughout the specification, when a configuration is said to be “connected” to another configuration, this includes not only cases where it is “directly connected,” but also cases where it is “connected with another configuration in between.” In addition, when a configuration “includes” a configuration, this means that other configurations may be further included rather than excluding other configurations, unless specifically stated to the contrary.

이하 첨부된 도면을 참고하여 실시예들을 상세히 설명하기로 한다.Hereinafter, embodiments will be described in detail with reference to the attached drawings.

아래 표 1은 문장형 수학 문제를 예시한 것이다. Table 1 below provides examples of sentence-type math problems.

[표 1][Table 1]

표 1의 문제에서 쿼터는 0.25 달러(25 센트)이고, 니켈은 0.05 달러(5 센트)를 의미한다. 인공신경망 모델은 문장형 수학 문제를 입력받아 문제로부터 적합한 수학식을 추론한다. 문장형 수학 문제에는 숫자가 기재되어 있으나 변수는 기재되어 있지 않으므로, 인공신경망 모델은 적절한 변수를 추론해야 한다. 예컨대, 쿼터의 개수를 변수 x로 설정하고, 니켈의 개수를 변수 y로 설정하고, 풀이에 필요한 변수의 개수가 2 개에 해당하는 것을 찾아야 한다. In the problem in Table 1, a quarter is equal to $0.25 (25 cents) and a nickel is equal to $0.05 (5 cents). The artificial neural network model receives sentence-type math problems as input and infers appropriate mathematical formulas from the problems. Sentence-type math problems contain numbers but not variables, so the artificial neural network model must infer the appropriate variables. For example, set the number of quarters to the variable x, set the number of nickels to the variable y, and find the number of variables required for the solution corresponding to 2.

인공신경망 모델은 문장형 수학 문제를 풀이할 때 인코딩 및 디코딩 과정을 거친다. 인공신경망 모델은 인코딩 과정에서 수학 문제를 올바르게 이해하고 숫자의 역할을 설명하는 숫자 정보로 변환한다. 예컨대, 인공신경망 모델은 12는 전체 동전 개수이고 2.20은 전체 동전 가격을 의미하는 것을 파악해야 한다. 인공신경망 모델은 디코딩 과정에서 수학식의 계산 구조 및 작성된 부분을 추적하고 작은 계산 단위를 결합하여 수학식을 구성한다. 예컨대, 인공신경망 모델은 x+y 표현을 생성하고, 생성된 표현을 x+y=12에 재사용한다.Artificial neural network models go through encoding and decoding processes when solving sentence-type math problems. During the encoding process, artificial neural network models correctly understand mathematical problems and convert them into numerical information that explains the role of numbers. For example, the artificial neural network model must understand that 12 means the total number of coins and 2.20 means the price of all coins. The artificial neural network model tracks the calculation structure and written parts of the mathematical expression during the decoding process and constructs the mathematical expression by combining small calculation units. For example, the artificial neural network model creates an x+y expression and reuses the generated expression for x+y=12.

인공신경망 모델을 어떻게 설계하는지에 따라 인코딩 및 디코딩 과정에서 맥락 정보를 손실하게 된다. 인공신경망 모델이 문장형 수학 문제를 풀 때 맥락 정보를 더 잘 보존하기 위해서 아래 표 2와 같은 관계를 고려할 필요가 있다.Depending on how the artificial neural network model is designed, context information is lost during the encoding and decoding process. In order for the artificial neural network model to better preserve context information when solving sentence-type math problems, it is necessary to consider the relationships shown in Table 2 below.

[표 2][Table 2]

본 실시예에 따른 문제 풀이 장치는 수학식을 작성하는 디코딩 과정에서 수학식의 계산 구조의 맥락 정보를 포착하기 위해서 연산자와 관련된 피연산자를 그룹화한 표현식을 사용하는 수학식 생성 모델을 설계한다. 표현식이 연산자와 피연산자 간의 관계를 표시하므로, 수학식 생성 모델은 이전에 생성한 표현식의 맥락 정보를 식별할 수 있고, 수학식 생성 모델은 생성한 표현식과 추가로 생성할 표현식을 쉽게 파악할 수 있다. 수학식 생성 모델은 표현식을 활용하여 정답률을 개선한다.The problem-solving device according to this embodiment designs a mathematical expression generation model that uses an expression that groups operands related to operators to capture context information of the calculation structure of the mathematical expression during the decoding process of creating the mathematical expression. Because expressions represent relationships between operators and operands, the mathematical expression generation model can identify the context information of previously created expressions, and the mathematical expression generation model can easily identify the expressions it has created and the expressions to be additionally created. The mathematical expression generation model uses expressions to improve the correct answer rate.

본 실시예에 따른 문제 풀이 장치는 문제를 이해하는 인코딩 과정에서 문제를 구성하는 숫자 등의 요소의 맥락 정보를 포착하기 위해서 숫자와 변수를 설명하는 설명문 생성 모델을 설계한다. 설명문 생성 모델은 생성된 설명과 원 문제를 비교하여 인공신경망 모델의 이해 정도를 확인할 수 있다. 생성된 설명을 참고하여 인공신경망 모델이 문제를 오해했는지를 식별할 수 있다. 설명문 생성 모델이 올바른 설명을 예측할수록 인공신경망 모델의 풀이 정확도를 향상시킨다.The problem-solving device according to this embodiment designs an explanatory text generation model that explains numbers and variables in order to capture context information of elements such as numbers that make up the problem during the encoding process of understanding the problem. The explanation generation model can check the degree of understanding of the artificial neural network model by comparing the generated explanation with the original problem. By referring to the generated explanation, you can identify whether the artificial neural network model misunderstood the problem. The more the explanation generation model predicts the correct explanation, the more the artificial neural network model’s solving accuracy improves.

도 1은 일 실시예에 따른 문제 풀이 장치의 기능 블록도이다.1 is a functional block diagram of a problem-solving device according to an embodiment.

도 1을 참조하면, 일 실시예에 따른 문제 풀이 장치(100)는 입출력부(110), 저장부(120) 및 제어부(130)를 포함한다.Referring to FIG. 1, the problem-solving device 100 according to one embodiment includes an input/output unit 110, a storage unit 120, and a control unit 130.

입출력부(110)는 사용자로부터 입력을 수신하기 위한 입력부와 작업의 수행결과 또는 문제 풀이 장치(100)의 상태 등의 정보를 표시하기 위한 출력부를 포함할 수 있다. 즉, 입출력부(110)는 데이터를 입력받고, 이를 연산 처리한 결과를 출력하기 위한 구성이다. 실시예에 따른 문제 풀이 장치(100)는 입출력부(110)를 통해 문장형 수학 문제 등을 수신할 수 있다. The input/output unit 110 may include an input unit for receiving input from a user and an output unit for displaying information such as a task performance result or the status of the problem solving device 100 . In other words, the input/output unit 110 is configured to receive input data and output the results of processing the data. The problem-solving device 100 according to the embodiment may receive a sentence-type math problem, etc. through the input/output unit 110.

저장부(120)는 파일 및 프로그램이 저장될 수 있는 구성으로서, 다양한 종류의 메모리를 통해 구성될 수 있다. 특히, 저장부(120)에는 후술하는 제어부(130)가 이하에서 제시되는 알고리즘에 따라 문제 풀이를 위한 연산을 수행할 수 있도록 하는 데이터 및 프로그램이 저장될 수 있다. The storage unit 120 is a component in which files and programs can be stored, and can be configured using various types of memory. In particular, the storage unit 120 may store data and programs that enable the control unit 130, which will be described later, to perform calculations for solving problems according to the algorithm presented below.

제어부(130)는 CPU, GPU, 아두이노 등과 같은 적어도 하나의 프로세서를 포함하는 구성으로서, 문제 풀이 장치(100)의 전체적인 동작을 제어할 수 있다. 즉, 제어부(130)는 문제 풀이를 위한 동작을 수행하도록 문제 풀이 장치(100)에 포함된 다른 구성들을 제어할 수 있다. 제어부(130)는 저장부(120)에 저장된 프로그램을 실행함으로써 이하에서 제시되는 알고리즘에 따라 문제 풀이를 하기 위한 연산을 수행할 수 있다. The control unit 130 is a component that includes at least one processor such as CPU, GPU, Arduino, etc., and can control the overall operation of the problem solving device 100. That is, the control unit 130 can control other components included in the problem-solving device 100 to perform operations for solving the problem. The control unit 130 can perform calculations to solve the problem according to the algorithm presented below by executing the program stored in the storage unit 120.

제어부(130)는 인공신경망 모델을 통해 문장형 수학 문제로부터 설명문을 생성한다. 제어부(130)는 문장형 수학 문제를 입력받아 인코더-디코더 기반의 설명문 생성 모델을 통해 문장형 수학 문제에 내포된 숫자와 변수에 관한 설명문을 생성하여 출력한다. 제어부(130)는 문제 속의 숫자와 관련된 맥락 정보를 활용하여 설명문을 생성하며, 생성된 설명문을 수학식 생성에 활용한다. 생성된 설명문은 인공신경망 모델이 문제를 얼마나 올바르게 이해했는지를 판단하는 기준이 될 수 있다.The control unit 130 generates an explanation from a sentence-type math problem through an artificial neural network model. The control unit 130 receives a sentence-type math problem as input, generates and outputs an explanation about the numbers and variables contained in the sentence-type math problem through an encoder-decoder-based explanation generation model. The control unit 130 generates an explanation using context information related to the numbers in the problem, and uses the generated explanation to generate a mathematical equation. The generated explanation can serve as a standard for judging how correctly the artificial neural network model understands the problem.

제어부(130)는 인공신경망 모델을 통해 설명문을 변환한 문제와 원래의 문장형 수학 문제를 결합한 문제로부터 수학식을 생성한다. 제어부(130)는 문장형 수학 문제, 생성된 설명문, 또는 이들의 조합을 입력받아 인코더-디코더 기반의 수학식 생성 모델을 통해 수학식을 생성하여 출력한다. 설명문을 변환한 문제를 원래의 문장형 수학 문제와 함께 입력하면, 설명문을 변환한 문제가 원래의 문장형 수학 문제 속의 맥락 정보를 보완하여 인공신경망 모델의 성능을 개선할 수 있다.The control unit 130 generates a mathematical equation from a problem that combines the problem of converting the explanation text through an artificial neural network model and the original sentence-type mathematical problem. The control unit 130 receives a sentence-type math problem, a generated explanation, or a combination thereof, generates a math equation through an encoder-decoder-based math equation generation model, and outputs it. If the problem with the converted explanation is entered together with the original sentence-type math problem, the problem with the converted explanation can complement the context information in the original sentence-type math problem and improve the performance of the artificial neural network model.

도 2는 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델을 예시한 도면이다.Figure 2 is a diagram illustrating an artificial neural network model applied to a problem-solving device according to an embodiment.

인공신경망 모델은 설명문 생성 모델(200) 및 수학식 생성 모델(300)을 포함한다. 설명문 생성 모델(200) 및 수학식 생성 모델(300)은 인코더(Encoder)-디코더(Decoder) 구조의 복수 레이어로 설계된다. 인코더는 고차원 정보를 특징을 대표하는 저차원 정보로 변환하는 임베딩(Embedding) 처리를 수행한다. 디코더는 저차원 정보로부터 고차원 정보로 변환하여 특징을 재구성한다.The artificial neural network model includes a description generation model (200) and a mathematical expression generation model (300). The description generation model 200 and the mathematical expression generation model 300 are designed with multiple layers of an encoder-decoder structure. The encoder performs embedding processing to convert high-dimensional information into low-dimensional information representing features. The decoder converts low-dimensional information into high-dimensional information and reconstructs the features.

설명문 생성 모델(200)은 인코더(210), 디코더(220), 변수 예측기(230), 및 포인터 생성기(240)를 포함한다.The description generation model 200 includes an encoder 210, a decoder 220, a variable predictor 230, and a pointer generator 240.

설명문 생성 모델(200)의 인코더(210)는 문장형 수학 문제를 입력받아 토큰으로 분절하고 각 토큰마다 문장형 수학 문제의 맥락 정보를 나타내는 문제 맥락 벡터를 출력한다.The encoder 210 of the explanation generation model 200 receives a sentence-type math problem as input, segments it into tokens, and outputs a problem context vector representing context information of the sentence-type math problem for each token.

설명문 생성 모델(200)의 디코더(220)는 문제 맥락 벡터, 변수의 개수를 고려한 변수, 및 문장형 수학 문제의 숫자를 입력받아 이전에 생성된 설명 토큰을 기반으로 디코더(220)의 은닉 상태(hidden state)를 계산한다.The decoder 220 of the explanation generation model 200 receives a problem context vector, a variable considering the number of variables, and a number of sentence-type math problems, and determines the hidden state of the decoder 220 based on the previously generated explanation token ( Calculate the hidden state.

변수 예측기(230)는 설명문 생성 모델(200)의 인코더(210) 및 디코더(220) 사이에 연결된다. 변수 예측기(230)는 문제 맥락 벡터를 입력받아 대푯값을 변형하여 문장형 수학 문제를 풀기 위해 필요한 변수의 개수를 예측한다.The variable predictor 230 is connected between the encoder 210 and the decoder 220 of the description generation model 200. The variable predictor 230 receives the problem context vector and transforms the representative value to predict the number of variables needed to solve the sentence-type math problem.

포인터 생성기(240)는 설명문 생성 모델(200)의 디코더(220)에 연결되며, 포인터 생성기(240)는 디코더(220)의 은닉 상태를 입력받아 다음 설명 토큰을 예측한다.The pointer generator 240 is connected to the decoder 220 of the explanation generation model 200, and the pointer generator 240 receives the hidden state of the decoder 220 and predicts the next explanation token.

제어부(130)는 설명문 생성 모델(200)을 통해 문장형 수학 문제에서 숫자가 위치하는 문맥을 선별하여 생성한 숫자에 관한 설명문을 제1 문장으로 재구조화한다. 제어부(130)는 설명문 생성 모델(200)을 통해 변수의 인덱스를 이용하여 생성한 변수에 관한 설명문을 제2 문장으로 재구조화한다. 제어부(130)는 재구조화한 제1 문장 및 제2 문장을 결합하여 재결합 문제를 생성한다.The control unit 130 selects the context in which a number is located in a sentence-type math problem through the explanatory sentence generation model 200 and restructures the generated explanatory sentence about the number into a first sentence. The control unit 130 restructures the description of the variable created using the index of the variable through the description generation model 200 into a second sentence. The control unit 130 combines the restructured first and second sentences to create a recombination problem.

수학식 생성 모델(300)은 인코더(310) 및 디코더(320)를 포함한다.The mathematical expression generation model 300 includes an encoder 310 and a decoder 320.

제어부(130)는 재결합 문제 및 문장형 수학 문제를 수학식 생성 모델(300)의 인코더(310)에 입력한다.The control unit 130 inputs the recombination problem and the sentence-type math problem to the encoder 310 of the mathematical expression generation model 300.

수학식 생성 모델(300)의 인코더(310)는 재결합 문제 및 원래의 문장형 수학 문제를 입력받아 재결합 맥락 벡터를 출력한다. The encoder 310 of the mathematical expression generation model 300 receives the recombination problem and the original sentence-type math problem and outputs a recombination context vector.

수학식 생성 모델(300)의 디코더(320)는 재결합 맥락 벡터를 입력받아 연산자와 필요한 피연산자를 그룹화한 표현식 토큰 단위를 사용하여 수학식을 생성한다.The decoder 320 of the mathematical expression generation model 300 receives the recombined context vector and generates a mathematical expression using expression token units that group operators and necessary operands.

도 3 및 도 4는 일 실시예에 따른 문제 풀이 장치에 적용 가능한 수학식 생성 모델의 동작 원리를 예시한 도면이다.Figures 3 and 4 are diagrams illustrating the operating principle of a mathematical equation generation model applicable to a problem-solving device according to an embodiment.

문제 풀이 장치에 적용 가능한 수학식 생성 모델(Expression Pointer Transformer, EPT)은 표현식 토큰을 사용하여 표현식 조각화 문제를 해결하고, 피연산자-맥락 정보 분리 문제를 해결한다. The expression generation model (Expression Pointer Transformer, EPT), which can be applied to problem solving devices, uses expression tokens to solve the expression fragmentation problem and the operand-context information separation problem.

도 3의 (a)를 참조하면, 표현식 조각화 문제는 수학식의 계산 구조를 나타내는 표현식 트리가 분할되는 것이다. 표현식 조각화 문제의 예시로, 기존의 연산자/피연산자 토큰을 적용하면, 연산자 'x', 피연산자 'x₁'와 'N₂'로 분해된다. 분해된 3 개의 토큰은 N₂x x₁이라는 단일 표현으로 재결합하기 전까지는 의미가 없고, 모델을 혼동시킨다.Referring to (a) of FIG. 3, the expression fragmentation problem is one in which an expression tree representing the calculation structure of a mathematical expression is split. As an example of an expression fragmentation problem, when applying an existing operator/operand token, it is decomposed into operator 'x', operands 'x ₁ ' and 'N ₂ '. The three decomposed tokens are meaningless and confuse the model until they are recombined into the single expression N ₂ xx ₁ .

피연산자-맥락 정보 분리 문제는 피연산자 및 피연산자에 관련된 숫자 간에 연결이 끊어지는 것이다. 피연산자-맥락 정보 분리 문제의 예시로, 기존의 연산자/피연산자 토큰을 적용할 때, 숫자 '8'을 사용하지 않고 문제에 나타나지 않은 추상적인 기호 'N₂'를 사용하게 된다. 추상적인 기호 'N₂'에 대응하는 숫자가 어떤 숫자인지 원 문제에서 찾기 쉽지 않다.The problem of operand-context information separation is the loss of connection between the operand and the numbers associated with the operand. As an example of the operand-context information separation problem, when applying existing operator/operand tokens, the number '8' is not used, but the abstract symbol 'N ₂ ', which does not appear in the problem, is used. It is not easy to find in the original problem which number corresponds to the abstract symbol 'N ₂ '.

즉, 표현식 조각화 문제 및 피연산자-맥락 정보 분리 문제로 인하여 수학식을 생성하는 과정에서 계산 구조의 맥락 정보가 손실된다.In other words, the context information of the calculation structure is lost in the process of generating a mathematical expression due to the expression fragmentation problem and the operand-context information separation problem.

수학식 생성 모델은 연산자 및 연산자와 관련된 피연산자를 그룹화한 표현식 토큰 단위를 적용하여, 기존의 연산자/피연산자 토큰보다 더 많은 계산 구조의 맥락 정보를 보존할 수 있다. 그룹화한 표현식 토큰 단위는 트리 구조를 유지하고 피연산자의 맥락 정보를 직접 사용하므로, 표현식 단편화 문제 및 피연산자-맥락 정보 분리 문제를 해결할 수 있다. The mathematical expression generation model can preserve more context information of the calculation structure than traditional operator/operand tokens by applying expression token units that group operators and operands related to the operators. Since the grouped expression token unit maintains the tree structure and directly uses the context information of the operand, it can solve the expression fragmentation problem and the operand-context information separation problem.

도 3의 (b)에 도시된 바와 같이, 연산자 및 관련 피연산자를 그룹화한 표현식 토큰을 적용하면, 3 개의 토큰 각각은 표현식 트리의 하위 트리 중 하나를 나타낸다. 수학식 생성 모델은 연산자 및 관련 피연산자를 그룹화한 표현식 토큰을 이용하여 연산자 및 관련 피연산자에 대한 맥락 정보를 명시적으로 고려할 수 있다. As shown in Figure 3(b), applying expression tokens that group operators and related operands, each of the three tokens represents one of the subtrees of the expression tree. A mathematical expression generation model can explicitly consider context information about operators and related operands by using expression tokens that group operators and related operands.

연산자 및 관련 피연산자를 그룹화한 표현식 토큰을 적용할 때, 문제에 쓰여진 숫자를 추상적인 기호로 변환하지 않고, 숫자가 발생한 위치에 포인터를 생성한다. 각각의 피연산자는 쓰여진 숫자 또는 이전 출력을 직접 가리킨다. 수학식 생성 모델은 피연산자-맥락 포인터를 이용하여 피연산자에 대한 맥락 정보를 직접 접근할 수 있다.When applying an expression token that groups an operator and its associated operands, rather than converting the numbers written in the problem into abstract symbols, a pointer is created to the location where the numbers occur. Each operand points directly to a written number or previous output. The mathematical expression generation model can directly access context information about the operand using the operand-context pointer.

도 4를 참조하면, 문제 풀이 장치에 적용 가능한 수학식 생성 모델(Expression Pointer Transformer, EPT)의 인코더는 문제를 읽고 각 토큰에 대한 인코더의 은닉 상태 벡터를 출력한다. 그런 다음 디코더는 은닉 상태 벡터를 메모리로 사용하고 단계적으로 표현식을 생성한다.Referring to FIG. 4, the encoder of the expression pointer transformer (EPT) applicable to the problem solving device reads the problem and outputs the encoder's hidden state vector for each token. The decoder then uses the hidden state vector as memory and generates the expression step by step.

인코더는 문제 텍스트를 읽고 인코더의 은닉 상태 벡터와 숫자의 맥락 벡터를 생성한다. The encoder reads the problem text and generates the encoder's hidden state vector and a numeric context vector.

인코더는 입력된 문제 텍스트를 일련의 하위 단어 토큰으로 토큰화한다. 그런 다음 각 토큰을 임베딩 벡터로 변환한다. 인코더는 임베딩 벡터를 사용하여 각 토큰에 대해 인코더의 은닉 상태 벡터를 계산한다.The encoder tokenizes the input problem text into a series of subword tokens. We then convert each token into an embedding vector. The encoder uses the embedding vector to calculate the encoder's hidden state vector for each token.

인코더는 문제에 쓰여진 각 숫자에 대한 맥락 벡터를 획득한다. 인코더가 주어진 문제 텍스트를 하위 단어 토큰으로 토큰화를 수행하므로, 쓰여진 숫자는 하나 이상의 은닉 상태 벡터를 가진다. 따라서 단일 벡터값을 얻기 위해 인코더는 이러한 은닉 상태 벡터의 평균을 취할 수 있다. The encoder obtains a context vector for each number written in the problem. Since the encoder performs tokenization of the given problem text into subword tokens, the written numbers have one or more hidden state vectors. Therefore, to obtain a single vector value, the encoder can take the average of these hidden state vectors.

인코더의 초기 가중치는 사전 훈련된 언어 모델인 ALBERT(A Lite BERT for Self-supervised Learning of Language Representations, Lan et al., 2019) 모델을 참고할 수 있다. ALBERT모델은 대규모 코퍼스(corpus)에서 배열된 단어나 문장을 예측하도록 훈련되었으므로 ALBERT 모델은 단어 문제를 해결하기 위한 지식을 제공할 수 있다. For the initial weights of the encoder, you can refer to the ALBERT (A Lite BERT for Self-supervised Learning of Language Representations, Lan et al., 2019) model, which is a pre-trained language model. Since the ALBERT model is trained to predict words or sentences arranged in a large corpus, the ALBERT model can provide knowledge for solving word problems.

디코더는 맥락 벡터를 사용하여 피연산자-맥락 정보 분리 문제를 해결한다. 디코더는 인코더의 은닉 상태 벡터를 메모리로 사용하여 표현식 토큰을 생성한다. 표현식 토큰을 생성하는 전체 프로세스는 자동 회귀적으로 진행된다. 수학식 생성 모델은 i 번째 단계 이전의 토큰을 사용하여 i 번째 토큰을 예측한다. The decoder uses context vectors to solve the operand-context information separation problem. The decoder uses the encoder's hidden state vector as memory to generate expression tokens. The entire process of generating expression tokens proceeds recursively. The mathematical generation model predicts the ith token using the tokens before the ith step.

아래 표 3은 x₀-2x₁=8 및 x₀+x₁=20을 생성하는 8 개의 표현식 토큰 시퀀스의 예를 보여준다.Table 3 below shows an example of a sequence of eight expression tokens that produces x ₀ -2x ₁ =8 and x ₀ +x ₁ =20.

[표 3][Table 3]

3개의 토큰(BEGIN, VAR 및 VAR)을 입력한 후 디코더는 연산자 ×와 피연산자 2와 R₁을 동시에 예측해야 한다. 디코더는 세 개의 특수 명령을 목록에 포함한다. 'BEGIN'은 시퀀스 시작을 위한 명령이고, 'VAR'은 새 변수 생성을 위한 명령이고, 'END'은 지금까지 생성된 모든 수학식을 수집하기 위한 명령이다.After inputting three tokens (BEGIN, VAR, and VAR), the decoder must predict operator × and operands 2 and R ₁ simultaneously. The decoder includes three special instructions in its list. 'BEGIN' is a command to start a sequence, 'VAR' is a command to create a new variable, and 'END' is a command to collect all mathematical expressions created so far.

수학식 생성 모델에 적용 가능한 디코더의 계산 구조는 트랜스포머(Transformer: Attention Is All You Need, Vaswani et al., 2017) 모델의 디코더의 계산 구조를 참고할 수 있다. The calculation structure of the decoder applicable to the mathematical expression generation model can refer to the calculation structure of the decoder of the Transformer (Attention Is All You Need, Vaswani et al., 2017) model.

디코더가 i 번째 표현식 토큰을 예측한다고 가정하면, 디코더는 지금까지 생성된 표현식 토큰을 수신하여 임베딩 벡터 v_j(j=0, ... , i-1)로 변환한다. 임베딩 벡터 vj와 인코더의 은닉 상태 벡터 e_t를 기반으로 디코더는 i 번째 표현식 토큰에 대한 디코더의 은닉 상태 벡터 d_i를 구축한다. 그런 다음 디코더는 은닉 상태 벡터 di로 다음 표현식 토큰을 예측한다.Assuming the decoder predicts the ith expression token, the decoder receives the expression tokens generated so far and transforms them into an embedding vector v _j (j=0, ..., i-1). Based on the embedding vector vj and the encoder's hidden state vector e _t , the decoder constructs the decoder's hidden state vector d _i for the ith expression token. The decoder then predicts the next expression token with the hidden state vector di.

트랜스포머 모델의 디코더와 달리, 수학식 생성 모델에 적용 가능한 디코더가 표현식 토큰을 수신하고 생성하도록 입력 임베딩 부분과 출력 예측 부분을 수정한다.Unlike the decoder of the transformer model, the decoder applicable to the mathematical expression generation model receives expression tokens and modifies the input embedding part and the output prediction part to generate them.

입력 임베딩 부분과 관련하여, i 번째 표현식 토큰의 입력 벡터 v_i는 수학식 1과 같이 연산자 임베딩 f_i와 피연산자 임베딩 a_ij를 결합하여 획득된다.Regarding the input embedding part, the input vector v _i of the ith expression token is obtained by combining the operator embedding f _i and the operand embedding a _ij as shown in Equation 1.

[수학식 1][Equation 1]

FF_*는 피드-포워드 선형 레이어(feed-forward linear layer)이고, Concat()은 괄호 안의 모든 벡터를 결합하는 것을 의미한다. v_i, f_i, 및 a_ij는 동일한 차원 D를 가진다.FF _* is a feed-forward linear layer, and Concat() means concatenate all vectors in parentheses. v _i , f _i , and a _ij have the same dimension D.

i 번째 표현식의 연산자 토큰 f _i 에 대해서 연산자 임베딩 벡터 f_i는 수학식 2와 같이 계산된다.For the operator token f _i of the ith expression, the operator embedding vector f _i is calculated as in Equation 2.

[수학식 2][Equation 2]

E_*()는 임베딩 벡터를 위한 룩업 테이블, c_*는 스칼라 파라미터(scalar parameter), LN_*()은 레이어 정규화(layer normalization) 및 PE()는 위치 인코딩(positional encoding)을 각각 나타낸다.E _* () represents a lookup table for the embedding vector, c _* represents a scalar parameter, LN _* () represents layer normalization, and PE() represents positional encoding.

i 번째 표현식의 j 번째 피연산자를 나타내는 임베딩 벡터 a_ij는 피연산자 a_ij의 소스에 따라 상이하게 계산된다. 피연산자의 맥락 정보를 반영하기 위해서 가능한 세 가지 소스는 문제 종속적인 숫자, 문제 독립적인 상수, 및 이전 표현식 토큰의 결과가 있다.The embedding vector a _ij representing the j-th operand of the i-th expression is calculated differently depending on the source of the operand a _ij . Three possible sources for reflecting the context information of the operands are issue-dependent numbers, issue-independent constants, and the result of the previous expression token.

문제 종속적인 숫자는 대수 문제에서 제공된 숫자이다. 숫자 a_ij를 계산하기 위해 수학식 3과 같이 숫자 토큰에 해당하는 인코더의 은닉 상태 벡터를 재사용한다. Problem-dependent numbers are numbers provided in an algebraic problem. To calculate the number a _ij , the hidden state vector of the encoder corresponding to the number token is reused as shown in Equation 3.

[수학식 3][Equation 3]

u_*는 소스를 나타내는 벡터이고, 는 숫자 a_ij에 대응하는 맥락 벡터이다. u _* is a vector representing the source, is the context vector corresponding to the number a _ij .

문제 독립적인 상수는 문제에 명시되지 않은 미리 정의된 숫자이다. 예컨대, 0.25는 쿼터의 양으로 사용된다. 상수 a_ij를 계산하기 위해 수학식 4과 같이 룩업 테이블 E_c를 사용한다.Problem-independent constants are predefined numbers that are not specified in the problem. For example, 0.25 is used as a quarter amount. To calculate the constant a _ij , the lookup table E _c is used as shown in Equation 4.

[수학식 4][Equation 4]

이전 표현식 토큰의 결과는 현재 i 번째 단계 이전에 생성된 표현식 토큰이다. 예컨대, 표 3을 참조하면, 4 번째 단계의 R₀은 i 번째 단계 이전에 디코더의 은닉 상태 벡터에 접근할 수 있더라도. 수학식 생성 모델은 트레이딩 과정에서 동시 디코딩(simultaneous decoding)을 유지하기 위해서 위치 인코딩을 사용한다.The result of the previous expression token is the expression token generated before the current ith step. For example, referring to Table 3, R ₀ in the fourth step even though the hidden state vector of the decoder can be accessed before the ith step. The mathematical generation model uses positional encoding to maintain simultaneous decoding during the trading process.

[수학식 5][Equation 5]

k는 이전 표현식 a_ij가 생성된 단계의 인덱스를 나타낸다. LN_a와 c_a는 서로 다른 소스에서 공유된다.k represents the index of the step in which the previous expression a _ij was created. LN _a and c _a are shared from different sources.

입력 임베딩 부분은 표현식 조각화 문제와 피연산자-맥락 정보 분리 문제를 처리하는 데 적합하다. 표현식 단편화 문제를 해결하기 위해 수학식 생성 모델의 디코더의 입력은 표현식의 구조를 유지한다. 따라서 연산자와 관련 피연산자는 표현식의 입력 임베딩을 구성하는 데 사용된다. 피연산자 컨텍스트 분리 문제를 해결하기 위해 수학식 생성 모델의 디코더는 피연산자의 맥락 정보를 활용한다. 특히, 디코더는 추상적인 기호 대신 쓰여진 숫자의 맥락 벡터를 직접 사용한다.The input embedding part is suitable for handling expression fragmentation problems and operand-context information separation problems. To solve the expression fragmentation problem, the input of the decoder of the mathematical expression generation model maintains the structure of the expression. Therefore, operators and their associated operands are used to construct the input embedding of an expression. To solve the operand context separation problem, the decoder of the mathematical expression generation model utilizes the context information of the operand. In particular, the decoder directly uses the context vector of written numbers instead of abstract symbols.

출력 예측 부분과 관련하여, 수학식 생성 모델의 디코더는 i 번째 표현식 토큰이 제공될 때 다음 연산자 f _i ₊₁과 피연산자 a_i+1,j를 동시에 예측한다. 다음 연산자 f _i ₊₁는 수학식 6과 같이 예측된다.Regarding the output prediction part, the decoder of the mathematical expression generation model simultaneously predicts the next operator f _i ₊₁ and the operand a _i+1,j when the ith expression token is provided. The next operator f _i ₊₁ is predicted as in Equation 6.

[수학식 6][Equation 6]

는 소프트맥스(softmax) 함수 의 출력을 따르는 분포에서 항목 k를 선택할 확률이다. is the softmax function It is the probability of selecting item k from a distribution that follows the output of .

피연산자를 예측할 때 피연산자의 맥락 정보를 활용하기 위해 출력 레이어는 포인터 네트워크(Pointer Networks, Vinyals et al., 2015)를 참조한 '피연산자-맥락 포인터'를 적용한다. 포인터 네트워크에서 출력 레이어는 후보 벡터에 대한 어텐션(attention)을 사용하여 다음 토큰을 예측한다. In order to utilize the context information of the operand when predicting the operand, the output layer applies an 'operand-context pointer' referring to Pointer Networks (Vinyals et al., 2015). In a pointer network, the output layer predicts the next token using attention to candidate vectors.

수학식 생성 모델은 피연산자의 소스에 따라 다른 세 가지 방법으로 다음 (i+1) 번째 표현식에 대한 후보 벡터를 수집한다. 피연산자의 소스는 문제 내의 k 번째 숫자에 대한 , k 번째 표현식 출력에 대한 , 상수 x에 대한 으로 구분된다. The mathematical expression generation model collects candidate vectors for the next (i+1)th expression in three different ways depending on the source of the operand. The source of the operand is for the kth number in the problem. , for the kth expression output , for constant x It is divided into

수학식 생성 모델은 다음 j 번째 피연산자 a_i+1,j를 예측한다. A_ij를 행 벡터가 후보인 행렬이라고 가정하면, 수학식 생성 모델은 키 행렬 K_ij에서 쿼리 벡터 Q_ij의 어텐션을 계산하여 a_i+1,j를 예측한다.The mathematical expression generation model predicts the next jth operand a _i+1,j . Assuming A _ij is a matrix where the row vector is a candidate, the equation generation model calculates the attention of the query vector Q _ij in the key matrix K _ij and predicts a _i+1,j .

[수학식 7][Equation 7]

출력 예측 부분은 표현식 조각화 문제와 피연산자-맥락 정보 분리 문제를 처리하는 데 적합하다. 표현식 단편화 문제를 해결하기 위해 수학식 생성 모델의 디코더는 표현식의 모든 구성 요소를 동시에 예측한다. 연산자와 피연산자는 동일한 디코더의 은닉 상태 벡터 d_i에서 생성된다. 피연산자-맥락 정보 분리 문제를 해결하기 위해 수학식 생성 모델의 디코더는 추상적인 기호를 생성하는 대신 적절한 맥락 정보를 직접 가리킨다. 인코더의 은닉 상태 벡터와 디코더의 은닉 상태 벡터는 쓰여진 숫자의 맥락 정보와 이전 표현식 토큰을 각각 제공할 수 있다.The output prediction part is suitable for handling expression fragmentation problems and operand-context information separation problems. To solve the problem of expression fragmentation, the decoder of a mathematical expression generative model predicts all components of the expression simultaneously. Operators and operands are generated from the hidden state vector d _i of the same decoder. To solve the operand-context information separation problem, the decoder of the mathematical expression generation model directly points to appropriate context information instead of generating abstract symbols. The encoder's hidden state vector and the decoder's hidden state vector can provide context information of the written number and the previous expression token, respectively.

수학식 생성 모델은 다양한 손실 함수(연산자 손실 함수 및 피연산자 손실 함수)의 합을 최소화하도록 훈련된다. 손실 함수는 평활화된 크로스-엔트로피(smoothed cross-entropy)가 적용될 수 있다. 수학식 생성 모델은 비가환 연산자(noncommutative operator)를 고려하여 각 피연산자 위치에 대해 피연산자 손실 함수를 별도로 계산할 수 있다. 예컨대, 수학식 생성 모델에서 사용되는 연산자의 최대 애러티(arity)가 2이면, 하나의 연산자 손실 함수와 두 개의 피연산자 손실 함수에 관한 세 개의 손실 함수가 적용될 수 있다.The mathematical expression generation model is trained to minimize the sum of various loss functions (operator loss function and operand loss function). The loss function may be smoothed cross-entropy. The mathematical expression generation model can calculate the operand loss function separately for each operand position by considering noncommutative operators. For example, if the maximum arity of an operator used in a mathematical expression generation model is 2, three loss functions related to one operator loss function and two operand loss functions can be applied.

도 5 및 도 6은 일 실시예에 따른 문제 풀이 장치에 적용 가능한 수학식 생성 모델에 대한 절제(ablation) 분석을 예시한 도면이다.Figures 5 and 6 are diagrams illustrating ablation analysis of a mathematical equation generation model applicable to a problem-solving device according to an embodiment.

도 4에 도시된 모델은 표현식 토큰(expression token)과 피연산자-맥락 포인터(operand-context pointer)를 모두 적용한 모델이고, 도 5에 도시된 모델은 표현식 토큰 대신에 연산자/피연산자 토큰(op token)을 적용하고 피연산자-맥락 포인터 대신에 추상적인 기호를 적용한 모델이고, 도 6에 도시된 모델은 표현식 토큰을 적용하고 피연산자-맥락 포인터 대신에 추상적인 기호를 적용한 모델이다.The model shown in Figure 4 is a model that applies both an expression token and an operand-context pointer, and the model shown in Figure 5 uses an operator/operand token (op token) instead of an expression token. It is a model in which an abstract symbol is applied instead of an operand-context pointer, and the model shown in Figure 6 is a model in which an expression token is applied and an abstract symbol is applied instead of an operand-context pointer.

아래 표 4는 도 4, 도 5, 및 도 6에 도시된 각 모델에 적용된 세 가지 데이터 세트의 크기 및 복잡도를 나타낸다.Table 4 below shows the size and complexity of the three data sets applied to each model shown in Figures 4, 5, and 6.

[표 4][Table 4]

ALG514 및 DRAW-1K는 고복잡도 데이터 세트이고, MAWPS는 저복잡도 데이터 세트이다. ALG514 및 DRAW-1K에서의 미지수와 토큰의 수는 MAWPS보다 거의 두 배에 해당한다.ALG514 and DRAW-1K are high-complexity data sets, and MAWPS is a low-complexity data set. The number of unknowns and tokens in ALG514 and DRAW-1K is almost twice that in MAWPS.

아래 표 5는 도 4, 도 5, 및 도 6에 도시된 각 모델의 정답 정확도를 나타낸다. Table 5 below shows the correct answer accuracy of each model shown in Figures 4, 5, and 6.

[표 5][Table 5]

표현식 토큰을 적용한 모델(도 6)은 트랜스포머 모델(도 5)보다 ALG514 및 DRAW-1K에서 약 15% 정도, MAWPS에서 약 1% 정도로 정확도가 향상되었다. 표현식 토큰을 적용한 모델에 피연산자-맥락 포인터를 적용한 모델(도 4)은 ALG514 및 DRAW-1K에서 약 30% 정도, MAWPS에서 약 3% 정도로 정확도가 향상되었다.The accuracy of the model using expression tokens (Figure 6) was improved by approximately 15% in ALG514 and DRAW-1K, and by approximately 1% in MAWPS, compared to the transformer model (Figure 5). The model that applied operand-context pointers to the model that applied expression tokens (Figure 4) improved accuracy by about 30% in ALG514 and DRAW-1K and by about 3% in MAWPS.

이처럼 ALG514 및 DRAW-1K의 고복잡도 데이터 세트에서 두 가지 관점에서 더 높은 성능 개선을 확인할 수 있다. 첫 번째로 고복잡도 데이터 세트에서 미지수와 토큰의 수가 증가함에 따라 각 토큰에 대한 표현식 조각화 문제가 발생할 확률이 기하급수적으로 증가하게 된다. 두 번째로 숫자와 표현식 토큰은 피연산자를 선택하기 위한 후보이므로 방정식에서 숫자와 표현식의 수가 증가함에 따라 피연산자-맥락 정보 분리 문제가 발생할 확률이 선형적으로 증가하게 된다. 수학식 생성 모델에 표현식 토큰과 피연산자-맥락 포인터를 적용하면 각 토큰에 대한 표현식 조각화 문제와 피연산자-맥락 정보 분리 문제를 해결할 수 있기 때문에 고복잡도 데이터 세트에서 정확도를 더 향상시킬 수 있다.In this way, higher performance improvements can be seen from two perspectives in the high-complexity data sets of ALG514 and DRAW-1K. First, as the number of unknowns and tokens increases in a high-complexity data set, the probability of encountering expression fragmentation problems for each token increases exponentially. Second, because number and expression tokens are candidates for selecting operands, the probability of operand-context information separation problems increases linearly as the number of numbers and expressions in an equation increases. Applying expression tokens and operand-context pointers to the mathematical expression generation model can solve the expression fragmentation problem and operand-context information separation problem for each token, thereby further improving accuracy in high-complexity data sets.

문장형 수학 문제를 풀이하는 인공신경망 모델의 정확도가 향상되더라도 인공신경망 모델이 문제의 숫자 정보를 올바르게 포착했는지 검증할 필요가 있다.Even if the accuracy of the artificial neural network model that solves sentence-type math problems improves, it is necessary to verify whether the artificial neural network model correctly captures the numerical information of the problem.

도 7은 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델의 동작 원리를 예시한 도면이다.Figure 7 is a diagram illustrating the operating principle of an artificial neural network model applied to a problem-solving device according to an embodiment.

본 실시예에 따른 인공신경망 모델(Expression Pointer Transformer with eXplanations, EPT-X)은 문제를 풀이할 때 두 가지 기준을 충족하는 설명을 고려해야 한다. 첫 번째로 설명은 문제를 이해하는 과정에서 주어진 문제를 철저히 반영하여 타당성(plausibility)을 충족시켜야 한다. 특히, 인간은 각 숫자 및 변수를 개별적으로 인식하므로 설명은 주어진 문제의 맥락에서 각 숫자 또는 변수가 나타내는 의미를 밝혀야 한다. 두 번째로 수학식을 작성하는 과정에서 설명을 사용하여 충실도(faithfulness)를 충족시켜야 한다. 설명에 수학식을 작성하는 데 유용한 정보가 포함되어 있으면 연산자 또는 피연산자를 선택하는 기준으로 작용해야 한다. The artificial neural network model (Expression Pointer Transformer with eXplanations, EPT-X) according to this embodiment must consider explanations that meet two criteria when solving a problem. First, the explanation must satisfy plausibility by thoroughly reflecting the given problem in the process of understanding the problem. In particular, because humans perceive each number and variable individually, the explanation must reveal what each number or variable represents in the context of the given problem. Second, in the process of writing mathematical equations, explanations must be used to satisfy faithfulness. If the description contains useful information for writing a mathematical expression, it should serve as a basis for selecting operators or operands.

타당성 있는 설명에 충실한 수학식을 작성하기 위해서 본 실시예에 따른 문제 풀이 장치는 도 4에서 설명한 수학식 생성 모델을 변형하고 추가로 설명문 생성 모델을 설계한다. 문제 풀이 장치는 설명문 생성 모델에 기반한 숫자 또는 변수의 설명 동작과 수학식 생성 모델에 기반한 수학식 구축 동작을 각각 수행한다.In order to create a mathematical equation that is faithful to a valid explanation, the problem-solving device according to this embodiment modifies the mathematical expression generation model described in FIG. 4 and additionally designs an explanation generation model. The problem-solving device performs an explanation operation of numbers or variables based on an explanation generation model and a mathematical expression construction operation based on a mathematical expression generation model, respectively.

설명 동작은 원 문제(original problem)를 입력받아 선행 학습 언어 모델인 인코더(210)를 통과하여 문제의 맥락적 의미를 나타내는 벡터값인 문제 맥락 벡터(problem context vector)을 획득한다. 선행 학습 언어 모델로 ELECTRA(Pre-training Text Encoders as Discriminators Rather Than Generators, Clark et al., 2020) 모델을 참고할 수 있다.The explanation operation receives the original problem and passes it through the encoder 210, a prior learning language model, to obtain a problem context vector, which is a vector value representing the contextual meaning of the problem. As a pre-training language model, you can refer to the ELECTRA (Pre-training Text Encoders as Discriminators Rather Than Generators, Clark et al., 2020) model.

설명 동작은 생성된 벡터값을 변수 예측기(variable predictor, 230)에 입력하여 문제를 풀기 위해 필요한 변수의 개수를 예측한다. The explanation operation inputs the generated vector value into a variable predictor 230 to predict the number of variables needed to solve the problem.

설명 동작은 문제 속의 각 숫자와 예측된 변수를 디코더(220)에 각각 입력하여 숫자와 변수를 각각 설명하는 설명문(explanation)을 획득한다. 설명을 생성하는 대상은 문제 전체가 아닌 숫자/변수와 같이 문제의 일부분이다.The explanation operation inputs each number and predicted variable in the problem into the decoder 220 to obtain an explanation explaining each number and variable. The object for which an explanation is created is a part of the problem, such as a number/variable, not the entire problem.

수학식 구축 동작은 생성된 설명문을 문장으로 변환하고 이어 붙여서 설명을 재결합한 재결합 문제(recombined problem)를 생성한다. 생성된 재결합 문제는 원 문제의 의역(paraphrase) 결과물로서, 설명 동작에서 놓친 정보가 있는지 모델이 스스로 원 문제와 비교해 확인할 수 있다. The mathematical formula construction operation converts the generated description into a sentence and concatenates it to create a recombined problem in which the description is recombined. The generated recombination problem is the result of a paraphrase of the original problem, and the model can check if there is any information missed in the explanation operation by comparing it with the original problem.

수학식 구축 동작은 원 문제와 재결합 문제를 이어 붙이거나 어느 한쪽만 입력받아 선행 학습 언어 모델인 인코더(310)를 통과하여 새로운 벡터값인 재결합 맥락 벡터(recombined context vector)를 획득한다. 선행 학습 언어 모델로 ELECTRA 모델을 참고할 수 있다. 필요에 따라 설명문 생성 모델(200)의 인코더(210)와 수학식 생성 모델(300)의 인코더(310)는 공유될 수 있다.The mathematical expression construction operation concatenates the original problem and the recombined problem, or receives only one side as input and passes through the encoder 310, a prior learning language model, to obtain a new vector value, a recombined context vector. You can refer to the ELECTRA model as a pre-learning language model. If necessary, the encoder 210 of the description generation model 200 and the encoder 310 of the mathematical expression generation model 300 may be shared.

수학식 구축 동작은 새로운 벡터값을 디코더(320)에 입력하여 주어진 문제를 풀기 위한 수학식을 예측한다. The mathematical expression construction operation inputs a new vector value into the decoder 320 to predict a mathematical expression to solve a given problem.

설명 동작은 구체적으로 문제 맥락 벡터를 계산하는 동작, 변수의 개수를 예측하는 동작, 및 설명문을 생성하는 동작으로 나뉜다.The explanation operation is specifically divided into an operation to calculate the problem context vector, an operation to predict the number of variables, and an operation to generate an explanation sentence.

문제 맥락 벡터를 계산하는 동작은 원 문제가 선행 학습 언어 모델에 입력되면 입력된 원 문제를 토큰으로 분절하고, 사전(dictionary)을 참조하여 각 토큰을 사전의 인덱스(index)로 변환한다. 분절 및 변환으로 생성된 인덱스 목록을 선행 학습 언어 모델에 입력한다. 선행 학습 언어 모델의 계산 과정에 따라 각 토큰마다 벡터값을 생성하여, 문제 맥락 벡터를 획득한다.The operation of calculating the problem context vector is to segment the original problem into tokens when the original problem is input to the prior learning language model, and refer to the dictionary to convert each token into an index of the dictionary. The index list generated by segmentation and transformation is input to the pre-learning language model. A vector value is generated for each token according to the calculation process of the prior learning language model, and the problem context vector is obtained.

문제 맥락 벡터를 계산하는 동작에서 설명문 생성 모델(200)의 인코더(210)는 자연어 문제 텍스트를 입력으로 수신하고, 문제 맥락 벡터를 계산한다. 문제 텍스트가 제공되면 인코더(210)는 문제 텍스트를 일련의 하위 단어 토큰으로 토큰화한다. 그런 다음 인코더(210)는 각 토큰을 임베딩 벡터로 변환한다. 인코더(210)는 임베딩 벡터를 사용하여 주어진 문제의 각 토큰 w _s 대해 문제 맥락 벡터 w_s를 계산한다.In the operation of calculating the problem context vector, the encoder 210 of the explanation text generation model 200 receives the natural language problem text as input and calculates the problem context vector. Given problem text, encoder 210 tokenizes the problem text into a series of sub-word tokens. Encoder 210 then converts each token into an embedding vector. Encoder 210 uses the embedding vector to calculate a problem context vector w _s for each token w _s of a given problem.

사전 훈련된 언어 모델인 ELECTRA 모델을 사용하여 인코더(210)의 가중치를 초기화할 수 있다. ELECTRA 모델은 대규모 코퍼스에서 대체 단어를 예측하도록 훈련되었으므로 ELECTRA 모델은 주어진 문제를 이해하는 데 필요한 지식을 제공할 수 있다.The weights of the encoder 210 can be initialized using the ELECTRA model, a pre-trained language model. Because the ELECTRA model is trained to predict alternative words in large corpora, the ELECTRA model can provide the knowledge needed to understand a given problem.

주어진 문제 텍스트는 필요한 변수의 개수 N를 숨기고 있으므로 설명문을 생성하기 전에 문제 맥락 벡터를 사용하여 변수의 개수 N을 복구해야 한다.Since the given problem text hides the number of required variables, N, the number of variables, N, must be recovered using the problem context vector before generating the description.

변수의 개수를 예측하는 동작은 언어 모델이 생성한 여러 벡터값 중 문제를 표현할 수 있는 대푯값을 선택한다. 대푯값을 선택하는 방법은 여러 가지가 있으며, 예컨대, 첫 번째 값을 선택할 수 있다. 대푯값을 여러 개의 선형 피드 포워드 레이어(linear feed-forward layer)와 활성화(activation) 함수를 통과하여 변형한다. 변형한 값을 기반으로 변수의 개수가 1부터 9까지 중 몇일지 그 확률을 예측한다.The operation of predicting the number of variables selects a representative value that can express the problem among several vector values generated by the language model. There are several ways to select a representative value, for example, you can select the first value. The representative value is transformed by passing through several linear feed-forward layers and an activation function. Based on the transformed value, predict the probability that the number of variables will be from 1 to 9.

변수의 개수를 예측하는 동작에서 설명문 생성 모델(200)의 변수 예측기(230)는 문제 맥락 벡터를 풀링(pooling)하는 여러 방법 중에서 문제 텍스트의 전체 의미를 포함하는 첫 번째 토큰의 문제 맥락 벡터 w₀를 사용할 수 있다. 변수의 개수 N의 확률 분포는 수학식 8과 같이 계산된다.In the operation of predicting the number of variables, the variable predictor 230 of the explanation text generation model 200 selects the problem context vector w ₀ of the first token containing the entire meaning of the problem text among several methods of pooling the problem context vector. can be used. The probability distribution of the number of variables N is calculated as Equation 8.

[수학식 8][Equation 8]

FF()는 피드 포워드 레이어를 나타낸다. 변수의 최대 개수를 9로 설정될 수 있다.FF() stands for feed forward layer. The maximum number of variables can be set to 9.

설명문을 생성하는 동작은 선행 학습 언어 모델이 생성한 벡터와 설명을 생성할 숫자 또는 변수의 정보를 하나씩 입력받아 디코더(220)가 설명문을 완성한다. 설명문 생성 모델(200)의 디코더(220)는 트랜스포머 모델을 참고할 수 있다. 숫자의 경우 숫자가 등장하는 문제의 문맥을 선별하여 입력하고 설명하게 한다. 변수의 경우 변수의 인덱스를 입력하고 설명하게 한다. 모든 숫자와 모든 변수에 대한 설명을 생성하면 설명문을 생성하는 동작이 종료된다.In the operation of generating a description, the decoder 220 completes the description by receiving the vector generated by the prior learning language model and the information of the number or variable to generate the description one by one. The decoder 220 of the description generation model 200 may refer to the transformer model. In the case of numbers, the context of the problem in which the number appears is selected, entered, and explained. In the case of variables, enter the index of the variable and explain it. When descriptions for all numbers and all variables are created, the operation to create descriptions ends.

설명문을 생성하는 동작에서 설명문 생성 모델(200)의 디코더(220)는 문제 맥락 벡터를 메모리로 사용하여 설명문을 생성한다. 설명문 생성 모델(200)은 트랜스포머(Transformer: Attention Is All You Need, Vaswani et al., 2017) 모델의 디코더와 포인터 생성기 네트워크(Get To The Point: Summarization with Pointer-Generator Networks, See et al., 2017)를 참고할 수 있다.In the operation of generating an explanation, the decoder 220 of the explanation generation model 200 uses the problem context vector as a memory to generate an explanation. The description generation model (200) is a decoder of the Transformer (Attention Is All You Need, Vaswani et al., 2017) model and a pointer-generator network (Get To The Point: Summarization with Pointer-Generator Networks, See et al., 2017). ) can be referred to.

다음 설명문 토큰 x_t+1을 예측하기 전에 디코더(220)는 문제 맥락 벡터 ws와 이전에 생성된 설명문 토큰 x₁, ... , x_t을 기반으로 은닉 상태 h_t를 계산한다.Before predicting the next statement token x _t+1, the decoder 220 calculates the hidden state h _t based on the problem context vector ws and the previously generated statement tokens x ₁ , ..., x _t .

설명문 생성하는 데 지식을 활용하기 위해 BERTGeneration(Leveraging Pre-trained Checkpoints for Sequence Generation Tasks, Rothe et al., 2020)를 적용하고, ELECTRA(Clark et al., 2020) 모델을 초기 가중치로 사용할 수 있다. 설명문을 생성하는 동작에 따라 디코더의 임베딩은 초기화 후 고정될 수 있다.To utilize knowledge to generate explanations, BERTGeneration (Leveraging Pre-trained Checkpoints for Sequence Generation Tasks, Rothe et al., 2020) can be applied, and the ELECTRA (Clark et al., 2020) model can be used as initial weights. Depending on the operation of generating the description, the embedding of the decoder may be fixed after initialization.

포인터 생성기(240)는 계산된 은닉 상태 h_t를 수신하고, 다음 토큰을 예측한다. p_g, P_v, 및 P_c를 각각 생성된 단어를 사용할 확률, 어휘에서 토큰을 생성할 확률, 및 문제에서 토큰을 복사할 확률이라 하면, 다음 토큰 x_t+1은 수학식 9와 같이 예측된다.Pointer generator 240 receives the calculated hidden state h _t and predicts the next token. If p _g , P _v , and P _c are the probability of using the generated word, the probability of generating a token in the vocabulary, and the probability of copying a token in the problem, respectively, then the next token x _t+1 is predicted as in Equation 9 do.

[수학식 9][Equation 9]

σ(), E(), 및 attn()는 각각 시그모이드(sigmoid), 임베딩(embedding) 및 단일 헤드 어텐션 점수 함수(single head attention scoring function)를 나타낸다. 는 벡터들의 결합(concatenation)을 나타낸다.σ(), E(), and attn() represent sigmoid, embedding, and single head attention scoring function, respectively. represents the concatenation of vectors.

설명문을 생성하는 동작에서 각 숫자/변수에 대한 설명문을 별도로 생성하여 설명문의 타당성을 확보한다. 설명문 생성 모델(200)은 모든 숫자와 변수에 고유한 초기 입력값을 사용한다. 트랜스포머 모델의 디코더의 초기 입력값인 '[CLS]' 대신에 설명문 생성 모델(200)은 “[CLS] explain: context [SEP]”를 입력한다. 여기서 'context' 부분은 숫자나 변수에 의존한다. 숫자의 경우 주어진 숫자 토큰 근처에 있는 토큰 윈도우를 사용한다. 예컨대, 윈도우 크기가 3이면, 지정된 토큰 앞에 위치한 3 개 토큰, 뒤에 위치한 3 개 토큰을 사용한다. 변수의 경우 문제에 변수가 나타나지 않기 때문에 변수 인덱스를 사용한다. 예컨대, n번째 변수의 초기 입력값은 “[CLS] explain: variable n [SEP]”이 된다.In the process of creating an explanation, a separate explanation is created for each number/variable to ensure the validity of the explanation. The description generation model 200 uses unique initial input values for all numbers and variables. Instead of ‘[CLS]’, which is the initial input value of the decoder of the transformer model, the explanation generation model 200 inputs “[CLS] explain: context [SEP]”. Here, the 'context' part depends on numbers or variables. For numbers, use the token window near the given number token. For example, if the window size is 3, the 3 tokens located before the specified token and the 3 tokens located after the specified token are used. In the case of variables, the variable index is used because the variable does not appear in the problem. For example, the initial input value of the nth variable becomes “[CLS] explain: variable n [SEP]”.

수학식 구축 동작은 구체적으로 설명문을 재결합하는 동작, 재결합 맥락 벡터를 계산하는 동작, 및 수학식을 생성하는 동작으로 나뉜다.The mathematical formula construction operation is specifically divided into an operation to recombine descriptions, an operation to calculate a recombined context vector, and an operation to generate a mathematical expression.

설명문을 재결합하는 동작은 설명문을 의역한다. 숫자에 대한 설명문의 경우 “설명은 숫자다”와 같은 일반 형태 문장으로 재구조화하고, 변수에 대한 설명문의 경우 “설명은 무엇인가?”와 같은 질문 형태 문장으로 재구조화한다. 숫자에 대한 설명문을 의역한 제1 문장 및 변수에 대한 설명문을 의역한 제2 문장을 순서대로 이어 붙여 재결합 문제를 획득한다. 재구조화하는 문장의 형태는 편의상 선택한 것이며, 다른 형태의 문장으로도 변형이 가능하다. 데이터 세트를 보강하기 위해 모델의 훈련 과정에서 재결합 전에 각 숫자/변수의 참조 설명 중 하나를 임의로 선택할 수 있다.The action of recombining a statement paraphrases the statement. In the case of explanatory sentences about numbers, they are restructured into general sentences such as “The explanation is a number,” and in the case of explanatory sentences about variables, they are restructured into question-type sentences such as “What is the explanation?” The recombination problem is obtained by concatenating the first sentence, which is a paraphrase of the explanation of numbers, and the second sentence, which is a paraphrase of the explanation of variables, in order. The form of the sentence being restructured was chosen for convenience, and it can be transformed into other types of sentences. To augment the data set, during the model's training process, one of the reference descriptions for each number/variable can be randomly selected before recombining.

설명문을 재결합하는 동작은 두 가지 관점에서 수학식 생성 모델(300)의 디코더(320)의 기능을 돕는다. 첫 번째로 원 문제를 정제하여 주어진 문제를 해결하는 데 관련 없는 정보를 제외한다. 재결합 문제 텍스트는 원 문제 텍스트보다 관련 없는 정보를 적게 갖기 때문이다. 두 번째로 재결합 문제는 문제를 해결하는 데 필요한 맥락 정보를 보완한다. 원 문제는 필요한 변수에 관한 정보를 생략할 때가 있으나, 재결합 문제는 이러한 정보를 명시적으로 지정한다.The operation of recombining the description helps the function of the decoder 320 of the mathematical expression generation model 300 from two perspectives. First, the original problem is refined to exclude information that is irrelevant to solving the given problem. This is because the recombined problem text contains less irrelevant information than the original problem text. Second, the recombination problem supplements the contextual information needed to solve the problem. The original problem sometimes omits information about necessary variables, but the recombination problem explicitly specifies this information.

재결합 맥락 벡터를 계산하는 동작은 선행 학습 언어 모델인 수학식 생성 모델(300)의 인코더(310)가 새로운 벡터값인 재결합 맥락 벡터(recombined context vector)을 획득한다. 원 문제와 재결합 문제를 모두 사용하는 경우, 원 문제와 재결합 문제를 순서대로 이어 붙여 입력으로 활용한다. 입력되는 문제의 종류가 하나인 경우, 그 문제를 그대로 입력으로 사용한다. In the operation of calculating the recombined context vector, the encoder 310 of the mathematical expression generation model 300, which is a prior learning language model, acquires a new vector value, a recombined context vector. If both the original problem and the recombined problem are used, the original problem and the recombined problem are connected in order and used as input. If there is only one type of input problem, that problem is used as input.

재결합 맥락 벡터를 계산하는 동작은 각 문제 또는 조합된 문제가 선행 학습 언어 모델에 입력되면 입력된 문제를 토큰으로 분절하고, 사전을 참조하여 각 토큰을 사전의 인덱스로 변환한다. 분절 및 변환으로 생성된 인덱스 목록을 선행 학습 언어 모델에 입력한다. 선행 학습 언어 모델의 계산 과정에 따라 각 토큰마다 벡터값을 생성하여, 재결합 맥락 벡터를 획득한다.The operation of calculating the recombined context vector is to segment the input problem into tokens when each problem or combined problem is input to the prior learning language model, and refer to the dictionary to convert each token into an index of the dictionary. The index list generated by segmentation and transformation is input to the pre-learning language model. A vector value is generated for each token according to the calculation process of the prior learning language model, and a recombined context vector is obtained.

재결합 문제 텍스트를 생성한 후, 원 문제와 재결합 문제는 “[CLS] original problem [SEP] recombined problem [SEP]”와 같이 결합하여 인코더(310)의 입력으로 제공된다. 인공신경망 모델은 원 문제를 사용하면 정보 손실을 방지할 수 있으므로 두 유형의 문제를 모두 사용하도록 설계할 수 있다. 재결합 문제는 설명이 잘못 생성되면 문제를 풀기에 충분한 정보가 없을 수 있다. 생성된 설명이 틀리더라도 원 문제는 인공신경망 모델이 올바른 방향으로 학습되도록 도울 수 있다.After generating the recombined problem text, the original problem and the recombined problem are combined as “[CLS] original problem [SEP] recombined problem [SEP]” and provided as input to the encoder 310. Artificial neural network models can be designed to use both types of problems because using the original problem can prevent information loss. In the case of recombination problems, if the explanation is generated incorrectly, there may not be enough information to solve the problem. Even if the generated explanation is wrong, the original problem can help the artificial neural network model learn in the right direction.

재결합 맥락 벡터를 계산하는 동작에서 수학식 생성 모델(300)의 인코더(310)는 각 입력 토큰 r _i 에 대해 재결합 맥락 벡터 r_i를 계산한다. In the operation of calculating the recombination context vector, the encoder 310 of the equation generation model 300 calculates the recombination context vector r _i for each input token r _i .

인공신경망 모델은 설명문 생성 모델(200)의 인코더(210)와 수학식 생성 모델(300)의 인코더(310)의 유사성을 고려하여 설명문 생성 모델(200)의 인코더(210)와 수학식 생성 모델(300)의 인코더(310)에 동일한 인코더를 적용할 수 있다.The artificial neural network model considers the similarity between the encoder 210 of the explanation generation model 200 and the encoder 310 of the mathematical expression generation model 300, and uses the encoder 210 of the explanation generation model 200 and the mathematical expression generation model ( The same encoder can be applied to the encoder 310 of 300).

설명문 생성 모델(200)의 인코더(210)의 입력 텍스트는 수학식 생성 모델(300)의 인코더(310)에 입력되는 하위 시퀀스인 원 문제이고, 설명문 생성 모델(200)의 인코더(210)와 수학식 생성 모델(300)의 인코더(310)의 출력은 동일한 형식을 가진다. 즉, 인코더(210, 310)의 출력은 주어진 입력 텍스트를 캡슐화하는 맥락 벡터이다. 설명문 생성 모델(200)의 인코더(210)와 수학식 생성 모델(300)의 인코더(310) 간에 훈련 지식을 공유하면 훈련 과정이 안정화될 수 있다.The input text of the encoder 210 of the explanation generation model 200 is an original problem that is a sub-sequence input to the encoder 310 of the mathematical expression generation model 300, and the encoder 210 of the explanation generation model 200 and the mathematics The output of the encoder 310 of the expression generation model 300 has the same format. That is, the output of encoders 210 and 310 is a context vector that encapsulates the given input text. The training process can be stabilized by sharing training knowledge between the encoder 210 of the description generation model 200 and the encoder 310 of the mathematical expression generation model 300.

수학식을 생성하는 동작은 생성된 선행 학습 언어 모델의 벡터값을 입력받아 수학식 생성 모델(300)의 디코더(320)가 수학식을 단계적으로 생성한다.In the operation of generating a mathematical expression, the vector value of the generated prior learning language model is input and the decoder 320 of the mathematical expression generation model 300 generates the mathematical expression step by step.

수학식 생성 모델(300)의 디코더(320)는 각 단계마다 연산자 1 개를 생성하고 그에 필요한 피연산자를 재결합 문제, 원 문제, 사전 정의된 상수값, 또는 이전 계산 단계의 결과 중 하나 이상에서 복사한다. 최종적으로 생성된 수학식이 문제를 풀이하기 위한 방정식으로 변환된다. 변환된 방정식을 계산하면 수학식 풀이 라이브러리를 통해 답을 얻을 수 있다.The decoder 320 of the mathematical expression generation model 300 generates one operator at each step and copies the required operands from one or more of a recombination problem, a circle problem, a predefined constant value, or the result of a previous calculation step. . Finally, the generated mathematical expression is converted into an equation to solve the problem. If you calculate the converted equation, you can get the answer through the mathematical equation solving library.

수학식 생성 모델(300)의 디코더(320)는 재결합 맥락 벡터를 메모리로 사용하여 수학식을 생성한다. 수학식 생성 모델(300)은 도 4에서 설명한 디코더의 동작 원리를 활용하여 연산자와 관련 피연산자의 그룹인 표현식 토큰 단위를 사용하여 수학식을 생성한다. The decoder 320 of the mathematical expression generation model 300 generates a mathematical expression using the recombined context vector as a memory. The mathematical expression generation model 300 utilizes the operating principle of the decoder described in FIG. 4 to generate a mathematical expression using an expression token unit, which is a group of operators and related operands.

수학식 생성 모델(300)의 디코더(320)는 다음 j번째 토큰을 예측한다. 먼저 디코더(320)는 지금까지 생성된 표현식 토큰을 수신하여 임베딩 벡터 v_k(k=0, ... , j-1)로 변환한다. 그런 다음, 이러한 임베딩 벡터 v_k 및 재결합 맥락 벡터 r_i를 사용하여 디코더(320)는 다음 j 번째 토큰에 대한 수학식 맥락 벡터 q_j를 구축한다. 마지막으로 디코더(320)는 수학식 맥락 벡터 q_j를 사용하여 다음 연산자와 관련 피연산자를 예측한다. The decoder 320 of the mathematical expression generation model 300 predicts the next jth token. First, the decoder 320 receives the expression tokens generated so far and converts them into an embedding vector v _k (k=0, ..., j-1). Then, using these embedding vectors v _k and the recombined context vector r _i , decoder 320 builds the mathematical context vector q _j for the next jth token. Finally, the decoder 320 uses the mathematical context vector q _j to predict the next operator and related operands.

수학식을 생성하는 동작에서 설명문을 입력 데이터 소스로 사용하여 설명의 충실도를 확보한다. 도 4에서 설명한 수학식 생성 모델의 디코더에서 설명문을 사용할 수 있도록 숫자와 변수의 입력 형식을 변경한다. 도 4에서 설명한 수학식 생성 모델의 디코더는 알려진 각 숫자에 대한 인코더의 은닉 상태 벡터 및 알려지지 않은 각 변수에 대한 디코더의 은닉 상태 벡터에 해당하는 서로 다른 유형의 벡터를 입력한다. In the operation of creating a mathematical equation, the explanation is used as an input data source to ensure the fidelity of the explanation. Change the input format of numbers and variables so that explanatory sentences can be used in the decoder of the mathematical expression generation model described in Figure 4. The decoder of the mathematical expression generation model described in Figure 4 inputs different types of vectors corresponding to the encoder's hidden state vector for each known number and the decoder's hidden state vector for each unknown variable.

본 실시예에 따른 인공신경망 모델의 수학식 생성 모델(300)의 디코더(320)는 설명문의 정보를 활용하여 수학식을 생성하도록 모델을 학습한다. 모든 숫자와 변수가 재결합 문제에 나타나 있으므로, 본 실시예에 따른 인공신경망 모델의 수학식 생성 모델(300)의 디코더(320)는 각 숫자/변수에 해당하는 재결합 맥락 벡터 r_i를 사용한다.The decoder 320 of the mathematical equation generation model 300 of the artificial neural network model according to this embodiment learns the model to generate a mathematical equation using information in the description. Since all numbers and variables appear in the recombination problem, the decoder 320 of the mathematical expression generation model 300 of the artificial neural network model according to this embodiment uses the recombination context vector r _i corresponding to each number/variable.

본 실시예에 따른 인공신경망 모델의 학습 및 검증을 위해 문제 텍스트, 수학식, 및 각 문제에 대한 숫자/변수에 대한 설명이 포함된 문장형 대수 문제 데이터 세트인 PEN(Problem with Explanation for Numbers)이라는 데이터 세트를 설계한다.For learning and verification of the artificial neural network model according to this embodiment, PEN (Problem with Explanation for Numbers) is a sentential algebra problem data set containing problem text, mathematical equations, and explanations of numbers/variables for each problem. Design the data set.

PEN 데이터 세트 구축은 데이터 세트 선택 단계, 오류 정정 준비 단계, 및 설명문 수집을 위한 주석화 단계를 거친다.Building a PEN data set goes through a data set selection stage, an error correction preparation stage, and an annotation stage to collect descriptions.

데이터 세트 선택 단계에서 소스 데이터 세트를 수집한다. 데이터 세트에는 영어 단어 문제가 포함되어야 하고, 데이터 세트에 있는 대부분의 문제는 문제를 해결하기 위해 대수 방정식을 사용해야 하고, 데이터 세트에는 각 문제에 대한 최적 표준 방정식(gold standard equation)이 포함되어야 한다.In the data set selection step, source data sets are collected. The data set should contain English word problems, most problems in the data set should use algebraic equations to solve the problem, and the data set should contain a gold standard equation for each problem.

오류 정정 준비 단계에서는 오류를 정정하고 데이터를 정리한다. 먼저, 오타, 문법 오류 또는 논리적 오류를 수정한다. 다음으로, 수정된 텍스트에서 숫자 유형을 추출한다. 문장형 문제에는 방정식을 설정하는 데 사용 가능한 숫자 데이터가 포함될 수 있다. 숫자 데이터는 아라비아 숫자, 분수, 서수, 또는 숫자의 다른 표현(예컨대, dozen) 등 다양한 형태로 작성될 수 있다. 마지막으로 해 방정식을 정규화한다. 아래 표 6과 같이 미리 정의된 공식을 기반으로 방정식을 재구성한다. In the error correction preparation stage, errors are corrected and data is organized. First, correct any typos, grammatical errors, or logical errors. Next, we extract the number type from the modified text. Sentence problems can contain numeric data that can be used to set up equations. Numeric data can be written in a variety of forms, such as Arabic numerals, fractions, ordinal numbers, or other representations of numbers (e.g., dozen). Finally, normalize the solution equation. Reconstruct the equation based on the predefined formula as shown in Table 6 below.

[표 6][Table 6]

설명문 수집을 위한 주석화 단계에서는 문제의 각 숫자/변수에 대한 설명을 수집한다. 웹 기반 시스템을 사용하여 각 숫자/변수에 대해 자연어 설명을 입력하고, 주어진 정보를 기반으로 문장을 완성한다. 설명이 문제와 일치하도록 하기 위해 아래 표 7과 같이 정의된 규칙과 유효성 검사라는 두 가지 전략을 사용한다.In the annotation step to collect explanations, explanations for each number/variable in the problem are collected. Using a web-based system, you enter a natural language description for each number/variable and complete the sentence based on the given information. To ensure that the description matches the problem, we use two strategies: rules and validation, defined as shown in Table 7 below.

[표 7][Table 7]

규칙 1은 숫자/변수가 나타내는 상황에 대한 설명을 텍스트에 나오는 단어를 사용하여 작성하도록 한다. 규칙 2는 각 설명은 3~25개의 단어로 구성된 간단한 명사구이므로, 간결하게 작성하도록 한다. 규칙 3은 설명을 작성할 때 문제 텍스트에 나타나는 단어를 하나 이상 사용하도록 한다. 규칙 4는 다른 개체에 대해 동일한 설명을 사용하지 않도록 한다. 규칙 5는 설명만으로 문제를 풀기 위한 방정식을 공식화할 수 있도록 한다. 규칙 6은 차이 A-B를 “A 빼기 B의 값”으로 작성하도록 한다. 규칙 7은 비율 A/B를 “A 대 B의 비율”로 작성하도록 한다. 규칙 8은 A/B의 분자[분모]를 “A 대 B 비율의 분자[분모]”로 쓰도록 한다.Rule 1 requires you to write a description of the situation represented by the number/variable using words that appear in the text. Rule 2 is that each description is a simple noun phrase consisting of 3 to 25 words, so write it concisely. Rule 3 tells you to use at least one word that appears in the question text when writing your explanation. Rule 4 ensures that you do not use the same description for different entities. Rule 5 allows us to formulate an equation to solve the problem from the description alone. Rule 6 tells us to write the difference A-B as “the value of A minus the value of B.” Rule 7 requires that the ratio A/B be written as “the ratio of A to B.” Rule 8 states that the numerator [denominator] of A/B should be written as “the numerator [denominator] of the ratio A to B.”

웹 기반 시스템은 작성자가 규칙을 준수하는지 여부를 지속적으로 확인하고, 처음 네 가지 규칙 중 하나가 위반되면 시스템은 작성자가 다음 문제를 진행하기 전에 위반된 규칙을 따르도록 경고한다. 다른 네 가지 규칙의 경우 시스템은 작성자가 수동으로 규칙을 확인할 수 있도록 힌트를 표시한다.The web-based system continuously checks whether the author complies with the rules, and if one of the first four rules is violated, the system warns the author to follow the violated rule before proceeding to the next issue. For the other four rules, the system displays hints so that the author can manually check the rules.

완성된 PEN 데이터 세트의 크기는 아래 표 8과 같다.The size of the completed PEN data set is shown in Table 8 below.

[표 8][Table 8]

본 실시예에 따른 인공신경망 모델(EPT-X)에 대해서 PEN 데이터 세트를 이용하여 정답을 맞춘 문제의 비율인 정답률을 측정한다. 생성된 수학식에서 변수의 순서를 고려하고, 생성된 수학식이 최적 표준 방정식과 일치하면 정답으로 간주한다.For the artificial neural network model (EPT-X) according to this embodiment, the PEN data set is used to measure the correct answer rate, which is the proportion of problems answered correctly. The order of variables in the generated equation is considered, and if the generated equation matches the optimal standard equation, it is considered correct.

도 4의 EPT는 74.52%의 정답률을 보이고, 도 7의 EPT-X는 설명을 추가로 출력하면서도 69.59%의 정답률을 보인다.The EPT in Figure 4 shows a correct answer rate of 74.52%, and the EPT-X in Figure 7 shows a correct answer rate of 69.59% even though it outputs an additional explanation.

도 8 및 도 9는 일 실시예에 따른 문제 풀이 장치에 적용된 인공신경망 모델에 대한 절제 분석을 예시한 도면이다.Figures 8 and 9 are diagrams illustrating ablation analysis of an artificial neural network model applied to a problem-solving device according to an embodiment.

정확도와 충실도 간의 균형 관계를 확인하기 위해 인공신경망 모델의 수학식 생성 모델의 입력 데이터 및 데이터 경로를 변경한 후 테스트를 진행한다. In order to check the balance between accuracy and fidelity, testing is performed after changing the input data and data path of the mathematical equation generation model of the artificial neural network model.

도 8에 도시된 모델은 수학식 생성 모델에 설명문만을 입력하는 모델(EPTX+F)이다. EPTX+F는 수학식 생성 모델의 디코더가 설명문 생성 모델의 디코더의 출력에만 의존하므로, 생성된 설명문에서 오류가 발생한 상황에 취약할 수 있다.The model shown in Figure 8 is a model (EPTX+F) that inputs only explanatory text into the mathematical equation generation model. EPTX+F may be vulnerable to situations where an error occurs in the generated explanation because the decoder of the mathematical expression generation model depends only on the output of the decoder of the explanation generation model.

도 9에 도시된 모델은 수학식 생성 모델에 원 문제만을 입력하는 모델(EPTX+U)이다. EPTX+U는 설명문을 활용하지 않으므로 충실도를 확보할 수 없다.The model shown in Figure 9 is a model (EPTX+U) that inputs only the original problem into the equation generation model. EPTX+U does not utilize explanatory text, so fidelity cannot be secured.

아래 표 9는 EPT-X, EPTX+F, 및 EPTX+U를 비교한 결과이다.Table 9 below shows the results of comparing EPT-X, EPTX+F, and EPTX+U.

[표 9][Table 9]

설명문의 유사성을 측정하여 설명문의 타당성을 확인할 수 있다. BLEU(a Method for Automatic Evaluation of Machine Translation, Papineni et al., 2002), ROUGE(A Package for Automatic Evaluation of Summaries, Lin, 2004), CIDEr(Consensus-based Image Description Evaluation, Vedantam et al., 2015), 및 BLEURT(Learning Robust Metrics for Text Generation, Sellam et al., 2020)는 설명문의 유사성을 측정하는 방식이다.The validity of the explanation can be confirmed by measuring the similarity of the explanation. BLEU (a Method for Automatic Evaluation of Machine Translation, Papineni et al., 2002), ROUGE (A Package for Automatic Evaluation of Summaries, Lin, 2004), CIDEr (Consensus-based Image Description Evaluation, Vedantam et al., 2015) , and BLEURT (Learning Robust Metrics for Text Generation, Sellam et al., 2020) are methods for measuring the similarity of descriptions.

생성된 설명문만 입력한 모델(EPTX+F)보다 원 문제와 생성된 설명문을 함께 입력한 모델(EPT-X)에서 정답률이 향상됨을 확인할 수 있다. 설명문이 원 문제의 맥락 정보를 보완하기 때문이다. 원 문제만 입력한 모델(EPTX+U)에서 원 문제와 생성된 설명문을 함께 입력한 모델(EPT-X)보다 정답률이 조금 향상될 수 있으나 원 문제만 입력한 모델(EPTX+U)는 충실도를 확보할 수 없는 한계가 있다. EPT-X, EPTX+F, 및 EPTX+U가 정답률이 높은 원인은 연산자와 피연산자를 그룹화한 표현식 토큰 단위를 사용하여 맥락 정보를 보존하기 때문이다.It can be seen that the correct answer rate is improved in the model (EPT-X) in which the original problem and the generated explanation are input together than in the model in which only the generated explanation is input (EPTX+F). This is because the explanatory text supplements the contextual information of the original problem. The correct answer rate may slightly improve in the model that inputs only the original problem (EPTX+U) compared to the model that inputs both the original problem and the generated explanation (EPT-X), but the model that inputs only the original problem (EPTX+U) has lower fidelity. There are limits that cannot be secured. The reason EPT-X, EPTX+F, and EPTX+U have a high percentage of correct answers is because they preserve context information by using expression token units that group operators and operands.

도 10 및 도 11은 다른 실시예에 따른 문제 풀이 방법의 흐름도이다.10 and 11 are flowcharts of a problem solving method according to another embodiment.

도 10 및 도 11에 도시된 실시예에 따른 문제 풀이 방법은 도 1에 도시된 문제 풀이 장치에서 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하에서 생략된 내용이라고 하더라도, 도 1에 도시된 문제 풀이 장치에 관하여 이상에서 기술한 내용은 도 10 및 도 11에 도시된 실시예에 따른 문제 풀이 방법에도 적용될 수 있다. The problem-solving method according to the embodiment shown in FIGS. 10 and 11 includes steps processed in time series in the problem-solving device shown in FIG. 1. Therefore, even if the content is omitted below, the content described above regarding the problem solving device shown in FIG. 1 can also be applied to the problem solving method according to the embodiment shown in FIGS. 10 and 11.

도 10을 참조하면, S1010 단계에서 문제 풀이 장치(100)는 문장형 수학 문제를 입력받아 인코더-디코더 기반의 설명문 생성 모델을 통해 문장형 수학 문제에 내포된 숫자와 변수에 관한 설명문을 생성하여 출력한다.Referring to FIG. 10, in step S1010, the problem solving device 100 receives a sentence-type math problem and generates and outputs an explanation about the numbers and variables contained in the sentence-type math problem through an encoder-decoder-based description generation model. do.

설명문을 생성하여 출력하는 단계(S1010)에서 설명문 생성 모델의 인코더는 문장형 수학 문제를 입력받아 토큰으로 분절하고 각 토큰마다 문장형 수학 문제의 맥락 정보를 나타내는 문제 맥락 벡터를 출력한다.In the step of generating and outputting an explanation (S1010), the encoder of the explanation generation model receives the sentence-type math problem as input, segments it into tokens, and outputs a problem context vector representing the context information of the sentence-type math problem for each token.

설명문을 생성하여 출력하는 단계(S1010)에서 설명문 생성 모델의 인코더 및 디코더 사이에 연결된 변수 예측기는 문제 맥락 벡터를 입력받아 대푯값을 변형하여 문장형 수학 문제를 풀기 위해 필요한 변수의 개수를 예측한다. In the step of generating and outputting an explanation (S1010), the variable predictor connected between the encoder and decoder of the explanation generation model receives the problem context vector and transforms the representative value to predict the number of variables needed to solve the sentence-type math problem.

설명문을 생성하여 출력하는 단계(S1010)에서 설명문 생성 모델의 디코더는 문제 맥락 벡터, 변수의 개수를 고려한 변수, 및 문장형 수학 문제의 숫자를 입력받아 이전에 생성된 설명 토큰을 기반으로 디코더의 은닉 상태를 계산한다. In the step of generating and outputting an explanation (S1010), the decoder of the explanation generation model receives the problem context vector, variables considering the number of variables, and the number of sentence-type math problems, and hides the decoder based on the previously generated explanation token. Calculate the state.

설명문을 생성하여 출력하는 단계(S1010)에서 설명문 생성 모델의 디코더에 연결된 포인터 생성기는 디코더의 은닉 상태를 입력받아 다음 설명 토큰을 예측한다.In the step of generating and outputting a description (S1010), a pointer generator connected to the decoder of the description generation model receives the hidden state of the decoder and predicts the next description token.

S1020 단계에서 문제 풀이 장치(100)는 문장형 수학 문제, 생성된 설명문, 또는 이들의 조합을 입력받아 인코더-디코더 기반의 수학식 생성 모델을 통해 수학식을 생성하여 출력한다.In step S1020, the problem solving device 100 receives a sentence-type math problem, a generated explanation, or a combination thereof, generates a math equation through an encoder-decoder-based math equation generation model, and outputs it.

도 11을 참조하면, 수학식을 생성하여 출력하는 단계(S1020)는 생성된 설명문을 문장으로 변환하고 재결합 문제를 생성한다. 수학식을 생성하여 출력하는 단계(S1020)는 설명문 생성 모델을 통해 문장형 수학 문제에서 숫자가 위치하는 문맥을 선별하여 생성한 숫자에 관한 설명문을 제1 문장으로 재구조화하는 단계(S1110), 설명문 생성 모델을 통해 변수의 인덱스를 이용하여 생성한 변수에 관한 설명문을 제2 문장으로 재구조화하는 단계(S1120), 및 재구조화한 제1 문장 및 제2 문장을 결합하여 재결합 문제를 생성하는 단계(S1130)를 포함한다.Referring to FIG. 11, the step of generating and outputting a mathematical equation (S1020) converts the generated description into a sentence and creates a recombination problem. The step of generating and outputting a mathematical formula (S1020) is a step of restructuring the explanatory statement about the number generated by selecting the context in which the number is located in the sentence-type math problem through the explanatory statement generation model into a first sentence (S1110), the explanatory statement A step of restructuring the description of the variable created using the index of the variable through the generative model into a second sentence (S1120), and combining the restructured first and second sentences to create a recombination problem ( S1130).

수학식을 생성하여 출력하는 단계(S1020)에서 재결합 문제 및 문장형 수학 문제를 수학식 생성 모델의 인코더에 입력한다.In the step of generating and outputting a mathematical equation (S1020), the recombination problem and the sentence-type mathematical problem are input to the encoder of the mathematical equation generation model.

수학식을 생성하여 출력하는 단계(S1020)에서 수학식 생성 모델의 인코더는 재결합 문제 및 문장형 수학 문제를 입력받아 재결합 맥락 벡터를 출력한다. In the step of generating and outputting a mathematical equation (S1020), the encoder of the mathematical equation generation model receives a recombination problem and a sentence-type math problem as input and outputs a recombination context vector.

수학식을 생성하여 출력하는 단계(S1020)에서 수학식 생성 모델의 디코더는 재결합 맥락 벡터를 입력받아 연산자와 필요한 피연산자를 그룹화한 표현식 토큰 단위를 사용하여 수학식을 생성한다.In the step of generating and outputting a mathematical expression (S1020), the decoder of the mathematical expression generation model receives the recombined context vector and generates a mathematical expression using expression token units that group operators and necessary operands.

생성된 수학식은 수학식 풀이 라이브러리를 통해 답을 얻을 수 있다.The generated mathematical expression can be answered through the mathematical equation solving library.

이상의 실시예들에서 사용되는 '~부'라는 용어는 소프트웨어 또는 FPGA(field programmable gate array) 또는 ASIC 와 같은 하드웨어 구성요소를 의미하며, '~부'는 어떤 역할들을 수행한다. 그렇지만 '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램특허 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다.The term '~unit' used in the above embodiments refers to software or hardware components such as FPGA (field programmable gate array) or ASIC, and the '~unit' performs certain roles. However, '~part' is not limited to software or hardware. The '~ part' may be configured to reside in an addressable storage medium and may be configured to reproduce on one or more processors. Therefore, as an example, '~ part' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, and procedures. , subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays and variables.

구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로부터 분리될 수 있다.The functions provided within the components and 'parts' may be combined into a smaller number of components and 'parts' or may be separated from additional components and 'parts'.

뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU/GPU들을 재생시키도록 구현될 수도 있다.In addition, components and 'parts' may be implemented to reproduce one or more CPUs/GPUs within a device or a secure multimedia card.

한편, 본 명세서를 통해 설명된 일 실시예에 따른 문제 풀이 방법은 컴퓨터에 의해 실행 가능한 명령어 및 데이터를 저장하는, 컴퓨터로 판독 가능한 매체의 형태로도 구현될 수 있다. 이때, 명령어 및 데이터는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 소정의 프로그램 모듈을 생성하여 소정의 동작을 수행할 수 있다. 또한, 컴퓨터로 판독 가능한 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터로 판독 가능한 매체는 컴퓨터 기록 매체일 수 있는데, 컴퓨터 기록 매체는 컴퓨터 판독 가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함할 수 있다. 예를 들어, 컴퓨터 기록 매체는 HDD 및 SSD 등과 같은 마그네틱 저장 매체, CD, DVD 및 블루레이 디스크 등과 같은 광학적 기록 매체, 또는 네트워크를 통해 접근 가능한 서버에 포함되는 메모리일 수 있다.Meanwhile, the problem-solving method according to an embodiment described through this specification may also be implemented in the form of a computer-readable medium that stores instructions and data executable by a computer. At this time, instructions and data can be stored in the form of program code, and when executed by a processor, they can generate a certain program module and perform a certain operation. Additionally, computer-readable media can be any available media that can be accessed by a computer and includes both volatile and non-volatile media, removable and non-removable media. Additionally, computer-readable media may be computer recording media, which are volatile and non-volatile implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. It can include both volatile, removable and non-removable media. For example, computer recording media may be magnetic storage media such as HDDs and SSDs, optical recording media such as CDs, DVDs, and Blu-ray discs, or memory included in servers accessible through a network.

또한, 본 명세서를 통해 설명된 일 실시예에 따른 문제 풀이 방법은 컴퓨터에 의해 실행 가능한 명령어를 포함하는 컴퓨터 프로그램(또는 컴퓨터 프로그램 제품)으로 구현될 수도 있다. 컴퓨터 프로그램은 프로세서에 의해 처리되는 프로그래밍 가능한 기계 명령어를 포함하고, 고레벨 프로그래밍 언어(High-level Programming Language), 객체 지향 프로그래밍 언어(Object-oriented Programming Language), 어셈블리 언어 또는 기계 언어 등으로 구현될 수 있다. 또한 컴퓨터 프로그램은 유형의 컴퓨터 판독가능 기록매체(예를 들어, 메모리, 하드디스크, 자기/광학 매체 또는 SSD(Solid-State Drive) 등)에 기록될 수 있다. Additionally, the problem-solving method according to an embodiment described through this specification may be implemented as a computer program (or computer program product) including instructions executable by a computer. A computer program includes programmable machine instructions processed by a processor and may be implemented in a high-level programming language, object-oriented programming language, assembly language, or machine language. . Additionally, the computer program may be recorded on a tangible computer-readable recording medium (eg, memory, hard disk, magnetic/optical medium, or solid-state drive (SSD)).

따라서, 본 명세서를 통해 설명된 일 실시예에 따른 문제 풀이 방법은 상술한 바와 같은 컴퓨터 프로그램이 컴퓨팅 장치에 의해 실행됨으로써 구현될 수 있다. 컴퓨팅 장치는 프로세서와, 메모리와, 저장 장치와, 메모리 및 고속 확장포트에 접속하고 있는 고속 인터페이스와, 저속 버스와 저장 장치에 접속하고 있는 저속 인터페이스 중 적어도 일부를 포함할 수 있다. 이러한 성분들 각각은 다양한 버스를 이용하여 서로 접속되어 있으며, 공통 마더보드에 탑재되거나 다른 적절한 방식으로 장착될 수 있다. Accordingly, the problem-solving method according to an embodiment described throughout this specification can be implemented by executing the above-described computer program by a computing device. The computing device may include at least some of a processor, memory, a storage device, a high-speed interface connected to the memory and a high-speed expansion port, and a low-speed interface connected to a low-speed bus and a storage device. Each of these components is connected to one another using various buses and may be mounted on a common motherboard or in some other suitable manner.

여기서 프로세서는 컴퓨팅 장치 내에서 명령어를 처리할 수 있는데, 이런 명령어로는, 예컨대 고속 인터페이스에 접속된 디스플레이처럼 외부 입력, 출력 장치상에 GUI(Graphic User Interface)를 제공하기 위한 그래픽 정보를 표시하기 위해 메모리나 저장 장치에 저장된 명령어를 들 수 있다. 다른 실시예로서, 다수의 프로세서 및(또는) 다수의 버스가 적절히 다수의 메모리 및 메모리 형태와 함께 이용될 수 있다. 또한 프로세서는 독립적인 다수의 아날로그 및(또는) 디지털 프로세서를 포함하는 칩들이 이루는 칩셋으로 구현될 수 있다. Here, the processor can process instructions within the computing device, such as displaying graphical information to provide a graphic user interface (GUI) on an external input or output device, such as a display connected to a high-speed interface. These may include instructions stored in memory or a storage device. In other embodiments, multiple processors and/or multiple buses may be utilized along with multiple memories and memory types as appropriate. Additionally, the processor may be implemented as a chipset consisting of chips including multiple independent analog and/or digital processors.

또한, 메모리는 컴퓨팅 장치 내에서 정보를 저장한다. 일례로, 메모리는 휘발성 메모리 유닛 또는 그들의 집합으로 구성될 수 있다. 다른 예로, 메모리는 비휘발성 메모리 유닛 또는 그들의 집합으로 구성될 수 있다. 또한 메모리는 예컨대, 자기 혹은 광 디스크와 같이 다른 형태의 컴퓨터 판독 가능한 매체일 수도 있다. Additionally, memory stores information within a computing device. In one example, memory may be comprised of volatile memory units or sets thereof. As another example, memory may consist of non-volatile memory units or sets thereof. The memory may also be another type of computer-readable medium, such as a magnetic or optical disk.

그리고, 저장장치는 컴퓨팅 장치에게 대용량의 저장공간을 제공할 수 있다. 저장 장치는 컴퓨터 판독 가능한 매체이거나 이런 매체를 포함하는 구성일 수 있으며, 예를 들어 SAN(Storage Area Network) 내의 장치들이나 다른 구성도 포함할 수 있고, 플로피 디스크 장치, 하드 디스크 장치, 광 디스크 장치, 혹은 테이프 장치, 플래시 메모리, 그와 유사한 다른 반도체 메모리 장치 혹은 장치 어레이일 수 있다.Additionally, the storage device can provide a large amount of storage space to the computing device. A storage device may be a computer-readable medium or a configuration that includes such media, and may include, for example, devices or other components within a storage area network (SAN), such as a floppy disk device, a hard disk device, an optical disk device, Or it may be a tape device, flash memory, or other similar semiconductor memory device or device array.

상술한 실시예들은 예시를 위한 것이며, 상술한 실시예들이 속하는 기술분야의 통상의 지식을 가진 자는 상술한 실시예들이 갖는 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above-described embodiments are for illustrative purposes, and those of ordinary skill in the technical field to which the above-described embodiments belong will recognize that they can be easily modified into other specific forms without changing the technical idea or essential features of the above-described embodiments. You will understand. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive. For example, each component described as unitary may be implemented in a distributed manner, and similarly, components described as distributed may also be implemented in a combined form.

본 명세서를 통해 보호받고자 하는 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope sought to be protected through this specification is indicated by the patent claims described later rather than the detailed description above, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts are included in the scope of the present invention. It should be interpreted as being

100 : 문제 풀이 장치
110 : 입출력부
120 : 저장부
130 : 제어부100: Problem solving device
110: input/output unit
120: storage unit
130: control unit

Claims

An input/output unit for receiving data and outputting the results of processing the data;
a storage unit in which a program for performing a problem solving method is stored; and
It includes at least one process, and includes a control unit that analyzes data received through the input/output unit by executing the program,
The control unit,
Receives a sentence-type math problem as input, generates and outputs an explanation about the numbers and variables contained in the sentence-type math problem through an encoder-decoder-based explanation generation model,
A problem-solving device that receives the sentence-type math problem, the generated explanation, or a combination thereof, and generates and outputs a math equation through an encoder-decoder-based math equation generation model.

According to claim 1,
The encoder of the explanation generation model receives the sentence-type math problem as input, segments it into tokens, and outputs a problem context vector representing context information of the sentence-type math problem for each token,
The explanation generation model includes a variable predictor connected between the encoder and the decoder of the explanation generation model, and the variable predictor receives the problem context vector and transforms a representative value to determine the variable necessary to solve the sentence-type math problem. Predict the number of
The decoder of the explanation generation model receives the problem context vector, a variable considering the number of variables, and the number of sentence-type math problems, and calculates a hidden state of the decoder based on a previously generated explanation token,
The explanation generation model includes a pointer generator connected to the decoder, and the pointer generator receives the hidden state of the decoder and predicts the next explanation token.

According to claim 1,
The control unit,
Restructuring the explanatory sentence about the number generated by selecting the context in which the number is located in the sentence-type math problem through the explanatory sentence generation model into a first sentence,
Restructuring the description of the variable generated using the index of the variable through the description generation model into a second sentence,
A problem-solving device that combines the restructured first sentence and the second sentence to create a recombination problem.

According to claim 3,
The control unit,
Input the recombination problem and the sentence-type math problem into the encoder of the equation generation model,
The encoder of the mathematical expression generation model receives the recombination problem and the sentence-type math problem and outputs a recombination context vector,
The decoder of the mathematical expression generation model receives the recombined context vector and generates the mathematical expression using expression token units that group operators and necessary operands.

In the problem solving method using a problem solving device,
A step of receiving a sentence-type math problem as an input and generating and outputting a description of numbers and variables contained in the sentence-type math problem through an encoder-decoder-based description generation model; and
A problem-solving method comprising receiving the sentence-type math problem, the generated explanation, or a combination thereof, and generating and outputting a math equation through an encoder-decoder-based math equation generation model.

According to claim 5,
The encoder of the explanation generation model receives the sentence-type math problem as input, segments it into tokens, and outputs a problem context vector representing context information of the sentence-type math problem for each token,
The explanation generation model includes a variable predictor connected between the encoder and the decoder of the explanation generation model, and the variable predictor receives the problem context vector and transforms a representative value to determine the variable necessary to solve the sentence-type math problem. Predict the number of
The decoder of the explanation generation model receives the problem context vector, a variable considering the number of variables, and the number of sentence-type math problems, and calculates a hidden state of the decoder based on a previously generated explanation token,
The explanation generation model includes a pointer generator connected to the decoder, and the pointer generator receives the hidden state of the decoder and predicts the next explanation token.

According to claim 5,
The step of generating and outputting the above equation is,
Restructuring the explanatory sentence about the number generated by selecting the context in which the number is located in the sentence-type math problem through the explanatory sentence generation model into a first sentence,
Restructuring the description of the variable generated using the index of the variable through the description generation model into a second sentence,
A problem-solving method that combines the restructured first sentence and the second sentence to create a recombination problem.

According to claim 7,
The step of generating and outputting the above equation is,
Inputting the recombination problem and the sentence-type math problem into the encoder of the equation generation model,
The encoder of the mathematical expression generation model receives the recombination problem and the sentence-type math problem and outputs a recombination context vector,
The decoder of the mathematical expression generation model receives the recombined context vector and generates the mathematical expression using expression token units that group operators and necessary operands.

A computer-readable recording medium on which a program for performing the method according to claim 5 is recorded.

A computer program performed by a problem-solving device and stored in a recording medium to perform the method described in claim 5.