JP6451414B2

JP6451414B2 - Information processing apparatus, summary sentence editing method, and program

Info

Publication number: JP6451414B2
Application number: JP2015044280A
Authority: JP
Inventors: 片江　伸之; 伸之片江
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2015-03-06
Filing date: 2015-03-06
Publication date: 2019-01-16
Anticipated expiration: 2035-03-06
Also published as: JP2016164700A

Description

本発明は、情報処理装置、要約文編集方法、及びプログラムに関する。 The present invention relates to an information processing apparatus, a summary sentence editing method, and a program.

大量の情報が文書の形式で提供される今日においては、文書の内容を短時間で効率良く把握し、目的の文書を探し出して活用する際に要約文は有用である。しかし、大量にある文書を要約する作業は時間と労力がかかる。そのため、コンピュータを利用して要約文の作成作業を支援する文書要約技術が研究されている。この技術は、例えば、医療分野で作成されるサマリーの要約、株式・証券分野で作成されるアナリストレポートの要約、コールセンター業務で作成される通話レポートの要約などへの応用が期待される。 In the present day when a large amount of information is provided in the form of a document, the summary sentence is useful for efficiently grasping the contents of the document in a short time and searching for and utilizing the target document. However, it takes time and effort to summarize a large number of documents. Therefore, document summarization techniques that support the creation of summary sentences using computers have been studied. This technology is expected to be applied to, for example, summary summaries created in the medical field, analyst report summaries created in the stock and securities fields, and call report summaries created in call center operations.

例えば、要約元の文書（以下、原文）を対象に形態素解析・構文解析（係り受け構造解析）を実施し、コンピュータが不要と判断した箇所を原文から削除して要約文を作成する技術が提案されている。また、原文から文字列を削除する割合（要約率）を予め複数設定しておき、要約率に応じて作成された複数の要約文からユーザが所望の要約文を選択できるようにする技術が提案されている。その他、ユーザが指定した語を強調表示する技術や、指定箇所に対応する事前準備された語句を挿脱する技術などが提案されている。 For example, a technology is proposed that creates a summary sentence by performing a morphological analysis / syntactic analysis (dependency structure analysis) on the source document (hereinafter referred to as the original text) and deleting a part that the computer determines is unnecessary from the original text. Has been. In addition, a technology has been proposed in which multiple ratios (summary rates) for deleting character strings from the original text are set in advance, and the user can select a desired summary text from a plurality of summary texts created according to the summary rate. Has been. In addition, a technique for highlighting a word designated by a user, a technique for inserting / removing a word / phrase prepared in advance corresponding to a designated location, and the like have been proposed.

国際公開第２０１０／０５２７６４号International Publication No. 2010/052764 特開平１１−２５０９１号公報JP-A-11-25091 特開平１１−２１９３６１号公報Japanese Patent Laid-Open No. 11-219361 特開２０１４−５６４９９号公報JP 2014-56499 A

上述した技術を適用することで要約文の作成作業が支援されるが、原文から削除された語句が適切でない場合には追加的に編集作業が生じる。例えば、ユーザが望むよりも長い語句が削除された場合、削除された語句を要約文に再び挿入する作業が生じる。他方、ユーザが削除を望む語句が削除されずに残っている場合、その語句を要約文から削除する作業が生じる。上述した技術のうち、指定操作だけで語句を挿脱できるようにする技術は、こうした事後的な編集作業の負担軽減に寄与しうる。 Although the above-described technique is applied to support the creation of a summary sentence, additional editing work occurs when words deleted from the original sentence are not appropriate. For example, when a phrase longer than the user desires is deleted, an operation of inserting the deleted phrase into the summary sentence again occurs. On the other hand, when a word that the user wants to delete remains without being deleted, an operation of deleting the word from the summary sentence occurs. Among the techniques described above, a technique that allows words to be inserted / removed only by a specified operation can contribute to reducing the burden of such subsequent editing work.

しかし、事前準備された語句を挿脱しても依然としてユーザが望む表現とならないことがあり、この場合にはユーザが直接的に要約文を編集する作業が生じる。指定操作により挿脱できる語句の自由度を高めることができれば、指定操作だけで要約文の編集作業が完結する可能性が高まり、作業負担の低減が期待される。 However, there is a case where the expression still desired by the user is not obtained even when the prepared phrase is inserted / removed. In this case, the user directly edits the summary sentence. If the degree of freedom of words that can be inserted / removed by the designation operation can be increased, the possibility of completing the editing of the summary sentence only by the designation operation is increased, and a reduction in the workload is expected.

そこで、１つの側面によれば、本発明の目的は、要約文の編集を容易にすることが可能な情報処理装置、要約文編集方法、及びプログラムを提供することにある。 Therefore, according to one aspect, an object of the present invention is to provide an information processing apparatus, a summary sentence editing method, and a program that can easily edit a summary sentence.

本開示の１つの側面によれば、原文と、該原文の構文解析に基づく語句の係り受け構造を、該語句に対応するノードの接続関係で表現した構文木とを記憶する記憶部と、原文と、語句を省略して原文を要約した要約文とを表示する表示部と、原文に対する指定操作を受け付けた場合は、指定箇所にある語句に対応する第１のノードに接続された、構文木の根へ向かう方向にある第２のノードを特定し、第１及び第２のノードに対応する語句を要約文に追加し、要約文に対する指定操作を受け付けた場合は、指定箇所にある語句に対応する第３のノードに接続された、構文木の末端へ向かう方向にある第４のノードを特定し、第３及び第４のノードに対応する語句を要約文から削除する演算部と、を備える、情報処理装置が提供される。 According to one aspect of the present disclosure, a storage unit that stores an original sentence and a syntax tree in which a dependency structure of a phrase based on a syntactic analysis of the original sentence is expressed by a connection relation of nodes corresponding to the phrase; And a display section that displays a summary sentence that summarizes the original text with the phrase omitted, and the root of the syntax tree connected to the first node corresponding to the phrase at the specified location when a specification operation for the original text is accepted. When the second node in the direction toward is specified, the words corresponding to the first and second nodes are added to the summary sentence, and the designation operation for the summary sentence is accepted, the word corresponding to the designated place An operation unit that identifies a fourth node that is connected to the third node and that is in a direction toward the end of the syntax tree, and that deletes words corresponding to the third and fourth nodes from the summary sentence. An information processing apparatus is provided.

本発明によれば、要約文の編集を容易にすることが可能になる。 According to the present invention, it is possible to easily edit a summary sentence.

第１実施形態に係る情報処理装置の一例を示した図である。It is the figure which showed an example of the information processing apparatus which concerns on 1st Embodiment. 第２実施形態に係る情報処理装置の機能を実現可能なハードウェアの一例を示した図である。It is the figure which showed an example of the hardware which can implement | achieve the function of the information processing apparatus which concerns on 2nd Embodiment. 第２実施形態に係る情報処理装置が有する機能の一例を示したブロック図である。It is the block diagram which showed an example of the function which the information processing apparatus which concerns on 2nd Embodiment has. 第２実施形態に係る形態素解析結果の一例を示した図である。It is the figure which showed an example of the morphological analysis result which concerns on 2nd Embodiment. 第２実施形態に係る構文解析（係り受け解析）結果の一例を示した図である。It is the figure which showed an example of the syntax analysis (dependency analysis) result concerning 2nd Embodiment. 第２実施形態に係る構文木及び要約文テキストの一例を示した図である。It is the figure which showed an example of the syntax tree and summary sentence text which concern on 2nd Embodiment. 第２実施形態に係る原文と構文解析結果の対応データの一例を示した図である。It is the figure which showed an example of the corresponding data of the original sentence and syntax analysis result which concern on 2nd Embodiment. 第２実施形態に係る要約文と構文解析結果の対応データの一例を示した図である。It is the figure which showed an example of the corresponding data of the summary sentence and syntax analysis result concerning 2nd Embodiment. 第２実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。It is the figure which showed an example of designation | designated operation and a process about addition of the phrase which concerns on 2nd Embodiment. 第２実施形態に係る語句の削除についての指定操作及び処理の一例を示した図である。It is the figure which showed an example of designation | designated operation and a process about deletion of the phrase which concern on 2nd Embodiment. 第２実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。It is the 1st flowchart which showed the flow of the process about operation | movement of the information processing apparatus which concerns on 2nd Embodiment. 第２実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。It is the 2nd flowchart which showed the flow of the process about operation | movement of the information processing apparatus which concerns on 2nd Embodiment. 第３実施形態に係る共起確率テーブルの一例を示した図である。It is the figure which showed an example of the co-occurrence probability table which concerns on 3rd Embodiment. 第３実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。It is the figure which showed an example of designation | designated operation and a process about addition of the phrase which concerns on 3rd Embodiment. 第３実施形態に係る語句の削除についての指定操作及び処理の一例を示した図である。It is the figure which showed an example of designation | designated operation and a process about deletion of the phrase which concern on 3rd Embodiment. 第３実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。It is the 1st flowchart which showed the flow of the process about the operation | movement of the information processing apparatus which concerns on 3rd Embodiment. 第３実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。It is the 2nd flow figure showing the flow of processing about the operation of the information processor concerning a 3rd embodiment. 第３実施形態に係る情報処理装置の動作についての処理の流れを示した第３のフロー図である。It is the 3rd flowchart which showed the flow of the process about the operation | movement of the information processing apparatus which concerns on 3rd Embodiment. 第３実施形態に係る情報処理装置の動作についての処理の流れを示した第４のフロー図である。It is the 4th flow figure showing the flow of processing about the operation of the information processor concerning a 3rd embodiment. 第４実施形態に係る構文解析（係り受け解析）結果の一例を示した図である。It is the figure which showed an example of the syntax analysis (dependency analysis) result which concerns on 4th Embodiment. 第４実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。It is the figure which showed an example of designation | designated operation and a process about addition of the phrase which concerns on 4th Embodiment. 第４実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。It is the 1st flowchart which showed the flow of the process about operation | movement of the information processing apparatus which concerns on 4th Embodiment. 第４実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。It is the 2nd flowchart which showed the flow of the process about operation | movement of the information processing apparatus which concerns on 4th Embodiment. 第４実施形態に係る情報処理装置の動作についての処理の流れを示した第３のフロー図である。It is the 3rd flow figure showing the flow of processing about the operation of the information processor concerning a 4th embodiment.

以下に添付図面を参照しながら、本発明の実施形態について説明する。なお、本明細書及び図面において実質的に同一の機能を有する要素については、同一の符号を付することにより重複説明を省略する場合がある。 Embodiments of the present invention will be described below with reference to the accompanying drawings. In addition, about the element which has the substantially same function in this specification and drawing, duplication description may be abbreviate | omitted by attaching | subjecting the same code | symbol.

＜１．第１実施形態＞
図１を参照しながら、第１実施形態について説明する。図１は、第１実施形態に係る情報処理装置の一例を示した図である。第１実施形態は、文書を要約して要約文を自動作成する文書要約技術、及び要約文の編集を支援する編集支援技術に関する。以下、要約元の文書を原文と称する。また、説明の都合上、図１に示した原文３１を要約して要約文３２を作成し、要約文３２を編集する場合を例に説明を行う。 <1. First Embodiment>
The first embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of an information processing apparatus according to the first embodiment. The first embodiment relates to a document summarization technique for automatically creating a summary sentence by summarizing documents, and an editing support technique for supporting editing of the summary sentence. Hereinafter, the document of the summarization source is referred to as the original text. Further, for convenience of explanation, an explanation will be given taking as an example a case where a summary sentence 32 is created by summarizing the original sentence 31 shown in FIG. 1 and the summary sentence 32 is edited.

図１に示すように、情報処理装置１０は、記憶部１１、演算部１２、表示部１３を有する。
記憶部１１は、ＲＡＭ（Random Access Memory）などの揮発性記憶装置、又はＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの不揮発性記憶装置である。演算部１２は、ＣＰＵ（Central Processing Unit）やＤＳＰ（Digital Signal Processor）などのプロセッサである。但し、演算部１２は、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）などの電子回路であってもよい。 As illustrated in FIG. 1, the information processing apparatus 10 includes a storage unit 11, a calculation unit 12, and a display unit 13.
The storage unit 11 is a volatile storage device such as a RAM (Random Access Memory) or a nonvolatile storage device such as an HDD (Hard Disk Drive) or a flash memory. The arithmetic unit 12 is a processor such as a CPU (Central Processing Unit) or a DSP (Digital Signal Processor). However, the arithmetic unit 12 may be an electronic circuit such as an application specific integrated circuit (ASIC) or a field programmable gate array (FPGA).

演算部１２は、例えば、記憶部１１又は他のメモリに記憶されたプログラムを実行する。表示部１３は、ＣＲＴ（Cathode Ray Tube）、ＬＣＤ（Liquid Crystal Display）、ＰＤＰ（Plasma Display Panel）、又はＥＬＤ（Electro-Luminescence Display）などのディスプレイ装置である。 For example, the calculation unit 12 executes a program stored in the storage unit 11 or another memory. The display unit 13 is a display device such as CRT (Cathode Ray Tube), LCD (Liquid Crystal Display), PDP (Plasma Display Panel), or ELD (Electro-Luminescence Display).

なお、表示部１３は情報処理装置１０と一体に形成されていなくてもよく、例えば、表示部１３に表示される内容が、ネットワークを介して情報処理装置１０に接続された情報端末のディスプレイ装置に表示される仕組みにしてもよい。また、情報処理装置１０の機能は、演算部１２の機能を分担して実現する複数のコンピュータや、記憶部１１の機能を有するストレージ装置などを組み合わせたシステムにより実現することも可能である。 The display unit 13 does not have to be formed integrally with the information processing apparatus 10. For example, the content displayed on the display unit 13 is a display device of an information terminal connected to the information processing apparatus 10 via a network. You may make it the structure displayed on. The functions of the information processing apparatus 10 can also be realized by a system that combines a plurality of computers that share the functions of the arithmetic unit 12 and a storage apparatus that has the functions of the storage unit 11.

記憶部１１は、原文３１と、構文木２０とを記憶する。
構文木２０は、原文３１の構文解析に基づく語句の係り受け構造を、該語句に対応するノードの接続関係で表現した情報である。語句は、例えば、文節や句である。 The storage unit 11 stores the original text 31 and the syntax tree 20.
The syntax tree 20 is information that expresses a dependency structure of a phrase based on the parsing of the original sentence 31 by a connection relation of nodes corresponding to the phrase. The phrase is, for example, a clause or a phrase.

構文解析は、文法規則に則り、句や文節を単位として文の構造を解析する方法である。句とは、２つ以上の語が集まって１つの品詞と同様の働きをする語の集合を言う。文節は、日本語を意味の分かる単位で区切ったものであり、文を読む際に自然な発音によって区切られる最小の単位である。日本語の文における任意の１つの文節は、その文節に続く少なくとも１つの文節と係り受け関係を有する。このような係り受け関係を有する文節の構造を係り受け構造と呼ぶ。 Parsing is a method of analyzing the structure of a sentence in units of phrases and clauses according to grammatical rules. A phrase refers to a set of words in which two or more words come together and function in the same way as a single part of speech. A phrase is a unit in which Japanese is delimited by a unit whose meaning is understood, and is the smallest unit that is delimited by natural pronunciation when a sentence is read. Any one phrase in the Japanese sentence has a dependency relationship with at least one phrase following the phrase. A phrase structure having such a dependency relationship is called a dependency structure.

なお、構文解析を行う際に前提として形態素解析が行われる。通常、意味を持つ最小の文字列の単位を形態素と呼ぶ。また、文を単語毎に分割し、各単語に品詞情報などを付け加える作業を形態素解析と呼ぶ。形態素解析を行うシステムは、自然言語の文法ルールや辞書情報を用いて文を形態素に分割し、各単語に対して機械的に品詞情報などを付与する。例えば、図１（Ｂ１）に示した原文３１を上記の解析方法で解析すると、図１（Ａ）に示すような構文木２０が得られる。 Note that morphological analysis is performed as a premise when performing syntax analysis. Usually, the smallest meaningful character string unit is called a morpheme. The operation of dividing a sentence into words and adding part-of-speech information to each word is called morphological analysis. A system that performs morphological analysis divides a sentence into morphemes using natural language grammar rules and dictionary information, and mechanically gives part of speech information to each word. For example, when the original text 31 shown in FIG. 1 (B1) is analyzed by the above analysis method, a syntax tree 20 as shown in FIG. 1 (A) is obtained.

図１（Ａ）の例では、「昨年」、「八月末の」、「暑い」、「日」、「経済論壇で」、「重い」、「存在だった」、「一人の」、「論客が」、「志」、「半ばで」、「世を」、「去った」が構文木２０の要素となる語句である。以下、構文木２０の要素をノードと呼ぶ。構文木２０は、語句間の係り受け関係をノード間の接続関係（図１（Ａ）ではノード間を接続する線）で表現している。以下、ノード間の接続関係をブランチと呼ぶ場合がある。 In the example of Fig. 1 (A), "Last year", "End of August", "Hot", "Sun", "In the economic platform", "Heavy", "It was", "One person", "", "" "", "" Mid-"", "" the world "", "" "left" "are phrases that are elements of the syntax tree 20. Hereinafter, elements of the syntax tree 20 are referred to as nodes. The syntax tree 20 expresses a dependency relationship between phrases as a connection relationship between nodes (in FIG. 1A, a line connecting nodes). Hereinafter, the connection relationship between nodes may be referred to as a branch.

木構造を有する構文木２０の根（ルート）に位置するノードをルートノード、末端に位置するノードを末端ノードと呼ぶ場合がある。図１（Ａ）の例では、「去った」に対応するノードがルートノードであり、「昨年」、「暑い」、「経済論壇で」、「重い」、「一人の」、「世を」、「志」がそれぞれ末端ノードである。つまり、ルートノードに対応する語句の後には係り受け関係を持つ語句が続かず、末端ノードに対応する語句の前には係り受け関係を持つ語句が存在しない。記憶部１１には、このような構文木２０に関する情報が格納されている。 A node located at the root of the syntax tree 20 having a tree structure may be referred to as a root node, and a node located at the end may be referred to as a terminal node. In the example of FIG. 1A, the node corresponding to “Leave” is the root node, and “Last year”, “Hot”, “In the economic platform”, “Heavy”, “One person”, “The world” , “Zhi” are terminal nodes. That is, a word having a dependency relationship does not follow the word corresponding to the root node, and a word having a dependency relationship does not exist before the word corresponding to the terminal node. The storage unit 11 stores information related to such a syntax tree 20.

表示部１３は、原文３１と、原文３１を要約した要約文３２とを表示する。要約文３２は、構文木２０に基づいて原文３１に含まれる一部の語句を省略したものである。例えば、要約文３２は、ルートノードから末端ノードまでを一連のノードとブランチとで結ぶパスを任意に選択し、選択したパスにある各ノードに対応する語句を原文３１上の語句と同じ順に並べることで得られる。図１（Ｂ１）は、「一人の」、「志」、「世を」に対応する末端ノードへ至るパスが選択された場合の要約文３２を例示している。 The display unit 13 displays an original sentence 31 and a summary sentence 32 that summarizes the original sentence 31. The summary sentence 32 is obtained by omitting some words included in the original sentence 31 based on the syntax tree 20. For example, the summary sentence 32 arbitrarily selects a path connecting a series of nodes and branches from the root node to the end node, and arranges words and phrases corresponding to each node in the selected path in the same order as the words and phrases in the original text 31. Can be obtained. FIG. 1B1 illustrates a summary sentence 32 when a path to the terminal node corresponding to “one person”, “will”, and “the world” is selected.

表示部１３は、原文３１と、要約文３２とを共に表示する。そして、演算部１２は、原文３１、及び要約文３２に対するユーザの指定操作を受け付ける。演算部１２は、原文３１に対する指定操作を受け付けた場合に、指定箇所にある語句に対応する第１のノードに接続された、構文木２０の根へ向かう方向にある第２のノードを特定し、第１及び第２のノードに対応する語句を要約文３２に追加する。 The display unit 13 displays both the original sentence 31 and the summary sentence 32. Then, the calculation unit 12 receives a user's designation operation for the original sentence 31 and the summary sentence 32. When the operation unit 12 receives a designating operation on the original text 31, the computing unit 12 identifies the second node that is connected to the first node corresponding to the phrase at the designated location and that is in the direction toward the root of the syntax tree 20. , Words corresponding to the first and second nodes are added to the summary sentence 32.

図１（Ｂ１）の例では、原文３１の「重い」が指定されている。この場合、演算部１２は、「重い」に対応するノードを第１のノードとして特定し、第１のノードからルートノードへ向かう方向にあるノードを第２のノードとして特定する。なお、指定操作は、語句を選択して指定する操作であってもよいし、文字を指定する操作であってもよい。文字を指定する操作の場合、演算部１２が、指定された文字を含む語句を特定し、特定した語句が指定されたものと判断する。 In the example of FIG. 1 (B1), “heavy” in the original text 31 is designated. In this case, the computing unit 12 identifies the node corresponding to “heavy” as the first node, and identifies the node in the direction from the first node toward the root node as the second node. Note that the designation operation may be an operation of selecting and designating a phrase, or an operation of designating characters. In the case of an operation for designating a character, the calculation unit 12 identifies a word / phrase including the designated character and determines that the specified word / phrase has been designated.

この例において、演算部１２は、第２のノードの候補として「存在だった」、「論客が」、「去った」を検出し、要約文３２に既に含まれている「論客が」、「去った」を除く「存在だった」を第２のノードとして特定する。そして、演算部１２は、第１及び第２のノードを追加範囲２１に決定し、追加範囲２１に対応する「重い」、「存在だった」を要約文３２に追加する（図１（Ｂ２）下線部参照）。 In this example, the calculation unit 12 detects “was present”, “examiner”, and “exited” as candidates for the second node, and the “examiner” already included in the summary sentence 32, “ “Exist” other than “Leave” is specified as the second node. Then, the computing unit 12 determines the first and second nodes as the additional range 21, and adds “heavy” and “was present” corresponding to the additional range 21 to the summary sentence 32 (FIG. 1 (B2)). See underlined).

一方、要約文３２に対する指定操作を受け付けた場合、演算部１２は、指定箇所にある語句に対応する第３のノードに接続された、構文木２０の末端へ向かう方向にある第４のノードを特定し、第３及び第４のノードに対応する語句を要約文３２から削除する。 On the other hand, when the designation operation for the summary sentence 32 is received, the arithmetic unit 12 selects the fourth node that is connected to the third node corresponding to the phrase at the designated location and is in the direction toward the end of the syntax tree 20. The words corresponding to the third and fourth nodes are deleted from the summary sentence 32.

図１（Ｂ２）の例では、要約文３２の「半ばで」が指定されている。この場合、演算部１２は、「半ばで」に対応するノードを第３のノードとして特定し、第３のノードから末端ノードへ向かう方向にあるノードを第４のノードとして特定する。 In the example of FIG. 1 (B2), “mid-half” of the summary sentence 32 is designated. In this case, the computing unit 12 identifies the node corresponding to “mid-way” as the third node, and identifies the node in the direction from the third node toward the terminal node as the fourth node.

この例において、演算部１２は、第４のノードの候補として「志」を検出し、「志」が要約文３２に既に含まれていることを確認して「志」を第４のノードとして特定する。そして、演算部１２は、第３及び第４のノードを削除範囲２２に決定し、削除範囲２２に対応する「志」、「半ばで」を要約文３２から削除する（図１（Ｂ３）参照）。 In this example, the arithmetic unit 12 detects “zhi” as a fourth node candidate, confirms that “zhi” is already included in the summary sentence 32, and sets “zhi” as the fourth node. Identify. Then, the calculation unit 12 determines the third and fourth nodes as the deletion range 22, and deletes “zhi” and “mid-d” corresponding to the deletion range 22 from the summary sentence 32 (see FIG. 1 (B3)). ).

第１実施形態によれば、文節や句などの語句を単位とする係り受け関係に基づいて指定箇所の語句が挿脱される。原文３１の一部を指定した場合には、指定箇所にある語句と係り受け関係にある語句とが要約文３２に挿入され、要約文３２の一部を指定した場合には指定箇所にある語句と係り受け関係にある語句とが要約文３２から削除される。 According to the first embodiment, a phrase at a specified location is inserted / removed based on a dependency relationship having a phrase or phrase as a unit. When a part of the original sentence 31 is designated, the phrase at the designated place and the phrase having a dependency relation are inserted into the summary sentence 32, and when a part of the summary sentence 32 is designated, the phrase at the designated place. And the phrase having a dependency relationship are deleted from the summary sentence 32.

同じパス上にあるノードであっても、異なるノードに対応する語句が選択されれば、その語句に対応するノードを起点に挿脱される語句が決まる。そのため、指定箇所を変えながら追加又は削除する語句を調整することで、指定操作の繰り返しにより所望の要約文３２が得られうる。つまり、挿脱される語句が固定されている場合に比べ、指定操作による編集の自由度が向上し、より簡易な操作で要約文を所望の表現に近づけることができる。その結果、要約文の編集が容易になる。 Even if nodes are on the same path, if a word corresponding to a different node is selected, a word to be inserted / removed starting from the node corresponding to the word is determined. Therefore, by adjusting the word to be added or deleted while changing the designated portion, the desired summary sentence 32 can be obtained by repeating the designation operation. That is, the degree of freedom of editing by the designation operation is improved as compared with the case where the words to be inserted / removed are fixed, and the summary sentence can be brought closer to a desired expression by a simpler operation. As a result, the summary sentence can be easily edited.

以上、第１実施形態について説明した。
＜２．第２実施形態＞
次に、第２実施形態について説明する。第２実施形態では、要約文の編集支援方法に関し、要約文に対する語句の追加・削除を簡単な操作で実現できるようにする方法を提案する。以下、この方法を実現可能な情報処理装置１００について説明する。情報処理装置１００は、第２実施形態に係る情報処理装置の一例である。 The first embodiment has been described above.
<2. Second Embodiment>
Next, a second embodiment will be described. The second embodiment proposes a method for enabling addition / deletion of words to / from a summary sentence with a simple operation, with respect to the summary sentence editing support method. Hereinafter, the information processing apparatus 100 capable of realizing this method will be described. The information processing apparatus 100 is an example of an information processing apparatus according to the second embodiment.

［２−１．ハードウェア］
ここで、図２を参照しながら、情報処理装置１００のハードウェアについて説明する。図２は、第２実施形態に係る情報処理装置の機能を実現可能なハードウェアの一例を示した図である。つまり、後述する情報処理装置１００の機能は、図２に例示したハードウェア資源を用いて実現することが可能である。また、情報処理装置１００の機能は、コンピュータプログラムを用いて図２に示すハードウェアを制御することにより実現される。 [2-1. hardware]
Here, the hardware of the information processing apparatus 100 will be described with reference to FIG. FIG. 2 is a diagram illustrating an example of hardware capable of realizing the functions of the information processing apparatus according to the second embodiment. That is, the functions of the information processing apparatus 100 to be described later can be realized using the hardware resources illustrated in FIG. The functions of the information processing apparatus 100 are realized by controlling the hardware shown in FIG. 2 using a computer program.

なお、第２実施形態に係る技術は、図２に例示したハードウェアを有する１台の情報処理装置を利用して実現することも可能であるが、複数台の情報処理装置やストレージ装置などをネットワークで接続したシステムによっても実現することが可能である。このような変形も当然に第２実施形態の技術的範囲に属する。 Note that the technology according to the second embodiment can be realized by using one information processing apparatus having the hardware illustrated in FIG. 2, but a plurality of information processing apparatuses, storage apparatuses, and the like are provided. It can also be realized by a system connected via a network. Such a modification naturally belongs to the technical scope of the second embodiment.

図２に示すように、このハードウェアは、主に、ＣＰＵ９０２と、ＲＯＭ（Read Only Memory）９０４と、ＲＡＭ９０６と、ホストバス９０８と、ブリッジ９１０とを有する。さらに、このハードウェアは、外部バス９１２と、インターフェース９１４と、入力部９１６と、出力部９１８と、記憶部９２０と、ドライブ９２２と、接続ポート９２４と、通信部９２６とを有する。 As shown in FIG. 2, this hardware mainly includes a CPU 902, a ROM (Read Only Memory) 904, a RAM 906, a host bus 908, and a bridge 910. Further, this hardware includes an external bus 912, an interface 914, an input unit 916, an output unit 918, a storage unit 920, a drive 922, a connection port 924, and a communication unit 926.

ＣＰＵ９０２は、例えば、演算処理装置又は制御装置として機能し、ＲＯＭ９０４、ＲＡＭ９０６、記憶部９２０、又はリムーバブル記録媒体９２８に記録された各種プログラムに基づいて各構成要素の動作全般又はその一部を制御する。ＲＯＭ９０４は、ＣＰＵ９０２に読み込まれるプログラムや演算に用いるデータなどを格納する記憶装置の一例である。ＲＡＭ９０６には、例えば、ＣＰＵ９０２に読み込まれるプログラムや、そのプログラムを実行する際に変化する各種パラメータなどが一時的又は永続的に格納される。 The CPU 902 functions as, for example, an arithmetic processing unit or a control unit, and controls the overall operation of each component or a part thereof based on various programs recorded in the ROM 904, the RAM 906, the storage unit 920, or the removable recording medium 928. . The ROM 904 is an example of a storage device that stores a program read by the CPU 902, data used for calculation, and the like. The RAM 906 temporarily or permanently stores, for example, a program read by the CPU 902 and various parameters that change when the program is executed.

これらの要素は、例えば、高速なデータ伝送が可能なホストバス９０８を介して相互に接続される。一方、ホストバス９０８は、例えば、ブリッジ９１０を介して比較的データ伝送速度が低速な外部バス９１２に接続される。また、入力部９１６としては、例えば、マウス、キーボード、タッチパネル、タッチパッド、ボタン、スイッチ、及びレバーなどが用いられる。さらに、入力部９１６としては、赤外線やその他の電波を利用して制御信号を送信することが可能なリモートコントローラが用いられることもある。 These elements are connected to each other via, for example, a host bus 908 capable of high-speed data transmission. On the other hand, the host bus 908 is connected to an external bus 912 having a relatively low data transmission speed via a bridge 910, for example. As the input unit 916, for example, a mouse, a keyboard, a touch panel, a touch pad, a button, a switch, a lever, or the like is used. Furthermore, as the input unit 916, a remote controller capable of transmitting a control signal using infrared rays or other radio waves may be used.

出力部９１８としては、例えば、ＣＲＴ、ＬＣＤ、ＰＤＰ、又はＥＬＤなどのディスプレイ装置が用いられる。また、出力部９１８として、スピーカやヘッドホンなどのオーディオ出力装置、又はプリンタなどが用いられることもある。つまり、出力部９１８は、情報を視覚的又は聴覚的に出力することが可能な装置である。 As the output unit 918, for example, a display device such as a CRT, LCD, PDP, or ELD is used. As the output unit 918, an audio output device such as a speaker or headphones, or a printer may be used. In other words, the output unit 918 is a device that can output information visually or audibly.

記憶部９２０は、各種のデータを格納するための装置である。記憶部９２０としては、例えば、ＨＤＤなどの磁気記憶デバイスが用いられる。また、記憶部９２０として、ＳＳＤ（Solid State Drive）やＲＡＭディスクなどの半導体記憶デバイス、光記憶デバイス、又は光磁気記憶デバイスなどが用いられてもよい。 The storage unit 920 is a device for storing various data. As the storage unit 920, for example, a magnetic storage device such as an HDD is used. Further, as the storage unit 920, a semiconductor storage device such as an SSD (Solid State Drive) or a RAM disk, an optical storage device, a magneto-optical storage device, or the like may be used.

ドライブ９２２は、着脱可能な記録媒体であるリムーバブル記録媒体９２８に記録された情報を読み出し、又はリムーバブル記録媒体９２８に情報を書き込む装置である。リムーバブル記録媒体９２８としては、例えば、磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリなどが用いられる。 The drive 922 is a device that reads information recorded on a removable recording medium 928 that is a removable recording medium or writes information on the removable recording medium 928. As the removable recording medium 928, for example, a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory is used.

接続ポート９２４は、例えば、ＵＳＢ（Universal Serial Bus）ポート、ＩＥＥＥ１３９４ポート、ＳＣＳＩ（Small Computer System Interface）、ＲＳ−２３２Ｃポート、又は光オーディオ端子など、外部接続機器９３０を接続するためのポートである。外部接続機器９３０としては、例えば、プリンタなどが用いられる。 The connection port 924 is a port for connecting an external connection device 930 such as a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface), an RS-232C port, or an optical audio terminal. For example, a printer or the like is used as the external connection device 930.

通信部９２６は、ネットワーク９３２に接続するための通信デバイスである。通信部９２６としては、例えば、有線又は無線ＬＡＮ（Local Area Network）用の通信回路、ＷＵＳＢ（Wireless USB）用の通信回路、光通信用の通信回路やルータ、ＡＤＳＬ（Asymmetric Digital Subscriber Line）用の通信回路やルータ、携帯電話ネットワーク用の通信回路などが用いられる。通信部９２６に接続されるネットワーク９３２は、有線又は無線により接続されたネットワークであり、例えば、インターネット、ＬＡＮ、放送網、衛星通信回線などを含む。 The communication unit 926 is a communication device for connecting to the network 932. As the communication unit 926, for example, a communication circuit for wired or wireless LAN (Local Area Network), a communication circuit for WUSB (Wireless USB), a communication circuit or router for optical communication, an ADSL (Asymmetric Digital Subscriber Line) Communication circuits, routers, communication circuits for mobile phone networks, and the like are used. A network 932 connected to the communication unit 926 is a wired or wireless network, and includes, for example, the Internet, a LAN, a broadcast network, a satellite communication line, and the like.

以上、情報処理装置１００のハードウェアについて説明した。
［２−２．機能］
次に、図３を参照しながら、情報処理装置１００の機能について説明する。図３は、第２実施形態に係る情報処理装置が有する機能の一例を示したブロック図である。 The hardware of the information processing apparatus 100 has been described above.
[2-2. function]
Next, functions of the information processing apparatus 100 will be described with reference to FIG. FIG. 3 is a block diagram illustrating an example of functions of the information processing apparatus according to the second embodiment.

図３に示すように、情報処理装置１００は、記憶部１０１、原文入力部１０２、形態素解析部１０３、構文解析部１０４、要約文生成部１０５、文出力部１０６、指定受領部１０７、及び範囲制御部１０８を有する。 As illustrated in FIG. 3, the information processing apparatus 100 includes a storage unit 101, a source text input unit 102, a morpheme analysis unit 103, a syntax analysis unit 104, a summary sentence generation unit 105, a sentence output unit 106, a designation reception unit 107, and a range A control unit 108 is included.

なお、記憶部１０１の機能は、上述したＲＡＭ９０６や記憶部９２０などを用いて実現できる。原文入力部１０２、指定受領部１０７の機能は、上述した入力部９１６などの機能を用いて実現できる。形態素解析部１０３、構文解析部１０４、要約文生成部１０５、及び範囲制御部１０８の機能は、上述したＣＰＵ９０２などを用いて実現できる。文出力部１０６の機能は、上述した出力部９１８などを用いて実現できる。 Note that the function of the storage unit 101 can be realized by using the above-described RAM 906, the storage unit 920, or the like. The functions of the original text input unit 102 and the designation receiving unit 107 can be realized by using the functions of the input unit 916 described above. The functions of the morphological analysis unit 103, the syntax analysis unit 104, the summary sentence generation unit 105, and the range control unit 108 can be realized by using the above-described CPU 902 or the like. The function of the sentence output unit 106 can be realized by using the output unit 918 described above.

（２−２−１．構文木、要約文、対応データの生成）
記憶部１０１には、原文テキスト１０１ａ、及び解析結果１０１ｂなどの情報が格納される。原文テキスト１０１ａは、要約文の元となる原文のテキストデータである。 (2-2-1. Generation of syntax tree, summary sentence, and corresponding data)
The storage unit 101 stores information such as the original text 101a and the analysis result 101b. The original text 101a is text data of the original text that is the source of the summary text.

例えば、原文入力部１０２は、入力部９１６を利用してユーザが入力した原文テキスト１０１ａを記憶部１０１に格納する。また、原文テキスト１０１ａが情報処理装置１００に外部接続されたストレージ装置やネットワーク上のストレージ領域にある場合、原文入力部１０２は、原文テキストを取得して記憶部１０１に格納する。 For example, the original text input unit 102 stores the original text 101 a input by the user using the input unit 916 in the storage unit 101. When the original text 101 a is in a storage device externally connected to the information processing apparatus 100 or a storage area on the network, the original text input unit 102 acquires the original text and stores it in the storage unit 101.

形態素解析部１０３は、原文テキスト１０１ａに対する形態素解析を実施し、原文テキスト１０１ａから抽出した各形態素に品詞などを付加した情報（図４を参照）を解析結果１０１ｂの一部として記憶部１０１に格納する。図４は、第２実施形態に係る形態素解析結果の一例を示した図である。原文テキスト１０１ａが「昨年八月末の暑い日、経済論壇で重い存在だった一人の論客が志半ばで世を去った。」という文である場合、形態素解析部１０３は、図４に例示した形態素解析結果を出力する。 The morpheme analysis unit 103 performs morpheme analysis on the original text 101a, and stores in the storage unit 101 information (see FIG. 4) obtained by adding parts of speech to each morpheme extracted from the original text 101a as a part of the analysis result 101b. To do. FIG. 4 is a diagram illustrating an example of a morphological analysis result according to the second embodiment. When the original text 101a is a sentence “a hot day at the end of August last year, a single expert who was heavy on the economic forum passed away in the middle”, the morpheme analysis unit 103 uses the morpheme illustrated in FIG. Output analysis results.

構文解析部１０４は、形態素解析部１０３が出力した形態素解析結果をもとに原文テキスト１０１ａの構文解析（係り受け解析）を実施する。構文解析は、文法規則に則り、句や文節を単位として文の構造を解析する方法である。なお、本稿では、構文解析の単位となる句や文節を単に「語句」と呼ぶことにする。構文解析部１０４は、構文解析で得た語句毎に、語句の表記、係り先、係り受け種類などの情報（図５を参照）を対応付けて解析結果１０１ｂの一部として記憶部１０１に格納する。図５は、第２実施形態に係る構文解析（係り受け解析）結果の一例を示した図である。 The syntax analysis unit 104 performs syntax analysis (dependency analysis) of the original text 101a based on the morpheme analysis result output by the morpheme analysis unit 103. Parsing is a method of analyzing the structure of a sentence in units of phrases and clauses according to grammatical rules. In this paper, phrases and clauses that are units of parsing are simply called “phrases”. The syntax analysis unit 104 associates information (see FIG. 5) such as a phrase notation, a dependency destination, and a dependency type for each phrase obtained by the syntax analysis, and stores the information in the storage unit 101 as a part of the analysis result 101b. To do. FIG. 5 is a diagram illustrating an example of a syntax analysis (dependency analysis) result according to the second embodiment.

図５に示すように、各語句にはノード番号が割り当てられ、ノード番号をもとに各語句を特定することができる。語句間の係り受け関係は、係り先の欄に記載されたノード番号により表現される。例えば、ノード番号１の語句「昨年」は、ノード番号２の語句「八月末の」を係り先とする係り受け関係を有する。各語句に関する係り受け関係を表す木構造の表現を構文木と呼ぶが、図５に対応する構文木は図６のようになる。 As shown in FIG. 5, each word / phrase is assigned a node number, and each word / phrase can be specified based on the node number. The dependency relationship between words is expressed by a node number described in the dependency destination column. For example, the phrase “last year” of the node number 1 has a dependency relationship with the phrase “of the end of August” of the node number 2 as a destination. An expression of the tree structure representing the dependency relationship for each word is called a syntax tree, and the syntax tree corresponding to FIG. 5 is as shown in FIG.

図６は、第２実施形態に係る構文木及び要約文テキストの一例を示した図である。図６に例示した構文木の各ブロックはノードを表す。また、ブロック間を結ぶ線はブランチであり、各ブランチがノード間の接続関係（つまり、係り受け関係）を表す。図６の例において、構文木のルートノードは、語句「去った」に対応するノードである。また、末端ノードは、語句「昨年」、「暑い」、「経済論壇で」、「重い」、「一人の」、「世を」、「志」にそれぞれ対応するノードである。 FIG. 6 is a diagram showing an example of a syntax tree and summary text according to the second embodiment. Each block of the syntax tree illustrated in FIG. 6 represents a node. A line connecting the blocks is a branch, and each branch represents a connection relationship between nodes (that is, a dependency relationship). In the example of FIG. 6, the root node of the syntax tree is a node corresponding to the phrase “Leave”. In addition, the terminal nodes are nodes corresponding to the phrases “last year”, “hot”, “in the economic forum”, “heavy”, “one person”, “the world”, and “zhi”, respectively.

さらに、構文解析部１０４は、原文テキスト１０１ａに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図７を参照）を生成する。図７は、第２実施形態に係る原文と構文解析結果の対応データの一例を示した図である。例えば、図７に示すように、構文解析部１０４は、原文テキスト１０１ａに含まれる各文字に割り当てられた番号（以下、原文文字番号）と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成し、該対応データを記憶部１０１に格納する。 Further, the syntax analysis unit 104 generates correspondence data (see FIG. 7) that associates each character included in the original text 101a with a node corresponding to a word including the character. FIG. 7 is a diagram showing an example of correspondence data between the original text and the syntax analysis result according to the second embodiment. For example, as shown in FIG. 7, the syntax analysis unit 104 includes a number assigned to each character included in the original text 101a (hereinafter referred to as an original character number), a character notation, and a node corresponding to the character. Corresponding data in which numbers are associated is generated as a part of the analysis result 101 b and the corresponding data is stored in the storage unit 101.

要約文生成部１０５は、ルートノードから、設定されたノードまでを結ぶ一連のノードをもとに要約文テキストを生成する。図６の例では、ルートノードから、語句「一人の」、「世を」、「志」に対応するノードまでを結ぶ一連のノードをもとに生成された要約文「一人の論客が志半ばで世を去った。」が示されている。このように、要約文生成部１０５は、設定されたノードへ至るパス上のノードに対応する語句を特定し、特定した語句を原文テキスト１０１ａ上での語句の順に並べて要約文テキストを生成する。構文解析結果から要約文テキストを自動的に生成する処理では、要約に含めるノードあるいは要約において削除するノードを設定する方法として、単語重要度、単語Ｎグラム、係り受けの種類などを利用する様々な方式が既知であるが、本発明ではいずれかの方式に特定しない。また、全てのノードを要約文に含め、原文テキストと同一の要約文テキストを生成してもよいし、全てのノードを削除し、文字数０の文字列を要約文テキストとしてもよい。 The summary sentence generation unit 105 generates summary sentence text based on a series of nodes connecting from the root node to the set node. In the example of FIG. 6, a summary sentence “one of the authors is half-hearted” generated based on a series of nodes that connect from the root node to the nodes corresponding to the phrases “one person”, “the world”, and “will”. I passed away. " As described above, the summary sentence generation unit 105 identifies the phrases corresponding to the nodes on the path to the set node, and generates the summary sentence text by arranging the identified phrases in the order of the phrases on the original text 101a. In the process of automatically generating summary text from the syntax analysis result, various methods using word importance, word N-gram, dependency type, etc. are set as methods for setting nodes to be included in the summary or nodes to be deleted in the summary. Although the method is known, the present invention does not specify any method. In addition, all nodes may be included in the summary sentence, and the same summary sentence text as the original text may be generated, or all nodes may be deleted and a character string of 0 characters may be used as the summary sentence text.

要約文生成部１０５は、構文解析結果に含まれる語句のうち、要約文テキストに含めなかった語句の削除フラグをＯＮにする（図５を参照）。削除フラグは、構文解析結果に含まれる語句にそれぞれ対応付けて管理され、要約文の編集処理に利用される。要約文生成部１０５は、要約文テキストを文出力部１０６に入力する。 The summary sentence generation unit 105 turns on the deletion flag of the phrases that are not included in the summary sentence text among the phrases included in the syntax analysis result (see FIG. 5). The deletion flag is managed in association with each word included in the syntax analysis result, and is used for the summary sentence editing process. The summary sentence generation unit 105 inputs the summary sentence text to the sentence output unit 106.

さらに、要約文生成部１０５は、要約文テキストに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図８を参照）を生成する。図８は、第２実施形態に係る要約文と構文解析結果の対応データの一例を示した図である。例えば、図８に示すように、要約文生成部１０５は、要約文テキストに含まれる各文字に割り当てられた番号（以下、要約文文字番号）と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成し、該対応データを記憶部１０１に格納する。 Further, the summary sentence generation unit 105 generates correspondence data (see FIG. 8) for associating each character included in the summary sentence text with a node corresponding to the phrase including the character. FIG. 8 is a diagram illustrating an example of correspondence data between a summary sentence and a syntax analysis result according to the second embodiment. For example, as illustrated in FIG. 8, the summary sentence generation unit 105 includes a number assigned to each character included in the summary sentence text (hereinafter, a summary sentence character number), a character notation, and a node corresponding to the character. Is generated as part of the analysis result 101b, and the corresponding data is stored in the storage unit 101.

文出力部１０６は、記憶部１０１から原文テキスト１０１ａを取得し、要約文生成部１０５から入力された要約文テキストと共に原文テキスト１０１ａを表示する。このとき、文出力部１０６は、原文テキスト１０１ａ上の文字、及び要約文テキスト上の文字をユーザが指定できる形式で原文テキスト１０１ａ及び要約文テキストを表示する。 The sentence output unit 106 acquires the original text 101 a from the storage unit 101 and displays the original text 101 a together with the summary text input from the summary generation unit 105. At this time, the sentence output unit 106 displays the original text 101a and the summary text in a format that allows the user to specify the characters on the original text 101a and the characters on the summary text.

（２−２−２．要約文の編集）
指定受領部１０７は、文出力部１０６が表示した原文テキスト１０１ａ又は要約文テキストに対する指定操作を受け付ける。指定操作は、原文テキスト１０１ａ又は要約文テキストに含まれる文字又は語句を指定する操作である（図９（Ａ）、図１０（Ａ）を参照）。図９は、第２実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。図１０は、第２実施形態に係る語句の削除についての指定操作及び処理の一例を示した図である。ここでは文字を指定する指定操作を受け付けた場合について説明する。 (2-2-2. Editing summary sentences)
The designation receiving unit 107 receives a designation operation for the original text 101a or the summary text displayed by the sentence output unit 106. The designation operation is an operation of designating characters or phrases included in the original text 101a or the summary text (see FIGS. 9A and 10A). FIG. 9 is a diagram illustrating an example of the designation operation and processing for adding words according to the second embodiment. FIG. 10 is a diagram illustrating an example of a designation operation and processing for deleting words according to the second embodiment. Here, a case where a designation operation for designating characters is accepted will be described.

指定受領部１０７は、原文テキスト１０１ａの文字に対する指定操作を受け付けると、原文と構文解析結果の対応データ（図７を参照）を参照し、指定された文字に対応するノードを特定する。他方、要約文テキストの文字に対する指定操作を受け付けると、指定受領部１０７は、要約文と構文解析結果の対応データ（図８を参照）を参照し、指定された文字に対応するノードを特定する。指定受領部１０７は、特定したノードの情報を範囲制御部１０８に入力する。このとき、指定受領部１０７は、指定操作の対象が原文テキスト１０１ａであるか、要約文テキストであるかを範囲制御部１０８に通知する。 When the designation receiving unit 107 receives a designation operation on the characters of the original text 101a, the designation receiving unit 107 refers to correspondence data (see FIG. 7) of the original text and the syntax analysis result, and identifies a node corresponding to the designated character. On the other hand, when receiving a designation operation for the characters of the summary sentence text, the designation receiving unit 107 refers to the correspondence data (see FIG. 8) of the summary sentence and the syntax analysis result, and identifies a node corresponding to the designated character. . The designation receiving unit 107 inputs the specified node information to the range control unit 108. At this time, the designation receiving unit 107 notifies the range control unit 108 whether the target of the designation operation is the original text 101a or the summary text.

指定操作の対象が原文テキスト１０１ａである場合、範囲制御部１０８は、図９（Ｂ）に示すように、構文解析結果をもとに要約文テキストに追加する語句の範囲（以下、追加範囲）を決定する。このとき、範囲制御部１０８は、指定された文字に対応するノードからルートノードに至るパス上のノードを抽出し、要約文テキストに既に含まれている語句に対応するノード以外のノードを追加範囲に含める。 When the target of the designating operation is the original text 101a, the range control unit 108, as shown in FIG. 9B, the range of words to be added to the summary text based on the parsing result (hereinafter, additional range). To decide. At this time, the range control unit 108 extracts nodes on the path from the node corresponding to the designated character to the root node, and adds a node other than the node corresponding to the phrase already included in the summary text. Include in

図９（Ｂ）の例では、語句「重い」に対応するノードから、語句「去った」に対応するルートノードに至るパス上のノードのうち、語句「論客が」が要約文テキストに含まれているため、追加範囲は、語句「重い」、「存在だった」に対応するノードとなる。この場合、図９（Ｃ）に示すように、語句「重い存在だった」（下線部）が要約文テキストに追加される。 In the example of FIG. 9B, among the nodes on the path from the node corresponding to the phrase “heavy” to the root node corresponding to the phrase “leaved”, the phrase “discussion is” is included in the summary text. Therefore, the added range is a node corresponding to the words “heavy” and “was present”. In this case, as shown in FIG. 9C, the phrase “it was heavy” (underlined) is added to the summary text.

一方、指定操作の対象が要約文テキストである場合、範囲制御部１０８は、図１０（Ｂ）に示すように、構文解析結果をもとに要約文テキストから削除する語句の範囲（以下、削除範囲）を決定する。このとき、範囲制御部１０８は、指定された文字に対応するノードから末端ノードに至るパス上のノードを抽出し、抽出したノードのうち要約文テキストに既に含まれている語句に対応するノードを削除範囲に含める。 On the other hand, when the target of the designation operation is a summary sentence text, the range control unit 108, as shown in FIG. Range). At this time, the range control unit 108 extracts a node on the path from the node corresponding to the designated character to the terminal node, and selects the node corresponding to the phrase already included in the summary text from the extracted nodes. Include in deletion range.

図１０（Ｂ）の例では、語句「半ばで」に対応するノードから、語句「志」に対応する末端ノードに至るパス上のノードのうち、語句「半ばで」、「志」が要約文テキストに含まれているため、削除範囲は、語句「半ばで」、「志」に対応するノードとなる。この場合、図１０（Ｃ）に示すように、語句「志半ばで」が要約文テキストから削除される。 In the example of FIG. 10B, of the nodes on the path from the node corresponding to the phrase “mid” to the terminal node corresponding to the phrase “zhi”, the phrases “middle” and “zhi” are summary sentences. Since it is included in the text, the deletion range is a node corresponding to the words “middle” and “will”. In this case, as shown in FIG. 10 (C), the phrase “mid-shi” is deleted from the summary text.

上述した追加範囲の追加処理及び削除範囲の削除処理は、要約文生成部１０５が実行する。範囲制御部１０８が決定した追加範囲又は削除範囲の情報が要約文生成部１０５に入力され、この情報をもとに要約文生成部１０５が要約文テキストを編集し、編集後の要約文テキストが文出力部１０６により表示される。このようにして図９（Ｃ）又は図１０（Ｃ）に示すような編集後の要約文テキストが表示される。 The summary sentence generation unit 105 executes the above-described addition range addition processing and deletion range deletion processing. The information on the addition range or the deletion range determined by the range control unit 108 is input to the summary sentence generation unit 105, and the summary sentence generation unit 105 edits the summary sentence text based on this information, and the edited summary sentence text is displayed. Displayed by the sentence output unit 106. In this way, the edited summary text as shown in FIG. 9C or FIG. 10C is displayed.

以上説明したように、情報処理装置１００によれば、原文テキスト１０１ａ又は要約文テキストに対する指定操作を行うことで、容易に要約文テキストの編集ができるようになる。追加時には、構文木のルートノード方向へ連結された一連のノードに対応する語句が追加範囲とされ、削除時には、末端ノード方向へ連結された一連のノードに対応する語句が削除範囲とされる。そのため、一度に追加又は削除される可能性の高い語句の集合が１回の指定操作で纏めて処理されるため、要約文テキストの編集が更に容易になり、編集作業の負担軽減に寄与する。 As described above, according to the information processing apparatus 100, the summary text can be easily edited by performing the designation operation on the original text 101a or the summary text. At the time of addition, words corresponding to a series of nodes connected in the direction of the root node of the syntax tree are set as an addition range, and at the time of deletion, a word corresponding to a series of nodes connected in the direction of the terminal node is set as a deletion range. Therefore, a set of words and phrases that are likely to be added or deleted at a time are processed together by a single designation operation, so that the summary text can be edited more easily, contributing to a reduction in the burden of editing work.

以上、情報処理装置１００の機能について説明した。
［２−３．処理フロー］
次に、図１１及び図１２を参照しながら、情報処理装置１００が実行する処理の流れについて説明する。図１１は、第２実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。図１２は、第２実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。 The function of the information processing apparatus 100 has been described above.
[2-3. Processing flow]
Next, the flow of processing executed by the information processing apparatus 100 will be described with reference to FIGS. 11 and 12. FIG. 11 is a first flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the second embodiment. FIG. 12 is a second flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the second embodiment.

（Ｓ１０１）原文入力部１０２は、原文テキスト１０１ａを取得して記憶部１０１に格納する。例えば、原文入力部１０２は、入力部９１６を利用してユーザが入力した原文テキスト１０１ａを記憶部１０１に格納する。原文テキスト１０１ａが情報処理装置１００に外部接続されたストレージ装置やネットワーク上のストレージ領域にある場合、原文入力部１０２は、そこから原文テキストを取得して記憶部１０１に格納する。 (S101) The original text input unit 102 acquires the original text 101a and stores it in the storage unit 101. For example, the original text input unit 102 stores the original text 101 a input by the user using the input unit 916 in the storage unit 101. When the original text 101 a is in a storage device externally connected to the information processing apparatus 100 or a storage area on the network, the original text input unit 102 acquires the original text from the original text and stores it in the storage unit 101.

（Ｓ１０２）形態素解析部１０３は、原文テキスト１０１ａに対する形態素解析を実施し、原文テキスト１０１ａから抽出した各形態素に品詞などを付加した情報（図４を参照）を解析結果１０１ｂの一部として記憶部１０１に格納する。例えば、原文テキスト１０１ａが「昨年八月末の暑い日、経済論壇で重い存在だった一人の論客が志半ばで世を去った。」という文である場合、図４のような形態素解析結果が得られる。 (S102) The morpheme analysis unit 103 performs morpheme analysis on the original text 101a, and stores information (see FIG. 4) in which each morpheme extracted from the original text 101a has a part of speech as a part of the analysis result 101b. 101. For example, if the original text 101a is a sentence "A hot day at the end of August last year, a single expert who was heavy on the economic platform passed away in the middle", the morphological analysis results shown in Fig. 4 were obtained. It is done.

（Ｓ１０３）構文解析部１０４は、形態素解析部１０３が出力した形態素解析結果をもとに原文テキスト１０１ａの構文解析（係り受け解析）を実施する。構文解析は、文法規則に則り、句や文節を単位として文の構造を解析する方法である。構文解析部１０４は、構文解析で得た語句毎に、語句の表記、係り先、係り受け種類などの情報（図５を参照）を対応付けて解析結果１０１ｂの一部として記憶部１０１に格納する。 (S103) The syntax analysis unit 104 performs syntax analysis (dependency analysis) of the original text 101a based on the morpheme analysis result output by the morpheme analysis unit 103. Parsing is a method of analyzing the structure of a sentence in units of phrases and clauses according to grammatical rules. The syntax analysis unit 104 associates information (see FIG. 5) such as a phrase notation, a dependency destination, and a dependency type for each phrase obtained by the syntax analysis, and stores the information in the storage unit 101 as a part of the analysis result 101b. To do.

（Ｓ１０４）構文解析部１０４は、原文テキスト１０１ａに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図７を参照）を生成する。例えば、図７に示すように、構文解析部１０４は、原文テキスト１０１ａに含まれる各文字の原文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S104) The syntax analysis unit 104 generates correspondence data (see FIG. 7) for associating each character included in the original text 101a with a node corresponding to the phrase including the character. For example, as shown in FIG. 7, the syntax analysis unit 104 associates the original character number of each character included in the original text 101a, the character notation, and the node number of the node corresponding to the character. Is generated as a part of the analysis result 101b.

（Ｓ１０５）構文解析部１０４は、構文解析結果（図５を参照）に含まれる各語句に対応付けた削除フラグを全てＯＦＦにする（初期化）。削除フラグは、要約文テキストに含まれる語句についてＯＦＦ、要約文テキストに含まれない語句についてＯＮとされる。 (S105) The syntax analysis unit 104 turns off all the deletion flags associated with each word / phrase included in the syntax analysis result (see FIG. 5) (initialization). The deletion flag is set to OFF for words included in the summary sentence text and ON for words not included in the summary sentence text.

（Ｓ１０６）要約文生成部１０５は、要約文生成時に削除するノードに対応する削除フラグをＯＮに設定する。例えば、要約文生成部１０５は、ルートノードから、要約文生成のために設定されたノードへ至るパス上のノードに対応する語句を特定し、特定したノード以外のノードに対応する削除フラグをＯＮに設定する。 (S106) The summary sentence generation unit 105 sets the deletion flag corresponding to the node to be deleted when generating the summary sentence to ON. For example, the summary sentence generation unit 105 identifies a phrase corresponding to a node on a path from the root node to a node set for summary sentence generation, and turns on a deletion flag corresponding to a node other than the identified node. Set to.

（Ｓ１０７）要約文生成部１０５は、削除フラグがＯＦＦのノードに対応する語句を原文テキスト１０１ａ上の出現順に連結して要約文テキストを生成する。
（Ｓ１０８）要約文生成部１０５は、要約文テキストに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図８を参照）を生成する。例えば、図８に示すように、要約文生成部１０５は、要約文テキストに含まれる各文字の要約文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S107) The summary sentence generation unit 105 generates a summary sentence text by concatenating words and phrases corresponding to the node whose deletion flag is OFF in the order of appearance on the original text 101a.
(S108) The summary sentence generation unit 105 generates correspondence data (see FIG. 8) for associating each character included in the summary sentence text with a node corresponding to the phrase including the character. For example, as illustrated in FIG. 8, the summary sentence generation unit 105 associates the summary sentence character number of each character included in the summary sentence text, the character notation, and the node number of the node corresponding to the character. Corresponding data is generated as a part of the analysis result 101b.

（Ｓ１０９）文出力部１０６は、記憶部１０１から原文テキスト１０１ａを取得し、要約文生成部１０５が生成した要約文テキストと共に原文テキスト１０１ａを出力する。このとき、文出力部１０６は、原文テキスト１０１ａ上の文字、及び要約文テキスト上の文字をユーザが指定できる形式で原文テキスト１０１ａ及び要約文テキストを表示する。 (S109) The sentence output unit 106 acquires the original text 101a from the storage unit 101, and outputs the original text 101a together with the summary text generated by the summary sentence generation unit 105. At this time, the sentence output unit 106 displays the original text 101a and the summary text in a format that allows the user to specify the characters on the original text 101a and the characters on the summary text.

（Ｓ１１０）指定受領部１０７は、文出力部１０６が出力した要約文テキストで確定されたか否かを判定する。例えば、指定受領部１０７は、要約文テキストの編集終了操作が行われたか否かを判定する。要約文テキストが確定された場合、図１１及び図１２に示した一連の処理は終了する。一方、要約文テキストが確定されていない場合、処理はＳ１１１へと進む。 (S110) The designation receiving unit 107 determines whether the summary sentence text output by the sentence output unit 106 has been confirmed. For example, the designation receiving unit 107 determines whether or not a summary sentence text editing end operation has been performed. When the summary sentence text is confirmed, the series of processes shown in FIGS. 11 and 12 ends. On the other hand, if the summary text has not been confirmed, the process proceeds to S111.

（Ｓ１１１）指定受領部１０７は、原文テキスト１０１ａの文字が指定されたか否かを判定する。原文テキスト１０１ａの文字が指定された場合、処理はＳ１１２へと進む。一方、原文テキスト１０１ａの文字が指定されていない場合、処理はＳ１１４へと進む。 (S111) The designation receiving unit 107 determines whether or not a character of the original text 101a has been designated. When the character of the original text 101a is designated, the process proceeds to S112. On the other hand, when the character of the original text 101a is not designated, the process proceeds to S114.

（Ｓ１１２）指定受領部１０７は、原文テキスト１０１ａの文字に対する指定操作を受け付けると、原文と構文解析結果の対応データ（図７を参照）を参照し、指定された文字（指定文字）に対応するノードを特定する。 (S112) When the designation receiving unit 107 receives a designation operation on the characters of the original text 101a, the designation receiving unit 107 refers to the correspondence data (see FIG. 7) between the original text and the syntax analysis result, and corresponds to the designated character (designated character). Identify the node.

（Ｓ１１３）範囲制御部１０８は、構文解析結果をもとに、指定受領部１０７が特定したノードからルートノードまでの各ノードに対応する削除フラグをＯＦＦにする。つまり、範囲制御部１０８は、図９（Ｂ）に示すように、構文解析結果をもとに、指定文字に対応するノードからルートノードに至るパス上のノードを抽出し、要約文テキストに既に含まれている語句に対応するノード以外のノードを追加範囲に含める。Ｓ１１３の処理が完了すると、処理はＳ１０７へと進む。 (S113) The range control unit 108 turns OFF the deletion flag corresponding to each node from the node identified by the designation receiving unit 107 to the root node based on the syntax analysis result. That is, as shown in FIG. 9B, the range control unit 108 extracts a node on the path from the node corresponding to the designated character to the root node based on the syntax analysis result, and has already added it to the summary text. Include nodes other than those corresponding to the included words in the additional range. When the process of S113 is completed, the process proceeds to S107.

（Ｓ１１４）指定受領部１０７は、要約文テキストの文字が指定されたか否かを判定する。要約文テキストの文字が指定された場合、処理はＳ１１５へと進む。一方、要約文テキストの文字が指定されていない場合、処理はＳ１１０へと進む。 (S114) The designation receiving unit 107 determines whether characters of the summary text are designated. If characters of the summary text are designated, the process proceeds to S115. On the other hand, when the characters of the summary text are not designated, the process proceeds to S110.

（Ｓ１１５）指定受領部１０７は、要約文テキストの文字に対する指定操作を受け付けると、要約文と構文解析結果の対応データ（図８を参照）を参照し、指定文字に対応するノードを特定する。 (S115) When the designation receiving unit 107 receives a designation operation for the characters of the summary sentence text, the designation receiving unit 107 refers to the correspondence data (see FIG. 8) of the summary sentence and the syntax analysis result, and identifies the node corresponding to the designated character.

（Ｓ１１６）範囲制御部１０８は、構文解析結果をもとに、指定受領部１０７が特定したノードから末端ノードまでの各ノードに対応する削除フラグをＯＮにする。つまり、範囲制御部１０８は、図１０（Ｂ）に示すように、構文解析結果をもとに、指定文字に対応するノードから末端ノードに至るパス上のノードを抽出し、要約文テキストに既に含まれている語句に対応するノードを削除範囲に含める。Ｓ１１６の処理が完了すると、処理はＳ１０７へと進む。 (S116) The range control unit 108 turns on the deletion flag corresponding to each node from the node specified by the designation receiving unit 107 to the terminal node based on the syntax analysis result. That is, as shown in FIG. 10B, the range control unit 108 extracts a node on the path from the node corresponding to the designated character to the terminal node based on the syntax analysis result, and has already added it to the summary text. Include nodes corresponding to the included words in the deletion range. When the process of S116 is completed, the process proceeds to S107.

以上、情報処理装置１００が実行する処理の流れについて説明した。
上記の処理方法によれば、原文テキスト１０１ａ又は要約文テキストに対する指定操作を行うことで、容易に要約文テキストの編集ができるようになる。追加時には、構文木のルートノード方向へ連結された一連のノードに対応する語句が追加範囲とされ、削除時には、末端ノード方向へ連結された一連のノードに対応する語句が削除範囲とされる。そのため、一度に追加又は削除される可能性の高い語句の集合が１回の指定操作で纏めて処理されるため、要約文テキストの編集が更に容易になり、編集作業の負担軽減に寄与する。 The flow of processing executed by the information processing apparatus 100 has been described above.
According to the above processing method, the summary sentence text can be easily edited by performing the designation operation on the original sentence text 101a or the summary sentence text. At the time of addition, words corresponding to a series of nodes connected in the direction of the root node of the syntax tree are set as an addition range, and at the time of deletion, a word corresponding to a series of nodes connected in the direction of the terminal node is set as a deletion range. Therefore, a set of words and phrases that are likely to be added or deleted at a time are processed together by a single designation operation, so that the summary text can be edited more easily, contributing to a reduction in the burden of editing work.

以上、第２実施形態について説明した。
＜３．第３実施形態＞
次に、第３実施形態について説明する。但し、上述した第２実施形態の説明と重複する部分については詳細な説明を省略する。 The second embodiment has been described above.
<3. Third Embodiment>
Next, a third embodiment will be described. However, detailed description of the same parts as those described in the second embodiment will be omitted.

［３−１．機能］
第３実施形態に係る情報処理装置１００は、同じテキストの中で複数の語句が共起する可能性の高さを示す共起確率を考慮して追加範囲及び削除範囲を決定する。この情報処理装置１００は、ノードの共起確率に関する情報を示す共起確率テーブルを記憶部１０１に格納している点、及び範囲制御部１０８が共起確率を利用する点が上述した第２実施形態と異なる。以下、これらの相違点を中心に説明する。 [3-1. function]
The information processing apparatus 100 according to the third embodiment determines the addition range and the deletion range in consideration of a co-occurrence probability indicating a high possibility that a plurality of phrases co-occur in the same text. In the information processing apparatus 100, the second embodiment described above is that the co-occurrence probability table indicating information on the co-occurrence probability of the node is stored in the storage unit 101, and that the range control unit 108 uses the co-occurrence probability. Different from form. Hereinafter, these differences will be mainly described.

（３−１−１．共起確率）
図１３を参照しながら、共起確率テーブルについて説明する。図１３は、第３実施形態に係る共起確率テーブルの一例を示した図である。図１３に示すように、共起確率テーブルは、係り元ノードが含む内容語と、係り先ノードが含む内容語との組み合わせ毎に共起確率を対応付けるテーブルである。共起確率テーブルは予め記憶部１０１に格納される。なお、内容語とは、名詞、動詞、形容詞など、文法的な機能をほとんど有せず、主として語彙的意味を表す語を言う。 (3-1-1. Co-occurrence probability)
The co-occurrence probability table will be described with reference to FIG. FIG. 13 is a diagram illustrating an example of a co-occurrence probability table according to the third embodiment. As illustrated in FIG. 13, the co-occurrence probability table is a table that associates the co-occurrence probability for each combination of the content word included in the dependency source node and the content word included in the dependency destination node. The co-occurrence probability table is stored in the storage unit 101 in advance. A content word is a word that has almost no grammatical functions, such as nouns, verbs, and adjectives, and mainly represents lexical meaning.

共起確率は、下記の式（１）をもとにテキストコーパスを利用して計算される。下記の式（１）において、Ａ→Ｂは、ノードＡがノードＢに係ることを意味する。つまり、ノードＡが係り元ノードであり、ノードＢが係り先ノードである。Ｐ（Ａ→Ｂ）は、Ａ→Ｂの係り受け関係を有するノードＡ、Ｂの共起確率を表す。Ｍ_A、Ｍ_Bは、それぞれノードＡ、Ｂに含まれる内容語を表す。＊は、任意の内容語を表す。Ｎ（Ｍ_A→Ｍ_B）、Ｎ（Ｍ_A→＊）、Ｎ（＊→Ｍ_B）は、それぞれテキストコーパスにおけるＭ_A→Ｍ_B、Ｍ_A→＊、＊→Ｍ_Bの出現数を表す。 The co-occurrence probability is calculated using a text corpus based on the following equation (1). In the following formula (1), A → B means that node A relates to node B. In other words, node A is the source node and node B is the destination node. P (A → B) represents the co-occurrence probability of nodes A and B having a dependency relationship of A → B. M _A and M _B represent content words included in the nodes A and B, respectively. * Represents an arbitrary content word. N (M _A → M _B ), N (M _A → *), and N (* → M _B ) respectively represent the number of occurrences of M _A → M _B , M _A → *, * → M _{B in} the text corpus. .

（３−１−２．要約文の編集）
次に、図１４及び図１５を参照しながら、共起確率テーブルを利用した要約文の編集処理について説明する。図１４は、第３実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。図１５は、第３実施形態に係る語句の削除についての指定操作及び処理の一例を示した図である。 (3-1-2. Editing summary text)
Next, the summary sentence editing process using the co-occurrence probability table will be described with reference to FIGS. FIG. 14 is a diagram illustrating an example of a specifying operation and processing for adding words according to the third embodiment. FIG. 15 is a diagram illustrating an example of a designation operation and processing for deleting a phrase according to the third embodiment.

指定操作の対象が原文テキスト１０１ａである場合（図１４（Ａ）を参照）、範囲制御部１０８は、図１４（Ｂ）に示すように、構文解析結果をもとに追加範囲を決定する。このとき、範囲制御部１０８は、指定文字に対応するノードを起点とし、起点からルートノードに至るパス上のノードを抽出し、要約文テキストに既に含まれている語句に対応するノード以外のノードを追加範囲に含める。 When the target of the designation operation is the original text 101a (see FIG. 14A), the range control unit 108 determines an additional range based on the syntax analysis result as shown in FIG. 14B. At this time, the range control unit 108 starts from a node corresponding to the designated character, extracts a node on the path from the starting point to the root node, and nodes other than the node corresponding to the phrase already included in the summary text Is included in the additional range.

また、範囲制御部１０８は、共起確率テーブルを参照し、起点から末端ノードへ向かう方向に連結するノードのうち共起確率が予め設定した閾値以上であるノードを抽出する。さらに、範囲制御部１０８は、抽出したノードを新たな起点とし、起点から末端ノードへ向かう方向に連結するノードのうち共起確率が予め設定した閾値以上であるノードを抽出する。そして、範囲制御部１０８は、この抽出処理を繰り返し、抽出したノードを追加範囲に加える。上記の閾値は、テキストコーパスから得られた共起確率の分布などをもとに統計的に妥当な値を求めてもよいし、ユーザが任意に設定してもよい。 Further, the range control unit 108 refers to the co-occurrence probability table, and extracts nodes having a co-occurrence probability equal to or higher than a preset threshold among nodes connected in the direction from the starting point to the terminal node. Further, the range control unit 108 uses the extracted node as a new starting point, and extracts a node having a co-occurrence probability equal to or higher than a preset threshold among nodes connected in the direction from the starting point toward the terminal node. Then, the range control unit 108 repeats this extraction process and adds the extracted node to the additional range. The threshold value may be a statistically valid value based on the co-occurrence probability distribution obtained from the text corpus, or may be arbitrarily set by the user.

図１４（Ｂ）の例では、指定文字に対応する語句「存在だった」に対応するノードから、語句「去った」に対応するルートノードに至るパス上のノードのうち、語句「論客が」が要約文テキストに含まれているため、語句「存在だった」に対応するノードが追加範囲に含められる。 In the example of FIG. 14B, among the nodes on the path from the node corresponding to the phrase “existing” corresponding to the designated character to the root node corresponding to the phrase “leaving”, the phrase “discussion is” Is included in the summary text, the node corresponding to the phrase “was present” is included in the additional range.

また、語句「存在だった」に対応するノードから末端ノードへ向かうノードとして、語句「経済論壇で」、「重い」に対応する２つのノードがある。この例では、語句「経済論壇で」と「存在だった」に対応するノード間の共起確率が閾値未満のため、このノードは追加範囲に含められない。一方、この例では、語句「重い」、「存在だった」に対応するノード間の共起確率が閾値以上のため、このノードは追加範囲に含められる。その結果、図１４（Ｃ）に示すように、語句「重い存在だった」（下線部）が要約文テキストに追加される。 In addition, as nodes from the node corresponding to the phrase “existing” to the terminal node, there are two nodes corresponding to the phrases “in the economic platform” and “heavy”. In this example, since the co-occurrence probability between nodes corresponding to the phrases “in the economic forum” and “was present” is less than the threshold, this node is not included in the additional range. On the other hand, in this example, since the co-occurrence probability between nodes corresponding to the phrases “heavy” and “was present” is equal to or greater than the threshold, this node is included in the additional range. As a result, as shown in FIG. 14C, the phrase “it was heavy” (underlined portion) is added to the summary text.

一方、指定操作の対象が要約文テキストである場合、範囲制御部１０８は、図１５（Ｂ）に示すように、構文解析結果をもとに削除範囲を決定する。このとき、範囲制御部１０８は、指定文字に対応するノードを起点とし、起点から末端ノードに至るパス上のノードを抽出し、要約文テキストに既に含まれている語句に対応するノードを削除範囲に含める。
また、範囲制御部１０８は、指定文字に対応するノードからルートノードに至るパス上のノードを抽出し、抽出したノードのうち要約文テキストに既に含まれている語句に対応するノードを候補とする。そして、範囲制御部１０８は、指定文字に対応するノードと候補との共起確率が閾値以上の場合に、その候補を削除範囲に含める。 On the other hand, when the target of the designation operation is a summary text, the range control unit 108 determines the deletion range based on the syntax analysis result as shown in FIG. At this time, the range control unit 108 starts from the node corresponding to the designated character, extracts nodes on the path from the start point to the end node, and deletes nodes corresponding to the words and phrases already included in the summary text Include in
In addition, the range control unit 108 extracts a node on the path from the node corresponding to the designated character to the root node, and sets the node corresponding to the phrase already included in the summary text among the extracted nodes as candidates. . Then, when the co-occurrence probability between the node corresponding to the designated character and the candidate is equal to or higher than the threshold, the range control unit 108 includes the candidate in the deletion range.

図１５（Ｂ）の例では、指定文字列に対応する語句「志」に対応するノードは、それ自身が末端ノードであるため、さらに末端方向に削除範囲を抽出することはしない。一方、語句「志」に対応するノードから、語句「去った」に対応するルートノードに至るパス上のノードのうち、語句「半ばで」、「去った」が要約文テキストに含まれている。この例では、語句「志」と「半ばで」に対応するノードの共起確率が閾値以上であるため、削除範囲は、語句「志」、「半ばで」に対応するノードとなる。その結果、図１５（Ｃ）に示すように、語句「志半ばで」が要約文テキストから削除される。なお、追加時に用いる閾値と削除時に用いる閾値とは、例えば、同じ値に設定される。 In the example of FIG. 15B, since the node corresponding to the phrase “zhi” corresponding to the designated character string is itself a terminal node, the deletion range is not further extracted in the terminal direction. On the other hand, of the nodes on the path from the node corresponding to the phrase “zhi” to the root node corresponding to the phrase “departed”, the phrases “middle” and “departed” are included in the summary text. . In this example, since the co-occurrence probability of the nodes corresponding to the phrases “zhi” and “middle” is greater than or equal to the threshold value, the deletion range is a node corresponding to the words “zhi” and “middle”. As a result, as shown in FIG. 15 (C), the phrase “mid-disease” is deleted from the summary text. Note that the threshold value used at the time of addition and the threshold value used at the time of deletion are set to the same value, for example.

以上説明したように、第３実施形態においても、第２実施形態の場合と同様に、原文テキスト１０１ａ又は要約文テキストに対する指定操作を行うことで、容易に要約文テキストの編集ができるようになる。 As described above, in the third embodiment, as in the case of the second embodiment, the summary text can be easily edited by performing the designation operation on the original text 101a or the summary text. .

追加時には、構文木のルートノード方向へ連結された一連のノードに対応する語句が追加範囲とされ、さらに、末端ノード方向に共起確率の高い一連のノードに対応する語句が追加範囲とされる。削除時には、末端ノード方向へ連結された一連のノードに対応する語句が削除範囲とされ、さらに、ルートノード方向に共起確率の高い一連のノードに対応する語句が削除範囲とされる。そのため、一度に追加又は削除される可能性の高い語句の集合が１回の指定操作で纏めて処理されるため、要約文テキストの編集が更に容易になり、編集作業の負担軽減に寄与する。さらに、共起確率を考慮するため、一度の操作で追加又は削除される可能性が特に高い語句の集合が１回の指定操作で処理され、更なる編集作業の負担軽減に寄与する。 At the time of addition, words corresponding to a series of nodes connected in the direction of the root node of the syntax tree are added as an additional range, and further words corresponding to a series of nodes having a high co-occurrence probability are set as an additional range in the direction of the end node. . At the time of deletion, words corresponding to a series of nodes connected in the terminal node direction are set as the deletion range, and words corresponding to a series of nodes having a high co-occurrence probability in the root node direction are set as the deletion range. Therefore, a set of words and phrases that are likely to be added or deleted at a time are processed together by a single designation operation, so that the summary text can be edited more easily, contributing to a reduction in the burden of editing work. Furthermore, in order to consider the co-occurrence probability, a set of words and phrases that are particularly likely to be added or deleted in one operation is processed in one designation operation, which contributes to further reducing the burden of editing work.

［３−２．処理フロー］
次に、図１６〜図１８を参照しながら、情報処理装置１００が実行する処理の流れについて説明する。図１６は、第３実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。図１７は、第３実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。図１８は、第３実施形態に係る情報処理装置の動作についての処理の流れを示した第３のフロー図である。 [3-2. Processing flow]
Next, the flow of processing executed by the information processing apparatus 100 will be described with reference to FIGS. FIG. 16 is a first flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the third embodiment. FIG. 17 is a second flowchart illustrating a processing flow regarding the operation of the information processing apparatus according to the third embodiment. FIG. 18 is a third flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the third embodiment.

（Ｓ２０１）原文入力部１０２は、原文テキスト１０１ａを取得して記憶部１０１に格納する。例えば、原文入力部１０２は、入力部９１６を利用してユーザが入力した原文テキスト１０１ａを記憶部１０１に格納する。原文テキスト１０１ａが情報処理装置１００に外部接続されたストレージ装置やネットワーク上のストレージ領域にある場合、原文入力部１０２は、そこから原文テキストを取得して記憶部１０１に格納する。 (S201) The original text input unit 102 acquires the original text 101a and stores it in the storage unit 101. For example, the original text input unit 102 stores the original text 101 a input by the user using the input unit 916 in the storage unit 101. When the original text 101 a is in a storage device externally connected to the information processing apparatus 100 or a storage area on the network, the original text input unit 102 acquires the original text from the original text and stores it in the storage unit 101.

（Ｓ２０２）形態素解析部１０３は、原文テキスト１０１ａに対する形態素解析を実施し、原文テキスト１０１ａから抽出した各形態素に品詞などを付加した情報（図４を参照）を解析結果１０１ｂの一部として記憶部１０１に格納する。例えば、原文テキスト１０１ａが「昨年八月末の暑い日、経済論壇で重い存在だった一人の論客が志半ばで世を去った。」という文である場合、図４のような形態素解析結果が得られる。 (S202) The morpheme analysis unit 103 performs morpheme analysis on the original text 101a, and stores information (see FIG. 4) in which each morpheme extracted from the original text 101a has a part of speech as a part of the analysis result 101b. 101. For example, if the original text 101a is a sentence "A hot day at the end of August last year, a single expert who was heavy on the economic platform passed away in the middle", the morphological analysis results shown in Fig. 4 were obtained. It is done.

（Ｓ２０３）構文解析部１０４は、形態素解析部１０３が出力した形態素解析結果をもとに原文テキスト１０１ａの構文解析（係り受け解析）を実施する。構文解析は、文法規則に則り、句や文節を単位として文の構造を解析する方法である。構文解析部１０４は、構文解析で得た語句毎に、語句の表記、係り先、係り受け種類などの情報（図５を参照）を対応付けて解析結果１０１ｂの一部として記憶部１０１に格納する。 (S203) The syntax analysis unit 104 performs syntax analysis (dependency analysis) of the source text 101a based on the morpheme analysis result output by the morpheme analysis unit 103. Parsing is a method of analyzing the structure of a sentence in units of phrases and clauses according to grammatical rules. The syntax analysis unit 104 associates information (see FIG. 5) such as a phrase notation, a dependency destination, and a dependency type for each phrase obtained by the syntax analysis, and stores the information in the storage unit 101 as a part of the analysis result 101b. To do.

（Ｓ２０４）構文解析部１０４は、原文テキスト１０１ａに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図７を参照）を生成する。例えば、図７に示すように、構文解析部１０４は、原文テキスト１０１ａに含まれる各文字の原文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S204) The syntax analysis unit 104 generates correspondence data (see FIG. 7) that associates each character included in the original text 101a with a node corresponding to a word including the character. For example, as shown in FIG. 7, the syntax analysis unit 104 associates the original character number of each character included in the original text 101a, the character notation, and the node number of the node corresponding to the character. Is generated as a part of the analysis result 101b.

（Ｓ２０５）構文解析部１０４は、構文解析結果（図５を参照）に含まれる各語句に対応付けた削除フラグを全てＯＦＦにする（初期化）。削除フラグは、要約文テキストに含まれる語句についてＯＦＦ、要約文テキストに含まれない語句についてＯＮとされる。 (S205) The syntax analysis unit 104 turns OFF all the deletion flags associated with each word / phrase included in the syntax analysis result (see FIG. 5) (initialization). The deletion flag is set to OFF for words included in the summary sentence text and ON for words not included in the summary sentence text.

（Ｓ２０６）要約文生成部１０５は、要約文生成時に削除するノードに対応する削除フラグをＯＮに設定する。例えば、要約文生成部１０５は、ルートノードから、要約文生成のために設定されたノードへ至るパス上のノードに対応する語句を特定し、特定したノード以外のノードに対応する削除フラグをＯＮに設定する。 (S206) The summary sentence generation unit 105 sets the deletion flag corresponding to the node to be deleted when generating the summary sentence to ON. For example, the summary sentence generation unit 105 identifies a phrase corresponding to a node on a path from the root node to a node set for summary sentence generation, and turns on a deletion flag corresponding to a node other than the identified node. Set to.

（Ｓ２０７）要約文生成部１０５は、削除フラグがＯＦＦのノードに対応する語句を原文テキスト１０１ａ上の出現順に連結して要約文テキストを生成する。
（Ｓ２０８）要約文生成部１０５は、要約文テキストに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図８を参照）を生成する。例えば、図８に示すように、要約文生成部１０５は、要約文テキストに含まれる各文字の要約文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S207) The summary sentence generation unit 105 generates a summary sentence text by concatenating words and phrases corresponding to the node with the deletion flag OFF in the order of appearance on the original text 101a.
(S208) The summary sentence generation unit 105 generates correspondence data (see FIG. 8) for associating each character included in the summary sentence text with a node corresponding to the phrase including the character. For example, as illustrated in FIG. 8, the summary sentence generation unit 105 associates the summary sentence character number of each character included in the summary sentence text, the character notation, and the node number of the node corresponding to the character. Corresponding data is generated as a part of the analysis result 101b.

（Ｓ２０９）文出力部１０６は、記憶部１０１から原文テキスト１０１ａを取得し、要約文生成部１０５が生成した要約文テキストと共に原文テキスト１０１ａを出力する。このとき、文出力部１０６は、原文テキスト１０１ａ上の文字、及び要約文テキスト上の文字をユーザが指定できる形式で原文テキスト１０１ａ及び要約文テキストを表示する。 (S209) The sentence output unit 106 acquires the original text 101a from the storage unit 101, and outputs the original text 101a together with the summary text generated by the summary sentence generation unit 105. At this time, the sentence output unit 106 displays the original text 101a and the summary text in a format that allows the user to specify the characters on the original text 101a and the characters on the summary text.

（Ｓ２１０）指定受領部１０７は、文出力部１０６が出力した要約文テキストで確定されたか否かを判定する。例えば、指定受領部１０７は、要約文テキストの編集終了操作が行われたか否かを判定する。要約文テキストが確定された場合、図１６及び図１７に示した一連の処理は終了する。一方、要約文テキストが確定されていない場合、処理はＳ２１１へと進む。 (S210) The designation receiving unit 107 determines whether the summary sentence text output from the sentence output unit 106 has been confirmed. For example, the designation receiving unit 107 determines whether or not a summary sentence text editing end operation has been performed. When the summary sentence text is confirmed, the series of processing shown in FIGS. 16 and 17 ends. On the other hand, if the summary text has not been confirmed, the process proceeds to S211.

（Ｓ２１１）指定受領部１０７は、原文テキスト１０１ａの文字が指定されたか否かを判定する。原文テキスト１０１ａの文字が指定された場合、処理はＳ２１２へと進む。一方、原文テキスト１０１ａの文字が指定されていない場合、処理はＳ２１７へと進む。 (S211) The designation receiving unit 107 determines whether a character of the original text 101a is designated. If the character of the original text 101a is designated, the process proceeds to S212. On the other hand, when the character of the original text 101a is not designated, the process proceeds to S217.

（Ｓ２１２）指定受領部１０７は、原文テキスト１０１ａの文字に対する指定操作を受け付けると、原文と構文解析結果の対応データ（図７を参照）を参照し、指定文字に対応するノードを特定する。そして、指定受領部１０７は、特定したノードを検討対象ノードに設定する。 (S212) Upon receiving a designation operation for the characters of the original text 101a, the designation receiving unit 107 refers to correspondence data (see FIG. 7) between the original text and the syntax analysis result, and identifies a node corresponding to the designated character. Then, the designation receiving unit 107 sets the identified node as the examination target node.

（Ｓ２１３）範囲制御部１０８は、サブルーチンＳ（図１９を参照）の処理を実行する。サブルーチンＳ［…］は、「…」を引数とする処理単位である。Ｓ２１３の処理において、範囲制御部１０８は、Ｓ２１２で設定した検討対象ノードを引数に指定してサブルーチンＳの処理を実行する。なお、サブルーチンＳの処理については後述する。サブルーチンＳの処理において追加起点ノードが設定されるが、これについても後述する。 (S213) The range control unit 108 executes the process of subroutine S (see FIG. 19). Subroutine S [...] is a processing unit with "..." as an argument. In the process of S213, the range control unit 108 executes the process of the subroutine S by specifying the examination target node set in S212 as an argument. The processing of the subroutine S will be described later. An additional starting node is set in the subroutine S, which will also be described later.

（Ｓ２１４、Ｓ２１５、Ｓ２１６）Ｓ２１３の処理で設定された全ての追加起点ノードを対象に、各追加起点ノードについてＳ２１５の処理が実行される。Ｓ２１５の処理において、範囲制御部１０８は、構文解析結果をもとに、追加起点ノードからルートノードまでの各ノードに対応する削除フラグをＯＦＦに設定する。全ての追加起点ノードについてＳ２１５の処理が完了した場合、処理はＳ２０７へと進む。 (S214, S215, S216) The process of S215 is executed for each additional starting node for all the additional starting nodes set in the process of S213. In the processing of S215, the range control unit 108 sets the deletion flag corresponding to each node from the additional origin node to the root node to OFF based on the syntax analysis result. When the process of S215 is completed for all the additional starting nodes, the process proceeds to S207.

（Ｓ２１７）指定受領部１０７は、要約文テキストの文字が指定されたか否かを判定する。要約文テキストの文字が指定された場合、処理はＳ２１８へと進む。一方、要約文テキストの文字が指定されていない場合、処理はＳ２１０へと進む。 (S217) The designation receiving unit 107 determines whether characters of the summary text are designated. If characters of the summary text are designated, the process proceeds to S218. On the other hand, when the characters of the summary text are not designated, the process proceeds to S210.

（Ｓ２１８）指定受領部１０７は、要約文テキストの文字に対する指定操作を受け付けると、要約文と構文解析結果の対応データ（図８を参照）を参照し、指定文字に対応するノードを特定する。そして、指定受領部１０７は、特定したノードを削除起点ノードに設定する。 (S218) Upon receiving the designation operation for the characters of the summary text, the designation receiving unit 107 refers to the correspondence data (see FIG. 8) of the summary text and the syntax analysis result, and identifies the node corresponding to the designated character. Then, the designation receiving unit 107 sets the identified node as the deletion start node.

（Ｓ２１９）範囲制御部１０８は、削除起点ノードがルートノードであるか否かを判定する。削除起点ノードがルートノードである場合、処理はＳ２２４へと進む。一方、削除起点ノードがルートノードでない場合、処理はＳ２２０へと進む。 (S219) The range control unit 108 determines whether or not the deletion start node is a root node. If the deletion start node is the root node, the process proceeds to S224. On the other hand, if the deletion origin node is not the root node, the process proceeds to S220.

（Ｓ２２０）範囲制御部１０８は、共起確率テーブルから、削除起点ノードからルートノードへ向かう方向へ連結するノード（連結ノード）の共起確率を取得する。
（Ｓ２２１）範囲制御部１０８は、Ｓ２２０で取得した共起確率が予め設定された閾値以上であるか否かを判定する。共起確率が閾値以上である場合、処理はＳ２２２へと進む。一方、共起確率が閾値未満である場合、処理はＳ２２４へと進む。 (S220) The range control unit 108 acquires, from the co-occurrence probability table, a co-occurrence probability of a node (connected node) that is connected in the direction from the deletion start node to the root node.
(S221) The range control unit 108 determines whether or not the co-occurrence probability acquired in S220 is equal to or greater than a preset threshold value. If the co-occurrence probability is greater than or equal to the threshold, the process proceeds to S222. On the other hand, if the co-occurrence probability is less than the threshold, the process proceeds to S224.

（Ｓ２２２）範囲制御部１０８は、連結ノードが、末端ノードへ向かう方向に連結する削除フラグがＯＦＦの他のノードを持つか否かを判定する。連結ノードが該他のノードを持つ場合、処理はＳ２２４へと進む。一方、連結ノードが該他のノードを持たない場合、処理はＳ２２３へと進む。 (S222) The range control unit 108 determines whether or not the connected node has another node in which the deletion flag connected in the direction toward the end node is OFF. If the connected node has the other node, the process proceeds to S224. On the other hand, if the connected node does not have the other node, the process proceeds to S223.

（Ｓ２２３）範囲制御部１０８は、連結ノードを削除起点ノードに設定する。Ｓ２２３の処理が完了すると、処理はＳ２１９へと進む。
（Ｓ２２４）範囲制御部１０８は、削除起点ノードから末端ノードまでの各ノードに対応する削除フラグをＯＮに設定する。Ｓ２２４の処理が完了すると、処理はＳ２０７へと進む。 (S223) The range control unit 108 sets a connection node as a deletion start node. When the process of S223 is completed, the process proceeds to S219.
(S224) The range control unit 108 sets the deletion flag corresponding to each node from the deletion start node to the end node to ON. When the process of S224 is completed, the process proceeds to S207.

（サブルーチンＳ）
ここで、図１９を参照しながら、サブルーチンＳの処理について説明する。図１９は、第３実施形態に係る情報処理装置の動作についての処理の流れを示した第４のフロー図である。なお、サブルーチンＳの引数として指定されたノードを「入力ノード」とする。 (Subroutine S)
Here, the processing of the subroutine S will be described with reference to FIG. FIG. 19 is a fourth flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the third embodiment. Note that a node designated as an argument of the subroutine S is an “input node”.

（Ｓ２５１）範囲制御部１０８は、入力ノードから末端ノードへ向かう方向に連結するノード（連結ノード）があるか否かを判定する。連結ノードがある場合、処理はＳ２５２へと進む。一方、連結ノードがない場合、図１９に示した一連の処理は終了する。 (S251) The range control unit 108 determines whether or not there is a node (connected node) connected in the direction from the input node toward the terminal node. If there is a connected node, the process proceeds to S252. On the other hand, if there is no connected node, the series of processes shown in FIG. 19 ends.

（Ｓ２５２、Ｓ２５６）範囲制御部１０８は、入力ノードから末端ノードへ向かう方向に連結する連結ノードの全てを対象に、各連結ノードについてＳ２５２からＳ２５６までの処理を実行する。つまり、範囲制御部１０８は、処理対象の連結ノードを変えながらＳ２５３からＳ２５５までの処理を繰り返し実行する。全ての連結ノードについて処理が完了すると、処理はＳ２５７へと進む。 (S252, S256) The range control unit 108 executes the processing from S252 to S256 for each connected node for all connected nodes connected in the direction from the input node to the end node. That is, the range control unit 108 repeatedly executes the processing from S253 to S255 while changing the connection node to be processed. When the process is completed for all connected nodes, the process proceeds to S257.

（Ｓ２５３、Ｓ２５４）範囲制御部１０８は、共起確率テーブルから、入力ノードと連結ノードとの共起確率を取得し、共起確率が予め設定された閾値以上であるか否かを判定する。共起確率が閾値以上である場合、処理はＳ２５５へと進む。一方、共起確率が閾値未満である場合、処理はＳ２５６へと進む。 (S253, S254) The range control unit 108 acquires the co-occurrence probability between the input node and the connected node from the co-occurrence probability table, and determines whether the co-occurrence probability is equal to or higher than a preset threshold value. If the co-occurrence probability is greater than or equal to the threshold, the process proceeds to S255. On the other hand, if the co-occurrence probability is less than the threshold, the process proceeds to S256.

（Ｓ２５５）範囲制御部１０８は、連結ノードを引数としてサブルーチンＳの処理を実行する。つまり、範囲制御部１０８は、現在処理対象の連結ノードを入力ノードとして図１９に示した一連の処理を実行する。 (S255) The range control unit 108 executes the process of the subroutine S with the connected node as an argument. That is, the range control unit 108 executes a series of processes shown in FIG. 19 with the currently processed connection node as an input node.

（Ｓ２５７）範囲制御部１０８は、全ての連結ノードについて共起確率が閾値より低かったか否かを判定する。つまり、範囲制御部１０８は、全ての連結ノードについてＳ２５５の処理が実行されなかったか否かを判定する。全ての連結ノードについて共起確率が閾値より低かった場合、処理はＳ２５８へと進む。一方、共起確率が閾値より高くなる連結ノードがあった場合、図１９に示した一連の処理は終了する。 (S257) The range control unit 108 determines whether or not the co-occurrence probability is lower than a threshold value for all connected nodes. That is, the range control unit 108 determines whether or not the process of S255 has been executed for all the connected nodes. If the co-occurrence probability is lower than the threshold value for all connected nodes, the process proceeds to S258. On the other hand, when there is a connected node whose co-occurrence probability is higher than the threshold value, the series of processes illustrated in FIG. 19 ends.

（Ｓ２５８）範囲制御部１０８は、入力ノードを追加起点ノードに設定する。Ｓ２５８の処理が完了すると、図１９に示した一連の処理は終了する。
以上、情報処理装置１００が実行する処理の流れについて説明した。 (S258) The range control unit 108 sets the input node as an additional starting node. When the process of S258 is completed, the series of processes shown in FIG. 19 ends.
The flow of processing executed by the information processing apparatus 100 has been described above.

上記の処理方法によれば、原文テキスト１０１ａ又は要約文テキストに対する指定操作を行うことで、容易に要約文テキストの編集ができるようになる。
追加時には、構文木のルートノード方向へ連結された一連のノードに対応する語句が追加範囲とされ、さらに、末端ノード方向に共起確率の高い一連のノードに対応する語句が追加範囲とされる。削除時には、末端ノード方向へ連結された一連のノードに対応する語句が削除範囲とされ、さらに、ルートノード方向に共起確率の高い一連のノードに対応する語句が削除範囲とされる。そのため、一度に追加又は削除される可能性の高い語句の集合が１回の指定操作で纏めて処理されるため、要約文テキストの編集が更に容易になり、編集作業の負担軽減に寄与する。さらに、共起確率を考慮するため、一度の操作で追加又は削除される可能性が特に高い語句の集合が１回の指定操作で処理され、更なる編集作業の負担軽減に寄与する。 According to the above processing method, the summary sentence text can be easily edited by performing the designation operation on the original sentence text 101a or the summary sentence text.
At the time of addition, words corresponding to a series of nodes connected in the direction of the root node of the syntax tree are set as an additional range, and further, words corresponding to a series of nodes having a high co-occurrence probability in the direction of the terminal node are set as an additional range. . At the time of deletion, words corresponding to a series of nodes connected in the terminal node direction are set as the deletion range, and words corresponding to a series of nodes having a high co-occurrence probability in the root node direction are set as the deletion range. Therefore, a set of words and phrases that are likely to be added or deleted at a time are processed together by a single designation operation, so that the summary text can be edited more easily, contributing to a reduction in the burden of editing work. Furthermore, in order to consider the co-occurrence probability, a set of words and phrases that are particularly likely to be added or deleted in one operation is processed in one designation operation, which contributes to further reducing the burden of editing work.

以上、第３実施形態について説明した。
＜４．第４実施形態＞
次に、第４実施形態について説明する。但し、上述した第２又は第３実施形態の説明と重複する部分については詳細な説明を省略する。なお、第４実施形態の技術は、第２又は第３実施形態と組み合わせて利用されうる。 The third embodiment has been described above.
<4. Fourth Embodiment>
Next, a fourth embodiment will be described. However, detailed description of the same parts as those described in the second or third embodiment will be omitted. The technique of the fourth embodiment can be used in combination with the second or third embodiment.

［４−１．機能］
第４実施形態に係る情報処理装置１００は、要約文テキストに語句を追加する際、指定文字に対応するノードからルートノードに至るパス上のノードのうち、特定の条件を満たすノードを選択して追加する。つまり、この情報処理装置１００は、構文木上で離れた位置にあるノードの接続関係をショートカットする。 [4-1. function]
When the information processing apparatus 100 according to the fourth embodiment adds a phrase to the summary text, the information processing apparatus 100 selects a node satisfying a specific condition from nodes on the path from the node corresponding to the designated character to the root node. to add. That is, the information processing apparatus 100 performs a shortcut to the connection relationship between nodes located at distant positions on the syntax tree.

以下、上記のショートカットについて、図２０及び図２１を参照しながら説明する。図２０は、第４実施形態に係る構文解析（係り受け解析）結果の一例を示した図である。図２１は、第４実施形態に係る語句の追加についての指定操作及び処理の一例を示した図である。 Hereinafter, the shortcut will be described with reference to FIGS. 20 and 21. FIG. FIG. 20 is a diagram illustrating an example of a syntax analysis (dependency analysis) result according to the fourth embodiment. FIG. 21 is a diagram illustrating an example of a specifying operation and processing for adding a phrase according to the fourth embodiment.

上記のショートカットには、図２０に示すように、構文解析結果に追加したショートカットの情報が利用される。図２０の例では、ノード番号「１」のノードが、ノード番号「１３」のノードにショートカットされる。例えば、ノード番号「１」に対応する語句「昨年」が指定されると（図２１（Ａ）を参照）、ルートノードに至るパス上のノードのうちノード番号「１３」のノード（語句「去った」に対応）までのノードがスキップされ、追加対象から除外される。そのため、図２１（Ｃ）に示すように「昨年」だけが要約文テキストに追加される。 As shown in FIG. 20, the shortcut information added to the syntax analysis result is used as the shortcut. In the example of FIG. 20, the node with the node number “1” is shortcut to the node with the node number “13”. For example, when the phrase “last year” corresponding to the node number “1” is specified (see FIG. 21A), the node with the node number “13” (the phrase “departure” among the nodes on the path leading to the root node). Nodes up to “)” are skipped and excluded from addition targets. Therefore, only “Last Year” is added to the summary text as shown in FIG.

ショートカット先は、下記の式（２）をもとに決定される。Ｄ（ｉ，ｊ）は、ノードｉ，ｊの距離を表す。Ｐ（ｉ，ｊ）は、ノードｉ，ｊの共起確率を表す。ノードｉに対するノードｊのスコアＳｃｊをＤ（ｉ，ｊ）×Ｐ（ｉ，ｊ）と定義する。ノード間の距離は、ノード間にあるブランチの数や、ノード間にある他のノードの数に１を加えた数などで評価できる。共起確率は上述した共起確率テーブルにより与えられる。Ｎｏｄｅ_s（ｉ）は、ノードｉからルートノードへ至るパス上のノードのうちスコアが最大となるノードのノード番号を表す。Ｌは、ノードｉからルートノードに至るパス上のノード数である。 The shortcut destination is determined based on the following equation (2). D (i, j) represents the distance between the nodes i and j. P (i, j) represents the co-occurrence probability of the nodes i and j. The score Scj of the node j with respect to the node i is defined as D (i, j) × P (i, j). The distance between nodes can be evaluated by the number of branches between the nodes or the number of other nodes between the nodes plus one. The co-occurrence probability is given by the co-occurrence probability table described above. Node _s (i) represents the node number of the node having the maximum score among the nodes on the path from the node i to the root node. L is the number of nodes on the path from the node i to the root node.

図２１の例では、語句「昨年」が指定され、ルートノードに至るパス上の各ノードに対し、スコアＳｃ１、Ｓｃ２、Ｓｃ３が参照されている。この例では、Ｓｃ１＞Ｓｃ２、Ｓｃ３であるとき、Ｓｃ１に対応する語句「去った」のノードがショートカット先になる。そのため、範囲制御部１０８は、語句「八月末の」、「日」を追加対象から除外し、要約文テキストに含まれていない語句「昨年」を追加範囲に含める。その結果、図２１（Ｃ）に示す要約文テキストが得られる。 In the example of FIG. 21, the phrase “last year” is specified, and the scores Sc1, Sc2, and Sc3 are referenced for each node on the path leading to the root node. In this example, when Sc1> Sc2, Sc3, the node of the phrase “Leave” corresponding to Sc1 is the shortcut destination. Therefore, the range control unit 108 excludes the words “at the end of August” and “day” from the addition target, and includes the word “last year” not included in the summary text in the additional range. As a result, the summary text shown in FIG. 21C is obtained.

このように、第４実施形態に係る情報処理装置１００の機能によれば、ノード間の距離及び共起確率をもとに追加対象が絞り込まれる。指定文字に対応するノードがルートノードから遠ければ、要約文テキストに追加される語句が多くなり、不要な語句が追加されるリスクが高まるが、上記機能を適用することで、このようなリスクが軽減される。また、共起確率が考慮されるため、必要な語句が残り、不要な語句が除外される可能性が高まり、適切な語句の集合が追加される可能性を高めることができる。その結果、要約文テキストの編集が更に容易になり、編集作業の更なる負担軽減に寄与する。 Thus, according to the function of the information processing apparatus 100 according to the fourth embodiment, the addition target is narrowed down based on the distance between nodes and the co-occurrence probability. If the node corresponding to the specified character is far from the root node, more words will be added to the summary text and the risk of adding unnecessary words will increase, but this risk can be increased by applying the above function. It is reduced. In addition, since the co-occurrence probability is taken into consideration, it is possible to increase the possibility that necessary words remain and unnecessary words / phrases are excluded, and that a set of appropriate words / phrases is added. As a result, the summary text can be edited more easily, which contributes to further reducing the burden of editing work.

［４−２．処理フロー］
次に、図２２〜図２４を参照しながら、情報処理装置１００が実行する処理の流れについて説明する。図２２は、第４実施形態に係る情報処理装置の動作についての処理の流れを示した第１のフロー図である。図２３は、第４実施形態に係る情報処理装置の動作についての処理の流れを示した第２のフロー図である。図２４は、第４実施形態に係る情報処理装置の動作についての処理の流れを示した第３のフロー図である。 [4-2. Processing flow]
Next, the flow of processing executed by the information processing apparatus 100 will be described with reference to FIGS. FIG. 22 is a first flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the fourth embodiment. FIG. 23 is a second flowchart illustrating a process flow regarding the operation of the information processing apparatus according to the fourth embodiment. FIG. 24 is a third flowchart illustrating a process flow of the operation of the information processing apparatus according to the fourth embodiment.

（Ｓ３０１）原文入力部１０２は、原文テキスト１０１ａを取得して記憶部１０１に格納する。例えば、原文入力部１０２は、入力部９１６を利用してユーザが入力した原文テキスト１０１ａを記憶部１０１に格納する。原文テキスト１０１ａが情報処理装置１００に外部接続されたストレージ装置やネットワーク上のストレージ領域にある場合、原文入力部１０２は、そこから原文テキストを取得して記憶部１０１に格納する。 (S301) The original text input unit 102 acquires the original text 101a and stores it in the storage unit 101. For example, the original text input unit 102 stores the original text 101 a input by the user using the input unit 916 in the storage unit 101. When the original text 101 a is in a storage device externally connected to the information processing apparatus 100 or a storage area on the network, the original text input unit 102 acquires the original text from the original text and stores it in the storage unit 101.

（Ｓ３０２）形態素解析部１０３は、原文テキスト１０１ａに対する形態素解析を実施し、原文テキスト１０１ａから抽出した各形態素に品詞などを付加した情報（図４を参照）を解析結果１０１ｂの一部として記憶部１０１に格納する。例えば、原文テキスト１０１ａが「昨年八月末の暑い日、経済論壇で重い存在だった一人の論客が志半ばで世を去った。」という文である場合、図４のような形態素解析結果が得られる。 (S302) The morpheme analysis unit 103 performs morpheme analysis on the original text 101a, and stores information (see FIG. 4) in which the part of speech is added to each morpheme extracted from the original text 101a as a part of the analysis result 101b. 101. For example, if the original text 101a is a sentence "A hot day at the end of August last year, a single expert who was heavy on the economic platform passed away in the middle", the morphological analysis results shown in Fig. 4 were obtained. It is done.

（Ｓ３０３）構文解析部１０４は、形態素解析部１０３が出力した形態素解析結果をもとに原文テキスト１０１ａの構文解析（係り受け解析）を実施する。構文解析は、文法規則に則り、句や文節を単位として文の構造を解析する方法である。構文解析部１０４は、構文解析で得た語句毎に、語句の表記、係り先、係り受け種類などの情報（図２０を参照）を対応付けて解析結果１０１ｂの一部として記憶部１０１に格納する。 (S303) The syntax analysis unit 104 performs syntax analysis (dependency analysis) of the original text 101a based on the morpheme analysis result output by the morpheme analysis unit 103. Parsing is a method of analyzing the structure of a sentence in units of phrases and clauses according to grammatical rules. The syntax analysis unit 104 associates information such as phrase notation, dependency destination, dependency type, and the like (see FIG. 20) with each phrase obtained by the syntax analysis, and stores it in the storage unit 101 as a part of the analysis result 101b. To do.

（Ｓ３０４）構文解析部１０４は、原文テキスト１０１ａに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図７を参照）を生成する。例えば、図７に示すように、構文解析部１０４は、原文テキスト１０１ａに含まれる各文字の原文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S304) The syntax analysis unit 104 generates correspondence data (see FIG. 7) that associates each character included in the original text 101a with a node corresponding to the phrase including the character. For example, as shown in FIG. 7, the syntax analysis unit 104 associates the original character number of each character included in the original text 101a, the character notation, and the node number of the node corresponding to the character. Is generated as a part of the analysis result 101b.

（Ｓ３０５）構文解析部１０４は、構文解析結果（図２０を参照）に含まれる各語句に対応付けた削除フラグを全てＯＦＦにする（初期化）。削除フラグは、要約文テキストに含まれる語句についてＯＦＦ、要約文テキストに含まれない語句についてＯＮとされる。 (S305) The syntax analysis unit 104 turns off all the deletion flags associated with the respective words included in the syntax analysis result (see FIG. 20) (initialization). The deletion flag is set to OFF for words included in the summary sentence text and ON for words not included in the summary sentence text.

（Ｓ３０６）要約文生成部１０５は、要約文生成時に削除するノードに対応する削除フラグをＯＮに設定する。例えば、要約文生成部１０５は、ルートノードから、要約文生成のために設定されたノードへ至るパス上のノードに対応する語句を特定し、特定したノード以外のノードに対応する削除フラグをＯＮに設定する。 (S306) The summary sentence generation unit 105 sets the deletion flag corresponding to the node to be deleted when generating the summary sentence to ON. For example, the summary sentence generation unit 105 identifies a phrase corresponding to a node on a path from the root node to a node set for summary sentence generation, and turns on a deletion flag corresponding to a node other than the identified node. Set to.

（Ｓ３０７）要約文生成部１０５は、削除フラグがＯＦＦのノードに対応する語句を原文テキスト１０１ａ上の出現順に連結して要約文テキストを生成する。
（Ｓ３０８）要約文生成部１０５は、要約文テキストに含まれる文字のそれぞれと、その文字を含む語句に対応するノードとを対応付ける対応データ（図８を参照）を生成する。例えば、図８に示すように、要約文生成部１０５は、要約文テキストに含まれる各文字の要約文文字番号と、文字の表記と、その文字に対応するノードのノード番号とを対応付けた対応データを解析結果１０１ｂの一部として生成する。 (S307) The summary sentence generation unit 105 generates a summary sentence text by concatenating words and phrases corresponding to the node whose deletion flag is OFF in the order of appearance on the original text 101a.
(S308) The summary sentence generation unit 105 generates correspondence data (see FIG. 8) for associating each character included in the summary sentence text with a node corresponding to the phrase including the character. For example, as illustrated in FIG. 8, the summary sentence generation unit 105 associates the summary sentence character number of each character included in the summary sentence text, the character notation, and the node number of the node corresponding to the character. Corresponding data is generated as a part of the analysis result 101b.

（Ｓ３０９）文出力部１０６は、記憶部１０１から原文テキスト１０１ａを取得し、要約文生成部１０５が生成した要約文テキストと共に原文テキスト１０１ａを出力する。このとき、文出力部１０６は、原文テキスト１０１ａ上の文字、及び要約文テキスト上の文字をユーザが指定できる形式で原文テキスト１０１ａ及び要約文テキストを表示する。 (S309) The sentence output unit 106 acquires the original text 101a from the storage unit 101, and outputs the original text 101a together with the summary text generated by the summary sentence generation unit 105. At this time, the sentence output unit 106 displays the original text 101a and the summary text in a format that allows the user to specify the characters on the original text 101a and the characters on the summary text.

（Ｓ３１０）指定受領部１０７は、文出力部１０６が出力した要約文テキストで確定されたか否かを判定する。例えば、指定受領部１０７は、要約文テキストの編集終了操作が行われたか否かを判定する。要約文テキストが確定された場合、図２２〜図２４に示した一連の処理は終了する。一方、要約文テキストが確定されていない場合、処理はＳ３１１へと進む。 (S310) The designation receiving unit 107 determines whether the summary sentence text output by the sentence output unit 106 has been confirmed. For example, the designation receiving unit 107 determines whether or not a summary sentence text editing end operation has been performed. When the summary text is confirmed, the series of processes shown in FIGS. 22 to 24 ends. On the other hand, if the summary text has not been confirmed, the process proceeds to S311.

（Ｓ３１１）指定受領部１０７は、原文テキスト１０１ａの文字が指定されたか否かを判定する。原文テキスト１０１ａの文字が指定された場合、処理はＳ３１２へと進む。一方、原文テキスト１０１ａの文字が指定されていない場合、処理はＳ３２０へと進む。 (S311) The designation receiving unit 107 determines whether or not a character of the original text 101a is designated. When the character of the original text 101a is designated, the process proceeds to S312. On the other hand, when the character of the original text 101a is not designated, the process proceeds to S320.

（Ｓ３１２）指定受領部１０７は、原文テキスト１０１ａの文字に対する指定操作を受け付けると、原文と構文解析結果の対応データ（図７を参照）を参照し、指定文字に対応するノードを特定する。そして、指定受領部１０７は、特定したノードを追加起点ノードに設定する。 (S312) When the designation receiving unit 107 receives a designation operation on the characters of the original text 101a, the designation receiving unit 107 refers to the correspondence data (see FIG. 7) between the original text and the syntax analysis result, and identifies the node corresponding to the designated character. Then, the designation receiving unit 107 sets the identified node as an additional starting node.

（Ｓ３１３）範囲制御部１０８は、追加起点ノードのルートノード側に隣接するノードを近道判定ノードに設定する。Ｓ３１３の処理が完了すると、処理はＳ３１４へと進む。
（Ｓ３１４）範囲制御部１０８は、共起確率テーブルから、追加起点ノードと近道判定ノードとの共起確率を取得する。 (S313) The range control unit 108 sets a node adjacent to the root node side of the additional origin node as a shortcut determination node. When the process of S313 is completed, the process proceeds to S314.
(S314) The range control unit 108 acquires the co-occurrence probability between the additional origin node and the shortcut determination node from the co-occurrence probability table.

（Ｓ３１５）範囲制御部１０８は、追加起点ノードと近道判定ノードの距離、及びＳ３１４で取得した共起確率に基づくスコアを計算する。例えば、追加起点ノードをノードｉ、近道判定ノードをノードｊとし、ノードｉ，ｊの距離をＤ（ｉ，ｊ）、共起確率をＰ（ｉ，ｊ）とすると、スコアＳｃｊはＤ（ｉ，ｊ）×Ｐ（ｉ，ｊ）で与えられる。なお、上記距離は構文木上の距離である。 (S315) The range control unit 108 calculates a score based on the distance between the additional origin node and the shortcut determination node and the co-occurrence probability acquired in S314. For example, if the additional origin node is node i, the shortcut determination node is node j, the distance between nodes i and j is D (i, j), and the co-occurrence probability is P (i, j), the score Scj is D (i , J) × P (i, j). The above distance is a distance on the syntax tree.

（Ｓ３１６）範囲制御部１０８は、近道判定ノードの削除フラグがＯＮであるか否かを判定する。近道判定ノードの削除フラグがＯＮである場合、処理はＳ３１７へと進む。一方、近道判定ノードの削除フラグがＯＦＦである場合、処理はＳ３１８へと進む。 (S316) The range control unit 108 determines whether the deletion flag of the shortcut determination node is ON. If the shortcut flag of the shortcut determination node is ON, the process proceeds to S317. On the other hand, if the deletion flag of the shortcut determination node is OFF, the process proceeds to S318.

（Ｓ３１７）範囲制御部１０８は、現在の近道判定ノードのルートノード側に隣接するノードを新たな近道判定ノードに設定する。Ｓ３１７の処理が完了すると、処理はＳ３１４へと進む。 (S317) The range control unit 108 sets a node adjacent to the root node side of the current shortcut determination node as a new shortcut determination node. When the process of S317 is completed, the process proceeds to S314.

（Ｓ３１８）範囲制御部１０８は、追加起点ノードから、最大スコアのノードへのショートカットを設定する。
（Ｓ３１９）範囲制御部１０８は、ショートカットする区間に含まれるノードの削除フラグは維持したまま、追加起点ノードからルートノードまでの各ノードに対応する削除フラグをＯＦＦに設定する。Ｓ３１９の処理が完了すると、処理はＳ３０７へと進む。 (S318) The range control unit 108 sets a shortcut from the additional origin node to the node with the highest score.
(S319) The range control unit 108 sets the deletion flag corresponding to each node from the additional origin node to the root node to OFF while maintaining the deletion flag of the node included in the section to be shortcutted. When the process of S319 is completed, the process proceeds to S307.

（Ｓ３２０）指定受領部１０７は、要約文テキストの文字が指定されたか否かを判定する。要約文テキストの文字が指定された場合、処理は第２又は第３実施形態と同じ処理が実行される。一方、要約文テキストの文字が指定されていない場合、処理はＳ３１０へと進む。 (S320) The designation receiving unit 107 determines whether characters of the summary text are designated. When characters of the summary text are designated, the same processing as that in the second or third embodiment is executed. On the other hand, when the characters of the summary text are not designated, the process proceeds to S310.

第２実施形態と同じ処理が実行される場合、図１２のＳ１１５以降の処理が実行され、Ｓ１１６の処理が完了すると、処理はＳ３０７へと進む。一方、第３実施形態と同じ処理が実行される場合、図１８のＳ２１８以降の処理が実行され、Ｓ２２４の処理が完了すると、処理はＳ３０７へと進む。このように、第４実施形態の技術は、第２又は第３実施形態の技術と組み合わせることができる。 When the same processing as that of the second embodiment is executed, the processing after S115 of FIG. 12 is executed, and when the processing of S116 is completed, the processing proceeds to S307. On the other hand, when the same processing as that of the third embodiment is executed, the processing after S218 in FIG. 18 is executed, and when the processing of S224 is completed, the processing proceeds to S307. Thus, the technique of the fourth embodiment can be combined with the technique of the second or third embodiment.

以上、情報処理装置１００が実行する処理の流れについて説明した。
上記の処理方法によれば、ノード間の距離及び共起確率をもとに追加対象が絞り込まれる。指定文字に対応するノードがルートノードから遠ければ、要約文テキストに追加される語句が多くなり、不要な語句が追加されるリスクが高まるが、上記処理方法を適用することで、このようなリスクが軽減される。また、共起確率が考慮されるため、必要な語句が残り、不要な語句が除外される可能性が高まり、適切な語句の集合が追加される可能性を高めることができる。その結果、要約文テキストの編集が更に容易になり、編集作業の更なる負担軽減に寄与する。 The flow of processing executed by the information processing apparatus 100 has been described above.
According to the above processing method, the addition target is narrowed down based on the distance between nodes and the co-occurrence probability. If the node corresponding to the specified character is far from the root node, more words will be added to the summary text and the risk of adding unnecessary words will increase, but this risk can be increased by applying the above processing method. Is reduced. In addition, since the co-occurrence probability is taken into consideration, it is possible to increase the possibility that necessary words remain and unnecessary words / phrases are excluded, and that a set of appropriate words / phrases is added. As a result, the summary text can be edited more easily, which contributes to further reducing the burden of editing work.

以上、第４実施形態について説明した。 The fourth embodiment has been described above.

１０情報処理装置
１１記憶部
１２演算部
１３表示部
２０構文木
２１追加範囲
２２削除範囲
３１原文
３２要約文 DESCRIPTION OF SYMBOLS 10 Information processing apparatus 11 Memory | storage part 12 Operation part 13 Display part 20 Syntax tree 21 Additional range 22 Deletion range 31 Original sentence 32 Summary sentence

Claims

A storage unit that stores an original sentence and a syntax tree in which a dependency structure of a phrase based on a syntactic analysis of the original sentence is expressed by a connection relation of nodes corresponding to the phrase;
A display unit that displays the original sentence and a summary sentence that summarizes the original sentence by omitting the word;
When the designation operation for the original sentence is received, the second node connected to the first node corresponding to the word or phrase at the designated location and in the direction toward the root of the syntax tree is specified, and the first and second Add words to the summary sentence that correspond to
When the designation operation for the summary sentence is received, the fourth node connected to the third node corresponding to the word at the designated location and in the direction toward the end of the syntax tree is specified, and the third And an arithmetic unit that deletes the word corresponding to the fourth node from the summary sentence;
An information processing apparatus comprising:

The storage unit further stores a co-occurrence probability between nodes connected on the syntax tree,
When the calculation unit receives a designating operation on the original text, the computing unit is connected to the first node and is connected to the first node among nodes in a direction toward the end of the syntax tree. The information processing apparatus according to claim 1, wherein a node having an occurrence probability greater than a set threshold is specified as the second node.

When the calculation unit receives a designating operation on the summary sentence, the co-occurrence with the third node among nodes connected to the third node and in a direction toward the root of the syntax tree. The information processing apparatus according to claim 2, wherein a node having a probability greater than the threshold is specified as the fourth node.

The storage unit further stores a co-occurrence probability between nodes connected on the syntax tree,
When the calculation unit accepts a designation operation on the original text, the calculation unit is based on an evaluation value that takes a higher value as the distance from the first node on the syntax tree is longer and the co-occurrence probability increases. The information processing apparatus according to claim 1, wherein a node having the highest evaluation value is identified as the second node among nodes connected to the first node and in a direction toward a root of the syntax tree.

Computer
From the storage unit, obtain the original sentence and a syntax tree expressing the dependency structure of the phrase based on the syntactic analysis of the original sentence by the connection relation of the nodes corresponding to the phrase;
The display unit displays the original sentence and a summary sentence summarizing the original sentence,
When the designation operation for the original sentence is received, the second node connected to the first node corresponding to the word or phrase at the designated location and in the direction toward the root of the syntax tree is specified, and the first and second Add words to the summary sentence that correspond to
When the designation operation for the summary sentence is received, the fourth node connected to the third node corresponding to the word at the designated location and in the direction toward the end of the syntax tree is specified, and the third And a summary sentence editing method of deleting words corresponding to the fourth node from the summary sentence.

On the computer,
From the storage unit, obtain the original sentence and a syntax tree expressing the dependency structure of the phrase based on the syntactic analysis of the original sentence by the connection relation of the nodes corresponding to the phrase;
The display unit displays the original sentence and a summary sentence summarizing the original sentence,
When the designation operation for the original sentence is received, the second node connected to the first node corresponding to the word or phrase at the designated location and in the direction toward the root of the syntax tree is specified, and the first and second Add words to the summary sentence that correspond to
When the designation operation for the summary sentence is received, the fourth node connected to the third node corresponding to the word at the designated location and in the direction toward the end of the syntax tree is specified, and the third And the program which performs the process which deletes the phrase corresponding to a 4th node from the said summary sentence.