JP7578779B2

JP7578779B2 - Method and system for adapting neural networks to dynamic environments

Info

Publication number: JP7578779B2
Application number: JP2023176429A
Authority: JP
Inventors: アールマニカンダン; 雄一野中; バネルジーキングシュク
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2022-11-02
Filing date: 2023-10-12
Publication date: 2024-11-06
Anticipated expiration: 2043-10-12
Also published as: JP2024066994A

Description

本開示は一般に、人工知能及びデータ分析の分野に関する。より詳細には、本開示は、ニューラルネットワークを動的環境に適応させるための方法及びシステムに関する。 The present disclosure relates generally to the fields of artificial intelligence and data analytics. More specifically, the present disclosure relates to methods and systems for adapting neural networks to dynamic environments.

ニューラルネットワークは、人間の脳が意思決定を行う方法を模倣しようとする計算モデルである。ニューラルネットワークは、さまざまな業界の多様なアプリケーションで広く使われている。このようなアプリケーションで使用される場合、ニューラルネットワークはさまざまな種類の環境に対応する。例えば、職場の安全のための監視アプリケーションのようなビデオ分析システムに実装されたニューラルネットワークは、組立ラインで安全ギアを装着した作業員を識別する必要がある。また、ニューラルネットワークは、生産ラインで安全ギアを装着した作業員を識別する必要がある。組立ラインと生産ラインでは環境が異なるため、安全ギアを装着した作業員の識別に影響する背景のばらつきが大きい。このため、ニューラルネットワークが作業員を誤って識別したり、識別を見落としたりする可能性があり、ニューラルネットワークの性能が低下する。別の例では、物体認識アプリケーションにおいて、晴れた地域で撮影された物体の画像を使用して訓練されたニューラルネットワークは、曇りの地域ではうまく機能しないことがある。ニューラルネットワークは物体を誤って認識する可能性があるため、異なる気候条件下ではうまく動作しない可能性がある。そのため、ニューラルネットワークを異なる環境用に開発し、性能を向上させる必要がある。しかし、このような開発は高価で面倒である。また、さまざまなアプリケーションにおけるデータとアノテーションの不足は、ニューラルネットワークを異なる環境に適応させることを困難にしている。 A neural network is a computational model that attempts to mimic the way the human brain makes decisions. Neural networks are widely used in a variety of applications across different industries. When used in such applications, neural networks adapt to different types of environments. For example, a neural network implemented in a video analytics system, such as a surveillance application for workplace safety, needs to identify workers wearing safety gear on an assembly line. Also, the neural network needs to identify workers wearing safety gear on a production line. Because the environments of an assembly line and a production line are different, there is a large amount of background variability that affects the identification of workers wearing safety gear. This can cause the neural network to misidentify or miss identifying workers, which reduces the performance of the neural network. In another example, in an object recognition application, a neural network trained using images of objects taken in a sunny area may not work well in a cloudy area. The neural network may not work well under different weather conditions because it may misidentify objects. Therefore, neural networks need to be developed for different environments and their performance improved. However, such development is expensive and tedious. Additionally, the lack of data and annotations in various applications makes it difficult to adapt neural networks to different environments.

半教師あり学習、ゼロショット学習、少数ショット学習などの従来の技術は、大量のデータとアノテーションの必要性に取り組んでいる。しかし、これらの技術は、最小限のアノテーションなしで異なる環境に対処できるようにニューラルネットワークを拡張する解決策を提供しない。さらに、これらの技術は、アノテーションを作成するにもかかわらず、精度が低い。他の従来技術の中には、ニューラルネットワークの性能を向上させるために画像ビューの抽出を利用するものがある。しかし、これらの従来技術は、データ内のニューラルネットワークのアプリケーションに関連する前景特徴のみを考慮するため、異なる背景や環境に適応することができない。 Conventional techniques such as semi-supervised learning, zero-shot learning, and few-shot learning address the need for large amounts of data and annotations. However, these techniques do not provide a solution to extend neural networks to deal with different environments without minimal annotation. Moreover, these techniques, despite producing annotations, have low accuracy. Some other conventional techniques utilize image view extraction to improve the performance of neural networks. However, these conventional techniques are unable to adapt to different backgrounds and environments because they only consider foreground features in the data that are relevant to the application of the neural network.

本開示の背景の項で開示された情報は、本発明の一般的背景の理解を深めるためのものであり、この情報が当業者に既に知られている先行技術を形成していることを認めるもの、又は何らかの形で示唆するものと解釈されるべきではない。 The information disclosed in the Background section of this disclosure is intended to enhance understanding of the general background of the present invention and should not be construed as an admission or in any manner suggesting that this information forms prior art already known to those of skill in the art.

実施形態において、本開示は、ニューラルネットワークを動的環境に適応させる方法を開示する。本方法は、入力データ項目の第１の部分に関連付けられた、入力データ項目の特徴の第１のセットを受信することを備える。さらに、本方法は、ニューラルネットワークを使用して生成された１つ又は複数の相関ペアと特徴の第１のセットとを関連付けることによって、入力データ項目の第２の部分に関連付けられた、入力データ項目の特徴の第２のセットを決定することを備える。１つ又は複数の相関ペアは、１つ又は複数のニューラルネットワークから、複数の訓練データ項目の各々のデータセットの訓練特徴の第１のセット及び訓練特徴の第２のセットを含む特徴セットを受信することによって生成される。複数の訓練データ項目の各々のデータセットに対する１つ又は複数の第１のラベル及び１つ又は複数の第２のラベルは、訓練特徴の第１のセット及び訓練特徴の第２のセットを特徴空間上に投影することによって生成される。訓練特徴の第１のセット及び訓練特徴の第２のセットは、訓練特徴の第１のセットと訓練特徴の第２のセットとの相関に基づいて、特徴空間上で再配置される。複数の訓練データ項目の各々のデータセットの１つ又は複数の第１のラベル及び対応する第２のラベルをそれぞれ示す１つ又は複数の相関ペアが、訓練特徴の第１のセット及び訓練特徴の第２のセットの再配置に基づいて、生成される。 In an embodiment, the present disclosure discloses a method for adapting a neural network to a dynamic environment. The method comprises receiving a first set of features of the input data items associated with a first portion of the input data items. The method further comprises determining a second set of features of the input data items associated with a second portion of the input data items by associating the first set of features with one or more correlation pairs generated using the neural network. The one or more correlation pairs are generated by receiving a feature set including a first set of training features and a second set of training features for each data set of the plurality of training data items from the one or more neural networks. The one or more first labels and the one or more second labels for each data set of the plurality of training data items are generated by projecting the first set of training features and the second set of training features onto a feature space. The first set of training features and the second set of training features are relocated on the feature space based on the correlation between the first set of training features and the second set of training features. One or more correlation pairs, each indicating one or more first labels and corresponding second labels for each of the data sets of the plurality of training data items, are generated based on the rearrangement of the first set of training features and the second set of training features.

一実施形態において、本開示は、ニューラルネットワークを動的環境に適応させるためのシステムを開示する。このシステムは、１つ又は複数のプロセッサと、メモリとを備える。１つ以上のプロセッサは、入力データ項目の第１の部分に関連付けられた、入力データ項目の特徴の第１のセットを受信するように構成される。さらに、１つ又は複数のプロセッサは、ニューラルネットワークを使用して生成された１つ又は複数の相関ペアと特徴の第１のセットとを関連付けることによって、入力データ項目の第２の部分に関連付けられた、入力データ項目の特徴の第２のセットを決定するように構成される。１つ又は複数の相関ペアは、１つ又は複数のニューラルネットワークから、複数の訓練データ項目の各々のデータセットの訓練特徴の第１のセット及び訓練特徴の第２のセットを含む特徴セットを受信することによって生成される。複数の訓練データ項目の各々のデータセットに対する１つ又は複数の第１のラベル及び１つ又は複数の第２のラベルは、訓練特徴の第１のセット及び訓練特徴の第２のセットを特徴空間上に投影することによって生成される。訓練特徴の第１のセット及び訓練特徴の第２のセットは、訓練特徴の第１のセットと訓練特徴の第２のセットとの相関に基づいて、特徴空間上で再配置される。複数の訓練データ項目の各々のデータセットの１つ又は複数の第１のラベル及び対応する第２のラベルをそれぞれ示す１つ又は複数の相関ペアが、訓練特徴の第１のセット及び訓練特徴の第２のセットの再配置に基づいて、生成される。 In one embodiment, the present disclosure discloses a system for adapting a neural network to a dynamic environment. The system includes one or more processors and a memory. The one or more processors are configured to receive a first set of features of the input data items associated with a first portion of the input data items. Furthermore, the one or more processors are configured to determine a second set of features of the input data items associated with a second portion of the input data items by associating the first set of features with one or more correlation pairs generated using the neural network. The one or more correlation pairs are generated by receiving a feature set including a first set of training features and a second set of training features for each dataset of the plurality of training data items from the one or more neural networks. The one or more first labels and the one or more second labels for each dataset of the plurality of training data items are generated by projecting the first set of training features and the second set of training features onto a feature space. The first set of training features and the second set of training features are relocated on the feature space based on the correlation between the first set of training features and the second set of training features. One or more correlation pairs, each indicating one or more first labels and corresponding second labels for each of the data sets of the plurality of training data items, are generated based on the rearrangement of the first set of training features and the second set of training features.

前述の概要は例示的なものであり、いかなる意味においても限定的であることを意図するものではない。上述した例示的な態様、実施形態、及び特徴に加えて、さらなる態様、実施形態、及び特徴は、図面及び以下の詳細な説明を参照することによって明らかになるであろう。 The foregoing summary is illustrative and is not intended to be in any way limiting. In addition to the exemplary aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.

本開示の新規な特徴及び特性は、添付の特許請求の範囲に記載されている。しかしながら、本開示自体、並びにその好ましい使用態様、さらなる目的、及び利点は、添付の図と併せて読まれたときに、例示的な実施形態の以下の詳細な説明を参照することによって最もよく理解されるであろう。ここで、１つ又は複数の実施形態が、例示のためにのみ、添付の図を参照して説明され、ここで、同様の参照数字は同様の要素を表し、その中で、同様の参照数字は同様の要素を表す。 The novel features and characteristics of the present disclosure are set forth in the appended claims. However, the disclosure itself, as well as its preferred modes of use, further objects and advantages, will be best understood by reference to the following detailed description of illustrative embodiments, when read in conjunction with the accompanying drawings, in which: One or more embodiments are now described, by way of example only, with reference to the accompanying drawings, in which like reference numerals represent like elements throughout.

図１は、本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるための例示的な環境を示す。FIG. 1 illustrates an example environment for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. 図２は、本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるためのシステムの詳細図である。FIG. 2 is a detailed diagram of a system for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. 図３Ａは本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるための例示的な図である。FIG. 3A is an exemplary diagram for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. 図３Ｂは本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるための例示的な図である。FIG. 3B is an exemplary diagram for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. 図３Ｃは本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるための例示的な図である。FIG. 3C is an exemplary diagram for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. 図４は、本開示のいくつかの実施形態による、動的環境に適応するためのニューラルネットワークを訓練するための方法ステップを示す例示的なフローチャートである。FIG. 4 is an exemplary flowchart illustrating method steps for training a neural network to adapt to a dynamic environment, according to some embodiments of the present disclosure. 図５は、本開示のいくつかの実施形態によるニューラルネットワークを動的環境に適応させるための方法ステップを示す例示的なフローチャートである。FIG. 5 is an exemplary flow chart illustrating method steps for adapting a neural network to a dynamic environment according to some embodiments of the present disclosure. 図６は、本開示の実施形態によるニューラルネットワークを動的環境に適応させるための汎用コンピューティングシステムのブロック図である。FIG. 6 is a block diagram of a general-purpose computing system for adapting neural networks to dynamic environments according to an embodiment of the present disclosure.

本明細書における任意のブロック図は、本主題の原理を具体化する例示的なシステムの概念図を表すことが、当業者には理解されよう。同様に、任意のフローチャート、フロー図、状態遷移図、擬似コードなどは、そのようなコンピュータ又はプロセッサが明示的に示されているか否かにかかわらず、実質的にコンピュータ可読媒体で表され、コンピュータ又はプロセッサによって実行され得る様々なプロセスを表すことが理解されよう。 Those skilled in the art will appreciate that any block diagrams herein represent conceptual diagrams of example systems embodying the principles of the present subject matter. Similarly, any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like will be understood to represent various processes that may be substantially represented on a computer-readable medium and executed by a computer or processor, whether or not such a computer or processor is explicitly shown.

本明細書において、「例示的」という語は、「例、実例、又は説明として役立つ」という意味で使用される。本明細書において「例示的」として記載される本主題の任意の実施形態又は実装は、必ずしも、他の実施形態よりも好ましい又は有利であると解釈されるものではない。 The word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.

本開示は様々な修正及び代替形態に影響を受け得るが、その具体的な実施形態が図面に例示的に示されており、以下に詳細に説明される。しかしながら、本開示を開示された特定の形態に限定することを意図するものではなく、逆に、本開示は、本開示の範囲内に入る全ての変更、等価物、及び代替物を対象とするものであることを理解されたい。 While the present disclosure is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are described in detail below. It is to be understood, however, that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is intended to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure.

「含む」、「含むこと」、又はそれらの他の変形は、非排他的な包含を意図しており、構成要素又はステップのリストから構成されるセットアップ、装置又は方法は、それらの構成要素又はステップのみを含むのではなく、明示的に列挙されていない、又はそのようなセットアップ又は装置又は方法に固有の他の構成要素又はステップを含む可能性がある。言い換えれば、システム又は装置内の１つ又は複数の要素の「…を含む」という文は、さらなる制約がなければ、システム又は装置内の他の要素又は追加の要素の存在を妨げるものではない。 "Include", "comprising" or other variations thereof are intended to be non-exclusive inclusive, and a setup, apparatus or method consisting of a list of components or steps does not include only those components or steps, but may include other components or steps not expressly listed or inherent to such setup or apparatus or method. In other words, a statement "including..." of one or more elements in a system or apparatus does not preclude the presence of other or additional elements in the system or apparatus, absent further constraints.

ニューラルネットワークは、さまざまな業界のさまざまなアプリケーションで使用されている。このようなアプリケーションで使用される場合、ニューラルネットワークはさまざまな種類の環境に対応する。ニューラルネットワークは、より良い性能を得るために、異なる種類の環境に適応する必要がある。本開示は、ニューラルネットワークを異なる動的環境に適応させるための方法及びシステムを提供する。ニューラルネットワークは、複数の訓練データ項目の訓練特徴（例えば、背景特徴）の第１のセットと訓練特徴（例えば、前景特徴）の第２のセットを提供することによって訓練される。ニューラルネットワークは、特徴の第１のセット及び特徴の第２のセットを相関させ、第１のラベル（例えば、背景ラベル）及び第２のラベル（例えば、前景ラベル）との相関ペアを形成する。ニューラルネットワークが入力データ項目の特徴（例えば背景特徴）の第１のセットを受信すると、ニューラルネットワークは相関ペアに基づいて入力データ項目の特徴の第２のセット（例えば前景特徴）を決定する。したがって、ニューラルネットワークは新しい環境に適応することができる。したがって、本開示は、ニューラルネットワークを異なる環境に適応させることを可能にする。本開示は、新たな環境（例えば、新たなモノのインターネット（ＩｏＴ）環境）に対するデータ不足の制約下でニューラルネットワークを開発する際に使用することができ、それによってニューラルネットワークの用途を拡大することができる。 Neural networks are used in various applications in various industries. When used in such applications, neural networks respond to different types of environments. To obtain better performance, neural networks need to adapt to different types of environments. The present disclosure provides a method and system for adapting a neural network to different dynamic environments. The neural network is trained by providing a first set of training features (e.g., background features) and a second set of training features (e.g., foreground features) of a plurality of training data items. The neural network correlates the first set of features and the second set of features to form a correlation pair with a first label (e.g., background label) and a second label (e.g., foreground label). When the neural network receives the first set of features (e.g., background features) of the input data items, the neural network determines the second set of features (e.g., foreground features) of the input data items based on the correlation pair. Thus, the neural network can adapt to new environments. Thus, the present disclosure enables the neural network to adapt to different environments. The present disclosure can be used to develop neural networks under data scarcity constraints for new environments (e.g., new Internet of Things (IoT) environments), thereby expanding the applications of neural networks.

図１は、本開示のいくつかの実施形態による、ニューラルネットワークを動的環境に適応させるための例示的な環境１００を示す。環境１００は、ニューラルネットワーク１０２及びシステム１０４を有する。入力データ項目１０１は、ニューラルネットワーク１０２に提供されてもよい。入力データ項目１０１は、画像、ビデオ、音声入力、テキスト入力、音声入力などのうちの１つを含む。入力データ項目１０１は、キャプチャユニット、ボイスレコーダ、データベースなどのようなソースから受信されてもよい。システム１０４は、入力データ項目１０１の第１の部分に関連付けられた、入力データ項目１０１の特徴の第１のセット１０３（第１の特徴のセット１０３）を受信する。例えば、入力データ項目１０１が画像である場合、入力データ項目１０１の第１の部分は画像の背景であってもよい。一実施形態では、特徴の第１のセット１０３は、ニューラルネットワーク１０２から受信されてもよい。人工ニューラルネットワークとしても知られるニューラルネットワーク１０２は、相互接続されたノードを有するコンピューティングシステムである。ニューラルネットワーク１０２は、継続的な学習及び改善のために、データ、領域、及び分類における隠れたパターン及び相関を経時的に認識する。ニューラルネットワーク１０２は、畳み込みニューラルネットワーク（ＣＮＮ）、リカレントニューラルネットワーク（ＲＮＮ）、長期短期記憶（ＬＳＴＭ）、再帰的ニューラルネットワーク、グラフ畳み込みネットワーク、逐次ニューラルネットワーク、エンコーダ－デコーダネットワーク、及び、それらの組み合わせなどのうちの少なくとも１つで構成される。当業者は、本開示が、上記のニューラルネットワーク以外の任意のニューラルネットワークに適用可能であることを理解するであろう。 FIG. 1 illustrates an exemplary environment 100 for adapting a neural network to a dynamic environment, according to some embodiments of the present disclosure. The environment 100 includes a neural network 102 and a system 104. An input data item 101 may be provided to the neural network 102. The input data item 101 includes one of an image, a video, an audio input, a text input, a voice input, and the like. The input data item 101 may be received from a source such as a capture unit, a voice recorder, a database, and the like. The system 104 receives a first set 103 of features of the input data item 101 (first set of features 103) associated with a first portion of the input data item 101. For example, if the input data item 101 is an image, the first portion of the input data item 101 may be a background of the image. In one embodiment, the first set of features 103 may be received from the neural network 102. The neural network 102, also known as an artificial neural network, is a computing system having interconnected nodes. The neural network 102 recognizes hidden patterns and correlations in data, domains, and classifications over time for continuous learning and improvement. The neural network 102 is composed of at least one of a convolutional neural network (CNN), a recurrent neural network (RNN), a long short-term memory (LSTM), a recurrent neural network, a graph convolutional network, a sequential neural network, an encoder-decoder network, and combinations thereof. Those skilled in the art will understand that the present disclosure is applicable to any neural network other than the above neural networks.

システム１０４は、特徴の第１のセット１０３を１つ以上の相関ペアと関連付けることによって、入力データ項目１０１の第２の部分に関連付けられた、入力データ項目１０１の特徴の第２のセット１０８（第２の特徴のセット１０８）を決定してもよい。例えば、入力データ項目１０１が画像である場合、入力データ項目１０１の第２の部分は画像の前景であってもよい。本明細書では、入力データ項目１０１の前景は、ニューラルネットワーク１０２の所定の用途に関連する関心対象／音声特徴（入力データ項目１０１が音声入力を含む場合）を含む入力データ項目１０１の一部を指す。入力データ項目１０１の背景とは、関心対象を含まない／ニューラルネットワーク１０２の所定のアプリケーションに関連しない、入力データ項目１０１の一部／音声特徴を指す。相関ペアは、ニューラルネットワーク１０２を使用して、複数の訓練データ項目の各々のデータセットの訓練特徴の第１のセット（第１訓練特徴のセット）及び訓練特徴の第２のセット（第２訓練特徴のセット）からなる特徴セットを受信することによって生成される。例えば、訓練特徴の第１のセットは背景訓練特徴であってもよく、訓練特徴の第２のセットは前景訓練特徴であってもよい。複数の訓練データ項目の各々のデータセットは、元のデータ項目と、元のデータ項目の１つ以上の変換とから構成されてもよい。訓練特徴の第１のセット及び訓練特徴の第２のセットは、第１のラベル及び第２のラベルを生成するために特徴空間上に投影される。さらに、訓練特徴の第１のセット及び訓練特徴の第２のセットは、訓練特徴の第１のセットを訓練特徴の第２のセットに相関させることによって、特徴空間上に再配置される。再配置に基づいて、第１のラベル及び対応する第２のラベルをそれぞれ示す１つ又は複数の相関ペアが生成される。相関ペアは、第１のラベル（例えば、背景ラベル）及び対応する第２のラベル（例えば、前景ラベル）を示す。新たな特徴（例えば、背景特徴）を有する新たな入力データ項目１０１が受信されると、ニューラルネットワーク１０２は、入力データ項目１０１の前景特徴を決定し、異なる環境に適応することができる。ニューラルネットワーク１０２は、物体検出、物体追跡、顔認識などのアプリケーションで実装することができる。 The system 104 may determine a second set 108 of features of the input data item 101 (second set of features 108) associated with the second portion of the input data item 101 by associating the first set 103 of features with one or more correlation pairs. For example, if the input data item 101 is an image, the second portion of the input data item 101 may be the foreground of the image. As used herein, the foreground of the input data item 101 refers to a portion of the input data item 101 that includes objects of interest/audio features (if the input data item 101 includes an audio input) that are relevant to a given application of the neural network 102. The background of the input data item 101 refers to a portion of the input data item 101/audio features that do not include objects of interest/are not relevant to a given application of the neural network 102. The correlation pairs are generated by receiving a feature set consisting of a first set of training features (first set of training features) and a second set of training features (second set of training features) for each data set of a plurality of training data items using the neural network 102. For example, the first set of training features may be background training features and the second set of training features may be foreground training features. Each dataset of the plurality of training data items may be composed of an original data item and one or more transformations of the original data item. The first set of training features and the second set of training features are projected onto a feature space to generate a first label and a second label. Furthermore, the first set of training features and the second set of training features are rearranged onto the feature space by correlating the first set of training features with the second set of training features. Based on the rearrangement, one or more correlation pairs are generated, each indicating a first label and a corresponding second label. The correlation pairs indicate a first label (e.g., a background label) and a corresponding second label (e.g., a foreground label). When a new input data item 101 having new features (e.g., a background feature) is received, the neural network 102 can determine a foreground feature of the input data item 101 and adapt to a different environment. Neural networks 102 can be implemented in applications such as object detection, object tracking, and face recognition.

システム１０４は、入出力（Ｉ／Ｏ）インターフェース１０６、メモリ１０７、及び中央処理装置１０５（「ＣＰＵ」又は「１つ以上のプロセッサ１０５」とも呼ばれる）を含むことができる。いくつかの実施形態では、メモリ１０７は、１つ又は複数のプロセッサ１０５に通信可能に結合されてもよい。メモリ１０７は、１つ以上のプロセッサ１０５によって実行可能な命令を記憶する。１つ以上のプロセッサ１０５は、ユーザ又はシステムが生成した要求を実行するためのプログラムコンポーネントを実行するための少なくとも１つのデータプロセッサを含んでよい。メモリ１０７は、１つ以上のプロセッサ１０５に通信可能に結合されてもよい。メモリ１０７は、１つ又は複数のプロセッサ１０５によって実行可能な命令を記憶し、実行時に、１つ又は複数のプロセッサ１０５にニューラルネットワーク１０２を動的環境に適応させるようにさせることができる。Ｉ／Ｏインターフェース１０６は、入力信号又は／及び出力信号が通信される１つ又は複数のプロセッサ１０５と結合される。例えば、入力データ項目１０１は、Ｉ／Ｏインターフェース１０６を介してユーザから受信されることがある。一実施形態では、システム１０４は、ラップトップコンピュータ、デスクトップコンピュータ、パーソナルコンピュータ（ＰＣ）、ノートブック、スマートフォン、タブレット、サーバ、ネットワークサーバ、クラウドベースのサーバなど、様々なコンピューティングシステムに実装することができる。 The system 104 may include an input/output (I/O) interface 106, a memory 107, and a central processing unit 105 (also referred to as a "CPU" or "one or more processors 105"). In some embodiments, the memory 107 may be communicatively coupled to the one or more processors 105. The memory 107 stores instructions executable by the one or more processors 105. The one or more processors 105 may include at least one data processor for executing program components for executing user or system generated requests. The memory 107 may be communicatively coupled to the one or more processors 105. The memory 107 stores instructions executable by the one or more processors 105 that, when executed, may cause the one or more processors 105 to adapt the neural network 102 to a dynamic environment. The I/O interface 106 is coupled to the one or more processors 105 to which input signals or/and output signals are communicated. For example, an input data item 101 may be received from a user via the I/O interface 106. In one embodiment, the system 104 may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a personal computer (PC), a notebook, a smartphone, a tablet, a server, a network server, a cloud-based server, etc.

本開示のいくつかの実施形態による、ニューラルネットワーク１０２を異なる環境に適応させるためのシステム１０４の詳細図２００を示す。一実施形態では、メモリ１０７は、１つ以上のモジュール２０２及びデータ２０１を含み得る。１つ又は複数のモジュール２０２は、ダイナミックな環境にニューラルネットワーク１０２を適応させるために、データ２０１を使用して本開示のステップを実行するように構成され得る。一実施形態では、１つ以上のモジュール２０２の各々は、メモリ１０７の外部にあり、システム１０４と結合され得るハードウェアユニットであってもよい。本明細書で使用される場合、モジュール２０２という用語は、ASIC（Application Specific Integrated Circuit）、電子回路、FPGA（Field-Programmable Gate Arrays）、PSoC（Programmable System-on-Chip）、組合せ論理回路、及び／又は記載された機能を提供する他の適切なコンポーネントを指す。１つ又は複数のモジュール２０２は、本開示において定義される記載された機能性で構成されるとき、新規なハードウェアをもたらす。 A detailed diagram 200 of a system 104 for adapting a neural network 102 to different environments is shown, according to some embodiments of the present disclosure. In one embodiment, the memory 107 may include one or more modules 202 and data 201. The one or more modules 202 may be configured to perform steps of the present disclosure using the data 201 to adapt the neural network 102 to a dynamic environment. In one embodiment, each of the one or more modules 202 may be a hardware unit that is external to the memory 107 and may be coupled with the system 104. As used herein, the term module 202 refers to an application specific integrated circuit (ASIC), an electronic circuit, a field-programmable gate array (FPGA), a programmable system-on-chip (PSoC), a combinational logic circuit, and/or other suitable components that provide the described functionality. The one or more modules 202, when configured with the described functionality defined in this disclosure, result in novel hardware.

一実施形態では、モジュール２０２は、例えば、入力モジュール２０９、ラベル生成モジュール２１０、再配置モジュール２１１、相関モジュール２１２、決定モジュール２１３、及び他のモジュール２１４を含むことができる。このような前述のモジュール２０２は、単一のモジュールとして表されてもよいし、異なるモジュールの組み合わせとして表されてもよいことが理解されよう。一実施態様では、データ２０１は、例えば、入力データ２０３、ラベルデータ２０４、再配置データ２０５、相関データ２０６、決定データ２０７、及び他のデータ２０８を含むことができる。 In one embodiment, modules 202 may include, for example, input module 209, label generation module 210, rearrangement module 211, correlation module 212, decision module 213, and other module 214. It will be appreciated that such aforementioned modules 202 may be represented as a single module or as a combination of different modules. In one implementation, data 201 may include, for example, input data 203, label data 204, rearrangement data 205, correlation data 206, decision data 207, and other data 208.

一実施形態において、入力モジュール209は、複数の訓練データ項目の各々のデータセットの特徴セットを受信するように構成されてもよい。複数の訓練データ項目は、画像、ビデオ、音声入力、テキスト入力、音声入力などのうちの１つを含んでよい。複数の訓練データ項目の各々のデータセットは、元のデータ項目と、元のデータ項目の１つ以上の変換とから構成される。元のデータ項目の１つ以上の変換は、回転、平行移動、スケーリングなどのうちの少なくとも１つから構成されてもよい。例えば、データセットは、原画像、原画像の回転バージョン、原画像のスケーリングバージョン等から構成されてもよい。特徴セットは、複数の訓練データ項目の各々のデータセットの訓練特徴の第1のセット及び訓練特徴の第２のセットから構成されてもよい。訓練特徴の第１のセットは、複数の訓練データ項目の各々の第１の部分に関連付けられ得る。訓練特徴の第２のセットは、複数の訓練データ項目の各々の第２の部分に関連付けられてもよい。例えば、訓練データ項目が画像であることを考える。その場合、訓練データ項目の第１の部分は画像の背景であってもよい。一方、訓練データ項目の第２の部分は、画像の前景であってもよい。画像は、前景に列車、背景に鉄道駅構内の人々から構成される。別の例では、画像は、前景に作業員、背景に組立ライン環境から構成される。 In one embodiment, the input module 209 may be configured to receive a feature set for a dataset of each of the plurality of training data items. The plurality of training data items may include one of an image, a video, an audio input, a text input, a voice input, and the like. Each dataset of the plurality of training data items is comprised of an original data item and one or more transformations of the original data item. The one or more transformations of the original data item may comprise at least one of a rotation, a translation, a scaling, and the like. For example, the dataset may be comprised of an original image, a rotated version of the original image, a scaled version of the original image, and the like. The feature set may be comprised of a first set of training features and a second set of training features for each dataset of the plurality of training data items. The first set of training features may be associated with a first portion of each of the plurality of training data items. The second set of training features may be associated with a second portion of each of the plurality of training data items. For example, consider the training data items to be images. In that case, the first portion of the training data items may be the background of the image. Meanwhile, the second portion of the training data items may be the foreground of the image. The image consists of a train in the foreground and people in a train station in the background. In another example, the image consists of workers in the foreground and an assembly line environment in the background.

一実施形態では、ニューラルネットワーク１０２の訓練中、入力モジュール２０９は、１つ又は複数のニューラルネットワーク１０２から訓練特徴の第１のセット及び訓練特徴の第２のセットを受信してもよい。例えば、第１のニューラルネットワークは、複数の訓練データ項目の各々のデータセットの第１の部分に関連する訓練特徴の第１のセットを抽出するために使用されてもよい。第２のニューラルネットワークは、複数の訓練データ項目の各々のデータセットの第２の部分に関連付けられた訓練特徴の第２のセットを抽出するために使用されてもよい。１つ又は複数のニューラルネットワーク１０２は、訓練されたニューラルネットワーク又は未訓練のニューラルネットワークであってよい。一実施形態において、複数の訓練データ項目は、入力モジュール２０９に提供されてもよい。入力モジュール２０９は、複数の訓練データ項目のそれぞれの第１の部分及び第２の部分を生成してもよい。例えば、入力モジュール２０９は、複数の訓練データ項目の各々の前景及び背景を得るために、背景減算技術を使用して、複数の訓練データ項目の各々の第１の部分及び第２の部分を生成してもよい。当業者であれば、複数の訓練データ項目の各々の第１の部分及び第２の部分を得るために、上述の技法以外の任意の技法が使用されてもよいことを理解するであろう。別の実施形態において、入力モジュール２０９は、複数の訓練データ項目の各々の第１の部分及び第２の部分を受信してもよい。第１の部分及び第２の部分は、ユーザ、データベースなどから受信されてもよい。 In one embodiment, during training of the neural network 102, the input module 209 may receive a first set of training features and a second set of training features from one or more neural networks 102. For example, the first neural network may be used to extract a first set of training features associated with a first portion of a data set of each of the multiple training data items. The second neural network may be used to extract a second set of training features associated with a second portion of a data set of each of the multiple training data items. The one or more neural networks 102 may be trained neural networks or untrained neural networks. In one embodiment, the multiple training data items may be provided to the input module 209. The input module 209 may generate the first and second portions of each of the multiple training data items. For example, the input module 209 may generate the first and second portions of each of the multiple training data items using a background subtraction technique to obtain the foreground and background of each of the multiple training data items. Those skilled in the art will appreciate that any technique other than the techniques described above may be used to obtain the first and second portions of each of the plurality of training data items. In another embodiment, the input module 209 may receive the first and second portions of each of the plurality of training data items. The first and second portions may be received from a user, a database, etc.

図３Ａの例３００を参照すると、複数の訓練データ項目の各々の第１の部分３０２及び第２の部分３０１が図示されている。この例では、複数の訓練データ項目の各々は画像である。第１の部分３０２は画像の背景であり、第２の部分３０１は画像の前景である。さらに、複数の訓練データ項目の各々の１つ以上の変換３０３が生成される。次に、１つ又は複数のニューラルネットワーク１０２（ニューラルネットワーク１０２_１、ニューラルネットワーク１０２_２）は、複数の訓練データ項目の各々のデータセットの特徴の第１のセット及び特徴の第２のセットを抽出することができる。例えば、ニューラルネットワーク１０２_１は、特徴の第２のセット３０４（例えば、前景特徴）を生成し、ニューラルネットワーク１０２_２は、特徴の第１のセット３０５（例えば、背景特徴）を生成する。一例において、ニューラルネットワーク１０２は、異なるアプリケーションにおいてニューラルネットワーク１０２によって実行されるタスクに基づいて、複数の訓練データ項目の各々の複数のクラスに対して訓練されてもよい。各クラスに対応する特徴の第１のセット３０５及び特徴の第２のセット３０４が抽出されてもよい。特徴セットは、入力データ２０３としてメモリ１０７に格納されてもよい。 Referring to the example 300 of FIG. 3A, a first portion 302 and a second portion 301 of each of the plurality of training data items are illustrated. In this example, each of the plurality of training data items is an image. The first portion 302 is the background of the image, and the second portion 301 is the foreground of the image. Furthermore, one or more transformations 303 of each of the plurality of training data items are generated. Next, one or more neural networks 102 (neural network 102 ₁ , neural network 102 ₂ ) can extract a first set of features and a second set of features of the dataset of each of the plurality of training data items. For example, neural network 102 ₁ generates a second set of features 304 (e.g., foreground features), and neural network 102 ₂ generates a first set of features 305 (e.g., background features). In one example, the neural network 102 may be trained for multiple classes of each of the plurality of training data items based on tasks performed by the neural network 102 in different applications. A first set of features 305 and a second set of features 304 corresponding to each class may be extracted. The feature sets may be stored in memory 107 as input data 203.

一実施形態において、ラベル生成モジュール２１０は、入力モジュール２０９から入力データ２０３を受信するように構成されてもよい。さらに、ラベル生成モジュール２１０は、複数の訓練データ項目の各々のデータセットについて、１つ以上の第１のラベル及び１つ以上の第２のラベルを生成するように構成されてもよい。ラベル生成モジュール２１０は、訓練特徴の第１のセット及び訓練特徴の第２のセットを特徴空間上に投影することによって、１つ以上の第１のラベル（例えば、背景ラベル）及び１つ以上の第２のラベル（例えば、前景ラベル）を生成する。まず、ラベル生成モジュール２１０は、複数の訓練データ項目からの訓練データ項目の第２の部分の１つ以上のクラスの各々の、特徴空間における訓練特徴の第１のセット及び訓練特徴の第２のセットを投影する。例えば、訓練データ項目は画像であってもよい。画像の第２の部分（例えば、前景）の１つ又は複数のクラスは、オブジェクト識別、オブジェクト認識などを含み得る。一例において、ニューラルネットワーク１０２は、歩行者分類において実装されてもよい。その場合、１つ又は複数のクラスは、歩行者識別、歩行者認識などを含み得る。別の例では、ニューラルネットワーク１０２は、職場の安全のための監視アプリケーションに実装されてもよい。１つ又は複数のクラスは、作業者識別、安全装置遵守検出などを含むことができる。一実施形態では、特徴空間は超球であってもよい。当業者であれば、訓練特徴の第１のセット及び第２のセットは、上述の特徴空間以外の任意の特徴空間に投影されてもよいことを理解するであろう。図３Ｂの３０６を参照すると、三角形は、特徴空間３０８に投影された訓練特徴の第２のセットからの訓練特徴を示す。同様に、星印は、特徴空間３０８に投影された訓練特徴の第１のセットからの訓練特徴を示す。次に、ラベル生成モジュール２１０は、訓練特徴の第１のセットと訓練特徴の第２のセットとの間の類似性に基づいて複数の領域を作成する。図３Ｂの例を参照すると、複数の領域３０７₁₁、３０７₁₂、３０７₂₁、３０７₂₂、３０７₃₁及び３０７₃₂が作成される。特徴空間が超球の場合、複数の領域は球である。他の例では、複数の領域は、正方形、eclipseなどを含んでもよい。３０９は、訓練特徴の第１のセットと訓練特徴の第２のセットを分離する湾曲超平面３０９を示す。 In one embodiment, the label generation module 210 may be configured to receive the input data 203 from the input module 209. Furthermore, the label generation module 210 may be configured to generate one or more first labels and one or more second labels for each data set of the plurality of training data items. The label generation module 210 generates one or more first labels (e.g., background labels) and one or more second labels (e.g., foreground labels) by projecting the first set of training features and the second set of training features onto a feature space. First, the label generation module 210 projects the first set of training features and the second set of training features in the feature space for each of one or more classes of the second portion of the training data items from the plurality of training data items. For example, the training data items may be images. The one or more classes of the second portion of the image (e.g., foreground) may include object identification, object recognition, etc. In one example, the neural network 102 may be implemented in pedestrian classification. In that case, the one or more classes may include pedestrian identification, pedestrian recognition, etc. In another example, the neural network 102 may be implemented in a monitoring application for workplace safety. The one or more classes may include worker identification, safety equipment compliance detection, etc. In one embodiment, the feature space may be a hypersphere. Those skilled in the art will appreciate that the first and second sets of training features may be projected into any feature space other than the feature spaces mentioned above. Referring to 306 in FIG. 3B, triangles indicate training features from the second set of training features projected into feature space 308. Similarly, stars indicate training features from the first set of training features projected into feature space 308. The label generation module 210 then creates multiple regions based on the similarity between the first set of training features and the second set of training features. Referring to the example of FIG. 3B, multiple regions 307 ₁₁ , 307 ₁₂ , 307 ₂₁ , 307 ₂₂ , 307 ₃₁ and 307 ₃₂ are created. If the feature space is a hypersphere, the multiple regions are spheres. In other examples, the regions may include a square, an eclipse, etc. 309 denotes a curved hyperplane separating the first set of training features and the second set of training features.

ラベル生成モジュール２１０は、複数の領域の各々の中心及び半径を特定してもよい。次に、ラベル生成モジュール２１０は、複数の訓練データ項目から他の訓練データ項目の第２の部分の１つ以上のクラスのそれぞれの、特徴空間における訓練特徴の第１のセット及び訓練特徴の第２のセットを投影することによって、複数の領域のそれぞれの半径を調整する。特徴空間が超球である場合、ラベル生成モジュール２１０は、以下の式（１）に基づいて半径を調整する。 The label generation module 210 may identify a center and a radius for each of the multiple regions. The label generation module 210 then adjusts the radius for each of the multiple regions by projecting the first set of training features and the second set of training features in the feature space for each of the one or more classes of the second portion of the other training data items from the multiple training data items. If the feature space is a hypersphere, the label generation module 210 adjusts the radius based on the following equation (1):

ここで、「W_pxk」は中心行列、「R_pxk」は半径行列、「l_mxn」は訓練特徴の第１及び第２のセット、「Dist」は特徴空間における点（訓練特徴の第１のセット及び訓練特徴の第２のセット）間の距離である。距離は、ユークリッド距離、コサイン距離などを用いて計算することができる。ラベル生成モジュール２１０は、対応する領域に関連付けられた、訓練特徴の第１のセットと訓練特徴の第２のセットの類似性に基づいて、複数の領域の各々に対してラベルを生成する。例えば、「黄色いシャツを着た歩行者」というラベルが、領域３０７_１１に対して生成されることがある。複数の領域の数は、複数の訓練データ項目の各々の１つ以上の変換３０３の数と、１つ以上のクラスの数に基づく。一実施形態では、複数の領域の数は、１つ以上のクラスと１つ以上の変換３０３との積である。１つ以上の第１のラベル及び１つ以上の第２のラベルは、ラベルデータ２０４としてメモリ１０７に記憶されてもよい。

where " _Wpxk " is a centroid matrix, " _Rpxk " is a radius matrix, " _lmxn " is the first and second set of training features, and "Dist" is the distance between points (first set of training features and second set of training features) in feature space. The distance can be calculated using Euclidean distance, cosine distance, etc. The label generation module 210 generates a label for each of the plurality of regions based on the similarity of the first set of training features and the second set of training features associated with the corresponding region. For example, a label "pedestrian wearing yellow shirt" may be generated for region _307-11 . The number of the plurality of regions is based on the number of one or more transformations 303 of each of the plurality of training data items and the number of one or more classes. In one embodiment, the number of the plurality of regions is a product of the one or more classes and the one or more transformations 303. The one or more first labels and the one or more second labels may be stored in the memory 107 as label data 204.

一実施形態において、再配置モジュール２１１は、入力モジュール２０９から入力データ２０３を受信し、ラベル生成モジュール２１０からラベルデータ２０４を受信するように構成されてもよい。さらに、再配置モジュール２１１は、訓練特徴の第１のセットと訓練特徴の第２のセットとの相関に基づいて、訓練特徴の第１のセットと訓練特徴の第２のセットとを特徴空間上に再配置するように構成されてもよい。特に、再配置モジュール２１１は、訓練特徴の第２のセットの生成に関連する１つ以上のニューラルネットワーク１０２から第１のニューラルネットワークに１つ以上の第１のラベルを提供することによって、１つ以上の第２のラベルを１つ以上の第１のラベルと相関させる。さらに、１つ又は複数の第２のラベルは、訓練特徴の第１のセットの生成に関連付けられた１つ又は複数のニューラルネットワーク１０２から第２のニューラルネットワークに提供される。相関（関連付け）は、スワップされたラベリング技術を使用して実行されてもよい。当業者は、１つ又は複数の第２のラベルを１つ又は複数の第１のラベルと相関させるために、上述の技法以外の任意の技法が使用されてもよいことを理解するであろう。再配置モジュール２１１は、相関関係に基づいて、特徴空間上で訓練特徴の第１のセット及び訓練特徴の第２のセットを再配置してもよい。一例において、再配置は、訓練特徴の第２のセットを有する領域に対する訓練特徴の第１のセットを有する領域の移動を含み得る。図３Ｂの例を参照すると、相関関係に基づいて、領域３０７_１２が領域３０７_１１に近付くように移動されてもよい。再配置された訓練特徴の第１のセット及び訓練特徴の第２のセットに関連するデータは、再配置データ２０５としてメモリ１０７に格納されてもよい。 In one embodiment, the rearrangement module 211 may be configured to receive the input data 203 from the input module 209 and the label data 204 from the label generation module 210. Furthermore, the rearrangement module 211 may be configured to rearrange the first set of training features and the second set of training features on the feature space based on the correlation between the first set of training features and the second set of training features. In particular, the rearrangement module 211 correlates the one or more second labels with the one or more first labels by providing the one or more first labels from the one or more neural networks 102 associated with the generation of the second set of training features to the first neural network. Furthermore, the one or more second labels are provided to the second neural network from the one or more neural networks 102 associated with the generation of the first set of training features. The correlation may be performed using a swapped labeling technique. Those skilled in the art will understand that any technique other than the above-mentioned techniques may be used to correlate the one or more second labels with the one or more first labels. The relocation module 211 may relocate the first set of training features and the second set of training features in the feature space based on the correlation. In one example, the relocation may include moving a region having the first set of training features relative to a region having the second set of training features. With reference to the example of FIG. 3B, region _307-12 may be moved closer to region _307-11 based on the correlation. Data related to the relocated first set of training features and the second set of training features may be stored in the memory 107 as relocation data 205.

一実施形態において、相関モジュール２１２は、再配置モジュール２１１から再配置データ２０５を受信するように構成されてもよい。さらに、相関モジュール２１２は、１つ以上の相関ペアを生成するように構成されてもよい。相関モジュール２１２は、訓練特徴の第１のセット及び訓練特徴の第２のセットの再配置に基づいて、１つ又は複数の相関ペアを生成してもよい。１つ又は複数の相関ペアの各々は、複数の訓練データ項目の各々のデータセットの１つ又は複数の第１のラベル及び対応する第２のラベルを示してよい。図３Ｂの例を参照すると、第１の相関ペアは、領域３０７₁₁と領域３０７₁₂とを含み、第2の相関ペアは、領域３０７₂₁と領域３０７₂₂とを含み、第3の相関ペアは、領域３０７₃₁と領域３０７₃₂とを含む。一例において、相関ペアは、複数の画像の各々のデータセットの背景ラベル及び対応する前景ラベルを示すことができる。１つ以上の相関ペアは、相関データ２０６としてメモリ１０７に格納されることがある。 In one embodiment, the correlation module 212 may be configured to receive the rearrangement data 205 from the rearrangement module 211. Furthermore, the correlation module 212 may be configured to generate one or more correlation pairs. The correlation module 212 may generate one or more correlation pairs based on the rearrangement of the first set of training features and the second set of training features. Each of the one or more correlation pairs may indicate one or more first labels and corresponding second labels of each data set of the plurality of training data items. With reference to the example of FIG. 3B, the first correlation pair includes region 307 ₁₁ and region 307 ₁₂ , the second correlation pair includes region 307 ₂₁ and region 307 ₂₂ , and the third correlation pair includes region 307 ₃₁ and region 307 _32. In one example, the correlation pair may indicate a background label and a corresponding foreground label of each data set of the plurality of images. The one or more correlation pairs may be stored in the memory 107 as correlation data 206.

一実施形態では、展開中、入力モジュール２０９は、入力データ項目１０１の特徴の第１のセット１０３を受信するようにさらに構成される。入力データ項目１０１は、画像、ビデオ、音声入力、テキスト入力及び音声入力などのうちの１つを含んでよい。特徴の第１のセット１０３は、入力データ項目１０１の第１の部分に関連付けられることがある。例えば、入力データ項目１０１は画像であってもよい。入力データ項目１０１の第１の部分は、画像の背景であってもよい。一例では、画像の背景は、オープンエリアにある列車を含むことがある。別の例では、入力データ項目１０１は書かれたテキストであってもよい。背景は青い紙を含んでいてもよい。別の例では、背景は生産ライン環境を含む。ニューラルネットワークは、組立ライン環境で訓練されてもよい。一実施形態では、入力モジュール２０９は、ニューラルネットワーク１０２から特徴の第１のセット１０３を受信してもよい。別の実施形態では、入力モジュール２０９は、データベース、ユーザ、キャプチャユニット、オーディオレコーダなどの１つ以上のソースから特徴の第１のセットを受信してもよい。特徴の第１のセット１０３は、入力データ２０３としてメモリ１０７に格納されてもよい。 In one embodiment, during deployment, the input module 209 is further configured to receive a first set 103 of features of the input data item 101. The input data item 101 may include one of an image, a video, a voice input, a text input, and a voice input, and the like. The first set 103 of features may be associated with a first portion of the input data item 101. For example, the input data item 101 may be an image. The first portion of the input data item 101 may be a background of the image. In one example, the background of the image may include a train in an open area. In another example, the input data item 101 may be written text. The background may include blue paper. In another example, the background includes a production line environment. The neural network may be trained in an assembly line environment. In one embodiment, the input module 209 may receive the first set 103 of features from the neural network 102. In another embodiment, the input module 209 may receive the first set of features from one or more sources, such as a database, a user, a capture unit, an audio recorder, and the like. The first set of features 103 may be stored in memory 107 as input data 203.

一実施形態では、決定モジュール２１３は、入力モジュール２０９から入力データ２０３を受信するように構成される。さらに、決定モジュール２１３は、入力データ項目１０１の特徴の第２のセット１０８を決定するように構成される。特徴の第２のセット１０８は、入力データ項目１０１の第２の部分、例えば、画像の前景に関連付けられる。決定モジュール２１３は、特徴の第１のセット１０３を１つ又は複数の相関ペアと関連付けることによって、特徴の第２のセット１０８を決定してもよい。決定モジュール２１３は、図３Ｃに示されるように、変換行列Wt３１０を生成することによって、特徴の第１のセット１０３を１つ又は複数の相関ペアと関連付けることができる。一実施形態では、変換行列Ｗｔは、リーマン条件付け技法（Riemannian conditioning technique）を使用して生成されてもよい。決定モジュール２１３は、以下に与えられるように、式（２）を使用して、特徴の第１のセット１０３を１つ又は複数の相関ペアと関連付けることができる。 In one embodiment, the determination module 213 is configured to receive the input data 203 from the input module 209. Furthermore, the determination module 213 is configured to determine a second set of features 108 of the input data item 101. The second set of features 108 is associated with a second portion of the input data item 101, e.g., the foreground of the image. The determination module 213 may determine the second set of features 108 by associating the first set of features 103 with one or more correlation pairs. The determination module 213 may associate the first set of features 103 with one or more correlation pairs by generating a transformation matrix Wt 310, as shown in FIG. 3C. In one embodiment, the transformation matrix Wt may be generated using a Riemannian conditioning technique. The determination module 213 may associate the first set of features 103 with one or more correlation pairs using equation (2), as given below:

ここで、Arccosはコサイン関数の逆関数を示す。当業者であれば、上述の関数以外の任意の逆三角関数が、特徴の第１のセット１０３と１つ又は複数の相関ペアとの関連付けに使用され得ることを理解するであろう。特徴の第２のセット１０８は、メモリ１０７に決定データ２０７として記憶されてもよい。上述の例を参照すると、ニューラルネットワーク１０２は、背景に人がいる場合に列車又は列車の周囲の物体を識別するように訓練されてもよい。本開示は、列車がオープンエリアにある場合に、ニューラルネットワーク１０２が列車又は列車の周囲の物体を識別することを可能にする。したがって、本開示は、異なる環境におけるニューラルネットワーク１０２による予測における背景変動（例えば、オープンエリアにおける太陽光）を克服し、したがって、ニューラルネットワーク１０２の性能及びスケーラビリティを向上させる。

Here, Arccos denotes the inverse of the cosine function. Those skilled in the art will understand that any inverse trigonometric function other than the above functions may be used to associate the first set of features 103 with one or more correlation pairs. The second set of features 108 may be stored in the memory 107 as decision data 207. Referring to the above example, the neural network 102 may be trained to identify a train or objects around the train when there are people in the background. The present disclosure enables the neural network 102 to identify a train or objects around the train when the train is in an open area. Thus, the present disclosure overcomes background variations (e.g., sunlight in an open area) in predictions by the neural network 102 in different environments, thus improving the performance and scalability of the neural network 102.

その他のデータ２０８は、システム１０４の様々な機能を実行するために１つ又は複数のモジュール２０２によって生成された一時データ及び一時ファイルを含むデータを格納することができる。その他のデータ２０８は、メモリ１０７に格納されてもよい。また、１つ以上のモジュール２０２は、システム１０４の様々な雑多な機能を実行するための他のモジュール２１４を含んでもよい。例えば、他のモジュール２１４は、テストモジュールを含むことができる。テストモジュールは、訓練特徴の第２のセット及び特徴の第２のセット１０８に基づいて、１つ又は複数のテスト特徴を生成するように構成されてもよい。さらに、テストモジュールは、１つ以上のテスト特徴を特徴の第１のセット１０３と比較することによって偏差を識別するように構成されてもよい。テストモジュールは、入力データ項目１０１の第１の部分に関連する正しい特徴（背景特徴）が予測されるか否かを判定するために、偏差を識別する。次に、テストモジュールは、偏差の識別に応じて、１つ以上のテスト特徴を修正するように構成されてもよい。さらに、テストモジュールは、特徴の第２のセット１０８を修正するように構成されてもよい。テストモジュールは、特徴の第２のセット１０８を修正するために逆変換を実行するために使用される。１つ又は複数のモジュール２０２は、単一のモジュール又は異なるモジュールの組み合わせとして表されてもよいことが理解されよう。 The other data 208 may store data including temporary data and temporary files generated by one or more modules 202 to perform various functions of the system 104. The other data 208 may be stored in the memory 107. The one or more modules 202 may also include other modules 214 for performing various miscellaneous functions of the system 104. For example, the other module 214 may include a test module. The test module may be configured to generate one or more test features based on the second set of training features and the second set of features 108. Furthermore, the test module may be configured to identify deviations by comparing the one or more test features with the first set of features 103. The test module identifies deviations to determine whether a correct feature (background feature) associated with the first portion of the input data item 101 is predicted. The test module may then be configured to modify the one or more test features in response to identifying the deviations. Furthermore, the test module may be configured to modify the second set of features 108. The test module is used to perform an inverse transformation to modify the second set of features 108. It will be appreciated that one or more of the modules 202 may be represented as a single module or a combination of different modules.

図４は、本開示のいくつかの実施形態に従って、動的環境に適応するためのニューラルネットワーク１０２を訓練するための方法ステップを示す例示的なフローチャートである。図４に示されるように、方法４００は、１つ又は複数のステップから構成され得る。方法４００は、コンピュータ実行可能命令の一般的な文脈で説明され得る。一般に、コンピュータ実行可能命令は、ルーチン、プログラム、オブジェクト、コンポーネント、データ構造、手順、モジュール、及び、関数を含み得、これらは、特定の機能を実行するか、又は特定の抽象データ型を実装する。 FIG. 4 is an exemplary flowchart illustrating method steps for training a neural network 102 to adapt to a dynamic environment, according to some embodiments of the present disclosure. As shown in FIG. 4, method 400 may be comprised of one or more steps. Method 400 may be described in the general context of computer-executable instructions. Generally, computer-executable instructions may include routines, programs, objects, components, data structures, procedures, modules, and functions that perform particular functions or implement particular abstract data types.

方法４００が説明される順序は、限定として解釈されることを意図しておらず、説明される方法ブロックの任意の数は、方法を実施するために任意の順序で組み合わせることができる。さらに、個々のブロックは、本明細書に記載の主題の範囲から逸脱することなく、方法から削除することができる。さらに、本方法は、任意の適切なハードウェア、ソフトウェア、ファームウェア、又はそれらの組み合わせで実施することができる。 The order in which method 400 is described is not intended to be construed as a limitation, and any number of the described method blocks may be combined in any order to implement the method. Additionally, individual blocks may be deleted from the method without departing from the scope of the subject matter described herein. Additionally, the method may be implemented in any suitable hardware, software, firmware, or combination thereof.

ステップ４０１で、システム１０４は、入力データ項目１０１の特徴の第１のセット１０３を受信する。入力データ項目１０１は、画像、ビデオ、音声入力、テキスト入力及び音声入力などのうちの１つで構成されてもよい。特徴１０３の第１のセットは、入力データ項目１０１の第１の部分に関連付けられる。一実施形態では、システム１０４は、ニューラルネットワーク１０２から特徴の第１のセット１０３を受信してもよい。別の実施形態では、システム１０４は、データベース、ユーザ、キャプチャユニット、オーディオレコーダなどの１つ又は複数のソースから特徴の第１のセットを受信してもよい。 In step 401, the system 104 receives a first set 103 of features of an input data item 101. The input data item 101 may consist of one of an image, a video, an audio input, a text input, and a voice input, and the like. The first set of features 103 is associated with a first portion of the input data item 101. In one embodiment, the system 104 may receive the first set of features 103 from a neural network 102. In another embodiment, the system 104 may receive the first set of features from one or more sources, such as a database, a user, a capture unit, an audio recorder, and the like.

ステップ４０２において、システム１０４は、複数の訓練データ項目の各々のデータセットについて、１つ以上の第１のラベル及び１つ以上の第２のラベルを生成する。システム１０４は、訓練特徴の第１のセット及び訓練特徴の第２のセットを特徴空間上に投影することによって、１つ又は複数の第１のラベル及び１つ又は複数の第２のラベルを生成する。システム１０４は、複数の訓練データ項目からの訓練データ項目の第２の部分の１つ又は複数のクラスの各々の、特徴空間における訓練特徴の第１のセット及び訓練特徴の第２のセットを投影する。さらに、システム１０４は、訓練特徴の第１のセットと訓練特徴の第２のセットとの間の類似性に基づいて、複数の領域を作成する。次に、システム１０４は、複数の領域のそれぞれの中心と半径を特定する。システム１０４は、複数の訓練データ項目から他の訓練データ項目の第２の部分の１つ又は複数のクラスの各々について、訓練特徴の第１のセット及び訓練特徴の第２のセットを特徴空間内に投影することによって、複数の領域の各々の半径を調整する。さらに、システム１０４は、訓練特徴の第１のセット及び訓練特徴の第２のセットの類似性に基づいて、対応する領域に関連付けられた、複数の領域の各々に対するラベルを生成する。 In step 402, the system 104 generates one or more first labels and one or more second labels for each data set of the plurality of training data items. The system 104 generates the one or more first labels and the one or more second labels by projecting the first set of training features and the second set of training features onto a feature space. The system 104 projects the first set of training features and the second set of training features in the feature space for each of one or more classes of the second portion of the training data items from the plurality of training data items. Furthermore, the system 104 creates a plurality of regions based on the similarity between the first set of training features and the second set of training features. The system 104 then identifies a center and a radius for each of the plurality of regions. The system 104 adjusts the radius of each of the plurality of regions by projecting the first set of training features and the second set of training features into the feature space for each of one or more classes of the second portion of the other training data items from the plurality of training data items. Additionally, the system 104 generates a label for each of the plurality of regions associated with the corresponding region based on the similarity between the first set of training features and the second set of training features.

ステップ４０３において、システム１０４は、訓練特徴の第１のセットと訓練特徴の第２のセットとの相関に基づいて、訓練特徴の第１のセットと訓練特徴の第２のセットとを特徴空間上に再配置する。システム１０４は、訓練特徴の第２のセットの生成に関連する１つ又は複数のニューラルネットワーク１０２から第１のニューラルネットワークに１つ又は複数の第１のラベルを提供することによって、１つ又は複数の第２のラベルを１つ又は複数の第１のラベルと相関させる。さらに、１つ又は複数の第２のラベルは、訓練特徴の第１のセットの生成に関連付けられた１つ又は複数のニューラルネットワーク１０２からの第２のニューラルネットワークに提供される。 In step 403, the system 104 rearranges the first set of training features and the second set of training features in a feature space based on the correlation between the first set of training features and the second set of training features. The system 104 correlates the one or more second labels with the one or more first labels by providing the one or more first labels to the first neural network from the one or more neural networks 102 associated with the generation of the second set of training features. Additionally, the one or more second labels are provided to the second neural network from the one or more neural networks 102 associated with the generation of the first set of training features.

ステップ４０４において、システム１０４は、１つ以上の相関ペアを生成する。システム１０４は、訓練特徴の第１のセット及び訓練特徴の第２のセットの再配置に基づいて、１つ又は複数の相関ペアを生成してもよい。１つ又は複数の相関ペアの各々は、複数の訓練データ項目の各々のデータセットの１つ又は複数の第１のラベル及び対応する第２のラベルを示してもよい。 In step 404, the system 104 generates one or more correlation pairs. The system 104 may generate the one or more correlation pairs based on rearrangements of the first set of training features and the second set of training features. Each of the one or more correlation pairs may indicate one or more first labels and corresponding second labels for each of the data sets of the multiple training data items.

図５は、本開示のいくつかの実施形態による、ニューラルネットワーク１０２を動的環境に適応させるための方法ステップを示す例示的なフローチャートである。図５に示されるように、方法５００は、１つ又は複数のステップから構成され得る。方法５００は、コンピュータ実行可能命令の一般的な文脈で説明され得る。一般に、コンピュータ実行可能命令は、ルーチン、プログラム、オブジェクト、コンポーネント、データ構造、手順、モジュール、及び関数を含み得る、これらは、特定の機能を実行するか、又は特定の抽象データ型を実装する。 FIG. 5 is an exemplary flowchart illustrating method steps for adapting a neural network 102 to a dynamic environment, according to some embodiments of the present disclosure. As shown in FIG. 5, method 500 may be comprised of one or more steps. Method 500 may be described in the general context of computer-executable instructions. Generally, computer-executable instructions may include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions or implement particular abstract data types.

方法５００が説明される順序は、限定として解釈されることを意図しておらず、説明される方法ブロックの任意の数は、方法を実施するために任意の順序で組み合わせることができる。さらに、個々のブロックは、本明細書に記載の主題の範囲から逸脱することなく、方法から削除することができる。さらに、本方法は、任意の適切なハードウェア、ソフトウェア、ファームウェア、又はそれらの組み合わせで実施することができる。 The order in which method 500 is described is not intended to be construed as a limitation, and any number of the described method blocks may be combined in any order to implement the method. Additionally, individual blocks may be deleted from the method without departing from the scope of the subject matter described herein. Additionally, the method may be implemented in any suitable hardware, software, firmware, or combination thereof.

ステップ５０１で、システム１０４は、入力データ項目１０１の特徴の第１のセット１０３を受信する。入力データ項目１０１は、画像、ビデオ、音声入力、テキスト入力及び音声入力などのうちの１つで構成されてもよい。特徴の第１のセット１０３は、入力データ項目１０１の第１の部分に関連付けられる。一実施形態では、システム１０４は、ニューラルネットワーク１０２から特徴の第１のセット１０３を受信してもよい。別の実施形態では、システム１０４は、データベース、ユーザ、キャプチャユニット、オーディオレコーダなどの１つ又は複数のソースから特徴の第１のセットを受信してもよい。 At step 501, the system 104 receives a first set 103 of features of an input data item 101. The input data item 101 may consist of one of an image, a video, an audio input, a text input, and a voice input, and the like. The first set of features 103 is associated with a first portion of the input data item 101. In one embodiment, the system 104 may receive the first set of features 103 from a neural network 102. In another embodiment, the system 104 may receive the first set of features from one or more sources, such as a database, a user, a capture unit, an audio recorder, and the like.

ステップ５０２で、システム１０４は、入力データ項目１０１の特徴の第２のセット１０８を決定する。特徴の第２のセット１０８は、入力データ項目１０１の第２の部分に関連付けられる。システム１０４は、特徴の第１のセット１０３を１つ以上の相関ペアと関連付けることによって、特徴の第２のセット１０８を決定する。システム１０４は、リーマン条件付け技術（Riemannian conditioning technique）を使用して変換行列Wtを生成することによって、特徴の第１のセット１０３を１つ又は複数の相関ペアと関連付けることができる。当業者であれば、上述の技法以外の任意の技法が、特徴の第１のセット１０３を１つ又は複数の相関ペアと関連付けるために使用されてもよいことを理解するであろう。 In step 502, the system 104 determines a second set 108 of features of the input data item 101. The second set 108 of features is associated with a second portion of the input data item 101. The system 104 determines the second set 108 of features by associating the first set 103 of features with one or more correlation pairs. The system 104 may associate the first set 103 of features with one or more correlation pairs by generating a transformation matrix Wt using a Riemannian conditioning technique. A person skilled in the art will understand that any technique other than the techniques described above may be used to associate the first set 103 of features with one or more correlation pairs.

上述した実施例及び対応する図は、入力データ項目（複数の訓練データ項目及び入力データ項目１０１）を画像とみなして説明する。従って、特徴の第１のセット、特徴の第２のセット、訓練特徴の第１のセット、訓練特徴の第２のセット、第１のラベル、第２のラベル等は、画像の前景及び背景に対応する。本発明は、画像に限定されず、特徴の第１のセット、特徴の第２のセットは、入力データ項目１０１のタイプに基づいて適宜変化してもよい。例えば、入力データ項目１０１は、ニューラルネットワークの音声認識アプリケーションのための音声入力を含んでもよい。このような場合、特徴の第１のセットは、人物のアクセントなどの音声入力の背景特徴から構成されてもよい。特徴の第２のセットは、人物の声などの音声入力の前景特徴から構成されてもよい。別の例では、入力データ項目１０１は、ニューラルネットワークのアクセント分類アプリケーションにおける音声入力を含む場合がある。このような場合、特徴の第１のセットは、人物の声などの音声入力の背景特徴から構成されてもよい。特徴の第２のセットは、人物のアクセントなどの音声入力の前景特徴から構成される場合がある。 The above-described embodiment and corresponding figures are described by considering the input data items (the plurality of training data items and the input data item 101) as an image. Thus, the first set of features, the second set of features, the first set of training features, the second set of training features, the first label, the second label, etc. correspond to the foreground and background of the image. The present invention is not limited to images, and the first set of features and the second set of features may vary accordingly based on the type of the input data item 101. For example, the input data item 101 may include a speech input for a neural network speech recognition application. In such a case, the first set of features may consist of background features of the speech input, such as the person's accent. The second set of features may consist of foreground features of the speech input, such as the person's voice. In another example, the input data item 101 may include a speech input in a neural network accent classification application. In such a case, the first set of features may consist of background features of the speech input, such as the person's voice. The second set of features may consist of foreground features of the speech input, such as the person's accent.

＜コンピュータシステム＞
図６は、本開示と一致する実施形態を実施するための例示的なコンピュータシステム６００のブロック図である。実施形態において、コンピュータシステム６００は、システム１０４であってもよい。したがって、コンピュータシステム６００は、動的環境においてニューラルネットワーク１０２を適応させるために使用されてもよい。コンピュータシステム６００は、中央処理装置６０２（「ＣＰＵ」又は「プロセッサ」とも呼ばれる）を含んでもよい。プロセッサ６０２は、少なくとも１つのデータプロセッサを含んでよい。プロセッサ６０２は、統合システム（バス）コントローラ、メモリ管理制御ユニット、浮動小数点ユニット、グラフィック処理ユニット、デジタル信号処理ユニットなどの特殊な処理ユニットを含んでもよい。プロセッサ６０２は、図２で説明したプロセッサ１０５を実現するために使用されてもよい。 <Computer System>
6 is a block diagram of an exemplary computer system 600 for implementing embodiments consistent with the present disclosure. In an embodiment, the computer system 600 may be the system 104. Thus, the computer system 600 may be used to adapt the neural network 102 in a dynamic environment. The computer system 600 may include a central processing unit 602 (also referred to as a "CPU" or "processor"). The processor 602 may include at least one data processor. The processor 602 may include specialized processing units such as an integrated system (bus) controller, a memory management control unit, a floating point unit, a graphics processing unit, a digital signal processing unit, etc. The processor 602 may be used to implement the processor 105 described in FIG. 2.

プロセッサ６０２は、Ｉ／Ｏインターフェース６０１を介して、１つ以上の入力／出力（Ｉ／Ｏ）デバイス（図示せず）と通信可能に配置されてもよい。Ｉ／Ｏインターフェース６０１は、限定されないが、オーディオ、アナログ、デジタル、モノラル、ＲＣＡ、ステレオ、IEEE（Institute of Electrical and Electronics Engineers）-１３９４、シリアルバス、ユニバーサルシリアルバス（USB）、赤外線、PS／2、BNC、同軸、コンポーネント、コンポジット、デジタルビジュアルインターフェース（DVI）、高精細マルチメディアインターフェース（HDMI）、無線周波数（RF）アンテナ、Sビデオ、VGA、IEEE 802.n /b/g/n/x、ブルートゥース、セルラー（例えば、符号分割多元接続（CDMA）、高速パケットアクセス（HSPA+）、グローバル移動通信システム（GSM）、長期進化（LTE）、WiMaxなど）などのような、通信プロトコル／通信方法などが挙げられる。 The processor 602 may be disposed in communication with one or more input/output (I/O) devices (not shown) via an I/O interface 601. The I/O interface 601 may include, but is not limited to, communication protocols/methods such as audio, analog, digital, mono, RCA, stereo, IEEE (Institute of Electrical and Electronics Engineers)-1394, serial bus, universal serial bus (USB), infrared, PS/2, BNC, coaxial, component, composite, digital visual interface (DVI), high definition multimedia interface (HDMI), radio frequency (RF) antenna, S-video, VGA, IEEE 802.n/b/g/n/x, Bluetooth, cellular (e.g., code division multiple access (CDMA), high speed packet access (HSPA+), global system for mobile communications (GSM), long term evolution (LTE), WiMax, etc.), etc.

Ｉ／Ｏインターフェース６０１を使用して、コンピュータシステム６００は、１つ又は複数のＩ／Ｏデバイスと通信してもよい。例えば、入力デバイス６１０は、アンテナ、キーボード、マウス、ジョイスティック、（赤外線）リモートコントロール、カメラ、カードリーダ、ファックス機、ドングル、バイオメトリックリーダ、マイクロフォン、タッチスクリーン、タッチパッド、トラックボール、スタイラス、スキャナ、記憶装置、トランシーバ、ビデオデバイス／ソースなどであってもよい。出力デバイス６１１は、プリンタ、ファクシミリ、ビデオ・ディスプレイ（例えば、陰極線管（CRT）、液晶ディスプレイ（LCD）、発光ダイオード（LED）、プラズマ、プラズマ・ディスプレイ・パネル（PDP）、有機発光ダイオード・ディスプレイ（OLED）等）、オーディオ・スピーカ等であってもよい。 Using the I/O interface 601, the computer system 600 may communicate with one or more I/O devices. For example, the input device 610 may be an antenna, a keyboard, a mouse, a joystick, an (infrared) remote control, a camera, a card reader, a fax machine, a dongle, a biometric reader, a microphone, a touch screen, a touch pad, a trackball, a stylus, a scanner, a storage device, a transceiver, a video device/source, etc. The output device 611 may be a printer, a fax machine, a video display (e.g., a cathode ray tube (CRT), a liquid crystal display (LCD), a light emitting diode (LED), a plasma, a plasma display panel (PDP), an organic light emitting diode display (OLED), etc.), an audio speaker, etc.

プロセッサ６０２は、ネットワークインターフェース６０３を介して通信ネットワーク６０９と通信可能に配置されてもよい。ネットワークインターフェース６０３は、通信ネットワーク６０９と通信してもよい。ネットワークインターフェース６０３は、限定されないが、ダイレクトコネクト、イーサネット（例えば、ツイストペア１０／１００／１０００ベースＴ）、伝送制御プロトコル／インターネットプロトコル（TCP／IP）、トークンリング、IEEE802.11a／b／g／n／xなどを含む接続プロトコルを採用してもよい。通信ネットワーク６０９は、限定するものではないが、直接相互接続、ローカルエリアネットワーク（LAN）、ワイドエリアネットワーク（WAN）、ワイヤレスネットワーク（例えば、ワイヤレスアプリケーションプロトコルを使用する）、インターネット等を含み得る。ネットワークインターフェース６０３は、接続プロトコルとして、ダイレクトコネクト、イーサネット（例えば、ツイストペア10／100／1000ベースT）、伝送制御プロトコル／インターネットプロトコル（TCP／IP）、トークンリング、IEEE 802.11a／b／g／n／xなどを採用することができるが、これらに限定されない。 The processor 602 may be disposed in communication with the communication network 609 via the network interface 603. The network interface 603 may communicate with the communication network 609. The network interface 603 may employ a connection protocol including, but not limited to, direct connect, Ethernet (e.g., twisted pair 10/100/1000 base-T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc. The communication network 609 may include, but is not limited to, direct interconnect, local area network (LAN), wide area network (WAN), wireless network (e.g., using wireless application protocol), internet, etc. The network interface 603 may employ, but is not limited to, direct connect, Ethernet (e.g., twisted pair 10/100/1000 base-T), transmission control protocol/internet protocol (TCP/IP), token ring, IEEE 802.11a/b/g/n/x, etc. as a connection protocol.

通信ネットワーク６０９は、直接相互接続、電子商取引ネットワーク、ピアツーピア（Ｐ２Ｐ）ネットワーク、ローカルエリアネットワーク（LAN）、ワイドエリアネットワーク（WAN）、ワイヤレスネットワーク（例えば、ワイヤレスアプリケーションプロトコルを使用する）、インターネット、Wi-Fiなどを含むが、これらに限定されない。第１のネットワークと第２のネットワークは、専用ネットワークであっても、共有ネットワークであってもよく、これは、例えば、HTTP（Hypertext Transfer Protocol）、TCP／IP（Transmission Control Protocol／Internet Protocol）、WAP（Wireless Application Protocol）等の様々なプロトコルを使用して互いに通信する異なるタイプのネットワークの関連付けを表す。さらに、第１のネットワーク及び第２のネットワークは、ルータ、ブリッジ、サーバ、コンピューティングデバイス、ストレージデバイスなどを含む様々なネットワークデバイスを含み得る。コンピュータシステム６００は、通信ネットワーク６０９を介してデータ項目６１２（入力データ項目及び複数の訓練データ項目）を受信することができる。 The communication network 609 may include, but is not limited to, a direct interconnection, an electronic commerce network, a peer-to-peer (P2P) network, a local area network (LAN), a wide area network (WAN), a wireless network (e.g., using a wireless application protocol), the Internet, Wi-Fi, and the like. The first and second networks may be dedicated or shared networks, which represent an association of different types of networks that communicate with each other using various protocols, such as, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like. Additionally, the first and second networks may include various network devices, including routers, bridges, servers, computing devices, storage devices, and the like. The computer system 600 may receive data items 612 (the input data items and the plurality of training data items) over the communication network 609.

いくつかの実施形態では、プロセッサ６０２は、ストレージインタフェース６０４を介してメモリ６０５（例えば、図６には示されていないRAM、ROMなど）と通信可能に配置されてもよい。ストレージインタフェース６０４は、シリアル・アドバンスド・テクノロジー・アタッチメント（SATA）、統合ドライブ・エレクトロニクス（IDE）、IEEE-1394、ユニバーサルシリアルバス（USB）、ファイバ・チャネル、小型コンピュータシステム・インターフェース（SCSI）などの接続プロトコルを採用する、限定されないが、メモリ・ドライブ、リムーバブル・ディスク・ドライブなどを含むメモリ605に接続してもよい。メモリ・ドライブには、さらに、ドラム、磁気ディスク・ドライブ、光磁気ドライブ、光学ドライブ、RAID（Redundant Array of Independent Discs）、ソリッド・ステート・メモリ・デバイス、ソリッド・ステート・ドライブなどを含めることができる。 In some embodiments, the processor 602 may be arranged to communicate with memory 605 (e.g., RAM, ROM, etc., not shown in FIG. 6) via a storage interface 604. The storage interface 604 may connect to memory 605, including, but not limited to, memory drives, removable disk drives, etc., employing connection protocols such as Serial Advanced Technology Attachment (SATA), Integrated Drive Electronics (IDE), IEEE-1394, Universal Serial Bus (USB), Fibre Channel, Small Computer System Interface (SCSI), etc. The memory drives may further include drum, magnetic disk drives, magneto-optical drives, optical drives, Redundant Array of Independent Discs (RAID), solid state memory devices, solid state drives, etc.

メモリ６０５は、限定されるものではないが、ユーザインタフェース６０６、オペレーティングシステム６０７、ウェブブラウザ６０８などを含む、プログラム又はデータベースコンポーネントの集合を格納することができる。いくつかの実施形態では、コンピュータシステム６００は、本開示で説明されるようなデータ、変数、レコードなどのユーザ／アプリケーションデータを格納することができる。このようなデータベースは、Oracle（登録商標）又はSybase（登録商標）などのフォールトトレラント、リレーショナル、スケーラブル、セキュアなデータベースとして実装することができる。メモリ６０５は、図２で説明したメモリ１０７を実現するために使用することができる。メモリ６０５は、プロセッサ６０２と通信可能に結合されていてもよい。メモリ６０５は、１つ又は複数のプロセッサ６０２によって実行可能な命令を格納し、実行時に、プロセッサ６０２にニューラルネットワーク１０２を動的環境に適応させることができる。 The memory 605 may store a collection of program or database components, including, but not limited to, a user interface 606, an operating system 607, a web browser 608, and the like. In some embodiments, the computer system 600 may store user/application data, such as data, variables, records, and the like, as described in this disclosure. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases, such as Oracle® or Sybase®. The memory 605 may be used to implement the memory 107 described in FIG. 2. The memory 605 may be communicatively coupled to the processor 602. The memory 605 may store instructions executable by one or more processors 602, and when executed, may enable the processor 602 to adapt the neural network 102 to a dynamic environment.

オペレーティングシステム６０７は、コンピュータシステム６００のリソース管理及び動作を容易にすることができる。オペレーティングシステムの例としては、限定されないが、APPLE MACINTOSHR OS X、UNIXR、UNIXライクシステムディストリビューション（例えば、BERKELEY SOFTWARE DISTRIBUTIONTM（BSD）、FREEBSDTM、NETBSDTM、OPENBSDTM など）、LINUX DISTRIBUTIONSTM（例えば、RED HATTM,UBUNTUTM,KUBUNTUTMなど)、IBMTM OS/2、MICROSOFTTM WINDOWSTM (XPTM , VISTATM /7/8, 10など)、APPLER IOSTM、GOOGLER ANDROIDTM、BLACKBERRYR OSなどを含む。 Operating system 607 can facilitate resource management and operation of computer system 600. Examples of operating systems include, but are not limited to, APPLE MACINTOSH R OS X, UNIXR, UNIX-like system distributions (e.g., BERKELEY SOFTWARE DISTRIBUTIONTM (BSD), FREEBSDTM, NETBSDTM, OPENBSDTM, etc.), LINUX DISTRIBUTIONSTM (e.g., RED HATTM, UBUNTUTM, KUBUNTUTM, etc.), IBMTM OS/2, MICROSOFTTM WINDOWSTM (XPTM, VISTATM /7/8, 10, etc.), APPLER IOSTM, GOOGLER ANDROIDTM, BLACKBERRYR OS, etc.

いくつかの実施形態では、コンピュータシステム６００は、ウェブブラウザ６０８に格納されたプログラムコンポーネントを実装することができる。ウェブブラウザ６０８は、例えば、MICROSOFTR INTERNET EXPLORERTM 、GOOGLER CHROMETM0 、MOZILLAR FIREFOXTM 、APPLER SAFARITM などのハイパーテキスト閲覧アプリケーションであってもよい。セキュアなウェブブラウジングは、セキュア・ハイパーテキスト・トランスポート・プロトコル（HTTPS）、セキュア・ソケット・レイヤー（SSL）、トランスポート・レイヤー・セキュリティ（TLS）などを使用して提供される。ウェブブラウザ608は、AJAXTM 、DHTMLTM 、ADOBER FLASHTM、JAVASCRIPTTM 、JAVATM 、アプリケーションプログラミングインタフェース（API）などのファシリティを利用することができる。いくつかの実施形態では、コンピュータシステム６００は、メールサーバ（図示せず）に格納されたプログラムコンポーネントを実装してもよい。メールサーバは、Microsoft Exchangeなどのインターネットメールサーバーであってもよい。メールサーバは、ASPTM、ACTIVEXTM 、ANSITM C++/C#、MICROSOFTR、.NETTM、CGI SCRIPTSTM 、JAVATM、JAVASCRIPTTM、PERLTM、PHPTM、PYTHONTM、WEBOBJECTSTMなどの設備を利用してもよい。メールサーバは、IMAP（Internet Message Access Protocol）、MAPI（Messaging Application Programming Interface）、MICROSOFTR exchange、POP（Post Office Protocol）、SMTP（Simple Mail Transfer Protocol）などの通信プロトコルを利用することができる。いくつかの実施形態では、コンピュータシステム６００は、メールクライアントストアドプログラムコンポーネントを実装してもよい。メールクライアント（図示せず）は、APPLER MAILTM 、MICROSOFTR ENTOURAGETM 、MICROSOFTR OUTLOOKTM 、MOZILLAR THUNDERBIRDTM などのメール閲覧アプリケーションであってもよい。 In some embodiments, computer system 600 may implement program components stored in a web browser 608. Web browser 608 may be, for example, a hypertext browsing application such as MICROSOFT INTERNET EXPLORERTM, GOOGLER CHROMETM0, MOZILLAR FIREFOXTM, APPLER SAFARITM, or the like. Secure web browsing is provided using Secure Hypertext Transport Protocol (HTTPS), Secure Sockets Layer (SSL), Transport Layer Security (TLS), or the like. Web browser 608 may utilize facilities such as AJAXTM, DHTMLTM, ADOBER FLASHTM, JAVASCRIPTTM, JAVATM, application programming interfaces (APIs), and the like. In some embodiments, computer system 600 may implement program components stored in a mail server (not shown). The mail server may be an Internet mail server such as Microsoft Exchange. The mail server may utilize facilities such as ASP™, ACTIVEXT™, ANSI™ C++/C#, MICROSOFT®, .NET™, CGI SCRIPTS™, JAVA™, JAVASCRIPT™, PERLTM, PHP™, PYTHON™, WEBOBJECTS™, etc. The mail server may utilize communications protocols such as Internet Message Access Protocol (IMAP), Messaging Application Programming Interface (MAPI), MICROSOFT® exchange, Post Office Protocol (POP), Simple Mail Transfer Protocol (SMTP), etc. In some embodiments, computer system 600 may implement a mail client stored program component. The mail client (not shown) may be a mail viewing application such as APPLER MAIL™, MICROSOFT® ENTOURAGE™, MICROSOFT® OUTLOOK™, MOZILLAR THUNDERBIRD™, etc.

さらに、本開示と一致する実施形態を実施する際に、１つ以上のコンピュータ読み取り可能な記憶媒体を利用することができる。コンピュータ可読記憶媒体は、プロセッサによって読み取り可能な情報又はデータが記憶され得る任意のタイプの物理的メモリを指す。したがって、コンピュータ読み取り可能記憶媒体は、プロセッサ（複数可）に本明細書に記載の実施形態と一致するステップ又は段階を実行させるための命令を含む、１つ又は複数のプロセッサによる実行のための命令を記憶することができる。「コンピュータ読み取り可能媒体」という用語は、有形物を含み、搬送波及び過渡信号を除外する、すなわち、非一過性であると理解されるべきである。例えば、ランダムアクセスメモリ（RAM）、読み出し専用メモリ（ROM）、揮発性メモリ、不揮発性メモリ、ハードドライブ、コンパクトディスク読み出し専用メモリ（CD ROM）、デジタルビデオディスク（DVD）、フラッシュドライブ、ディスク、及び他の任意の既知の物理的記憶媒体が挙げられる。 Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory in which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to execute steps or stages consistent with the embodiments described herein. The term "computer-readable medium" should be understood to include tangible objects and exclude carrier waves and transient signals, i.e., non-transient. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, non-volatile memory, hard drives, compact disk read-only memory (CD ROM), digital video disk (DVD), flash drives, disks, and any other known physical storage medium.

本開示は、ニューラルネットワークを異なる動的環境に適応させるための方法及びシステムを提供する。ニューラルネットワークは、複数の訓練データ項目の訓練特徴の第１のセット（例えば、背景特徴）と訓練特徴の第２のセット（例えば、前景特徴）とを提供することによって訓練される。
ニューラルネットワークは、特徴の第１のセット及び特徴の第２のセットを相関させ、第１のラベル（例えば、背景ラベル）及び第２のラベル（例えば、前景ラベル）との相関ペアを形成する。ニューラルネットワークが入力データ項目の特徴の第１のセット（例えば背景特徴）を受信すると、ニューラルネットワークは相関ペアに基づいて入力データ項目の特徴の第２のセット（例えば前景特徴）を決定する。したがって、ネットワークは新しい環境に適応することができる。したがって、本開示は、ニューラルネットワークを異なる環境に適応させることを可能にする。本開示は、新たな環境（例えば、新たなモノのインターネット（ＩｏＴ）環境）に対するデータ不足の制約下でのニューラルネットワークの開発に使用することができ、それによりニューラルネットワークの用途を拡大することができる。 The present disclosure provides a method and system for adapting a neural network to different dynamic environments, where the neural network is trained by providing a first set of training features (e.g., background features) and a second set of training features (e.g., foreground features) of a plurality of training data items.
The neural network correlates the first set of features and the second set of features to form a correlation pair with a first label (e.g., background label) and a second label (e.g., foreground label). When the neural network receives the first set of features of the input data items (e.g., background features), the neural network determines the second set of features of the input data items (e.g., foreground features) based on the correlation pair. Thus, the network can adapt to new environments. Thus, the present disclosure allows the neural network to adapt to different environments. The present disclosure can be used to develop neural networks under data scarcity constraints for new environments (e.g., new Internet of Things (IoT) environments), thereby expanding the applications of neural networks.

「実施形態」、「実施形態」、「実施形態」、「１つ又は複数の実施形態」、「いくつかの実施形態」、及び「１つの実施形態」という用語は、明示的に別段の指定がない限り、「本発明（複数可）の１つ又は複数の（すべてではない）実施形態」を意味する。 The terms "embodiment", "embodiment", "embodiment", "one or more embodiments", "some embodiments", and "one embodiment" mean "one or more (but not all) embodiments of the invention(s)", unless expressly specified otherwise.

用語「を含む」、「からなる」、「を有する」及びその変形は、明示的に別段の定めがない限り、「を含むが、これに限定されない」を意味する。 The terms "including," "consisting of," "having," and variations thereof mean "including, but not limited to," unless expressly stated otherwise.

項目の列挙は、明示的に別段の定めがない限り、項目のいずれか又はすべてが相互に排他的であることを意味するものではない。「a」、「an」及び「the」という用語は、明示的に別段の定めがない限り、「１つ以上」を意味する。 The enumeration of items does not imply that any or all of the items are mutually exclusive unless expressly stated otherwise. The terms "a," "an," and "the" mean "one or more" unless expressly stated otherwise.

互いに通信する複数の構成要素を有する実施形態の説明は、そのような構成要素がすべて必要であることを意味するものではない。それどころか、本発明の多種多様な可能な実施形態を説明するために、様々なオプションの構成要素が記載されている。 The description of an embodiment having multiple components in communication with one another does not imply that all such components are required. Rather, various optional components are described to illustrate the wide variety of possible embodiments of the present invention.

単一の装置又は成形品が本明細書に記載されている場合、複数の装置／成形品（それらが協働するか否かを問わない）が、単一の装置／成形品の代わりに使用され得ることは、容易に明らかであろう。同様に、本明細書において（それらが協働するか否かにかかわらず）２つ以上の装置又は物品が記載される場合、２つ以上の装置又は物品の代わりに単一の装置／物品が使用されてもよいこと、又は、示された数の装置又はプログラムの代わりに異なる数の装置／物品が使用されてもよいことは、容易に明らかであろう。装置の機能性及び／又は特徴は、そのような機能性／特徴を有するものとして明示的に記載されていない１つ以上の他の装置によって代替的に具現化されてもよい。したがって、本発明の他の実施形態は、装置自体を含む必要はない。 Where a single device or article is described herein, it will be readily apparent that multiple devices/articles (whether they cooperate or not) may be used in place of the single device/article. Similarly, where two or more devices or articles are described herein (whether they cooperate or not), it will be readily apparent that a single device/article may be used in place of the two or more devices or articles, or that a different number of devices/articles may be used in place of the number of devices or programs shown. The functionality and/or features of a device may alternatively be embodied by one or more other devices not expressly described as having such functionality/features. Thus, other embodiments of the present invention need not include a device per se.

図４及び図５の図示された動作は、特定の事象が特定の順序で発生することを示している。代替の実施形態では、特定の動作は、異なる順序で実行されたり、修正されたり、削除されたりしてもよい。さらに、上述したロジックにステップを追加しても、説明した実施形態に適合する。さらに、本明細書に記載された操作は、順次行われてもよいし、特定の操作が並行して処理されてもよい。さらに、操作は、単一の処理ユニットによって実行されてもよいし、分散処理ユニットによって実行されてもよい。 The illustrated operations of Figures 4 and 5 show certain events occurring in a particular order. In alternative embodiments, certain operations may be performed in a different order, modified, or removed. Furthermore, steps may be added to the logic described above and still be consistent with the described embodiments. Furthermore, operations described herein may be performed sequentially or certain operations may be processed in parallel. Furthermore, operations may be performed by a single processing unit or by distributed processing units.

最後に、本明細書で使用される文言は、主として読みやすさと説明の目的で選択されたものであり、発明的主題を画定又は包囲するために選択されたものではない可能性がある。したがって、本発明の範囲は、この詳細な説明によってではなく、むしろ、ここに基づく出願に基づいて発行される特許請求の範囲によって限定されることが意図される。従って、本発明の実施形態の開示は、以下の特許請求の範囲に規定される本発明の範囲を例示するものであるが、限定するものではないことを意図している。 Finally, the language used herein has been selected primarily for purposes of readability and explanation, and may not have been selected to define or encompass inventive subject matter. Accordingly, it is intended that the scope of the invention be limited not by this detailed description, but rather by the claims that issue on an application based hereon. Accordingly, the disclosure of embodiments of the invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined in the following claims.

本明細書では様々な態様及び実施形態を開示したが、他の態様及び実施形態も当業者には明らかであろう。本明細書に開示された様々な態様及び実施形態は、例示を目的とするものであり、限定を意図するものではなく、真の範囲は以下の特許請求の範囲によって示される。 While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those of skill in the art. The various aspects and embodiments disclosed herein are intended to be illustrative and not limiting, with the true scope being indicated by the following claims.

１００…環境、１０１…入力データ項目、１０２…ニューラルネットワーク、１０３…特徴の第１のセット、１０４…システム、１０５…プロセッサ、１０６…Ｉ／Ｏインターフェース、１０７…メモリ、１０８…特徴の第２のセット、２００…詳細図、２０１…データ、２０２…モジュール、２０３…入力データ、２０４…ラベルデータ、２０５…再配置データ、２０６…相関データ、２０７…決定データ、２０８…他のデータ、２０９…入力モジュール、２１０…ラベル生成モジュール、２１１…再配置モジュール、２１２…相関モジュール、２１３…決定モジュール、２１４…他のモジュール、３００…例、３０１…第１の部分、３０２…第２の部分、３０３…変換、３０４…特徴の第２のセット、３０５…特徴の第１のセット、３０７…領域、３０８…特徴空間、３０９…湾曲超平面、３１０…変換行列、６００…コンピュータシステム、６０１…Ｉ／Ｏインターフェース、６０２…プロセッサ、６０３…ネットワークインターフェース、６０４…ストレージインタフェース、６０５…メモリ、６０６…ユーザインタフェース、６０７…オペレーティングシステム、６０８…ウェブブラウザ、６０９…通信ネットワーク、６１０…入力デバイス、６１１…出力デバイス、６１２…データ項目 100... environment, 101... input data item, 102... neural network, 103... first set of features, 104... system, 105... processor, 106... I/O interface, 107... memory, 108... second set of features, 200... detailed view, 201... data, 202... module, 203... input data, 204... label data, 205... rearrangement data, 206... correlation data, 207... decision data, 208... other data, 209... input module, 210... label generation module, 211... rearrangement module, 212... correlation module, 213... decision module, 214... other modules, 300...example, 301...first part, 302...second part, 303...transformation, 304...second set of features, 305...first set of features, 307...domain, 308...feature space, 309...curved hyperplane, 310...transformation matrix, 600...computer system, 601...I/O interface, 602...processor, 603...network interface, 604...storage interface, 605...memory, 606...user interface, 607...operating system, 608...web browser, 609...communication network, 610...input device, 611...output device, 612...data item

Claims

1. A method for adapting a neural network to a dynamic environment, comprising:
receiving, by the system, a first set of characteristics of the input data item associated with a first portion of the input data item;
determining, by the system, a second set of features of the input data item associated with a second portion of the input data item by associating the first set of features with one or more correlation pairs generated using a neural network;
receiving a feature set from the one or more neural networks, the feature set including a first set of training features and a second set of training features for each data set of a plurality of training data items;
generating, for each data set of the plurality of training data items, one or more first labels and one or more second labels by projecting the first set of training features and the second set of training features onto a feature space;
reordering the first set of training features and the second set of training features in a feature space based on a correlation between the first set of training features and the second set of training features;
generating one or more correlation pairs, each indicative of one or more first labels and corresponding second labels for a respective data set of the plurality of training data items, based on a rearrangement of the first set of training features and the second set of training features;
Including,
method.

10. The method of claim 1 ,
generating one or more test features based on the second set of training features and the second set of features;
identifying deviations by comparing the one or more test features to the first set of features;
modifying the second set of characteristics in response to the deviation; and
Including,
method.

10. The method of claim 1 ,
the input data item and the plurality of training data items include one of an image, a video, an audio input, a text input, and a voice input;
method.

10. The method of claim 1 ,
each dataset of the plurality of training data items comprises an original data item and one or more transformations of the original data item;
method.

10. The method of claim 1 ,
Generating the one or more first labels and the one or more second labels includes:
projecting, from the plurality of training data items, the first set of training features and the second set of training features in the feature space for each of one or more classes of a second portion of the training data items;
creating a plurality of regions based on similarity between the first set of training features and the second set of training features;
identifying a center and a radius for each of the plurality of regions;
adjusting a radius of each of the plurality of regions by projecting the first set of training features and the second set of training features in the feature space for each of one or more classes of the second portion of the other training data items from the plurality of training data items;
generating, for each of the plurality of regions, a label based on a similarity between the first set of training features and the second set of training features associated with a corresponding region;
Including,
method.

6. The method of claim 5,
the number of regions is based on a number of one or more transformations of each of the plurality of training data items and a number of the one or more classes.
method.

10. The method of claim 1 ,
Correlating the one or more second labels with the one or more first labels includes providing the one or more first labels to a first neural network from one or more neural networks associated with generating a second set of training features;
providing the one or more second labels to a second neural network from one or more neural networks associated with generating the first set of training features;
Including,
method.

1. A system for adapting a neural network to a dynamic environment, comprising:
one or more processors;
a memory storing processor-executable instructions that, when executed, cause one or more processors to perform the following operations:
Equipped with
The processor,
Receiving a first set of characteristics of an input data item;
determining a second set of features of the input data items by correlating the first set of features with one or more correlation pairs generated using a neural network;
receiving a feature set from the one or more neural networks, the feature set including a first set of training features and a second set of training features for each data set of a plurality of training data items;
generating, for each data set of the plurality of training data items, one or more first labels and one or more second labels by projecting the first set of training features and the second set of training features onto a feature space;
repositioning the first set of training features and the second set of training features in a feature space based on a correlation between the first set of training features and the second set of training features;
generating one or more correlation pairs indicative of one or more first labels and corresponding second labels for each data set of the plurality of training data items based on a rearrangement of the first set of training features and the second set of training features;
To carry out
system.

9. The system of claim 8,
The one or more processors:
generating one or more test features based on the second set of training features and the second set of features;
identifying deviations by comparing the one or more test features to the first set of features;
modifying the second set of characteristics in response to the deviation; and
To carry out
system.

9. The system of claim 8,
The input data item and the plurality of training data items comprise one of an image, a video, an audio input, a text input, and a voice input.

9. The system of claim 8,
each dataset of the plurality of training data items comprises an original data item and one or more transformations of the original data item;
system.

9. The system of claim 8,
The one or more processors:
projecting, from the plurality of training data items, the first set of training features and the second set of training features in the feature space for each of one or more classes of a second portion of the training data items;
creating a plurality of regions based on similarity between the first set of training features and the second set of training features;
identifying a center and a radius for each of the plurality of regions;
adjusting a radius of each of the plurality of regions by projecting the first set of training features and the second set of training features in the feature space for each of one or more classes of the second portion of the other training data items from the plurality of training data items;
generating, for each of the plurality of regions, a label based on a similarity between the first set of training features and the second set of training features associated with a corresponding region;
Due to
generating the one or more first labels and the one or more second labels;
system.

13. The system of claim 12,
The number of the plurality of regions is based on a number of one or more transformations of each of the plurality of training data items and a number of the one or more classes.

9. The system of claim 8,
The one or more processors:
providing the one or more first labels from one or more neural networks associated with generating the second set of training features to a first neural network, and providing the one or more second labels from the one or more neural networks associated with generating the first set of training features to a second neural network, thereby correlating the one or more first labels with the one or more second labels.
system.