JP6643905B2

JP6643905B2 - Machine learning method and machine learning device

Info

Publication number: JP6643905B2
Application number: JP2016006161A
Authority: JP
Inventors: 泰金田; 秋山　靖浩; 靖浩秋山; 健人緒方; 吉孝内田
Original assignee: Clarion Co Ltd
Current assignee: Faurecia Clarion Electronics Co Ltd
Priority date: 2016-01-15
Filing date: 2016-01-15
Publication date: 2020-02-12
Anticipated expiration: 2036-01-15
Also published as: JP2017126260A

Description

本発明はニューラルネットワーク等を使用した機械学習に関する。 The present invention relates to machine learning using a neural network or the like.

近年、多層ニューラルネットによる音声、画像などの認識に関する研究、いわゆる深層学習の研究が活性化している。この活性化は、第１に従来は学習させることが困難だった４層以上の多層（深層）ニューラルネットを、auto-encoderという機構を使用して学習させる方法が開発されたこと、第２に、たたみこみニューラルネットによる音声や画像の認識率がおおきく向上したことなどによっている。 In recent years, research on recognition of speech, images, and the like using a multilayer neural network, that is, research on so-called deep learning, has been activated. This activation is based on the development of a method of learning multi-layer (deep) neural nets with four or more layers using a mechanism called auto-encoder, which was difficult to learn conventionally. The reason is that the recognition rate of speech and images by the convolutional neural network has been greatly improved.

深層学習にかぎらず、ニューラルネットの訓練などで使用される逆伝搬学習法のための基本アルゴリズムとして、また各種の学習や最適化の手法として最急降下法（steepest descent method）が使用されている。この方法は決定的な探索法（deterministic search）である。しかし、この方法はほぼ確実に大域最適ではない局所最適値にとらわれるため、通常は確率的な探索（stochastic search）である確率的勾配降下法（stochastic gradient descent method）などが使用される。ニューラルネットの学習においては、学習を制御するパラメタとして学習率（learning rate）がある。学習率の初期値は実験者（人間）が決定し、学習の過程において定数であるか、またはあらかじめきめられたスケジュールで変化する。特許文献１においては、学習過程でえられた解の評価がたかいときは学習率を増加させ、評価がひくいときは学習率を低下させる。このように学習率を適応的にきめる方法は他にも提案されているが、いずれも適応可能な問題やネットワークが限定されている。 The steepest descent method is used not only for deep learning but also as a basic algorithm for a back propagation learning method used in neural network training and the like, and as various learning and optimization techniques. This method is a deterministic search. However, since this method is almost certainly limited to a local optimum value that is not global optimum, a stochastic gradient descent method, which is a stochastic search, is usually used. In neural network learning, there is a learning rate as a parameter for controlling the learning. The initial value of the learning rate is determined by the experimenter (human), and is a constant during the learning process or changes according to a predetermined schedule. In Patent Literature 1, the learning rate is increased when the evaluation of the solution obtained in the learning process is high, and the learning rate is decreased when the evaluation is poor. As described above, other methods for adaptively determining the learning rate have been proposed, but all of them have limited applicable problems and networks.

確率的な最適化のための方法として遺伝的アルゴリズム（ＧＡ）がある。ＧＡはもともとニューラルネットとは独立に発展してきた最適化法だが、機械学習の方法とくみあわせて使用されることもある。とくに、ニューラルネットにおいては逆伝搬学習とＧＡとをくみあわせて使用する方法も多数、開発されている。もっとも多いのは、特許文献２および非特許文献１のようにＧＡによってニューラルネットの構造およびウェイトを最適化する方法であるが、学習法を最適化するためにもＧＡが使用されている。これらの方法においては逆伝搬学習の全過程を実施したのちにＧＡの操作すなわち変異または交叉を実施することをくりかえす。なお、ほかの確率的探索法とのくみあわせとして、比較的近年開発され成功をおさめている確率的探索法である粒子群最適化法（particle swarm optimization methods）とくみあわせた方法も研究されている。また、確率的勾配降下法を並列化したアルゴリズムも開発されている。 As a method for stochastic optimization, there is a genetic algorithm (GA). GA is an optimization method originally developed independently of a neural network, but it is sometimes used in combination with a machine learning method. In particular, in a neural network, a number of methods for using back propagation learning and GA in combination have been developed. The most frequent method is the method of optimizing the structure and weight of a neural network by using GA as in Patent Document 2 and Non-Patent Document 1, but GA is also used to optimize the learning method. In these methods, the operation of the GA, that is, the mutation or crossover, is repeatedly performed after the entire back propagation learning process is performed. In addition, as a combination with other stochastic search methods, a method combining with particle swarm optimization methods, which is a stochastic search method that has been developed relatively recently and has been successful, has been studied. I have. Also, an algorithm in which the stochastic gradient descent method is parallelized has been developed.

米国特許第６２６９３５１号明細書U.S. Pat. No. 6,269,351 米国特許第６６０１０５３号明細書US Pat. No. 6,601,053

Marshall, S. J. and Harrison, R．F., “Optimization and Training of Feedforward Neural Networks by Genetic Algorithms”, 2nd International Conference on Artificial Neural Networks, pp. 39-43, November 1991.Marshall, S. J. and Harrison, R. F., “Optimization and Training of Feedforward Neural Networks by Genetic Algorithms”, 2nd International Conference on Artificial Neural Networks, pp. 39-43, November 1991.

近年の深層学習およびニューラルネットの逆伝搬学習にかかわる研究の発展にもかかわらず、ニューラルネットの逆伝搬学習においてはいくつかの困難な課題がのこっている。第１の課題は学習率の決定法である。すなわち、最適な学習率はニューラルネットの構造によってもことなり、問題によってもことなる。さらに、学習率を学習の過程において一定にする方法が比較的ひろく使用されているが、通常は学習がすすむにつれて低下させるのがよいため、学習過程においてそれを変化させる方法をくふうする必要がある。学習率のきめかたに関しては多数の文献がある。学習過程における時刻の関数として学習率が自動的にきまる方法も提案されている。最近ではほかにもさまざまな適応的な方法が考案されているが、これらの方法のおおくは巧妙な方法だがうまくいかない場合もある。より単純で強力な方法がもとめられる。 Despite recent developments in research on deep learning and back propagation learning of neural networks, there are some difficult issues in back propagation learning of neural networks. The first problem is how to determine the learning rate. That is, the optimum learning rate differs depending on the structure of the neural network, and also depends on the problem. In addition, a method of keeping the learning rate constant in the learning process is relatively widely used, but it is generally better to decrease the learning rate as the learning progresses. is there. There is a great deal of literature on how to determine the learning rate. A method has also been proposed in which the learning rate is automatically determined as a function of time in the learning process. A variety of other adaptive methods have been devised these days, but many of these methods are subtle but may not work. A simpler and more powerful method is needed.

逆伝搬学習における第２の課題は袋小路（local minima）から脱出することである。すなわち、ニューラルネットのおもみ（weight）の初期値によっては、逆伝搬学習しても最小化するべき関数（エラー率など）の値として最小値からほどとおい値しかもとめられないことがある。また、いったんは比較的よい値がもとめられても、さらに学習をすすめると最小値からとおざかったままもどらないことがある。このような袋小路から脱出し、最適にちかい値をもとめることが課題である。 The second challenge in backpropagation learning is to escape from the local minima. That is, depending on the initial value of the weight of the neural network, the value of the function to be minimized (such as the error rate) may not be as small as the minimum value even if the back propagation learning is performed. Also, once a relatively good value is determined, it may not return to the minimum value after further learning. The challenge is to escape from such a blind alley and determine the optimal value.

逆伝搬学習が袋小路にとらわれやすいのは、逆伝搬学習が局所探索の方法だからである。前記のように逆伝搬のための方法として通常は確率的勾配降下法が使用される。しかし、確率的な探索をおこなうのであれば、確立されたさまざまな確率的探索法のなかのいずれかとくみあわせて使用することによって、改善をはかることがかんがえられる。従来の逆伝搬学習とＧＡとのくみあわせはそれを目的としているが、従来の方法においてはＧＡが１回の逆伝搬学習の過程には作用しないため、１回の学習のなかで学習率を最適化することはできない。 The reason that backpropagation learning is easily caught in a dead end is that backpropagation learning is a local search method. As described above, a stochastic gradient descent method is usually used as a method for back propagation. However, if a stochastic search is to be performed, it can be improved by using it in combination with one of the various established stochastic search methods. The purpose of combining conventional backpropagation learning and GA is that, but in the conventional method, GA does not act on the process of one backpropagation learning, so the learning rate is reduced in one learning. It cannot be optimized.

上記の課題を解決するために、本発明の一形態は、プロセッサと、前記プロセッサに接続される記憶装置と、を有する計算機が実行する機械学習方法であって、前記記憶装置は、所定の処理を実行する複数のシステムを実現するための複数のプログラムと、前記複数のプログラムの各々に対応する複数の構造パラメタと、前記複数のシステムに対応する複数の学習パラメタと、を保持し、前記各学習パラメタは、前記各システムが実行する学習における前記構造パラメタの変更を指定するパラメタであり、前記機械学習方法は、前記プロセッサが、前記各システムに対応する学習パラメタを用いて、前記各システムに所定のデータセットを学習させる第１手順と、前記プロセッサが、前記各システムを所定の評価方法によって評価する第２手順と、前記プロセッサが、前記複数のシステムから第１システムおよび前記第１システムより評価が高い第２システムを選択し、前記第２システムに対応する前記プログラムおよび前記複数の構造パラメタの複製を、前記第２システムの複製を実現するためのプログラムおよびそれに対応する複数の構造パラメタとして生成し、前記第２システムに対応する前記学習パラメタの複製を前記第２システムの複製に対応する学習パラメタとして生成し、前記第２システムに対応する学習パラメタおよび前記第２システムの複製に対応する学習パラメタが互いに異なるように前記第２システムに対応する学習パラメタおよび前記第２システムの複製に対応する学習パラメタの少なくとも一方を変更する第３手順と、を含み、前記第１システム以外の前記複数のシステムについて、前記第１手順から前記第３手順が再度実行されることを特徴とする。 According to one embodiment of the present invention, there is provided a machine learning method executed by a computer having a processor and a storage device connected to the processor, wherein the storage device performs a predetermined process. Holding a plurality of programs for realizing a plurality of systems for executing, a plurality of structural parameters corresponding to each of the plurality of programs, and a plurality of learning parameters corresponding to the plurality of systems, The learning parameter is a parameter that specifies a change in the structural parameter in the learning performed by each system, and the machine learning method is such that the processor uses the learning parameter corresponding to each system to perform the processing on each system. A first step of learning a predetermined data set, and a second step in which the processor evaluates each system by a predetermined evaluation method. And the processor selects a first system and a second system having a higher evaluation than the first system from the plurality of systems, and copies the program and the plurality of structural parameters corresponding to the second system. A program for realizing a copy of the second system and a plurality of structural parameters corresponding to the program are generated, and a copy of the learning parameter corresponding to the second system is generated as a learning parameter corresponding to the copy of the second system. At least the learning parameter corresponding to the second system and the learning parameter corresponding to the copy of the second system such that the learning parameter corresponding to the second system and the learning parameter corresponding to the copy of the second system are different from each other. And a third procedure for changing one of the first and second systems. For serial multiple systems, wherein the third instructions from said first procedure is executed again.

本発明の一態様によれば、学習率（または学習パラメタ、最適化・探索制御パラメタ）を自律的に決定することによって最適化することができる。 According to one embodiment of the present invention, optimization can be performed by autonomously determining a learning rate (or a learning parameter or an optimization / search control parameter).

本発明の実施形態における多層ニューラルネットのパラメタの染色体へのエンコードの説明図である。FIG. 4 is an explanatory diagram of encoding of parameters of a multilayer neural network into chromosomes according to the embodiment of the present invention. 本発明の実施形態における逆伝搬学習とＧＡとをくみあわせた学習法の説明図である。FIG. 7 is an explanatory diagram of a learning method in which back propagation learning and GA are combined in the embodiment of the present invention. 本発明の実施形態における学習率の変化の例の説明図である。It is an explanatory view of an example of a change of a learning rate in an embodiment of the present invention. 本発明の実施形態における最良の個体とその他の各個体とのユークリッド距離の例の説明図である。It is explanatory drawing of the example of the Euclidean distance between the best individual and each other individual in embodiment of this invention. 本発明の実施形態のデータ識別用ニューラルネット設計ツールの構成および動作の概要の説明図である。It is an explanatory view of the outline of the composition and operation of the neural network design tool for data identification of the embodiment of the present invention. 本発明の実施形態のデータ識別用ニューラルネット設計ツールのハードウェア構成を説明するブロック図である。1 is a block diagram illustrating a hardware configuration of a data identification neural network design tool according to an embodiment of the present invention.

本発明の実施形態について説明する。この実施形態においてはこの実施形態における方法では並列化した逆伝搬学習の１ステップ（1 epoch）ごとにＧＡにおける選択と変異とをおこなうことによって、従来の逆伝搬学習法およびそれとＧＡとをくみあわせた方法と同様に学習結果としてニューラルネットを最適化するとともに、従来の方法においてはできなかった逆伝搬学習過程における学習率の最適化がなされる。 An embodiment of the present invention will be described. In this embodiment, the conventional back-propagation learning method and the GA are combined by performing selection and mutation in the GA for each step (1 epoch) of the back-propagation learning parallelized by the method in this embodiment. In the same way as the method described above, the neural network is optimized as a learning result, and the learning rate in the back propagation learning process, which cannot be achieved by the conventional method, is optimized.

（全体構成）
図５を使用してこの実施形態の全体構成すなわちデータ識別用ニューラルネット設計ツール５００の構成と動作の概要を説明する。データ識別用ニューラルネット設計ツール５００は画像データの識別などのためのニューラルネットをユーザが設計するためのツールであり、学習用のデータである原データ５０４と、原データ５０４を説明する教師情報５０５が入力される。原データ５０４としてはビデオや静止画を入力することができ、教師情報５０５はそのビデオや静止画のどの位置に識別するべき情報たとえば車両または歩行者が存在するかを指示する複数の矩形領域（bounding box）の情報をふくむ。設計されたニューラルネットは識別用ニューラルネット５０８のためのおもみ、バイアスというかたちで学習用ニューラルネット群５０７からとりだされる。この設計結果はテストデータとして識別するべきデータ５０９をあたえてテストすることができ、その結果として識別結果出力５１０が出力される。 (overall structure)
The overall configuration of this embodiment, that is, the configuration and operation of the neural network design tool 500 for data identification will be described with reference to FIG. The data identification neural network design tool 500 is a tool for the user to design a neural network for identifying image data and the like, and includes original data 504 as learning data and teacher information 505 for explaining the original data 504. Is entered. A video or a still image can be input as the original data 504, and the teacher information 505 includes information to be identified at which position of the video or the still image, for example, a plurality of rectangular areas (for example, indicating a vehicle or a pedestrian). bounding box) information. The designed neural network is extracted from the learning neural net group 507 in the form of a bias and a bias for the identifying neural network 508. This design result can be tested by giving data 509 to be identified as test data, and as a result, an identification result output 510 is output.

学習制御コンピュータ５０１はユーザの入力を学習制御プログラム５０２につたえ、学習制御プログラムからの出力をユーザにつたえる端末である。また、学習制御コンピュータ５０１は学習制御プログラム５０２、学習データ生成プログラム５０３、複数のニューラルネットによって構成される学習用ニューラルネット群５０７および識別用ニューラルネット５０８をそのうえで動作させることも可能であり、これらのプログラムの入出力である原データ５０４、教師情報５０５、教師情報つき学習データ５０６、識別するべきデータ５０９、および識別結果出力５１０を格納することができる。ただし、学習データ生成プログラム５０３、学習用ニューラルネット群５０７、識別用ニューラルネット５０８、原データ５０４、教師情報５０５、教師情報つき学習データ５０６、識別するべきデータ５０９、および識別結果出力５１０は学習制御プログラム５０２から指示される他のコンピュータ上に格納し実行させることもできる（図６参照）。 The learning control computer 501 is a terminal that inputs a user's input to the learning control program 502 and outputs an output from the learning control program to the user. Further, the learning control computer 501 can operate the learning control program 502, the learning data generation program 503, the learning neural net group 507 composed of a plurality of neural nets, and the identification neural net 508 thereon. Original data 504, teacher information 505, learning data with teacher information 506, data 509 to be identified, and an identification result output 510, which are input / output of a program, can be stored. However, the learning data generation program 503, the neural network group for learning 507, the neural network for identification 508, the original data 504, the teacher information 505, the learning data with teacher information 506, the data to be identified 509, and the identification result output 510 are used for learning control. It can also be stored and executed on another computer designated by the program 502 (see FIG. 6).

ユーザは学習制御コンピュータ５０１に原データ５０４および教師情報５０５としてどのデータを使用するかを指示する情報、学習データ生成プログラム５０３、ならびに、学習用ニューラルネット群５０７および識別用ニューラルネット５０８のためのパラメタを入力する。学習用ニューラルネット群５０７のためのパラメタは、後述するようにニューラルネットの数すなわち個体数、ニューラルネットのおもみおよびバイアスをランダムにきめるための乱数の種、複数のニューラルネットのおもみおよびバイアスの分布をきめる正規分布などの分布関数、平均値および標準偏差、ならびに、学習率の初期値の平均値および標準偏差をふくむ。ただし、これらの値として既定値を使用するときはユーザはそれを入力する必要はない。また、この入力はニューラルネットの停止条件としてステップ数（epoch数）の上限や誤差の目標値、学習率の目標値をふくむことができる。ユーザはこの入力の際に識別するべきデータ５０９もあわせて入力することができる。 The user instructs the learning control computer 501 which data to use as the original data 504 and the teacher information 505, the learning data generation program 503, and the parameters for the learning neural net group 507 and the identification neural net 508. Enter The parameters for the learning neural network group 507 include the number of neural nets, that is, the number of individuals, seeds of random numbers for randomly determining weights and biases of the neural nets, weights and biases of a plurality of neural nets, as described later. , A distribution function such as a normal distribution that determines the distribution of, the average value and the standard deviation, and the average value and the standard deviation of the initial value of the learning rate. However, when using default values for these values, the user does not need to enter them. This input can include the upper limit of the number of steps (the number of epochs), the target value of the error, and the target value of the learning rate as the conditions for stopping the neural network. The user can also input data 509 to be identified at the time of this input.

ユーザの指示によって学習制御プログラム５０２は学習データ生成プログラム５０３を動作させ、教師情報つき学習データ５０６を生成させる。原データ５０４は比較的おおきなサイズ（たとえば６４０×４８０）のフレーム画像であり、そのままでは学習用ニューラルネット群５０７があつかえないため、１枚のフレーム画像を使用して比較的ちいさなサイズ（たとえば２４×４８）の多数のパッチ画像を生成し、教師情報つき学習データ５０６とする。これらの画像は学習データ生成プログラム５０３が教師情報５０５を使用することによって正例すなわち検出するべき画像と負例すなわち検出するべきでない画像とに分類されるため、教師情報つき学習データ５０６においてはその分類が画像と１対１に対応するかたちで格納される。教師情報５０５においては検出するべき画像がクラスわけされていることもあり、この場合には教師情報つき学習データ５０６にはそのクラスが格納される。すなわち、パッチ画像とクラスとの対が格納される。 The learning control program 502 operates the learning data generation program 503 in accordance with a user's instruction, and generates learning data 506 with teacher information. The original data 504 is a frame image having a relatively large size (for example, 640 × 480). Since the learning neural network group 507 cannot be used as it is, a relatively small size (for example, 24 A number of patch images (× 48) are generated and used as learning data 506 with teacher information. The learning data generation program 503 uses the teacher information 505 to classify these images into positive examples, that is, images to be detected, and negative examples, that is, images that should not be detected. The classifications are stored in a one-to-one correspondence with the images. In the teacher information 505, the image to be detected may be classified into classes. In this case, the class is stored in the learning data with teacher information 506. That is, a pair of a patch image and a class is stored.

ユーザの指示によって学習制御プログラム５０２は学習用ニューラルネット群５０７を動作させて逆伝搬学習をおこなう。すなわち、教師情報つき学習データ５０６がふくむ画像をニューラルネットに入力し、その出力と教師情報との差にもとづいて学習用ニューラルネット群５０７のウェイトとバイアスを更新することがくりかえされる。複数のニューラルネットを学習させる方法は後述する。 The learning control program 502 operates the learning neural net group 507 according to a user's instruction to perform back propagation learning. That is, the image including the learning data with teacher information 506 is input to the neural network, and the weight and bias of the learning neural network group 507 are updated based on the difference between the output and the teacher information. A method of learning a plurality of neural nets will be described later.

図６は、本発明の実施形態のデータ識別用ニューラルネット設計ツール５００のハードウェア構成を説明するブロック図である。 FIG. 6 is a block diagram illustrating a hardware configuration of a data identification neural network design tool 500 according to the embodiment of this invention.

本実施形態のデータ識別用ニューラルネット設計ツール５００は、例えば、ネットワーク６３０によって相互に接続された計算機６００、６１０および６２０によって実現することができる。 The data identification neural network design tool 500 of the present embodiment can be realized by, for example, computers 600, 610, and 620 interconnected by a network 630.

計算機６００は、図５の学習制御コンピュータ５０１に相当し、相互に接続されたＣＰＵ（Central Processing Unit）６０１、メモリ６０２、Ｉ／Ｆ（Interface）６０３およびＨＤＤ（Hard Disk Drive）６０４を有する。ＣＰＵ６０１は、メモリ６０２に格納されたプログラムを実行するプロセッサである。メモリ６０２は、ＣＰＵ６０１によって実行されるプログラム及び処理されるデータ等を格納するいわゆる主記憶装置である。本実施形態のメモリ６０２は、学習制御プログラム５０２を格納する。本実施形態において学習制御プログラム５０２が実行する処理は、実際には、ＣＰＵ６０１が学習制御プログラム５０２に従って実行する。Ｉ／Ｆ６０３は、ネットワーク６３０を介して計算機６１０および６２０との間でデータを送受信する。ＨＤＤ６０４は、ＣＰＵ６０１によって実行されるプログラム及び処理されるデータ等を格納するいわゆる補助記憶装置である。例えば学習制御プログラム５０２がＨＤＤ６０４に格納され、必要に応じてメモリ６０２にコピーされてもよい。 The computer 600 corresponds to the learning control computer 501 in FIG. 5 and includes a CPU (Central Processing Unit) 601, a memory 602, an I / F (Interface) 603, and an HDD (Hard Disk Drive) 604 which are interconnected. The CPU 601 is a processor that executes a program stored in the memory 602. The memory 602 is a so-called main storage device that stores programs executed by the CPU 601 and data to be processed. The memory 602 of the present embodiment stores the learning control program 502. In the present embodiment, the processing executed by the learning control program 502 is actually executed by the CPU 601 according to the learning control program 502. The I / F 603 transmits and receives data to and from the computers 610 and 620 via the network 630. The HDD 604 is a so-called auxiliary storage device that stores programs executed by the CPU 601 and data to be processed. For example, the learning control program 502 may be stored in the HDD 604 and copied to the memory 602 as needed.

計算機６１０は、相互に接続されたＣＰＵ６１１、メモリ６１７、Ｉ／Ｆ６１６およびＨＤＤ６１５を有し、さらに、ＣＰＵ６１１に接続されたＧＰＵ（Graphics Processing Unit）６１２を有する。ＣＰＵ６１１は、メモリ６１７に格納されたプログラムを実行するプロセッサである。メモリ６１７は、ＣＰＵ６１１によって実行されるプログラム及び処理されるデータ等を格納するいわゆる主記憶装置である。本実施形態のメモリ６０２は、学習データ生成プログラム５０３を格納する。本実施形態において学習データ生成プログラム５０３が実行する処理は、実際には、ＣＰＵ６１１が学習データ生成プログラム５０３に従って実行する。Ｉ／Ｆ６１６は、ネットワーク６３０を介して計算機６００および６２０との間でデータを送受信する。ＨＤＤ６１５は、ＣＰＵ６１１等によって実行されるプログラム及び処理されるデータ等を格納するいわゆる補助記憶装置である。本実施例のＨＤＤ６１５は、原データ５０４および教師情報５０５を格納する。 The computer 610 has a CPU 611, a memory 617, an I / F 616, and an HDD 615 connected to each other, and further has a GPU (Graphics Processing Unit) 612 connected to the CPU 611. The CPU 611 is a processor that executes a program stored in the memory 617. The memory 617 is a so-called main storage device that stores programs executed by the CPU 611, data to be processed, and the like. The memory 602 of the present embodiment stores a learning data generation program 503. In the present embodiment, the processing executed by the learning data generation program 503 is actually executed by the CPU 611 according to the learning data generation program 503. The I / F 616 transmits and receives data to and from the computers 600 and 620 via the network 630. The HDD 615 is a so-called auxiliary storage device that stores programs executed by the CPU 611 and the like, data to be processed, and the like. The HDD 615 of this embodiment stores original data 504 and teacher information 505.

ＧＰＵ６１２は、複数のプロセッサコア６１３及びメモリ６１４を有するプロセッサである。本実施例のメモリ６１４には、学習用ニューラルネット群５０７および教師情報つき学習データ５０６が格納される。本実施形態の学習用ニューラルネット群５０７の動作は、ＧＰＵ６１２によって実行される。 The GPU 612 is a processor having a plurality of processor cores 613 and a memory 614. In the memory 614 of this embodiment, a learning neural network group 507 and learning data with teacher information 506 are stored. The operation of the learning neural network group 507 of the present embodiment is executed by the GPU 612.

なお、教師情報つき学習データ５０６は、原データ５０４および教師情報５０５から学習データ生成プログラム５０３によって生成されると、ＨＤＤ６１５に格納され、その後、ＣＰＵ６１１によってメモリ６１４にコピーされてもよい。同様に、学習用ニューラルネット群５０７は、ＨＤＤ６１５またはメモリ６１７に格納され、ＣＰＵ６１１によってメモリ６１４にコピーされてもよい。 When the learning data with teacher information 506 is generated from the original data 504 and the teacher information 505 by the learning data generation program 503, it may be stored in the HDD 615 and then copied to the memory 614 by the CPU 611. Similarly, the learning neural net group 507 may be stored in the HDD 615 or the memory 617, and may be copied to the memory 614 by the CPU 611.

学習用ニューラルネット群５０７に含まれる各学習用ニューラルネットは、メモリ６１４に格納された各学習用ニューラルネットに含まれるニューロン間の重みおよびバイアス等の構造パラメタのセットと、それらの構造パラメタおよび入力された学習データに基づいて出力を計算し、その出力の評価に基づいて所定の学習方法（たとえば逆伝搬学習）による学習を行うプログラムと、に対応する。すなわち、各学習用ニューラルネットは、ＧＰＵ６１２が構造パラメタを使用して対応するプログラムを実行することによって実現されるシステムである。 Each of the learning neural nets included in the learning neural net group 507 includes a set of structural parameters such as a weight and a bias between neurons included in each of the learning neural nets stored in the memory 614, and the structural parameters and input. And a program that calculates an output based on the obtained learning data and performs learning by a predetermined learning method (for example, back propagation learning) based on the evaluation of the output. That is, each neural network for learning is a system realized by the GPU 612 executing the corresponding program using the structural parameters.

計算機６２０は、相互に接続されたＣＰＵ６２１、メモリ６２６、Ｉ／Ｆ６２５およびＨＤＤ６２７を有し、さらに、ＣＰＵ６２１に接続されたＧＰＵ６２２を有する。ＣＰＵ６２１は、メモリ６２６に格納されたプログラムを実行するプロセッサである。メモリ６２６は、ＣＰＵ６２１によって実行されるプログラム及び処理されるデータ等を格納するいわゆる主記憶装置である。Ｉ／Ｆ６２５は、ネットワーク６３０を介して計算機６００および６１０との間でデータを送受信する。ＨＤＤ６２７は、ＣＰＵ６２１等によって実行されるプログラム及び処理されるデータ等を格納するいわゆる補助記憶装置である。 The computer 620 has a CPU 621, a memory 626, an I / F 625, and an HDD 627 connected to each other, and further has a GPU 622 connected to the CPU 621. The CPU 621 is a processor that executes a program stored in the memory 626. The memory 626 is a so-called main storage device that stores programs executed by the CPU 621, data to be processed, and the like. The I / F 625 transmits and receives data to and from the computers 600 and 610 via the network 630. The HDD 627 is a so-called auxiliary storage device that stores programs executed by the CPU 621 and the like, data to be processed, and the like.

ＧＰＵ６２２は、複数のプロセッサコア６２３及びメモリ６２４を有するプロセッサである。本実施例のメモリ６２４には、識別用ニューラルネット５０８、識別するべきデータ５０９および識別結果出力５１０が格納される。本実施形態の識別用ニューラルネット５０８の動作は、ＧＰＵ６２２によって実行される。 The GPU 622 is a processor having a plurality of processor cores 623 and a memory 624. In the memory 624 of this embodiment, a neural network 508 for identification, data 509 to be identified, and an identification result output 510 are stored. The operation of the neural network for identification 508 of the present embodiment is executed by the GPU 622.

なお、識別用ニューラルネット５０８および識別するべきデータ５０９は、ＨＤＤ６２７またはメモリ６２６に格納され、ＣＰＵ６２１によってメモリ６２４にコピーされてもよい。また、識別結果出力５１０は、ＣＰＵ６２１によってメモリ６２４からメモリ６２６またはＨＤＤ６２７にコピーされ、さらに、必要に応じてＩ／Ｆ６２５およびネットワーク６３０を介して計算機６００等に送信されてもよい。 Note that the identification neural network 508 and the data 509 to be identified may be stored in the HDD 627 or the memory 626, and may be copied to the memory 624 by the CPU 621. Further, the identification result output 510 may be copied from the memory 624 to the memory 626 or the HDD 627 by the CPU 621, and further transmitted to the computer 600 or the like via the I / F 625 and the network 630 as necessary.

なお、図６はデータ識別用ニューラルネット設計ツール５００のハードウェア構成の一例であり、実際には種々の変形例があり得る。例えば、計算機６１０が複数のＧＰＵ６１２を有してもよい。その場合、各ＧＰＵ６１２のメモリ６１４に、学習用ニューラルネット群５０７に含まれる各学習用ニューラルネットと、教師情報つき学習データ５０６とが格納され、それぞれのＧＰＵ６１２が一つの学習用ニューラルネットの学習を行ってもよい。これによって、複数のニューラルネットの学習が並列に実行されるため、学習に要する時間が短縮される。 FIG. 6 shows an example of the hardware configuration of the data identification neural network design tool 500, and there may be various modifications in practice. For example, the computer 610 may have a plurality of GPUs 612. In this case, each learning neural net included in the learning neural net group 507 and the learning data with teacher information 506 are stored in the memory 614 of each GPU 612, and each GPU 612 learns one learning neural net. May go. Thereby, learning of a plurality of neural nets is performed in parallel, so that the time required for learning is reduced.

あるいは、計算機６００、６１０および６２０のいずれか二つまたは全部の機能が一つの計算機によって実現されてもよい。あるいは、上記の例においてＧＰＵ６１２等が実行する処理が、ＣＰＵ６１１等によって実行されてもよい。あるいは、ＨＤＤ６０４等がフラッシュメモリ等のＨＤＤ以外の種類の記憶装置によって置き換えられてもよい。 Alternatively, any two or all of the functions of the computers 600, 610, and 620 may be realized by one computer. Alternatively, the processing executed by the GPU 612 or the like in the above example may be executed by the CPU 611 or the like. Alternatively, the HDD 604 and the like may be replaced by a storage device other than the HDD such as a flash memory.

（染色体の表現）
この実施形態においては、図１のように多層ニューラルネットのパラメタが遺伝的アルゴリズム（ＧＡ）の染色体にエンコードされる。１個の個体は１個の染色体だけをもつため、染色体と個体はここでは同義である。図１（ａ）には３層パーセプトロンの例をしめす。結合のおもみ１０１を染色体上にエンコードする点は従来のニューラルネットとＧＡをくみあわせた方法におけるエンコード法と同様だが、本実施形態ではさらに学習率（learning rate）１０２もあわせてエンコードされている。なお、図１（ａ）においてはニューロン間の結合パラメタのうちおもみだけを記述しているが、定数項すなわちバイアスも染色体にエンコードすることができる。また、ここではニューラルネットの構造は固定にしているため構造は染色体上に表現されていないが、構造も表現することによって、学習過程において所定の変異規則にしたがってニューロンおよびニューロン間結合を変更（たとえば削除）するような構造最適化もＧＡを使用して実現することができる。すなわち、染色体を可変長にし、各ニューロンのパラメタを記述する（ニューロンを削除する際にはそれ全体を削除する）ようにしたり、ニューロン間の結合に関するパラメタを記述する（結合を削除する際にはそれを削除する）ようにすることができる。 (Chromosome expression)
In this embodiment, as shown in FIG. 1, the parameters of the multilayer neural network are encoded on the chromosome of the genetic algorithm (GA). Since one individual has only one chromosome, chromosome and individual are synonymous here. FIG. 1A shows an example of a three-layer perceptron. The point that the joint 101 is encoded on the chromosome is the same as the encoding method in the method of combining the conventional neural network and GA, but in the present embodiment, the learning rate 102 is also encoded. . In FIG. 1 (a), only the connection parameter among the neurons is described, but the constant term, that is, the bias can also be encoded in the chromosome. Here, the structure of the neural network is not represented on the chromosome because the structure of the neural network is fixed, but by representing the structure, the neurons and the connections between the neurons are changed according to a predetermined mutation rule in the learning process (for example, A structural optimization such as deleting) can also be realized using GA. That is, the chromosome is made variable length, and the parameters of each neuron are described (when deleting a neuron, the whole is deleted), or the parameters related to connections between neurons are described (when deleting a connection, Remove it).

図１（ｂ）には、画像認識などにおいてよく使用されるたたみこみニューラルネット（ＣＮＮ）のエンコードをしめしている。図１（ａ）と比較するとパラメタ数は増加し染色体の規模が拡大するが、パラメタをエンコードするという点においてはおなじである。 FIG. 1B illustrates encoding of a convolutional neural network (CNN) often used in image recognition and the like. Compared to FIG. 1 (a), the number of parameters increases and the scale of the chromosome increases, but the same is true in that parameters are encoded.

染色体の構造はすべての個体について同一である必要はない。すなわち、ことなる構造の（たとえばニューロン間の結合がことなる、またはニューロン数およびニューロン間の結合がことなる）ニューラルネットを使用して計算をおこなうことができる。この場合でも変異は同様におこなうことができる。ＧＡにおいては変異のほかに交叉という演算が使用されるが、同一の構造をもつ染色体間ではもちろん、ことなる構造をもつ染色体間でも交叉をおこなうことが可能である。たとえば、２個のニューラルネットのそれぞれをいずれかの層のあいだで分割するか、特定の層において２分割して、それらを交叉してくみあわせることが可能である。この際には、切断する結合の数がひとしくなるようにすれば単純に再接続するだけでニューラルネットの構造を維持することができるが、結合が不足するときはおもみ０の結合を導入したり、結合に剰余がでるときには結合を削除することによって、ニューラルネットの構造を再構築することができる。 The structure of the chromosome need not be the same for all individuals. That is, the calculations can be performed using neural nets of different structures (eg, different connections between neurons or different numbers of neurons and connections between neurons). In this case, the mutation can be similarly performed. In GA, an operation called crossover is used in addition to mutation, but crossover can be performed not only between chromosomes having the same structure but also between chromosomes having different structures. For example, it is possible to divide each of the two neural nets between any of the layers, or to divide them into two in a specific layer and to intersect them. In this case, the structure of the neural network can be maintained by simply reconnecting if the number of bonds to be cut is reduced to one, but when the number of bonds is insufficient, a bond of zero is introduced. When the connection has a remainder, the structure of the neural network can be reconstructed by deleting the connection.

なお、染色体へのコーディングは多層ニューラルネットにかぎらず、学習または最適化・探索のための他の種類のシステムにおいても適用することができる。すなわち、システムの構造パラメタ（ニューラルネットにおける結合のおもみに相当）と学習パラメタまたは最適化・探索の過程を制御するパラメタ（学習率に相当）をコーディングし、変異および交叉の操作を適用することができる。 The coding on the chromosome is not limited to the multilayer neural network, but can be applied to other types of systems for learning or optimization / search. In other words, coding the structural parameters of the system (corresponding to the joints of the neural network) and the learning parameters or the parameters that control the optimization / search process (corresponding to the learning rate), and applying the mutation and crossover operations Can be.

（学習法）
以下、図２を使用して逆伝搬学習とＧＡとをくみあわせた学習法について説明する。この学習法をＬＯＧ−ＢＰ学習法（learning-rate-optimizing genetic back-propapation 学習法）とよぶ。この学習法においては、前節でしめした染色体を複数用意して並列に逆伝搬学習をおこなうことによって、それらの染色体上のおもみは自律的に変異する。また、学習率は確率的に変異させる。 (Learning method)
Hereinafter, a learning method combining back propagation learning and GA will be described with reference to FIG. This learning method is called LOG-BP learning method (learning-rate-optimizing genetic back-propapation learning method). In this learning method, by preparing a plurality of chromosomes described in the previous section and performing backpropagation learning in parallel, the weight on those chromosomes changes autonomously. In addition, the learning rate is stochastically varied.

まず、図２のプログラム（すなわち学習用ニューラルネット群５０７にふくまれるプログラム）がくみこまれたコンピュータ（図６の例では計算機６１０）が、染色体の初期化をおこなう（２０１）。個体数は可変とすることもできるが、ここでは固定数（たとえば２０個）とする。それらの染色体がもつおもみと学習率は乱数によってきめられる。おもみの初期化は通常の逆伝搬学習におけるのと同様におこなえばよいが、たとえば正規分布する乱数によっておもみやバイアスをきめてもよい。学習率も乱数を使用して適度に分布させるが、たとえば正規分布によってきめればよい。学習率は発散頻度がたかくなりすぎない程度に、比較的おおきな値にするのがよいとかんがえられる。これらの初期値をきめるためのパラメタは学習制御プログラム５０２を経由して外部から入力することができる。すなわち、学習の開始前に学習率、おもみの平均値、標準偏差、分布の形状、および乱数の種を指定することができる。 First, the computer (the computer 610 in the example of FIG. 6) into which the program of FIG. 2 (that is, the program included in the learning neural network group 507) is incorporated performs chromosome initialization (201). The number of individuals can be variable, but here is a fixed number (for example, 20). The weight and learning rate of those chromosomes are determined by random numbers. Initialization of the weight may be performed in the same manner as in normal backpropagation learning. For example, weight and bias may be determined by normally distributed random numbers. The learning rate is also appropriately distributed using random numbers, but may be determined by, for example, a normal distribution. It seems that the learning rate should be set to a relatively large value so that the divergence frequency does not become too high. The parameters for determining these initial values can be externally input via the learning control program 502. That is, the learning rate, the average value of the rice, the standard deviation, the shape of the distribution, and the seed of the random number can be designated before the learning is started.

つぎに、コンピュータは、各個体について逆伝搬学習の１ステップ（1 epoch）をおこなう（２０３）。このステップがＧＡにおける１世代に相当する。たとえば、コンピュータは、画像データを学習させるときには、あらかじめ、できるだけ多数の画像データを訓練データとして用意し、その一部を検証用データとしてとりわける。また、おなじ形式の画像からなる評価用データを必要に応じて用意する。そして、コンピュータは、すべての訓練データを１回、学習させる（画像データを使用した学習に関しては後述する）。ミニバッチを単位とする確率的勾配降下法（すなわち、訓練データのすべてを一度に学習させる最急降下法とも、１個ずつ学習させる基本的な確率的勾配降下法ともちがって、ある程度ずつまとめて学習させる方法）を使用するときは、配列に格納した訓練用データをミニバッチごとに分割して１回ずつ逆伝搬させて学習させる。このとき、学習率としては各染色体にエンコードされた値を使用する。 Next, the computer performs one step (1 epoch) of back propagation learning for each individual (203). This step corresponds to one generation in GA. For example, when learning image data, the computer prepares as many image data as possible as training data in advance, and specially uses a part of the training data as verification data. Also, evaluation data composed of images in the same format is prepared as needed. Then, the computer trains all training data once (learning using image data will be described later). Stochastic gradient descent in units of mini-batch (that is, steepest descent method in which all training data is learned at once, and basic stochastic gradient descent method in which training data is learned one by one; When using (method), the training data stored in the array is divided for each mini-batch and learned by backpropagating once. At this time, a value encoded in each chromosome is used as the learning rate.

つぎに、コンピュータは、学習によって変化したおもみによって、染色体上のおもみを更新する（２０４）。すなわち、この方法においては染色体上のおもみの値は外的に変化させるのではなくて、乱数と各個体の学習にもとづいて自律的に更新される。すなわち、Darwin的な遺伝ではなく、獲得形質がそのまま遺伝するLamarck的な遺伝を実現する。ただし、おもみの更新を変異とかんがえれば、この過程はＧＡの基本に一致する。 Next, the computer updates the weight on the chromosome with the weight changed by learning (204). That is, in this method, the value of the weight on the chromosome is not changed externally, but is updated autonomously based on random numbers and learning of each individual. That is, Lamarck-like inheritance in which acquired traits are inherited as they are is realized, instead of Darwin-like inheritance. However, this process is consistent with the basics of GA, assuming that the updating of fir is a mutation.

つぎに、コンピュータは、更新された各個体（もとの個体の無性生殖による卵子）に関して、検証用データを使用して評価をおこなう（２０５）。十分な評価値をもつ個体があれば、ここで計算を終了すればよい（２０６）。十分な評価値をもつ個体がないときは、評価結果がエラー率であれば値はひくいほどよいから、コンピュータは、その値にもとづいて選択をおこなう。すなわち、値が最大のものすなわち評価が最悪の個体（すなわち染色体）は殺して（すなわち削除して）、最小のものすなわち評価が最良の個体をコピーする（２卵性双生児を生成する）（２０７）。これによって個体数は不変になる。ただし、生成（コピー）確率と死滅確率とを同一にしないことにより、個体数がしだいに増加または減少するようにすることも可能である。コンピュータは、コピーによって生成された個体に関してはつぎの式にしたがって染色体上の学習率ηを変異させる。 Next, the computer evaluates each updated individual (eg, an asexually regenerated egg of the original individual) using the verification data (205). If there is an individual having a sufficient evaluation value, the calculation may be terminated here (206). When there is no individual having a sufficient evaluation value, if the evaluation result is an error rate, the lower the value, the better, and the computer makes a selection based on the value. That is, the one with the highest value, i.e., the worst evaluated (i.e., chromosome) is killed (i.e., deleted), and the smallest, i.e., the individual with the best evaluation is copied (to produce a dizygotic twin) (207). ). This makes the population unchanged. However, by making the generation (copy) probability and the death probability not the same, the number of individuals can be gradually increased or decreased. The computer mutates the learning rate η on the chromosome according to the following equation for the individual generated by copying.

η' = fη （確率 0.5）
η' =η/f （確率 0.5） η '= fη (probability 0.5)
η '= η / f (probability 0.5)

すなわち、どちらの式を適用するかは乱数によって等確率になるように決定する。ｆはたとえば１．２くらいの値であり、適応的な逆伝搬学習法において使用される規則（この規則は本来はＧＡとは無関係）にちかい。ただし、上記の式による学習率の変更は一例であり、評価が最良の個体の学習率とそれをコピーすることによって生成された個体の学習率とが相違するように決定されるかぎり、例えば両方の学習率を変更するなど、上記以外の方法によって学習率を決定してもよい。また、上記の例では、評価が最悪の個体が削除されて、評価が最良の個体のコピーが生成されるが、コピーが生成される個体の評価が削除される個体の評価よりよいかぎり、削除とコピーの対象を評価が最悪の個体と最良の個体とに限定する必要はない。 That is, which equation is applied is determined by random numbers so as to have equal probability. f is a value of, for example, about 1.2, which is similar to a rule used in an adaptive back propagation learning method (this rule is originally irrelevant to GA). However, the change of the learning rate by the above equation is an example, and as long as the learning rate of the individual whose evaluation is the best and the learning rate of the individual generated by copying it are determined to be different, for example, The learning rate may be determined by a method other than the above, such as changing the learning rate of. Further, in the above example, the worst individual is deleted and a copy of the individual with the best evaluation is generated.However, as long as the evaluation of the individual whose copy is generated is better than the evaluation of the individual to be deleted, deletion is performed as long as it is better. It is not necessary to limit the objects to be copied to the worst individual and the best individual.

また、処理２０７における染色体の削除は、当該染色体をそれ以降の機械学習の処理から除外するための処理の一例であり、実際にその染色体をメモリ６１４から削除してもよいし、その染色体をメモリ６１４に残したまま、例えば学習制御プログラム５０２がそれ以降のepochにおいてその染色体に関する機械学習を行わないように学習を制御するなどの方法でその染色体を機械学習の処理から除外してもよい。以下の説明における染色体の削除も同様である。 Further, the deletion of the chromosome in the process 207 is an example of a process for excluding the chromosome from the subsequent machine learning process. The chromosome may be actually deleted from the memory 614, or the chromosome may be deleted from the memory 614. The chromosomes may be excluded from the machine learning process, for example, by controlling the learning so that the learning control program 502 does not perform the machine learning on the chromosomes in the subsequent epochs. The same applies to chromosome deletion in the following description.

コンピュータは、適切な解がえられるまで、あるいは変化がほとんどおこらなくなるまで、上記のステップ（epoch）を反復して計算する（２０９）。計算停止の条件は通常の逆伝搬学習法におけるのと同様にきめればよい。処理２０２および２０８はこの反復にかかわるパラメタの初期化および更新のための処理である。 The computer repeats the above steps (epoch) until a proper solution is obtained or little change occurs (209). The condition for stopping the calculation may be determined in the same manner as in the normal back propagation learning method. Processes 202 and 208 are processes for initializing and updating parameters relating to this repetition.

上記ではステップごとに選択と変異をおこなうように記述したが、実際にはステップごとの選択と変異の回数の平均値を選択・制御するのがよいとかんがえられる。すなわち、各ステップにちょうど１回の選択・変異をおこなうのでは、個体数がすくないときはその回数は過大になり、個体数がおおいときにはその回数は過小になる。そのため、ステップごとに選択・変異をおこなう回数の平均値をあらかじめきめておいて、実際の回数は確率的にきめればよい。選択回数が過大であれば探索範囲がはやくせばまりすぎるし、過小であれば探索範囲がひろくなりすぎるとかんがえられる。選択回数を適切に制御することによって、計算開始時には広域を探索し、徐々に探索範囲をせまくすることができ、うまく解をもとめることができるという効果がある。 In the above description, selection and mutation are performed for each step. However, in practice, it is considered better to select and control the average value of the number of selections and mutations for each step. That is, if the selection / mutation is performed exactly once for each step, the number of individuals becomes excessive when the number of individuals is small, and the number becomes excessively small when the number of individuals is large. Therefore, the average value of the number of times of performing selection / mutation for each step is determined in advance, and the actual number of times may be determined stochastically. If the number of selections is too large, the search range may be too fast, and if too small, the search range may be too wide. By appropriately controlling the number of selections, a wide area can be searched at the start of calculation, and the search range can be gradually narrowed, so that a solution can be obtained well.

なお、ニューラルネットのかわりに他の学習システムまたは最適化・探索システムを使用するときは、反復実行されるその学習や最適化・探索の１ステップごとに評価をおこない、その結果にもとづいて選択をおこない、学習過程を制御する学習パラメタあるいは最適化・探索過程を制御する最適化・探索パラメタの値を変異させる。この変異に関しては、これらのパラメタの複数の値のあいだに距離（スカラー値のときは差）が定義できるときは乱数を使用して距離のちかいパラメタ値を生成すればよい。また、距離が定義できないときはいずれかことなる値を乱数によって選択すればよい。 When another learning system or optimization / search system is used instead of the neural network, evaluation is performed for each step of the learning or optimization / search that is repeatedly executed, and the selection is made based on the result. Then, the value of the learning parameter for controlling the learning process or the value of the optimization / search parameter for controlling the optimization / search process is varied. With regard to this variation, when a distance (difference in scalar value) can be defined between a plurality of values of these parameters, a parameter value close to the distance may be generated using random numbers. When the distance cannot be defined, any value may be selected by using a random number.

（学習法と応用範囲に関する補足）
以下、ＬＯＧ−ＢＰ学習法の変異とその応用範囲拡大に関する６点について記述する。第１に、各個体は検証用データにもとづく評価値を参照し、それを学習に反映させることができる。上記のアルゴリズムにおいては選択のためにもその評価値を使用しているが、選択は外的なものとかんがえられるから、選択のための評価値はそれとはべつにあたえることが可能である。たとえば、検証用データ以外に評価用データをあたえ、選択にはそれを使用することもかんがえられる。すなわち、各個体による選択基準（逆伝搬学習における基準）と外的な選択基準（ＧＡにおける基準）としてことなる基準を使用することができる。 (Supplementary information on learning methods and application range)
Hereinafter, six points related to the variation of the LOG-BP learning method and the expansion of its application range will be described. First, each individual can refer to the evaluation value based on the verification data and reflect that in learning. In the above algorithm, the evaluation value is also used for selection, but since the selection is considered to be external, the evaluation value for selection can be different from that. For example, it is possible to provide evaluation data in addition to verification data and use it for selection. That is, a different criterion can be used as a selection criterion by each individual (a criterion in backpropagation learning) and an external selection criterion (a criterion in GA).

第２に、前記の方法においてはニューラルネットの構造およびパラメタは選択・変異によって変化しない。以下の評価においては拡張はおこなわないが、構造およびパラメタを最適化する目的でこれを拡張し、変異および交差を使用することは可能である。たとえば、各染色体が各ニューラルネットのニューロン数およびニューロン間の結合の有無を示す情報を含み、計算機６１０は、処理２０７において、所定の変異規則に基づいて染色体を変異させることによって、ニューロン間の結合を切断したり、ニューロンを消滅させたりすることができる。後述するニューロンの追加も同様である。ニューラルネット以外のシステムを使用するときも、同様の方法によってその一部を変更・削除することができる。 Second, in the above method, the structure and parameters of the neural network are not changed by selection / mutation. No extensions will be made in the following evaluations, but it is possible to extend them for the purpose of optimizing structure and parameters and to use mutations and crossovers. For example, each chromosome includes information indicating the number of neurons of each neural network and the presence / absence of connection between neurons, and the computer 610 mutates the chromosomes based on a predetermined mutation rule in the process 207, thereby obtaining the connection between neurons. Can be cut or neurons can be killed. The same applies to the addition of neurons described later. When a system other than a neural network is used, a part thereof can be changed or deleted by the same method.

第３に、変異によってニューロンを追加することも可能である。ニューロンを追加する際、それによってすでにおこなった学習を無効にしないためには、おもみの値をちいさくすればよいとかんがえられる（おもみが０ならば追加しないのとおなじになる）。ただし、それでは追加したニューロンが活性化されない可能性もある。訓練データを増加させずにニューロンを追加すると過剰適合が発生し、みかけ上は評価値が向上しやすいとかんがえられる。そのため、ニューラルネットの規模がちいさいときに評価値が向上するように評価関数をきめるのがよいとかんがえられる。たとえば、評価値の一部として最小記述長（minimul description length, MDL）をくわえる（いいかえればdescription length penalty をあたえる）ことがかんがえられる。すなわち、ニューラルネットのモデルの記述長を評価値の一部とする。染色体がニューラルネットの構造を記述しているときには、それはモデルを記述したものということができるから、染色体のながさを記述長として使用することができる。ニューラルネット以外のシステムを使用するときも、その一部を追加することができる。 Third, it is possible to add neurons by mutation. When adding a neuron, it is necessary to reduce the value of the fir tree so that it does not invalidate the learning that has already been performed (it is the same as not adding a fir tree if it is 0). However, this may not activate the added neurons. If neurons are added without increasing the training data, overfitting occurs, and apparently the evaluation value is likely to improve. Therefore, it is considered that it is better to determine the evaluation function so that the evaluation value is improved when the size of the neural network is small. For example, it is expected that a minimum description length (MDL) is added as a part of the evaluation value (in other words, a description length penalty is given). That is, the description length of the neural network model is set as a part of the evaluation value. When a chromosome describes the structure of a neural network, it can be said that it describes a model, so the length of the chromosome can be used as the description length. When using a system other than a neural network, a part of the system can be added.

第４に、すでに補足説明してきているように、ＬＯＧ−ＢＰ学習法は上記のようにニューラルネットへの適用において拡張できるだけでなく、他の学習法への拡張も可能である。すなわち、分類・検知などをおこなう（ニューラルネットに対応する）システムが存在し、それを訓練するための学習法が存在するとする。その学習は反復的におこなわれ、また学習を制御するパラメタが存在するとする。このとき、反復の過程で学習の効果を評価する方法があたえられていれば、ニューラルネットの逆伝搬学習におけるのと同様に本学習法（ＬＯＧ学習法）を適用することができる。すなわち、システムの構造をきめるパラメタを染色体として表現し、複数の染色体を初期化して学習を開始し、学習、評価、変異・選択を反復していく。変異の対象となる学習制御パラメタは実数値である必要もなく、単にそれを他の値に変異させる方法が前記の変異のための２個の式のかわりにあたえられればよい。 Fourth, as already explained, the LOG-BP learning method can be extended not only in the application to the neural network as described above, but also in other learning methods. That is, it is assumed that a system that performs classification and detection (corresponding to a neural network) exists, and a learning method for training the system exists. The learning is performed iteratively, and there are parameters that control the learning. At this time, if a method for evaluating the effect of learning in the process of iteration is given, the present learning method (LOG learning method) can be applied in the same manner as in the back propagation learning of the neural network. That is, parameters that determine the structure of the system are expressed as chromosomes, learning is started by initializing a plurality of chromosomes, and learning, evaluation, mutation / selection are repeated. The learning control parameter to be mutated does not need to be a real value, and a method of simply mutating it to another value may be provided instead of the two expressions for the mutation.

第５に、上記の実施形態においては全個体が同種のニューラルネットだったが、個体ごとに異種のニューラルネットあるいは他の学習法を使用する個体であっても、評価関数をそろえて学習パラメタとその変異の方法を指定すれば上記の方法によって評価し、選択・変異させることができる。すなわち、ニューラルネットと他の学習法を混合して適用することができる。具体的には、たとえば、学習用ニューラルネット群５０７が、ニューロン間の結合の有無が異なる複数のニューラルネットを含んでもよいし、ニューロン数及びニューロン間の結合の有無が異なる複数のニューラルネットを含んでもよい。それらのすべてについて逆伝搬学習を使用してもよいし、ニューラルネットごとに異なる学習法を使用してもよい。これによって、異種のニューラルネットまたは異種の学習法のなかで最適なものを特定することができる。 Fifth, in the above embodiment, all individuals are of the same type of neural network. However, even for individuals using different types of neural nets or other learning methods for each individual, the evaluation parameters are aligned and the learning parameters and If the method of the mutation is designated, it can be evaluated, selected and mutated by the above method. That is, the neural network and other learning methods can be mixed and applied. Specifically, for example, the learning neural network group 507 may include a plurality of neural nets having different connections between neurons, or include a plurality of neural nets having different numbers of neurons and different connections between neurons. May be. Back propagation learning may be used for all of them, or a different learning method may be used for each neural network. As a result, it is possible to specify an optimal one among different types of neural nets or different types of learning methods.

第６に、上記の実施形態においては１台のコンピュータ上（例えば図６の計算機６１０）での学習を基本としたが、複数台のコンピュータ（例えば複数の計算機６１０）を用意し、各コンピュータに１個の染色体をわりあてることによって、これらのコンピュータが有するプロセッサ（例えば各計算機６１０のＣＰＵ６１１またはＧＰＵ６１２）によって並列計算をおこなうことができる。各染色体の評価値は１台のコンピュータにあつめて選択をおこなうことができる。つぎのepochにすすむ際にはそれらのコンピュータのうちの１台または複数台の染色体をいれかえる必要があるが、この操作は少量のデータをコンピュータ間で交換することによっておこなうことができる。ことなるパラメタを使用した通常の逆伝搬学習を複数のコンピュータ上でおこなうことは従来技術によって実現できるが、それと比較すると上記のような方法をとることによってより高速に、またより最適にちかい値がもとめられる確率がたかまるという利点がある。あるいは、図６を参照して説明したように、例えば計算機６１０が複数のＧＰＵ６１２を有し、それぞれのＧＰＵ６１２を使用して上記と同様の処理と実行することもできる。 Sixth, in the above-described embodiment, learning on one computer (for example, the computer 610 in FIG. 6) is basically performed. However, a plurality of computers (for example, a plurality of computers 610) are prepared, and By allocating one chromosome, parallel calculations can be performed by a processor (for example, CPU 611 or GPU 612 of each computer 610) of these computers. The evaluation value of each chromosome can be collected by one computer and selected. The next epoch requires that one or more of the chromosomes be replaced, but this can be done by exchanging small amounts of data between the computers. Performing normal backpropagation learning using different parameters on multiple computers can be realized by conventional techniques, but in comparison with this, the method described above allows faster and more optimal values to be obtained. This has the advantage that the probability of finding it increases. Alternatively, as described with reference to FIG. 6, for example, the computer 610 has a plurality of GPUs 612, and the same processing and execution as described above can be performed using each GPU 612.

（画像データセットの学習例）
この節においては、前記の学習法にしたがって歩行者画像データセットを学習させる方法について記述する。歩行者画像データセットの例としては、Caltech歩行者データセットがある。歩行者画像のかわりに顔画像、物体の画像、文字画像などを使用する場合もおなじ方法を適用することができる。この学習において使用するデータセットは複数個のビデオをふくんでいる。ビデオとはべつに注釈データが付属していて、そのなかに歩行者の位置とサイズをしめすbounding boxのデータもある。ビデオは訓練用の１個または複数個のビデオと、テスト用の１個または複数個のビデオとで構成されている。 (Example of image data set learning)
In this section, a method of learning a pedestrian image data set according to the learning method described above will be described. An example of a pedestrian image dataset is the Caltech pedestrian dataset. The same method can be applied when a face image, an object image, a character image, or the like is used instead of the pedestrian image. The dataset used in this training contains multiple videos. In addition to the video, annotation data is attached, and there is also bounding box data that shows the position and size of the pedestrian. The video is composed of one or more videos for training and one or more videos for testing.

訓練データのうち半数は正例であるが、それをつぎのようにして生成する。上記のデータセットにおいて指定されているbounding boxをきりとって２４×４８のサイズに正規化することによって１０万個の画像を用意し、それを左右反転してえられた１０万個とあわせた２０万個の画像を２回ずつ正例として使用する。 Half of the training data are positive examples, which are generated as follows. A 100,000 image was prepared by cutting out the bounding box specified in the above data set and normalizing it to a size of 24 × 48, and was combined with the 100,000 obtained by inverting left and right. Two hundred thousand images are used twice as positive examples.

また、訓練データののこり半数の負例はつぎのようにして生成する。Caltech歩行者データセットのbounding box以外の部分からきりだしたサイズ２４×４８の画像を２０万個使用する。この初期負例の生成にあたってはその位置を乱数によってきめる。２４×４８とはことなるサイズの画像をきりだしてリサイズすることもできるが、ちょうど２４×４８のサイズの画像だけをきりだすことも可能である。そして、さらに正例２０万個、負例２０万個を使用して訓練したたたみこみ層１段のＣＮＮをもとのデータセットに適用して誤認識した部分から負例２０万個を生成する。すなわち、そのＣＮＮが歩行者がふくまれると判定したがbounding boxからはずれている画像をあらたな負例とする。これらの負例をあわせて４０万個とし、正例とあわせて８０万個の画像を用意する。 Further, negative examples of the remaining half of the training data are generated as follows. Use 200,000 images of size 24x48 extracted from the Caltech pedestrian dataset other than the bounding box. In generating this initial negative example, its position is determined by a random number. Although an image of a size different from 24 × 48 can be cut out and resized, it is also possible to cut out only an image of exactly 24 × 48 size. Then, 200,000 positive examples and 200,000 negative examples are applied, and the CNN of one stage of the convolutional layer trained is applied to the original data set to generate 200,000 negative examples from the erroneously recognized portion. That is, an image in which the CNN determines that a pedestrian is included but is out of the bounding box is set as a new negative example. A total of 400,000 images are prepared including these negative examples and 800,000 images including the positive examples.

これらのデータがふくむ数値は、Caltech歩行者データセットなどの原データにおいては０〜２５５だが、これをほぼ−１〜１の範囲の浮動小数にし、さらに平均が０になるように補正する。 The numerical values included in these data are 0 to 255 in the original data such as the Caltech pedestrian data set, but these are converted to floating-point numbers in the range of approximately -1 to 1, and further corrected so that the average becomes 0.

以下、使用するべきＣＮＮの構造とハイパー・パラメタについて記述する。その例は、たたみこみ層２段であることを前提とすると、つぎのとおりである。 Hereinafter, the structure of the CNN to be used and the hyper parameters will be described. The example is as follows, assuming that there are two convolutional layers.

・たたみこみ層初段：フィルタ・サイズ５×５、フィルタ数１６、非線形（activation）関数：ReLU
・プーリング層初段：最大プーリング．サイズ２×２
・たたみこみ層２段め：フィルタ・サイズ３×３、フィルタ数２６、２８、または３２、非線形関数：ReLU
・プーリング層２段め：最大プーリング．サイズ２×２
・かくれ層（１段）：ニューロン数５０
・出力層：Logistic regression．ニューロン数２（期待される出力は［１，０］または［０，１］）
・ミニバッチ・サイズ：２５０（もとにしたDeep learning tutorialよりはちいさいが、過大である可能性あり）
ここでReLUとは、f(x) = if x < 0 then 0 else x という折れ線関数を意味している。・ First stage of convolutional layer: filter size 5 × 5, number of filters 16, nonlinear (activation) function: ReLU
・ First stage of pooling layer: Maximum pooling. Size 2 × 2
• Second stage of convolutional layer: filter size 3 × 3, number of filters 26, 28, or 32, nonlinear function: ReLU
-2nd pooling layer: maximum pooling. Size 2 × 2
・ Hidden layer (1 stage): 50 neurons
-Output layer: Logistic regression. 2 neurons (expected output is [1,0] or [0,1])
・ Mini-batch size: 250 (smaller than the original Deep learning tutorial, but may be too large)
Here, ReLU means a broken line function of f (x) = if x <0 then 0 else x.

（学習率の変化）
本実施形態の学習過程における学習率の変化の例を図３にしめす。学習率の平均値と標準偏差とはepochごとに測定することができるが、この図においてはそれらを5 epochごとにプロットしている。学習率は初期値がひくすぎるとき（図３（ｂ））にはその平均値が最初は増加し、その後減少するが、図３（ａ）においては増加していない。図３（ｂ）においてはやや初期値がひくすぎたが、それが自律的に調整されたのだとかんがえられる。学習率の標準偏差は初期状態では比較的おおきくしているが、学習がすすむと通常は減少する。しかし、増加する場合もある。いずれにしても、学習率は自律的に調整される。ニューラルネットの学習のかわりに他のシステムの機械学習や最適化・探索をおこなうときは、学習率のかわりに学習過程を制御する他の学習パラメタまたは最適化・探索の過程を制御するパラメタが自律的に調整される。 (Change in learning rate)
FIG. 3 shows an example of a change in the learning rate in the learning process of the present embodiment. Although the average value and the standard deviation of the learning rate can be measured for each epoch, they are plotted every 5 epochs in this figure. When the initial value is too low (FIG. 3 (b)), the average value initially increases and then decreases, but does not increase in FIG. 3 (a). In FIG. 3B, the initial value is slightly too small, but it is recognized that the initial value has been adjusted autonomously. The standard deviation of the learning rate is relatively large in the initial state, but usually decreases as learning progresses. However, it may increase. In any case, the learning rate is adjusted autonomously. When performing machine learning or optimization / search for other systems instead of neural network learning, other learning parameters that control the learning process or parameters that control the optimization / search process instead of the learning rate are autonomous. Is adjusted.

このような学習率（あるいは学習パラメタ、最適化・探索パラメタ）の値およびその変化は、学習をおこなう際あるいは学習の終了時に図３のようなグラフまたは表などの手段によって表示することができる。 The value of the learning rate (or the learning parameter or the optimization / search parameter) and its change can be displayed by means such as a graph or table as shown in FIG. 3 when learning is performed or at the end of learning.

（学習性能等に関する補足）
第１に、個体数（染色体数）に関して記述する。それが多いほうが確率的にはより最適にちかい解をもとめることができるが、すべてを並列に計算できるのでなければ、個体数が多いほうが計算時間がかかる。計算時間と探索範囲のバランスがとれる値の例として、個体数を１２個程度にすることがかんがえられる。個体数を１２として100 epochまで実験するにはＧＰＵを使用してたとえば８時間程度かかる。 (Supplementary information on learning performance, etc.)
First, the number of individuals (the number of chromosomes) will be described. The larger the number, the more stochastically the closest solution can be obtained. However, unless all of them can be calculated in parallel, the larger the number of individuals, the longer the calculation time. As an example of a value that can balance the calculation time and the search range, it is possible to reduce the number of individuals to about 12. It takes, for example, about 8 hours using a GPU to perform an experiment up to 100 epoch with the number of individuals being 12.

第２に、多様性を維持する方法すなわち探索の大域性を制御するための方法に関して記述する。選択・変異の頻度がたかいと、すべての個体が１個の個体からのコピーになりやすい。そのため、その個体が大域最適値からはなれた局所最適値しかない部分に位置していると、満足できる解に到達できない。1 epochで選択・変異する確率を５％以下にする（個体数が１２個なら０．６個以下にする）必要があるとかんがえられる。選択・変異の確率をひくくすると、学習の過程がすすんでもより大域的な探索がおこなわれる。逆に選択・変異の確率をたかくすると、比較的早期に探索が局所的になる。 Second, a method for maintaining diversity, that is, a method for controlling the globality of search, will be described. If the frequency of selection / mutation is high, all individuals are likely to be copied from one individual. Therefore, if the individual is located in a portion having only the local optimal value deviating from the global optimal value, a satisfactory solution cannot be reached. It seems that the probability of selection / mutation in one epoch needs to be 5% or less (0.6 or less if the number of individuals is 12). If the probability of selection / mutation is reduced, a more global search is performed even if the learning process proceeds. Conversely, if the probability of selection / mutation is increased, the search becomes local relatively early.

第３に、大域探索性能に関して記述する。染色体数１２程度では大域探索に十分とはいえない。この場合、最初は１２か所を探索するが、しだいに複写によって生成される個体がふえるため、上記の多様性を維持する方法を適用しても、すぐに探索箇所が３点の近傍くらいにしぼられるからである。探索範囲が何個くらいあるかは、これまでにもとめた最良の個体と現在の各個体とのユークリッド距離を計算し表示することによって推定することができる。図４にこのような表示の例をしめす。各行の右の４個の数値が各段のウェイトのユークリッド距離である。個体９（左端の数字が９の行）が最良の個体であり、すくなくとも２か所の近傍を並列に探索していることがわかる。 Third, global search performance will be described. About 12 chromosomes is not enough for global search. In this case, initially, 12 locations are searched, but since individuals generated by duplication gradually increase, even if the above-mentioned method for maintaining diversity is applied, the search location is immediately reduced to around 3 points. Because it is spilled. The number of search ranges can be estimated by calculating and displaying the Euclidean distance between the best individual determined so far and each individual at present. FIG. 4 shows an example of such a display. The four numerical values on the right of each row are the Euclidean distances of the weights in each row. It can be seen that the individual 9 (the row with the leftmost digit of 9) is the best individual, and at least two neighborhoods are searched in parallel.

第４に、発散した個体の削除について記述する。この実験で使用した単純な逆伝搬学習のアルゴリズムにおいては、逆伝搬によってウェイトが発散する（“nan”になる）ことがしばしばある。このような個体はゾンビすなわち計算を継続しても解がえられる可能性のない個体だとかんがえられるから、削除するべきである。削除のための論理をくみこむこともできるが、そのような個体は評価値が極端に悪化するため、このアルゴリズムにおいては優先的に削除されるから、特別な論理をくみこむ必要はかならずしもない。ただし、その場合は選択・変異の頻度をゾンビの発生頻度よりたかくする必要がある。また、ゾンビが多数発生するときはそれが除去する論理をくみこんだほうが計算効率がよくなる。 Fourth, the deletion of divergent individuals will be described. In the simple back propagation learning algorithm used in this experiment, the weight often diverges (becomes "nan") due to the back propagation. Such individuals are considered to be zombies, that is, individuals that are unlikely to be solved even if calculations are continued, and should be deleted. Although it is possible to incorporate logic for deletion, it is necessary to incorporate special logic because such an individual has an extremely deteriorated evaluation value and is therefore preferentially deleted in this algorithm. However, in that case, the frequency of selection / mutation must be higher than the frequency of zombies. In addition, when a large number of zombies are generated, the calculation efficiency is improved by incorporating the logic to eliminate them.

たとえば、計算機６１０は、処理２０７において、評価が所定の条件を満たす（たとえば所定の値より悪い）全ての染色体を削除し、削除した染色体と同数の染色体の複製を生成してもよい。たとえば、計算機６１０は、複数の染色体を削除した場合、評価が最良の染色体の複製を、削除した染色体と同数生成して、評価が最良の染色体とそれらを複製した複数の染色体の学習率が全て異なるようにそれらの染色体の学習率を変更してもよい。あるいは、計算機６１０は、複数の染色体を削除した場合、削除した染色体と同数の、評価が上位の染色体を選択して、選択した染色体の複製を一つずつ生成し、複製元の染色体と複製された染色体の学習率が異なるようにそれらの少なくとも一方を変更してもよい。 For example, in the process 207, the computer 610 may delete all chromosomes whose evaluation satisfies a predetermined condition (for example, worse than a predetermined value), and may generate the same number of chromosome copies as the deleted chromosomes. For example, when a plurality of chromosomes are deleted, the computer 610 generates the same number of copies of the chromosome with the best evaluation as the number of the deleted chromosomes, and the learning rate of the chromosome with the best evaluation and the learning rates of the plurality of chromosomes that duplicated them are all equal. The learning rate of those chromosomes may be changed differently. Alternatively, when a plurality of chromosomes are deleted, the computer 610 selects the same number of chromosomes with the highest evaluation as the deleted chromosomes, generates one copy of the selected chromosomes one by one, and copies them to the source chromosome. At least one of them may be changed so that the learning rates of the chromosomes differ.

（本発明の実施形態のまとめ）
以上のように、本実施形態は、逆伝搬学習過程にＧＡの方法をとりいれた、あらたな学習法に関する。この方法においてはニューラルネットを１個の染色体（データ）をもつ個体としてコンピュータ上に（プログラムおよびデータとして）表現し、各個体の染色体にニューラルネットのハイパー・パラメタすなわちニューロン間の接続のおもみなどをコーディング（表現）する。また、それとあわせて各染色体にそのニューラルネットの学習率をコーディングする。複数の個体を用意して並列に計算し、並列化された逆伝搬学習の１ステップ（1 epoch）ごとにＧＡにおける選択と変異とをおこなう。すなわち、成績のわるい個体を削除して成績のよい個体の学習率を変異させたものによって置換する。 (Summary of Embodiment of the Present Invention)
As described above, the present embodiment relates to a new learning method in which the GA method is used in the back propagation learning process. In this method, a neural network is represented on a computer (as a program and data) as an individual having one chromosome (data), and the hyper-parameter of the neural network, that is, the connection between neurons, etc. is stored in each individual chromosome. Is encoded (represented). In addition, the learning rate of the neural network is coded for each chromosome. A plurality of individuals are prepared and calculated in parallel, and selection and mutation in GA are performed for each step (1 epoch) of parallelized back propagation learning. That is, individuals with poor grades are deleted and replaced with mutated learning rates of individuals with good grades.

また、本実施形態の方法はニューラルネットの学習にかぎらず、他の機械学習にも適用することができる。すなわち、画像、音声、ドキュメントなどのデータの反復学習をおこない、その結果を数値的に評価することができるときに、その機械学習を制御する学習パラメタを染色体上にコーディングし、並列化された学習の１ステップごとにＧＡにおける選択と変異とをおこなう。 Further, the method of the present embodiment can be applied not only to learning of a neural network but also to other machine learning. In other words, iterative learning of data such as images, sounds, documents, etc. is performed, and when the results can be evaluated numerically, learning parameters that control the machine learning are coded on chromosomes, and parallelized learning is performed. Selection and mutation in GA are performed for each of the following steps.

さらに、本発明の方法は最適化および探索にも適用することができる。すなわち、探索空間内の移動を反復して最適化や探索をおこなう際に、探索空間内の現在の点を数値的に評価することができるとき、その最適化や探索を制御する最適化・探索制御パラメタを染色体上にコーディングし、並列化された最適化や探索の１ステップごとにＧＡにおける選択と変異とをおこなう。 Furthermore, the method of the present invention can be applied to optimization and search. In other words, when performing optimization or search by repeating movement in the search space, if the current point in the search space can be numerically evaluated, optimization / search to control the optimization or search The control parameters are coded on the chromosome, and the selection and mutation in GA are performed for each step of parallel optimization and search.

本実施形態の最大の効果は学習率（または学習パラメタ、最適化・探索制御パラメタ）が自律的に決定されることである。すなわち、逆伝搬学習（または学習、最適化、探索）の１ステップごとに選択と変異とをおこなうことによって、従来の逆伝搬学習法（または学習法、最適化法、探索法）およびそれとＧＡとをくみあわせた方法と同様に学習結果としてニューラルネット（またはシステム）を最適化するのと同時に、従来の方法においてはできなかった逆伝搬学習過程（学習過程、最適化過程、または探索過程）における学習率（または学習パラメタ、最適化・探索制御パラメタ）を最適化することができる。すなわち、１ステップごとにおこなう選択と変異とによって、学習率（または学習パラメタ、最適化・探索制御パラメタ）の平均値がそのステップにおける最適値にちかづけられ、学習（または最適化、探索）の進展とともに変化する最適値に追随する。学習率（または学習パラメタ、最適化・探索制御パラメタ）は、通常は学習（または最適化、探索）の初期には比較的おおきな値をとり学習（または最適化、探索）がすすむとともに最適なスケジュールで低下させることができるが、低下させないほうがよいときはそのようになる。 The greatest effect of this embodiment is that the learning rate (or learning parameters, optimization / search control parameters) is determined autonomously. That is, by performing selection and mutation at each step of back propagation learning (or learning, optimization, search), the conventional back propagation learning method (or learning method, optimization method, search method) and the GA As well as optimizing the neural network (or system) as a learning result in the same way as the method of combining The learning rate (or learning parameter, optimization / search control parameter) can be optimized. That is, the average value of the learning rate (or learning parameter, optimization / search control parameter) is approximated to the optimum value in that step by the selection and mutation performed for each step, and the learning (or optimization, search) Follow the optimal values that change with progress. The learning rate (or learning parameter, optimization / search control parameter) usually takes a relatively large value at the beginning of learning (or optimization, search), and the learning (or optimization, search) proceeds and the optimal schedule , But when it is better not to lower it, it becomes like that.

また、それと同時に本実施形態においては学習（または最適化、探索）における探索範囲が適切に制御できるという効果がある。学習（または最適化、探索）の初期には大域的な探索をおこなうことができ、学習（または最適化、探索）の進展とともに探索範囲をせばめることができる。初期には大域的な探索をおこなうことによって局所最適値におちいる確率が低下するとともに、後期にはせまい範囲を効率的に並列探索することができる。ただし、適切な制御のためには選択と変異の頻度を適切に制御する必要がある。 At the same time, in the present embodiment, there is an effect that the search range in learning (or optimization, search) can be appropriately controlled. At the beginning of learning (or optimization, search), a global search can be performed, and the search range can be narrowed as the learning (or optimization, search) progresses. Performing a global search in the early stage lowers the probability of falling to the local optimum, and in the latter stage, it is possible to efficiently search in a narrow range efficiently. However, for proper control, it is necessary to appropriately control the frequency of selection and mutation.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明のより良い理解のために詳細に説明したのであり、必ずしも説明の全ての構成を備えるものに限定されものではない。たとえば、上記した実施例の構成の一部について、他の構成の追加・削除・置換をすることが可能である。 Note that the present invention is not limited to the above-described embodiment, and includes various modifications. For example, the above-described embodiments have been described in detail for better understanding of the present invention, and are not necessarily limited to those having all the configurations described. For example, with respect to a part of the configuration of the above-described embodiment, it is possible to add, delete, or replace another configuration.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によってハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによってソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、不揮発性半導体メモリ、ハードディスクドライブ、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶デバイス、または、ＩＣカード、ＳＤカード、ＤＶＤ等の計算機読み取り可能な非一時的データ記憶媒体に格納することができる。 In addition, each of the above configurations, functions, processing units, processing means, and the like may be partially or entirely realized by hardware, for example, by designing an integrated circuit. In addition, the above-described configurations, functions, and the like may be implemented by software by a processor interpreting and executing a program that implements each function. Information such as a program, a table, and a file for realizing each function is stored in a non-volatile semiconductor memory, a hard disk drive, a storage device such as an SSD (Solid State Drive), or a non-readable computer such as an IC card, an SD card, or a DVD. It can be stored on a temporary data storage medium.

また、制御線及び情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線及び情報線を示しているとは限らない。実際にはほとんど全ての構成が相互に接続されていると考えてもよい。 Further, the control lines and the information lines are shown as necessary for the explanation, and not all the control lines and the information lines are necessarily shown on the product. In fact, almost all components may be considered to be interconnected.

５０１学習制御コンピュータ
５０２学習制御プログラム
５０３学習データ生成プログラム
５０４原データ
５０５教師情報
５０６教師情報つき学習データ
５０７学習用ニューラルネット群
５０８識別用ニューラルネット
５０９識別するべきデータ
５１０識別結果出力 501 Learning control computer 502 Learning control program 503 Learning data generation program 504 Original data 505 Teacher information 506 Learning data with teacher information 507 Learning neural net group 508 Identification neural net 509 Data to be identified 510 Identification result output

Claims

A machine learning method executed by a computer having a processor and a storage device connected to the processor,
The storage device includes a plurality of programs for realizing a plurality of systems that execute a predetermined process, a plurality of structural parameters corresponding to each of the plurality of programs, and a plurality of learning parameters corresponding to the plurality of systems. And hold the
Each of the learning parameters is a parameter that specifies a change in the structural parameter in learning performed by each of the systems,
The machine learning method includes:
A first procedure in which the processor causes each system to learn a predetermined data set by using a learning parameter corresponding to each system;
A second procedure in which the processor evaluates each of the systems according to a predetermined evaluation method;
The processor selects a first system and a second system having a higher evaluation than the first system from the plurality of systems, and copies the program and the plurality of structural parameters corresponding to the second system to the second system. Generating a program for realizing a copy of the system and a plurality of structural parameters corresponding thereto, generating a copy of the learning parameter corresponding to the second system as a learning parameter corresponding to the copy of the second system, At least one of the learning parameter corresponding to the second system and the learning parameter corresponding to the copy of the second system is set such that the learning parameter corresponding to the second system and the learning parameter corresponding to the copy of the second system are different from each other. A third step of changing
A machine learning method, wherein the first to third steps are executed again for the plurality of systems other than the first system.

The machine learning method according to claim 1, wherein
The combination of the plurality of structural parameters corresponding to each system and the learning parameters corresponding to each system is held as one chromosome in the genetic algorithm,
The machine learning method further includes a step in which the processor uses random numbers to determine initial values of the plurality of structural parameters corresponding to the respective systems and initial values of learning parameters corresponding to the respective systems,
In the third step, the processor generates a copy of a chromosome corresponding to the second system, and uses a random number to generate a learning parameter corresponding to the second system and a learning parameter corresponding to the copy of the second system. Determining a value of at least one of the following.

The machine learning method according to claim 1, wherein
Each of the systems is a neural network,
The plurality of structural parameters corresponding to each system include weights of connections between neurons in the neural network,
In the first procedure, the processor causes the respective systems to learn the predetermined data set by back propagation learning,
The machine learning method according to claim 1, wherein the learning parameter is a learning rate indicating a magnitude of a change amount of the weight in the back propagation learning.

The machine learning method according to claim 3, wherein
The combination of the plurality of structural parameters corresponding to each system and the learning parameters corresponding to each system is held as one chromosome in the genetic algorithm,
Each chromosome further includes information indicating the presence or absence of connection between the neurons,
In the third step, the processor changes the presence or absence of the connection between the neurons based on a predetermined mutation rule.

The machine learning method according to claim 4, wherein
Each of the chromosomes further includes information indicating the number of neurons included in each of the systems,
The machine learning method according to claim 3, wherein in the third step, the processor changes the number of neurons based on a predetermined mutation rule.

The machine learning method according to claim 4, wherein
Each chromosome further includes information indicating the number of stages of the neural network included in each system,
In the third step, the processor changes the number of stages of the neural network based on a predetermined mutation rule.

The machine learning method according to claim 3, wherein
The plurality of systems include a plurality of neural networks having different connections between the neurons, or a plurality of neural networks having different numbers of the neurons and different connections between the neurons. Learning method.

The machine learning method according to claim 1, wherein
In the third step, when the evaluation of two or more systems is lower than a predetermined value, the processor is configured to realize the same number of systems as the two or more systems, other than the two or more systems. A copy of a program, a copy of a plurality of structural parameters corresponding to a system other than the two or more systems, and a copy of a learning parameter corresponding to a system other than the two or more systems,
A machine learning method, wherein the first to third steps are executed again for the plurality of systems other than the two or more systems.

The machine learning method according to claim 1, wherein
The computer has a plurality of the processors,
Each of the plurality of systems is assigned to each of the plurality of processors;
The machine learning method according to claim 1, wherein, in the first procedure, each of the processors causes one of the systems assigned to each of the processors to learn a predetermined data set using the learning parameter.

A machine learning device having a processor and a storage device connected to the processor,
The storage device holds a plurality of systems each including a program executed by the processor and a plurality of structural parameters for the program, and a learning parameter corresponding to each system,
Each of the learning parameters is a parameter that specifies a change in the structural parameter in learning performed by each of the systems,
The processor comprises:
A first procedure for causing each system to learn a predetermined data set by using a learning parameter corresponding to each system;
A second procedure for evaluating each of the systems according to a predetermined evaluation method;
Selecting a first system and a second system having a higher evaluation than the first system from the plurality of systems, generating a copy of the second system, and copying the learning parameter corresponding to the second system to the second system; A learning parameter corresponding to the second system and a learning parameter corresponding to the second system such that the learning parameter corresponding to the second system and the learning parameter corresponding to the replication of the second system are different from each other. And 3) changing at least one of the learning parameters corresponding to the duplication of the two systems.
A machine learning apparatus, wherein the first to third procedures are executed again for the plurality of systems other than the first system.