WO2020087254A1 - Optimization method for convolutional neural network, and related product - Google Patents
Optimization method for convolutional neural network, and related product Download PDFInfo
- Publication number
- WO2020087254A1 WO2020087254A1 PCT/CN2018/112569 CN2018112569W WO2020087254A1 WO 2020087254 A1 WO2020087254 A1 WO 2020087254A1 CN 2018112569 W CN2018112569 W CN 2018112569W WO 2020087254 A1 WO2020087254 A1 WO 2020087254A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- layer
- replacement
- loss value
- replaced
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the invention relates to the technical field of communication and artificial intelligence, in particular to an optimization method of convolutional neural networks and related products.
- Convolutional neural networks In recent years, as a machine learning model, deep convolutional neural networks have achieved excellent results in computer vision and other fields, and even exceeded the average level of humans in some tasks, such as image classification and recognition, Go games, and so on.
- Convolutional neural networks generally include multiple convolutional layers, interspersed with pooling layers, linear rectification layers, etc.
- the top of the network generally has one or more fully connected layers, and the top is the loss function layer for training.
- Transfer learning is a method of developing and training machine learning models. The purpose is to transfer the model M trained in domain A to domain B at a lower cost through methods such as retraining. Transfer learning technology is widely used in deep convolutional neural networks, but the training time of such networks is very long and the cost is high.
- the embodiments of the present invention provide an optimization method and related products of a convolutional neural network.
- the trained model can be simply retrained and can be applied to the target field, which has the advantage of reducing costs.
- an embodiment of the present invention provides an optimization method for a convolutional neural network.
- the method includes the following steps:
- the operation of the replacement layer includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm, and determining that the standard convolution layer e is replaced with the first middle of the efficient convolution layer Model M 1 effect gain; renormalize the parameters of the first intermediate model M1 to obtain the second intermediate model M2; initialize and retrain the second intermediate model M2 to obtain the third intermediate model M3; calculate the third intermediate model M3 Loss value
- the determination that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm specifically includes:
- Finding a group convolutional layer containing Ng groups from the initial model M 0 is the minimum change in the importance of intra-layer connections
- the loss value includes:
- Lw is the loss value
- an optimization device for a convolutional neural network includes:
- the obtaining unit is used to obtain the pre-training model M;
- the training unit is used to retrain the pre-trained model M in the data set D of the specified domain to obtain the initial model M 0 ;
- the selection unit is used to control the replacement unit to repeatedly perform the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; and select the third intermediate model M3 with the smallest loss value as the output model.
- the replacement unit is specifically used to find a group convolutional layer containing Ng groups from the initial model M 0 where the change in the importance of intra-layer connections is minimal;
- the loss value includes:
- Lw is the loss value
- a computer-readable storage medium stores a program for electronic data exchange, wherein the program causes the terminal to perform the method provided in the first aspect.
- the technical solution of the present application proposes a brand-new solution to optimize the convolutional neural network by replacing the convolutional layer.
- it is difficult to select which convolutional layers need to be replaced, and it is difficult to train the replaced model.
- Optimization schemes that are not based on layer replacement often require the use of a large amount of GPU computing resources, and the training time is usually very long.
- an optimized convolutional neural network model can be obtained within a few hours on the premise that only one NVidia, Titan, and XP GPU is used. Therefore, it saves time, improves efficiency, and reduces costs.
- FIG. 1 is a schematic flowchart of an optimization method for a convolutional neural network provided by this application.
- FIG. 2 is a schematic diagram of providing a parameter in an initialization replacement layer provided by this application.
- FIG. 3 is a schematic structural diagram of an optimization device for a convolutional neural network provided by this application.
- the present application proposes an optimization method for convolutional neural networks, which is based on convolutional neural networks for transfer learning and convolutional layer replacement.
- the goal of this optimization method is to reduce the resource occupation and calculation speed of the convolutional neural network model in a specific domain D (also called the target domain) without losing the performance of the task as much as possible.
- the input accepted by this method is the pre-trained deep convolutional neural network model and the data set of the target domain, and the output is the optimized convolutional neural network model that has been trained on the target domain data set and replaced by layers.
- the product neural network model can use the target domain D.
- the input to this method is a pre-trained model That is, a model that is pre-trained on a large data set and can solve general problems.
- the optimization method includes the following steps:
- Step S101 Obtain a pre-training model M
- Step S102 Retrain the pre-trained model M on the data set D in the specified field to obtain the initial model M 0 , and perform a replacement layer operation on the initial model M 0 ; the replacement layer operation includes the following steps S103-S106;
- Step S103 based on the bipartite graph maximum matching algorithm, it is determined that the standard convolutional layer e in the initial model M 0 is suitable to be replaced with an efficient convolutional layer, and the standard intermediate convolutional layer e is replaced with the first intermediate model M 1 Gain
- the technical solution of the present application uses a method based on the maximum bipartite matching algorithm to determine which standard convolutional layers are suitable to be replaced with efficient convolutional layers, and the effect gain that the replacement can bring.
- the problem to be solved by the bipartite graph maximum matching algorithm can be described formally as formula (1).
- N g refers to the number of group convolutions in the replacement target
- L refers to the number of layers in the entire network
- C l refers to the number of channels that input data at layer 1
- the formula describes an optimization problem.
- the goal of optimization is to find a layer replacement target, that is, a group convolution layer containing N g groups (when N g is equal to the number of channels, the group convolution is depth separable) Convolutional layer), so that the change in the importance of the connections within the layer is minimal, and the importance measurement formula (2) is the L2 norm of all weights in each connection.
- Step S104 Renormalize the parameters of the first intermediate model M1 to obtain the second intermediate model M2;
- Step S105 Initialize and retrain the second intermediate model M2 to obtain the third intermediate model M3;
- Step S106 Calculate the loss value of the third intermediate model M3;
- the calculation method of the above loss value may include:
- the loss value is obtained by summing the loss values of all L layers.
- the loss value of each layer is taken as the weighted average of the following two items: the L2 norm of the weight (first item) and the L2 norm of the remaining weight after layer replacement (second item).
- the value is 0 or 1.
- ⁇ and ⁇ g are the weights of two weighted averages.
- Step S107 Repeat the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; select the third intermediate model M3 with the smallest loss value as the output model.
- the maximum matching connection of the bipartite graph is used to initialize the model after layer replacement, and the parameters used for initialization are the original parameters in the pre-trained model. Prior to initialization, this method will additionally perform regularization of parameters to ensure that the initialized model can be trained as soon as possible.
- the patent also rearranges the channel order of the output result by adding a pointwise convolution layer after the replaced convolutional layer. Refer to FIG. 2, which is a schematic diagram of initializing parameters in the replacement layer.
- the technical solution of the present application proposes a brand-new solution to optimize the convolutional neural network by replacing the convolutional layer.
- Optimization schemes that are not based on layer replacement often require the use of a large amount of GPU computing resources, and the training time is usually very long.
- an optimized convolutional neural network model can be obtained within a few hours under the premise of using only one NVidia Titan XP.
- FIG. 3 provides an optimization device for a convolutional neural network.
- the device includes:
- the obtaining unit 301 is used to obtain the pre-training model M;
- the training unit 302 is configured to retrain the pre-trained model M in the data set D of the specified domain to obtain the initial model M 0 ;
- the replacement unit 303 is configured to perform a replacement layer operation on the initial model M 0 ; the replacement layer operation includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on a bipartite graph maximum matching algorithm, and determining The standard convolution layer e is replaced with the first intermediate model M 1 effect gain of the efficient convolution layer; the parameters of the first intermediate model M 1 are renormalized to obtain the second intermediate model M 2; the second intermediate model M 2 is initialized and Retraining to obtain the third intermediate model M3; calculating the loss value of the third intermediate model M3;
- the selection unit 304 is configured to control the replacement unit to repeatedly perform the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; and select the third intermediate model M3 with the smallest loss value as the output model.
- An embodiment of the present invention also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, which causes the computer to execute any of the convolutional neural networks described in the above method embodiments Some or all steps of the optimization method.
- An embodiment of the present invention also provides a computer program product, the computer program product includes a non-transitory computer-readable storage medium storing a computer program, the computer program is operable to cause the computer to execute as described in the above method embodiments Some or all steps of any convolutional neural network optimization method.
- the disclosed device may be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or may Integration into another system, or some features can be ignored, or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or software program modules.
- the integrated unit is implemented in the form of a software program module and sold or used as an independent product, it may be stored in a computer-readable memory.
- the technical solution of the present invention essentially or part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a memory, Several instructions are included to enable a computer device (which may be a personal computer, server, network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present invention.
- the aforementioned memory includes: U disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
- the program may be stored in a computer-readable memory, and the memory may include: a flash disk , Read-Only Memory (English: Read-Only Memory, abbreviation: ROM), Random Access Device (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
- ROM Read-Only Memory
- RAM Random Access Device
- magnetic disk or optical disk etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
An optimization method for a convolutional neural network, and a related product. The method comprises: obtaining a pre-trained model M; re-training the pre-trained model M according to a data set D in a specified field to obtain an initial model M0, and perform a layer replacement operation on the initial model M0, wherein the layer replacement operation comprises: determining on the basis of a bipartite graph maximum matching algorithm that a standard convolutional layer e in the initial model M0 is suitable to be replaced with a high-efficiency convolutional layer, and determining a first intermediate model M1 effect gain of the standard convolutional layer e being replaced with the high-efficiency convolutional layer; renormalizing parameters of the first intermediate model M1 to obtain a second intermediate model M2; initializing and retraining the second intermediate model M2 to obtain a third intermediate model M3; calculating a loss value of the third intermediate model M3; repeatedly performing the layer replacement operation to obtain multiple third intermediate models M3 and multiple loss values; and selecting the third intermediate model M3 having the smallest loss value as an output model. The method is low in costs.
Description
本发明涉及通信以及人工智能技术领域,具体涉及一种卷积神经网络的优化方法及相关产品。The invention relates to the technical field of communication and artificial intelligence, in particular to an optimization method of convolutional neural networks and related products.
近年来,作为一种机器学习模型,深度卷积神经网络在计算机视觉等领域取得了优异的效果,在部分任务中甚至超过人类的平均水平,如图像分类识别,围棋比赛等。卷积神经网络一般包含多个卷积层,中间穿插着池化层,线性整流层等,网络的顶部一般有一个或多个全连接层,最顶端为用于训练的损失函数层。In recent years, as a machine learning model, deep convolutional neural networks have achieved excellent results in computer vision and other fields, and even exceeded the average level of humans in some tasks, such as image classification and recognition, Go games, and so on. Convolutional neural networks generally include multiple convolutional layers, interspersed with pooling layers, linear rectification layers, etc. The top of the network generally has one or more fully connected layers, and the top is the loss function layer for training.
迁移学习是一种机器学习模型的开发与训练方法,目的是把在领域A中训练完成的模型M,通过重训练等方法以较低的成本迁移到领域B中。迁移学习技术在深度卷积神经网络中的应用很广泛,但是这类网络的训练时间很长,成本很高。Transfer learning is a method of developing and training machine learning models. The purpose is to transfer the model M trained in domain A to domain B at a lower cost through methods such as retraining. Transfer learning technology is widely used in deep convolutional neural networks, but the training time of such networks is very long and the cost is high.
发明内容Summary of the invention
本发明实施例提供了一种卷积神经网络的优化方法及相关产品,可以将训练好的模型进行简单的重训练即可以应用到目标领域中,具有降低成本的优点。The embodiments of the present invention provide an optimization method and related products of a convolutional neural network. The trained model can be simply retrained and can be applied to the target field, which has the advantage of reducing costs.
第一方面,本发明实施例提供一种卷积神经网络的优化方法,所述方法包括如下步骤:In a first aspect, an embodiment of the present invention provides an optimization method for a convolutional neural network. The method includes the following steps:
获得预训练模型M;Obtain the pre-training model M;
将预训练模型M在指定领域的数据集D重训练得到初始模型M
0,对初始模型M
0进行替换层操作;
Retrain the pre-trained model M on the data set D of the specified domain to obtain the initial model M 0 , and perform a replacement layer operation on the initial model M 0 ;
所述替换层操作包括:基于二分图最大匹配算法确定初始模型M
0中标准卷积层e适合被替换成高效卷积层,确定标准卷积层e被替换成高效卷积层的第一中间模型M
1效果增益;对第一中间模型M1的参数进行重整化得到第二中间模型M2;对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;计 算第三中间模型M3的损失值;
The operation of the replacement layer includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm, and determining that the standard convolution layer e is replaced with the first middle of the efficient convolution layer Model M 1 effect gain; renormalize the parameters of the first intermediate model M1 to obtain the second intermediate model M2; initialize and retrain the second intermediate model M2 to obtain the third intermediate model M3; calculate the third intermediate model M3 Loss value
重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。Repeating the replacement layer operation results in multiple third intermediate models M3 and multiple loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
可选的,所述基于二分图最大匹配算法确定初始模型M
0中标准卷积层e适合被替换成高效卷积层具体包括:
Optionally, the determination that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm specifically includes:
从初始模型M
0中寻找一个包含Ng个组的组卷积层是的层内连接的重要性变化最小;
Finding a group convolutional layer containing Ng groups from the initial model M 0 is the minimum change in the importance of intra-layer connections;
所述重要性为每个连接中全部权重的L2范数;The importance is the L2 norm of all weights in each connection;
可选的,所述损失值包括:Optionally, the loss value includes:
其中,Lw为损失值。Among them, Lw is the loss value.
第二方面,提供一种卷积神经网络的优化装置,所述装置包括:In a second aspect, an optimization device for a convolutional neural network is provided. The device includes:
获取单元,用于获得预训练模型M;The obtaining unit is used to obtain the pre-training model M;
训练单元,用于将预训练模型M在指定领域的数据集D重训练得到初始模型M
0;
The training unit is used to retrain the pre-trained model M in the data set D of the specified domain to obtain the initial model M 0 ;
替换单元,用于对初始模型M
0进行替换层操作;所述替换层操作包括:基于二分图最大匹配算法确定初始模型M
0中标准卷积层e适合被替换成高效卷积 层,确定标准卷积层e被替换成高效卷积层的第一中间模型M
1效果增益;对第一中间模型M1的参数进行重整化得到第二中间模型M2;对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;计算第三中间模型M3的损失值;
A replacement unit for performing a replacement layer operation on the initial model M 0 ; the replacement layer operation includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on a bipartite graph maximum matching algorithm, and determining the standard e convolutional layer is replaced with a first intermediate layer is efficient convolution model gain effect M 1; a first intermediate parameter model M1 will be renormalized to obtain a second intermediate model M2; second intermediate model M2 is initialized and re Train to obtain the third intermediate model M3; calculate the loss value of the third intermediate model M3;
选择单元,用于控制所述替换单元重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。The selection unit is used to control the replacement unit to repeatedly perform the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; and select the third intermediate model M3 with the smallest loss value as the output model.
可选的,所述替换单元,具体用于从初始模型M
0中寻找一个包含Ng个组的组卷积层是的层内连接的重要性变化最小;
Optionally, the replacement unit is specifically used to find a group convolutional layer containing Ng groups from the initial model M 0 where the change in the importance of intra-layer connections is minimal;
所述重要性为每个连接中全部权重的L2范数;The importance is the L2 norm of all weights in each connection;
可选的,所述损失值包括:Optionally, the loss value includes:
其中,Lw为损失值。Among them, Lw is the loss value.
第三方面,提供一种计算机可读存储介质,其存储用于电子数据交换的程序,其中,所述程序使得终端执行第一方面提供的方法。In a third aspect, a computer-readable storage medium is provided that stores a program for electronic data exchange, wherein the program causes the terminal to perform the method provided in the first aspect.
实施本发明实施例,具有如下有益效果:The implementation of the embodiments of the present invention has the following beneficial effects:
可以看出,本申请的技术方案提出了一种全新的、通过替换卷积层来优化卷积神经网络的方案。现有技术难以选择哪些卷积层需要替换,并且很难训练替换后的模型。不基于层替换的优化方案往往需要使用大量的GPU计算资源,并且训练时间通常很长。通过使用本方案,在仅使用一块NVidia Titan Xp GPU 的前提下,在几小时之内就可以得到一个优化过的卷积神经网络模型,因此其节省了时间,提高了效率,降低了成本。It can be seen that the technical solution of the present application proposes a brand-new solution to optimize the convolutional neural network by replacing the convolutional layer. In the prior art, it is difficult to select which convolutional layers need to be replaced, and it is difficult to train the replaced model. Optimization schemes that are not based on layer replacement often require the use of a large amount of GPU computing resources, and the training time is usually very long. By using this solution, an optimized convolutional neural network model can be obtained within a few hours on the premise that only one NVidia, Titan, and XP GPU is used. Therefore, it saves time, improves efficiency, and reduces costs.
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the technical solutions in the embodiments of the present invention, the drawings required in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present invention. Those of ordinary skill in the art can obtain other drawings based on these drawings without creative efforts.
图1是本申请提供一种卷积神经网络的优化方法的流程示意图。FIG. 1 is a schematic flowchart of an optimization method for a convolutional neural network provided by this application.
图2是本申请提供一种初始化替换层中参数的示意图。FIG. 2 is a schematic diagram of providing a parameter in an initialization replacement layer provided by this application.
图3是本申请提供一种卷积神经网络的优化装置的结构示意图。FIG. 3 is a schematic structural diagram of an optimization device for a convolutional neural network provided by this application.
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be described clearly and completely in the following with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.
本发明的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third" and "fourth" in the description and claims of the present invention and the accompanying drawings are used to distinguish different objects, not to describe a specific order . In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes steps or units that are not listed, or optionally also includes Other steps or units inherent to these processes, methods, products or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结果或特性可以包含在本发明的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference herein to "embodiments" means that specific features, results, or characteristics described in connection with the embodiments may be included in at least one embodiment of the present invention. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art understand explicitly and implicitly that the embodiments described herein can be combined with other embodiments.
本申请提出一种卷积神经网络的优化方法,其基于迁移学习和卷积层替换的卷积神经网络。该优化方法的目标是在尽可能不损失任务表现的前提下,降 低在特定领域D(也可以称为目标领域)上的卷积神经网络模型的资源占用和计算速度。该方法接受的输入为预训练的深度卷积神经网络模型和目标领域的数据集,输出为在目标领域数据集上训练好的,层替换过的优化的卷积神经网络模型,该优化的卷积神经网络模型能够使用目标领域D。The present application proposes an optimization method for convolutional neural networks, which is based on convolutional neural networks for transfer learning and convolutional layer replacement. The goal of this optimization method is to reduce the resource occupation and calculation speed of the convolutional neural network model in a specific domain D (also called the target domain) without losing the performance of the task as much as possible. The input accepted by this method is the pre-trained deep convolutional neural network model and the data set of the target domain, and the output is the optimized convolutional neural network model that has been trained on the target domain data set and replaced by layers. The product neural network model can use the target domain D.
如图1所示,本方法的输入为预训练模型
即在大数据集上预先训练好的,可以解决通用问题的模型。如图1所示,该优化方法包括如下步骤:
As shown in Figure 1, the input to this method is a pre-trained model That is, a model that is pre-trained on a large data set and can solve general problems. As shown in Figure 1, the optimization method includes the following steps:
步骤S101、获得预训练模型M;Step S101: Obtain a pre-training model M;
步骤S102、将预训练模型M在指定领域的数据集D重训练得到初始模型M
0,对初始模型M
0进行替换层操作;该替换层操作包括下述步骤S103-步骤S106;
Step S102: Retrain the pre-trained model M on the data set D in the specified field to obtain the initial model M 0 , and perform a replacement layer operation on the initial model M 0 ; the replacement layer operation includes the following steps S103-S106;
步骤S103、基于二分图最大匹配算法确定初始模型M
0中标准卷积层e适合被替换成高效卷积层,确定标准卷积层e被替换成高效卷积层的第一中间模型M
1效果增益;
Step S103, based on the bipartite graph maximum matching algorithm, it is determined that the standard convolutional layer e in the initial model M 0 is suitable to be replaced with an efficient convolutional layer, and the standard intermediate convolutional layer e is replaced with the first intermediate model M 1 Gain
本申请的需要解决的核心问题有两个:如何选择需要被替换的标准卷积层和替换目标,以及如何在目标数据集上训练层替换过的模型。There are two core problems to be solved in this application: how to select the standard convolutional layer and the replacement target to be replaced, and how to train the model with the replacement layer on the target data set.
因为深度卷积神经网络模型中往往包含几十个卷积层,并且替换的选择有很多种,采用枚举算法的话会出现严重的“组合爆炸”问题(combinatorial explosion),枚举算法大大增加了开销,因此效率低。Because the deep convolutional neural network model often contains dozens of convolutional layers, and there are many options for replacement, if you use the enumeration algorithm, there will be a serious "combinatorial explosion" problem (combinatorial explosion), the enumeration algorithm has greatly increased Overhead and therefore inefficient.
本申请的技术方案使用了基于二分图最大匹配算法(maximal bipartite matching)的方法以确定哪些标准卷积层适合被替换为高效卷积层,以及替换可以带来的效果增益。该二分图最大匹配算法要解决的问题可以形式化描述为公式(1)。The technical solution of the present application uses a method based on the maximum bipartite matching algorithm to determine which standard convolutional layers are suitable to be replaced with efficient convolutional layers, and the effect gain that the replacement can bring. The problem to be solved by the bipartite graph maximum matching algorithm can be described formally as formula (1).
公式中,N
g指替换目标中的组卷积个数,L为整个网络中的层数,C
l指在第l层输入数据的通道个数,
指在第l层的第c个输入通道和第f个输出通道之间的 连接的重要程度,
则指在第l层的第c个输入通道和第f个输出通道之间的连接是否应该被删除。
In the formula, N g refers to the number of group convolutions in the replacement target, L refers to the number of layers in the entire network, and C l refers to the number of channels that input data at layer 1, Refers to the importance of the connection between the c-th input channel and the f-th output channel in the lth layer, Then it refers to whether the connection between the c-th input channel and the f-th output channel of the first layer should be deleted.
该公式的描述了一个最优化问题,优化的目标是找到一个层替换的目标,即一个包含N
g个组的组卷积层(当N
g等于通道数时该组卷积则为深度可分离卷积层),使得层内连接的重要性变化最小,重要性的衡量公式(2)为每个连接中的全部权重的L2范数。
The formula describes an optimization problem. The goal of optimization is to find a layer replacement target, that is, a group convolution layer containing N g groups (when N g is equal to the number of channels, the group convolution is depth separable) Convolutional layer), so that the change in the importance of the connections within the layer is minimal, and the importance measurement formula (2) is the L2 norm of all weights in each connection.
公式中,
的含义不变,
指指在第l层的第c个输入通道和第f个输出通道之间的连接权重,
则是指该权重中的第k个元素。
formula, The meaning of is unchanged, Refers to the connection weight between the c-th input channel and the f-th output channel of the lth layer, Then it refers to the k-th element in the weight.
步骤S104、对第一中间模型M1的参数进行重整化得到第二中间模型M2;Step S104: Renormalize the parameters of the first intermediate model M1 to obtain the second intermediate model M2;
步骤S105、对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;Step S105: Initialize and retrain the second intermediate model M2 to obtain the third intermediate model M3;
步骤S106、计算第三中间模型M3的损失值;Step S106: Calculate the loss value of the third intermediate model M3;
上述损失值的计算方式可以包括:The calculation method of the above loss value may include:
其中,
为损失值,其值由所有L层的损失值求和得到。每层的损失值取如下两项的加权平均:所有权重的L2范数(第一项)和层替换后的剩余权重的L2范数(第二项)。
为在第l层的第c个输入通道和第f个输出通道之间的第k个连接是否应该被删除,取值0或1。λ和λ
g为两项加权平均的权重。
among them, The loss value is obtained by summing the loss values of all L layers. The loss value of each layer is taken as the weighted average of the following two items: the L2 norm of the weight (first item) and the L2 norm of the remaining weight after layer replacement (second item). To determine whether the k-th connection between the c-th input channel and the f-th output channel of the l-th layer should be deleted, the value is 0 or 1. λ and λ g are the weights of two weighted averages.
步骤S107、重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。Step S107: Repeat the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; select the third intermediate model M3 with the smallest loss value as the output model.
使用二分图最大匹配的连接来初始化层替换后的模型,初始化使用的参数为预训练模型中的原始参数。在初始化之前,本方法会额外进行参数的重整化(regularization),以确保初始化的模型可以被尽快训练好。本专利还通过在替 换后的卷积层之后添加点卷积层(pointwise convolution)来重排输出结果的通道顺序。参阅图2,图2为一种初始化替换层中参数的示意图。The maximum matching connection of the bipartite graph is used to initialize the model after layer replacement, and the parameters used for initialization are the original parameters in the pre-trained model. Prior to initialization, this method will additionally perform regularization of parameters to ensure that the initialized model can be trained as soon as possible. The patent also rearranges the channel order of the output result by adding a pointwise convolution layer after the replaced convolutional layer. Refer to FIG. 2, which is a schematic diagram of initializing parameters in the replacement layer.
本申请的技术方案提出了一种全新的、通过替换卷积层来优化卷积神经网络的方案。现有技术难以选择哪些卷积层需要替换,并且很难训练替换后的模型。不基于层替换的优化方案往往需要使用大量的GPU计算资源,并且训练时间通常很长。通过使用本方案,在仅使用一块NVidia Titan Xp GPU的前提下,在几小时之内就可以得到一个优化过的卷积神经网络模型。The technical solution of the present application proposes a brand-new solution to optimize the convolutional neural network by replacing the convolutional layer. In the prior art, it is difficult to select which convolutional layers need to be replaced, and it is difficult to train the replaced model. Optimization schemes that are not based on layer replacement often require the use of a large amount of GPU computing resources, and the training time is usually very long. By using this solution, an optimized convolutional neural network model can be obtained within a few hours under the premise of using only one NVidia Titan XP.
参阅图3,图3提供一种卷积神经网络的优化装置,所述装置包括:Referring to FIG. 3, FIG. 3 provides an optimization device for a convolutional neural network. The device includes:
获取单元301,用于获得预训练模型M;The obtaining unit 301 is used to obtain the pre-training model M;
训练单元302,用于将预训练模型M在指定领域的数据集D重训练得到初始模型M
0;
The training unit 302 is configured to retrain the pre-trained model M in the data set D of the specified domain to obtain the initial model M 0 ;
替换单元303,用于对初始模型M
0进行替换层操作;所述替换层操作包括:基于二分图最大匹配算法确定初始模型M
0中标准卷积层e适合被替换成高效卷积层,确定标准卷积层e被替换成高效卷积层的第一中间模型M
1效果增益;对第一中间模型M1的参数进行重整化得到第二中间模型M2;对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;计算第三中间模型M3的损失值;
The replacement unit 303 is configured to perform a replacement layer operation on the initial model M 0 ; the replacement layer operation includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on a bipartite graph maximum matching algorithm, and determining The standard convolution layer e is replaced with the first intermediate model M 1 effect gain of the efficient convolution layer; the parameters of the first intermediate model M 1 are renormalized to obtain the second intermediate model M 2; the second intermediate model M 2 is initialized and Retraining to obtain the third intermediate model M3; calculating the loss value of the third intermediate model M3;
选择单元304,用于控制所述替换单元重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。The selection unit 304 is configured to control the replacement unit to repeatedly perform the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; and select the third intermediate model M3 with the smallest loss value as the output model.
本发明实施例还提供一种计算机存储介质,其中,该计算机存储介质存储用于电子数据交换的计算机程序,该计算机程序使得计算机执行如上述方法实施例中记载的任何一种卷积神经网络的优化方法的部分或全部步骤。An embodiment of the present invention also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, which causes the computer to execute any of the convolutional neural networks described in the above method embodiments Some or all steps of the optimization method.
本发明实施例还提供一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序可操作来使计算机执行如上述方法实施例中记载的任何一种卷积神经网络的优化方法的部分或全部步骤。An embodiment of the present invention also provides a computer program product, the computer program product includes a non-transitory computer-readable storage medium storing a computer program, the computer program is operable to cause the computer to execute as described in the above method embodiments Some or all steps of any convolutional neural network optimization method.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施 例,所涉及的动作和模块并不一定是本发明所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the sequence of actions described. Because according to the invention, certain steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily required by the present invention.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For a part that is not detailed in an embodiment, you can refer to related descriptions in other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed device may be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or may Integration into another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序模块的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or software program modules.
所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software program module and sold or used as an independent product, it may be stored in a computer-readable memory. Based on such an understanding, the technical solution of the present invention essentially or part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a memory, Several instructions are included to enable a computer device (which may be a personal computer, server, network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned memory includes: U disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory, 简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。A person of ordinary skill in the art may understand that all or part of the steps in the various methods of the above embodiments may be completed by a program instructing relevant hardware. The program may be stored in a computer-readable memory, and the memory may include: a flash disk , Read-Only Memory (English: Read-Only Memory, abbreviation: ROM), Random Access Device (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
以上对本发明实施例进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The embodiments of the present invention have been described in detail above, and specific examples have been used in this article to explain the principles and implementations of the present invention. The descriptions of the above embodiments are only used to help understand the method and the core idea of the present invention; Those of ordinary skill in the art, according to the ideas of the present invention, may have changes in specific implementations and application scopes. In summary, the content of this specification should not be construed as limiting the present invention.
Claims (8)
- 一种卷积神经网络的优化方法,其特征在于,所述方法包括如下步骤:An optimization method of a convolutional neural network, characterized in that the method includes the following steps:获得预训练模型M;Obtain the pre-training model M;将预训练模型M在指定领域的数据集D重训练得到初始模型M 0,对初始模型M 0进行替换层操作; Retrain the pre-trained model M on the data set D of the specified domain to obtain the initial model M 0 , and perform a replacement layer operation on the initial model M 0 ;所述替换层操作包括:基于二分图最大匹配算法确定初始模型M 0中标准卷积层e适合被替换成高效卷积层,确定标准卷积层e被替换成高效卷积层的第一中间模型M 1效果增益;对第一中间模型M1的参数进行重整化得到第二中间模型M2;对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;计算第三中间模型M3的损失值; The operation of the replacement layer includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm, and determining that the standard convolution layer e is replaced with the first middle of the efficient convolution layer Model M 1 effect gain; renormalize the parameters of the first intermediate model M1 to obtain the second intermediate model M2; initialize and retrain the second intermediate model M2 to obtain the third intermediate model M3; calculate the third intermediate model M3 Loss value重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。Repeating the replacement layer operation results in multiple third intermediate models M3 and multiple loss values; the third intermediate model M3 with the smallest loss value is selected as the output model.
- 根据权利要求1所述的方法,其特征在于,所述基于二分图最大匹配算法确定初始模型M 0中标准卷积层e适合被替换成高效卷积层具体包括: The method according to claim 1, wherein the determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on the bipartite graph maximum matching algorithm specifically includes:从初始模型M 0中寻找一个包含Ng个组的组卷积层是的层内连接的重要性变化最小; Finding a group convolutional layer containing Ng groups from the initial model M 0 is the minimum change in the importance of intra-layer connections;所述重要性为每个连接中全部权重的L2范数;The importance is the L2 norm of all weights in each connection;
- 一种卷积神经网络的优化装置,其特征在于,所述装置包括:An optimization device for a convolutional neural network, characterized in that the device includes:获取单元,用于获得预训练模型M;The obtaining unit is used to obtain the pre-training model M;训练单元,用于将预训练模型M在指定领域的数据集D重训练得到初始模型M 0; The training unit is used to retrain the pre-trained model M in the data set D of the specified domain to obtain the initial model M 0 ;替换单元,用于对初始模型M 0进行替换层操作;所述替换层操作包括:基于二分图最大匹配算法确定初始模型M 0中标准卷积层e适合被替换成高效卷积层,确定标准卷积层e被替换成高效卷积层的第一中间模型M 1效果增益;对第一中间模型M1的参数进行重整化得到第二中间模型M2;对第二中间模型M2进行初始化以及重训练得到第三中间模型M3;计算第三中间模型M3的损失值; A replacement unit for performing a replacement layer operation on the initial model M 0 ; the replacement layer operation includes: determining that the standard convolution layer e in the initial model M 0 is suitable to be replaced with an efficient convolution layer based on a bipartite graph maximum matching algorithm, and determining the standard e convolutional layer is replaced with a first intermediate layer is efficient convolution model gain effect M 1; a first intermediate parameter model M1 will be renormalized to obtain a second intermediate model M2; second intermediate model M2 is initialized and re Train to obtain the third intermediate model M3; calculate the loss value of the third intermediate model M3;选择单元,用于控制所述替换单元重复执行替换层操作得到多个第三中间模型M3以及多个损失值;选择损失值最小的第三中间模型M3为输出模型。The selection unit is used to control the replacement unit to repeatedly perform the replacement layer operation to obtain multiple third intermediate models M3 and multiple loss values; and select the third intermediate model M3 with the smallest loss value as the output model.
- 根据权利要求4所述的装置,其特征在于,The device according to claim 4, characterized in that所述替换单元,具体用于从初始模型M 0中寻找一个包含Ng个组的组卷积层是的层内连接的重要性变化最小; The replacement unit is specifically used to find a group convolutional layer containing Ng groups from the initial model M 0 where the change in the importance of intra-layer connections is minimal;所述重要性为每个连接中全部权重的L2范数;The importance is the L2 norm of all weights in each connection;
- 一种计算机可读存储介质,其存储用于电子数据交换的程序,其中,所述程序使得终端执行如权利要求1-3任意一项提供的方法。A computer-readable storage medium storing a program for electronic data exchange, wherein the program causes a terminal to perform the method provided in any one of claims 1-3.
- 一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序可操作来使计算机执行如权利要求1-3任意一项提供的方法。A computer program product comprising a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to perform the method provided in any one of claims 1-3.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/112569 WO2020087254A1 (en) | 2018-10-30 | 2018-10-30 | Optimization method for convolutional neural network, and related product |
CN201880083507.4A CN111602145A (en) | 2018-10-30 | 2018-10-30 | Optimization method of convolutional neural network and related product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/112569 WO2020087254A1 (en) | 2018-10-30 | 2018-10-30 | Optimization method for convolutional neural network, and related product |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020087254A1 true WO2020087254A1 (en) | 2020-05-07 |
Family
ID=70463304
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/112569 WO2020087254A1 (en) | 2018-10-30 | 2018-10-30 | Optimization method for convolutional neural network, and related product |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111602145A (en) |
WO (1) | WO2020087254A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113128670B (en) * | 2021-04-09 | 2024-03-19 | 南京大学 | Neural network model optimization method and device |
CN114648671A (en) * | 2022-02-15 | 2022-06-21 | 成都臻识科技发展有限公司 | Detection model generation method and device based on deep learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150006444A1 (en) * | 2013-06-28 | 2015-01-01 | Denso Corporation | Method and system for obtaining improved structure of a target neural network |
CN105844653A (en) * | 2016-04-18 | 2016-08-10 | 深圳先进技术研究院 | Multilayer convolution neural network optimization system and method |
CN106485324A (en) * | 2016-10-09 | 2017-03-08 | 成都快眼科技有限公司 | A kind of convolutional neural networks optimization method |
CN108319988A (en) * | 2017-01-18 | 2018-07-24 | 华南理工大学 | A kind of accelerated method of deep neural network for handwritten Kanji recognition |
-
2018
- 2018-10-30 WO PCT/CN2018/112569 patent/WO2020087254A1/en active Application Filing
- 2018-10-30 CN CN201880083507.4A patent/CN111602145A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150006444A1 (en) * | 2013-06-28 | 2015-01-01 | Denso Corporation | Method and system for obtaining improved structure of a target neural network |
CN105844653A (en) * | 2016-04-18 | 2016-08-10 | 深圳先进技术研究院 | Multilayer convolution neural network optimization system and method |
CN106485324A (en) * | 2016-10-09 | 2017-03-08 | 成都快眼科技有限公司 | A kind of convolutional neural networks optimization method |
CN108319988A (en) * | 2017-01-18 | 2018-07-24 | 华南理工大学 | A kind of accelerated method of deep neural network for handwritten Kanji recognition |
Also Published As
Publication number | Publication date |
---|---|
CN111602145A (en) | 2020-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110288030B (en) | Image identification method, device and equipment based on lightweight network model | |
JP6620439B2 (en) | Learning method, program, and learning apparatus | |
KR20180092810A (en) | Automatic thresholds for neural network pruning and retraining | |
CN107977704A (en) | Weighted data storage method and the neural network processor based on this method | |
CN107480774A (en) | Dynamic neural network model training method and device based on integrated study | |
CN110956202B (en) | Image training method, system, medium and intelligent device based on distributed learning | |
CN107256422A (en) | Data quantization methods and device | |
CN111353591B (en) | Computing device and related product | |
CN106709565A (en) | Neural network optimization method and device | |
CN109344893B (en) | A kind of image classification method based on mobile terminal | |
CN111126602A (en) | A Recurrent Neural Network Model Compression Method Based on Convolution Kernel Similarity Pruning | |
US20180137413A1 (en) | Diverse activation functions for deep neural networks | |
CN109558576A (en) | A kind of punctuation mark prediction technique based on from attention mechanism | |
CN108985457A (en) | A kind of deep neural network construction design method inspired by optimization algorithm | |
Dozono et al. | Convolutional self organizing map | |
CN113590748B (en) | Emotion classification continuous learning method based on iterative network combination and storage medium | |
WO2020087254A1 (en) | Optimization method for convolutional neural network, and related product | |
CN107169566A (en) | Dynamic neural network model training method and device | |
CN115861705A (en) | Federal learning method for eliminating malicious clients | |
CN109034372B (en) | Neural network pruning method based on probability | |
CN115310594A (en) | Method for improving expandability of network embedding algorithm | |
CN111565065B (en) | A UAV base station deployment method, device and electronic device | |
CN116805157B (en) | Unmanned cluster autonomous dynamic assessment method and device | |
CN117095217A (en) | Multi-stage comparative knowledge distillation process | |
CN113538484B (en) | A Deeply Refined Multi-Information Nested Edge Detection Method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18938612 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 111021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18938612 Country of ref document: EP Kind code of ref document: A1 |