CN108171649B - Image stylization method for keeping focus information - Google Patents
Image stylization method for keeping focus information Download PDFInfo
- Publication number
- CN108171649B CN108171649B CN201711292746.4A CN201711292746A CN108171649B CN 108171649 B CN108171649 B CN 108171649B CN 201711292746 A CN201711292746 A CN 201711292746A CN 108171649 B CN108171649 B CN 108171649B
- Authority
- CN
- China
- Prior art keywords
- image
- network
- loss
- stylized
- focus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000008447 perception Effects 0.000 claims abstract description 32
- 238000006243 chemical reaction Methods 0.000 claims abstract description 26
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 5
- 238000010586 diagram Methods 0.000 claims description 30
- 238000013528 artificial neural network Methods 0.000 claims description 17
- 238000012549 training Methods 0.000 claims description 13
- 239000013598 vector Substances 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 5
- 229940050561 matrix product Drugs 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 4
- 238000011478 gradient descent method Methods 0.000 description 3
- 230000001902 propagating effect Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 125000003118 aryl group Chemical group 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to an image stylization method for keeping focus information, which adds 'focus position difference' as a punishment item into the traditional image stylization method, namely, the sum of perception loss and focus loss is used as total loss, and the Adam algorithm is used for adjusting the weight of an image conversion network to obtain an optimized network; after a certain picture is input into the optimized network, an image with original image focus information reserved is generated, and the style is blended more naturally. The method not only enables the generated stylized graph to still keep the main semantic content of the original graph, keeps the focus information of the graph, but also avoids style transfer of the traditional simple texture superposition, and the effect graph can better highlight the original graph theme.
Description
Technical Field
The invention relates to the technical field of image processing and deep learning, in particular to an image stylizing method for keeping focus information.
Background
The method comprises the steps of generating an image based on a residual error neural network, calculating perception loss by comparing the generated image with an original image and a feature map obtained when a style map passes through a VGG network, and training the residual error neural network through back propagation, so that the residual error neural network generates a picture which meets the requirements and has a certain specific style and content. To get the perceptual loss, two parts of the loss need to be calculated: one part is to compare the characteristics of the original image and the generated image in the high level in the VGG network to obtain the content loss; and the other part is to compare the style picture with the low-level characteristics of the generated picture in the VGG network to obtain the style loss.
For example, in document 1(Johnson J, Alahi A, Li F. Perceptial Losses for Real-Time Style Transfer and Super-Resolution [ M ].2016.), an image difference calculation method called "perceptual loss" is discussed. This method does not directly compare the difference between the pixels of two pictures, but compares the difference in the features that the pictures generate when passing through a neural network. The method is used for comparing style texture information with high dimension of the image with shape outline information, so that the perception loss is calculated, and finally a neural network which can add a certain specific style to any picture is trained.
For example, in document 2(Gatys L A, Ecker A S, Bethge M.A Neural Algorithm of aromatic Style [ J ]. Computer Science,2015.), a method is discussed in which a picture of randomly initialized pixels is continuously modified by using a gradient descent method to minimize the loss of the picture after passing through a trained Neural network, and finally, an image with a given Style and content fused is obtained. In the gradient descent method of document 2, VGG-19 is used as a neural network for calculating loss, the network is modified, information of a target content image and a genre image is recorded in one forward propagation, and after the modified image is subjected to one forward propagation as a network input, a difference between the image and the target image is obtained, loss and gradient are calculated, and the target image is modified.
In the method of document 1, a residual neural network is obtained by using perceptual loss training, and the network binds a specific image style. And inputting a network needing style conversion into the network, and obtaining the stylized version of the image after one-time forward propagation. However, the stylized picture obtained by using the method is a whole picture, and is nondifferential and weightless stylized, and is similar to the way that the texture information in the target stylized picture is simply superposed into the picture to be converted. The stylization effect is relatively general.
The image stylization process of document 2 directly acts on the target image, and in the process of generating a stylized image, it generally requires several hundred forward and backward propagation processes, and an image whose pixels are randomly initialized is continuously modified by a gradient descent method until the image approaches an expected requirement. This method has the same disadvantage as document 1 in that the stylization is overall, and there is no difference and no emphasis on stylization. Moreover, because one stylized image is generated each time, the process of forward and backward propagation is required for many times, and the time for generating the stylized image is large.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a stylizing method capable of keeping focus information of an image, so that the stylized image does not lose information which is originally intended to be expressed, has focus and is emphasized.
In order to achieve the purpose, the technical scheme provided by the invention is as follows: the method comprises the following steps:
s1, building a residual error neural network as an image conversion network;
the residual neural network has 12 layers, including 5 residual neural network blocks (residual blocks), each containing two convolutional layers with a convolution kernel of 3 × 3 size. The method has strong expression capability and can record the information of the target style image. Only one style is designated during each training, and then a large number of images with different contents are input into the network to obtain stylized images, so that the image conversion network is trained into a network which records a target style and can stylize any content image.
S2, sending the image to be processed into an image conversion network to obtain a stylized image;
s3, using a VGG network as a perception loss network, firstly inputting a target style image into the network, capturing target style information, then respectively sending the image to be processed and the generated stylized image into the network, and calculating to obtain perception loss;
the loss is composed of two parts, one part represents the difference between the content contour of the generated image and the content contour of the original image, and is called content loss; part of the representation of the generated image and the target style sheet in the tone texture difference, called style loss;
s4, respectively carrying out focus loss network on the generated stylized image and the original image, calculating a matrix product, and solving the root mean square error of the stylized image and the original image as the focus loss;
the focus loss network used in the step is a residual error neural network with an 18-layer structure, which is inconsistent with the structural layer number of the image conversion network and is trained; the weight value of the last layer of Softmax layer of the network is a 1000 multiplied by 512 matrix, and a 512-dimensional vector is obtained for each classification result; matrix multiplication operation is carried out on the vector and the activation value of the last convolution layer of the network, so that the network is implicit and special attention is paid to a certain part of the image;
s5, taking the sum of the perception loss and the focus loss as the total loss, and adjusting the weight of the image conversion network by using an Adam algorithm;
s6, taking down an image from the training set and inputting the image into the adjusted image conversion network, and repeating the steps S2 to S5 until the maximum iteration number is reached to obtain an optimized network;
and S7, inputting the pictures to be stylized into the optimization network to obtain stylized images keeping focus information.
Further, the specific steps of calculating the perceptual loss in step S3 are as follows:
s31, selecting feature maps of the four layers of relu1_2, relu2_2, relu3_3 and relu4_3 of the perception loss network as style feature maps, and selecting feature maps of a relu3_3 layer as content feature maps;
s32, firstly, the target style image is transmitted once in the perception loss network, and style characteristic diagrams of all layers are captured and stored to be used as target style characteristic diagrams in the training process;
s33, reading a picture from the data set, inputting the picture as a target content graph to the perception loss network, capturing and storing the content characteristic graph as a target content characteristic graph of the training;
s34, inputting the pictures read in the step S33 into an image conversion network to obtain a generated stylized graph; inputting the picture into a perception loss network to respectively obtain a content characteristic diagram and a style characteristic diagram of the picture;
s35, calculating the mean square error between the content characteristic diagram of the formatting diagram in the step S34 and the target content characteristic diagram in the step S33 as the content loss part of the perception loss;
s36, calculating the mean square error between the style characteristic diagram of the formatting diagram in the step S34 and the target style characteristic diagram in the step S32 to be used as the style loss part of the perception loss;
s37, assuming that the size of the feature map of the j-th layer is C × H × W, the perceptual loss calculation formula is as follows:
and S38, adding the losses of the layers to obtain the total sensing loss.
Further, the specific steps of calculating the focus loss in step S4 are as follows:
s41, extracting the weight value of the last Softmax layer of the focus loss network;
s42, taking out a content graph from the data set and obtaining a stylized graph obtained after the content graph passes through an image conversion network; then, scaling and normalizing the content graph and the generated stylized graph;
s43, respectively propagating the preprocessed content graph and the preprocessed stylized graph in the focus loss network once in the forward direction to obtain a classification result of the corresponding picture and an activation value of the last convolutional layer;
s44, extracting corresponding vectors from the weight data in the step S41 according to the index values of the classification results, and performing matrix multiplication operation on the vectors and the activation values in the step S43 to obtain initial focus information corresponding to the content graph and the stylized graph;
s45, scaling the initial focus information to the size of the content map and normalizing the size to be between 0 and 256 to obtain a focus positioning map corresponding to the content map and the stylized map;
s46, calculating the difference between the content graph and the focus positioning graph of the stylized graph to obtain the focus loss, wherein the calculation formula of the focus loss is as follows:
compared with the prior art, the principle of the scheme is as follows:
in the traditional image stylization method, only the perception loss needs to be calculated, and then a residual error neural network (image conversion network) is trained through back propagation, so that the image conversion network generates a picture which meets the requirements and has a certain specific style and content; according to the scheme, the focus position difference is added into a traditional image stylization method as a punishment item, namely the sum of the perception loss and the focus loss is used as the total loss, and the weight of the image conversion network is adjusted by using an Adam algorithm to obtain an optimized network; after a picture is input into the optimized network, a stylized image which does not change the focus information of the original picture and is not superimposed by simple textures is generated.
Compared with the prior art, the scheme has the following two advantages:
1. the generated stylized graph still retains the main semantic content of the original graph and keeps the focus information of the image.
2. The style transfer of the traditional simple texture superposition is avoided, and the original image theme can be more highlighted by the effect image.
Drawings
FIG. 1 is a block diagram of a method for stylizing an image that maintains focus information in accordance with the present invention;
FIG. 2 is a schematic diagram of the loss-aware network according to the present invention;
fig. 3 is a comparison of the original image, the stylized image obtained by the method proposed by Gatys et al in document 2, and the focus positioning image of the stylized image obtained by the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples:
referring to fig. 1, in the image stylizing method for maintaining focus information according to this embodiment, X is an image to be stylized, and is different in each iterative training and is also a current target content image; xs is a given, desired target style image; y is an image which is converted by the current image conversion network, integrates the content of X and the style of Xs, and keeps the focus information of X unchanged;
the method comprises the following specific steps:
s1, building a residual error neural network as an image conversion network;
s2, sending the image to be processed into an image conversion network to obtain a stylized image;
s3, using a VGG network as a perception loss network, firstly inputting a target style image into the network, capturing target style information, then respectively sending the image to be processed and the generated stylized image into the network, and calculating to obtain perception loss; the specific steps for calculating the perceptual loss are as follows:
s31, selecting feature maps of the four layers of relu1_2, relu2_2, relu3_3 and relu4_3 of the perception loss network as style feature maps, and selecting feature maps of a relu3_3 layer as content feature maps;
s32, firstly, the target style image is transmitted once in the perception loss network, and style characteristic diagrams of all layers are captured and stored to be used as target style characteristic diagrams in the training process;
s33, reading a picture from the data set, inputting the picture as a target content graph to the perception loss network, capturing and storing the content characteristic graph as a target content characteristic graph of the training;
s34, inputting the pictures read in the step S33 into an image conversion network to obtain a generated stylized graph; inputting the picture into a perception loss network to respectively obtain a content characteristic diagram and a style characteristic diagram of the picture;
s35, calculating the mean square error between the content characteristic diagram of the formatting diagram in the step S34 and the target content characteristic diagram in the step S33 as the content loss part of the perception loss;
s36, calculating the mean square error between the style characteristic diagram of the formatting diagram in the step S34 and the target style characteristic diagram in the step S32 to be used as the style loss part of the perception loss;
s37, assuming that the size of the feature map of the j-th layer is C × H × W, the perceptual loss calculation formula is as follows:
s38, adding the losses of the layers to obtain the total sensing loss;
s4, respectively sending the generated stylized image and the original image into a focus loss network, calculating a matrix product, and solving the root mean square error of the stylized image and the original image as the focus loss; the specific steps for calculating the focal loss are as follows:
s41, extracting the weight value of the last Softmax layer of the network by using the ResNet-18 residual error neural network as a focus loss network;
s42, taking out a content graph from the data set and obtaining a stylized graph obtained after the content graph passes through an image conversion network; then, scaling and normalizing the content graph and the generated stylized graph;
s43, respectively propagating the preprocessed content graph and the preprocessed stylized graph in the focus loss network once in the forward direction to obtain a classification result of the corresponding picture and an activation value of the last convolutional layer;
s44, extracting corresponding vectors from the weight data in the step S41 according to the index values of the classification results, and performing matrix multiplication operation on the vectors and the activation values in the step S43 to obtain initial focus information corresponding to the content graph and the stylized graph;
s45, scaling the initial focus information to the size of the content map and normalizing the size to be between 0 and 256 to obtain a focus positioning map corresponding to the content map and the stylized map;
s46, calculating the difference between the content graph and the focus positioning graph of the stylized graph to obtain the focus loss, wherein the calculation formula of the focus loss is as follows:
s5, taking the sum of the perception loss and the focus loss as the total loss, and adjusting the weight of the image conversion network by using an Adam algorithm;
s6, taking down an image from the training set and inputting the image into the adjusted image conversion network, and repeating the steps S2 to S5 until the maximum iteration number is reached to obtain an optimized network;
and S7, inputting the pictures to be stylized into the optimization network to obtain stylized images keeping focus information.
The method and the device solve the problems that in traditional image stylization, style transfer is simple and hard, styles cannot be well fused with original image contents, focus information of the original content images is shifted or lost, and the like. The stylized result can not only keep the main semantic information which is supposed to be expressed by the original content graph, but also be more natural, the style transfer of the traditional simple texture superposition is avoided, and the effect graph can better highlight the original graph theme.
The above-mentioned embodiments are merely preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, so that variations based on the shape and principle of the present invention should be covered within the scope of the present invention.
Claims (2)
1. An image stylization method that preserves focus information, characterized by: the method comprises the following steps:
s1, building a residual error neural network as an image conversion network;
s2, sending the image to be processed into an image conversion network to obtain a stylized image;
s3, using a VGG network as a perception loss network, firstly inputting a target style image into the network, capturing target style information, then respectively sending the image to be processed and the generated stylized image into the network, and calculating to obtain perception loss;
s4, respectively sending the generated stylized image and the original image into a focus loss network, calculating a matrix product, and solving the root mean square error of the stylized image and the original image as the focus loss;
s5, taking the sum of the perception loss and the focus loss as the total loss, and adjusting the weight of the image conversion network by using an Adam algorithm;
s6, taking down an image from the training set and inputting the image into the adjusted image conversion network, and repeating the steps S2 to S5 until the maximum iteration number is reached to obtain an optimized network;
s7, inputting the pictures to be stylized into an optimization network to obtain stylized images keeping focus information;
the specific steps of calculating the focus loss in step S4 are as follows:
s41, extracting the weight value of the last Softmax layer of the network by using the ResNet-18 residual error neural network as a focus loss network;
s42, taking out an original image from the data set, and obtaining a stylized graph obtained after the original image passes through an image conversion network; then, scaling and normalizing the original image and the generated stylized image;
s43, the preprocessed original image and the preprocessed stylized image are respectively transmitted once in the focus loss network, and the classification result of the corresponding image and the activation value of the last convolutional layer are obtained;
s44, extracting corresponding vectors from the weight data in the step S41 according to the index values of the classification results, and performing matrix multiplication operation on the vectors and the activation values in the step S43 to obtain initial focus information corresponding to the original image and the stylized image;
s45, zooming the initial focus information to the size of the original image, and normalizing the size to be between 0 and 256 to obtain a focus positioning image corresponding to the original image and the stylized image;
and S46, calculating the difference between the original image and the focus positioning image of the stylized image to obtain the focus loss.
2. An image stylization method that preserves focus information, as defined by claim 1, wherein: the specific steps of calculating the perceptual loss in step S3 are as follows:
s31, selecting feature maps of the four layers of relu1_2, relu2_2, relu3_3 and relu4_3 of the perception loss network as style feature maps, and selecting feature maps of a relu3_3 layer as content feature maps;
s32, the target style image is transmitted once in the perception loss network, and style characteristic diagrams of all layers are captured and stored to be used as target style characteristic diagrams in the training process;
s33, reading a picture from the data set, inputting the picture as a target content graph to the perception loss network, capturing and storing the content characteristic graph as a target content characteristic graph of the training;
s34, inputting the pictures read in the step S33 into an image conversion network to obtain a generated stylized graph; inputting the generated stylized graph into a perception loss network, and respectively obtaining a content characteristic graph and a style characteristic graph of the generated stylized graph;
s35, calculating the mean square error between the content characteristic diagram of the formatting diagram in the step S34 and the target content characteristic diagram in the step S33 as the content loss part of the perception loss;
s36, calculating the mean square error between the style characteristic diagram of the formatting diagram in the step S34 and the target style characteristic diagram in the step S32 to be used as the style loss part of the perception loss;
s37, assuming that the size of the feature map of the j-th layer is C × H × W, the perceptual loss calculation formula is as follows:
and S38, adding the losses of the layers to obtain the total sensing loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711292746.4A CN108171649B (en) | 2017-12-08 | 2017-12-08 | Image stylization method for keeping focus information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711292746.4A CN108171649B (en) | 2017-12-08 | 2017-12-08 | Image stylization method for keeping focus information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108171649A CN108171649A (en) | 2018-06-15 |
CN108171649B true CN108171649B (en) | 2021-08-17 |
Family
ID=62525490
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711292746.4A Active CN108171649B (en) | 2017-12-08 | 2017-12-08 | Image stylization method for keeping focus information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108171649B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109144641B (en) * | 2018-08-14 | 2021-11-02 | 四川虹美智能科技有限公司 | Method and device for displaying image through refrigerator display screen |
CN109345446B (en) * | 2018-09-18 | 2022-12-02 | 西华大学 | Image style transfer algorithm based on dual learning |
CN109559363B (en) * | 2018-11-23 | 2023-05-23 | 杭州网易智企科技有限公司 | Image stylization processing method and device, medium and electronic equipment |
CN111860823B (en) * | 2019-04-30 | 2024-06-11 | 北京市商汤科技开发有限公司 | Neural network training method, neural network image processing method, neural network training device, neural network image processing equipment and storage medium |
TWI730467B (en) * | 2019-10-22 | 2021-06-11 | 財團法人工業技術研究院 | Method of transforming image and network for transforming image |
CN111160138A (en) * | 2019-12-11 | 2020-05-15 | 杭州电子科技大学 | Fast face exchange method based on convolutional neural network |
WO2022204868A1 (en) * | 2021-03-29 | 2022-10-06 | 深圳高性能医疗器械国家研究院有限公司 | Method for correcting image artifacts on basis of multi-constraint convolutional neural network |
CN113469923B (en) * | 2021-05-28 | 2024-05-24 | 北京达佳互联信息技术有限公司 | Image processing method and device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006009257A1 (en) * | 2004-07-23 | 2006-01-26 | Matsushita Electric Industrial Co., Ltd. | Image processing device and image processing method |
CN105913377A (en) * | 2016-03-24 | 2016-08-31 | 南京大学 | Image splicing method for reserving image correlation information |
CN106952224A (en) * | 2017-03-30 | 2017-07-14 | 电子科技大学 | A kind of image style transfer method based on convolutional neural networks |
CN107292875A (en) * | 2017-06-29 | 2017-10-24 | 西安建筑科技大学 | A kind of conspicuousness detection method based on global Local Feature Fusion |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090310863A1 (en) * | 2008-06-11 | 2009-12-17 | Gallagher Andrew C | Finding image capture date of hardcopy medium |
-
2017
- 2017-12-08 CN CN201711292746.4A patent/CN108171649B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006009257A1 (en) * | 2004-07-23 | 2006-01-26 | Matsushita Electric Industrial Co., Ltd. | Image processing device and image processing method |
CN105913377A (en) * | 2016-03-24 | 2016-08-31 | 南京大学 | Image splicing method for reserving image correlation information |
CN106952224A (en) * | 2017-03-30 | 2017-07-14 | 电子科技大学 | A kind of image style transfer method based on convolutional neural networks |
CN107292875A (en) * | 2017-06-29 | 2017-10-24 | 西安建筑科技大学 | A kind of conspicuousness detection method based on global Local Feature Fusion |
Non-Patent Citations (1)
Title |
---|
Perceptual Losses for Real-Time Style Transfer and Super-Resolution;Justin Johnson 等;《ECCV 2016: Computer Vision – ECCV 201》;20160917;第694-711页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108171649A (en) | 2018-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108171649B (en) | Image stylization method for keeping focus information | |
CN113240580B (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN109344288B (en) | Video description combining method based on multi-modal feature combining multi-layer attention mechanism | |
Yang et al. | Deep plastic surgery: Robust and controllable image editing with human-drawn sketches | |
WO2021093620A1 (en) | Method and system for high-resolution image inpainting | |
CN109977942B (en) | Scene character recognition method based on scene classification and super-resolution | |
CN107330127B (en) | Similar text detection method based on text picture retrieval | |
CN110634170B (en) | Photo-level image generation method based on semantic content and rapid image retrieval | |
Liu et al. | Effective image super resolution via hierarchical convolutional neural network | |
Li et al. | Context-aware semantic inpainting | |
CN114418853B (en) | Image super-resolution optimization method, medium and equipment based on similar image retrieval | |
US20240020810A1 (en) | UNIVERSAL STYLE TRANSFER USING MULTl-SCALE FEATURE TRANSFORM AND USER CONTROLS | |
Tang et al. | Attribute-guided sketch generation | |
CN114549913A (en) | Semantic segmentation method and device, computer equipment and storage medium | |
Wan et al. | Generative adversarial learning for detail-preserving face sketch synthesis | |
Xing et al. | Few-shot single-view 3d reconstruction with memory prior contrastive network | |
Li et al. | High-resolution network for photorealistic style transfer | |
Ma et al. | SwinFG: A fine-grained recognition scheme based on swin transformer | |
EP4075328A1 (en) | Method and device for classifying and searching for a 3d model on basis of deep attention | |
Gilbert et al. | Disentangling structure and aesthetics for style-aware image completion | |
CN115187456A (en) | Text recognition method, device, equipment and medium based on image enhancement processing | |
CN114170460A (en) | Multi-mode fusion-based artwork classification method and system | |
CN117576248B (en) | Image generation method and device based on gesture guidance | |
CN116740069B (en) | Surface defect detection method based on multi-scale significant information and bidirectional feature fusion | |
CN117994623A (en) | Image feature vector acquisition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |