CN117725242B

CN117725242B - Image searching method, device, equipment and medium

Info

Publication number: CN117725242B
Application number: CN202310358027.7A
Authority: CN
Inventors: 陈诚; 宋德嘉; 汤旭
Original assignee: Shuhang Technology Beijing Co ltd
Current assignee: Shuhang Technology Beijing Co ltd
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2024-10-22
Anticipated expiration: 2043-04-04
Also published as: CN117725242A

Abstract

The embodiment of the application discloses an image searching method, an image searching device, image searching equipment and an image searching medium, which are applied to the technical field of image processing. The method comprises the following steps: the method comprises the steps of obtaining a first image, inputting the first image into an image generator, generating a second image through the image generator, inputting the second image into an image extractor, performing feature extraction processing on the second image through the image extractor, taking a feature extraction result output by the image extractor associated with the second image as image feature information of the second image, performing stitching processing on the image feature information of the second image to obtain stitching feature information, determining the image feature information and the image classification information of the first image through the stitching feature information, obtaining an image set, and searching an image to be matched with the first image from the image set based on the image feature information and the image classification information of the first image and the image feature information and the image classification information of the image to be matched. By adopting the embodiment of the application, the image searching precision can be improved.

Description

Image searching method, device, equipment and medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image searching method, apparatus, device, and medium.

Background

Currently, returning an image input by a user to a plurality of images (i.e., similar images) matching the image is one of important scenes in the search field. Existing image search techniques typically extract image features directly from an input image and search for multiple images from an image database that are similar to the input image based on the image features and the similarity between the image features of each image in the image database. However, in this way, the feature information contained in the image features participating in the image search process is relatively single, resulting in low image search accuracy. Therefore, how to improve the image search accuracy is a problem to be solved.

Disclosure of Invention

The embodiment of the application provides an image searching method, device, equipment and medium, which can improve the image searching precision.

In one aspect, an embodiment of the present application provides an image searching method, including:

Acquiring a first image, inputting the first image into N image generators, and respectively carrying out image recombination processing on the first image through the N image generators to generate N second images; an image generator for generating a second image; n is an integer greater than 1;

Inputting N second images into R image extractors respectively, carrying out feature extraction processing on the N second images through the R image extractors, and taking feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image respectively; r is an integer greater than or equal to N;

performing stitching processing on the image characteristic information of the N second images to obtain stitching characteristic information, and determining the image characteristic information and the image classification information of the first image through the stitching characteristic information;

acquiring an image set; the image set comprises a plurality of images to be matched, wherein one image to be matched corresponds to one image characteristic information and one image classification information;

and searching the images to be matched with the first image from the image set based on the image characteristic information and the image classification information of the first image and the image characteristic information and the image classification information of each image to be matched.

In one aspect, an embodiment of the present application provides an image search apparatus, including:

The processing module is used for acquiring a first image, inputting the first image into N image generators, and respectively carrying out image recombination processing on the first image through the N image generators to generate N second images; an image generator for generating a second image; n is an integer greater than 1;

The processing module is further used for inputting the N second images into R image extractors respectively, carrying out feature extraction processing on the N second images through the R image extractors, and taking feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image respectively; r is an integer greater than or equal to N;

The processing module is also used for carrying out splicing processing on the image characteristic information of the N second images to obtain splicing characteristic information, and determining the image characteristic information and the image classification information of the first image through the splicing characteristic information;

The acquisition module is used for acquiring an image set; the image set comprises a plurality of images to be matched, wherein one image to be matched corresponds to one image characteristic information and one image classification information;

The processing module is further used for searching the image to be matched with the first image from the image set based on the image characteristic information and the image classification information of the first image and the image characteristic information and the image classification information of each image to be matched.

In one aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the memory is configured to store a computer program, the computer program including program instructions, and the processor is configured to invoke the program instructions to perform some or all of the steps in the above method.

In one aspect, embodiments of the present application provide a computer readable storage medium storing a computer program comprising program instructions for performing part or all of the steps of the above method when executed by a processor.

Accordingly, according to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions which, when executed by a processor, implement some or all of the steps of the above method.

In the embodiment of the application, a first image can be acquired, the first image is input into N image generators, and the N image generators respectively carry out image recombination processing on the first image to generate N second images; the image generator can obtain a plurality of second images corresponding to the first image so as to realize data enhancement, and the plurality of second images can be understood to all contain the image information of the first image; inputting N second images into R image extractors respectively, performing feature extraction processing on the N second images through the R image extractors, taking feature extraction results output by the image extractors associated with each second image in the R image extractors as image feature information of each second image respectively, performing stitching processing on the image feature information of the N second images to obtain stitching feature information, and determining the image feature information and the image classification information of the first image through the stitching feature information; the image characteristic information extracted from the second image by the image extractor can be used as multi-stage characteristic information of the first image, and the image characteristic information obtained by the multi-stage characteristic information can represent the image characteristic of the first image in a deeper layer, namely, the image characteristic information is richer; acquiring an image set, and searching an image to be matched with a first image from the image set based on the image characteristic information and the image classification information of the first image and the image characteristic information and the image classification information of each image to be matched; therefore, the image to be matched with the first image can be comprehensively determined through the deeper image characteristic information and the image classification information, so that the image searching precision can be improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an image searching scene provided in an embodiment of the present application;

Fig. 2 is a schematic flow chart of an image searching method according to an embodiment of the present application;

fig. 3 is a schematic view of a scene for acquiring a second image according to an embodiment of the present application;

fig. 4a is a schematic view of a scene for acquiring image feature information according to an embodiment of the present application;

Fig. 4b is a schematic diagram of a second scene for acquiring image feature information according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an image search model according to an embodiment of the present application;

FIG. 6 is a schematic diagram of an image searching process according to an embodiment of the present application;

fig. 7 is a second schematic flow chart of an image searching method according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a training image search model according to an embodiment of the present application;

Fig. 9 is a schematic structural diagram of an image searching device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The image searching method provided by the embodiment of the application is realized in the electronic equipment, and the electronic equipment can be a server or a terminal. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, basic cloud computing services such as big data and an artificial intelligent platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, etc.

A schematic diagram of an image search scenario proposed based on the image search method may be shown in fig. 1, and fig. 1 proposes a network architecture, where the network architecture may include a service server and a user terminal cluster, where the user terminal cluster may include one or more user terminals, and the number of user terminals in the user terminal cluster will not be limited here. A communication connection may exist between user terminals in a cluster of user terminals. Meanwhile, any user terminal in the user terminal cluster can be in communication connection with the service server, so that each user terminal in the user terminal cluster can perform data interaction with the service server through the communication connection. The communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, or may be other manners, and the present application is not limited herein. In addition, it can be understood that the electronic device according to the embodiment of the present application may be the service server shown in fig. 1, or may be any one of the user terminals in the user terminal cluster shown in fig. 1.

For example, as shown in fig. 1, in the embodiment of the present application, a service server may obtain any first image, for example, a first image uploaded by any user terminal, and implement image searching of the first image by using the image searching method provided by the present application. Specifically, a first image is acquired, the first image is input into N image generators, the N image generators respectively perform image reorganization processing on the first image to obtain N second images, and one image generator can obtain one second image; inputting N second images into R image extractors respectively to obtain image characteristic information of each second image, performing stitching processing on the image characteristic information of each second image to obtain stitching characteristic information, determining the image characteristic information and the image classification information of the first image through the stitching characteristic information, and obtaining an image set, wherein the image set comprises a plurality of images to be matched, and one image to be matched corresponds to one image characteristic information and one image classification information; and searching the images to be matched with the first image from the image set based on the image characteristic information and the image classification information of the first image, the image characteristic information of each image to be matched and the image classification information thereof.

It is to be understood that determining the image feature information of any one of the second images may be inputting the any one of the second images to R image extractors, performing feature extraction processing on any one of the second images by the R image extractors, and taking the feature extraction result output by the image extractor associated with any one of the R image extractors as the image feature information of any one of the second images; among the R image extractors, two adjacent image extractors, the output of the former image extractor is the input of the latter image extractor.

It can be understood that the image feature information and the image classification information of the images to be matched in the image set are determined in the same manner as the image feature information and the image classification information of the first image. By the method, deeper and richer image characteristic information can be extracted, so that more accurate image matching can be realized, and the image searching precision is improved.

Alternatively, in some embodiments, the electronic device may perform the image search method according to actual service requirements to achieve image search accuracy. The technical scheme of the application can be applied to any image searching scene. For example, the method can be applied to scenes such as photographing shopping, commodity recommendation, copyright protection, image similarity recommendation and the like. When the first image to be searched is obtained, the image to be matched with the first image, namely the image to be matched similar to the first image, can be searched through the technical scheme of the application. Taking the image similarity recommended scene as an example, an image containing a dog may be input to search for an image similar to the dog in the image. The application scenario is not limited here. That is, the technical scheme of the present application can be applied to any type of image search task, such as a pet type image search task, a clothing type image search task, and the like. For example, in an image search task of pet type, a user may upload an image containing a pet, and the electronic device searches for a similar image matching the image, where the similar image contains pet information similar to the pet in the uploaded image.

Optionally, the data related to the present application, such as the first image, the searched image to be matched with the first image, etc., may be stored in a database, or may be stored in a blockchain, such as by a blockchain distributed system, which is not limited by the present application.

In the embodiment of the present application, when a scenario of acquiring related data such as user information is referred to, for example, acquiring a first image uploaded by a user, permission or consent of the user needs to be obtained. That is, when the embodiments of the present application are applied to a particular product or technology, the collection, use and processing of relevant user data complies with relevant laws and regulations and standards for the relevant country and region. For example, prompt information can be sent out in the form of an interactive interface to prompt a user about which data is collected or acquired, and the type, content and the like of the data can be prompted to the user in a list mode, and the relevant data can be further collected, processed and the like only after confirmation operation or instruction for allowing the data to be collected is received on the interactive interface.

It can be understood that the above scenario is merely an example, and does not constitute a limitation on the application scenario of the technical solution provided by the embodiment of the present application, and the technical solution of the present application may also be applied to other scenarios. For example, as one of ordinary skill in the art can know, with the evolution of the system architecture and the appearance of new service scenarios, the technical solution provided by the embodiment of the present application is also applicable to similar technical problems.

Based on the above description, the embodiments of the present application propose an image search method, which can be performed by the above-mentioned electronic device. Referring to fig. 2, fig. 2 is a flowchart of an image searching method according to an embodiment of the present application. As shown in fig. 2, the flow of the image searching method according to the embodiment of the present application may include the following:

s101, acquiring a first image, inputting the first image into N image generators, and respectively carrying out image recombination processing on the first image through the N image generators to generate N second images.

The first image may be an image uploaded by the user terminal, or may be a video frame captured from a video, or may be an image downloaded on a network, or may be an image obtained from a service database (such as a database constructed in a pet identification scene), or may be obtained from a multimedia document such as a blog, a note, etc. published by the user. The source of the first image is not limited herein.

In some embodiments, the process and principle of image reorganization processing of the first image by each image generator are the same. Taking the procedure of any one of the image generators (for example, denoted as the i-th image generator) performing the image reorganization processing as an example, it may be: splitting a first image into K image blocks through an ith image generator in the N image generators, splicing the K image blocks into spliced images with the same image size as the first image according to an image splicing rule, and taking the spliced images as second images generated by the ith image generator. The image stitching rule may be that performing image random stitching on the K image blocks to obtain randomly stitched images; and the image size of the randomly reorganized image is the same as the image size of the first image. K is a positive integer. K may be set according to an empirical value or may be obtained by training the image generator.

Wherein i is a positive integer less than or equal to N. The K corresponding to different image generators may be different. Image reorganization with different granularities can be realized through K corresponding to the image generator. It will be appreciated that as K is greater, the smaller the split image blocks, the smaller the granularity of image reorganization, and the correspondingly obtained image content of the second image is more cluttered than that of the first image. Conversely, when K is smaller, the split image block is larger, the granularity of image reorganization is larger, and the image content of the corresponding obtained first image is less disordered compared with the image content of the first image. That is, the image generator is operable to generate a second image containing different granularity level information of the first image. The image generator may divide the sample image into K image blocks and randomly shuffle the image blocks to merge into a new image.

Alternatively, the K's respectively corresponding to the N image generators may be presented in increasing order, or in decreasing order, or not in any order. And are not limited herein. For example, K corresponding to each of the N image generators is in descending order. That is, K corresponding to an image generator located before the ith image generator among the N image generators is greater than K corresponding to the ith image generator, and K corresponding to an image generator located after the ith image generator among the N image generators is less than K corresponding to the ith image generator. The N and K may be preset according to empirical values, or may be obtained by training N image generators.

It will be appreciated that the image size of each second image is the same as the first image, and that when K is an integer greater than 1, the image content of each second image is different and different from the first image. Correspondingly, if K is equal to 1, the correspondingly generated second image is the first image. By the method, data enhancement can be achieved, a plurality of second images containing the image information of the first image can be obtained, and further extraction of the depth image characteristic information of the first image can be achieved.

For example, as shown in fig. 3, fig. 3 is a schematic view of a scene in which a second image is acquired; for ease of understanding, taking N image generators 30 as 3 image generators (e.g., denoted as 30a, 30b, and 30 c), and K for image generator 30a is 8, K for image generator 30b is 6, and K for image generator 30c is 4; inputting the first image 31 into each image generator respectively, and performing image reorganization processing on the first image 31 by each image generator respectively to obtain a second image 32 (such as 32a, 32b and 32 c) correspondingly generated by each image generator; for example, the first image 31 is input to the image generator 30a, the first image 31 is split into 8 image blocks 33 (for example, indicated as 33a, 33b, 33c, 33d, 33e, 33f, 33g, 33 h) by the image generator 30a, the 8 image blocks are subjected to image stitching to obtain a stitched image 34, and the stitched image 34 is used as a second image 32a correspondingly generated by the image generator 30 a; for another example, the first image 31 is input to the image generator 30b, the first image 31 is split into 6 image blocks 33 (such as 35a, 35b, 35c, 35d, 35e, 35 f) by the image generator 30b, the 6 image blocks are subjected to image stitching to obtain a stitched image 36, and the stitched image 36 is used as a second image 32b correspondingly generated by the image generator 30 b; for another example, the first image 31 is input to the image generator 30c, the first image 31 is split into 4 image blocks 37 (such as 37a, 37b, 37c, 37 d) by the image generator 30c, the 4 image blocks are subjected to image stitching to obtain a stitched image 38, and the stitched image 38 is used as the second image 32c correspondingly generated by the image generator 30 c.

In some embodiments, the technical scheme of the application can be applied to a similarity search scene of images, namely, images similar to a first image can be searched. When the first image is searched for similar images, if the first image contains the target object, the purpose may be to search for images similar to the target object in the first image.

Thus, acquiring the first image may be acquiring an image to be searched, performing object recognition on a target object in the image to be searched, generating an object detection frame in which the target object recognized in the image to be searched is located, and determining image data in the object detection frame as the first image. At this time, the first image includes the target object, and the image including the object similar to the target object, that is, the subject to be searched is identified by searching through the technical scheme of the application. For example, when a target object (for example, a dog) included in an image to be searched is generated, a first image is obtained based on a target object (for example, a dog) selected by the object detection frame. The image classification information and the image feature information of the first image may be regarded as the image classification information and the image feature information of the image to be searched.

S102, inputting N second images into R image extractors respectively, performing feature extraction processing on the N second images through the R image extractors, and taking feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image.

The N second images may be input into R image extractors, and feature extraction processing may be performed on each second image by the R image extractors, so as to obtain image feature information of each second image. It will be appreciated that R image extractors may extract feature information of different phases of the second image when performing feature extraction processing on the second image. For example, among the R image extractors, the output of the previous image extractor is the input of the next image extractor, that is, the next image extractor further performs feature extraction based on the feature information extracted by the previous image extractor, that is, the feature information extracted by the previous image extractor is a low-stage image extractor compared with the feature information extracted by the next image extractor; the feature information extracted by the latter image extractor is a high-level image extractor compared to the feature information extracted by the former image extractor.

When the R image extractors perform feature extraction processing on the N second images, feature information of the N second images at different stages can be extracted to serve as image feature information of the N second images. That is, each second image may have an associated image extractor, and the image extractor associated with each second image is different, thereby enabling extraction of different phases of feature information.

Alternatively, the image extractor associated with each second image may be determined randomly or based on K corresponding to each second image, i.e. based on the image reorganization granularity corresponding to each second image. For example, the higher the image reorganization granularity is, the higher the stage of the extracted feature information is; or the lower the image reorganization granularity, the lower the stage of extracted feature information. The association between the image reorganization granularity and the corresponding feature phase may be set according to an empirical value.

For example, if the lower the image reorganization granularity is, the higher the stage of the extracted feature information is, the association relationship between the image reorganization granularity and the corresponding feature stage may indicate: the image extractor associated with the ith second image of the N second images may be an R-n+i th image extractor of the R image extractors, i being a positive integer less than or equal to N.

For example, the number of the second images is 3, the number of the image extractors is 4, the image recombination granularity of the 1 st second image in the 3 second images is smaller than the image recombination granularity of the 2 nd second image and smaller than the image recombination granularity of the 3 rd second image, so that the image extractor associated with the 1 st second image is the 2 nd image extractor in the 4 th image extractor, the image extractor associated with the 2 nd second image is the 3 rd image extractor, and the image extractor associated with the 3 rd second image is the 4 th image extractor.

Thus, taking the ith second image of the N second images as an example, feature extraction processing is performed on the N second images by R image extractors, and the feature extraction result output by the image extractor associated with each second image of the R image extractors as the image feature information of each second image may be: and performing feature extraction processing on the ith second image through the former R-N+i image extractors in the R image extractors, and taking the feature extraction result output by the R-N+i image extractors as image feature information of the ith second image.

The two adjacent image extractors in the former R-N+i image extractors comprise a u-th image extractor and a v-th image extractor, wherein the feature extraction result output by the u-th image extractor is used as the input of the v-th image extractor, u is a positive integer smaller than v, and v is a positive integer.

For example, the lower the image reorganization granularity, the lower the stage of extracted feature information, the association between the image reorganization granularity and the corresponding feature stage may indicate: the image extractor associated with the ith second image of the N second images may be an R-i+1 th image extractor of the R image extractors, i being a positive integer less than or equal to N.

For example, the number of the second images is 3, the number of the image extractors is 4, the image recombination granularity of the 1 st second image in the 3 second images is smaller than the image recombination granularity of the 2 nd second image and smaller than the image recombination granularity of the 3 rd second image, so that the image extractor associated with the 1 st second image is the 4 th image extractor in the 4 th image extractor, the image extractor associated with the 2 nd second image is the 3 rd image extractor, and the image extractor associated with the 3 rd second image is the 2 nd image extractor.

Thus, taking the ith second image of the N second images as an example, feature extraction processing is performed on the N second images by R image extractors, and the feature extraction result output by the image extractor associated with each second image of the R image extractors as the image feature information of each second image may be: and performing feature extraction processing on the ith second image through the front R-i+1 image extractors in the R image extractors, and taking the feature extraction result output by the R-i+1 image extractors as the image feature information of the ith second image.

Therefore, the N image generators and the R image extractors can obtain the characteristic information (namely, the characteristic information of multiple groups of granularities of multiple characteristic stages) of different granularities of different stages corresponding to the first image, and the characteristic information of the characteristic of the first image is determined through the characteristic information of different granularities of different stages, and the characteristic information of the image can contain more deep and rich image information, so that the accuracy of the subsequent image matching based on the characteristic information of the image can be improved.

For example, as shown in fig. 4a to 4b, fig. 4a to 4b are schematic views of a scene for acquiring image feature information according to an embodiment of the present application; wherein, for ease of understanding, taking N image generators 40 as 3 image generators (such as shown as 40a, 40b and 40 c) as an example, R image extractors 41 are 5 image extractors (such as shown as 41a, 41b, 41c, 41d, 41 e); and, R corresponding to the image generator 40 is decremented, and the output of the previous image extractor in the image extractor 41 is the input of the next image extractor; alternatively, as shown in fig. 4a, if the image reorganization granularity is lower, the stage of the extracted feature information is higher, the image extractor associated with the second image 42a generated by the image generator 40a is the image extractor 41c, the image extractor associated with the second image 42b generated by the image generator 40b is the image extractor 41d, and the image extractor associated with the second image 42c generated by the image generator 40c is the image extractor 41e; accordingly, the second image 42a outputs R image extractors 41, the image extractors 41a-41c perform feature extraction processing on the second image 42a, and the feature information output by the image extractor 41c is taken as image feature information 43a of the second image 42 a; the second image 42b outputs R image extractors 41, the image extractors 41a-41d perform feature extraction processing on the second image 42b, and the feature information output by the image extractor 41d is taken as image feature information 43b of the second image 42 b; the second image 42c outputs R image extractors 41, the second image 42c is subjected to feature extraction processing by the image extractors 41a to 41e, and the feature information output by the image extractor 41e is taken as image feature information 43c of the second image 42 c.

As another example, as shown in fig. 4b, if the image reorganization granularity is lower, the stage of the extracted feature information is lower, the image extractor associated with the second image 42a generated by the image generator 40a is the image extractor 41e, the image extractor associated with the second image 42b generated by the image generator 40b is the image extractor 41d, and the image extractor associated with the second image 42c generated by the image generator 40c is the image extractor 41c; accordingly, the second image 42a outputs R image extractors 41, the image extractors 41a to 41e perform feature extraction processing on the second image 42a, and the feature information output by the image extractor 41e is taken as image feature information 43a of the second image 42 a; the second image 42b outputs R image extractors 41, the image extractors 41a-41d perform feature extraction processing on the second image 42b, and the feature information output by the image extractor 41d is taken as image feature information 43b of the second image 42 b; the second image 42c outputs R image extractors 41, the image extractors 41a-41c perform feature extraction processing on the second image 42c, and feature information output by the image extractor 41c is taken as image feature information 43c of the second image 42 c.

S103, performing stitching processing on the image characteristic information of the N second images to obtain stitching characteristic information, and determining the image characteristic information and the image classification information of the first image through the stitching characteristic information.

In some embodiments, the image feature information of the N second images may be stitched to obtain stitching feature information, and the stitching feature information is used as the image feature information of the first image.

In some embodiments, the image feature information of the N second images is subjected to stitching processing to obtain stitching feature information, or the image feature information of the N second images is respectively input into feature convolvers associated with the N second images to be convolved to obtain convolution feature information corresponding to the N second images, and one feature convolver performs convolution processing on the image feature information of one second image to obtain one convolution feature information; and performing splicing processing on the convolution characteristic information corresponding to the N second images, and taking the convolution characteristic information obtained by splicing as splicing characteristic information. The feature convolvers associated with the N second images may be the same or different.

Alternatively, the stitching feature information obtained via the feature convolver may be used as the image feature information of the first image. Alternatively, the stitching characteristic information determined in the above manner may be input to a characteristic processor, and the characteristic processor performs characteristic processing on the stitching characteristic information to obtain image characteristic information of the first image. Alternatively, the feature processor may be configured to perform feature dimension reduction processing on the stitched feature information, so as to use the feature dimension reduced stitched feature information as the image feature information of the first image. It can be understood that the image feature information of the first image is obtained by feature fusion of the image feature information of the N second images, and compared with the method of directly extracting the image feature information from the first image, the method can obtain the image feature information containing the richer image information of the first image.

In some embodiments, the image classification processing may be performed on the stitching feature information of the first image by using an image classifier, so as to obtain image classification information of the first image. Or performing image classification processing on the image characteristic information of the first image through an image classifier to obtain the image classification information of the first image. It will be appreciated that the image classification information is used to indicate the object classification of the target object contained in the first image. Therefore, when the image characteristic information is determined, the image classification information is synchronously determined, and the image search is comprehensively realized based on the image characteristic information and the image classification information.

It should be appreciated that the image search model may be constructed by the image generator, image extractor, feature convolver, image classifier, image processor, etc. described above and trained to result in a trained image generator, image extractor, feature convolver, image classifier, image processor, etc. The training process of the image search model can be seen from the related description of the following embodiments.

For example, as shown in fig. 5, fig. 5 is a schematic diagram of an image search model according to an embodiment of the present application; the process of extracting image feature information by the image search model will be described taking an example in which the image search model includes N image generators (50 a, 50b,..50N), R image extractors (51 a, 51b,..51R), N feature convolvers (52 a, 52b,..52N), an image classifier 53, and an image processor 54: the first image may output N image generators, N second images are generated by the N image generators, the N second images (53 a, 53b,., 53N) are input to R image extractors, the N second images are subjected to feature extraction processing by the R image extractors, and the feature extraction result output by the image extractor associated with each second image is used as image feature information (54 a, 54b,.., 54N) of each second image, for example, the second image 53a corresponds to the image feature information 54a, the second image 53b corresponds to the image feature information 54b,.., the second image 53N corresponds to the image feature information 54N, the image feature information of each second image is input to the feature convolver respectively associated therewith (for example, the second image 53a is associated with the feature convolver 52a, the second image 53b is associated with the feature convolver 52b,.., the second image 53N is associated with the feature convolver 52N); the feature convolver outputs the convolution feature information of each second image, the convolution feature information of the N second images is spliced to obtain spliced feature information 55, the spliced feature information 55 is respectively input into the image classifier 53 and the image processor 54, the image classifier carries out image classification processing on the spliced feature information 55 to obtain image classification information of the first image, and the image processor 54 carries out image classification processing on the spliced feature information 55 to obtain image feature information of the first image.

S104, acquiring an image set.

The image set comprises a plurality of images to be matched, and it is understood that one image to be matched corresponds to one image characteristic information and one image classification information.

In some embodiments, the set of images to be matched may be obtained based on image classification information of the first image. Specifically, category identification is performed on a target object in a first image to obtain an object category of the target object, and an object sub-category set associated with the object category is obtained; the object sub-category set includes a plurality of object sub-categories; if the object sub-category indicated by the image classification information of the first image belongs to the object sub-category set, acquiring an image library; the image library comprises a plurality of reference images, wherein one reference image corresponds to one object category (namely one reference image corresponds to one image classification information); and acquiring a reference image (namely, a reference image of which the object sub-category indicated by the corresponding image classification information belongs to the object sub-category set) of which the object category is the same as that of the target object from the image library, and taking the acquired reference image as an image set. The category recognition may be an object recognition model that is completed through training. The acquired reference image is the image to be matched in the image set.

It will be appreciated that the image classification information for an image indicates the sub-category of the object corresponding to the object in the image. The object subclass purpose division granularity is smaller than the object class purpose division granularity. For example, the target object in the first image is a dog and the breed is a Hastelloy, so the identified object category may be a dog and the sub-category of the object indicated by the image classification information may be a Hastelloy. Thus, the object category of the object contained in each image and the finer object sub-category can be determined in the above manner.

Wherein the set of object sub-categories associated with an object category includes all object sub-categories under the object category. For example, the object category is dog, and the object sub-category may be Hastelloy, cooki, and the like. The sub-categories of objects to which the object category belongs may be preset based on prior knowledge and empirical values.

It can be understood that whether the classification recognition result of the object recognition model and the respective recognition result of the image classifier are matched or not can be mutually verified through the object category and the image classification information so as to determine whether the recognized classification result is consistent or not, and therefore poor image searching quality caused by the object category recognition error or the image classification information recognition error can be avoided to a large extent. It will be appreciated that when the object sub-category indicated by the image classification information belongs to the object category of the target object in the first image (i.e., the object sub-category indicated by the image classification information belongs to the object sub-category set), the classification recognition result of the object recognition model and the respective recognition result of the image classifier are indicated to be matched, and a subsequent image search process may be performed.

Therefore, when the object sub-category indicated by the image classification information of the first image belongs to the object sub-category set (that is, the object sub-category corresponding to the image classification information matches the object category of the target object), the image library for image search can be acquired. The reference image of the image library may be an image in an image database or may be an image in a multimedia file (such as a note posted by a user) in a multimedia database. It will be appreciated that the reference image associated with each object sub-category in the set of object sub-categories may be obtained from an image library as an image collection.

It may be understood that the reference image may include a reference object, and the image classification information corresponding to any reference image may be obtained by performing object recognition on the reference object in the reference object, generating an object detection frame in which the identified reference object in the reference image is located, determining image data in the object detection frame as an image to be classified, obtaining image classification information of the image to be classified, and taking the image classification information of the image to be classified as the image classification information of the reference image. It will be appreciated that the image characteristic information of the image to be classified may be taken as the image characteristic information of the reference image. The image characteristic information of the image to be classified is determined in the same manner as the image characteristic information of the first image.

The image classification model can be used for preliminarily screening out reference images which are similar to the images to be searched from an image library, namely images containing objects with the same object category, and further screening out more matched similar images based on the image characteristic information so as to realize high-precision searching of the images.

S105, searching the images to be matched with the first image from the image set based on the image characteristic information and the image classification information of the first image, the image characteristic information and the image classification information of each image to be matched.

In some embodiments, searching for the images to be matched with the first image from the image set may include determining image similarity between the first image and each image to be matched based on image feature information of the first image and image feature information of each image to be matched, sorting the images to be matched in the image set according to the image similarity to obtain a sorted image set, reordering each image to be matched in the sorted image set according to image classification information of the first image and image classification information of each image to be matched to obtain a reordered image set, and taking the target number of images to be matched cut from the reordered image set as the images to be matched with the first image.

The ranking of the images to be matched in the image set according to the image similarity may be ranking the images to be matched in the image set according to an order of from the high image similarity to the low image similarity.

Optionally, reordering (i.e. category reordering) each image to be matched in the ordered image set according to the image classification information of the first image and the image classification information of each image to be matched, so as to obtain the reordered image set, that is, the image to be matched with the same image classification information as the image classification information of the first image in the image set is adjusted to the image to be matched with the different image classification information of the first image, and the adjusted image set is used as the reordered image set. Further, in order to enrich the image sub-category of the image to be matched with the first image, the adjusted image set may be reordered according to a category expansion policy to serve as the reordered image set. For example, the adjusted image set is sequentially adjusted again, so that the object sub-categories corresponding to the intercepted images to be matched are richer, and if the number of the corresponding object sub-categories is greater than or equal to the specified number. It can be understood that the searched image to be matched with the first image may be returned to the user terminal, or the multimedia file where the searched image to be matched with the first image is located may be returned to the user terminal.

For example, as shown in fig. 6, fig. 6 is a schematic diagram of an image searching process according to an embodiment of the present application; the method comprises the steps of obtaining a reference image, carrying out object recognition on the reference image to obtain a reference object contained in the reference image, taking image data in an object detection frame where the reference object is located as an image to be classified corresponding to the reference image, carrying out category recognition on the reference object in the image to be classified to obtain an object category of the reference object, taking the object category of the reference object as the object category of the reference image, and determining image characteristic information and image classification information of the image to be classified through an image search model; verifying the object category of the reference object through the image classification information, and adding the reference image into the image library when the verification result indicates that the object sub-category corresponding to the image classification information is matched with the object category of the reference object, so as to realize the construction of the image library; when an image to be searched is acquired, carrying out object recognition on the image to be searched to obtain a target object contained in the image to be searched, taking image data in an object detection frame where the target object is located as a first image corresponding to the image to be searched, carrying out category recognition on the target object in the first image to obtain an object category of the target object, and determining image characteristic information and image classification information of the first image through an image search model; verifying the object category of the target object through the image classification information, and acquiring an image set from an image library based on the object category of the reference object when the verification result indicates that the object sub-category corresponding to the image classification information is matched with the object category of the target object; performing image similarity matching based on the image feature information of the first image and the image feature information of the images to be matched in the image set, namely performing similarity sorting on the images to be matched in the image set based on the image feature information of the first image and the image feature information of the images to be matched in the image set, and performing category rearrangement on the images to be matched in the image set based on the image classification information of the first image and the image classification information of the images to be matched in the image set to obtain a sorted image set, and determining the images to be matched with the first image from the sorted image set.

By the method, the image to be matched which is matched with the first image more properly can be comprehensively determined through the image characteristic information and the image classification information. The matched similar image result can be more accurate through the deeper image characteristic information, and the determined image to be matched with the first image can be optimized in an auxiliary mode through the image classification information, so that the image searching precision and searching quality can be improved, and the searching experience of a user is improved.

In the embodiment of the application, a plurality of second images corresponding to the first image can be obtained through the image generator so as to realize data enhancement, and it can be understood that the plurality of second images all contain the image information of the first image, and the image characteristic information respectively extracted from the second images through the image extractor can be used as multi-stage characteristic information of the first image, and the image characteristic information obtained through the multi-stage characteristic information can represent the image characteristics of the deeper layers of the first image, namely has richer characteristic information; therefore, the image to be matched with the first image can be comprehensively determined through the deeper image characteristic information and the image classification information, so that the image searching precision can be improved.

Referring to fig. 7, fig. 7 is a flowchart of an image searching method according to an embodiment of the present application, and the method may be performed by the above-mentioned electronic device. As shown in fig. 7, the flow of the image searching method in the embodiment of the present application may include the following steps:

S201, acquiring a first sample image, inputting the first sample image into N initial image generators, and respectively carrying out image recombination processing on the first sample image through the N initial image generators to generate N second sample images.

The process and principle of generating the N second sample images by performing image recombination processing on the first sample images through the N initial image generators are the same as the process and principle of generating the N second images by performing image recombination processing on the first images through the N image generators. And will not be described in detail herein.

It will be appreciated that an initial image generator is used to generate a second sample image.

S202, inputting N second sample images into R initial image extractors respectively, carrying out feature extraction processing on the N second sample images through the R initial image extractors respectively, and taking sample feature extraction results output by the initial image extractor associated with each second sample image in the R initial image extractors as sample image feature information of each second sample image respectively.

The process and principle of obtaining the sample image characteristic information of each second sample image by respectively carrying out characteristic extraction processing on N second sample images through R initial image extractors are the same as the process and principle of obtaining the sample image characteristic information of each second image by respectively carrying out characteristic extraction processing on N second images through R image extractors. And will not be described in detail herein. By the method, sample image characteristic information with different reorganization granularities at different stages can be obtained.

S203, performing stitching processing on the sample image characteristic information of each second sample image to obtain sample stitching characteristic information, and determining sample image characteristic information and sample image classification information of the first sample image through the sample stitching characteristic information.

The process and principle of determining the sample image feature information and the sample image classification information of the first sample image through the sample stitching feature information are the same as the process and principle of determining the image feature information and the image classification information of the first image through the stitching feature information, and are not described herein.

For example, the sample image feature information of the first sample image may be obtained by inputting the image feature information of the N second sample images into initial feature convolvers associated with the N second sample images respectively to perform convolution processing to obtain sample convolution feature information corresponding to the N second sample images, where one initial convolver performs convolution processing on the image feature information of one second sample image to obtain one sample convolution feature information; performing splicing processing on sample convolution characteristic information corresponding to the N second sample images, and taking the sample convolution characteristic information obtained by splicing as sample splicing characteristic information; and inputting the sample splicing characteristic information into an initial characteristic processor, and performing characteristic processing on the sample splicing characteristic information by the initial characteristic processor to obtain sample image characteristic information of the first sample image. In another example, the sample image classification information of the first sample image may be obtained by performing image classification processing on the stitching feature information of the first sample image by using an initial image classifier.

S204, carrying out feature processing on the sample image feature information of each second sample image to obtain sample processing feature information corresponding to the sample image feature information of each second sample image, and taking the sample image feature information of the first sample image and the sample processing feature information corresponding to each second sample image as N+1 training feature information of the first sample image.

The determination mode of the sample processing characteristic information corresponding to each second sample image is the same. Taking a second sample image as an example, the determining procedure of the sample processing characteristic information may be: sample image feature information of the second sample image is respectively input into initial feature convolvers associated with N second sample images to be convolved, sample convolution feature information corresponding to the second sample image is obtained, the sample convolution feature information corresponding to the second sample image is input into an initial feature processor associated with the second sample image, and the initial feature processor associated with the second sample image carries out feature processing on the sample convolution feature information corresponding to the second sample image, so that sample processing feature information corresponding to the sample image feature information of the second sample image is obtained. The initial feature processor associated with the second sample image may be the same or different from the initial feature processor described above that outputs sample image feature information of the first image.

That is, the sample image feature information of the second sample image may be convolved by the initial feature convolver to obtain sample convolution feature information of the second sample image, and the sample convolution feature information of the second sample image is input to the initial feature processor associated with the second sample image to perform feature processing to obtain sample processing feature information corresponding to the second sample image, and the sample convolution feature information of the N second sample images is spliced to obtain sample splicing feature information, and the sample splicing feature information is input to the initial feature processor associated with the sample splicing feature information to perform feature processing to obtain sample image feature information of the first sample image.

Optionally, the sample convolution characteristic information of the N second sample images may be respectively input into the respective associated initial image classifiers to perform image classification processing, so as to obtain image classification information corresponding to the N second sample images. The initial image classifier associated with the N second sample images may be the same as or different from the initial image classifier used to output the image classification information of the first sample images.

Therefore, the sample image feature information of the first sample image and the sample processing feature information corresponding to each second sample image can be used as n+1 training feature information of the first sample image, and the initial image search model can be trained by the n+1 training feature information. For example, the initial image search model may include N initial image generators, R initial image extractors, N initial feature convolvers, n+1 initial image classifiers, and n+1 initial image processors. It will be appreciated that each of the second sample image and the first sample image is associated with an image classifier and an initial image processor.

Alternatively, the sample image classification information of the first sample image and the sample image classification information corresponding to each second sample image may be used as n+1 pieces of training classification information of the first sample image, and the initial image search model may be trained by the n+1 pieces of training classification information together.

S205, training N initial image generators and R initial image extractors through the N+1 training characteristic information and the sample image classification information to obtain N image generators and R image extractors.

In some embodiments, the way the initial image generator and the initial image extractor are trained may be: acquiring an image classification label of a first sample image, determining classification deviation information based on the image classification label of the first sample image and sample image classification information, acquiring a contrast image corresponding to the first sample image, and acquiring a contrast feature set of the contrast image; the contrast characteristic set comprises N+1 contrast characteristic information, and one contrast characteristic information corresponds to one training characteristic information; determining feature deviation information based on feature correlation between each training feature information and contrast feature information corresponding to each training feature information; and iteratively training N initial image generators and R initial image extractors through the classification deviation information and the characteristic deviation information, taking the N initial image generators after the iterative training as N image generators, and taking the R initial image extractors after the iterative training as R image extractors.

That is, the determination of the classification deviation information based on the image classification tag of the first sample image and the sample image classification information may be such that the classification deviation information corresponding to each sample image classification information is generated from the image classification tag of the first sample image and the respective sample image classification information of the n+1 training classification information, respectively, and the classification loss value may be determined from the classification deviation information corresponding to each sample image classification information. The determining of the classification deviation information may be performing deviation calculation on the image classification label and the sample image classification information by using a cross entropy loss function, so as to obtain the classification deviation information. It can be appreciated that the image search model can be trained with the sample image classification information to enable the image search model to extract more accurate image feature information.

It is understood that the contrast image may be a positive sample image or a negative sample image for the first sample image, and that the initial image search model may be contrast learning trained based on the positive sample image and the negative sample image. The determination mode of the N+1 pieces of comparison characteristic information included in the comparison characteristic set of the comparison image is the same as the determination mode of the N+1 pieces of training characteristic information of the first sample image. The method comprises the steps of generating N comparison reorganization images corresponding to the comparison images through N initial image generators, generating sample image characteristic information of the N comparison reorganization images through R initial image extractors, generating sample convolution characteristic information of the N comparison reorganization images through N initial characteristic convolvers, and generating sample processing characteristic information corresponding to the N comparison reorganization images and sample image characteristic information of the comparison images based on the sample convolution characteristic information of the N comparison reorganization images through N+1 initial image processors to serve as N+1 comparison characteristic information in a comparison characteristic set. Thus, one contrast characteristic information corresponds to one training characteristic information.

Feature correlation calculation can be performed through each training feature information and the corresponding contrast feature information of each training feature information, so that feature correlation between each training feature information and the corresponding contrast feature information of each training feature information is obtained, and feature deviation information is determined based on the feature correlation between each training feature information and the corresponding contrast feature information of each training feature information. Wherein, the feature correlation calculation can be performed by adopting an additive angle interval loss function and a triplet loss function. It can be understood that if the comparison image is a positive sample image, the training target is to make the feature correlation higher; if the contrast image is a negative sample image, the training target is to make the feature correlation lower. Therefore, the initial image search model can be trained in a contrast learning mode, so that the initial image search model can learn the correlation between positively correlated images and the difference between negatively correlated images, and the model can mine characteristic information with different granularity and different stages in the images. It can be understood that the image search model comprises two prediction tasks, one is a feature extraction task and the other is an image classification task, and the two prediction tasks are trained simultaneously during training, so that multi-task learning can be realized, and the image search model has a better model effect. It will be appreciated that there may be one or more of the comparison images, that one feature bias information is determined from the corresponding set of comparison features for each comparison image, and that a feature loss value for the image search model is determined based on the one or more feature bias information. For example, the sum of the values indicated by the one or more pieces of characteristic deviation information is used as the characteristic loss value.

It will be appreciated that training the initial image search model by classifying the deviation information and the feature deviation information, that is, iteratively training the initial image search model by classifying the loss value and the feature loss value, determines the initial image search model after the iterative training as the image search model for extracting the image features. The image search model comprises N initial image generators (N image generators) after iterative training, R initial image extractors (R image extractors) after iterative training, N initial feature convolvers (N feature convolvers) after iterative training, N+1 initial image classifiers (N+1 image classifiers) after iterative training and N+1 initial image processors (N+1 image processors) after iterative training.

For example, as shown in fig. 8, fig. 8 is a schematic diagram of a training image search model according to an embodiment of the present application; wherein the initial image search model includes N initial image generators (image generator 1, image generator 2, image generator N), R initial image extractors (image extractor 1, image extractor 2, image extractor R), N initial feature convolvers (feature convolvers 1, feature convolvers 2, feature convolvers N), n+1 initial image classifiers (image classifier 1, image classifier 2, image classifier n+1), and n+1 initial image processors (image processor 1, image processor 2, image processor n+1); inputting the first sample image into N initial image generators to obtain N second sample images, and inputting the N second sample images into N initial feature convolvers to obtain sample convolution feature information of the N second sample images; sample convolution characteristic information of the N second sample images is spliced to obtain sample splicing characteristic information, and the sample convolution characteristic information of the N second sample images and the sample splicing characteristic information are respectively input into N+1 initial image classifiers to obtain sample image classification information of the N second sample images and sample image classification information of the first sample images; respectively inputting the sample convolution characteristic information and the sample splicing characteristic information of the N second sample images into N+1 initial image classifiers to obtain sample image characteristic information of the N second sample images and sample image characteristic information of the first sample images; determining classification deviation information corresponding to the N second sample images and classification deviation information corresponding to the first sample images according to the image classification labels of the first sample images, the sample image classification information of the N second sample images and the sample image classification information of the first sample images, and determining classification loss values according to the classification deviation information corresponding to the N second sample images and the classification deviation information corresponding to the first sample images; acquiring a comparison image, acquiring a comparison feature set, determining feature correlation degrees corresponding to N second sample images and feature correlation degrees corresponding to the first sample images through sample image feature information of the N second sample images and sample image feature information of the first sample images and N+1 comparison feature information in the comparison feature set, determining feature deviation information through the feature correlation degrees, and determining feature loss values through the feature deviation information; training an initial image search model through the classification loss value and the feature loss value to obtain a trained image search model, wherein the trained image search model comprises N image generators, R image extractors, N feature convolvers, an (N+1) th image processor in the (N+1) th initial image processor and an (N+1) th image classifier in the (N+1) th initial image classifier.

According to the embodiment of the application, the image extractor and the image generator which can extract the deep feature information of the first image can be trained, so that the image search can be realized through the image extractor and the image generator, and the image search precision is improved.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an image searching apparatus according to the present application. It should be noted that, the image searching apparatus shown in fig. 9 is used to perform the method of the embodiment shown in fig. 2 and 7 of the present application, and for convenience of explanation, only the portion relevant to the embodiment of the present application is shown, and specific technical details are not disclosed, and reference is made to the embodiment shown in fig. 2 and 7 of the present application. The image search apparatus 900 may include: processing module 901, acquisition module 902. Wherein:

The processing module 901 is configured to acquire a first image, input the first image into N image generators, and perform image reconstruction processing on the first image through the N image generators to generate N second images; an image generator for generating a second image; n is an integer greater than 1;

The processing module 901 is further configured to input N second images into R image extractors, perform feature extraction processing on the N second images through the R image extractors, and respectively use feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image; r is an integer greater than or equal to N;

the processing module 901 is further configured to perform stitching processing on the image feature information of the N second images to obtain stitching feature information, and determine image feature information and image classification information of the first image according to the stitching feature information;

An acquiring module 902, configured to acquire an image set; the image set comprises a plurality of images to be matched, wherein one image to be matched corresponds to one image characteristic information and one image classification information;

The processing module 901 is further configured to search for an image to be matched that matches the first image from the image set based on the image feature information and the image classification information of the first image, the image feature information and the image classification information of each image to be matched.

In some embodiments, the processing module 901 is specifically configured to, when configured to perform image reconstruction processing on the first images by using N image generators, generate N second images:

Splitting a first image into K image blocks through an ith image generator in the N image generators, and splicing the K image blocks into spliced images with the same image size as the first image according to an image splicing rule; i is a positive integer less than or equal to N; the K corresponding to the image generator positioned before the ith image generator in the N image generators is larger than the K corresponding to the ith image generator, and the K corresponding to the image generator positioned after the ith image generator in the N image generators is smaller than the K corresponding to the ith image generator;

The stitched image is taken as a second image generated by the ith image generator.

In some embodiments, the image extractor associated with the ith second image of the N second images is an R-n+i th image extractor of the R image extractors; i is a positive integer less than or equal to N;

The processing module 901, when configured to perform feature extraction processing on N second images by R image extractors, and respectively use, as image feature information of each second image, feature extraction results output by an image extractor associated with each second image in the R image extractors, is specifically configured to:

Performing feature extraction processing on the ith second image through a front R-N+i image extractor in the R image extractors, and taking the feature extraction result output by the R-N+i image extractor as image feature information of the ith second image; two adjacent image extractors in the former R-N+i image extractors comprise a ith image extractor and a ith image extractor, u is a positive integer smaller than v, v is a positive integer, and the feature extraction result output by the ith image extractor is used as the input of the ith image extractor.

In some embodiments, the processing module 901, when configured to acquire the first image, is specifically configured to:

acquiring an image to be searched, carrying out object recognition on a target object in the image to be searched, and generating an object detection frame where the recognized target object in the image to be searched is located;

Image data in the object detection frame is determined as a first image.

In some embodiments, the acquiring module 902, when configured to acquire the first image, is specifically configured to:

Identifying the category of the target object in the first image to obtain the object category of the target object;

Obtaining an object sub-category set associated with an object category; the object sub-category set includes a plurality of object sub-categories;

If the object sub-category indicated by the image classification information of the first image belongs to the object sub-category set, acquiring an image library; the image library comprises a plurality of reference images, and one reference image corresponds to one image classification information;

and acquiring reference images of which the object sub-category indicated by the corresponding image classification information belongs to the object sub-category set from an image library, and taking the acquired reference images as an image set.

In some embodiments, the processing module 901 is further for:

Acquiring a first sample image, inputting the first sample image into N initial image generators, and respectively carrying out image recombination processing on the first sample image through the N initial image generators to generate N second sample images; an initial image generator for generating a second sample image;

Respectively inputting N second sample images into R initial image extractors, respectively carrying out feature extraction processing on the N second sample images through the R initial image extractors, and respectively taking sample feature extraction results output by the initial image extractor associated with each second sample image in the R initial image extractors as sample image feature information of each second sample image;

Performing stitching processing on sample image characteristic information of each second sample image to obtain sample stitching characteristic information, and determining sample image characteristic information and sample image classification information of the first sample image according to the sample stitching characteristic information;

Carrying out feature processing on sample image feature information of each second sample image to obtain sample processing feature information corresponding to the sample image feature information of each second sample image, and taking the sample image feature information of the first sample image and the sample processing feature information corresponding to each second sample image as N+1 training feature information of the first sample image;

And training the N initial image generators and the R initial image extractors through the N+1 training characteristic information and the sample image classification information to obtain the N image generators and the R image extractors.

In some embodiments, the processing module 901, when used for training N initial image generators and R initial image extractors through n+1 training feature information and sample image classification information, is specifically used for:

acquiring an image classification label of a first sample image, and determining classification deviation information based on the image classification label of the first sample image and sample image classification information;

Acquiring a contrast image corresponding to the first sample image, and acquiring a contrast characteristic set of the contrast image; the contrast characteristic set comprises N+1 contrast characteristic information, and one contrast characteristic information corresponds to one training characteristic information;

determining feature deviation information based on feature correlation between each training feature information and contrast feature information corresponding to each training feature information;

and iteratively training N initial image generators and R initial image extractors through the classification deviation information and the characteristic deviation information, taking the N initial image generators after the iterative training as N image generators, and taking the R initial image extractors after the iterative training as R image extractors.

For specific implementation manners of the processing module 901 and the obtaining module 902, reference may be made to the related descriptions in the foregoing embodiments, and details will not be further described herein. It should be understood that the description of the beneficial effects obtained by the same method will not be repeated.

The functional units in the embodiments of the present application may be integrated in one module (unit), or each module (unit) may exist alone physically, or two or more modules (units) may be integrated in one module. The integrated modules (units) may be implemented in hardware or in software functional modules, and the application is not limited thereto.

Referring to fig. 10, fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the electronic device 1000 includes: at least one processor 1001, a memory 1002. Optionally, the electronic device may further comprise a network interface. The processor 1001, the memory 1002 and the network interface may exchange data, the network interface is controlled by the processor 1001 to send and receive messages, the memory 1002 is used for storing a computer program, the computer program includes program instructions, and the processor 1001 is used for executing the program instructions stored in the memory 1002. Wherein the processor 1001 is configured to invoke program instructions to perform the above method.

The memory 1002 may include volatile memory (RAM), such as random-access memory (RAM); the memory 1002 may also include a non-volatile memory (non-volatile memory), such as a flash memory (flash memory), a solid state disk (solid-state drive-STATE DRIVE, SSD), etc.; the memory 1002 may also include a combination of the above types of memory.

The processor 1001 may be a central processing unit (central processing unit, CPU). In one embodiment, the processor 1001 may also be a graphics processor (Graphics Processing Unit, GPU). The processor 1001 may also be a combination of a CPU and a GPU.

In one possible implementation, the memory 1002 is configured to store program instructions, and the processor 1001 may call the program instructions to perform the following steps:

In some embodiments, the processor 1001 is specifically configured to, when configured to perform image reconstruction processing on the first images by using N image generators, generate N second images:

The processor 1001 is specifically configured to, when performing feature extraction processing on N second images by R image extractors, and using, as image feature information of each second image, feature extraction results output by the image extractor associated with each second image among the R image extractors, respectively:

In some embodiments, the processor 1001, when configured to acquire the first image, is specifically configured to:

Image data in the object detection frame is determined as a first image.

In some embodiments, the processor 1001 is further configured to:

In some embodiments, the processor 1001 is specifically configured to, when configured to train the N initial image generators and the R initial image extractors with the n+1 training feature information and the sample image classification information, obtain the N image generators and the R image extractors:

In specific implementation, the device, the processor, the memory, etc. described in the embodiments of the present application may perform the implementation described in the foregoing method embodiments, or may perform the implementation described in the embodiments of the present application, which is not described herein again.

The embodiment of the application also provides a computer (readable) storage medium, and the computer storage medium stores a computer program, where the computer program includes program instructions, and when the program instructions are executed by a processor, the program instructions enable the processor to execute some or all of the steps executed in the above method embodiment. In the alternative, the computer storage medium may be volatile or nonvolatile. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.

Embodiments of the present application provide a computer program product, which may include a computer program, where the computer program when executed by a processor may implement some or all of the steps of the above method, and are not described herein.

References herein to "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by way of a computer program stored in a computer storage medium, which may be a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random-access Memory (Random Access Memory, RAM), or the like.

The above disclosure is only a few examples of the present application, and it is not intended to limit the scope of the present application, but it is understood by those skilled in the art that all or a part of the above embodiments may be implemented and equivalent changes may be made in the claims of the present application.

Claims

1. An image search method, the method comprising:

Acquiring a first image, inputting the first image into N image generators, and respectively carrying out image recombination processing on the first image through the N image generators to generate N second images; an image generator for generating a second image; n is an integer greater than 1; the image generator is used for splitting the first image into K image blocks; the second image is a spliced image obtained by splicing the K image blocks according to an image splicing rule; the image stitching rule is used for indicating that K image blocks are randomly stitched according to the image size of the first image; the image size of the spliced image is the same as the image size of the first image; k corresponding to the image generator is in descending order or presentation increasing order; k is a positive integer; one second image is associated with one image extractor in the R image extractors, and the position of the image extractor associated with the second image in the R image extractors is obtained by K corresponding to the second image; r is an integer greater than or equal to N;

Inputting the N second images into R image extractors respectively, performing feature extraction processing on the N second images through the R image extractors, and taking feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image respectively; the two adjacent image extractors comprise a ith image extractor and a vy image extractor, wherein the feature extraction result output by the ith image extractor is used as the input of the vy image extractor, and u is a positive integer smaller than v;

2. The method according to claim 1, wherein an i-th image generator of the N image generators is configured to stitch K image blocks split from the first image into a stitched image having an image size identical to an image size of the first image as the second image generated by the i-th image generator according to the image stitching rule;

and when K corresponding to the N image generators is in descending order, K corresponding to an image generator positioned before the ith image generator in the N image generators is larger than K corresponding to the ith image generator, and K corresponding to an image generator positioned after the ith image generator in the N image generators is smaller than K corresponding to the ith image generator.

3. The method according to claim 1, wherein when the i-th second image extractor of the N second images is the R-n+i-th image extractor of the R image extractors, the feature extraction processing is performed on the N second images by the R image extractors, and the feature extraction result output by the image extractor associated with each second image of the R image extractors is taken as the image feature information of each second image, respectively, comprising:

Performing feature extraction processing on the ith second image through a front R-N+i image extractor in the R image extractors, and taking a feature extraction result output by the R-N+i image extractor as image feature information of the ith second image; i is a positive integer less than or equal to N.

4. The method of claim 1, wherein the acquiring the first image comprises:

Image data in the object detection frame is determined as the first image.

5. The method of any of claims 1-4, wherein the acquiring a set of images comprises:

Obtaining an object sub-category set associated with the object category; the set of object sub-categories includes a plurality of object sub-categories;

if the object sub-category indicated by the image classification information of the first image belongs to the object sub-category set, acquiring an image library; the image library comprises a plurality of reference images, and one reference image corresponds to one object category;

and acquiring reference images with the same object category as the target object from the image library, and taking the acquired reference images as the image set.

6. The method according to claim 1, wherein the method further comprises:

Carrying out feature processing on the sample image feature information of each second sample image to obtain sample processing feature information corresponding to the sample image feature information of each second sample image, and taking the sample image feature information of the first sample image and the sample processing feature information corresponding to each second sample image as N+1 training feature information of the first sample image;

7. The method of claim 6, wherein the training the N initial image generators and the R initial image extractors with the n+1 training feature information and the sample image classification information, resulting in the N image generators and the R image extractors, comprises:

acquiring an image classification label of the first sample image, and determining classification deviation information based on the image classification label of the first sample image and the sample image classification information;

and iteratively training the N initial image generators and the R initial image extractors through the classification deviation information and the characteristic deviation information, taking the N initial image generators after the iterative training as the N image generators, and taking the R initial image extractors after the iterative training as the R image extractors.

8. An image search apparatus, the apparatus comprising:

The processing module is used for acquiring a first image, inputting the first image into N image generators, and respectively carrying out image recombination processing on the first image through the N image generators to generate N second images; an image generator for generating a second image; n is an integer greater than 1; the image generator is used for splitting the first image into K image blocks; the second image is a spliced image obtained by splicing the K image blocks according to an image splicing rule; the image stitching rule is used for indicating that K image blocks are randomly stitched according to the image size of the first image; the image size of the spliced image is the same as the image size of the first image; k corresponding to the image generator is in descending order or presentation increasing order; k is a positive integer; one second image is associated with one image extractor in the R image extractors, and the position of the image extractor associated with the second image in the R image extractors is obtained by K corresponding to the second image; r is an integer greater than or equal to N;

The processing module is further configured to input the N second images into R image extractors, perform feature extraction processing on the N second images through the R image extractors, and respectively use feature extraction results output by the image extractor associated with each second image in the R image extractors as image feature information of each second image; the two adjacent image extractors comprise a ith image extractor and a vy image extractor, wherein the feature extraction result output by the ith image extractor is used as the input of the vy image extractor, and u is a positive integer smaller than v;

the processing module is further used for performing stitching processing on the image characteristic information of the N second images to obtain stitching characteristic information, and determining the image characteristic information and the image classification information of the first image through the stitching characteristic information;

The processing module is further configured to search an image to be matched with the first image from the image set based on the image feature information and the image classification information of the first image, and the image feature information and the image classification information of each image to be matched.

9. An electronic device comprising a processor and a memory, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-7.

10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-7.

11. A computer program product comprising computer instructions which, when executed by a processor, implement the method of any of claims 1-7.