[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114782714A - Image matching method and device based on context information fusion - Google Patents

Image matching method and device based on context information fusion Download PDF

Info

Publication number
CN114782714A
CN114782714A CN202210161767.7A CN202210161767A CN114782714A CN 114782714 A CN114782714 A CN 114782714A CN 202210161767 A CN202210161767 A CN 202210161767A CN 114782714 A CN114782714 A CN 114782714A
Authority
CN
China
Prior art keywords
image
image block
context information
blocks
block group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210161767.7A
Other languages
Chinese (zh)
Inventor
周振
俞益洲
李一鸣
乔昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenrui Bolian Technology Co Ltd
Shenzhen Deepwise Bolian Technology Co Ltd
Original Assignee
Beijing Shenrui Bolian Technology Co Ltd
Shenzhen Deepwise Bolian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenrui Bolian Technology Co Ltd, Shenzhen Deepwise Bolian Technology Co Ltd filed Critical Beijing Shenrui Bolian Technology Co Ltd
Priority to CN202210161767.7A priority Critical patent/CN114782714A/en
Publication of CN114782714A publication Critical patent/CN114782714A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an image matching method and device based on context information fusion. The method comprises the following steps: respectively segmenting image blocks to be matched from the image A, B by using an image segmentation network, and splicing each image block and the processed whole image into an image block group; inputting the image block groups of the image A, B into two sub-networks of the twin network respectively for feature extraction, and calculating the similarity of one image block group of the image A and one image block group of the image B based on the extracted texture features and a vector consisting of context information; and obtaining the image blocks matched in the image A, B by adopting a Hungarian algorithm based on the similarity. The invention simultaneously utilizes the texture characteristics and the context information of the image blocks to carry out image block matching, thereby improving the precision of image matching.

Description

Image matching method and device based on context information fusion
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to an image matching method and device based on context information fusion.
Background
The image matching is based on the similarity of image blocks among images, and the matching between the image blocks to be inquired and the candidate image blocks is realized. Image matching is the basis of many tasks in the field of computer vision, such as object recognition, image retrieval, wide-baseline matching, and the like. However, in reality, there are many factors that influence the determination of the similarity of image blocks in different images, such as different image capturing angles, different illumination, occlusion, shadowing, different camera performances, and so on. The traditional image block matching method is based on manual design of image block feature descriptors, but the manual design-based feature descriptors cannot fully consider the influence of the above factors. In recent years, with the advent of large-scale data volume, image block feature extraction has gradually shifted from manual design to automatic learning of image block features.
Most of the methods for automatically extracting image block features and judging similarity at present are based on a twin neural network. Sub-networks in a twin neural network are generally structurally identical, weight shared. The general flow of the method is as follows: firstly, defining similar positive sample pairs and dissimilar negative sample pairs, designing a twin convolutional neural network, taking different image blocks as the input of different subnetworks, extracting the features of the image blocks through a weight-shared subnetwork, and then calculating the similarity of the extracted features based on a certain similarity measurement function or predicting whether the extracted features are matched based on a certain classifier. The image block matching method only extracts the texture features of the image blocks, and relative to the whole image, the image blocks lose context information and relative position relation with surrounding objects, and when different examples of the same category are matched, the provided information is insufficient, and objects with similar textures but larger spatial difference are easily judged to be matched by mistake.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides an image matching method and apparatus based on context information fusion.
In order to achieve the above object, the present invention adopts the following technical solutions.
In a first aspect, the present invention provides an image matching method based on context information fusion, including the following steps:
respectively segmenting image blocks to be matched from the image A, B by using an image segmentation network, and splicing each image block and the processed whole image into an image block group;
respectively inputting image block groups of an image A, B into two subnetworks of a twin network with the same structure and shared weight, extracting texture features based on image blocks, extracting context information based on the whole image, and calculating the similarity between one image block group of the image A and one image block group of the image B based on a vector formed by the texture features and the context information;
and processing by adopting a Hungarian algorithm based on the similarity to obtain the matched image blocks in the image A, B.
Further, the image A, B is two images under multi-view vision, i.e., two images taken from different angles for the same scene.
Further, the method for obtaining the image block group comprises the following steps:
obtaining a segmentation mask of a target example, and cutting along the minimum circumscribed rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values to the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
Furthermore, the twin network adopts a grouping convolutional neural network to extract the characteristics of the image block groups, the first group of convolutional neural networks extracts texture characteristics based on the image blocks in the image block groups, and the second group of convolutional neural networks extracts context information based on the processed whole image in the image block groups.
Further, training the twin network with a training data set consisting of pairs of positive samples and pairs of negative samples; for two different view images M, N, M of the same sceneim,jmFor the ith in the image MmJ (th) of the objectmImage block group corresponding to each image block, Nin,jnFor the ith in image NnJ-th of like objectnThe image block group corresponding to each image block is set as im=inAnd j ism=jnIn time (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnWhen (M)im,jm,Nin,jn) Are negative sample pairs.
Furthermore, the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class, and the object classes of the image blocks are output by the image segmentation network.
In a second aspect, the present invention provides an image matching apparatus based on context information fusion, including:
the image block segmentation module is used for segmenting image blocks to be matched from the image A, B by using an image segmentation network and splicing each image block and the processed whole image into an image block group;
the similarity calculation module is used for respectively inputting the image block groups of the image A, B into two subnetworks of a twin network with the same structure and shared weight, extracting texture features based on the image blocks, extracting context information based on the whole image, and calculating the similarity between one image block group of the image A and one image block group of the image B based on a vector formed by the texture features and the context information;
and the image block matching module is used for processing by adopting a Hungarian algorithm based on the similarity to obtain a matched image block in the image A, B.
Further, the method for obtaining the image block group comprises the following steps:
obtaining a segmentation mask of a target example, and cutting along the minimum circumscribed rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values into the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
Further, training the twin network with a training data set consisting of pairs of positive and negative samples; for two different perspective images M, N, M of the same sceneim,jmFor the ith in the image MmJ (th) of the objectmImage block group corresponding to each image block, Nin,jnIs the ith in the image NnJ (th) of the objectnThe image block group corresponding to each image block is set as im=inAnd j ism=jnWhen (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnIn time (M)im,jm,Nin,jn) Are negative sample pairs.
Furthermore, the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class, and the object classes of the image blocks are output by the image segmentation network.
Compared with the prior art, the invention has the following beneficial effects.
The image blocks to be matched are respectively segmented from the image A, B, each image block and the processed whole image are spliced into an image block group, texture features and context information are extracted from the image block groups by using a twin network, the similarity between one image block group of the image A and one image block group of the image B is calculated based on a vector formed by the texture features and the context information, and the image blocks matched in the two images are obtained by processing the image blocks by using the Hungarian algorithm, so that the automatic matching of the image blocks is realized. The invention simultaneously utilizes the texture characteristics and the context information of the image blocks to carry out image block matching, thereby improving the precision of image matching.
Drawings
Fig. 1 is a flowchart of an image matching method based on context information fusion according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an image block and a processed whole image.
FIG. 3 is a schematic view of the processing flow of the method of the present invention.
Fig. 4 is a block diagram of an image matching apparatus based on context information fusion according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described below with reference to the accompanying drawings and the detailed description. It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a flowchart of an image matching method based on context information fusion according to an embodiment of the present invention, including the following steps:
step 101, respectively segmenting image blocks to be matched from an image A, B by using an image segmentation network, and splicing each image block and the whole processed image into an image block group;
step 102, respectively inputting image block groups of the image A, B into two sub-networks of a twin network with the same structure and shared weight, extracting texture features based on image blocks, extracting context information based on the whole image, and calculating the similarity between one image block group of the image A and one image block group of the image B based on a vector formed by the texture features and the context information;
and 103, processing by adopting a Hungarian algorithm based on the similarity to obtain the image blocks matched in the image A, B.
The embodiment provides an image matching method based on context information fusion. The matching method is used to find matching image blocks from the two input images A, B. Firstly, image blocks to be matched are respectively divided from an image A, B, for example, image blocks a1, a2, A3, a4 and B1, B2, B3 and B4 of four computers are divided from two images similar to that shown in fig. 2; then, according to a certain algorithm, a matching pair of image blocks is found out from the image blocks belonging to the two images, for example, the pairs of image blocks belonging to the same computer (a1, B3), (a2, B1), (A3, B4), (a4, B1).
In this embodiment, step 101 is mainly used to obtain the image block group to be matched. As described above, in the present embodiment, the matching image block is to be found from the two input images, and therefore, the image block to be matched needs to be first divided from the two input images. The image block generally refers to an image block containing one or more types of objects, such as a water cup, a computer, and the like in fig. 2. An image segmentation network may be used to segment image blocks from an entire input image. Image segmentation networks typically employ full convolution neural networks. Since the image matching of the present embodiment usually needs to be performed in real time, the present embodiment adopts a YOLACT real-time instance segmentation network. In the prior art, the segmented image blocks are generally directly input into a twin network for matching, and the matching accuracy is generally not ideal because only texture features can be extracted from a single image block and context information cannot be obtained. For this purpose, in this embodiment, each image block and the properly processed whole image are combined into an image block set, then the image block set corresponding to each image block is input to the twin network for feature extraction, and context information is extracted from the processed whole image. There are many processing methods for the whole image, and this embodiment does not limit the specific processing method, and the following embodiment will provide a specific processing method.
In this embodiment, step 102 is mainly used to calculate the similarity between the image block group in the image a and the image block group in the image B. A twin network is first constructed consisting of two sub-networks with structurally identical weight sharing. To achieve a fast match, a lightweight ResNet18 may be selected as a sub-network of the twin network; then, extracting texture features and context information from the input image block group by using the twin network, and splicing the texture features and the context information into a vector; and finally, calculating the similarity of each image block group of the image A and one image block group of the image B based on the vector.
In this embodiment, step 103 is mainly used for image block matching. The embodiment adopts Hungarian algorithm to realize image block matching. The hungarian algorithm is the algorithm in the graph theory to find the maximum number of matches. In this embodiment, a similarity matrix is constructed by using the similarity of the image block groups obtained in the previous step. For simplicity, it is assumed that image a is divided into 3 image blocks and image B is divided into 4 image blocks, and the similarity matrix is shown in table 1. First, two image block groups with the largest similarity are found, for example, the similarity of the image a _ patch2 and the image B _ patch4 in table 1 is 0.8, which is considered to be a one-to-one match, and the two image block groups are removed from the matrix, as shown in table 2. Then, the next image patch group with the largest similarity is found, such as the similarity of 0.5 between the image a _ patch3 and the image B _ patch1 in table 2, and is also removed from the matrix, as shown in table 3. Repeating the above process until no image block group which can be matched exists, and finally obtaining a matched image block result as follows: (A _ patch2, B _ patch4), (A _ patch3, B _ patch1), (A _ patch1, B _ patch 3).
TABLE 1
Image A _ patch1 Image A _ patch2 Image A _ patch3
Image B _ patch1 0.10 0.40 0.50
Image B _ patch2 0.20 0.60 0.30
Image B _ patch3 0.30 0.70 0.25
Image B _ patch4 0.50 0.80※ 0.30
TABLE 2
Image A _ patch1 Image A _ patch2 Image A _ patch3
Image B _ patch1 0.10 ——— 0.50※
Image B _ patch2 0.20 ——— 0.30
Image B _ patch3 0.30 ——— 0.25
Image B _ patch4 ——— ——— ———
TABLE 3
Image A _ patch1 Image A _ patch2 Image A _ patch3
Image B _ patch1 ——— ——— ———
Image B _ patch2 0.20 ——— ———
Image B _ patch3 0.30※ ——— ———
Image B _ patch4 ——— ——— ———
As an alternative embodiment, the image A, B is two images under multi-view vision, i.e., two images taken from different angles for the same scene.
The present embodiment defines an image A, B to be matched. In this embodiment, the image A, B is two images in a multi-view. A multi-view visual image is an image of the same scene taken from different angles. The images of the same object in the two multi-view images are different as if the three views (front, top, side) of the same object were different, except that the differences are less apparent than the three views. The image matching of this embodiment is to find the image blocks corresponding to the same object from the two multi-view visual images.
As an alternative embodiment, the method for obtaining the image block group includes:
obtaining a segmentation mask of a target example, and cutting along the minimum external rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values into the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
This embodiment provides a technical solution for obtaining image block groups. After the image is segmented by the target instances (e.g., the computer in fig. 2), a segmentation mask for each target instance is obtained. First, based on a segmentation mask, a minimum bounding rectangle of a target instance is calculated and an image block as the target instance is cut. Then, the pixels corresponding to the segmentation mask of the target instance in the whole image are set as background pixel values and scaled to the size of the image block of the target instance, as shown in fig. 2. Then, the image blocks and the processed whole image are spliced in the dimension of the channel, for example, assuming that the size of the image blocks is 1 × 32 × 32 × 32, the size of the processed whole image is 1 × 32 × 32 × 32, the size of the spliced image blocks is 2 × 32 × 32 × 32, and an image with twice the number of original channels is formed as an input of the twin network subnetwork.
As an optional embodiment, the twin network performs feature extraction on the image block groups by using a grouped convolutional neural network, the first group of convolutional neural networks extracts texture features based on the image blocks in the image block groups, and the second group of convolutional neural networks extracts context information based on the processed whole image in the image block groups.
The present embodiment further defines the structure of the twin network. As described above, the present embodiment extracts not only the texture features but also the context information. In order to extract texture features and context information from the image block groups, the sub-networks of the twin network of this embodiment use grouped convolutional neural networks, that is, a first group of convolutional neural networks and a second group of convolutional neural networks are provided, where the first group of convolutional neural networks extracts texture features based on the image blocks in the image block groups, and the second group of convolutional neural networks extracts context information based on the processed whole image in the image block groups.
As an alternative embodiment, a combination of positive and negative samples is usedTraining the twin network on the composed training data set; for two different perspective images M, N, M of the same sceneim,jmFor the ith in the image MmJ (th) of the objectmImage block group corresponding to each image block, Nin,jnFor the ith in image NnJ (th) of the objectnThe image block group corresponding to each image block is defined as im=inAnd j ism=jnWhen (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnWhen (M)im,jm,Nin,jn) Are negative sample pairs.
The present embodiment defines a training data set for training the twin network. In this embodiment, the training data set of the twin network is composed of a positive sample pair and a negative sample pair, one sample pair is composed of two image block groups belonging to the images M, N of two different viewing angles, respectively, the matched two image block groups constitute the sample pair as the positive sample pair, and the unmatched two image block groups constitute the sample pair as the negative sample pair. Generally, the target instances in the image M, N are classified according to object types (object types to which image segmentation network output image blocks belong), such as computers in fig. 2, cups, etc.; the number of target instances of the same object class may be generally 1 or more. For convenience of description, use Mim,jmRepresenting the ith in the image MmJ-th of like objectmFor groups of blocks corresponding to individual image blocks, Nin,jnRepresenting the ith in image NnJ (th) of the objectnFor example, the 2 nd computer in the image M and the 2 nd computer in the image N are the same computer (i.e. the two are matched). Based on the above assumptions, when im=inAnd j ism=jnIn time (M)im,jm,Nin,jn) The image blocks are positive sample pairs, namely the image block groups with the same category and sequence number are the positive sample pairs; when i ism≠inOr jm≠jnIn time (M)im,jm,Nin,jn) The image blocks are negative sample pairs, that is, the image blocks with the same category and sequence number are not uniform. Ratio ofFor example, the tile groups corresponding to the 2 nd computer in the image M and the 2 nd computer in the image N form a positive sample pair, and the tile groups corresponding to the 2 nd computer in the image M and the 3 rd computer in the image N form a negative sample pair. The positive or negative sample pairs are usually constructed by manual labeling. The present embodiment is only schematically described for the process of forming the positive and negative sample pairs.
As an alternative embodiment, the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class, and the object classes of the image blocks are output by the image segmentation network.
The embodiment provides a technical scheme for improving the matching speed. Since only the image blocks of the same object class may be matched image blocks, in order to reduce the amount of calculation and increase the matching speed, the embodiment only processes the image block groups corresponding to the image blocks of the same object class, that is, the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class. After the processing, the calculation amount of matching by using the Hungarian algorithm is greatly reduced, so that the image matching speed is increased.
Fig. 4 is a schematic composition diagram of an image matching apparatus based on context information fusion according to an embodiment of the present invention, where the apparatus includes:
the image block segmentation module 11 is configured to segment image blocks to be matched from the image A, B by using an image segmentation network, and splice each image block and the processed whole image into an image block group;
the similarity calculation module 12 is configured to input image block groups of the image A, B to two subnetworks of a twin network having the same structure and shared weights, extract texture features based on image blocks, extract context information based on the entire image, and calculate similarity between an image block group of the image a and an image block group of the image B based on a vector formed by the texture features and the context information;
and the image block matching module 13 is configured to perform processing by using a hungarian algorithm based on the similarity to obtain image blocks matched in the image A, B.
The apparatus of this embodiment may be configured to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again. The same applies to the following embodiments, which are not further described.
As an alternative embodiment, the method for obtaining the image block group includes:
obtaining a segmentation mask of a target example, and cutting along the minimum external rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values to the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
As an alternative embodiment, the twin network is trained using a training data set consisting of pairs of positive and negative samples; for two different view images M, N, M of the same sceneim,jmFor the ith in the image MmJ (th) of the objectmImage block group corresponding to each image block, Nin,jnFor the ith in image NnJ-th of like objectnThe image block group corresponding to each image block is set as im=inAnd j ism=jnWhen (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnWhen (M)im,jm,Nin,jn) Are negative sample pairs.
As an alternative embodiment, the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class, and the object classes of the image blocks are output by the image segmentation network.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. An image matching method based on context information fusion is characterized by comprising the following steps:
respectively segmenting image blocks to be matched from the image A, B by using an image segmentation network, and splicing each image block and the processed whole image into an image block group;
respectively inputting image block groups of an image A, B into two sub-networks of a twin network shared by the same structure and weight, extracting texture features based on image blocks, extracting context information based on the whole image, and calculating the similarity between one image block group of an image A and one image block group of an image B based on a vector formed by the texture features and the context information;
and processing by adopting a Hungarian algorithm based on the similarity to obtain the matched image blocks in the image A, B.
2. The method of image matching based on context information fusion of claim 1, wherein the image A, B is two images under multi-view vision, i.e. two images taken from different angles for the same scene.
3. The image matching method based on context information fusion according to claim 1, wherein the obtaining method of the image block group comprises:
obtaining a segmentation mask of a target example, and cutting along the minimum external rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values into the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
4. The image matching method based on context information fusion of claim 1, wherein the twin network uses a grouping convolutional neural network to perform feature extraction on the image block groups, the first group of convolutional neural networks extracts texture features based on the image blocks in the image block groups, and the second group of convolutional neural networks extracts context information based on the processed whole image in the image block groups.
5. The image matching method based on context information fusion according to claim 1, characterized in that the twin network is trained with a training data set consisting of positive and negative sample pairs; for two different perspective images M, N, M of the same sceneim,jmFor the ith in the image MmJ (th) of the objectmImage block group corresponding to each image block, Nin,jnIs the ith in the image NnJ (th) of the objectnThe image block group corresponding to each image block is set as im=inAnd j ism=jnIn time (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnWhen (M)im,jm,Nin,jn) Are negative sample pairs.
6. The image matching method based on context information fusion of claim 5, wherein the twin network only calculates and outputs the similarity of the image block groups corresponding to the image blocks of the same object class, and the object classes of the image blocks are output by the image segmentation network.
7. An image matching apparatus based on context information fusion, comprising:
the image block division module is used for dividing image blocks to be matched from the image A, B by using an image division network and splicing each image block and the processed whole image into an image block group;
the similarity calculation module is used for respectively inputting the image block groups of the image A, B into two subnetworks of a twin network with the same structure and shared weight, extracting texture features based on the image blocks, extracting context information based on the whole image, and calculating the similarity between one image block group of the image A and one image block group of the image B based on a vector formed by the texture features and the context information;
and the image block matching module is used for processing by adopting a Hungarian algorithm based on the similarity to obtain the matched image block in the image A, B.
8. The apparatus for matching image based on context information fusion according to claim 7, wherein the method for obtaining the group of image blocks includes:
obtaining a segmentation mask of a target example, and cutting along the minimum circumscribed rectangle to obtain an image block containing the target example;
setting pixels corresponding to the segmentation masks in the whole image as background pixel values, and scaling the background pixel values into the size of the image block to obtain a processed whole image;
and splicing the image blocks and the processed whole image in a channel dimension to obtain the image block group.
9. The context information fusion-based image matching apparatus according to claim 7, wherein the twin network is trained using a training data set consisting of a positive sample pair and a negative sample pair; for two different perspective images M, N, M of the same sceneim,jmFor the ith in the image MmJ-th of like objectmImage block group corresponding to each image block, Nin,jnFor the ith in image NnJ-th of like objectnThe image block group corresponding to each image block is defined as im=inAnd j ism=jnWhen (M)im,jm,Nin,jn) Is a positive sample pair; when i ism≠inOr jm≠jnIn time (M)im,jm,Nin,jn) Are negative sample pairs.
10. The context information fusion-based image matching apparatus according to claim 9, wherein the twin network calculates and outputs only similarities of image block groups corresponding to image blocks of a same object class, and the object classes of the image blocks are output by the image segmentation network.
CN202210161767.7A 2022-02-22 2022-02-22 Image matching method and device based on context information fusion Pending CN114782714A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210161767.7A CN114782714A (en) 2022-02-22 2022-02-22 Image matching method and device based on context information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210161767.7A CN114782714A (en) 2022-02-22 2022-02-22 Image matching method and device based on context information fusion

Publications (1)

Publication Number Publication Date
CN114782714A true CN114782714A (en) 2022-07-22

Family

ID=82423564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210161767.7A Pending CN114782714A (en) 2022-02-22 2022-02-22 Image matching method and device based on context information fusion

Country Status (1)

Country Link
CN (1) CN114782714A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115439938A (en) * 2022-09-09 2022-12-06 湖南智警公共安全技术研究院有限公司 Anti-splitting face archive data merging processing method and system
CN115497633A (en) * 2022-10-19 2022-12-20 联仁健康医疗大数据科技股份有限公司 Data processing method, device, equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115439938A (en) * 2022-09-09 2022-12-06 湖南智警公共安全技术研究院有限公司 Anti-splitting face archive data merging processing method and system
CN115439938B (en) * 2022-09-09 2023-09-19 湖南智警公共安全技术研究院有限公司 Anti-splitting face archive data merging processing method and system
CN115497633A (en) * 2022-10-19 2022-12-20 联仁健康医疗大数据科技股份有限公司 Data processing method, device, equipment and storage medium
CN115497633B (en) * 2022-10-19 2024-01-30 联仁健康医疗大数据科技股份有限公司 Data processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112132197B (en) Model training, image processing method, device, computer equipment and storage medium
CN106778604B (en) Pedestrian re-identification method based on matching convolutional neural network
CN102708370B (en) Method and device for extracting multi-view angle image foreground target
CN111160407B (en) Deep learning target detection method and system
CN104599275B (en) The RGB-D scene understanding methods of imparametrization based on probability graph model
CN110276264B (en) Crowd density estimation method based on foreground segmentation graph
CN104134234A (en) Full-automatic three-dimensional scene construction method based on single image
CN110298227B (en) Vehicle detection method in unmanned aerial vehicle aerial image based on deep learning
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN108648161A (en) The binocular vision obstacle detection system and method for asymmetric nuclear convolutional neural networks
CN108846404B (en) Image significance detection method and device based on related constraint graph sorting
CN110276768B (en) Image segmentation method, image segmentation device, image segmentation apparatus, and medium
CN113870128B (en) Digital mural image restoration method based on depth convolution countermeasure network
CN110334584B (en) Gesture recognition method based on regional full convolution network
CN114782714A (en) Image matching method and device based on context information fusion
CN102982539A (en) Characteristic self-adaption image common segmentation method based on image complexity
CN111027581A (en) 3D target detection method and system based on learnable codes
CN110956681A (en) Portrait background automatic replacement method combining convolutional network and neighborhood similarity
CN110111346B (en) Remote sensing image semantic segmentation method based on parallax information
CN110852327A (en) Image processing method, image processing device, electronic equipment and storage medium
Zou et al. Microarray camera image segmentation with Faster-RCNN
CN108388901B (en) Collaborative significant target detection method based on space-semantic channel
CN111160107B (en) Dynamic region detection method based on feature matching
CN108399630B (en) Method for quickly measuring distance of target in region of interest in complex scene
CN109816710B (en) Parallax calculation method for binocular vision system with high precision and no smear

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination