[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN108595710A - A kind of quick mass picture De-weight method - Google Patents

A kind of quick mass picture De-weight method Download PDF

Info

Publication number
CN108595710A
CN108595710A CN201810446311.9A CN201810446311A CN108595710A CN 108595710 A CN108595710 A CN 108595710A CN 201810446311 A CN201810446311 A CN 201810446311A CN 108595710 A CN108595710 A CN 108595710A
Authority
CN
China
Prior art keywords
image
hash
picture
img
mapping
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810446311.9A
Other languages
Chinese (zh)
Other versions
CN108595710B (en
Inventor
杨晓春
王斌
王晓琼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201810446311.9A priority Critical patent/CN108595710B/en
Publication of CN108595710A publication Critical patent/CN108595710A/en
Application granted granted Critical
Publication of CN108595710B publication Critical patent/CN108595710B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2136Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Processing Or Creating Images (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of quick mass picture De-weight methods, and the every width picture waited in duplicate removal picture is generated finger image information by perceptual hash algorithm;Image Hash characteristics dictionary is built using multigroup random Hash mapping, to remove repetitive picture.Compared with prior art, the present invention generates finger image, structure image Hash characteristics dictionary to the quick duplicate removal of large nuber of images by building low-dimensional characteristics of image, remove the higher image feature comparison of time loss completely, and make up low-dimensional characteristics of image and the mapping space Sparse Problems caused by local sensitive hash by designing repeatedly rational Hash mapping, it can not only quickly extract characteristics of image, position multiimage, more the aspect ratio of image falls below number 0 time, the efficiency of large nuber of images duplicate removal is greatly improved in the case where ensureing precision.

Description

A kind of quick mass picture De-weight method
Technical field
The present invention relates to picture duplicate removal processing technical field, especially a kind of quick mass picture De-weight method.
Background technology
Existing multiimage minimizing technology is first to each image zooming-out its color in sets of image data, line Reason, the features such as shape recycle the similitude of feature to go the similitude between measurement image, and then reach removal multiimage Purpose.Once but the quantity of image increases certain scale, carries out aspect ratio two-by-two to needed for it to image at this time The time loss wanted is very huge, it is difficult to receiving.However, for the image data of magnanimity, it is big in image collection Not related between parts of images, thus it is carried out two-by-two aspect ratio to not contributing the duplicate removal of image not only, and And consume a large amount of operation time so that the inefficiency of mass picture duplicate removal, time loss are higher.MD5 signatures are a kind of Cryptographic hash, with irreversibility, the discreteness of code bit characteristic and height, under normal circumstances its can uniquely represent original The feature of information is therefore widely used in multiimage detection.But limitation of the MD5 signatures with height, such as it is right In txt documents, MD5 values are calculated according to its binary data.For the duplicate plate of this txt document, if its relative to Former txt documents have a subtle transformation, such as indertform, MD5 signatures can also have compared with the txt documents of master heaven and earth it Not.Therefore MD5, which signs, can only judge completely unmodified file, for by the slight image scaled or mix colours, nothing Method is judged.Perceptual hash algorithm is the key technology that Neal Krawetz are proposed, is not to calculate to breathe out in strict manner Uncommon value, and simply by the difference for judging image adjacent pixel, a fingerprint (string format) is generated for image, for waiting comparing Compared with two images calculate the Hamming distance of fingerprint between two images, you can judge its image phase by comparing its fingerprint Like degree, and then retrieve similar image.However, perceptual hash algorithm is when carrying out two image comparisons, pass through finger image Generation, accelerate the versus speed between two images, when being only applicable to for given image database and image to be retrieved, The process for carrying out image retrieval, is not reduced the aspect ratio between large-scale image to number, however the feature between image Comparison is undoubtedly most time-consuming, therefore it is not particularly suited for the swift nature of large nuber of images and compares.It also just cannot be satisfied Quick large nuber of images duplicate removal problem.
Aspect ratio pair is carried out in view of to characteristics of image, during searching multiimage, because will be to every two images Image comparison is all done, therefore its image compares the time of consumption as the trend of exponential increase is presented in the increase of image data amount. And by rational Hash mapping, similar image is mapped in the same Hash bucket, it is possible to reduce image feature comparison time Number reduces the time loss of multiimage search.But the characteristics of image relative to higher-dimension is mapped in the Hash bucket of low-dimensional When forming m dimension Hash features, the characteristics of image of higher-dimension be for the Hamming space that m is tieed up it is very sparse, such as Fruit is only than sparse image low-dimensional feature as the characteristics of image extracted, and as the feature of image itself, this can make It is appeared in identical Hash bucket at a large amount of different images so that the image in same Hash bucket is also needed to take repeatedly Accurate profile compares the image that could finally search repetition two-by-two, and it is also possible to have identical for different images Hash encodes, and leads to image-erasing mistake.
Invention content
The invention aims to the deficiencies in the prior art, provide a kind of quick mass picture De-weight method.
In order to achieve the above objectives, the present invention is implemented according to following technical scheme:
A kind of quick mass picture De-weight method, includes the following steps:
Step 1, the every width picture waited in duplicate removal picture is generated by finger image information by perceptual hash algorithm;
Step 2, image Hash characteristics dictionary is built using multigroup random Hash mapping, to remove repetitive picture.
Further technical solution is that in the step 1, every width picture is converted into the single gray scale of 8*8 sizes first Image, then generates finger image coding by finger image generating algorithm, and the formula of wherein finger image generating algorithm is:
Further technical solution is, in the step 2, the finger image that every width picture generates is encoded, office is passed through Portion's sensitive hash is mapped to m Hash codings, and finger image is encoded the node as a connected graph, different using p groups Hash mapping to finger image encode carry out Hash coding, with two images p groups it is different hash function mapping under be located at Number q in the same Hash bucket is the connected graph between the weighted value structure image between two image nodes, then by p groups M Hash encoded radios are combined into m*p Hash encoded radios, thus by identical finger image coding mapping to the same Kazakhstan In uncommon bucket, the finger image coding repeated in each Hash bucket is then deleted, to remove repetitive picture.
As a preferred embodiment of the present invention, the p=6, m=8.
Compared with prior art, the present invention generates finger image, structure image Hash spy by building low-dimensional characteristics of image Dictionary is levied to the quick duplicate removal of large nuber of images, removes the higher image feature comparison of time loss completely, and multiple by designing Rational Hash mapping makes up the mapping space Sparse Problems caused by low-dimensional characteristics of image and local sensitive hash, can not only Enough quickly extraction characteristics of image, position multiimage, more fall below the aspect ratio of image 0 time to number, are ensureing precision In the case of greatly improve the efficiency of large nuber of images duplicate removal.
Description of the drawings
Fig. 1 is the flow chart of the present invention.
Fig. 2 is the Hash code pattern that every 64 bitmap piece fingerprint characteristics are mapped as 8 in the embodiment of the present invention.
Fig. 3 is the connected graph between multiple image in the embodiment of the present invention.
Fig. 4 is the width in several pictures in the embodiment of the present invention.
Fig. 5 is another width in several pictures in the embodiment of the present invention.
Fig. 6 is the third width in several pictures in the embodiment of the present invention.
Fig. 7 is the 4th width in several pictures in the embodiment of the present invention.
Fig. 8 is the 5th width in several pictures in the embodiment of the present invention.
Specific implementation mode
With reference to specific embodiment, the invention will be further described, in the illustrative examples and explanation of the invention For explaining the present invention, but it is not as a limitation of the invention.
As shown in Figure 1, a kind of quick mass picture De-weight method of the present embodiment, includes the following steps:
Step 1, the every width picture waited in duplicate removal picture is generated by finger image information by perceptual hash algorithm;
Step 2, image Hash characteristics dictionary is built using multigroup random Hash mapping, to remove repetitive picture.
Perceptual hash algorithm generates finger image information by building the low-dimensional feature of image, and as the complete of image Office's description carries out image comparison, to search for similar image.By the research to perceptual hash algorithm it will be understood that identical figure As the finger image feature generated must be identical.Therefore the retrieval for identical image, it is special by abandoning image higher-dimension Sign, only extracts the characteristics of image of its low-dimensional, can not only accelerate identical image retrieval rate, and is also enough to ensure that retrieval essence Degree, meanwhile, in order to enable identical image has certain robustness, the present embodiment for the change in size and color change of image Use finger image generating algorithm:The formula of wherein finger image generating algorithm is:
In the present embodiment, first, every width picture is converted into the single gray level image of 8*8 sizes, followed by image The formula of fingerprint generating algorithm generates finger image coding, specifically:Concrete mode as shown in Fig. 2, using accidental projection side 64 bitmap piece fingerprint characteristics of formation are mapped as 8 Hash and encoded by formula, for indicating characteristics of image, as shown in Fig. 2, A, B, C represent 3 projection planes of Hash mapping, and the small circle in figure represents picture.It then will be in each plane orientation counterclockwise Picture number is 0, if the number of clockwise is 1, and number order is arranged in order according to ABC, for red circle in figure The picture represented is enclosed, number is 000, in the present embodiment, the Hash coding that formation is 8, therefore each Hash is reflected It penetrates, randomly selects 8 projection planes, and then the Hash that every pictures are formed with 8 encodes.Obtaining all finger image codings After the global finger-print codes of picture, need to be gone to judge whether image is multiimage according to this feature.The present embodiment draws part The thought of sensitive hash Function Mapping, that is, the image fallen in the same Hash bucket has higher similitude, to Hash mapping Algorithm is improved, and the Hash characteristics dictionary that image is built using multigroup random Hash mapping is proposed, required for image Aspect ratio 0 is reduced to number.
In the present embodiment, the specific of the Hash characteristics dictionary of finger image information is generated as:Each image is generated Finger image coding, by local sensitivity Hash mapping at m Hash coding when, similar image there is a high likelihood that There is identical m Hash to encode, but simultaneously because the sparsity that Hash encodes, different images there may also be identical Hash to compile Code value.Therefore, if finger image is encoded the node as a connected graph, using the different Hash mapping of p groups to image Hash coding is carried out, is located at the number q in the same Hash bucket under the different hash function mapping of p groups for two with two images Open the connected graph between the weighted value structure image between image node, it is known that, under the value of rational p and m, weight q The bigger side of value, the similarity for representing its two images are higher.Then during image duplicate removal, herein by the Hash of p groups m Encoded radio is combined into m*p Hash encoded radios, and in the p different positions m Hash group Function Mapping, different images is in p groups The possibility that identical Hash coding is owned by Hash mapping is minimum, therefore takes p mapping by this paper, according to Its weight q judges its repeatability, and when being calculated using the method for expanding Hash number of encoding bits, can accurately will be identical Image is mapped in the same Hash bucket.Meanwhile the value of m and P is too small, is likely to result in different images and possesses identical volume Code value, to cause deletion error, and if the value of p is excessive, the mapping cost for calculating will become larger.In the present embodiment In method, p=6, m=8 is taken to acquire balance between the two.Not only it can ensure the opposite uniqueness of encoded radio, but also can ensure Acceptable calculating cost, wherein specific image Hash dictionary creation algorithm routine is as follows:
, by the structure image Hash characteristics dictionary built above, thus by identical finger image coding mapping to together In one Hash bucket, the finger image coding repeated in each Hash bucket is then deleted, to remove repetitive picture.
P=6, m=8 are drafted at this time, then using every pictures as connected graph vertex, are fallen in the same Hash bucket with two pictures In number be side weight, each vertex of connected graph as shown in Figure 3 (using 6 pictures as example) and other can be obtained Vertex has side to be connected, and the weights on side represent two pictures and fall the number in the same Hash bucket, if Fig. 1 and Fig. 2 is at 6 times It is each fallen in Hash mapping in the same Hash bucket, it can be concluded that 1 and 2 be repetitive picture, and 2 and 4 reflect in 6 Hash It hits only 4 times to fall into the same Hash bucket, can only say that 4 and 6 pictures have certain similitude, but do not determine that it is weight Multiple picture.
In order to further verify p=6, when m=8, can not only ensure the opposite uniqueness of encoded radio, but also can ensure connect The calculating cost received, the present embodiment in 1000,5000,10000 quantity and test the method that is proposed respectively, now take it In an Experimental results show it is as follows.This artificially adds 4 repetitive pictures for 637 pictures downloaded on the net.
Work as p=4, when m=8, there are some unduplicated pictures (Fig. 4 and Fig. 5) to be also detected as repetitive picture, specifically As a result as follows:
./img/res_img/10993710036_2033222c91.png
./img/res_img/10993818044_4c19b86c82.png
For repetitive picture
./img/res_img/1355787476_32e9f2a30b.png
./img/res_img/2476937534_21b285aa46_n.png
For repetitive picture
./img/res_img/14073784469_ffb12f3387_n.png
./img/res_img/21652746_cc379e0eea_m.png
For repetitive picture
./img/res_img/176375506_201859bb92_m- copy .png
./img/res_img/176375506_201859bb92_m.png
For repetitive picture
./img/res_img/43474673_7bb4465a86- copy .png
./img/res_img/43474673_7bb4465a86.png
For repetitive picture
./img/res_img/476857510_d2b30175de_n- copy .png
./img/res_img/476857510_d2b30175de_n.png
For repetitive picture
./img/res_img/5547758_eea9edfd54_n- copy .png
./img/res_img/5547758_eea9edfd54_n.png
For repetitive picture
0:00:00.083060, from the above it can be seen that detected 7 width repetitive pictures, and it is practical such as Fig. 4 and Fig. 5 Shown in two width pictures be obviously unduplicated, therefore, work as p=4, when m=8, there are prodigious errors for picture duplicate removal.
Work as p=6, when m=6, effect ratio p=4, m=8 will be got well, but still have error, and concrete outcome is as follows:
./img/res_img/10993710036_2033222c91.png
./img/res_img/10993818044_4c19b86c82.png
For repetitive picture
./img/res_img/176375506_201859bb92_m- copy .png
./img/res_img/176375506_201859bb92_m.png
For repetitive picture
./img/res_img/2520369272_1dcdb5a892_m.png
./img/res_img/525780443_bba812c26a_m.png
For repetitive picture
./img/res_img/43474673_7bb4465a86- copy .png
./img/res_img/43474673_7bb4465a86.png
For repetitive picture
./img/res_img/476857510_d2b30175de_n- copy .png
./img/res_img/476857510_d2b30175de_n.png
For repetitive picture
./img/res_img/5547758_eea9edfd54_n- copy .png
./img/res_img/5547758_eea9edfd54_n.png
For repetitive picture
0:00:00.068091, from the above it can be seen that detected 6 width repetitive pictures, and it is practical such as Fig. 6 and Fig. 7 Shown in two width pictures be obviously unduplicated, therefore, work as p=6, when m=6, there are prodigious errors for picture duplicate removal.
Work as p=8, when m=6, four pictures are all detected, meanwhile, for 10993710036 and 10993818044 Two pictures for having slight deformation also have detected out (two pictures are that slight shift obtains), and concrete outcome is as follows:
./img/res_img/10993710036_2033222c91.png
./img/res_img/10993818044_4c19b86c82.png
For repetitive picture
./img/res_img/176375506_201859bb92_m- copy .png
./img/res_img/176375506_201859bb92_m.png
For repetitive picture
./img/res_img/43474673_76b4465a86- copy .png
./img/res_img/43474673_7bb4465a86.png
For repetitive picture
./img/res_img/476857510_d2b30175de_n- copy .png
./img/res_img/476857510_d2b30175de_n.png
For repetitive picture
./img/res_img/5547758_eea9edfd54_n- copy .png
./img/res_img/5547758_eea9edfd54_n.png
For repetitive picture
0:00:00.100637, from the above it can be seen that detected 5 width repetitive pictures, and it is practical as shown in Figure 8 Two width pictures be obviously unduplicated, therefore, work as p=8, when m=6, there is smaller error for picture duplicate removal.
Work as p=6, when m=8, concrete outcome is as follows:
./img/res_img/10993710036_2033222c91.png
./img/res_img/10993818044_4c19b86c82.png
For repetitive picture
./img/res_img/176375506_201859bb92_m- copy .png
./img/res_img/176375506_201859bb92_m.png
For repetitive picture
./img/res_img/43474673_7bb4465a86- copy .png
./img/res_img/43474673_7bb4465a86.png
For repetitive picture
./img/res_img/476857510_d2b30175de_n- copy .png
./img/res_img/476857510_d2b30175de_n.png
For repetitive picture
./img/res_img/5547758_eea9edfd54_n- copy .png
./img/res_img/5547758_eea9edfd54_n.png
For repetitive picture
0:00:00.076153, from the above it can be seen that detected 5 width repetitive pictures and p=8, m=6 effect Almost indifference, but small (last column is time loss) is wanted when such situation time loss ratio p=8, m=6.
Therefore, consider using p=6, m=8, the aspect ratio of image falls below number 0 time, ensureing precision In the case of greatly improve the efficiency of large nuber of images duplicate removal.
Technical scheme of the present invention is not limited to the limitation of above-mentioned specific embodiment, every to do according to the technique and scheme of the present invention The technology deformation gone out, each falls within protection scope of the present invention.

Claims (4)

1. a kind of quick mass picture De-weight method, which is characterized in that include the following steps:
Step 1, the every width picture waited in duplicate removal picture is generated by finger image information by perceptual hash algorithm;
Step 2, image Hash characteristics dictionary is built using multigroup random Hash mapping, to remove repetitive picture.
2. quick mass picture De-weight method according to claim 1, it is characterised in that:In the step 1, first will Every width picture is converted into the single gray level image of 8*8 sizes, then generates finger image coding by finger image generating algorithm, The formula of wherein finger image generating algorithm is:
3. quick mass picture De-weight method according to claim 2, it is characterised in that:In the step 2, for every The finger image coding that width picture generates is encoded at m Hash by local sensitivity Hash mapping, finger image is encoded and is made For the node of a connected graph, finger image is encoded using p groups different Hash mapping and carries out Hash coding, with two images It is the weighted value between two image nodes to be located at the number q in the same Hash bucket under the different hash function mapping of p groups The connected graph between image is built, then the Hash encoded radio of p groups m is combined into m*p Hash encoded radios, thus by phase In same finger image coding mapping to the same Hash bucket, then deletes the finger image repeated in each Hash bucket and compile Code, to remove repetitive picture.
4. quick mass picture De-weight method according to claim 3, it is characterised in that:The p=6, m=8.
CN201810446311.9A 2018-05-11 2018-05-11 Rapid massive picture de-duplication method Expired - Fee Related CN108595710B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810446311.9A CN108595710B (en) 2018-05-11 2018-05-11 Rapid massive picture de-duplication method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810446311.9A CN108595710B (en) 2018-05-11 2018-05-11 Rapid massive picture de-duplication method

Publications (2)

Publication Number Publication Date
CN108595710A true CN108595710A (en) 2018-09-28
CN108595710B CN108595710B (en) 2021-07-13

Family

ID=63636685

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810446311.9A Expired - Fee Related CN108595710B (en) 2018-05-11 2018-05-11 Rapid massive picture de-duplication method

Country Status (1)

Country Link
CN (1) CN108595710B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934712A (en) * 2019-01-30 2019-06-25 网联清算有限公司 Account checking method, account checking apparatus and electronic equipment applied to distributed system
CN110458224A (en) * 2019-08-06 2019-11-15 北京字节跳动网络技术有限公司 Image processing method, device, electronic equipment and computer-readable medium
CN110490250A (en) * 2019-08-19 2019-11-22 广州虎牙科技有限公司 A kind of acquisition methods and device of artificial intelligence training set
CN110688514A (en) * 2019-08-30 2020-01-14 中国人民财产保险股份有限公司 Insurance claim settlement image data duplicate checking method and device
CN111078914A (en) * 2019-12-18 2020-04-28 书行科技(北京)有限公司 Method and device for detecting repeated pictures
CN111091118A (en) * 2019-12-31 2020-05-01 北京奇艺世纪科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN111368122A (en) * 2020-02-14 2020-07-03 深圳壹账通智能科技有限公司 Method and device for removing duplicate pictures
CN111382298A (en) * 2018-12-30 2020-07-07 贝壳技术有限公司 Image retrieval method and device based on picture content and electronic equipment
CN117708354A (en) * 2024-02-06 2024-03-15 湖南快乐阳光互动娱乐传媒有限公司 Image indexing method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677661A (en) * 2014-09-30 2016-06-15 华东师范大学 Method for detecting repetition data of social media
US20160247537A1 (en) * 2015-02-24 2016-08-25 Plaay, Llc System and method for creating a sports video
CN106570141A (en) * 2016-11-04 2017-04-19 中国科学院自动化研究所 Method for detecting approximately repeated image
CN107729935A (en) * 2017-10-12 2018-02-23 杭州贝购科技有限公司 The recognition methods of similar pictures and device, server, storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677661A (en) * 2014-09-30 2016-06-15 华东师范大学 Method for detecting repetition data of social media
US20160247537A1 (en) * 2015-02-24 2016-08-25 Plaay, Llc System and method for creating a sports video
CN106570141A (en) * 2016-11-04 2017-04-19 中国科学院自动化研究所 Method for detecting approximately repeated image
CN107729935A (en) * 2017-10-12 2018-02-23 杭州贝购科技有限公司 The recognition methods of similar pictures and device, server, storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382298A (en) * 2018-12-30 2020-07-07 贝壳技术有限公司 Image retrieval method and device based on picture content and electronic equipment
CN111382298B (en) * 2018-12-30 2021-04-20 北京房江湖科技有限公司 Image retrieval method and device based on picture content and electronic equipment
CN109934712A (en) * 2019-01-30 2019-06-25 网联清算有限公司 Account checking method, account checking apparatus and electronic equipment applied to distributed system
CN110458224A (en) * 2019-08-06 2019-11-15 北京字节跳动网络技术有限公司 Image processing method, device, electronic equipment and computer-readable medium
CN110490250A (en) * 2019-08-19 2019-11-22 广州虎牙科技有限公司 A kind of acquisition methods and device of artificial intelligence training set
CN110688514A (en) * 2019-08-30 2020-01-14 中国人民财产保险股份有限公司 Insurance claim settlement image data duplicate checking method and device
CN111078914A (en) * 2019-12-18 2020-04-28 书行科技(北京)有限公司 Method and device for detecting repeated pictures
CN111078914B (en) * 2019-12-18 2023-04-18 书行科技(北京)有限公司 Method and device for detecting repeated pictures
CN111091118A (en) * 2019-12-31 2020-05-01 北京奇艺世纪科技有限公司 Image recognition method and device, electronic equipment and storage medium
CN111368122A (en) * 2020-02-14 2020-07-03 深圳壹账通智能科技有限公司 Method and device for removing duplicate pictures
CN117708354A (en) * 2024-02-06 2024-03-15 湖南快乐阳光互动娱乐传媒有限公司 Image indexing method and device, electronic equipment and storage medium
CN117708354B (en) * 2024-02-06 2024-04-30 湖南快乐阳光互动娱乐传媒有限公司 Image indexing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN108595710B (en) 2021-07-13

Similar Documents

Publication Publication Date Title
CN108595710A (en) A kind of quick mass picture De-weight method
KR101565265B1 (en) Coding of feature location information
CN111767819B (en) Image recognition method, device, electronic equipment and computer readable medium
EP0151316A2 (en) On-line recognition method and apparatus for a handwritten pattern
CN108062478A (en) The malicious code sorting technique that global characteristics visualization is combined with local feature
WO2004062110A1 (en) Data compressing method, program and apparatus
Zenil et al. Image characterization and classification by physical complexity
CN105630765A (en) Place name address identifying method
CN103020321B (en) Neighbor search method and system
US10185887B2 (en) Textual representation of an image
Javed et al. A direct approach for word and character segmentation in run-length compressed documents with an application to word spotting
US8229232B2 (en) Computer vision-based methods for enhanced JBIG2 and generic bitonal compression
CN100541537C (en) A kind of method of utilizing computing machine to the compression of digitizing files
CN107423309A (en) Magnanimity internet similar pictures detecting system and method based on fuzzy hash algorithm
Vázquez et al. Using normalized compression distance for image similarity measurement: an experimental study
TWI847497B (en) Store deduplication processing method, device, equipment and storage medium
CN111563139A (en) Checking method and device for identifying invoice drug name through OCR (optical character recognition) and computer equipment
JP2006351001A (en) Content characteristic quantity extraction method and device, and content identity determination method and device
CN110110120B (en) Image retrieval method and device based on deep learning
CN112182337B (en) Method for identifying similar news from massive short news and related equipment
Besiris et al. Dictionary-based color image retrieval using multiset theory
CN115455966B (en) Safe word stock construction method and safe code extraction method thereof
CN113378163A (en) Android malicious software family classification method based on DEX file partition characteristics
Zhu et al. File Fragment Type Identification Based on CNN and LSTM
KR102497634B1 (en) Method and apparatus for compressing fastq data through character frequency-based sequence reordering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210713