[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2020000879A1 - Image recognition method and apparatus - Google Patents

Image recognition method and apparatus Download PDF

Info

Publication number
WO2020000879A1
WO2020000879A1 PCT/CN2018/116335 CN2018116335W WO2020000879A1 WO 2020000879 A1 WO2020000879 A1 WO 2020000879A1 CN 2018116335 W CN2018116335 W CN 2018116335W WO 2020000879 A1 WO2020000879 A1 WO 2020000879A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
identified
screenshot
recognition
recognition result
Prior art date
Application number
PCT/CN2018/116335
Other languages
French (fr)
Chinese (zh)
Inventor
周恺卉
王长虎
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020000879A1 publication Critical patent/WO2020000879A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • Embodiments of the present application relate to the field of computer technology, and in particular, to an image recognition method and device.
  • the embodiments of the present application provide an image recognition method and device.
  • an embodiment of the present application provides an image recognition method.
  • the method includes: acquiring an image to be identified; inputting the image to be identified into a pre-trained screen image recognition model to obtain a characterization for whether the image to be identified is a screen image Of the recognition result, wherein the screen capture image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result; and in response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
  • the method before acquiring the image to be identified, includes: acquiring a target image; and capturing a preset region of the target image as the image to be identified.
  • the method before acquiring the image to be identified, includes: acquiring a frame sequence of the target video; and selecting a target frame in the frame sequence of the target video as the image to be identified.
  • the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and label information used to characterize whether the sample image is a screenshot image; The sample image is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
  • the method further includes: in response to the recognition result indicating that the image to be recognized is not a screenshot image, performing text recognition on the image to be recognized to obtain the recognition result; determining whether the recognition result includes a preset text; and in response to determining the recognition result Contains preset text to delete images to be identified.
  • an embodiment of the present application provides an image recognition device, the device includes: an image to be identified acquisition unit configured to obtain the image to be identified; an identification unit configured to input the image to be identified into a pre-trained screenshot An image recognition model to obtain a recognition result used to characterize whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used to represent the correspondence between the image to be recognized and the recognition result; a first deletion unit configured to respond to the recognition result
  • the image to be identified is a screenshot image, and the image to be identified is deleted.
  • the apparatus further includes: a pushing unit configured to indicate that the image to be identified is not a screenshot image in response to the recognition result, and to push information for indicating that the image to be identified is not a screenshot image.
  • the apparatus further includes: a target image acquisition unit configured to acquire a target image; and a capture unit configured to intercept a preset region of the target image as an image to be identified.
  • the apparatus further includes: a frame sequence acquisition unit configured to acquire a frame sequence of the target video; and a selection unit configured to select a target frame in the frame sequence of the target video as an image to be identified.
  • the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and label information used to characterize whether the sample image is a screenshot image; The sample image is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
  • the apparatus further includes: a recognition unit configured to indicate that the image to be recognized is not a screenshot image in response to the recognition result, to perform text recognition on the image to be recognized to obtain a recognition result; and a determination unit configured to determine the recognition result Whether to include a preset text; and a second deleting unit configured to delete the image to be recognized in response to determining that the recognition result includes the preset text.
  • a recognition unit configured to indicate that the image to be recognized is not a screenshot image in response to the recognition result, to perform text recognition on the image to be recognized to obtain a recognition result
  • a determination unit configured to determine the recognition result Whether to include a preset text
  • a second deleting unit configured to delete the image to be recognized in response to determining that the recognition result includes the preset text.
  • an embodiment of the present application provides an electronic device.
  • the electronic device includes: one or more processors; a storage device that stores one or more programs thereon; Or multiple processors execute, so that the above one or more processors implement the method as described in any implementation manner of the first aspect.
  • an embodiment of the present application provides a computer-readable medium on which a computer program is stored.
  • a computer program is stored.
  • the image recognition method and device provided in the embodiments of the present application recognize a to-be-recognized image by taking a screenshot image recognition model. If the image to be identified is a screenshot, delete it. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied;
  • FIG. 2 is a flowchart of an embodiment of an image recognition method according to the present application.
  • FIG. 3 is a schematic diagram of an application scenario of the image recognition method according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an image recognition apparatus according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a server according to an embodiment of the present application.
  • FIG. 1 illustrates an exemplary system architecture 100 to which an image recognition method or an image recognition apparatus of an embodiment of the present application can be applied.
  • the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105.
  • the network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as photographing applications, picture processing applications, instant messaging tools, email clients, social platform software, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, and 103 can be various electronic devices with support for storing and transmitting images, including, but not limited to, smart phones, tablet computers, laptop computers, and desktop computers.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the server 105 may be a server that provides various services, such as a background server that processes images stored in the terminal devices 101, 102, and 103.
  • the background server may process the received image (for example, identify whether it is a screenshot image), and perform corresponding processing according to the processing result (for example, the recognition result).
  • the image recognition method provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the image recognition device is generally provided in the server 105.
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster consisting of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • terminal devices, networks, and servers in FIG. 1 are merely exemplary. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
  • the image recognition method includes the following steps:
  • Step 201 Obtain an image to be identified.
  • an execution subject of the image recognition method may obtain an image to be recognized from a terminal in a wired connection manner or a wireless connection manner.
  • the image to be identified may also be stored locally on the execution subject. At this time, the execution subject may directly obtain the image to be identified from the local.
  • the image to be identified may be any image that needs to be identified. In practice, the images to be identified can be specified by a technician or filtered according to certain conditions.
  • the method may include: acquiring a frame sequence of the target video; and selecting a target frame in the frame sequence of the target video as the image to be identified.
  • the target video can be any video.
  • the determination of the target video can be specified by a technician, or it can be filtered according to certain conditions.
  • the target frame may be at least one frame in the above-mentioned frame sequence.
  • the target frame can be specified by a technician, or it can be filtered according to certain conditions.
  • the condition may be: one frame is drawn at 2 second intervals.
  • step 202 an image to be identified is input to a pre-trained screen image recognition model, and a recognition result for characterizing whether the image to be identified is a screen image is obtained.
  • the above-mentioned execution subject may input an image to be recognized into a pre-trained screen image recognition model.
  • a recognition result for characterizing whether the image to be recognized is a screenshot image
  • the screenshot image may be an image recording content displayed on a screen of the electronic device.
  • the recognition results can take many forms. For example, you can use a number to indicate whether the image to be identified is a screenshot. Specifically, “1” may be used to indicate that the image to be identified is a screenshot image. Use "0" to indicate that the image to be identified is not a screenshot.
  • the recognition result may also be a value between 0 and 1, which is used to indicate the probability that the image to be recognized is a screenshot image.
  • the recognition results can also be text, characters, and so on.
  • the form of the recognition result is not limited.
  • the screenshot image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result.
  • the screenshot image model may be a correspondence table storing a large number of images (including screenshot images or non-screenshot images) and recognition results corresponding to the images.
  • the correspondence relationship table may be generated based on statistics of a large number of images and recognition results.
  • the above-mentioned execution subject can match the image to be identified with a large number of images in the correspondence table.
  • a preset threshold for example, 95%) can be determined.
  • the recognition result corresponding to the determined image may be used as the recognition result of the image to be recognized.
  • the above screenshot image recognition model may also be a neural network.
  • the neural network abstracts the human brain neuron network from the perspective of information processing, establishes some simple model, and forms different networks according to different connection methods. Usually consists of a large number of nodes (or neurons) connected to each other, each node represents a specific output function, called the excitation function. The connection between each two nodes represents a weighted value for the signal passing through the connection, called a weight (also called a parameter), and the output of the network varies according to the connection mode, weight value and incentive function of the network.
  • a neural network usually includes multiple layers, and each layer includes multiple nodes. Generally, the nodes of the same layer can have the same weight, and the nodes of different layers can have different weights, so the parameters of multiple layers of the neural network can also be different.
  • Step 203 In response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
  • the execution subject may delete the image to be recognized.
  • the above-mentioned execution subject may also push information for indicating that the image to be identified is not a screenshot image.
  • text recognition is performed on the to-be-recognized image to obtain the recognition result; determining whether the recognition result includes a preset text; and responding to determining the recognition
  • the result contains preset text, and the image to be recognized is deleted.
  • the above-mentioned execution subject may perform character recognition on the image to be recognized through various methods to obtain the recognition result.
  • the recognition result may be related information of the text displayed in the image to be recognized.
  • OCR Optical Character Recognition
  • the execution body may determine whether the recognition result (for example, the obtained text) contains a preset text (for example, the name of an operator, etc.). If so, the execution subject may delete the image to be identified.
  • FIG. 3 is a schematic diagram of an application scenario of the image recognition method according to this embodiment.
  • the execution body of the image recognition method is the server 300.
  • the server 300 may first obtain an image 301 to be identified from a terminal. Then, the to-be-recognized image 301 is input to a pre-trained screen image recognition model to obtain a recognition result. If the recognition result indicates that the to-be-recognized image 301 is a screenshot image, the to-be-recognized image 301 is deleted.
  • the image recognition method provided by the above embodiments of the present application uses a screen capture image recognition model to identify an image to be recognized. If the image to be identified is a screenshot, delete it. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
  • FIG. 4 illustrates a process 400 of still another embodiment of the image recognition method.
  • the process 400 of the image recognition method includes the following steps:
  • Step 401 Obtain a target image.
  • the execution subject of the image recognition method may obtain the target image from the terminal in a wired connection or a wireless connection.
  • the target image can be any image.
  • the target image can be specified by a technician, or it can be filtered based on preset conditions.
  • the target image may be stored locally in the execution subject. At this time, the execution subject may also directly obtain the target image from the local.
  • Step 402 Capture a preset area of the target image as the image to be identified.
  • the execution subject may intercept a preset area of the target image as the image to be identified.
  • the preset area may be a part or all of the target image. For example, it can be the upper fifth area.
  • the above-mentioned execution subject may intercept the preset area of the target image in various ways. For example, through some screenshot applications or image processing applications.
  • Step 403 Acquire an image to be identified.
  • the execution subject may obtain the to-be-recognized image obtained in step 402. Because the image to be identified is obtained in step 402, it can generally be obtained directly from the local.
  • Step 404 Input the image to be identified into a pre-trained screen image recognition model, and obtain a recognition result used to characterize whether the image to be identified is a screen image.
  • the above-mentioned screenshot image recognition model may be a model obtained by training an image classification network, such as a Convolutional Neural Network (CNN), based on multiple training samples using a machine learning method.
  • CNN Convolutional Neural Network
  • the convolutional neural network can be a kind of feed-forward neural network, and its artificial neurons can respond to a part of the surrounding cells in the coverage area, and it has excellent performance for image processing.
  • a convolutional neural network may include a convolutional layer, a pooling layer, a depooling layer, and a deconvolution layer.
  • the convolution layer can be used to extract image features.
  • the pooling layer can be used to downsample the input information.
  • the depooling layer can be used to upsample the input information
  • the deconvolution layer is used to deconvolve the input information
  • the transposition of the convolution kernel of the convolution layer is used as the deconvolution layer.
  • the convolution kernel processes the input information.
  • the above screenshot image recognition model can be trained by the following steps:
  • the first step is to obtain a training sample set, where each training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image.
  • each training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image.
  • the label information may be in various forms.
  • the label information may be a numerical value. For example, "0" indicates that it is not a screenshot image, and "1" indicates that it is a screenshot image.
  • the label information may also be text, characters, and so on.
  • the sample image of the training samples in the training sample set is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
  • the sample images of the training samples can be input into the initial image classification network.
  • the initial image classification network may be various image classification networks. As an example, it may be a residual network (Residual Network, ResNet), VGG, or the like.
  • VGG is a classification model proposed by the Visual Geometry Group (VGG) of a university.
  • an initial value can be set for the initial image classification network. For example, it could be some different small random numbers. The "small random number” is used to ensure that the network does not enter a saturation state due to excessive weights, which causes training failure. "Different" is used to ensure that the network can learn normally. After that, the recognition result of the input sample image can be obtained.
  • the machine learning method is used to train the initial image classification network. Specifically, the difference between the recognition result and the label information calculated by using a preset loss function can be used first. Then, based on the obtained differences, the parameters of the initial image classification network can be adjusted, and if the preset training end condition is met, the training is ended, and the trained initial image classification network is used as a screenshot image recognition model.
  • the training end condition here includes but is not limited to at least one of the following: the training time exceeds a preset duration; the number of training times reaches a preset number of times; and the calculated difference is less than a preset difference threshold.
  • BP Back Propagation, Back Propagation
  • SGD Spochastic Gradient Descent, Stochastic Gradient Descent
  • the execution subject of the training step and the image recognition method may be the same or different. If they are the same, the execution subject can store the network structure and parameter values of the trained image recognition model locally after training to obtain the screen image recognition model. If they are different, after the training subject obtains a screen capture image recognition model from training, the network structure and parameter values of the model may be sent to the image recognition method execution subject.
  • Step 405 In response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
  • step 405 For the specific processing of step 405 and the technical effects brought by it, reference may be made to step 203 of the embodiment corresponding to FIG. 2, and details are not described herein again.
  • the process 400 of the image recognition method in this embodiment adds an image interception step, thereby reducing unnecessary interference information in the image and improving image recognition. Accuracy.
  • this application provides an embodiment of an image recognition device.
  • the device embodiment corresponds to the method embodiment shown in FIG. 2.
  • the device may specifically Used in various electronic equipment.
  • the image recognition device 500 in this embodiment includes an image acquisition unit 501, an image recognition unit 502, and a first deletion unit 503.
  • the image to-be-identified unit 501 is configured to acquire an image to be identified.
  • the image recognition unit 502 is configured to input a to-be-recognized image into a pre-trained screenshot recognition model to obtain a recognition result used to characterize whether the to-be-recognized image is a screenshot image, where the screenshot-recognition model is used to represent the Correspondence of recognition results.
  • the first deleting unit 503 is configured to delete the image to be identified in response to the recognition result indicating that the image to be identified is a screenshot image.
  • the apparatus 500 may further include: a push unit (not shown in the figure).
  • the pushing unit is configured to indicate that the image to be identified is not a screenshot image in response to the recognition result, and to push information for indicating that the image to be identified is not a screenshot image.
  • the apparatus 500 may further include: a target image acquisition unit (not shown in the figure) and a capture unit (not shown in the figure).
  • the target image acquisition unit is configured to acquire a target image.
  • the capturing unit is configured to capture a preset area of the target image as an image to be identified.
  • the apparatus 500 further includes: a frame sequence acquisition unit and a selection unit.
  • the frame sequence obtaining unit is configured to obtain a frame sequence of a target video.
  • the selection unit is configured to select a target frame in a frame sequence of the target video as an image to be identified.
  • the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image; The sample images of the training samples in the training sample set are used as input, and the label information corresponding to the input sample images is used as the desired output, and a screenshot image recognition model is trained.
  • the apparatus 500 may further include: an identifying unit (not shown in the figure), a determining unit (not shown in the figure), and a second deleting unit (not shown in the figure) .
  • the recognition unit is configured to respond to the recognition result to indicate that the image to be recognized is not a screenshot image, and perform text recognition on the image to be recognized to obtain the recognition result;
  • the determination unit is configured to determine whether the recognition result includes a preset text;
  • the second deletion unit Configured to delete the image to be recognized in response to determining that the recognition result includes a preset text.
  • the above-mentioned image recognition unit 502 inputs the to-be-recognized image obtained by the to-be-recognized image acquisition unit 501 into a pre-trained screen image recognition model, and recognizes the to-be-recognized image. If the image to be identified is a screenshot image, it is deleted by the first deleting unit 503. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
  • FIG. 6 shows a schematic structural diagram of a computer system 600 suitable for implementing a server according to an embodiment of the present application.
  • the server shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
  • the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 from a program stored in a read-only memory (ROM) 602 or from a storage portion 608 Instead, perform various appropriate actions and processes.
  • RAM random access memory
  • ROM read-only memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input / output (I / O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk and the like; a communication section 609 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I / O interface 605 as necessary.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as needed, so that a computer program read therefrom is installed into the storage section 608 as needed.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication section 609, and / or installed from a removable medium 611.
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of this application may be written in one or more programming languages, or a combination thereof, including programming languages such as Java, Smalltalk, C ++, and also conventional Procedural programming language—such as "C" or a similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider) Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider Internet service provider
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions labeled in the blocks may also occur in a different order than those labeled in the drawings. For example, two blocks represented one after the other may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present application may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an image acquisition unit to be identified, an image recognition unit, and an image first deletion unit.
  • a processor includes an image acquisition unit to be identified, an image recognition unit, and an image first deletion unit.
  • the names of these units do not constitute a limitation on the unit itself in some cases.
  • the image acquisition unit to be identified may also be described as a “unit to acquire an image to be identified”.
  • the present application also provides a computer-readable medium, which may be included in the server described in the above embodiments; or may exist alone without being assembled into the server.
  • the computer readable medium carries one or more programs, and when the one or more programs are executed by the server, the server: obtains an image to be identified; enters the image to be identified into a pre-trained screen image recognition model to obtain It is used to characterize the recognition result of whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result; in response to the recognition result, the image to be recognized is a screenshot image, and the image to be recognized is deleted .

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

An image recognition method and apparatus. The method comprises: obtaining an image to be recognized (201); inputting said image to a pre-trained screenshot image recognition model to obtain a recognition result for representing whether said image is a screenshot image (202); and deleting said image in response to the recognition result representing that said image is a screenshot image (203). Recognition of an image to be recognized and deletion of a screenshot image are implemented. Due to use of a screenshot image recognition model, the verification and recognition efficiency of an image is improved compared with manual verification.

Description

图像识别方法和装置Image recognition method and device
本专利申请要求于2018年6月27日提交的、申请号为201810680031.4、申请人为北京字节跳动网络技术有限公司、发明名称为“图像识别方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。This patent application claims the priority of a Chinese patent application filed on June 27, 2018 with an application number of 201810680031.4, the applicant being Beijing BYTE Network Technology Co., Ltd., and the invention name being "Image Recognition Method and Device". Is incorporated by reference in its entirety.
技术领域Technical field
本申请实施例涉及计算机技术领域,具体涉及图像识别方法和装置。Embodiments of the present application relate to the field of computer technology, and in particular, to an image recognition method and device.
背景技术Background technique
随着互联网的快速发展,尤其是移动互联网的普及,各种内容的视频或图像层出不穷。为了对视频内容或图像内容进行监管,需要对于用户上传的图片或视频进行审核。With the rapid development of the Internet, especially the popularity of the mobile Internet, videos or images of various contents are emerging endlessly. In order to monitor video content or image content, the pictures or videos uploaded by users need to be reviewed.
发明内容Summary of the invention
本申请实施例提出了图像识别方法和装置。The embodiments of the present application provide an image recognition method and device.
第一方面,本申请实施例提供了一种图像识别方法,该方法包括:获取待识别图像;将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果,其中,截屏图像识别模型用于表征待识别图像与识别结果的对应关系;响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。In a first aspect, an embodiment of the present application provides an image recognition method. The method includes: acquiring an image to be identified; inputting the image to be identified into a pre-trained screen image recognition model to obtain a characterization for whether the image to be identified is a screen image Of the recognition result, wherein the screen capture image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result; and in response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
在一些实施例中,响应于识别结果表征待识别图像不是截屏图像,推送用于表征待识别图像不是截屏图像的信息。In some embodiments, in response to the recognition result indicating that the image to be identified is not a screenshot image, information for indicating that the image to be identified is not a screenshot image is pushed.
在一些实施例中,在获取待识别图像之前,包括:获取目标图像;截取目标图像的预设区域作为待识别图像。In some embodiments, before acquiring the image to be identified, the method includes: acquiring a target image; and capturing a preset region of the target image as the image to be identified.
在一些实施例中,在获取待识别图像之前,包括:获取目标视频 的帧序列;选取目标视频的帧序列中的目标帧作为待识别图像。In some embodiments, before acquiring the image to be identified, the method includes: acquiring a frame sequence of the target video; and selecting a target frame in the frame sequence of the target video as the image to be identified.
在一些实施例中,截屏图像识别模型通过以下步骤训练得到:获取训练样本集,其中,训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息;将训练样本集中的训练样本的样本图像作为输入,将与输入的样本图像对应的标注信息作为期望输出,训练得到截屏图像识别模型。In some embodiments, the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and label information used to characterize whether the sample image is a screenshot image; The sample image is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
在一些实施例中,该方法还包括:响应于识别结果表征待识别图像不是截屏图像,对待识别图像进行文字识别,得到识别结果;确定识别结果中是否包括预设文字;响应于确定识别结果中包含预设文字,将待识别图像删除。In some embodiments, the method further includes: in response to the recognition result indicating that the image to be recognized is not a screenshot image, performing text recognition on the image to be recognized to obtain the recognition result; determining whether the recognition result includes a preset text; and in response to determining the recognition result Contains preset text to delete images to be identified.
第二方面,本申请实施例提供了一种图像识别装置,该装置包括:待识别图像获取单元,被配置成获取待识别图像;识别单元,被配置成将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果,其中,截屏图像识别模型用于表征待识别图像与识别结果的对应关系;第一删除单元,被配置成响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。In a second aspect, an embodiment of the present application provides an image recognition device, the device includes: an image to be identified acquisition unit configured to obtain the image to be identified; an identification unit configured to input the image to be identified into a pre-trained screenshot An image recognition model to obtain a recognition result used to characterize whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used to represent the correspondence between the image to be recognized and the recognition result; a first deletion unit configured to respond to the recognition result The image to be identified is a screenshot image, and the image to be identified is deleted.
在一些实施例中,该装置还包括:推送单元,被配置成响应于识别结果表征待识别图像不是截屏图像,推送用于表征待识别图像不是截屏图像的信息。In some embodiments, the apparatus further includes: a pushing unit configured to indicate that the image to be identified is not a screenshot image in response to the recognition result, and to push information for indicating that the image to be identified is not a screenshot image.
在一些实施例中,该装置还包括:目标图像获取单元,被配置成获取目标图像;截取单元,被配置成截取目标图像的预设区域作为待识别图像。In some embodiments, the apparatus further includes: a target image acquisition unit configured to acquire a target image; and a capture unit configured to intercept a preset region of the target image as an image to be identified.
在一些实施例中,该装置还包括:帧序列获取单元,被配置成获取目标视频的帧序列;选取单元,被配置成选取目标视频的帧序列中的目标帧作为待识别图像。In some embodiments, the apparatus further includes: a frame sequence acquisition unit configured to acquire a frame sequence of the target video; and a selection unit configured to select a target frame in the frame sequence of the target video as an image to be identified.
在一些实施例中,截屏图像识别模型通过以下步骤训练得到:获取训练样本集,其中,训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息;将训练样本集中的训练样本的样本图像作为输入,将与输入的样本图像对应的标注信息作为期望输出,训练得到截屏图像识别模型。In some embodiments, the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and label information used to characterize whether the sample image is a screenshot image; The sample image is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
在一些实施例中,该装置还包括:识别单元,被配置成响应于识别结果表征待识别图像不是截屏图像,对待识别图像进行文字识别,得到识别结果;确定单元,被配置成确定识别结果中是否包括预设文字;第二删除单元,被配置成响应于确定识别结果中包含预设文字,将待识别图像删除。In some embodiments, the apparatus further includes: a recognition unit configured to indicate that the image to be recognized is not a screenshot image in response to the recognition result, to perform text recognition on the image to be recognized to obtain a recognition result; and a determination unit configured to determine the recognition result Whether to include a preset text; and a second deleting unit configured to delete the image to be recognized in response to determining that the recognition result includes the preset text.
第三方面,本申请实施例提供了一种电子设备,该电子设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当上述一个或多个程序被上述一个或多个处理器执行,使得上述一个或多个处理器实现如第一方面中任一实现方式描述的方法。According to a third aspect, an embodiment of the present application provides an electronic device. The electronic device includes: one or more processors; a storage device that stores one or more programs thereon; Or multiple processors execute, so that the above one or more processors implement the method as described in any implementation manner of the first aspect.
第四方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,上述程序被处理器执行时实现如第一方面中任一实现方式描述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable medium on which a computer program is stored. When the foregoing program is executed by a processor, the method as described in any implementation manner of the first aspect is implemented.
本申请实施例提供的图像识别方法和装置,通过截屏图像识别模型,对待识别图像进行识别。若待识别图像为截屏图像,则将其删除。从而实现了对于待识别图像的识别和对于截屏图像的删除。其中,由于使用了截屏图像识别模型,与人工审核相比,提高了图像的审核识别效率。The image recognition method and device provided in the embodiments of the present application recognize a to-be-recognized image by taking a screenshot image recognition model. If the image to be identified is a screenshot, delete it. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects, and advantages of the present application will become more apparent by reading the detailed description of the non-limiting embodiments with reference to the following drawings:
图1是本申请的一个实施例可以应用于其中的示例性系统架构图;FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present application can be applied; FIG.
图2是根据本申请的图像识别方法的一个实施例的流程图;2 is a flowchart of an embodiment of an image recognition method according to the present application;
图3是根据本申请的图像识别方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of the image recognition method according to the present application;
图4是根据本申请的图像识别方法的又一个实施例的流程图;4 is a flowchart of another embodiment of an image recognition method according to the present application;
图5是根据本申请的图像识别装置的一个实施例的结构示意图;5 is a schematic structural diagram of an embodiment of an image recognition apparatus according to the present application;
图6是适于用来实现本申请实施例的服务器的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a server according to an embodiment of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The following describes the present application in detail with reference to the accompanying drawings and embodiments. It can be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. It should also be noted that, for convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The application will be described in detail below with reference to the drawings and embodiments.
图1示出了可以应用本申请实施例的图像识别方法或图像识别装置的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 to which an image recognition method or an image recognition apparatus of an embodiment of the present application can be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. The network 104 is a medium for providing a communication link between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如拍照类应用、图片处理类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications can be installed on the terminal devices 101, 102, and 103, such as photographing applications, picture processing applications, instant messaging tools, email clients, social platform software, and the like.
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是具有支持存储并传输图像的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, and 103 are hardware, they can be various electronic devices with support for storing and transmitting images, including, but not limited to, smart phones, tablet computers, laptop computers, and desktop computers. When the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103中存储的图像进行处理的后台服务器。后台服务器可以对接收到的图像进行处理(例如识别是否为截屏图像),并根据处理结果(例如识别结果)进行相应的处理。The server 105 may be a server that provides various services, such as a background server that processes images stored in the terminal devices 101, 102, and 103. The background server may process the received image (for example, identify whether it is a screenshot image), and perform corresponding processing according to the processing result (for example, the recognition result).
需要说明的是,本申请实施例所提供的图像识别方法一般由服务 器105执行,相应地,图像识别装置一般设置于服务器105中。It should be noted that the image recognition method provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the image recognition device is generally provided in the server 105.
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster consisting of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
继续参考图2,示出了根据本申请的图像识别方法的一个实施例的流程200。该图像识别方法,包括以下步骤:With continued reference to FIG. 2, a flowchart 200 of an embodiment of an image recognition method according to the present application is shown. The image recognition method includes the following steps:
步骤201,获取待识别图像。Step 201: Obtain an image to be identified.
在本实施例中,图像识别方法的执行主体(例如图1所示的服务器)可以通过有线连接方式或者无线连接方式从终端获取待识别图像。此外,待识别图像也可以存储在上述执行主体本地。此时,上述执行主体可以直接从本地获取待识别图像。其中,待识别图像可以是需要进行识别的任意图像。实践中,待识别图像可以由技术人员指定,也可以根据一定的条件筛选。In this embodiment, an execution subject of the image recognition method (for example, a server shown in FIG. 1) may obtain an image to be recognized from a terminal in a wired connection manner or a wireless connection manner. In addition, the image to be identified may also be stored locally on the execution subject. At this time, the execution subject may directly obtain the image to be identified from the local. The image to be identified may be any image that needs to be identified. In practice, the images to be identified can be specified by a technician or filtered according to certain conditions.
在本实施例的一些可选的实现方式中,在获取待识别图像之前,该方法可以包括:获取目标视频的帧序列;选取目标视频的帧序列中的目标帧作为待识别图像。In some optional implementations of this embodiment, before acquiring an image to be identified, the method may include: acquiring a frame sequence of the target video; and selecting a target frame in the frame sequence of the target video as the image to be identified.
在这些实现方式中,目标视频可以是任意视频。目标视频的确定可以由技术人员指定,也可以根据一定的条件筛选。目标帧可以是上述帧序列中的至少一帧。目标帧可以由技术人员指定,也可以根据一定的条件筛选得到。作为示例,条件可以是:间隔2秒抽取一帧。In these implementations, the target video can be any video. The determination of the target video can be specified by a technician, or it can be filtered according to certain conditions. The target frame may be at least one frame in the above-mentioned frame sequence. The target frame can be specified by a technician, or it can be filtered according to certain conditions. As an example, the condition may be: one frame is drawn at 2 second intervals.
步骤202,将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果。In step 202, an image to be identified is input to a pre-trained screen image recognition model, and a recognition result for characterizing whether the image to be identified is a screen image is obtained.
在本实施例中,上述执行主体可以将待识别图像输入至预先训练的截屏图像识别模型。从而得到用于表征待识别图像是否为截屏图像的识别结果。其中,截屏图像可以是记录电子设备的屏幕所显示的内容的图像。识别结果可以有多种形式。例如,可以用数字表示待识别 图像是否为截屏图像。具体的,可以用“1”表示待识别图像是截屏图像。用“0”表示待识别图像不是截屏图像。又如,识别结果还可以是0到1之间的数值,用以表示待识别图像为截屏图像的概率。除此之外,识别结果也可以文字、字符等等。在此,对于识别结果的形式不做限定。In this embodiment, the above-mentioned execution subject may input an image to be recognized into a pre-trained screen image recognition model. Thereby, a recognition result for characterizing whether the image to be recognized is a screenshot image is obtained. The screenshot image may be an image recording content displayed on a screen of the electronic device. The recognition results can take many forms. For example, you can use a number to indicate whether the image to be identified is a screenshot. Specifically, “1” may be used to indicate that the image to be identified is a screenshot image. Use "0" to indicate that the image to be identified is not a screenshot. For another example, the recognition result may also be a value between 0 and 1, which is used to indicate the probability that the image to be recognized is a screenshot image. In addition, the recognition results can also be text, characters, and so on. Here, the form of the recognition result is not limited.
在本实施例中,截屏图像识别模型用于表征待识别图像与识别结果的对应关系。作为示例,截屏图像模型可以是存储有大量图像(包括截屏图像或非截屏图像)和图像对应的识别结果的对应关系表。其中,对应关系表可以基于大量图像和识别结果的统计而生成。这样,上述执行主体可以对于待识别图像与对应关系表中的大量图像进行匹配。从而可以确定对应关系表中与待识别图像的匹配度大于预设阈值(例如95%)的图像。之后,可以将确定的图像所对应的识别结果作为待识别图像的识别结果。In this embodiment, the screenshot image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result. As an example, the screenshot image model may be a correspondence table storing a large number of images (including screenshot images or non-screenshot images) and recognition results corresponding to the images. The correspondence relationship table may be generated based on statistics of a large number of images and recognition results. In this way, the above-mentioned execution subject can match the image to be identified with a large number of images in the correspondence table. Thereby, an image whose matching degree with the image to be identified in the correspondence table is greater than a preset threshold (for example, 95%) can be determined. After that, the recognition result corresponding to the determined image may be used as the recognition result of the image to be recognized.
在本实施例中,上述截屏图像识别模型也可以是神经网络。神经网络从信息处理角度对人脑神经元网络进行抽象,建立某种简单模型,按不同的连接方式组成不同的网络。通常由大量的节点(或称神经元)之间相互联接构成,每个节点代表一种特定的输出函数,称为激励函数。每两个节点间的连接都代表一个对于通过该连接信号的加权值,称之为权重(又叫做参数),网络的输出则依网络的连接方式、权重值和激励函数的不同而不同。神经网络通常包括多个层,每个层包括多个节点,通常,同一层的节点的权重可以相同,不同层的节点的权重可以不同,故神经网络的多个层的参数也可以不同。In this embodiment, the above screenshot image recognition model may also be a neural network. The neural network abstracts the human brain neuron network from the perspective of information processing, establishes some simple model, and forms different networks according to different connection methods. Usually consists of a large number of nodes (or neurons) connected to each other, each node represents a specific output function, called the excitation function. The connection between each two nodes represents a weighted value for the signal passing through the connection, called a weight (also called a parameter), and the output of the network varies according to the connection mode, weight value and incentive function of the network. A neural network usually includes multiple layers, and each layer includes multiple nodes. Generally, the nodes of the same layer can have the same weight, and the nodes of different layers can have different weights, so the parameters of multiple layers of the neural network can also be different.
步骤203,响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。Step 203: In response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
在本实施例中,响应于识别结果表征待识别图像是截屏图像,上述执行主体可以将待识别图像删除。In this embodiment, in response to the recognition result indicating that the image to be recognized is a screenshot image, the execution subject may delete the image to be recognized.
在本实施例的一些可选的实现方式中,响应于识别结果表征待识别图像不是截屏图像,上述执行主体还可以推送用于表征待识别图像不是截屏图像的信息。In some optional implementation manners of this embodiment, in response to the recognition result indicating that the image to be identified is not a screenshot image, the above-mentioned execution subject may also push information for indicating that the image to be identified is not a screenshot image.
在本实施例的一些可选的实现方式中,响应于识别结果表征待识 别图像不是截屏图像,对待识别图像进行文字识别,得到识别结果;确定识别结果中是否包括预设文字;响应于确定识别结果中包含预设文字,将待识别图像删除。In some optional implementations of this embodiment, in response to the recognition result indicating that the image to be recognized is not a screenshot image, text recognition is performed on the to-be-recognized image to obtain the recognition result; determining whether the recognition result includes a preset text; and responding to determining the recognition The result contains preset text, and the image to be recognized is deleted.
在这些实现方式中,响应于识别结果表征待识别图像不是截屏图像,上述执行主体可以通过各种方法对待识别图像进行文字识别,得到识别结果。其中,识别结果可以是待识别图像中显示的文本的相关信息。作为示例,可以利用OCR(Optical Character Recognition,光学字符识别)技术对待识别图像进行文字识别,从而得到待识别图像中显示的文本。之后,上述执行主体可以确定识别结果(例如得到的文本)是否包含预设文字(例如,可以是运营商的名称等等)。若包含,上述执行主体可以将待识别图像删除。In these implementation manners, in response to the recognition result indicating that the image to be recognized is not a screenshot image, the above-mentioned execution subject may perform character recognition on the image to be recognized through various methods to obtain the recognition result. The recognition result may be related information of the text displayed in the image to be recognized. As an example, OCR (Optical Character Recognition) technology can be used to perform text recognition on the image to be recognized, thereby obtaining the text displayed in the image to be recognized. After that, the execution body may determine whether the recognition result (for example, the obtained text) contains a preset text (for example, the name of an operator, etc.). If so, the execution subject may delete the image to be identified.
继续参见图3,图3是根据本实施例的图像识别方法的应用场景的一个示意图。在图3的应用场景中,图像识别方法的执行主体为服务器300。服务器300可以首先从终端获取待识别图像301。之后将待识别图像301输入至预先训练的截屏图像识别模型,得到识别结果。若识别结果表征待识别图像301是截屏图像,则删除待识别图像301。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of the image recognition method according to this embodiment. In the application scenario of FIG. 3, the execution body of the image recognition method is the server 300. The server 300 may first obtain an image 301 to be identified from a terminal. Then, the to-be-recognized image 301 is input to a pre-trained screen image recognition model to obtain a recognition result. If the recognition result indicates that the to-be-recognized image 301 is a screenshot image, the to-be-recognized image 301 is deleted.
本申请的上述实施例提供的图像识别方法通过截屏图像识别模型,对待识别图像进行识别。若待识别图像为截屏图像,则将其删除。从而实现了对于待识别图像的识别和对于截屏图像的删除。其中,由于使用了截屏图像识别模型,与人工审核相比,提高了图像的审核识别效率。The image recognition method provided by the above embodiments of the present application uses a screen capture image recognition model to identify an image to be recognized. If the image to be identified is a screenshot, delete it. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
进一步参考图4,其示出了图像识别方法的又一个实施例的流程400。该图像识别方法的流程400,包括以下步骤:Further reference is made to FIG. 4, which illustrates a process 400 of still another embodiment of the image recognition method. The process 400 of the image recognition method includes the following steps:
步骤401,获取目标图像。Step 401: Obtain a target image.
在本实施例中,图像识别方法的执行主体可以通过有线连接或无线连接的方式从终端获取目标图像。其中目标图像可以是任意图像。实践中,目标图像可以由技术人员指定,也可以根据预设条件筛选。此外,上述目标图像也可以存储于执行主体本地。此时,上述执行主体也可以从本地直接获取目标图像。In this embodiment, the execution subject of the image recognition method may obtain the target image from the terminal in a wired connection or a wireless connection. The target image can be any image. In practice, the target image can be specified by a technician, or it can be filtered based on preset conditions. In addition, the target image may be stored locally in the execution subject. At this time, the execution subject may also directly obtain the target image from the local.
步骤402,截取目标图像的预设区域作为待识别图像。Step 402: Capture a preset area of the target image as the image to be identified.
在本实施例中,上述执行主体可以截取目标图像的预设区域作为待识别图像。其中,预设区域可以是目标图像的部分或全部区域。例如,可以是上五分之一的区域。实践中,上述执行主体可以采取多种方式截取目标图像的预设区域。例如,通过一些截图类应用或者图片处理类应用等等。In this embodiment, the execution subject may intercept a preset area of the target image as the image to be identified. The preset area may be a part or all of the target image. For example, it can be the upper fifth area. In practice, the above-mentioned execution subject may intercept the preset area of the target image in various ways. For example, through some screenshot applications or image processing applications.
步骤403,获取待识别图像。Step 403: Acquire an image to be identified.
在本实施例中,上述执行主体可以获取步骤402中截取得到的待识别图像。由于待识别图像为步骤402中截取得到的,因而一般可以从本地直接获取。In this embodiment, the execution subject may obtain the to-be-recognized image obtained in step 402. Because the image to be identified is obtained in step 402, it can generally be obtained directly from the local.
步骤404,将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果。Step 404: Input the image to be identified into a pre-trained screen image recognition model, and obtain a recognition result used to characterize whether the image to be identified is a screen image.
在本实施例中,上述截屏图像识别模型可以是利用机器学习方法,基于多个训练样本对于图像分类网络,例如卷积神经网络(Convolutional Neural Network,CNN),进行训练后得到的模型。其中,卷积神经网络可以是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于图像处理有出色表现。卷积神经网络可以包括卷积层、池化层、反池化层和反卷积层。其中,卷积层可以用于提取图像特征。池化层可以用于对输入的信息进行降采样(downsample)。反池化层可以用于对输入的信息进行上采样(upsample),反卷积层用于对输入的信息进行反卷积,将卷积层的卷积核的转置作为反卷积层的卷积核对所输入的信息进行处理。In this embodiment, the above-mentioned screenshot image recognition model may be a model obtained by training an image classification network, such as a Convolutional Neural Network (CNN), based on multiple training samples using a machine learning method. Among them, the convolutional neural network can be a kind of feed-forward neural network, and its artificial neurons can respond to a part of the surrounding cells in the coverage area, and it has excellent performance for image processing. A convolutional neural network may include a convolutional layer, a pooling layer, a depooling layer, and a deconvolution layer. The convolution layer can be used to extract image features. The pooling layer can be used to downsample the input information. The depooling layer can be used to upsample the input information, the deconvolution layer is used to deconvolve the input information, and the transposition of the convolution kernel of the convolution layer is used as the deconvolution layer. The convolution kernel processes the input information.
作为示例,上述截屏图像识别模型可以通过以下步骤训练得到:As an example, the above screenshot image recognition model can be trained by the following steps:
第一步,获取训练样本集,其中,每个训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息。实践中,可以人工对样本图像是否为截屏图像进行标注,从而得到每个样本图像的标注信息。这里,标注信息可以是各种形式。作为示例,标注信息可以是数值。例如,用“0”表示不是截屏图像,用“1”表示是截屏图像。作为示例,标注信息还可以是文字、字符等等。The first step is to obtain a training sample set, where each training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image. In practice, it is possible to manually label whether a sample image is a screenshot image, thereby obtaining labeling information of each sample image. Here, the label information may be in various forms. As an example, the label information may be a numerical value. For example, "0" indicates that it is not a screenshot image, and "1" indicates that it is a screenshot image. As an example, the label information may also be text, characters, and so on.
第二步,将训练样本集中的训练样本的样本图像作为输入,将与 输入的样本图像对应的标注信息作为期望输出,训练得到截屏图像识别模型。In the second step, the sample image of the training samples in the training sample set is used as input, and the label information corresponding to the input sample image is used as the desired output, and a screenshot image recognition model is trained.
具体来说,可以将训练样本的样本图像输入初始图像分类网络。其中,初始图像分类网络可以是各种图像分类网络。作为示例,可以是残差网络(Deep Residual Network,ResNet)、VGG等等。VGG是某大学的视觉几何小组(Visual Geometry Group,VGG)提出的分类模型。实践中,可以为初始图像分类网络设置初始值。例如,可以是一些不同的小随机数。“小随机数”用来保证网络不会因权值过大而进入饱和状态,从而导致训练失败,“不同”用来保证网络可以正常地学习。之后,可以得到输入的样本图像的识别结果。以与输入的样本图像对应的标注信息作为初始图像分类网络的期望输出,利用机器学习方法训练初始图像分类网络。具体来说可以首先利用预设的损失函数计算得到的识别结果与标注信息之间的差异。然后,可以基于所得到的差异,调整初始图像分类网络的参数,并在满足预设的训练结束条件的情况下,结束训练,并将训练后的初始图像分类网络作为截屏图像识别模型。这里的训练结束条件包括但不限于以下至少一项:训练时间超过预设时长;训练次数达到预设次数;计算所得的差异小于预设差异阈值。Specifically, the sample images of the training samples can be input into the initial image classification network. The initial image classification network may be various image classification networks. As an example, it may be a residual network (Residual Network, ResNet), VGG, or the like. VGG is a classification model proposed by the Visual Geometry Group (VGG) of a university. In practice, an initial value can be set for the initial image classification network. For example, it could be some different small random numbers. The "small random number" is used to ensure that the network does not enter a saturation state due to excessive weights, which causes training failure. "Different" is used to ensure that the network can learn normally. After that, the recognition result of the input sample image can be obtained. Using the annotation information corresponding to the input sample image as the expected output of the initial image classification network, the machine learning method is used to train the initial image classification network. Specifically, the difference between the recognition result and the label information calculated by using a preset loss function can be used first. Then, based on the obtained differences, the parameters of the initial image classification network can be adjusted, and if the preset training end condition is met, the training is ended, and the trained initial image classification network is used as a screenshot image recognition model. The training end condition here includes but is not limited to at least one of the following: the training time exceeds a preset duration; the number of training times reaches a preset number of times; and the calculated difference is less than a preset difference threshold.
这里可以采用各种方式基于所得到的识别结果与输入的训练样本对应的标注信息之间的差异,调整初始图像分类网络的参数。例如,可以采用BP(Back Propagation,反向传播)算法或者SGD(Stochastic Gradient Descent,随机梯度下降)算法来调整初始图像分类网络的参数。Various methods can be used here to adjust the parameters of the initial image classification network based on the difference between the obtained recognition result and the labeled information corresponding to the input training sample. For example, a BP (Back Propagation, Back Propagation) algorithm or a SGD (Stochastic Gradient Descent, Stochastic Gradient Descent) algorithm can be used to adjust the parameters of the initial image classification network.
需要说明的是,训练步骤的执行主体与图像识别方法的执行主体可以相同,也可以不同。若相同,执行主体可以在训练得到截屏图像识别模型后,将训练后的图像识别模型的网络结构和参数值存储于本地。若不同,训练步骤的执行主体在训练得到截屏图像识别模型后,可以将模型的网络结构和参数值发送至图像识别方法的执行主体。It should be noted that the execution subject of the training step and the image recognition method may be the same or different. If they are the same, the execution subject can store the network structure and parameter values of the trained image recognition model locally after training to obtain the screen image recognition model. If they are different, after the training subject obtains a screen capture image recognition model from training, the network structure and parameter values of the model may be sent to the image recognition method execution subject.
步骤405,响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。Step 405: In response to the recognition result indicating that the image to be recognized is a screenshot image, the image to be recognized is deleted.
步骤405的具体处理及其所带来的技术效果可以参考图2对应的实施例的步骤203,在此不再赘述。For the specific processing of step 405 and the technical effects brought by it, reference may be made to step 203 of the embodiment corresponding to FIG. 2, and details are not described herein again.
从图4中可以看出,与图2对应的实施例相比,本实施例中的图像识别方法的流程400增加了对于图像的截取步骤,从而减少图像中不必要的干扰信息,提高图像识别准确率。As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the image recognition method in this embodiment adds an image interception step, thereby reducing unnecessary interference information in the image and improving image recognition. Accuracy.
进一步参考图5,作为对上述各图所示方法的实现,本申请提供了一种图像识别装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。Further referring to FIG. 5, as an implementation of the methods shown in the foregoing figures, this application provides an embodiment of an image recognition device. The device embodiment corresponds to the method embodiment shown in FIG. 2. The device may specifically Used in various electronic equipment.
如图5所示,本实施例的图像识别装置500包括:待识别图像获取单元501、图像识别单元502和第一删除单元503。其中,待识别图像获取单元501被配置成获取待识别图像。图像识别单元502,被配置成将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果,其中,截屏图像识别模型用于表征待识别图像与识别结果的对应关系。第一删除单元503被配置成响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。As shown in FIG. 5, the image recognition device 500 in this embodiment includes an image acquisition unit 501, an image recognition unit 502, and a first deletion unit 503. The image to-be-identified unit 501 is configured to acquire an image to be identified. The image recognition unit 502 is configured to input a to-be-recognized image into a pre-trained screenshot recognition model to obtain a recognition result used to characterize whether the to-be-recognized image is a screenshot image, where the screenshot-recognition model is used to represent the Correspondence of recognition results. The first deleting unit 503 is configured to delete the image to be identified in response to the recognition result indicating that the image to be identified is a screenshot image.
在本实施例中,图像识别装置500中的待识别图像获取单元501、图像识别单元502和第一删除单元503的具体处理及其所带来的技术效果可分别参考图2对应实施例中步骤201-203的相关说明,在此不再赘述。In this embodiment, for the specific processing of the to-be-recognized image acquisition unit 501, the image recognition unit 502, and the first deletion unit 503 in the image recognition apparatus 500, and the technical effects brought by it, refer to the steps in the corresponding embodiment in FIG. 2 respectively. Relevant descriptions of 201-203 are not repeated here.
在本实施例的一些可选的实现方式中,装置500还可以包括:推送单元(图中未示出)。推送单元被配置成响应于识别结果表征待识别图像不是截屏图像,推送用于表征待识别图像不是截屏图像的信息。In some optional implementation manners of this embodiment, the apparatus 500 may further include: a push unit (not shown in the figure). The pushing unit is configured to indicate that the image to be identified is not a screenshot image in response to the recognition result, and to push information for indicating that the image to be identified is not a screenshot image.
在本实施例的一些可选的实现方式中,装置500还可以包括:目标图像获取单元(图中未示出)和截取单元(图中未示出)。其中,目标图像获取单元被配置成获取目标图像。截取单元,被配置成截取目标图像的预设区域作为待识别图像。In some optional implementation manners of this embodiment, the apparatus 500 may further include: a target image acquisition unit (not shown in the figure) and a capture unit (not shown in the figure). The target image acquisition unit is configured to acquire a target image. The capturing unit is configured to capture a preset area of the target image as an image to be identified.
在本实施例的一些可选的实现方式中,装置500还包括:帧序列获取单元和选取单元。其中,帧序列获取单元被配置成获取目标视频的帧序列。选取单元被配置成选取目标视频的帧序列中的目标帧作为 待识别图像。In some optional implementation manners of this embodiment, the apparatus 500 further includes: a frame sequence acquisition unit and a selection unit. The frame sequence obtaining unit is configured to obtain a frame sequence of a target video. The selection unit is configured to select a target frame in a frame sequence of the target video as an image to be identified.
在本实施例的一些可选的实现方式中,截屏图像识别模型通过以下步骤训练得到:获取训练样本集,其中,训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息;将训练样本集中的训练样本的样本图像作为输入,将与输入的样本图像对应的标注信息作为期望输出,训练得到截屏图像识别模型。In some optional implementations of this embodiment, the screenshot image recognition model is obtained by training in the following steps: obtaining a training sample set, where the training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image; The sample images of the training samples in the training sample set are used as input, and the label information corresponding to the input sample images is used as the desired output, and a screenshot image recognition model is trained.
在本实施例的一些可选的实现方式中,装置500还可以包括:识别单元(图中未示出)、确定单元(图中未示出)和第二删除单元(图中未示出)。其中,识别单元被配置成响应于识别结果表征待识别图像不是截屏图像,对待识别图像进行文字识别,得到识别结果;确定单元,被配置成确定识别结果中是否包括预设文字;第二删除单元,被配置成响应于确定识别结果中包含预设文字,将待识别图像删除。In some optional implementation manners of this embodiment, the apparatus 500 may further include: an identifying unit (not shown in the figure), a determining unit (not shown in the figure), and a second deleting unit (not shown in the figure) . The recognition unit is configured to respond to the recognition result to indicate that the image to be recognized is not a screenshot image, and perform text recognition on the image to be recognized to obtain the recognition result; the determination unit is configured to determine whether the recognition result includes a preset text; the second deletion unit , Configured to delete the image to be recognized in response to determining that the recognition result includes a preset text.
在本实施例中,上述图像识别单元502将待识别图像获取单元501获取的待识别图像输入至预先训练的截屏图像识别模型,对待识别图像进行识别。若待识别图像为截屏图像,通过第一删除单元503则将其删除。从而实现了对于待识别图像的识别和对于截屏图像的删除。其中,由于使用了截屏图像识别模型,与人工审核相比,提高了图像的审核识别效率。In this embodiment, the above-mentioned image recognition unit 502 inputs the to-be-recognized image obtained by the to-be-recognized image acquisition unit 501 into a pre-trained screen image recognition model, and recognizes the to-be-recognized image. If the image to be identified is a screenshot image, it is deleted by the first deleting unit 503. Thus, the recognition of the image to be recognized and the deletion of the screenshot image are realized. Among them, because the screen capture image recognition model is used, compared with manual review, the efficiency of image review and recognition is improved.
下面参考图6,其示出了适于用来实现本申请实施例的服务器的计算机系统600的结构示意图。图6示出的服务器仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which shows a schematic structural diagram of a computer system 600 suitable for implementing a server according to an embodiment of the present application. The server shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present application.
如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 from a program stored in a read-only memory (ROM) 602 or from a storage portion 608 Instead, perform various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的 输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I / O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk and the like And a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 610 as needed, so that a computer program read therefrom is installed into the storage section 608 as needed.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present application are executed.
需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机 可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing. In this application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. . Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如”C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of this application may be written in one or more programming languages, or a combination thereof, including programming languages such as Java, Smalltalk, C ++, and also conventional Procedural programming language—such as "C" or a similar programming language. The program code can be executed entirely on the user's computer, partly on the user's computer, as an independent software package, partly on the user's computer, partly on a remote computer, or entirely on a remote computer or server. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider) Internet connection).
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions labeled in the blocks may also occur in a different order than those labeled in the drawings. For example, two blocks represented one after the other may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括待识别图像获取单元、图像识别单元和图像第一删除单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,待识别图像获取单元还可以被描述为“获取待识别图像的单元”。The units described in the embodiments of the present application may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an image acquisition unit to be identified, an image recognition unit, and an image first deletion unit. The names of these units do not constitute a limitation on the unit itself in some cases. For example, the image acquisition unit to be identified may also be described as a “unit to acquire an image to be identified”.
作为另一方面,本申请还提供了一种计算机可读介质,该计算机 可读介质可以是上述实施例中描述的服务器中所包含的;也可以是单独存在,而未装配入该服务器中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该服务器执行时,使得该服务器:获取待识别图像;将待识别图像输入至预先训练的截屏图像识别模型,得到用于表征待识别图像是否为截屏图像的识别结果,其中,截屏图像识别模型用于表征待识别图像与识别结果的对应关系;响应于识别结果表征待识别图像是截屏图像,将待识别图像删除。As another aspect, the present application also provides a computer-readable medium, which may be included in the server described in the above embodiments; or may exist alone without being assembled into the server. The computer readable medium carries one or more programs, and when the one or more programs are executed by the server, the server: obtains an image to be identified; enters the image to be identified into a pre-trained screen image recognition model to obtain It is used to characterize the recognition result of whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used to characterize the correspondence between the image to be recognized and the recognition result; in response to the recognition result, the image to be recognized is a screenshot image, and the image to be recognized is deleted .
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present application and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of the invention involved in this application is not limited to the technical solution of the specific combination of the above technical features, but should also cover the above technical features or Other technical solutions formed by arbitrarily combining their equivalent features. For example, a technical solution formed by replacing the above features with technical features disclosed in the present application (but not limited to) having similar functions.

Claims (14)

  1. 一种图像识别方法,包括:An image recognition method includes:
    获取待识别图像;Obtaining images to be identified;
    将所述待识别图像输入至预先训练的截屏图像识别模型,得到用于表征所述待识别图像是否为截屏图像的识别结果,其中,所述截屏图像识别模型用于表征待识别图像与识别结果的对应关系;Inputting the image to be identified into a pre-trained screenshot image recognition model to obtain a recognition result that is used to characterize whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used to represent the image to be recognized and the recognition result Corresponding relationship
    响应于所述识别结果表征所述待识别图像是截屏图像,将所述待识别图像删除。In response to the recognition result indicating that the image to be identified is a screenshot image, the image to be identified is deleted.
  2. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, further comprising:
    响应于所述识别结果表征所述待识别图像不是截屏图像,推送用于表征所述待识别图像不是截屏图像的信息。In response to the recognition result indicating that the image to be identified is not a screenshot image, information for indicating that the image to be identified is not a screenshot image is pushed.
  3. 根据权利要求1所述的方法,其中,在所述获取待识别图像之前,包括:The method according to claim 1, wherein before the acquiring an image to be identified, comprises:
    获取目标图像;Obtaining a target image;
    截取所述目标图像的预设区域作为所述待识别图像。A preset area of the target image is captured as the image to be identified.
  4. 根据权利要求1所述的方法,其中,在所述获取待识别图像之前,包括:The method according to claim 1, wherein before the acquiring an image to be identified, comprises:
    获取目标视频的帧序列;Get the frame sequence of the target video;
    选取所述目标视频的帧序列中的目标帧作为所述待识别图像。Selecting a target frame in a frame sequence of the target video as the image to be identified.
  5. 根据权利要求1-4中任一所述的方法,其中,所述截屏图像识别模型通过以下步骤训练得到:The method according to any one of claims 1-4, wherein the screenshot image recognition model is obtained by training in the following steps:
    获取训练样本集,其中,训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息;Obtaining a training sample set, where the training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image;
    将所述训练样本集中的训练样本的样本图像作为输入,将与输入的样本图像对应的标注信息作为期望输出,训练得到所述截屏图像识 别模型。The sample image of the training sample in the training sample set is used as input, and the label information corresponding to the input sample image is used as the desired output, and the screenshot image recognition model is trained.
  6. 根据权利要求1-4中任一所述的方法,其中,所述方法还包括:The method according to any one of claims 1-4, wherein the method further comprises:
    响应于所述识别结果表征所述待识别图像不是截屏图像,对所述待识别图像进行文字识别,得到识别结果;In response to the recognition result characterizing that the image to be recognized is not a screenshot image, performing text recognition on the image to be recognized to obtain a recognition result;
    确定所述识别结果中是否包括预设文字;Determining whether the recognition result includes a preset text;
    响应于确定所述识别结果中包含所述预设文字,将所述待识别图像删除。In response to determining that the recognition result includes the preset text, the image to be recognized is deleted.
  7. 一种图像识别装置,包括:An image recognition device includes:
    待识别图像获取单元,被配置成获取待识别图像;An image-to-be-identified obtaining unit configured to acquire an image to-be-identified;
    识别单元,被配置成将所述待识别图像输入至预先训练的截屏图像识别模型,得到用于表征所述待识别图像是否为截屏图像的识别结果,其中,所述截屏图像识别模型用于表征待识别图像与识别结果的对应关系;A recognition unit configured to input the image to be recognized into a pre-trained screenshot image recognition model to obtain a recognition result used to characterize whether the image to be recognized is a screenshot image, wherein the screenshot image recognition model is used for The correspondence between the image to be identified and the recognition result;
    第一删除单元,被配置成响应于所述识别结果表征所述待识别图像是截屏图像,将所述待识别图像删除。The first deleting unit is configured to delete the image to be identified in response to the recognition result indicating that the image to be identified is a screenshot image.
  8. 根据权利要求7所述的装置,其中,所述装置还包括:The apparatus according to claim 7, wherein the apparatus further comprises:
    推送单元,被配置成响应于所述识别结果表征所述待识别图像不是截屏图像,推送用于表征所述待识别图像不是截屏图像的信息。The pushing unit is configured to indicate that the image to be identified is not a screenshot image in response to the recognition result, and to push information for indicating that the image to be identified is not a screenshot image.
  9. 根据权利要求7所述的装置,其中,所述装置还包括:The apparatus according to claim 7, wherein the apparatus further comprises:
    目标图像获取单元,被配置成获取目标图像;A target image acquisition unit configured to acquire a target image;
    截取单元,被配置成截取所述目标图像的预设区域作为所述待识别图像。The capturing unit is configured to capture a preset area of the target image as the image to be identified.
  10. 根据权利要求7所述的装置,其中,所述装置还包括:The apparatus according to claim 7, wherein the apparatus further comprises:
    帧序列获取单元,被配置成获取目标视频的帧序列;A frame sequence obtaining unit configured to obtain a frame sequence of a target video;
    选取单元,被配置成选取所述目标视频的帧序列中的目标帧作为 所述待识别图像。The selection unit is configured to select a target frame in a frame sequence of the target video as the image to be identified.
  11. 根据权利要求7-10中任一所述的装置,其中,所述截屏图像识别模型通过以下步骤训练得到:The apparatus according to any one of claims 7 to 10, wherein the screenshot image recognition model is obtained by training in the following steps:
    获取训练样本集,其中,训练样本包括样本图像和用于表征样本图像是否为截屏图像的标注信息;Obtaining a training sample set, where the training sample includes a sample image and annotation information used to characterize whether the sample image is a screenshot image;
    将所述训练样本集中的训练样本的样本图像作为输入,将与输入的样本图像对应的标注信息作为期望输出,训练得到所述截屏图像识别模型。The sample image of the training sample in the training sample set is used as input, and the label information corresponding to the input sample image is used as the desired output, and the screenshot image recognition model is trained.
  12. 根据权利要求7-10中任一所述的装置,其中,所述装置还包括:The apparatus according to any one of claims 7 to 10, wherein the apparatus further comprises:
    识别单元,被配置成响应于所述识别结果表征所述待识别图像不是截屏图像,对所述待识别图像进行文字识别,得到识别结果;A recognition unit configured to, in response to the recognition result indicating that the image to be recognized is not a screenshot image, perform text recognition on the image to be recognized to obtain a recognition result;
    确定单元,被配置成确定所述识别结果中是否包括预设文字;A determining unit configured to determine whether the recognition result includes a preset text;
    第二删除单元,被配置成响应于确定所述识别结果中包含所述预设文字,将所述待识别图像删除。The second deleting unit is configured to delete the image to be identified in response to determining that the recognition result includes the preset text.
  13. 一种服务器,包括:A server including:
    一个或多个处理器;One or more processors;
    存储装置,其上存储有一个或多个程序;A storage device on which one or more programs are stored;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-6.
  14. 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-6中任一所述的方法。A computer-readable medium having stored thereon a computer program, wherein when the program is executed by a processor, the method according to any one of claims 1-6 is implemented.
PCT/CN2018/116335 2018-06-27 2018-11-20 Image recognition method and apparatus WO2020000879A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810680031.4A CN109002842A (en) 2018-06-27 2018-06-27 Image-recognizing method and device
CN201810680031.4 2018-06-27

Publications (1)

Publication Number Publication Date
WO2020000879A1 true WO2020000879A1 (en) 2020-01-02

Family

ID=64602070

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/116335 WO2020000879A1 (en) 2018-06-27 2018-11-20 Image recognition method and apparatus

Country Status (2)

Country Link
CN (1) CN109002842A (en)
WO (1) WO2020000879A1 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291644A (en) * 2020-01-20 2020-06-16 北京百度网讯科技有限公司 Method and apparatus for processing information
CN111310693A (en) * 2020-02-26 2020-06-19 腾讯科技(深圳)有限公司 Intelligent labeling method and device for text in image and storage medium
CN111353470A (en) * 2020-03-13 2020-06-30 北京字节跳动网络技术有限公司 Image processing method and device, readable medium and electronic equipment
CN111353434A (en) * 2020-02-28 2020-06-30 北京市商汤科技开发有限公司 Information identification method, device, system, electronic equipment and storage medium
CN111597966A (en) * 2020-05-13 2020-08-28 北京达佳互联信息技术有限公司 Expression image recognition method, device and system
CN111767918A (en) * 2020-02-21 2020-10-13 北京沃东天骏信息技术有限公司 Picture identification method and device
CN111797645A (en) * 2020-07-08 2020-10-20 北京京东振世信息技术有限公司 Method and apparatus for identifying bar code
CN111815505A (en) * 2020-07-14 2020-10-23 北京字节跳动网络技术有限公司 Method, apparatus, device and computer readable medium for processing image
CN111860284A (en) * 2020-07-15 2020-10-30 上海钧正网络科技有限公司 Safety management method, device, medium and server for battery replacement cabinet
CN111914822A (en) * 2020-07-23 2020-11-10 腾讯科技(深圳)有限公司 Text image labeling method and device, computer readable storage medium and equipment
CN111950591A (en) * 2020-07-09 2020-11-17 中国科学院深圳先进技术研究院 Model training method, interaction relation recognition method and device and electronic equipment
CN112287757A (en) * 2020-09-25 2021-01-29 北京百度网讯科技有限公司 Water body identification method and device, electronic equipment and storage medium
CN112541543A (en) * 2020-12-11 2021-03-23 深圳市优必选科技股份有限公司 Image recognition method and device, terminal equipment and storage medium
CN112905843A (en) * 2021-03-17 2021-06-04 北京文香信息技术有限公司 Information processing method and device based on video stream and storage medium
CN112989986A (en) * 2021-03-09 2021-06-18 北京京东乾石科技有限公司 Method, apparatus, device and storage medium for identifying crowd behavior
CN113221920A (en) * 2021-05-20 2021-08-06 北京百度网讯科技有限公司 Image recognition method, device, equipment, storage medium and computer program product
CN113361404A (en) * 2021-06-02 2021-09-07 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for recognizing text
CN113419915A (en) * 2021-07-21 2021-09-21 北京百度网讯科技有限公司 Cloud terminal desktop stillness determination method and device
CN113538450A (en) * 2020-04-21 2021-10-22 百度在线网络技术(北京)有限公司 Method and device for generating image
CN113610968A (en) * 2021-08-17 2021-11-05 北京京东乾石科技有限公司 Target detection model updating method and device
CN113643136A (en) * 2021-09-01 2021-11-12 京东科技信息技术有限公司 Information processing method, system and device

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111368902A (en) * 2020-02-28 2020-07-03 北京三快在线科技有限公司 Data labeling method and device
CN113741680A (en) * 2020-05-27 2021-12-03 北京字节跳动网络技术有限公司 Information interaction method and device
CN113546398A (en) * 2021-07-30 2021-10-26 重庆五诶科技有限公司 Chess and card game method and system based on artificial intelligence algorithm
CN113961526B (en) * 2021-11-22 2024-10-25 北京达佳互联信息技术有限公司 Method and device for detecting screen capturing picture
CN114693629A (en) * 2022-03-25 2022-07-01 北京城市网邻信息技术有限公司 Image recognition method and device, electronic equipment and readable medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761204A (en) * 2005-11-18 2006-04-19 郑州金惠计算机系统工程有限公司 System for blocking off erotic images and unhealthy information in internet
CN103605992A (en) * 2013-11-28 2014-02-26 国家电网公司 Sensitive image recognizing method in interaction of inner and outer power networks
CN106446932A (en) * 2016-08-30 2017-02-22 上海交通大学 Machine learning and picture identification-based evolvable prohibited picture batch processing method

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598902A (en) * 2015-01-29 2015-05-06 百度在线网络技术(北京)有限公司 Method and device for identifying screenshot and browser
CN105654057A (en) * 2015-12-31 2016-06-08 中国建设银行股份有限公司 Picture auditing system and picture auditing method based on picture contents
CN107133629B (en) * 2016-02-29 2020-09-04 百度在线网络技术(北京)有限公司 Picture classification method and device and mobile terminal
CN106599937A (en) * 2016-12-29 2017-04-26 池州职业技术学院 Bad image filtering device
CN108124191B (en) * 2017-12-22 2019-07-12 北京百度网讯科技有限公司 A kind of video reviewing method, device and server

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1761204A (en) * 2005-11-18 2006-04-19 郑州金惠计算机系统工程有限公司 System for blocking off erotic images and unhealthy information in internet
CN103605992A (en) * 2013-11-28 2014-02-26 国家电网公司 Sensitive image recognizing method in interaction of inner and outer power networks
CN106446932A (en) * 2016-08-30 2017-02-22 上海交通大学 Machine learning and picture identification-based evolvable prohibited picture batch processing method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAN ZHONGHAI: "The design and Implementation of Photo-sensitive Recognition System Based on Prohibited Gallery", CHINA MASTER'S THESES FULL-TEXT DATABASE, 15 May 2012 (2012-05-15), pages 33 - 48, ISSN: 1674-0246 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291644B (en) * 2020-01-20 2023-04-18 北京百度网讯科技有限公司 Method and apparatus for processing information
CN111291644A (en) * 2020-01-20 2020-06-16 北京百度网讯科技有限公司 Method and apparatus for processing information
CN111767918A (en) * 2020-02-21 2020-10-13 北京沃东天骏信息技术有限公司 Picture identification method and device
CN111310693B (en) * 2020-02-26 2023-08-29 腾讯科技(深圳)有限公司 Intelligent labeling method, device and storage medium for text in image
CN111310693A (en) * 2020-02-26 2020-06-19 腾讯科技(深圳)有限公司 Intelligent labeling method and device for text in image and storage medium
CN111353434A (en) * 2020-02-28 2020-06-30 北京市商汤科技开发有限公司 Information identification method, device, system, electronic equipment and storage medium
CN111353470A (en) * 2020-03-13 2020-06-30 北京字节跳动网络技术有限公司 Image processing method and device, readable medium and electronic equipment
CN111353470B (en) * 2020-03-13 2023-08-01 北京字节跳动网络技术有限公司 Image processing method and device, readable medium and electronic equipment
CN113538450A (en) * 2020-04-21 2021-10-22 百度在线网络技术(北京)有限公司 Method and device for generating image
CN113538450B (en) * 2020-04-21 2023-07-21 百度在线网络技术(北京)有限公司 Method and device for generating image
US11810333B2 (en) 2020-04-21 2023-11-07 Baidu Online Network Technology (Beijing) Co., Ltd. Method and apparatus for generating image of webpage content
CN111597966A (en) * 2020-05-13 2020-08-28 北京达佳互联信息技术有限公司 Expression image recognition method, device and system
CN111597966B (en) * 2020-05-13 2023-10-10 北京达佳互联信息技术有限公司 Expression image recognition method, device and system
CN111797645A (en) * 2020-07-08 2020-10-20 北京京东振世信息技术有限公司 Method and apparatus for identifying bar code
CN111950591B (en) * 2020-07-09 2023-09-01 中国科学院深圳先进技术研究院 Model training method, interaction relation recognition device and electronic equipment
CN111950591A (en) * 2020-07-09 2020-11-17 中国科学院深圳先进技术研究院 Model training method, interaction relation recognition method and device and electronic equipment
CN111815505A (en) * 2020-07-14 2020-10-23 北京字节跳动网络技术有限公司 Method, apparatus, device and computer readable medium for processing image
CN111860284A (en) * 2020-07-15 2020-10-30 上海钧正网络科技有限公司 Safety management method, device, medium and server for battery replacement cabinet
CN111914822B (en) * 2020-07-23 2023-11-17 腾讯科技(深圳)有限公司 Text image labeling method, device, computer readable storage medium and equipment
CN111914822A (en) * 2020-07-23 2020-11-10 腾讯科技(深圳)有限公司 Text image labeling method and device, computer readable storage medium and equipment
CN112287757B (en) * 2020-09-25 2024-04-26 北京百度网讯科技有限公司 Water body identification method and device, electronic equipment and storage medium
CN112287757A (en) * 2020-09-25 2021-01-29 北京百度网讯科技有限公司 Water body identification method and device, electronic equipment and storage medium
CN112541543A (en) * 2020-12-11 2021-03-23 深圳市优必选科技股份有限公司 Image recognition method and device, terminal equipment and storage medium
CN112541543B (en) * 2020-12-11 2023-11-24 深圳市优必选科技股份有限公司 Image recognition method, device, terminal equipment and storage medium
CN112989986A (en) * 2021-03-09 2021-06-18 北京京东乾石科技有限公司 Method, apparatus, device and storage medium for identifying crowd behavior
CN112905843A (en) * 2021-03-17 2021-06-04 北京文香信息技术有限公司 Information processing method and device based on video stream and storage medium
CN113221920A (en) * 2021-05-20 2021-08-06 北京百度网讯科技有限公司 Image recognition method, device, equipment, storage medium and computer program product
CN113221920B (en) * 2021-05-20 2024-01-12 北京百度网讯科技有限公司 Image recognition method, apparatus, device, storage medium, and computer program product
CN113361404A (en) * 2021-06-02 2021-09-07 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for recognizing text
CN113419915A (en) * 2021-07-21 2021-09-21 北京百度网讯科技有限公司 Cloud terminal desktop stillness determination method and device
CN113610968A (en) * 2021-08-17 2021-11-05 北京京东乾石科技有限公司 Target detection model updating method and device
CN113643136A (en) * 2021-09-01 2021-11-12 京东科技信息技术有限公司 Information processing method, system and device

Also Published As

Publication number Publication date
CN109002842A (en) 2018-12-14

Similar Documents

Publication Publication Date Title
WO2020000879A1 (en) Image recognition method and apparatus
WO2019242222A1 (en) Method and device for use in generating information
WO2020006963A1 (en) Method and apparatus for generating image detection model
CN108427939B (en) Model generation method and device
WO2020000876A1 (en) Model generating method and device
WO2020006961A1 (en) Image extraction method and device
US11436863B2 (en) Method and apparatus for outputting data
WO2019237657A1 (en) Method and device for generating model
CN109740018B (en) Method and device for generating video label model
CN109993150B (en) Method and device for identifying age
JP7394809B2 (en) Methods, devices, electronic devices, media and computer programs for processing video
CN109034069B (en) Method and apparatus for generating information
US11087140B2 (en) Information generating method and apparatus applied to terminal device
CN108235116B (en) Feature propagation method and apparatus, electronic device, and medium
WO2020029608A1 (en) Method and apparatus for detecting burr of electrode sheet
CN108235004B (en) Video playing performance test method, device and system
CN109583389B (en) Drawing recognition method and device
CN110070076B (en) Method and device for selecting training samples
CN110046571B (en) Method and device for identifying age
CN109816023B (en) Method and device for generating picture label model
CN108399401B (en) Method and device for detecting face image
CN109949213B (en) Method and apparatus for generating image
CN110008926B (en) Method and device for identifying age
CN111258414B (en) Method and device for adjusting screen
CN111292333B (en) Method and apparatus for segmenting an image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18923991

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06/05/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18923991

Country of ref document: EP

Kind code of ref document: A1