[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111797326B - False news detection method and system integrating multi-scale visual information - Google Patents

False news detection method and system integrating multi-scale visual information Download PDF

Info

Publication number
CN111797326B
CN111797326B CN202010459132.6A CN202010459132A CN111797326B CN 111797326 B CN111797326 B CN 111797326B CN 202010459132 A CN202010459132 A CN 202010459132A CN 111797326 B CN111797326 B CN 111797326B
Authority
CN
China
Prior art keywords
scale
frequency domain
network
representation
input image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010459132.6A
Other languages
Chinese (zh)
Other versions
CN111797326A (en
Inventor
曹娟
亓鹏
谢添
刘浩远
郭俊波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202010459132.6A priority Critical patent/CN111797326B/en
Publication of CN111797326A publication Critical patent/CN111797326A/en
Application granted granted Critical
Publication of CN111797326B publication Critical patent/CN111797326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Collating Specific Patterns (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a false news detection method integrating multi-scale visual information, which comprises the following steps: a frequency domain characteristic acquisition step, namely constructing a frequency domain sub-network model by using a convolutional neural network, and acquiring a frequency domain characteristic representation of an input image through the frequency domain sub-network model; a semantic feature acquisition step, namely constructing a pixel domain sub-network model by using a convolutional neural network, and acquiring semantic feature representation of the input image through the pixel domain sub-network model; and an image detection step, wherein the frequency domain feature representation and the semantic feature representation are fused to obtain an image representation of the input image, and the prediction probability of the input image as a false news picture is obtained according to the image representation. The invention also provides a false news detection system fused with the multi-scale visual information, a computer readable storage medium and a data processing device comprising the computer readable storage medium.

Description

False news detection method and system integrating multi-scale visual information
Technical Field
The invention relates to the field of news credibility authentication research, in particular to a false news detection method integrating multi-scale visual information.
Background
In recent years, social media has become an important news information platform by virtue of the advantages of strong timeliness, low cost, strong interactivity, low admission threshold and the like, and people are gradually used to acquire the latest news on the social media and freely publish own views. However, the convenience and openness of social media also provides great convenience for the spread of false news, creating many negative social impacts. For example, during the previous month of a large voting campaign, each participant had read on average 1-3 false news published by the well-known media. These false news inevitably mislead the voter and are likely to even affect the outcome of the vote. Therefore, whether to automatically detect false news by using a technical means has become a problem to be solved in the self-media era.
Advances in multimedia technology have facilitated the transition from a traditional text-based news format to a multimedia content-based news format. Multimedia content can better describe news events, has higher confidence and is easier to attract the reader's attention than mere text content. However, this trend also creates new opportunities for false news. False news often uses extremely misleading and even tampered pictures to attract and misguide readers, thereby facilitating the rapid spread of false news. More than 40% of false news on a microblog dataset is statistically inclusive of matches. Thus, visual content has become a non-negligible part of false news.
The existing false news detection method mainly focuses on text content and social context. With the popularity of multimedia content, researchers began to detect false news in combination with visual information. These visual information based tasks can be divided into three categories: based on visual statistics, visual evidence features, and visual semantics features.
Work based on visual statistics uses statistics of pictures in news to help discriminate false news, such as the number of matches, picture popularity, and picture type. But these statistics are too basic to characterize visual patterns of false news complexity.
Visual evidence features are commonly used to detect picture tampering. To verify the authenticity of news patterns, some work has utilized visual evidence features, such as blockiness, etc., to aid in the detection of false news. For example, the multimedia authentication task held by MediaEval in 2015 and 2016 provides 7 visual evidence features to help detect tampering and misuse of multimedia content. Based on these forensic features, l.wu et al designed higher-level forensic features and combined text features and user features to solve news authentication problems. However, most evidence obtaining features are designed manually to detect certain specific tamper marks, and cannot detect the untampered real pictures in the false news pictures. In addition, these manual features require expert design, are labor intensive, and cannot capture complex patterns. These limitations result in the visual evidence features performing poorly in actual false news detection tasks.
With the popularity of convolutional neural networks, most multimedia content-based works use pre-trained deep convolutional neural networks to obtain a generic visual representation and fuse with textual information for detection of false news. Jin and other methods for first passing through the deep neural network fuse multi-mode content to solve the problem of false news detection; wang et al propose a neural network for event countermeasure, which uses multi-modal features to detect new emerging false news events; dhruv et al propose a self-coding based approach to learn the shared expression of multimodal information for false news detection. However, these efforts have focused more on how to fuse information of different modalities, ignoring the efficient modeling of this modality for visual content. Due to the lack of task related information, the general visual expressions adopted by these works cannot reflect the essential characteristics of false news pictures, and the performance of visual content in false news detection tasks is weakened.
Disclosure of Invention
Aiming at the problems, the invention provides a false news detection method integrating multi-scale visual information, which comprises the following steps: a frequency domain characteristic acquisition step, namely constructing a frequency domain sub-network model by using a convolutional neural network, and acquiring a frequency domain characteristic representation of an input image through the frequency domain sub-network model; a semantic feature acquisition step, namely constructing a pixel domain sub-network model by using a convolutional neural network, and acquiring semantic feature representation of the input image through the pixel domain sub-network model; and an image detection step, wherein the frequency domain feature representation and the semantic feature representation are fused to obtain an image representation of the input image, and the prediction probability of the input image as a false news picture is obtained according to the image representation.
The false news detection method provided by the invention comprises the following steps of: constructing a large-scale network of the frequency domain sub-network model by using a convolutional neural network; performing a block discrete cosine transform on the input image to obtain a large-scale histogram of the input image corresponding to a plurality of frequencies; sampling the large-scale histogram to obtain a plurality of large-scale multidimensional vectors; fusing the plurality of large-scale multidimensional vectors through the large-scale network to obtain a large-scale frequency domain feature representation l of the input image large The method comprises the steps of carrying out a first treatment on the surface of the Constructing a small-scale network of the frequency domain sub-network model by using a convolutional neural network; dividing the input image into a plurality of image blocks with the same size, and performing block discrete cosine transform on the image blocks to obtain small-scale histograms corresponding to the image blocks on a plurality of frequencies; selecting a plurality of small-scale histograms in a high frequency band for sampling to obtain a plurality of small-scale multidimensional vectors; fusing the plurality of small-scale multidimensional vectors through the small-scale network to obtain a small-scale frequency domain feature representation l of the input image small The method comprises the steps of carrying out a first treatment on the surface of the Will l large And l small Performing splicing and fusion to obtain a frequency domain feature representation l of the input image F
The false news detection method provided by the invention comprises the following steps of: constructing a cyclic fusion network by using a convolutional neural network; acquiring a first characteristic diagram of the input of the cyclic fusion network on multiple scales, up-sampling the first characteristic diagram to obtain a second characteristic diagram with the same size, and performing channel splicing on the second characteristic diagram to obtain a global context knowledge representation as the output of the cyclic fusion network; taking the output of the cycle fusion network of the round as the input of the cycle fusion network of the next round, and connecting a plurality of the cycle fusion networks in series to form a sub-network model of the pixel domain; taking the input image as the input of the pixel domain sub-network model, taking the global context knowledge representation obtained after preset round iteration as the semantic feature representation l of the input image p
According to the inventionThe false news detection method specifically comprises the following steps: representing l by the frequency domain feature F And the semantic feature representation/ P Obtaining the image representation u, u=αl F +(1-α)l P The method comprises the steps of carrying out a first treatment on the surface of the Projecting the image representation u to the false news picture target space and the true news picture target space, respectively, with a full connection layer, obtaining the prediction probability p, and taking the cross entropy error L between the prediction probability p and the true value y as a loss function, p=softmax (W c u+b c ),L=-∑[ylogp+(1-y)log(1-p)]The method comprises the steps of carrying out a first treatment on the surface of the Wherein alpha is a normalized weight,
Figure BDA0002510380260000031
F(l F )=v T tanh(W F l F +b F ),F(l P )=v T tanh(W F l P +b F ),W c and W is F As a weight matrix, b c And b F To bias, v T The transposed weight vectors, softmax and tanh, are activation functions.
The invention also provides a false news detection system integrating the multi-scale visual information, which comprises the following steps: the frequency domain feature acquisition module is used for constructing a frequency domain sub-network model by using a convolutional neural network, and obtaining the frequency domain feature representation of the input image through the frequency domain sub-network model; the semantic feature acquisition module is used for constructing a pixel domain sub-network model by using a convolutional neural network, and acquiring semantic feature representation of the input image through the pixel domain sub-network model; and the image detection module is used for fusing the frequency domain feature representation with the semantic feature representation to obtain the image representation of the input image, and obtaining the prediction probability of the input image as a false news picture according to the image representation.
The invention relates to a false news detection system, wherein the frequency domain characteristic acquisition module specifically comprises: the large-scale frequency domain feature representation acquisition module is used for acquiring a large-scale frequency domain feature representation of the input image; constructing a large-scale network of the frequency domain sub-network model by using a convolutional neural network; performing a block discrete cosine transform on the input image to obtain the input image at a plurality of frequenciesA corresponding large-scale histogram; sampling the large-scale histogram to obtain a plurality of large-scale multidimensional vectors; fusing the plurality of large-scale multidimensional vectors through the large-scale network to obtain a large-scale frequency domain feature representation l of the input image large The method comprises the steps of carrying out a first treatment on the surface of the The small-scale frequency domain feature representation acquisition module is used for acquiring a small-scale frequency domain feature representation of the input image; constructing a small-scale network of the frequency domain sub-network model by using a convolutional neural network; dividing the input image into a plurality of image blocks with the same size, and performing block discrete cosine transform on the image blocks to obtain small-scale histograms corresponding to the image blocks on a plurality of frequencies; selecting a plurality of small-scale histograms in a high frequency band for sampling to obtain a plurality of small-scale multidimensional vectors; fusing the plurality of small-scale multidimensional vectors through the small-scale network to obtain a small-scale frequency domain feature representation l of the input image small The method comprises the steps of carrying out a first treatment on the surface of the Splicing and fusing module for splicing l large And l small Performing splicing and fusion to obtain a frequency domain feature representation l of the input image F
The false news detection system of the invention, wherein the semantic feature acquisition module specifically comprises: the cyclic fusion network construction module is used for constructing a cyclic fusion network by using a convolutional neural network; acquiring a first characteristic diagram of the input of the cyclic fusion network on multiple scales, up-sampling the first characteristic diagram to obtain a second characteristic diagram with the same size, and performing channel splicing on the second characteristic diagram to obtain a global context knowledge representation as the output of the cyclic fusion network; the loop fusion network serial module is used for taking the output of the loop fusion network of the current loop as the input of the loop fusion network of the next loop, and connecting a plurality of loop fusion networks in series to form the pixel domain sub-network model; the semantic feature acquisition module is used for taking the input image as the input of the pixel domain sub-network model, and taking the global context knowledge representation obtained after the preset round of iteration as the semantic feature representation l of the input image p
The invention relates to a false news detection system, wherein the image detection module specifically comprises: an image representation acquisition module for acquiring the image at the frequencyDomain feature representation l F And the semantic feature representation/ P Obtaining the image representation u, u=αl F +(1-α)l P The method comprises the steps of carrying out a first treatment on the surface of the A prediction probability obtaining module for respectively projecting the image representation u to the false news picture target space and the true news picture target space by using the full connection layer to obtain the prediction probability p, and taking the cross entropy error L between the prediction probability p and the true value y as a loss function, wherein p=softmax (W c u+b c ),L=-∑[ylogp+(1-y)log(1-p)]The method comprises the steps of carrying out a first treatment on the surface of the Wherein alpha is a normalized weight,
Figure BDA0002510380260000041
F(l F )=v T tanh(W F l F +b F ),F(l P )=v T tanh(W F l P +b F ),W c and W is F As a weight matrix, b c And b F To bias, v T The transposed weight vectors, softmax and tanh, are activation functions.
The present invention also proposes a computer readable storage medium storing computer executable instructions for performing false news detection incorporating multi-scale visual information as described above.
The invention also proposes a data processing apparatus comprising a computer readable storage medium as described above, a processor of the data processing apparatus retrieving and executing computer executable instructions in the computer readable storage medium to perform false news detection incorporating multi-scale visual information.
Drawings
Fig. 1 is a flow chart of a false news detection method of the present invention.
Fig. 2 is a schematic diagram of a false information detection model of the present invention.
FIG. 3 is a schematic diagram of a data processing apparatus of the present invention.
Detailed Description
In order to make the purposes, technical schemes and advantages of the invention more clear, the false news detection method and system for fusing multi-scale visual information provided by the invention are further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
When the inventor researches the visual mode of the false news (namely, the picture allocation of the false news), the false news picture is found to contain not only the maliciously tampered false picture, but also the false picture which is wrong and used for representing the irrelevant event. The prior art is only suitable for modeling a certain type of false news pictures, and the essential characteristics of the false news pictures cannot be captured. The inventor finds that the false news picture has obvious characteristics in physical and semantic layers, and has obvious expression in a frequency domain and a pixel domain respectively. Therefore, the invention designs a corresponding deep learning model aiming at the characteristics of the false news picture, deeply digs potential visual modes of the picture in the frequency domain and the pixel domain, and carries out high-efficiency expression and fusion, thereby improving the effect of automatically screening false news by utilizing visual contents.
The invention aims to effectively and automatically detect false news, and mainly solves the technical problem of how to establish an effective deep learning model for false news detection based on visual contents of news.
The invention has the main key point that a deep learning model is designed, and the multi-scale visual information of the picture in the frequency domain and the pixel domain can be fully captured and fused, so that the automatic detection of false news by utilizing visual content is realized. The method specifically comprises the steps of modeling the physical characteristics of the false news picture and designing two key points for modeling the semantic characteristics of the false news picture:
1) A multi-scale Convolutional Neural Network (CNN) for the frequency domain information is designed for capturing the physical characteristics of different levels of false news pictures.
The false news pictures have the characteristic of low quality on the physical level, such as multiple compression traces, tamper traces and the like, and often have certain periodicity on the frequency domain, so that the modeling can be performed by using CNN. For a typical false news picture, such as a tampered picture, the tampered area of the picture tends to undergo more compression than the untampered area, which results in different portions of the tampered picture exhibiting different compression characteristics. Therefore, in order to comprehensively consider the overall characteristic and the local abnormal characteristic of the picture, the invention designs a multi-scale CNN network aiming at the frequency domain information, which is used for capturing the physical characteristics of different layers of the false news picture.
2) A cyclic fusion network aiming at pixel domain information is designed for effectively extracting and fusing the characteristics of false news pictures on different semantic levels.
The false news picture presents the style characteristics of visual impact and emotion flaring at the semantic level, and the style characteristics can be presented on visual characteristics of different levels, so that the multi-scale visual characteristics should be comprehensively considered for better modeling the semantic characteristics of the false news picture. Different layers of the CNN model can learn multi-scale features with different abstract levels, but when the CNN model learns multi-scale visual features layer by layer, the learned features have limited representation capability due to lack of context information caused by limited receptive fields. Therefore, the invention designs a cyclic fusion network, guides the feature learning of CNN by using global context knowledge, and fuses the multi-scale CNN features, thereby realizing the characteristics of effectively extracting and fusing false news pictures on different semantic levels.
The invention is described below with reference to the drawings and the detailed description.
One of the main targets of the invention is to automatically screen false information of news information released by a user by utilizing visual content, so that a specific task can be defined as whether news belongs to false news or not according to the visual content.
The false news picture has obvious characteristics in the frequency domain and the pixel domain. Therefore, in order to fully model visual characteristics of false news pictures, the invention designs a deep learning model, and deep digs potential visual modes of the pictures in a frequency domain and a pixel domain, and carries out efficient expression and fusion, thereby improving the effect of automatically screening false news by utilizing visual contents.
Fig. 1 is a flow chart of a false news detection method of the present invention. As shown in fig. 1, the false news detection method of the present invention includes:
s1, constructing a frequency domain sub-network model by using a convolutional neural network, and obtaining a frequency domain characteristic representation of an input image through the frequency domain sub-network model; the frequency domain sub-network model consists of two CNN models with similar structures and is used for extracting physical characteristics of different scales of an input image;
the frequency domain sub-network model consists of two similar CNN networks: small scale networks and large scale networks. The invention uses the complete input image for training of a large scale network and uses the 128 (pixels) x 128 (pixels) image block into which the input image is segmented for training of a small scale network. The two single-scale subnetworks have a similar model architecture. Taking a large scale network as an example, for an input image, a block Discrete Cosine Transform (DCT) is first applied to it to obtain a histogram of DCT coefficients for the picture at 64 frequencies. In particular, the present invention performs a one-dimensional fourier transform on these histograms to enhance the effect of CNN. Taking into account that CNN requires a fixed-size input, these histograms are sampled to obtain 64 250-dimensional vectors, denoted as { H ] 0 ,H 1 ,…,H 63 }. After pretreatment, each input vector Hi is sent into a large-scale CNN network with shared weight to obtain corresponding characteristic representation w i . The CNN network consists of three convolution blocks and a full connection layer, each convolution block consisting of a one-dimensional convolution layer and a maximum pooling layer. To accelerate the convergence of the model, the number of filters in the convolution layer is set to be increasing. The eigenvectors { w } of 64 frequency domains 0 ,w 1 ,…,w 63 Splicing and fusing to obtain large-scale frequency domain characteristic representation l of the input image large . In a small scale network, a block DCT is adopted for each image block with the size of 128×128; the first 9 high frequency entries are selected from the 64 frequencies for the rendering of the DCT coefficient histogram for the reduction parameter. Inputting all 128×128 image blocks into a small-scale CNN network, and performing stitching fusion on the obtained feature vectors to further obtain a small-scale frequency domain feature representation l of the input image small . Finally, l is large And l small The splicing and the fusion are carried out,obtaining final characteristic representation l of input image frequency domain F Further as input to the converged subnetwork.
S2, constructing a pixel domain sub-network model by using a convolutional neural network, and obtaining semantic feature representation of the input image through the pixel domain sub-network model; the pixel domain sub-network is composed of a cyclic fusion network, and the network comprises two stages of GCK (global contextual knowledge, GCK, global context knowledge) guided feature extraction and multi-scale feature fusion, which are respectively used for extracting and fusing feature images of different semantic layers of an input image;
the pixel domain sub-network model is composed of a cyclic fusion network. The main structure of the cyclic fusion network is a simple CNN network, a representation of Global Context Knowledge (GCK) is constructed on the basis of the simple CNN network by fusing multi-scale features, and cyclic connection between different layers of the GCK and the CNN is constructed. Assuming that the basic CNN host structure consists of L layers, each layer gets a feature map X. X is X l Is the output of the first layer CNN, which can be written as
X l =f l (W l *X l-1 ),l∈[1,L]
Wherein, represents convolution operation; w (W) l The weight (including the deviation term) of the first convolution layer is randomly initialized and optimized in the training process; f (f) l (. Cndot.) is a combined function of a plurality of specific functions, such as activation and pooling. Where X is 0 And X L Representing the input and final output of the CNN. 4 layers are selected from the L layers, and fusion is carried out by using a cyclic fusion network. The network comprises two stages of multi-scale feature fusion and GCK guided feature extraction. Let s= { r m ,m∈[1,4]-representing the set of selected layers, and let r m ∈[1,L]The selected layer is marked. In the multi-scale feature fusion stage, a representation GCK of global context knowledge is first obtained. Specifically, after the input image passes through CNN, a group of multi-scale feature images { X }, are obtained r r.epsilon.S.. The present invention employs a 1 x 1 convolution to reduce the number of channels of these feature maps and upsamples feature maps of different scales to the same size. Then, all the amplified feature maps { F r Channel splicing is carried out by r epsilon S, and 1 multiplied by 1 convolution is adoptedAnd (3) calculating to promote information fusion among channels and reduce feature dimensions, and finally obtaining the GCK. Formalization of GCK is defined as follows:
Figure BDA0002510380260000081
where Cat is the channel stitching operation, x represents the convolution operation, W is the weight matrix, and σ is the activation function. In the GCK-guided feature extraction phase, a cyclic connection between the GCK and each selected CNN layer is constructed. By introducing a loop connection, the input of each selected CNN layer includes both the output of the previous layer and the GCK. t represents the number of time steps of the loop network (i.e. the number of loops), then X L (l.epsilon.S) can be rewritten as
Figure BDA0002510380260000082
Wherein X is l (t) and GCK (t) represent the output of the first layer CNN and GCK, respectively, at time step t, and x represents the convolution operation, W l And f l Is a weight matrix and a combination function (including an activation function, a pooling operation and the like) for transferring the feature map of the layer (l-1) to the layer (l), U l And g l Is a weight matrix and a combination function for obtaining GCK of the first layer, V l Is the weight matrix of the 1*1 convolution layer of the first layer, σ is the activation function, cat is the channel splice operation. The model parameters for the multiple time steps are shared. After t iterations, the global context knowledge representation GCK (t) of the last time step is obtained as the final semantic feature representation l of the pixel domain sub-network p Further as input to the converged subnetwork.
Step S3, fusing the frequency domain feature representation and the semantic feature representation to obtain an image representation of the input image, and obtaining the prediction probability of the input image as a false news picture according to the image representation; the fusion sub-network dynamically fuses the feature vectors acquired from the frequency domain and pixel domain sub-networks by using an attention (attention) mechanism, and classifies the input image as a false news picture or a real news picture;
the physical and semantic features of the picture are complementary when false news is detected, so the invention proposes to fuse the features by fusing the sub-networks, i.e. by using the output of the frequency domain sub-network F And output of sub-network of pixel domain P And predicting whether the input picture belongs to the false news picture. Intuitively, not all features play the same role in the detection of false news, meaning that some visual features play a more important role in evaluating whether a given picture is a false news picture or a true news picture. For example, for some tampered pictures with obvious tamper evidence, physical features perform better than semantic features in detecting false news; for some misleading images that have not undergone severe recompression, the semantic features are more efficient. The present invention thus highlights these valuable features through the attention mechanism, and the enhanced image representation u is calculated as follows:
F(l F )=v T tanh(W F l F +b F )
F(l P )=v T tanh(W F l P +b F )
Figure BDA0002510380260000091
u=αl F +(1-α)l P
wherein W is F Representing a weight matrix, b F Representing bias, v T Representing the transposed weight vector, tanh is the activation function and F (·) is the scoring function that measures the importance of each feature vector. Then, a feature vector l is obtained by a softmax activation function F And l p Corresponding normalized weights α and 1- α, and computes a weighted sum of the different feature vectors as the high-level representation u of the image. The vector v is randomly initialized during training and optimized during network training.
The feature vector u is then projected into two types of target spaces using a fully connected layer with Softmax activation: false news pictures and true news pictures, and obtaining probability distribution:
p=softmax(W c u+b c ),
wherein W is c Representing a weight matrix, b c Representing the bias. And defining the loss function as the cross entropy error between the predicted probability distribution and the true value:
L=-∑[ylogp+(1-y)log(1-p)]
where y is the true value of the input image, 1 represents the false news picture, 0 represents the true news picture, and p represents the prediction probability of the false news picture.
The invention also provides a false news detection system, the whole frame of which is shown in figure 2 and mainly comprises three parts: a frequency domain sub-network, a pixel domain sub-network, and a convergence sub-network. The frequency domain sub-network is composed of two CNN models with similar structures and is used for extracting physical characteristics of different scales of an input image; the pixel domain sub-network is composed of a cyclic fusion network, which comprises two stages of GCK (global contextual knowledge, GCK, global context knowledge) guided feature extraction and multi-scale feature fusion, and is used for extracting and fusing feature images of different semantic layers of an input image respectively. The fusion sub-network dynamically fuses feature vectors acquired from the frequency domain and pixel domain sub-networks using an attention (attention) mechanism to classify the input image as a false news picture or a true news picture.
1. Frequency domain sub-network model
Details of the model of the frequency domain sub-network are shown in the upper half of fig. 2, the model consisting of two similar CNN networks: small scale networks and large scale networks. The invention uses the complete input image for training of a large scale network and uses the 128 (pixels) x 128 (pixels) image block into which the input image is segmented for training of a small scale network. The two single-scale subnetworks have a similar model architecture. Taking a large scale network as an example, for an input image, a block Discrete Cosine Transform (DCT) is first applied to it to obtain a histogram of DCT coefficients for the picture at 64 frequencies. In particular, the present invention performs a one-dimensional fourier transform on these histograms to enhance the effect of CNN. Taking into account CNN requires a fixed-size input and samples these histograms to yield 64 250-dimensional vectors, denoted H 0 ,H 1 ,…,H 63 }. After pretreatment, each input vector Hi is sent into a large-scale CNN network with shared weight to obtain corresponding characteristic representation w i . The CNN network consists of three convolution blocks and a full connection layer, each convolution block consisting of a one-dimensional convolution layer and a maximum pooling layer. To accelerate the convergence of the model, the number of filters in the convolution layer is set to be increasing. The eigenvectors { w } of 64 frequency domains 0 ,w 1 ,…,w 63 Splicing and fusing to obtain large-scale frequency domain characteristic representation l of the input image large . In a small scale network, a block DCT is adopted for each image block with the size of 128×128; the first 9 high frequency entries are selected from the 64 frequencies for the rendering of the DCT coefficient histogram for the reduction parameter. Inputting all 128×128 image blocks into a small-scale CNN network, and performing stitching fusion on the obtained feature vectors to further obtain a small-scale frequency domain feature representation l of the input image small . Finally, l is large And l small Performing splicing and fusion to obtain final characteristic representation l of input image frequency domain F Further as input to the converged subnetwork.
2. Pixel domain subnetwork model
The details of the pixel domain sub-network are shown in the lower part of fig. 2, and mainly comprise a loop fusion network. The main structure of the network is a simple CNN network, on the basis, a representation of Global Context Knowledge (GCK) is constructed by fusing multi-scale features, and cyclic connection between different layers of the GCK and the CNN is constructed. Assuming that the basic CNN host structure consists of L layers, each layer gets a feature map X. X is X l Is the output of the first layer CNN, which can be written as
X l =f l (W l *X l-1 ),l∈[1,L]
Wherein, represents convolution operation; w (W) l The weight (including the deviation term) of the first convolution layer is randomly initialized and optimized in the training process; f (f) l (. Cndot.) is how much to activate and poolA combination of the specific functions. Where X is 0 And X L Representing the input and final output of the CNN. 4 layers are selected from the L layers, and fusion is carried out by using a cyclic fusion network. The network comprises two stages of multi-scale feature fusion and GCK guided feature extraction. Let s= { r m ,m∈[1,4]-representing the set of selected layers, and let r m ∈[1,L]The selected layer is marked. In the multi-scale feature fusion stage, a representation GCK of global context knowledge is first obtained. Specifically, after the input image passes through CNN, a group of multi-scale feature images { X }, are obtained r r.epsilon.S.. The present invention employs a 1 x 1 convolution to reduce the number of channels of these feature maps and upsamples feature maps of different scales to the same size. Then, all the amplified feature maps { F r And (3) performing channel splicing by r epsilon S, and adopting 1 multiplied by 1 convolution operation to promote information fusion among channels and reduce feature dimensions, so as to finally obtain the GCK. Formalization of GCK is defined as follows:
Figure BDA0002510380260000111
where Cat is the channel stitching operation, x represents the convolution operation, W is the weight matrix, and σ is the activation function. In the GCK-guided feature extraction phase, a cyclic connection between the GCK and each selected CNN layer is constructed. By introducing a loop connection, the input of each selected CNN layer includes both the output of the previous layer and the GCK. t represents the number of time steps of the loop network (i.e. the number of loops), then X L (l.epsilon.S) can be rewritten as
Figure BDA0002510380260000112
Wherein X is l (t) and GCK (t) represent the output of the first layer CNN and GCK, respectively, at time step t, and x represents the convolution operation, W l And f l Is a weight matrix and a combination function (including an activation function, a pooling operation and the like) for transferring the feature map of the layer (l-1) to the layer (l), U l And g l Is a weight matrix and a combination function for obtaining GCK of the first layer,V l Is the weight matrix of the 1*1 convolution layer of the first layer, σ is the activation function, cat is the channel splice operation. The model parameters for the multiple time steps are shared. After t iterations, the global context knowledge representation GCK (t) of the last time step is obtained as the final semantic feature representation l of the pixel domain sub-network p Further as input to the converged subnetwork.
3. Fusion sub-network model
The physical and semantic features of the picture are complementary when false news is detected, so the invention proposes to fuse the features by fusing the sub-networks, i.e. by using the output of the frequency domain sub-network F And output of sub-network of pixel domain P And predicting whether the input picture belongs to the false news picture. Intuitively, not all features play the same role in the detection of false news, meaning that some visual features play a more important role in evaluating whether a given picture is a false news picture or a true news picture. For example, for some tampered pictures with obvious tamper evidence, physical features perform better than semantic features in detecting false news; for some misleading images that have not undergone severe recompression, the semantic features are more efficient. The present invention thus highlights these valuable features through the attention mechanism, and the enhanced image representation u is calculated as follows:
F(l F )=v T tanh(W F l F +b F )
F(l P )=v T tanh(W F l P +b F )
Figure BDA0002510380260000121
u=αl F +(1-α)l P
wherein W is F Representing a weight matrix, b F Representing the bias, tanh is the activation function, v T Representing the transposed weight vector, F (·) is a scoring function that measures the importance of each feature vector. Then, the mixture is excited by a softmaxThe living function obtains a feature vector l F And l p Corresponding normalized weights α and 1- α, and computes a weighted sum of the different feature vectors as the high-level representation u of the image. The vector v is randomly initialized during training and optimized during network training.
The feature vector u is then projected into two classes of target space using a fully connected layer with Softmax activation functions: false news pictures and true news pictures, and obtaining probability distribution:
p=softmax(W c u+b c ),
wherein W is c Representing a weight matrix, b c Representing the bias. And defining the loss function as the cross entropy error between the predicted probability distribution and the true value:
L=-∑[ylogp+(1-y)log(1-p)]
where y is the true value of the input image, 1 represents the false news picture, 0 represents the true news picture, and p represents the prediction probability of the false news picture.
FIG. 3 is a schematic diagram of a data processing apparatus of the present invention. As shown in fig. 3, the embodiment of the present invention further provides a computer-readable storage medium, and a data processing apparatus. The computer readable storage medium of the present invention stores computer executable instructions that, when executed by a processor of a data processing apparatus, implement the above-described false news detection method that fuses multi-scale visual information. Those of ordinary skill in the art will appreciate that all or a portion of the steps of the above-described methods may be performed by a program that instructs associated hardware (e.g., processor, FPGA, ASIC, etc.), which may be stored on a readable storage medium such as read only memory, magnetic or optical disk, etc. All or part of the steps of the embodiments described above may also be implemented using one or more integrated circuits. Accordingly, each module in the above embodiments may be implemented in the form of hardware, for example, by an integrated circuit, or may be implemented in the form of a software functional module, for example, by a processor executing a program/instruction stored in a memory to implement its corresponding function. Embodiments of the invention are not limited to any specific form of combination of hardware and software.
The invention realizes effective discrimination of false news based on the visual content of news messages, and compared with the prior art, the invention realizes great improvement of performance on the premise of not adding extra data. In particular, for the task of detecting false news using visual content, the present invention achieves an accuracy improvement of at least 11.8 percentiles over prior art data sets disclosed in the industry.
The above embodiments are only for illustrating the present invention, not for limiting the present invention, and various changes and modifications may be made by one of ordinary skill in the relevant art without departing from the spirit and scope of the present invention, and therefore, all equivalent technical solutions are also within the scope of the present invention, and the scope of the present invention is defined by the claims.

Claims (8)

1. A false news detection method integrating multi-scale visual information is characterized by comprising the following steps:
a frequency domain characteristic acquisition step, namely constructing a large-scale network of a frequency domain sub-network model by using a convolutional neural network; performing block discrete cosine transform on an input image to obtain a large-scale histogram corresponding to the input image at a plurality of frequencies; sampling the large-scale histogram to obtain a plurality of large-scale multidimensional vectors; fusing the plurality of large-scale multidimensional vectors through the large-scale network to obtain a large-scale frequency domain feature representation of the input image; constructing a small-scale network of the frequency domain sub-network model by using a convolutional neural network; dividing the input image into a plurality of image blocks with the same size, and performing block discrete cosine transform on the image blocks to obtain small-scale histograms corresponding to the image blocks on a plurality of frequencies; selecting a plurality of small-scale histograms in a high frequency band for sampling to obtain a plurality of small-scale multidimensional vectors; fusing the plurality of small-scale multidimensional vectors through the small-scale network to obtain a small-scale frequency domain feature representation of the input image; splicing and fusing the large-scale frequency domain feature representation and the small-scale frequency domain feature representation to obtain a frequency domain feature representation of the input image;
a semantic feature acquisition step, namely constructing a pixel domain sub-network model by using a convolutional neural network, and acquiring semantic feature representation of the input image through the pixel domain sub-network model;
and an image detection step, wherein the frequency domain feature representation and the semantic feature representation are fused to obtain an image representation of the input image, and the prediction probability of the input image as a false news picture is obtained according to the image representation.
2. The false news detection method of claim 1, wherein the semantic feature acquisition step specifically includes:
constructing a cyclic fusion network by using a convolutional neural network; acquiring a first characteristic diagram of the input of the cyclic fusion network on multiple scales, up-sampling the first characteristic diagram to obtain a second characteristic diagram with the same size, and performing channel splicing on the second characteristic diagram to obtain a global context knowledge representation as the output of the cyclic fusion network;
taking the output of the cycle fusion network of the round as the input of the cycle fusion network of the next round, and connecting a plurality of the cycle fusion networks in series to form a sub-network model of the pixel domain;
taking the input image as the input of the pixel domain sub-network model, taking the global context knowledge representation obtained after preset round iteration as the semantic feature representation l of the input image p
3. The false news detection method of claim 1, wherein the image detection step specifically includes:
representing l by the frequency domain feature F And the semantic feature representation/ P Obtaining the image representation u, u=αl F +(1-α)l P
Projecting the image representation u to a false news picture target space and a true news picture target space respectively by using a full connection layer to obtain the prediction probability p, and taking a cross entropy error L between the prediction probability p and a true value y as a loss function, wherein p=softmax #W c u+b c ),L=-∑[ylogp+(1-y)log(1-p)];
Wherein alpha is a normalized weight,
Figure FDA0004052221620000021
F(l F )=v T tanh(W F l F +b F ),F(l P )=v T tanh(W F l P +b F ),W c and W is F As a weight matrix, b c And b F To bias, v T The transposed weight vectors, softmax and tanh, are activation functions.
4. A false news detection system incorporating multi-scale visual information, comprising:
the frequency domain feature acquisition module is used for constructing a frequency domain sub-network model by using a convolutional neural network, and obtaining the frequency domain feature representation of the input image through the frequency domain sub-network model; the method comprises a large-scale frequency domain feature representation acquisition module, a small-scale frequency domain feature representation acquisition module and a splicing and fusing module, wherein:
the large-scale frequency domain feature representation acquisition module is used for acquiring a large-scale frequency domain feature representation of the input image; constructing a large-scale network of the frequency domain sub-network model by using a convolutional neural network; performing a block discrete cosine transform on the input image to obtain a large-scale histogram of the input image corresponding to a plurality of frequencies; sampling the large-scale histogram to obtain a plurality of large-scale multidimensional vectors; fusing the plurality of large-scale multidimensional vectors through the large-scale network to obtain a large-scale frequency domain feature representation of the input image;
the small-scale frequency domain feature representation acquisition module is used for acquiring a small-scale frequency domain feature representation of the input image; constructing a small-scale network of the frequency domain sub-network model by using a convolutional neural network; dividing the input image into a plurality of image blocks with the same size, and performing block discrete cosine transform on the image blocks to obtain small-scale histograms corresponding to the image blocks on a plurality of frequencies; selecting a plurality of small-scale histograms in a high frequency band for sampling to obtain a plurality of small-scale multidimensional vectors; fusing the plurality of small-scale multidimensional vectors through the small-scale network to obtain a small-scale frequency domain feature representation of the input image;
the splicing and fusing module is used for splicing and fusing the large-scale frequency domain feature representation and the small-scale frequency domain feature representation to obtain the frequency domain feature representation of the input image;
the semantic feature acquisition module is used for constructing a pixel domain sub-network model by using a convolutional neural network, and acquiring semantic feature representation of the input image through the pixel domain sub-network model;
and the image detection module is used for fusing the frequency domain feature representation with the semantic feature representation to obtain the image representation of the input image, and obtaining the prediction probability of the input image as a false news picture according to the image representation.
5. The false news detection system of claim 4, wherein the semantic feature acquisition module specifically includes:
the cyclic fusion network construction module is used for constructing a cyclic fusion network by using a convolutional neural network; acquiring a first characteristic diagram of the input of the cyclic fusion network on multiple scales, up-sampling the first characteristic diagram to obtain a second characteristic diagram with the same size, and performing channel splicing on the second characteristic diagram to obtain a global context knowledge representation as the output of the cyclic fusion network;
the loop fusion network serial module is used for taking the output of the loop fusion network of the current loop as the input of the loop fusion network of the next loop, and connecting a plurality of loop fusion networks in series to form the pixel domain sub-network model;
the semantic feature acquisition module is used for taking the input image as the input of the pixel domain sub-network model, and taking the global context knowledge representation obtained after the preset round of iteration as the semantic feature representation l of the input image p
6. The false news detection system of claim 4, wherein the image detection module specifically includes:
an image representation acquisition module for representing the l with the frequency domain features F And the semantic feature representation/ P Obtaining the image representation u, u=αl F +(1-α)l P
A prediction probability obtaining module for respectively projecting the image representation u to the false news picture target space and the true news picture target space by using the full connection layer to obtain the prediction probability p, and taking the cross entropy error L between the prediction probability p and the true value y as a loss function, wherein p=softmax (W c u+b c ),L=-∑[ylogp+(1-y)log(1-p)];
Wherein alpha is a normalized weight,
Figure FDA0004052221620000031
F(l F )=v T tanh(W F l F +b F ),F(l P )=v T tanh(W F l P +b F ),W c and W is F As a weight matrix, b c And b F To bias, v T The transposed weight vectors, softmax and tanh, are activation functions.
7. A computer readable storage medium storing computer executable instructions for performing the false news detection incorporating multi-scale visual information as claimed in any one of claims 1 to 3.
8. A data processing apparatus comprising the computer readable storage medium of claim 7, the processor of the data processing apparatus retrieving and executing computer executable instructions in the computer readable storage medium to perform false news detection incorporating multi-scale visual information.
CN202010459132.6A 2020-05-27 2020-05-27 False news detection method and system integrating multi-scale visual information Active CN111797326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010459132.6A CN111797326B (en) 2020-05-27 2020-05-27 False news detection method and system integrating multi-scale visual information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010459132.6A CN111797326B (en) 2020-05-27 2020-05-27 False news detection method and system integrating multi-scale visual information

Publications (2)

Publication Number Publication Date
CN111797326A CN111797326A (en) 2020-10-20
CN111797326B true CN111797326B (en) 2023-05-12

Family

ID=72806353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010459132.6A Active CN111797326B (en) 2020-05-27 2020-05-27 False news detection method and system integrating multi-scale visual information

Country Status (1)

Country Link
CN (1) CN111797326B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113158815B (en) * 2021-03-27 2023-05-12 复旦大学 Unsupervised pedestrian re-identification method, system and computer readable medium
CN113469214A (en) * 2021-05-20 2021-10-01 中国科学院自动化研究所 False news detection method and device, electronic equipment and storage medium
CN113177110B (en) * 2021-05-28 2022-09-16 中国人民解放军国防科技大学 False news detection method and device, computer equipment and storage medium
CN113239926B (en) * 2021-06-17 2022-10-25 北京邮电大学 Multi-modal false information detection model system based on countermeasure
CN113643261B (en) * 2021-08-13 2023-04-18 江南大学 Lung disease diagnosis method based on frequency attention network
CN113946683A (en) * 2021-09-07 2022-01-18 中国科学院信息工程研究所 Knowledge fusion multi-mode false news identification method and device
CN113934882A (en) * 2021-09-29 2022-01-14 北京中科睿鉴科技有限公司 Fine-grained multi-mode false news detection method
CN113837310B (en) * 2021-09-30 2023-05-23 四川新网银行股份有限公司 Multi-scale fused certificate flap recognition method and device, electronic equipment and medium
CN114078274B (en) * 2021-10-29 2024-11-01 北京百度网讯科技有限公司 Face image detection method and device, electronic equipment and storage medium
CN114612679A (en) * 2022-02-24 2022-06-10 郑州大学 False news image detection method of multi-mode data fusion neural network
CN114912026B (en) * 2022-05-30 2023-11-07 贵州梦动科技有限公司 Network public opinion monitoring analysis processing method, equipment and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521592A (en) * 2011-11-30 2012-06-27 苏州大学 Multi-feature fusion salient region extracting method based on non-clear region inhibition
CN110889430A (en) * 2019-10-24 2020-03-17 中国科学院计算技术研究所 News image detection method, system and device based on multi-domain visual features
CN111079444A (en) * 2019-12-25 2020-04-28 北京中科研究院 Network rumor detection method based on multi-modal relationship

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8213725B2 (en) * 2009-03-20 2012-07-03 Eastman Kodak Company Semantic event detection using cross-domain knowledge

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521592A (en) * 2011-11-30 2012-06-27 苏州大学 Multi-feature fusion salient region extracting method based on non-clear region inhibition
CN110889430A (en) * 2019-10-24 2020-03-17 中国科学院计算技术研究所 News image detection method, system and device based on multi-domain visual features
CN111079444A (en) * 2019-12-25 2020-04-28 北京中科研究院 Network rumor detection method based on multi-modal relationship

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Exploiting Multi-domin Visual Information for Fake News Detection;Peng Qi et.al;《https://arxiv.org/abs/1908.04472》;20190813;第2-6页 *

Also Published As

Publication number Publication date
CN111797326A (en) 2020-10-20

Similar Documents

Publication Publication Date Title
CN111797326B (en) False news detection method and system integrating multi-scale visual information
CN112541476B (en) Malicious webpage identification method based on semantic feature extraction
CN110889430A (en) News image detection method, system and device based on multi-domain visual features
WO2020164278A1 (en) Image processing method and device, electronic equipment and readable storage medium
Liu et al. D-unet: a dual-encoder u-net for image splicing forgery detection and localization
CN111783712A (en) Video processing method, device, equipment and medium
CN112507912B (en) Method and device for identifying illegal pictures
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
CN112651996B (en) Target detection tracking method, device, electronic equipment and storage medium
Jabeen et al. A deep multimodal system for provenance filtering with universal forgery detection and localization
CN111860545B (en) Image sensitive content identification method and system based on weak detection mechanism
CN117112814A (en) False media content mining and identification system and identification method thereof
CN110347853B (en) Image hash code generation method based on recurrent neural network
Sreeja et al. A unified model for egocentric video summarization: an instance-based approach
CN115393698A (en) Digital image tampering detection method based on improved DPN network
CN114662586A (en) Method for detecting false information based on common attention multi-mode fusion mechanism
Shi et al. A lightweight image splicing tampering localization method based on MobileNetV2 and SRM
Zhang et al. A hybrid convolutional architecture for accurate image manipulation localization at the pixel-level
CN113807232A (en) Fake face detection method, system and storage medium based on double-flow network
CN113822521A (en) Method and device for detecting quality of question library questions and storage medium
Ananthi et al. A secure model on Advanced Fake Image-Feature Network (AFIFN) based on deep learning for image forgery detection
CN116977692A (en) Data processing method, device and computer readable storage medium
CN112883868B (en) Training method of weak supervision video motion positioning model based on relational modeling
CN116721320A (en) Universal image tampering evidence obtaining method and system based on multi-scale feature fusion
CN115719428A (en) Face image clustering method, device, equipment and medium based on classification model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant