[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN104679902A - Information abstract extraction method in conjunction with cross-media fuse - Google Patents

Information abstract extraction method in conjunction with cross-media fuse Download PDF

Info

Publication number
CN104679902A
CN104679902A CN201510123093.1A CN201510123093A CN104679902A CN 104679902 A CN104679902 A CN 104679902A CN 201510123093 A CN201510123093 A CN 201510123093A CN 104679902 A CN104679902 A CN 104679902A
Authority
CN
China
Prior art keywords
image
data
dimensional data
same
same dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510123093.1A
Other languages
Chinese (zh)
Other versions
CN104679902B (en
Inventor
裴廷睿
赵津锋
李哲涛
崔荣峻
吴相润
关屋大雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN201510123093.1A priority Critical patent/CN104679902B/en
Publication of CN104679902A publication Critical patent/CN104679902A/en
Application granted granted Critical
Publication of CN104679902B publication Critical patent/CN104679902B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

本发明提出了一种结合跨媒体融合的信息摘要提取方法。首先将输入的多媒体数据(文字、图像、音频、视频等)按数据类型将其分类;再将异构多媒体数据同维化并建立数据的文本标签,获得同维图像和文本标签;然后将同维图像数据聚类并进行文本标签的关联性检验;再分类别融合若干张同维图像为一副图像;最后生成跨媒体信息摘要。用户通过信息摘要可查看每类信息的融合图像,并可快速访问对应的多媒体数据。

The invention proposes an information abstract extraction method combined with cross-media fusion. Firstly, classify the input multimedia data (text, image, audio, video, etc.) according to the data type; then homogeneous heterogeneous multimedia data and establish the text label of the data to obtain the same-dimensional image and text label; then The two-dimensional image data is clustered and the relevance test of the text label is carried out; several same-dimensional images are then classified and fused into one image; finally, a cross-media information summary is generated. Users can view the fusion image of each type of information through the information summary, and can quickly access the corresponding multimedia data.

Description

一种结合跨媒体融合的信息摘要提取方法An Information Abstract Extraction Method Combined with Cross-media Fusion

技术领域 technical field

本发明涉及一种结合跨媒体融合的信息摘要提取方法,属于信息提取领域。 The invention relates to an information abstract extraction method combined with cross-media fusion, belonging to the field of information extraction.

背景技术 Background technique

我们生活在一个信息时代,海量信息扩增,互联网每天在新增大量的信息,而信息的存储方式日渐多样化,文本、图像、音频、视频是多媒体资源的基本存在形式。如今多种类型媒体数据混合并存,媒体数据组织结构复杂,但不同类型的媒体数据从不同侧面表达同一语义,信息提取中需要根据媒体之间存在的各种联系,从一种媒体跨越到另一种媒体。因此,如何跨越媒体之间的界限,如何提取多种媒体之间的潜在关联性,成为目前信息提取所面临的挑战。 We live in an information age, where massive amounts of information are increasing, and the Internet is adding a large amount of information every day, and the storage methods of information are becoming more and more diverse. Text, images, audio, and video are the basic forms of multimedia resources. Nowadays, various types of media data coexist, and the organization structure of media data is complex, but different types of media data express the same semantics from different aspects. In information extraction, it is necessary to cross from one media to another according to the various connections between media. kind of media. Therefore, how to cross the boundary between media and how to extract the potential correlation between multiple media has become a challenge for information extraction.

对于多种媒体形式混合并存的大数据,现有方法主要是通过同一种媒体的特征辨识来实现的,难以跨越多媒体之间的语义鸿沟,例如图像的视觉特征与音频的听觉特征之间的特征维数不同而无法直接度量他们之间的相似性,因此,现有信息提取方法不能很好为用户提供直观缩略图(或信息摘要),如何将混合的大量多媒体数据分类与提取,成为信息提取亟需解决的关键技术难题之一,也是目前所研究的热门课题。 For big data where multiple media forms coexist, the existing methods are mainly realized through the feature recognition of the same media, and it is difficult to bridge the semantic gap between multimedia, such as the visual features of images and the auditory features of audio. The dimensions are different and the similarity between them cannot be directly measured. Therefore, the existing information extraction methods cannot provide users with intuitive thumbnails (or information summaries). How to classify and extract a large amount of mixed multimedia data becomes information extraction. It is one of the key technical problems that need to be solved urgently, and it is also a hot topic of research at present.

现有的成熟文本挖掘技术、图像特征提取算法、音频场景识别、语音识别、视频场景分割、关键帧提取等方法可以提取单一媒体的语义信息,如何将这些算法加以结合,将不同维数的特征信息提取,形成处理多媒体的信息提取系统,我们通过图像这一中间维数的媒体来解决此问题。 Existing mature text mining technology, image feature extraction algorithm, audio scene recognition, speech recognition, video scene segmentation, key frame extraction and other methods can extract semantic information of a single media. How to combine these algorithms to combine features of different dimensions Information extraction, forming an information extraction system for processing multimedia, we solve this problem through the medium of image, which is an intermediate dimension.

发明内容 Contents of the invention

针对上述问题,本发明提出一种结合跨媒体融合的信息摘要提取方法。通过采用将异维数据同维化为图像的方法,解决了难以跨越多媒体语义鸿沟的问题。通过图像聚类方法,从而间接的将多媒体数据分类和提取,生成跨媒体信息摘要。 In view of the above problems, the present invention proposes an information summary extraction method combined with cross-media fusion. By adopting the method of synchronizing different-dimensional data into images, it solves the problem that it is difficult to cross the multimedia semantic gap. Through the image clustering method, the multimedia data is classified and extracted indirectly, and the cross-media information summary is generated.

本发明提出了一种结合跨媒体融合的信息摘要提取方法。首先将输入的多媒体数据(文字、图像、音频、视频等)按数据类型将其分类;再将异维多媒体数据同维化并建立数据的文本标签,获得同维图像和文本标签;然后将同维图像数据聚类并进行文本标签的关联性检验;再分类别融合若干张同维图像为一副图像;最后生成跨媒体信息摘要。用户通过信息摘要可查看每类信息的融合图像,并可快速访问对应的多媒体数据。 The invention proposes an information abstract extraction method combined with cross-media fusion. Firstly, the input multimedia data (text, image, audio, video, etc.) are classified according to the data type; then the multi-dimensional multimedia data is synchronized and the text label of the data is established to obtain the same-dimensional image and text label; then the same-dimensional image and text label are obtained. The two-dimensional image data is clustered and the relevance test of the text label is carried out; several same-dimensional images are then classified and fused into one image; finally, a cross-media information summary is generated. Users can view the fusion image of each type of information through the information summary, and can quickly access the corresponding multimedia data.

本发明提出一种结合跨媒体融合的信息摘要提取方法,包括以下步骤: The present invention proposes a method for extracting information summaries combined with cross-media fusion, comprising the following steps:

步骤一:将输入的多媒体数据中(文字、图像、音频、视频)按数据类型分类为原始文本数据                                                ,原始图像数据,原始音频数据,原始视频数据Step 1: Classify the input multimedia data (text, image, audio, video) into original text data by data type , the original image data , the raw audio data , the original video data ;

步骤二:设置图像数据维数(图像像素)标准值,建立带有文本标签的同维图像样本库,进行异维多媒体数据同维化处理,根据数据类型的不同采用相对应的处理方法; Step 2: Set the standard value of the image data dimension (image pixel), establish a same-dimensional image sample library with text labels, carry out the same-dimensional processing of different-dimensional multimedia data, and adopt corresponding processing methods according to different data types;

步骤三:对已处理的同维图像数据,根据聚类所需要的准确度确定阈值,按照图像聚类算法进行聚类,根据每类数据的文本标签进行文本标签关联性检验,将不满足条件的数据再次聚类,直到不满足条件的数据数量小于阈值,可得类同维图像数据的地址,即索引Step 3: For the processed same-dimensional image data, determine the threshold according to the accuracy required for clustering , perform clustering according to the image clustering algorithm, perform text label correlation test according to the text labels of each type of data, and cluster the data that does not meet the conditions again until the number of data that does not meet the conditions is less than the threshold ,Available same-dimensional image data the address of the index ;

步骤四:对已聚类的同维图像数据,按照一种融合规则,进行融合,从而得到每一类同维图像数据的融合图像Step 4: Fusion the clustered same-dimensional image data according to a fusion rule to obtain a fusion image of each type of same-dimensional image data ;

步骤五:根据每一类同维图像数据的融合图像以及索引,生成信息摘要。 Step 5: Generate an information summary according to the fused image and index of each type of same-dimensional image data.

与现有方法相比,本发明的优势在于: Compared with existing methods, the advantages of the present invention are:

1、 将异维的多媒体数据语义用同维图像数据表达,跨越了多种媒体之间的界限,并运用图像处理的相关算法处理多媒体数据; 1. Express the semantics of different-dimensional multimedia data with image data of the same dimension, crossing the boundaries between various media, and use relevant algorithms of image processing to process multimedia data;

2、 图像聚类方法与文本标签关联性检验相结合,保证了分类的准确性和数据之间的强关联性。 2. The image clustering method is combined with the text label correlation test to ensure the accuracy of the classification and the strong correlation between the data.

附图说明 Description of drawings

图1 是本发明的流程图; Fig. 1 is a flow chart of the present invention;

图2 是本发明中异维数据同维化方法流程图; Fig. 2 is a flow chart of the same-dimensionalization method for different-dimensional data in the present invention;

图3 是本发明中同维图像数据聚类与文本标签关联性检验示意图。 Fig. 3 is a schematic diagram of the same-dimensional image data clustering and text label correlation test in the present invention.

具体实施方法Specific implementation method

下面结合附图和具体实施方式对本发明进一步详细描述: Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:

步骤一:将输入的多媒体数据中(文字、图像、音频、视频)按数据类型分类为原始文本数据,原始图像数据,原始音频数据,原始视频数据Step 1: Classify the input multimedia data (text, image, audio, video) into original text data by data type , the original image data , the raw audio data , the original video data .

步骤二:参见图2,设置图像数据维数(图像像素)标准值,建立带有文本标签的同维图像样本库,进行异维多媒体数据同维化处理,根据数据类型的不同的采用相对应的处理方法; Step 2: Refer to Figure 2, set the standard value of the image data dimension (image pixel), establish a sample library of the same dimension image with text labels, perform the same dimension processing of different dimension multimedia data, and use corresponding methods according to different data types processing method;

现有分类结果为组原始文本数据、组原始图像数据、组原始音频数据、组原始视频数据,将组原始文本数据处理为同维图像数据,将组原始图像数据处理为同维图像数据,将组原始音频数据处理为同维图像数据,将组原始视频数据处理为同维图像数据,详细步骤如下; The existing classification results are group of raw text data, group of raw image data, group of raw audio data, group of raw video data, the group raw text data Processing as same-dimensional image data ,Will Group raw image data Processing as same-dimensional image data ,Will Group raw audio data Processing as same-dimensional image data ,Will Group raw video data Processing as same-dimensional image data , the detailed steps are as follows;

1)将原始文本数据处理为同维图像数据的过程和相关操作; 1) Convert raw text data to Processing as same-dimensional image data process and related operations;

a)预处理,利用某种文本挖掘技术(如基于语义理解的文本挖掘),将原始文本数据中每组文本信息段落的关键词提取为标签a) Preprocessing, using some text mining technology (such as text mining based on semantic understanding), the original text data The keywords of each group of text information paragraphs in the text are extracted as tags ;

b)将组文本数据根据标签关键词和样本库对应到同维图像数据,其中,一组文本可对应多个标签以及同维图像数据,对应的样本图像可表示为b) will The group text data is mapped to the same-dimensional image data according to the tag keywords and sample library , where a set of text can correspond to multiple labels and same-dimensional image data, The corresponding sample image can be expressed as .

2)将原始图像数据处理为同维图像数据的过程和相关操作; 2) Convert the original image data to Processing as same-dimensional image data process and related operations;

a)预处理原始图像数据,利用相关算法增强关键特征(如剔除背景区域),得到处理后的图像a) Preprocessing raw image data , using related algorithms to enhance key features (such as removing background areas) to obtain processed images ;

b)对于图像,利用某种图像缩放技术(如双三次插值与小波逆向插值)缩放为同维图像数据(与样本库同维); b) for images , use some image scaling technology (such as bicubic interpolation and wavelet inverse interpolation) to scale to the same dimension image data (same dimension as the sample library);

c)将同维图像数据采用某种识别方法(如基于视觉信息的图像特征提取算法)与样本库比对,获得图像的文本标签,结果存放于c) Convert the same-dimensional image data Use a certain recognition method (such as image feature extraction algorithm based on visual information) to compare with the sample library to obtain the text label of the image, and the result is stored in .

3)将原始音频数据处理为同维图像数据的过程和相关操作; 3) Convert the raw audio data to Processing as same-dimensional image data process and related operations;

a)预处理原始音频数据,利用相关算法提取音频场景(如基于概率潜在语义分析的音频场景识别方法),语言语义(如基于神经网络的语音识别)等关键特征,得到提取的文本标签a) Preprocess raw audio data , use related algorithms to extract key features such as audio scene (such as audio scene recognition method based on probabilistic latent semantic analysis), language semantics (such as neural network-based speech recognition), and get the extracted text label ;

b)对于提取的文本标签,文本标签与样本库对应,得到同维图像数据,其中,同组音频可对应多个标签以及同维图像数据,对应的多个样本图像可表示为b) For the extracted text labels , the text label corresponds to the sample library, and the same-dimensional image data is obtained , where the same group of audio can correspond to multiple tags and image data of the same dimension, The corresponding multiple sample images can be expressed as .

4)将原始视频数据处理为同维图像数据的过程和相关操作; 4) Convert the raw video data to Processing as same-dimensional image data process and related operations;

a)预处理原始视频数据,利用某一场景分割算法(如基于语义的视频场景分割算法),对于每一视频,得到分割场景后个视频片段a) Preprocessing raw video data , using a scene segmentation algorithm (such as a semantic-based video scene segmentation algorithm), for each video , after getting the segmented scene video clips ;

b)对于的每个视频片段,采用某一关键帧提取方法(如基于聚类算法的多特征融合关键帧提取),获得关键帧图像,每一视频的关键帧图像的集合记为b) For each video segment of , using a certain key frame extraction method (such as multi-feature fusion key frame extraction based on clustering algorithm) to obtain the key frame image , the set of key frame images of each video is denoted as ;

c)对于关键帧图像,利用相关算法增强关键特征(如剔除背景区域); c) For keyframe images , using related algorithms to enhance key features (such as removing background areas);

d)对已处理的图像利用某种图像缩放技术(如双三次插值与小波逆向插值)缩放为同维图像数据(与样本库同维); d) Use certain image scaling techniques (such as bicubic interpolation and wavelet inverse interpolation) to scale the processed image to the same dimension image data (same dimension as the sample library);

e)将同维图像数据,采用某种识别方法与样本库比对,获得图像的文本标签,结果存放于e) Convert the same-dimensional image data , using a certain recognition method to compare with the sample library to obtain the text label of the image, and the result is stored in .

步骤三:参见图3,对已处理的同维图像数据,根据聚类所需要的准确度确定阈值,按照某种图像聚类算法进行聚类(如基于遗传算法的图像聚类),根据每类数据的文本标签进行文本标签关联性检验,将不满足条件的数据再次聚类,直到不满足条件的数据数量小于阈值,可得索引,为类同维图像数据的地址,详细步骤如下: Step 3: See Figure 3, for the processed same-dimensional image data, determine the threshold according to the accuracy required for clustering , perform clustering according to a certain image clustering algorithm (such as image clustering based on genetic algorithm), perform text label correlation test according to the text labels of each type of data, and re-cluster the data that does not meet the conditions until the conditions are not satisfied The number of data is less than the threshold , available index ,for same-dimensional image data address, the detailed steps are as follows:

1)对已处理的同维图像数据,根据聚类所需要的准确度确定阈值越小,分类数量越多,分类越精确,反之,分类数量越少; 1) For the processed same-dimensional image data, determine the threshold according to the accuracy required for clustering , The smaller the value, the greater the number of classifications and the more accurate the classification; otherwise, the fewer the number of classifications;

2)按照某种图像聚类算法进行聚类(如基于遗传算法的图像聚类),存储已聚类的同维图像地址,对于已聚类的同一类同维图像,提取其对应的文本标签,进行文本标签与图像聚类结果的文本标签关联性检验; 2) Perform clustering according to a certain image clustering algorithm (such as image clustering based on genetic algorithm), store the address of the clustered same-dimensional image, and extract the corresponding text label for the same type of clustered same-dimensional image , performing a text label correlation test between the text label and the image clustering result;

3)对于已聚类的同维图像数据,若不满足文本标签关联性检验条件的数量大于阈值,则将不满足条件的数据剔除本类,重新成为未聚类的同维图像数据,并按照相同或不同的聚类方法再次聚类,直到不满足条件的数据数量小于阈值3) For clustered image data of the same dimension, if the number that does not meet the text label relevance test condition is greater than the threshold , then the data that does not meet the conditions will be removed from this category, and it will become unclustered same-dimensional image data again, and will be clustered again according to the same or different clustering methods until the number of data that does not meet the conditions is less than the threshold ;

4)将分类结果以地址的形式存储,得到索引,为类同维图像数据的地址。 4) Store the classification results in the form of addresses to get the index ,for same-dimensional image data the address of.

步骤四:对已聚类的同维图像数据,按照某一融合规则(如选取目标较多一幅图像),进行融合,从而得到每一类同维图像数据的融合图像Step 4: Fusion the clustered same-dimensional image data according to a certain fusion rule (such as selecting one image with more targets), so as to obtain the fusion image of each type of same-dimensional image data ;

依次按索引取出类同维图像数据,按照某一融合规则,进行融合,从而得到每一类同维图像数据的融合图像by index take out same-dimensional image data , according to a certain fusion rule, fusion is performed to obtain the fusion image of each type of same-dimensional image data .

步骤五:根据每一类同维图像数据的融合图像以及索引,生成信息摘要; Step 5: Generate an information summary according to the fusion image and index of each type of same-dimensional image data;

将获得的融合图像以及索引生成信息摘要,用户可查看融合图像,访问对应的多媒体数据。 The resulting fused image will be and the index Generating a summary of information , the user can view the fused image and access the corresponding multimedia data.

Claims (8)

1. combine the informative abstract extracting method across Media Convergence, it is characterized in that, first the multi-medium data (word, image, audio frequency, video etc.) of input is classified by data type; Again different dimension multi-medium data is set up the text label of data with dimensionization, obtain dimension image and text label together; Then by same dimensional data image cluster and carry out text label relevance inspection; Several same dimension images of sub-category fusion are a sub-picture again; Finally generate and make a summary across media information; Described method at least comprises the following steps:
Step one: be urtext data by data type classifications by (word, image, audio frequency, video) in the multi-medium data of input , raw image data , original audio data , original video data ;
Step 2: arrange view data dimension (image pixel) standard value, sets up the same Wei Tuxiangyangbenku with text label, carries out different dimension multi-medium data with dimensionization process, adopts corresponding disposal route according to the difference of data type;
Step 3: to processed same dimensional data image, the accuracy definite threshold required for cluster , carry out cluster according to image clustering algorithm, carry out the inspection of text label relevance according to the text label of every class data, by the data cluster again do not satisfied condition, until the data bulk do not satisfied condition is less than threshold value , can obtain roughly the same dimensional data image address, i.e. index ;
Step 4: to the same dimensional data image of cluster, according to a kind of fusion rule, merge, thus the fused images obtaining each roughly the same dimensional data image ;
Step 5: according to each the roughly the same fused images of dimensional data image and index, information generated is made a summary.
2. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, to it is characterized in that in step 2 and by urtext data be treated to same dimensional data image process and associative operation, at least further comprising the steps of:
1) pre-service, utilizes Text Mining Technology, by urtext data in often organize text message paragraph keyword extraction be label ;
2) will group text data corresponds to same dimensional data image according to label keyword and Sample Storehouse , wherein, one group of text may correspond to multiple label and same dimensional data image, corresponding sample image can be expressed as .
3. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, to it is characterized in that in step 2 and by raw image data be treated to same dimensional data image process and associative operation, at least further comprising the steps of:
1) pre-service raw image data , utilize related algorithm to strengthen key feature, obtain the image after processing ;
2) for image , utilize image scaling techniques to be scaled same dimensional data image (with Sample Storehouse with tieing up);
3) by same dimensional data image adopt image-recognizing method and Sample Storehouse comparison, obtain the text label of image, result is deposited in .
4. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, it is characterized in that original audio data in step 2 be treated to same dimensional data image process and associative operation, at least further comprising the steps of:
1) pre-service original audio data , utilize related algorithm to extract audio scene, the key features such as language semantic, obtain the text label extracted ;
2) for the text label extracted , text label is corresponding with Sample Storehouse, obtains same dimensional data image , wherein, may correspond to multiple label and same dimensional data image with group audio frequency, corresponding multiple sample images can be expressed as .
5. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, it is characterized in that original video data in step 2 be treated to same dimensional data image process and associative operation, at least further comprising the steps of:
1) pre-service original video data , utilize Algorithm of Scene, for each video , after obtaining split sence individual video segment ;
2) for each video segment , adopt extraction method of key frame, obtain key frame images , the set of the key frame images of each video is designated as ;
3) for key frame images , utilize related algorithm to strengthen key feature;
4) same dimensional data image is scaled to processed imagery exploitation image scaling techniques (with Sample Storehouse with tieing up);
5) by same dimensional data image , adopt image-recognizing method and Sample Storehouse comparison, obtain the text label of image, result is deposited in .
6. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, it is characterized in that to same dimension image clustering and the process setting up index in step 3, at least further comprising the steps of:
1) to processed same dimensional data image, the accuracy definite threshold required for cluster , less, classification quantity is more, classifies more accurate, otherwise classification quantity is fewer;
2) carry out cluster according to image clustering algorithm, store the same dimension image address of cluster, for the same class of cluster with tieing up image, extract the text label of its correspondence, the relevance of carrying out text label and image clustering result is checked;
3) for the same dimensional data image of cluster, if the quantity not meeting test condition is greater than threshold value , then the data do not satisfied condition are rejected this class, again become the same dimensional data image of non-cluster, and according to identical or different method cluster again, until the data bulk do not satisfied condition is less than threshold value ;
4) classification results is stored with the form of address, obtain index , for roughly the same dimensional data image address.
7. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, it is characterized in that in step 4, sub-category fusion is the process of a width with dimensional data image, at least further comprising the steps of:
1) index is pressed successively take out roughly the same dimensional data image , according to a kind of fusion rule, merge, thus obtain the fused images of each roughly the same dimensional data image .
8. a kind of combination according to claim 1 is across the informative abstract extracting method of Media Convergence, it is characterized in that according to the fusion of each category information same dimension image and index in step 5, the process of information generated summary, at least further comprising the steps of:
1) fused images will obtained and index information generated is made a summary .
CN201510123093.1A 2015-03-20 2015-03-20 A kind of informative abstract extracting method of combination across Media Convergence Expired - Fee Related CN104679902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510123093.1A CN104679902B (en) 2015-03-20 2015-03-20 A kind of informative abstract extracting method of combination across Media Convergence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510123093.1A CN104679902B (en) 2015-03-20 2015-03-20 A kind of informative abstract extracting method of combination across Media Convergence

Publications (2)

Publication Number Publication Date
CN104679902A true CN104679902A (en) 2015-06-03
CN104679902B CN104679902B (en) 2017-11-28

Family

ID=53314944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510123093.1A Expired - Fee Related CN104679902B (en) 2015-03-20 2015-03-20 A kind of informative abstract extracting method of combination across Media Convergence

Country Status (1)

Country Link
CN (1) CN104679902B (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105142096A (en) * 2015-08-14 2015-12-09 湘潭大学 Neural network-based cross-media data fusion method in internet of things
CN105245370A (en) * 2015-10-13 2016-01-13 湘潭大学 A self-adaptive hierarchical cross-media data fusion method in the Internet of Things
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
WO2017092574A1 (en) * 2015-12-01 2017-06-08 慧科讯业有限公司 Mixed data type data based data mining method
CN106997387A (en) * 2017-03-28 2017-08-01 中国科学院自动化研究所 The multi-modal automaticabstracting matched based on text image
WO2017128438A1 (en) * 2016-01-31 2017-08-03 深圳市博信诺达经贸咨询有限公司 Method and system for application of big data
CN107437100A (en) * 2017-08-08 2017-12-05 重庆邮电大学 A kind of picture position Forecasting Methodology based on the association study of cross-module state
CN107885845A (en) * 2017-11-10 2018-04-06 广州酷狗计算机科技有限公司 Audio frequency classification method and device, computer equipment and storage medium
CN108388942A (en) * 2018-02-27 2018-08-10 四川云淞源科技有限公司 Information intelligent processing method based on big data
CN110472075A (en) * 2018-05-09 2019-11-19 中国互联网络信息中心 A kind of isomeric data classification storage method and system based on machine learning
CN110489475A (en) * 2019-08-14 2019-11-22 广东电网有限责任公司 A kind of multi-source heterogeneous data processing method, system and relevant apparatus
CN110532426A (en) * 2019-08-27 2019-12-03 新华智云科技有限公司 It is a kind of to extract the method and system that Multi-media Material generates video based on template
CN110837560A (en) * 2019-11-15 2020-02-25 北京字节跳动网络技术有限公司 Label mining method, device, equipment and storage medium
WO2020048308A1 (en) * 2018-09-03 2020-03-12 腾讯科技(深圳)有限公司 Multimedia resource classification method and apparatus, computer device, and storage medium
CN111291204A (en) * 2019-12-10 2020-06-16 河北金融学院 Multimedia data fusion method and device
CN111488490A (en) * 2020-03-31 2020-08-04 北京奇艺世纪科技有限公司 Video clustering method, device, server and storage medium
CN111767395A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Abstract generation method and system based on picture
CN112860905A (en) * 2021-04-08 2021-05-28 深圳壹账通智能科技有限公司 Text information extraction method, device and equipment and readable storage medium
CN112925902A (en) * 2021-02-22 2021-06-08 新智认知数据服务有限公司 Method and system for intelligently extracting text abstract in case text and electronic equipment
CN113505201A (en) * 2021-07-29 2021-10-15 宁波薄言信息技术有限公司 Contract extraction method based on SegaBert pre-training model
CN117371533A (en) * 2023-11-01 2024-01-09 深圳市马博士网络科技有限公司 Method and device for generating data tag rule
CN117573870A (en) * 2023-11-20 2024-02-20 中国人民解放军国防科技大学 A text label extraction method, device, equipment and medium for multi-modal data
CN119293239A (en) * 2024-12-09 2025-01-10 阿里云飞天(杭州)云计算技术有限公司 Data classification method and work order classification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050203864A1 (en) * 2001-06-27 2005-09-15 Ontrak Data International, Inc. System and method for data management
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation
CN102693321A (en) * 2012-06-04 2012-09-26 常州南京大学高新技术研究院 Cross-media information analysis and retrieval method
CN103646094A (en) * 2013-12-18 2014-03-19 上海紫竹数字创意港有限公司 System and method for automatic extraction and generation of audiovisual product content abstract
CN104166982A (en) * 2014-06-30 2014-11-26 复旦大学 Image optimization clustering method based on typical correlation analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050203864A1 (en) * 2001-06-27 2005-09-15 Ontrak Data International, Inc. System and method for data management
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation
CN102693321A (en) * 2012-06-04 2012-09-26 常州南京大学高新技术研究院 Cross-media information analysis and retrieval method
CN103646094A (en) * 2013-12-18 2014-03-19 上海紫竹数字创意港有限公司 System and method for automatic extraction and generation of audiovisual product content abstract
CN104166982A (en) * 2014-06-30 2014-11-26 复旦大学 Image optimization clustering method based on typical correlation analysis

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105142096B (en) * 2015-08-14 2018-10-19 湘潭大学 Across media data fusion method based on neural network in Internet of Things
CN105142096A (en) * 2015-08-14 2015-12-09 湘潭大学 Neural network-based cross-media data fusion method in internet of things
CN105245370A (en) * 2015-10-13 2016-01-13 湘潭大学 A self-adaptive hierarchical cross-media data fusion method in the Internet of Things
CN105245370B (en) * 2015-10-13 2019-03-19 湘潭大学 Across the media data fusion method of adaptive layered in a kind of Internet of Things
WO2017092574A1 (en) * 2015-12-01 2017-06-08 慧科讯业有限公司 Mixed data type data based data mining method
CN106815253A (en) * 2015-12-01 2017-06-09 慧科讯业有限公司 Mining method based on mixed data type data
WO2017128438A1 (en) * 2016-01-31 2017-08-03 深圳市博信诺达经贸咨询有限公司 Method and system for application of big data
CN106686403A (en) * 2016-12-07 2017-05-17 腾讯科技(深圳)有限公司 Video preview generation method, device, server and system
CN106997387B (en) * 2017-03-28 2019-08-09 中国科学院自动化研究所 Multimodal Automatic Summarization Method Based on Text-Image Matching
CN106997387A (en) * 2017-03-28 2017-08-01 中国科学院自动化研究所 The multi-modal automaticabstracting matched based on text image
CN107437100A (en) * 2017-08-08 2017-12-05 重庆邮电大学 A kind of picture position Forecasting Methodology based on the association study of cross-module state
CN107885845A (en) * 2017-11-10 2018-04-06 广州酷狗计算机科技有限公司 Audio frequency classification method and device, computer equipment and storage medium
CN108388942A (en) * 2018-02-27 2018-08-10 四川云淞源科技有限公司 Information intelligent processing method based on big data
CN110472075A (en) * 2018-05-09 2019-11-19 中国互联网络信息中心 A kind of isomeric data classification storage method and system based on machine learning
US11798278B2 (en) 2018-09-03 2023-10-24 Tencent Technology (Shenzhen) Company Limited Method, apparatus, and storage medium for classifying multimedia resource
WO2020048308A1 (en) * 2018-09-03 2020-03-12 腾讯科技(深圳)有限公司 Multimedia resource classification method and apparatus, computer device, and storage medium
CN110489475A (en) * 2019-08-14 2019-11-22 广东电网有限责任公司 A kind of multi-source heterogeneous data processing method, system and relevant apparatus
CN110532426A (en) * 2019-08-27 2019-12-03 新华智云科技有限公司 It is a kind of to extract the method and system that Multi-media Material generates video based on template
CN110837560A (en) * 2019-11-15 2020-02-25 北京字节跳动网络技术有限公司 Label mining method, device, equipment and storage medium
CN110837560B (en) * 2019-11-15 2022-03-15 北京字节跳动网络技术有限公司 Label mining method, device, equipment and storage medium
CN111291204B (en) * 2019-12-10 2023-08-29 河北金融学院 Multimedia data fusion method and device
CN111291204A (en) * 2019-12-10 2020-06-16 河北金融学院 Multimedia data fusion method and device
CN111488490A (en) * 2020-03-31 2020-08-04 北京奇艺世纪科技有限公司 Video clustering method, device, server and storage medium
CN111488490B (en) * 2020-03-31 2024-08-02 北京奇艺世纪科技有限公司 Video clustering method, device, server and storage medium
CN111767395A (en) * 2020-06-30 2020-10-13 平安国际智慧城市科技股份有限公司 Abstract generation method and system based on picture
CN111767395B (en) * 2020-06-30 2023-12-26 平安国际智慧城市科技股份有限公司 Abstract generation method and system based on pictures
CN112925902B (en) * 2021-02-22 2024-01-30 新智认知数据服务有限公司 Method, system and electronic equipment for intelligently extracting text abstract from case text
CN112925902A (en) * 2021-02-22 2021-06-08 新智认知数据服务有限公司 Method and system for intelligently extracting text abstract in case text and electronic equipment
CN112860905A (en) * 2021-04-08 2021-05-28 深圳壹账通智能科技有限公司 Text information extraction method, device and equipment and readable storage medium
CN113505201A (en) * 2021-07-29 2021-10-15 宁波薄言信息技术有限公司 Contract extraction method based on SegaBert pre-training model
CN117371533B (en) * 2023-11-01 2024-05-24 深圳市马博士网络科技有限公司 Method and device for generating data tag rule
CN117371533A (en) * 2023-11-01 2024-01-09 深圳市马博士网络科技有限公司 Method and device for generating data tag rule
CN117573870A (en) * 2023-11-20 2024-02-20 中国人民解放军国防科技大学 A text label extraction method, device, equipment and medium for multi-modal data
CN117573870B (en) * 2023-11-20 2024-05-07 中国人民解放军国防科技大学 A method, device, equipment and medium for extracting text tags from multimodal data
CN119293239A (en) * 2024-12-09 2025-01-10 阿里云飞天(杭州)云计算技术有限公司 Data classification method and work order classification method

Also Published As

Publication number Publication date
CN104679902B (en) 2017-11-28

Similar Documents

Publication Publication Date Title
CN104679902B (en) A kind of informative abstract extracting method of combination across Media Convergence
CN108009228B (en) Method, device and storage medium for setting content label
EP3477506B1 (en) Video detection method, server and storage medium
Ding et al. Learning topical translation model for microblog hashtag suggestion.
CN105027162B (en) Image analysis apparatus, image analysis system, method for analyzing image
CN101996191B (en) Method and system for searching for two-dimensional cross-media element
CN105593851A (en) A method and an apparatus for tracking microblog messages for relevancy to an entity identifiable by an associated text and an image
CN102508923A (en) Automatic video annotation method based on automatic classification and keyword marking
Elkasrawi et al. What you see is what you get? Automatic Image Verification for Online News Content
CN108427925A (en) Copy video detection method based on continuous copy frame sequence
Maigrot et al. Mediaeval 2016: A multimodal system for the verifying multimedia use task
CN105912684A (en) Cross-media retrieval method based on visual features and semantic features
Amato et al. Searching and annotating 100M Images with YFCC100M-HNfc6 and MI-File
WO2024188044A1 (en) Video tag generation method and apparatus, electronic device, and storage medium
CN118076982A (en) Information extraction and structuring method
Papadopoulos et al. Image clustering through community detection on hybrid image similarity graphs
Guo et al. Saliency detection on sampled images for tag ranking
Wang et al. An efficient refinement algorithm for multi-label image annotation with correlation model
Tang et al. Label-specific training set construction from web resource for image annotation
Chou et al. Multimodal video-to-near-scene annotation
Zhang et al. Multi-modal tag localization for mobile video search
Inayathulla et al. Supervised Deep Learning Approach for Generating Dynamic Summary of the Video
Balasundaram et al. Unsupervised learning‐based recognition and extraction for intelligent automatic video retrieval
CN107391613A (en) A kind of automatic disambiguation method of more documents of industry security theme and device
Taileb et al. Multimodal automatic image annotation method using association rules mining and clustering

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171128

CF01 Termination of patent right due to non-payment of annual fee