[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

KR101981924B1 - Media contents discriminating method - Google Patents

Media contents discriminating method Download PDF

Info

Publication number
KR101981924B1
KR101981924B1 KR1020150169037A KR20150169037A KR101981924B1 KR 101981924 B1 KR101981924 B1 KR 101981924B1 KR 1020150169037 A KR1020150169037 A KR 1020150169037A KR 20150169037 A KR20150169037 A KR 20150169037A KR 101981924 B1 KR101981924 B1 KR 101981924B1
Authority
KR
South Korea
Prior art keywords
fingerprint
query
content
matching
fingerprints
Prior art date
Application number
KR1020150169037A
Other languages
Korean (ko)
Other versions
KR20170063077A (en
Inventor
박지현
김정현
서용석
유원영
임동혁
서영호
손욱호
Original Assignee
한국전자통신연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자통신연구원 filed Critical 한국전자통신연구원
Priority to KR1020150169037A priority Critical patent/KR101981924B1/en
Publication of KR20170063077A publication Critical patent/KR20170063077A/en
Application granted granted Critical
Publication of KR101981924B1 publication Critical patent/KR101981924B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Library & Information Science (AREA)

Abstract

Embodiments of the present invention relate to a method for identifying media content. According to an embodiment of the present invention, a method for identifying media content includes storing a fingerprint of a portion of fingerprints extracted from content to be compared in a database. Some of the fingerprints stored here are contiguous with each other; Retrieving from the database a fingerprint matching the query fingerprint extracted from the query content; Calculating a time interval between the retrieved matching fingerprint and a next fingerprint stored in the database; Extracting a next query fingerprint from the query content using the calculated time interval; And when a matching fingerprint matching all query fingerprints extracted from the query content is searched, determining that the comparison target content and the query content are the same media content. According to embodiments of the present invention, it is possible to reduce the amount of data to be managed for media content identification.

Description

Media contents discriminating method

Embodiments of the invention relate to a method of identifying media content.

In the field of media, fingerprinting technology is used to extract and database unique features (fingerprints) of media contents such as music or movies, and to recognize unknown media contents based on them.

The size of the fingerprint depends on how the fingerprint is created, but is typically very small, from 1 / 1,000 to 1 / 5,000 of the original media content size. However, as the number of media contents on the Internet soars, the amount of fingerprints to be stored in a database also increases.

Most fingerprint systems manage fingerprints using a memory database for fast retrieval. As the amount of fingerprint increases, the required memory capacity also increases rapidly.

Domestic Publication No. 10-2004-0086350 (Efficient Storage of Fingerprint)

Embodiments of the present disclosure provide a method of databaseing only a part of fingerprints of all fingerprints extracted from media content, and identifying media content using only part of the fingerprinted database.

In accordance with another aspect of the present invention, there is provided a method for identifying media content, the method comprising: storing a fingerprint of some of fingerprints extracted from a content to be compared in a database, wherein some of the stored fingerprints are contiguous with each other; Retrieving from the database a fingerprint matching the query fingerprint extracted from the query content; Calculating a time interval between the retrieved matching fingerprint and a next fingerprint stored in the database; Extracting a next query fingerprint from the query content using the calculated time interval; And when a matching fingerprint matching all query fingerprints extracted from the query content is searched, determining that the comparison target content and the query content are the same media content.

According to embodiments of the present invention, it is possible to reduce the amount of data to be managed for media content identification.

According to embodiments of the present invention, it is possible to reduce the cost of building and operating a system for media content identification.

1 is an exemplary diagram for explaining a process of identifying media content using a fingerprint;
2 is an exemplary diagram for explaining an editing section in media content;
3 is an exemplary view for explaining a case where it is determined that media content that is substantially the same as each other is not the same due to the existence of an editing section;
4 is an exemplary diagram for describing a fingerprint stored in a database according to an embodiment of the present invention;
5 is an exemplary view for explaining a query fingerprint selection method according to an embodiment of the present invention;
6 is a flowchart illustrating a media content search process according to an embodiment of the present invention;
7 is a block diagram illustrating an apparatus according to an embodiment of the present invention.

Hereinafter, in describing the embodiments of the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted.

Hereinafter, with reference to the accompanying drawings will be described embodiments of the present invention.

1 is an exemplary diagram for explaining a process of identifying media content using a fingerprint.

When a search is performed using the entire fingerprint extracted from media content (hereinafter, referred to as query content) used for a query, the search time takes a lot. Therefore, if a search is performed by selecting some of the entire fingerprints extracted from the query content, the search time can be shortened.

For example, suppose that a search is performed using four fingerprints Q 1 , Q 2 , Q 3 , and Q 4 among all fingerprints extracted from query content, as shown in FIG. 1. . Hereinafter, a fingerprint used for a query is called a query fingerprint.

The media content identification apparatus may search a database to compare the content having four query fingerprints Q 1 , Q 2 , Q 3 , and Q 4 and select a candidate as a candidate by performing a database search.

In addition, the media content identification device may calculate time intervals T 1 , T 2 , and T 3 between fingerprints present in candidate content and matching the query fingerprint. Hereinafter, a fingerprint matching the query fingerprint is called a matching fingerprint.

The media content identification device compares the time interval between the query fingerprints Q 1 , Q 2 , Q 3 , Q 4 and the time interval between the matching fingerprints R 1 , R 2 , R 3 , R 4 . If the time interval between them, that is, the time interval between T 1 and t 1, the time interval between T 2 and t 2 , and the time interval between T 3 and t 3 , is less than the threshold, media content with the same content as the query content It can be determined.

According to the media content identification method described with reference to FIG. 1, it is not possible to know in advance which part of the query content will be used for the actual query, and thus it is not possible to know in advance which part of the content to be compared is matched. All fingerprints should be stored in the database. Therefore, it requires a lot of space for fingerprint storage.

According to an embodiment of the present invention, only a part of the entire fingerprints extracted from the content to be compared is databased, and a method for identifying media content using only the partial fingerprinted database is proposed.

For this purpose, it should be considered that, due to the editing section existing in the media content, it may be determined that substantially the same media content is not identical to each other.

For example, generally, media content, such as a movie or TV program, consists of a title, main content, ending credits, and the like.

By the way, such media content may be edited by a company or the like that provides the media content. Such edits can be made to both the title, main content, and ending credits. For example, as shown in FIG. 2, the media content screened in a movie theater may include title 1 and title 2, but the media content screened in a TV may not include title 2.

That is, even in the case of substantially the same media content, the lengths of the media content may be different from each other due to the existence of the editing section, which may cause a problem that the media content that is substantially the same as each other may not be the same. This will be described with reference to FIG. 3.

3 is an exemplary diagram for explaining a case where it is determined that media contents that are substantially identical to each other are not identical to each other due to the existence of an editing section.

In the embodiment described with reference to FIG. 3, it is assumed that some of the fingerprints extracted from the content to be compared are stored in the database. The fingerprints stored in the database are assumed to have a set time interval.

On the other hand, the query content is media content that is substantially the same as the content to be compared, but it is assumed that an edit in which a title portion is added is performed and has a longer length than the content to be compared. In addition, it is assumed that a search is performed using query fingerprints having a set time interval.

In this case, the query fingerprint 301 extracted from the query content should be mapped to the fingerprint 302 extracted from the content to be compared, but since the fingerprint 302 is not stored in the database, the content to be compared with the query content May be determined to be other media content.

That is, even though the two media contents are substantially the same content, since the fingerprint matching the query fingerprint does not exist in the database, the two media contents may be determined to be different media contents.

Therefore, for accurate retrieval, there is a need for a method for selecting a query fingerprint using time information of fingerprints stored in a database.

4 is an exemplary diagram for describing a fingerprint stored in a database according to an embodiment of the present invention.

According to an embodiment of the present invention, only some of the fingerprints extracted from the media content may be stored in the database. The spacing between the fingerprints stored may or may not be the same. This can be determined differently depending on the intention of the system operator.

On the other hand, assuming that the length of the query fingerprint is L, the length of the fingerprint stored in the database may be greater than L. For example, a difference in reproduction time that may occur due to a difference in encoding methods may be considered and may be about 1.5 times larger than the length of L.

In addition, as described above, in consideration of the possibility that the editing section exists, some sections may have a longer time section than other sections. Hereinafter, fingerprints having a longer time interval than other intervals are referred to as synchronization blocks. The synchronization block can be used to adjust the query timing of the fingerprint.

5 is an exemplary view for explaining a query fingerprint selection method according to an embodiment of the present invention.

First, as shown in (a) of FIG. 5, it is assumed that some of the fingerprints extracted from the content to be compared are stored.

Thereafter, when the query content is input, the media content identification device may perform a search by extracting one query fingerprint from the query content, as shown in FIG. When extracting the initial query fingerprint, the query fingerprint may be extracted from the main content portion by skipping the set time interval. In FIG. 5B, it is assumed that a matching fingerprint matching the query fingerprint exists in the synchronization block.

Thereafter, as illustrated in FIG. 5C, the media content identification device may calculate a time interval between the matching fingerprint and the next fingerprint stored in the database. Assuming that the calculated time interval is t 1 , the media content identification device may extract the next query fingerprint in the interval after t 1 from the query fingerprint and perform a search using the extracted fingerprint.

Subsequently, as illustrated in FIG. 5D, the media content identification apparatus may extract the query fingerprint at a predetermined time interval t 2 to perform a search.

Then, as shown in (e) of FIG. 5, when a sync block is found, the media content identification device determines a time interval t 3 between a matching fingerprint existing in the sync block and the next fingerprint stored in the database. ) Can be calculated. The media content identification device may perform a search by extracting a next query fingerprint in a section after t 3 from the last extracted query fingerprint.

The media content identification apparatus may identify the media content by repeatedly repeating the processes of FIGS. 5C to 5E.

6 is a flowchart illustrating a media content search process according to an embodiment of the present invention. According to an embodiment, at least one of the steps illustrated in FIG. 6 may be omitted.

In operation 601, the media content identification apparatus may extract the query fingerprint at the t position of the query content.

In operation 603, the media content identification apparatus may search for a matching fingerprint that matches the query fingerprint. If a matching fingerprint is found, the process proceeds to step 605. Otherwise, the process proceeds to step 617, where it is determined that the content to be compared is not the same as the query content.

In operation 605, when the matching fingerprint is found, the media content identification device may check whether the matching fingerprint exists in the synchronization block.

In step 607, if the matching fingerprint is present in the synchronization block, the media content identification device may calculate a time interval t 1 between the matching fingerprint and the next fingerprint stored in the database.

In operation 609, the media content identification apparatus may calculate a time t to extract the next query fingerprint from the query content by reflecting the calculated time interval t 1 . On the other hand, in step 615, when the matching fingerprint does not exist in the synchronization block, the media content identification apparatus reflects the set time interval t 2 and time t to extract the next query fingerprint from the query content. Can be calculated.

In step 611, the media content identification device may check whether the calculated time t is longer than the time of the query content. If the calculated time t is shorter than the time of the query content, the process proceeds to step 601 to continue query fingerprint extraction and matching fingerprint search.

In step 613, when the calculated time t is longer than the time of the query content, the media content identification device may determine that the content to be compared is the same media content as the query content.

7 is a block diagram illustrating an apparatus according to an embodiment of the present invention.

Referring to FIG. 7, an apparatus according to an embodiment of the present invention includes a query content input unit 710, a matching content search unit 720, and a database 730. According to an embodiment, at least one of the aforementioned components may be omitted.

The query content input unit 710 may receive query content from a user or a system operator.

The matching content search unit 720 may search whether the comparison target content that matches the query content is stored.

For example, the matching content retrieval unit 720 may extract a query fingerprint at a predetermined position of the query content and search whether a matching fingerprint matching the extracted query fingerprint is stored.

When the matching fingerprint matching the query fingerprint is not stored, the matching content search unit 720 may determine that the comparison target content and the query content are different media contents.

When the matching fingerprint exists in the synchronization block, the matching content search unit 720 may extract the next query fingerprint from the query content by applying the set first time interval. If the matching fingerprint does not exist in the synchronization block, the matching content search unit 720 may extract the next query fingerprint from the query content by applying the set second time interval.

When the matching fingerprints matching all the query fingerprints extracted from the query content are found, the matching content search unit 720 may determine that the content to be compared is the same media content as the query content. Accordingly, the matching content search unit 720 may output information about the content to be compared as a search result.

The database 730 may store a fingerprint of content to be compared.

Embodiments of the present invention described above may be implemented in any of various ways. For example, embodiments of the present invention may be implemented using hardware, software, or a combination thereof. If implemented in software, it may be implemented as software running on one or more processors utilizing various operating systems or platforms. In addition, such software may be written using any of a number of suitable programming languages, and may also be compiled into machine code or intermediate code executable in a framework or virtual machine.

In addition, when embodiments of the present invention are executed on one or more processors, a processor-readable medium (eg, memory, recorded with one or more programs) for performing a method for implementing various embodiments of the present invention discussed above. Floppy disk, hard disk, compact disk, optical disk, or magnetic tape).

Claims (10)

In the media content identification method of the media content identification device for determining the sameness between the contents,
Storing a plurality of selected fingerprints among the fingerprints extracted from the content to be compared in a database;
Extracting a fingerprint selected from the fingerprints extracted from the query content into the query fingerprint;
Retrieving a matching fingerprint from the database that matches the query fingerprint;
Calculating a time interval t between the matching fingerprint and a next fingerprint stored in the database;
Extracting a next query fingerprint using the t from the query content; And
Determining that the query content and the comparison target content are the same when a matching fingerprint matching all the next query fingerprints extracted from the query content is searched for;
And a plurality of selected fingerprints among the fingerprints extracted from the content to be compared include a synchronization block that is a set of consecutive fingerprints.
The method according to claim 1,
Extracting a fingerprint selected from the fingerprint extracted from the query content as a query fingerprint,
And extracting a fingerprint located after a preset interval from the first fingerprint of the query content as the query fingerprint.
delete The method according to claim 1,
Computing the time interval t,
If the matching fingerprint is present in the synchronization block, calculating a time interval between t and the next fingerprint outside the synchronization block as t.
The method according to claim 1,
Extracting the next query fingerprint using the t,
Extracting a fingerprint spaced t from the query fingerprint as the next query fingerprint.
A query content input unit for receiving query content;
A database that stores a plurality of fingerprints selected from fingerprints extracted from the content to be compared; And
Extract a query fingerprint from the query content, retrieve a matching fingerprint matching the query fingerprint from the database, calculate a time interval t between the matching fingerprint and a next fingerprint stored in the database, When the next query fingerprint is extracted from the query content using the t, and a matching fingerprint matching the extracted next next query fingerprint is found, the matching content is determined to be the same as the content to be compared with the query content. Includes a search part,
And a plurality of fingerprints selected from the fingerprints extracted from the content to be compared include a synchronization block that is a set of consecutive fingerprints.
The method of claim 6, wherein the matching content search unit,
And extracting a fingerprint located after a predetermined interval from the first fingerprint of the query content as the query fingerprint.
delete The method according to claim 6,
The matching content search unit,
And if the matching fingerprint is present in the synchronization block, calculating the time interval between the matching fingerprint and the next fingerprint outside the synchronization block as t.
The method according to claim 6,
The matching content search unit,
And extract a fingerprint separated by t from the query fingerprint into the next query fingerprint.
KR1020150169037A 2015-11-30 2015-11-30 Media contents discriminating method KR101981924B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150169037A KR101981924B1 (en) 2015-11-30 2015-11-30 Media contents discriminating method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150169037A KR101981924B1 (en) 2015-11-30 2015-11-30 Media contents discriminating method

Publications (2)

Publication Number Publication Date
KR20170063077A KR20170063077A (en) 2017-06-08
KR101981924B1 true KR101981924B1 (en) 2019-08-30

Family

ID=59221354

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150169037A KR101981924B1 (en) 2015-11-30 2015-11-30 Media contents discriminating method

Country Status (1)

Country Link
KR (1) KR101981924B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11503365B2 (en) 2019-10-29 2022-11-15 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102141411B1 (en) 2018-03-02 2020-08-05 (주)미래기술 The content based clean cloud systems and method
KR102439201B1 (en) * 2020-09-14 2022-09-01 네이버 주식회사 Electronic device for synchronizing multimedia content and audio source and operating method thereof
KR20240071129A (en) * 2022-11-15 2024-05-22 삼성전자주식회사 Electronic apparatus and method for controlling thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120077542A1 (en) 2010-05-05 2012-03-29 Rhoads Geoffrey B Methods and Arrangements Employing Mixed-Domain Displays
KR101315970B1 (en) * 2012-05-23 2013-10-08 (주)엔써즈 Apparatus and method for recognizing content using audio signal
KR101494309B1 (en) 2013-10-16 2015-02-23 강릉원주대학교산학협력단 Asymmetric fingerprint matching system for digital contents and providing method thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003067466A2 (en) 2002-02-05 2003-08-14 Koninklijke Philips Electronics N.V. Efficient storage of fingerprints
KR100916310B1 (en) * 2007-06-05 2009-09-10 주식회사 코난테크놀로지 System and Method for recommendation of music and moving video based on audio signal processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120077542A1 (en) 2010-05-05 2012-03-29 Rhoads Geoffrey B Methods and Arrangements Employing Mixed-Domain Displays
KR101315970B1 (en) * 2012-05-23 2013-10-08 (주)엔써즈 Apparatus and method for recognizing content using audio signal
KR101494309B1 (en) 2013-10-16 2015-02-23 강릉원주대학교산학협력단 Asymmetric fingerprint matching system for digital contents and providing method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
박만수 외, 실제 잡음 환경에 강인한 오디오 핑거프린팅 기법, Telecommunications Review, 16권3호, pp.435-446, 2006.
서진수 외, 압축 도메인 특징을 이용한 강인한 오디오 핑거프린팅, 한국음향학회지 28권4호, pp.375-382, 2009.

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11503365B2 (en) 2019-10-29 2022-11-15 Samsung Electronics Co., Ltd. Electronic apparatus and control method thereof

Also Published As

Publication number Publication date
KR20170063077A (en) 2017-06-08

Similar Documents

Publication Publication Date Title
US8886531B2 (en) Apparatus and method for generating an audio fingerprint and using a two-stage query
TWI447601B (en) Improving audio/video fingerprint search accuracy using multiple search combining
KR101981924B1 (en) Media contents discriminating method
WO2018107914A1 (en) Video analysis platform, matching method, and accurate advertisement push method and system
CN101821734B (en) Detection and classification of matches between time-based media
JP4398242B2 (en) Multi-stage identification method for recording
US20110173185A1 (en) Multi-stage lookup for rolling audio recognition
US11044520B2 (en) Handling of video segments in a video stream
CN102405639A (en) Verification and synchronization of files obtained separately from a video content
CN106557545B (en) Video retrieval method and device
CN103198293A (en) System and method for fingerprinting video
KR102233175B1 (en) Method for determining signature actor and for identifying image based on probability of appearance of signature actor and apparatus for the same
KR20070121810A (en) Synthesis of composite news stories
US20190362405A1 (en) Comparing audiovisual products
CN104504333A (en) Malicious code detection method and device of ELF (executable and linkable format) file
EP3745727A1 (en) Method and device for data processing
JP6495792B2 (en) Speech recognition apparatus, speech recognition method, and program
JP2009510509A (en) Method and apparatus for automatically generating a playlist by segmental feature comparison
US20150010288A1 (en) Media information server, apparatus and method for searching for media information related to media content, and computer-readable recording medium
CN106598997B (en) Method and device for calculating text theme attribution degree
KR101472016B1 (en) Creation method of complex file having image file and additional data inserted in the image file and data record apparatus recording the complex file
US8044290B2 (en) Method and apparatus for reproducing first part of music data having plurality of repeated parts
KR101672123B1 (en) Apparatus and method for generating caption file of edited video
KR100939215B1 (en) Creation apparatus and search apparatus for index database
US20110072117A1 (en) Generating a Synthetic Table of Contents for a Volume by Using Statistical Analysis

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant