JP2011029795A - System and method for providing annotation information of video content - Google Patents
System and method for providing annotation information of video content Download PDFInfo
- Publication number
- JP2011029795A JP2011029795A JP2009171668A JP2009171668A JP2011029795A JP 2011029795 A JP2011029795 A JP 2011029795A JP 2009171668 A JP2009171668 A JP 2009171668A JP 2009171668 A JP2009171668 A JP 2009171668A JP 2011029795 A JP2011029795 A JP 2011029795A
- Authority
- JP
- Japan
- Prior art keywords
- term
- video content
- terms
- annotation information
- genre
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims description 35
- 238000004891 communication Methods 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 5
- 230000005236 sound signal Effects 0.000 claims description 4
- 230000008451 emotion Effects 0.000 description 16
- 230000006870 function Effects 0.000 description 11
- 238000010411 cooking Methods 0.000 description 8
- 230000035807 sensation Effects 0.000 description 8
- 235000019615 sensations Nutrition 0.000 description 8
- 238000007792 addition Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000003825 pressing Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000001965 increasing effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 235000016496 Panda oleosa Nutrition 0.000 description 2
- 240000000220 Panda oleosa Species 0.000 description 2
- 241001303755 Porpita porpita Species 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 230000013016 learning Effects 0.000 description 2
- 230000003340 mental effect Effects 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000033772 system development Effects 0.000 description 2
- 102100033806 Alpha-protein kinase 3 Human genes 0.000 description 1
- 101710082399 Alpha-protein kinase 3 Proteins 0.000 description 1
- 206010048909 Boredom Diseases 0.000 description 1
- 206010052804 Drug tolerance Diseases 0.000 description 1
- 206010041349 Somnolence Diseases 0.000 description 1
- 230000004308 accommodation Effects 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000000386 athletic effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000026781 habituation Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000008786 sensory perception of smell Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 210000003454 tympanic membrane Anatomy 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
æ¬çºæã¯ãæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã«ã¢ãããŒã·ã§ã³æ å ±ãä»äžããã·ã¹ãã ã§ãæŸéçªçµãèªäœãããªãªã©ã®æ åã³ã³ãã³ãã®é²ç»è£ 眮ããããªã«ã¡ã©ãç·šéè£ çœ®çã®æ åè£ çœ®ã«é©çšããããã®ã§ããã The present invention is a system that adds annotation information to an arbitrary scene of video content, and is applied to video devices such as a video content recording device, a video camera, an editing device, etc., such as a broadcast program or a self-made video.
èªäœãããªãå«ããæ åã³ã³ãã³ããããå¿«é©ã«ãããå¹çãããããç©æ¥µçã«å©çšããã«ã¯ãæ åã³ã³ãã³ãã®ç·šéãäžå¯æ¬ ã§ãããããŸããŸãªèŠå ããæè»œã«å®æœã§ããªãã
æ åã³ã³ãã³ãã®ç·šéãè¡ãäžã§ã®æåã®ã¹ãããã¯ç®çã®ã·ãŒã³æ¢ããšãã®åé¡ã§ãããããã«ã¯å€ãã®æéçãèäœçã粟ç¥çãªè² æ
ã匷ããããã
äŸãã°ãã€ã©ã€ãã·ãŒã³ãã³ããŒã·ã£ã«ã¡ãã»ãŒãžã·ãŒã³ïŒä»¥äžïŒ£ïŒïŒæ¢ããªã©ã¯ãã®æãããã®ã§ããããã以å€ã«ããæ åã³ã³ãã³ãã®é·ããé©åãªãã®ã«ããããã«ãå¿
èŠã·ãŒã³ïŒäžèŠã·ãŒã³ãåºåããããªã©ãå¶äœæå³ã«æ²¿ã£ãŠç²ŸåºŠããæ åã³ã³ãã³ããç·šéããå Žåã«è²»ããæéã®å€§åã¯ããããã®ã·ãŒã³ãæ¢ãåºãä»äºãšãã®åé¡ã§ãããšèšã£ãŠãéèšã§ãªãã
ïŒã¯ç»åã®å€åãé³å£°ã®ã¢ãŒãã®å€åãããããèªåæ€åºããæ¹æ³ã¯ä»¥åããå®çšåãããŠããã
ãŸãäžæ¹ãé³å£°ã®ã¬ãã«æ€åºãªã©ãå©çšããŠãã€ã©ã€ãã·ãŒã³ãèªåæ€åºããæè¡ãææ¡ãããŠãããã®ã®ããã€ã©ã€ãã·ãŒã³ãšããã·ãŒã³ã¯èŠèŽè
ã®å人差ã®ãŠãšãŒããé«ããç¹å®æ åã³ã³ãã³ã以å€ã¯èªååãå°é£ã§ããã
äŸãã°ãçŽè¡çªçµã§çŽ æŽãããçºããæ¯èŠ³ãªã©ã¯äžäººãèžãæã€ã·ãŒã³ã§ããããå¡åºžãªå Žæã§ãæäœãã§ããåå°ã蚪ããããšã®ããå Žæã¯ãã®äººã«ãšã£ãŠæããã貎éãªãã€ã©ã€ãã·ãŒã³ãšãªãã
髿 ¡éçã®çªçµã§ããåŠæ ¡ã®éåäŒã®èªäœãããªã§ããèªåã®åäŸãåºå Žããã·ãŒã³ã§ã¯ã©ããªç»åããã®çµæã§ãããã倧åãªãã€ã©ã€ãã·ãŒã³ã§ããã
ãŸããã¥ãŒã¹çªçµã«ãããŠæ¿æ²»çµæžçãªå
容ã§ããã°ããããã®å Žé¢ã§èªåã®æèŠãšã®è³æãå察ããã€ã©ã€ãã·ãŒã³ãšãªãã
以äžã®ããã«ãã€ã©ã€ãã·ãŒã³ã«å¯ŸããŠã¯äžèšã®ããã«åäººãæ±ãæãããã·ãŒã³ãæåããã·ãŒã³ãæããæããã·ãŒã³ãªã©ãããŸããŸãªã·ãŒã³ãååšããã²ãšãããã«èªååããããšã¯åºæ¥ãªãã
æ®ããŠããããæ åã³ã³ãã³ãã®äžã§ãäžéšã®ã·ãŒã³ã¯ãèŠãããªããèŠããããªãã·ãŒã³ããäžèŠãªã·ãŒã³ãç¡é§ãªã·ãŒã³çããŸããŸãªã·ãŒã³ãååšããã
以äžã®ããã«ããŸããŸãªç®çã®æ åã³ã³ãã³ãã®ç·šéã«ãããŠãç®çãšããã·ãŒã³äœçœ®ãèŠã€ãåºãé©åãªåé¡ãããããã«ã¯äººçå€æãšæåãäžå¿ãšããæ
å ±ïŒæåæ
å ±çïŒãšããããšãäžå¯æ¬ ã§ããæéçãèäœçã粟ç¥çãªè² æ
ã匷ããããã
ãã®ããäœæ¥æ§ãå¹çæ¹åãªã©äººçäœæ¥ãæ¯æŽãã€ãŸãç·šéã·ãŒã³ãžã®æ
å ±ïŒã¢ãããŒã·ã§ã³æ
å ±ïŒä»äžããã®æ¯æŽæ¹æ³ããç¹ã«æŸéæ¥çãæ åã³ã³ãã³ãæ¥ççã®ãããŠãŒãºã«ãããŠããŸããŸãªåœ¢ã§ææ¡ãããŠããã
Editing video content is indispensable in order to use video content, including self-made videos, more comfortably, efficiently, and more actively, but it cannot be easily implemented due to various factors.
The first step in editing video content is the search for the desired scene and its classification, which imposes many time, physical and mental burdens.
For example, searching for highlight scenes and commercial message scenes (hereinafter referred to as CMs) is the best one. Besides this, in order to make the length of video content appropriate, the necessary scenes / unnecessary scenes are classified, etc. It's no exaggeration to say that the majority of the time spent editing video content accurately according to the production intent is the task of finding these scenes and their classification.
For CM, a method of automatically detecting this from a change in image or a change in sound mode has been put into practical use.
On the other hand, although technology to automatically detect highlight scenes using audio level detection has also been proposed, the scenes used as highlight scenes have a high individual difference among viewers and are automated except for specific video content. Is difficult.
For example, a wonderful view and landscape in a travel show is a scene where everyone will hit the heart, but even in mediocre places, places that have lived in the past or places that have been visited become nostalgic and valuable highlight scenes for that person .
Whether it is a high school baseball program or a self-made video of a school athletic meet, it is an important highlight scene regardless of what image or result it is.
In addition, if it is political and economic content in a news program, the approval and disagreement with your opinion in each scene is also a highlight scene.
As described above, there are various scenes such as nostalgic scenes, impressed scenes, and angry scenes that individuals have for highlight scenes as described above, and cannot be automated all at once.
Among the video contents that are desired to remain, there are various scenes such as a scene that you do not want to see or want to see, an unnecessary scene, and a useless scene.
As described above, in editing video content for various purposes, it is indispensable to use human judgment and character-centered information (character information, etc.) in order to find the target scene position and classify it appropriately. There is a time, physical and mental burden.
For this reason, support methods for supporting human work such as improving workability and efficiency, that is, providing information (annotation information) to editing scenes have been proposed in various forms, particularly in the broadcast industry, video content industry, and other professional uses. .
ãã®ãããªæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã«æåæ
å ±çãäžããããã®ã¢ãããŒã·ã§ã³æ
å ±ã®å
¥åææ®µãšããŠã
é éæäœè£
眮ïŒãªã¢ã³ã³ïŒãšè¡šç€ºç»é¢ã®ïŒ§ïŒµïŒ©ïŒã°ã©ãã£ãã¯ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ïŒããŒããŒãã«ãããé éæäœè£
眮ã®ã«ãŒãœã«ãã¿ã³æäœã§ããªæåçãäžæåãã€éžæããæŒ¢å倿å
¥åãçšèªãç»é²ããæ¹æ³ãããã
ãã®æ¹æ³ã¯çŸåšå€ãã®æ åè£
çœ®ã«æ¡çšãããŠãããã®ã®ç¿çãå¿
èŠãªäžãæ
£ããŠãè€éãªæäœãšãªãæåå
¥åã®å¹çãæªããããéåžžã¯ãªãã©ã€ã³ïŒéåçæïŒã§æ åã³ã³ãã³ãã®ã¿ã€ãã«ãç·šéããçšåºŠãéçã§ããããªã³ã©ã€ã³ïŒåçæïŒã§ã¯å©çšã§ããªãã
å€éšããŒããŒããçšããŠãæåãäžæåãã€å
¥åãçšèªãç»é²ããæ¹æ³ãããã
ãã®æ¹æ³ã¯ç¿çããã°å¹çãè¯ãããä»äžããæåãçšèªçãæ
å ±ã«çµ±äžæ§ãæããããšãå°é£ã§ãããä»äžããæ
å ±ãå Žåœããçã«ãªãåŸã®æ€çŽ¢ã«è² æ
ããããã
é³å£°èªèãå©çšããŠãçšèªãç»é²ããæ¹æ³ãããã
äžè¬çãªé³å£°èªèã¯ç¿çãå¿
èŠã§ããããŸã誀èªèãé¿ããŠéããªãããŸãã·ã¹ãã ã®è² æ
ã倧ãããããŒããŒãåæ§ä»äžããæåãçšèªçãæ
å ±ã«çµ±äžæ§ãæããããšãå°é£ã§ããããŸãè€æ°ã®èŠèŽè
ãããå Žåãçºå£°ãéªéã«ãªãã
ç·šéçšã®èŸæžãå©çšããŠãããå©çšããæ¹æ³ãããã
ãã®æ¹æ³ã¯ä»äžããçšèªã«çµ±äžæ§ãäžããããšãã§ãããããžã£ã³ã«ããšã®èŸæžããã€æ¹æ³ãææ¡ãããŠãããã®ã®ãåã«é³å£°èªèã®ããã®ãã³ãã¬ãŒãã§ãã£ãããç»é²ãããŠããçšèªã®å¹ççãªæ€çŽ¢æ¹æ³ãæŽã«ã¯éžææ¹æ³ã«ã€ããŠã解決ãããªãéããç¹ã«ãªã³ã©ã€ã³ã§ã®å©çšã¯å°é£ã§ããã
As an input means of annotation information for giving character information etc. to an arbitrary scene of such video content,
There is a method in which a kana character or the like is selected one by one by a cursor button operation of the remote operation device using a remote operation device (remote control) and a GUI (graphic user interface) keyboard on the display screen, and Kanji characters are converted and input to register a term.
Although this method is currently used in many video devices, it is necessary to master it, and even if you get used to it, it becomes a complicated operation and character input is inefficient, so usually you can edit the title of video content offline (when not playing) The limit is to the extent that it is not available online (during playback).
There is a method of registering terms by inputting characters one by one using an external keyboard.
This method is efficient if it is mastered, but it is difficult to provide uniformity of information such as characters and terms to be added, and the information to be added becomes ad hoc and burdens on subsequent searches.
There is a method of registering terms using voice recognition.
General speech recognition requires proficiency, misrecognition is unavoidable, the burden on the system is large, and it is difficult to make the information, such as letters, terms, etc., uniform as well as the keyboard, When there are multiple viewers, the utterance gets in the way.
There is a method of using this using an editing dictionary.
Although this method can give uniformity to the terms to be assigned, a method having a dictionary for each genre has been proposed, but it is merely a template for speech recognition or an efficient search for registered terms. Unless the method and the selection method are solved, it is particularly difficult to use online.
é³å£°èªèãšèŸæžããã€ã¢ãããŒã·ã§ã³æ
å ±ä»äžã®å
è¡æè¡æç®ã®äžäŸãšããŠãç¹éïŒïŒïŒïŒâïŒïŒïŒïŒïŒå·å
¬å ±ãªãã³ã«ç¹éïŒïŒïŒïŒâïŒïŒïŒïŒïŒïŒå·å
¬å ±ã¯ãã³ã³ãã³ãå¶äœã«ãããã¡ã¿ããŒã¿ïŒã¢ãããŒã·ã§ã³æ
å ±ïŒå¶äœè£
眮åã³æ€çŽ¢è£
眮ã«é¢ãããã®ã§ãããããããé³å£°èªèã®èª€èªçãšãéªéãªçºå£°ã課é¡ãšãªãããŸãåŸè
ã¯å¶äœãããæ åã»é³å£°ã³ã³ãã³ããåçããããšã«ããã¡ã¿ããŒã¿ãšãã¹ãæ
å ±ã確èªããé³å£°å
¥åã§ã³ã³ã³ãã¥ãŒã¿çã«å
¥åããããšã«ããåèšã¡ã¿ããŒã¿ãå¶äœããæ€çŽ¢ããã·ã¹ãã ã§ããããäºåã«å¶äœãããæ åã»é³å£°ã³ã³ãã³ãã確èªããå¿
èŠããããäŸãã°æŸéäžã®çªçµã®ãã€ã©ã€ãã·ãŒã³ãªã©ããªã¢ã«ã¿ã€ã ã§å©çšããããšã¯é£ããã
æŽã«ç¹éïŒïŒïŒïŒâïŒïŒïŒïŒïŒïŒå·å
¬å ±ã§ã¯ãæ åã»é³å£°ã³ã³ãã³ãã«é¢é£ããã¡ã¿ããŒã¿ãäœæããã¡ã¿ããŒã¿ïŒã¢ãããŒã·ã§ã³æ
å ±ïŒäœæè£
眮ã«é¢ãããã®ã§ãããé³å£°èªèã®å ŽåããŒã¯ãŒãã誀ã£ãŠä»äžãããŠããŸãåé¡ããããéèŠåºŠãããšã«ããŠããŒã¯ãŒããäœæããããšãç®çãšããŠããã®ã§ããªãã¬ãŒã¿ã®å£°ãé³å£°èªèã§ããŒã¯ãŒãç»é²ããã®ã§èª€èªçããã³ãéªéãªçºå£°ã¯æ®ããããŸãŸã§ããããŸãéèŠåºŠã¯æ©æ¢°çã«äºåç»é²ããæ¹æ³ã§ããå¿
ããããã®ã·ãŒã³ã«æé©ãªéã¿ä»ããšãªããªãã
åŸãã£ãŠäžèšåæ§æŸéäžã®çªçµã®ãã€ã©ã€ãã·ãŒã³ãªã©ããªã¢ã«ã¿ã€ã ã§å©çšããããšã¯é£ããã
As an example of a prior art document of speech recognition and annotation information addition having a dictionary, Japanese Patent Application Laid-Open No. 2004-86124 and Japanese Patent Application Laid-Open No. 2004-153964 relate to a metadata (annotation information) production device and a search device in content production. In both cases, misrecognition rate of voice recognition and disturbing utterances are problems, and the latter confirms information that should be made metadata by playing the produced video and audio content, and the computer with voice input It is a system that creates and searches for the metadata by inputting it to the etc., but it is necessary to check the video / audio content produced in advance, for example, highlight scenes of broadcast programs, etc. Difficult to do.
Furthermore, Japanese Patent Application Laid-Open No. 2007-140198 relates to a metadata (annotation information) creation apparatus that creates metadata related to video / audio content, and there is a problem that keywords are erroneously assigned in the case of voice recognition. Because the purpose is to create keywords based on the importance, the operator's voice is registered as a keyword by voice recognition, so the misperception rate and disturbing utterance remain, and the importance is mechanical The pre-registration method is not necessarily an optimal weighting for the scene.
Therefore, it is difficult to use in real time such as a highlight scene of a program being broadcast as described above.
以äžã®ãããªæè¡çãªèæ¯ãå
æããŠãçç·Žãããªãã¬ãŒã¿ã§ãã£ããããäºåã«ã·ãŒã³ã®æŠèŠãææ¡ããããšãå¿
èŠãšãããå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã察象ãšããŠãªã³ã©ã€ã³ïŒæŸéäžïŒã®æ åã³ã³ãã³ããèŠèŽããªããèŠèŽç°å¢ãé
æ
®ããªã¢ã«ã¿ã€ã ã§ä»»æã®ã·ãŒã³ã察象ãšããŠãç¹å¥ã®ç¿çãå¿
èŠãšããã誰ã§ãåœè©²ã·ãŒã³ã«ãµããããããã€ç·šéã®æ€çŽ¢çšã«éçŽãããé«ç²ŸåºŠã§ãé«éã§ãæåæ
å ±çãäžå¿ãšããæ
å ±ãä»äžããããã®ã¢ãããŒã·ã§ã³æ
å ±ä»äžã·ã¹ãã ãæäŸããã
ãŸãæ¬çºæã®äž»èŠé©çšè£
眮ã§ããå®¶åºçšæ±çšé²ç»è£
眮çã«åºãå®çŸå¯èœãªã³ã¹ããç®æãããã«ãç¹å¥ãªãè£
眮ãéšåãçµç«ãã®æè¡ãçšããããšãªããçŸåšåžå Žã«åºãæµéããŠãããè£
眮ãéšåãçµç«ãã®æè¡ã§å®çŸå¯èœãªã¢ãããŒã·ã§ã³æ
å ±ä»äžã·ã¹ãã ãæäŸããã
Overcoming the technical background as described above, it is not necessary to be a skilled operator or to grasp the outline of the scene in advance, and it is online (broadcasting) for all video content genres. Considering the viewing environment while viewing the video content, targeting any scene in real time, no special skill is required, anyone is suitable for the scene, and is aggregated for searching for editing, with high accuracy and high speed An annotation information providing system for providing information centered on character information and the like is provided.
Moreover, in order to aim at a cost that can be widely realized in a general-purpose video recording device for home use, which is a main application device of the present invention, it is currently widely distributed in the market without using any special device, component, or assembly technology. An annotation information providing system that can be realized by the technology of devices, parts, and assembly is provided.
以äžã®èª²é¡ã解決ããããã«
è«æ±é
ïŒã§ã¯ã
æ åè£
眮ãšããã®ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£
眮ãšãã§æ§æãããèªäœãããªãå«ãæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã«ã¢ãããŒã·ã§ã³æ
å ±ãä»äžããããã®ã·ã¹ãã ã§ãã£ãŠã
äžèšæ åè£
眮ã¯ã
æ åã³ã³ãã³ãã®å
šãŠã®ãžã£ã³ã«ã®ã·ãŒã³ã«å
±éãªã·ãŒã³ãèŠèŽããå°è±¡ã衚ãçšèªã§ããå°è±¡çšèªã®èŠåºãçšèªãšãæ åã³ã³ãã³ãã®ãžã£ã³ã«ç¹æã®çšèªã§ãããžã£ã³ã«çšèªãšããæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ããã³éå±€å¥ã«é¢é£ä»ããæ§æãããç·šéçšèªèŸæžãšã
äžèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£
眮ã¯ã
æ åã³ã³ãã³ãã®èŠèŽéå§ããé æ¬¡ã¢ãããŒã·ã§ã³æ
å ±ãä»äžããã·ãŒã³äœçœ®ãæå®ãããã®æå®ããã·ãŒã³ã«å¯ŸããŠäžèšç·šéçšèªèŸæžã®äžèšèŠåºãçšèªãšäžèšãžã£ã³ã«çšèªãšãé æ¬¡éžæãã以äžã®æå®ããã³éžæããä¿¡å·æ
å ±ãæ åè£
眮ã«éä¿¡ããææ®µãåãã
æŽã«äžèšæ åè£
眮ã¯ã
ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ããåä¿¡ããä¿¡å·æ
å ±ã«ããšã¥ãç·šéçšèªèŸæžã«ããã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ãäœæããã¢ãããŒã·ã§ã³æ
å ±äœæéšãšã
ãå
·åããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšç·šéçšèªèŸæžã®åèšèŠåºãçšèªãªãã³ã«åèšãžã£ã³ã«çšèªã¯ïŒã°ã«ãŒãæå€§ïŒïŒã®çšèªãšããããæ§æãããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšç·šéçšèªèŸæžã«ã¯ã·ãŒã³ã®å°è±¡ã®åºŠåãã®æ
å ±ãç»é²ããããã®å°è±¡ã®åºŠåããéžæãã¢ãããŒã·ã§ã³æ
å ±ãšããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšæ åè£
眮ã¯ãªã¢ã³ã³ä¿¡å·åä¿¡éšãåãã
åèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£
眮ã¯å°ãªããŠãïŒïŒåã®æäœãã¿ã³ãå
·åãããªã¢ã³ã³ã§ãããåèšæå®ããã³éžæããä¿¡å·æ
å ±ã¯ãªã¢ã³ã³éä¿¡ä¿¡å·ã§ãã£ãŠã
ãã®ãªã¢ã³ã³ãã¿ã³ãæäœããããšã«ãããåèšã·ãŒã³äœçœ®ãæå®ããåèšç·šéçšèªèŸæžã®åèšèŠåºãçšèªãªãã³ã«åèšãžã£ã³ã«çšèªãéžæãã
æ åè£
眮ã¯äžèšãªã¢ã³ã³ä¿¡å·åä¿¡éšã§ãã®ä¿¡å·æ
å ±ãåä¿¡ããã¢ãããŒã·ã§ã³æ
å ±äœæéšã§åèšç·šéçšèªèŸæžã®çšèªã«ããåèšã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšæ åè£
眮ã¯é³å£°èªèéšãåãã
åèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£
眮ã¯é³å£°çšãã€ã¯ããã©ã³ã§ãããåèšæå®ããã³éžæããä¿¡å·æ
å ±ã¯ãã€ã¯ããã©ã³é³å£°ä¿¡å·ã§ãã£ãŠã
ãã®ãã€ã¯ããã©ã³ã«å€ããšãïŒïŒçš®ä»¥å
ã®é³å£°ãçºããããšã«ãããåèšã·ãŒã³äœçœ®ãæå®ããåèšç·šéçšèªèŸæžã®åèšèŠåºãçšèªãªãã³ã«åèšãžã£ã³ã«çšèªãéžæãã
æ åè£
眮ã¯äžèšé³å£°èªèéšã§ãã€ã¯ããã©ã³é³å£°ä¿¡å·ãä¿¡å·æ
å ±ãšããŠèªèããã¢ãããŒã·ã§ã³æ
å ±äœæéšã§åèšç·šéçšèªèŸæžã®çšèªã«ããåèšã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšæ åè£
çœ®ã¯æ åã³ã³ãã³ãã®ãžã£ã³ã«ãïŒãšã¬ã¯ãããã¯ã¹ ããã°ã©ã ã¬ã€ãïŒãžã£ã³ã«ãããžã£ã³ã«ãèªåéžæãããžã£ã³ã«éžæææ®µãå
·åããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšæ åè£
眮ã¯ã¿ã€ã ã·ããåçïŒè¿œãããåçïŒææ®µãå
·åããã¢ãããŒã·ã§ã³æ
å ±ã®ç·šéäžãäžæåæ¢ããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšã¢ãããŒã·ã§ã³æ
å ±ã«ãã¢ãããŒã·ã§ã³æ
å ±ä»äžè
åãç»é²ããããšãç¹åŸŽãšããã
è«æ±é
ïŒã§ã¯ã
åèšæ åè£
眮ã¯ãç·šéçšèªèŸæžãéä¿¡åç·ããããŠã³ããŒãããèŸæžããŠã³ããŒãéšãšãçšèªç»é²ã®ããã®å€éšããŒããŒãã®ããã®ããŒããŒãå
¥åéšãšã
ãæŽã«å
·åããããšç¹åŸŽãšããã
è«æ±é
ïŒïŒã§ã¯ã
åèšæ åè£
眮ã¯ãåèšïŒ¥ïŒ°ïŒ§ããŒã¿ãŸãã¯ã€ã³ã¿ãŒãããããã®ããŠã³ããŒãããŒã¿ã®ããããã«ãã£ãŠåå¥çªçµããšã®ç·šéçšèªèŸæžãšããããšãç¹åŸŽãšããã
è«æ±é
ïŒïŒã§ã¯ã
æ åã³ã³ãã³ãã®å
šãŠã®ãžã£ã³ã«ã«å
±éãªèŠåºãçšèªãšã
æ åã³ã³ãã³ãã®ãžã£ã³ã«ç¹æã®çšèªã§ãããžã£ã³ã«çšèªãšã
ãæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ããã³éå±€å¥ã«é¢é£ä»ããæ§æãããç·šéçšèªèŸæžããçšèªãéžæããŠã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããã
è«æ±é
ïŒïŒã§ã¯ã
åèšå
šãŠã®ãžã£ã³ã«ã«å
±éã®èŠåºãçšèªã¯ã·ãŒã³ãèŠèŽããå°è±¡ã衚ãçšèªã§ããããšãç¹åŸŽãšããã
In order to solve the above problems, in
A system for adding annotation information to an arbitrary scene of video content including a self-made video composed of a video device and the user interface device,
The video device
The heading term of impression terms, which is a term that expresses the impression of viewing a scene common to scenes of all genres of video content, and the genre terms, which are terms specific to the genre of video content, are classified by genre and hierarchy of video content. An associated editorial dictionary of terms,
The user interface device is
The scene position to which annotation information is added sequentially from the start of viewing video content is specified, the heading term and the genre term in the editing term dictionary are sequentially selected for the specified scene, and the above designation and selected signal Means for transmitting information to the video device;
Furthermore, the video device
An annotation information creation unit that creates annotation information data based on an edited term dictionary based on signal information received from the user interface;
It is characterized by comprising.
In
The heading terms and the genre terms in the editing term dictionary are configured to have a maximum of 12 terms per group.
In
Information on the degree of impression of a scene is registered in the editing term dictionary, and the degree of impression is selected and used as annotation information.
In
The video device includes a remote control signal receiver,
The user interface device is a remote controller having at least 20 operation buttons, and the signal information to be specified and selected is a remote control transmission signal,
By operating this remote control button, specify the scene position, select the heading term and the genre term in the editing term dictionary,
In the video apparatus, the remote control signal receiving unit receives the signal information, and the annotation information creating unit creates the annotation information data based on the terms in the editing term dictionary.
In
The video device includes a voice recognition unit,
The user interface device is a voice microphone, and the signal information to be specified and selected is a microphone voice signal,
By emitting at most 30 kinds of sounds to this microphone, the scene position is designated, the heading terms in the editing term dictionary and the genre terms are selected,
In the video apparatus, the voice recognition unit recognizes a microphone voice signal as signal information, and the annotation information creation unit creates the annotation information data based on the terms in the editing term dictionary.
In
The video apparatus includes genre selection means for automatically selecting a genre of video content from an EPG (Electronic Program Guide) genre.
In
The video apparatus includes a time-shift playback (chasing playback) means, and pauses during editing of annotation information.
In claim 8,
An annotation information assignor name is registered in the annotation information.
In
The video device includes a dictionary download unit for downloading an edited term dictionary from a communication line, a keyboard input unit for an external keyboard for term registration,
Is further provided.
In
The video device may be an edit term dictionary for each individual program based on either the EPG data or the data downloaded from the Internet.
In
Headline terms common to all genres of video content,
Genre terms, which are terms specific to the genre of video content,
Annotation information data is created by selecting a term from an editing term dictionary constructed by associating video content with each genre and hierarchy.
In
The headline term common to all the genres is a term representing an impression of viewing a scene.
å šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã®ã·ãŒã³ã«å ±éãªèŠåºãçšèªãšããã®èŠåºãçšèªã«é¢é£ãããžã£ã³ã«ããšã®çšèªãéå±€æ§é ãšããŠçŽä»ãããèŸæžãéçºããã ãã§ç¹ã«å€§ããªæè¡éçºè² æ ã匷ããããšãªããé²ç»è£ 眮ããããªã«ã¡ã©ãç·šéè£ çœ®çã®æ åè£ çœ®ã«é©å¿å¯èœãªãå©çšãæããå®çšçã§ããªã¢ã«ã¿ã€ã ã§èŠèŽç°å¢ã«é æ ®ãéçšã§ããæ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ãå®çŸåºæ¥ãã A recording device that does not impose a particularly large technical development burden by developing a dictionary in which a headline term common to scenes of all video content genres and a term for each genre related to this headline term are linked as a hierarchical structure. Therefore, it is possible to realize a video content annotation information adding system that can be applied to video devices such as video cameras and editing devices, is easy to use, practical, and can be operated in consideration of the viewing environment in real time.
å³ïŒã¯æ¬çºæã®ã·ã¹ãã ã®å
šäœæ§æã®äŸã§ããã
æ åè£
眮ïŒã¯é²ç»è£
眮ããããªã«ã¡ã©ãç·šéè£
眮çã§ãããïŒãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ïŒè£
眮ïŒã§ãããªã¢ã³ã³ïŒã§ãªã¢ã³ã³éä¿¡ä¿¡å·ïŒãšããŸãã¯ãã€ã¯ããã©ã³ïŒã§ãã€ã¯ããã©ã³é³å£°ä¿¡å·ïŒãšãã®ããããã§æäœå¯èœãªã·ã¹ãã æ§æãšãªã£ãŠããã
æ åè£
眮ïŒã«ã¯ãã¬ããžã§ã³ãŸãã¯æ¶²æ¶è¡šç€ºæ©ãªã©ã®ãã£ã¹ãã¬ãŒã§ããã¡ã€ã³ãã£ã¹ãã¬ãŒïŒããã³ãµããã£ã¹ãã¬ãŒïŒïŒãæ¥ç¶ãããŠãããæ åã³ã³ãã³ãã®é²ç»ãåçã衚瀺éšïŒïŒã§ã¢ã³ããå
¥åïŒïŒããæŸéçªçµãåä¿¡ããå€éšæ åå
¥åïŒïŒããæ åã³ã³ãã³ããåä¿¡ããããããé²ç»åçãããšãšãã«æ åä¿¡å·ïŒãåºåããŠã¡ã€ã³ãã£ã¹ãã¬ãŒïŒã«è¡šç€ºããããã
以äžã®æ åã³ã³ãã³ãã¯æ åã³ã³ãã³ãèšæ¶éšïŒïŒã«æ åã³ã³ãã³ãã®ã¿ã€ãã«ïŒïŒããšã«æ åã³ã³ãã³ãïŒïŒãé²ç»ãããŠããã
FIG. 1 shows an example of the overall configuration of the system of the present invention.
The
The
The video content 33 is recorded in the video content storage unit 31 for each title 32 of the video content.
ïŒã°ã©ãã£ãã¯ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ïŒéšïŒïŒã®ïŒ§ïŒµïŒ©è¡šç€ºéšïŒïŒã¯ã¢ãããŒã·ã§ã³æ
å ±ä»äžç·šéã«ä¿ã衚瀺信å·ïŒããã£ã¹ãã¬ãŒã«åºåããã
ãã®ïŒ§ïŒµïŒ©è¡šç€ºä¿¡å·ïŒã¯ãã£ã¹ãã¬ãŒåæ¿ã¹ã€ããïŒïŒã«ããã¡ã€ã³ãã£ã¹ãã¬ãŒïŒãŸãã¯ãµããã£ã¹ãã¬ãŒïŒïŒã«åæ¿ãããã
ãã®æ§æå³ã§ã¯ãã£ã¹ãã¬ãŒåæ¿ã¹ã€ããïŒïŒã¯ïŒ¢åŽã«éžæããç·šéã®ããã®ãµããã£ã¹ãã¬ãŒïŒïŒã«è¡šç€ºããªãããæ åã³ã³ãã³ããã®ãã®ã®èŠèŽãšãç·šéãå¥ã
ã«ç¬ç«ããŠè¡ãããããµããã£ã¹ãã¬ãŒïŒïŒã䜿çšããªãå Žåã«ã¯ãã£ã¹ãã¬ãŒåæ¿ã¹ã€ããïŒïŒãåŽã«ããŠã¡ã€ã³ãã£ã¹ãã¬ãŒïŒã®æ åã³ã³ãã³ãã®è¡šç€ºã«éããŠãã®ïŒ§ïŒµïŒ©è¡šç€ºä¿¡å·ïŒã衚瀺ããã
以éã®èª¬æã§ã¯ãµããã£ã¹ãã¬ãŒïŒïŒãçšãããã¡ã€ã³ãã£ã¹ãã¬ãŒïŒã«æ åã³ã³ãã³ãã«ïŒ§ïŒµïŒ©è¡šç€ºä¿¡å·ïŒãéããå Žåã®äŸã§èª¬æããã
éšïŒïŒã®ïŒ§ïŒµïŒ©å¶åŸ¡éšïŒïŒã¯ãè£
眮æ
å ±èªèéšïŒïŒã®ïŒµïŒ©è£
眮ïŒã®ãªã¢ã³ã³ïŒã®ä¿¡å·æ
å ±ã¯ãªã¢ã³ã³ä¿¡å·åä¿¡éšïŒïŒããã€ã¯ããã©ã³ïŒã®ä¿¡å·æ
å ±ã¯é³å£°èªèéšïŒïŒããã®ïŒµïŒ©åä¿¡ä¿¡å·ïŒïŒã«ãã£ãŠç·šéã«ä¿ãåçš®ã®å¶åŸ¡ãå®è¡ããã
éšïŒïŒã®ãžã£ã³ã«éžæéšïŒïŒã¯æ åã³ã³ãã³ãã®ãžã£ã³ã«ãïŒãšã¬ã¯ãããã¯ã¹ ããã°ã©ã ã¬ã€ãïŒããŒã¿ããååŸããŠèŠèŽããæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ã®èŸæžçšèªãéžæãããã®ã§ããã
The
The GUI display signal 8 is switched to the
In this configuration diagram, the display changeover switch 40 is selected on the B side and displayed on the sub-display 10 for editing, and viewing and editing of the video content itself can be performed independently, but the sub-display 10 is not used. The display changeover switch 40 is set to the A side, and the GUI display signal 8 is displayed so as to overlap the display of the video content on the
In the following description, an example in which the GUI display signal 8 is superimposed on the video content on the
The
The
ç·šéçšèªèŸæžïŒïŒã¯æ¬çºæã®æ ¹å¹¹ããªããã®ã§ãããæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥æŽã«ã¯çªçµå¥ã«å
ã®ãªã¢ã³ã³ïŒã®ä¿¡å·ã¯ãªã¢ã³ã³ä¿¡å·åä¿¡éšïŒïŒãšããã€ã¯ããã©ã³ïŒã®ãã€ã¯ããã©ã³é³å£°ä¿¡å·ïŒã¯é³å£°èªèéšïŒïŒãšãã®ããããããã®ä¿¡å·æ
å ±ã«ããèŸæžçšèªéžæéšïŒïŒã§ãã®ç·šéçšèªèŸæžïŒïŒããçšèªãéžæããéžæãããçšèªïŒïŒãåºåãããã®åºåã¯ïŒ§ïŒµïŒ©è¡šç€ºéšïŒïŒãéããŠãã£ã¹ãã¬ãŒã«è¡šç€ºããããšãšãã«ãéžæãããçšèªïŒïŒãã¢ãããŒã·ã§ã³æ
å ±äœæéšïŒïŒã§ãè£
眮ïŒã®æäœã«ããšã¥ãã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ããŒã¹ïŒïŒå
ã®ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒãšããŠèšæ¶ããã
ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ããŒã¹ïŒïŒå
ã«ã¯æ åã³ã³ãã³ãã®ã¿ã€ãã«ïŒïŒå¥ã«ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒãäœæãããã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã«ã¯æå»æ
å ±ïŒïŒãªãã³ã«é¢é£æ
å ±ïŒïŒãé¢é£ä»ããããŠããã
The editing term dictionary 19 forms the basis of the present invention. The signal from the
Annotation information data 37 is created for each video content title 32 in the annotation information database 34, and
以äžã®æ§æã®ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ããŒã¹ïŒïŒã¯ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿æ€çŽ¢éšïŒïŒã«ããæ€çŽ¢ãè¡ããéžæãããçšèªïŒïŒãšããŠã¡ã€ã³ãã£ã¹ãã¬ãŒïŒãŸãã¯ãµããã£ã¹ãã¬ãŒïŒïŒããããã«è¡šç€ºãããã
ç·šéçšèªèŸæžïŒïŒã¯æŸéçªçµæ åã³ã³ãã³ãã®ïŒ¥ïŒ°ïŒ§ããŒã¿ããã¿ã€ãã«æ
å ±ãªãã³ã«ãžã£ã³ã«æ
å ±ãçªçµæ
å ±ãåºæŒè
æ
å ±ãªã©ã®å¿
èŠãªæ
å ±ãèŸæžç»é²éšïŒïŒã«ãã£ãŠååŸãããšãšãã«ãèŸæžçšèªããŠã³ããŒãéšïŒïŒã§ã€ã³ã¿ãŒãããéä¿¡ä¿¡å·ïŒïŒã«ããéæææ°ççšèªããŒã¿ãããŠã³ããŒãå¯èœã«ãªã£ãŠãããšãšãã«ãå€éšæ¥ç¶ãããããŒããŒãã®å
¥åã«ããããŒããŒãä¿¡å·ïŒïŒãèŸæžçšèªããŒããŒãå
¥åéšïŒïŒã«ããæžã蟌ã¿ãå¯èœãªæ§æãšãªã£ãŠããã
以äžãæ¬çºæã®ã·ã¹ãã ã®å
šäœæ§æã®äŸã§ããã
The annotation information database 34 configured as described above is displayed on either the
The edit term dictionary 19 acquires necessary information such as title information, genre information, program information, and performer information from the EPG data of the broadcast program video content by the dictionary registration unit 16, and the dictionary term download unit 17 uses the Internet communication signal 26. Therefore, the latest term data can be downloaded at any time, and the
The above is an example of the overall configuration of the system of the present invention.
å³ïŒã¯æ åã³ã³ãã³ãã®ãžã£ã³ã«åºåã®äŸã§ããã
çŸåšã®ããžã¿ã«æŸéçªçµã®æ åã³ã³ãã³ãã¯ïŒ¥ïŒ°ïŒ§ããŒã¿ã§ïŒïŒã®ãžã£ã³ã«ãèŠå®ããæŽã«ãµããžã£ã³ã«ãå®çŸ©ä»ããããŠããã
ããããã®ãŸãŸå©çšããããšãåºæ¥ãããæ¬äŸã§ã¯å³ïŒã«ç€ºãããã«ïŒ¥ïŒ°ïŒ§ã¡ã€ã³ãžã£ã³ã«ã®ã¢ãã¡ïŒç¹æ®ã¯æ ç»ã«å«ããèªäœã®ãããªãå«ãåèšïŒïŒçš®é¡ã®åºåãšããŠããã
æ¬äŸã§ã¯ä»¥äžã®ããã«èªäœãããªä»¥å€ã¯æŸéèŠæ Œã«ããšã¥ãã¡ã€ã³ãžã£ã³ã«ããã®ãŸãŸçšããæ§æãšããŠãã§èªåéžæããããšãåºæ¥ãããã«ããŠããããå¥ãªæ¹æ³ã§åé¡ããæåéžæããæ¹æ³ã§ãããã
èªäœãããªã«ã€ããŠã¯ãäœæããããããªã®çš®é¡ãé©åã«ãµããžã£ã³ã«ã«ç»é²ãéžæåºæ¥ãããæ§æããã
FIG. 2 shows an example of genre classification of video content.
The video content of the current digital broadcast program has 12 genres defined by EPG data and further sub-genres.
Although this can be used as it is, in this example, as shown in FIG. 2, the EPG main genre animation / special effects are included in the movie, and a total of 12 types including the self-made video are included.
In this example, as described above, the EPG main genre based on the broadcast standard other than the home-made video is used as it is, so that it can be automatically selected by the EPG. However, it is classified by another method and manually selected. But you can.
The self-made video is configured so that the type of video to be created can be appropriately registered and selected in a sub-genre.
以äžã®ããã«å€å²ã«ããããžã£ã³ã«ã®æ åã³ã³ãã³ãã«ãããŠãäºåã«ã·ãŒã³ã®æŠèŠãææ¡ããããšãå¿ èŠãšããããªã³ã©ã€ã³ïŒæŸéäžïŒã®æ åã³ã³ãã³ããèŠèŽããªããã§ãä»»æã®ã·ãŒã³ã察象ãšããŠãç¹å¥ã®ç¿çãå¿ èŠãšãããåœè©²ã·ãŒã³ã«ãµããããããã€ç·šéã®æ€çŽ¢çšã«éçŽãããé«ç²ŸåºŠã§ãé«éã§ãæåæ å ±çãä»äžããããã«ãæ¬çºæã¯å šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã®ã·ãŒã³ã«å ±éã§ãããšãšãã«ãäžäººã«å ±éã§çè§£ãæãããæ€çŽ¢ã®ããã«ãçšèªã®æ°ãéå®åºæ¥ãèŠåºãçšèªãæ¢ãåºããããèŠåºãçšèªãšããèŸæžãçšæãããã®èŸæžããå¿ èŠãªçšèªãé æ¬¡éžæããŠè¡ãããšã«ãã課é¡ã解決ããŠããã As described above, it is not necessary to grasp the outline of the scene in advance for a wide variety of video content, and special proficiency can be obtained for any scene while viewing online (broadcast) video content. The present invention is common to scenes of all video content genres, so that it is suitable for the scene and is aggregated for search for editing, and is provided with high accuracy, high speed, character information, etc. Find a heading term that is easy to understand for all and easy to understand and can limit the number of terms for searching. Prepare a dictionary using this as a heading term, and select the necessary terms from this dictionary in order. The problem is solved.
æ¬çºæã®èŸæžæ§æã®åçã¯æ¬¡ã®éãã§ããã
äœã®ç®çã§èŠèŽè
ïŒå©çšè
ïŒãããã®æ åããã³é³å£°ã®ã·ãŒã³ããã€ã©ã€ãã·ãŒã³ãå«ãç·šéã®ããã®ã·ãŒã³ãšããŠéžã¶ã®ãã®ç¬¬äžã®ã¹ãããã¯ãã®ã·ãŒã³ã«å¯ŸãããèŠèŽè
ããããã®å奜ãçµéšãç°å¢ãå¢éãªã©ã«ããšã¥ãææ
ãäºæãå Žåã«ãã£ãŠã¯äœæãããšã«ããŠãããã®ã§ããããã®ããšã¯å
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã®ã·ãŒã³ã«å
±éã§ããã
ãããã®ççåŠçãªåéã¯è³æ³¢ããããå©çšããèªèæè¡ãèªèå¶åŸ¡çã®ç ç©¶ãšããŠæŽ»çºã«ãããªãããå°æ¥ã¯æ åã³ã³ãã³ãã®ç·šéã®ãããªé«åºŠãªäººç倿ã«ãå©çšåºæ¥ãããæåŸ
ãããããã®ãè³æ³¢ãªã©ãçŽæ¥å©çšããŠç·šéãè¡ãã«ã¯çŸåšã®ãšããéè£
åã®çäœæ
å ±ååŸã®ããã®ã€ã³ã¿ãŒãã§ãŒã¹ã®äœ¿çšãäžå¯æ¬ ãšãªããããã¹ãããŠãŒãºå¯Ÿè±¡ã®è£œåã«é©å¿ããã®ã¯å°é£ã§ããã
ããããªãããããã«ä»£ãã£ãŠãäººã®ææ
ãäºæãå Žåã«ãã£ãŠã¯äœæã«ããšã¥ãå°è±¡ã衚ããçšèªãèŠåºãçšèªãšããèŸæžãå©çšããŠæå³ä»ãããæ¹æ³ã¯ãä»ã®æåå
¥åæ¹æ³ã«ãªã倧ããªã¡ãªããããã€ã
ãã®ç¬¬ïŒã®ã¡ãªããã¯å
ã«è¿°ã¹ãããã«å°è±¡ã衚çŸããçšèªã¯æããå°è±¡ãã®ãã®ã§ããäžäººã«çè§£ãæãå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã®ã·ãŒã³ã«å
±éã«å©çšå¯èœã§ããããšã§ããã
åŸãã£ãŠãããèŠåºãçšèªãšããŠçšããããšã«ããããããŸã§å°é£ã§ãã£ãæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã®ã¢ãããŒã·ã§ã³ç·šéã«å¿
èŠãªçšèªã®å¹ççãªç»é²æ¹æ³ãšãç»é²ãããçšèªã®å¹ççãªéžææ¹æ³ãšããè§£æ±ºåºæ¥ãã
次ã«éèŠãªã¡ãªããã¯å°è±¡ã衚ãçšèªãã€ãŸãåè©ã圢容ãã圢容è©ãããã«é¡ãã圢容系ã®çšèªãèŠåºãçšèªãšããããšã«ãããå¿
ç¶çã«ããã«ç¶ã圢容ã®å¯Ÿè±¡ãšãªãåè©çšèªã欲ãããªãã
å
·äœçã«ã¯å°è±¡ã衚ãçšèªãèŠåºãçšèªãšããããšã«ãããããã«é¢é£ãããžã£ã³ã«ç¹æã®åè©çšèªãéå±€çã«çŽæçã§ãå¹³æã«ãéå®çã«ãçŽä»ããç»é²ããããšãå¯èœã«ãªããšãšãã«ããŸãå察ã«å©çšè
ã¯ãèŠåºãçšèªã§ããå°è±¡ã衚ãçšèªãéžæããããšã«ããããã®çšèªã«é¢é£çŽä»ããããããžã£ã³ã«ç¹æã®ç·šéã®ããã®åè©çšèªãéå±€çã«ãæ¡å
衚瀺ã«èªå°ããããã容æã«éžæããããšãå¯èœãšãªãããšã§ããã
äŸãã°ã¢ã€ãŠãšãªé ãããã®ä»ã®èŠåºãå顿¹æ³ã§ã¯ãã®ããã«ä»¥éã«ç¶ãçšèªãèªå°ãããããªå¹æã¯åŸãããªãã
èªå°å¹æä»¥å€ã«ãããŠãå°éçšèªããå
ã«æå³ä»ãããæ¹æ³ã§ã¯ãæ åã³ã³ãã³ãã®ãžã£ã³ã«ããšå¥ã
ã«èŠåºãçšèªãå¿
èŠã«ãªããèŠåºãã®å
±æåãå°é£ã§ãããæ¬å®æœäŸã®ããã«å
šãŠã®æ åã³ã³ãã³ãã«å
±éãªèŠåºãçšèªãšãããããªå
±éã®éå±€æ§é ãšããèŸæžæ§é ãšã¯ãªããªãã
å
šãžã£ã³ã«å
±éã®èŠåºãçšèªããã€ããšã¯è£
眮ãã·ã¹ãã éçºåŽã®è² æ
ãå°ãªãããã®ã¿ã§ãªããè£
眮ãã·ã¹ãã ã®å©çšè
åŽã®æ
£ããå¹çã®é¢ã§ã倧ããªæå³ããã€ã
The principle of the dictionary structure of the present invention is as follows.
The first step of what purpose the viewer (user) selects the video and audio scene as the scene for editing including the highlight scene is the viewer's individual preference, experience, This is based on the emotions and the five senses based on the environment and circumstances, and in some cases, the bodily sensations, and this is common to all video content genre scenes.
These physiological fields are actively conducted as research on brain waves, recognition technology using them, recognition control, etc., and in the future it is expected to be used for advanced human judgments such as editing video contents. In order to edit directly by using brain waves or the like, it is currently difficult to adapt to mass-produced products because it is indispensable to use an interface for acquiring biological information of heavy equipment.
However, instead of this, using a dictionary that uses terms that represent impressions based on human emotions, the five senses, and, in some cases, bodily sensations as a headline term, has a significant advantage over other methods of character input. Have.
The first merit is that, as described above, the term for expressing an impression is the impression itself, which can be easily understood by everyone and can be used in common for all video content genre scenes.
Therefore, by using this as a heading term, an efficient registration method of terms necessary for annotation editing of an arbitrary scene of video content, which has been difficult until now, and an efficient selection method of registered terms Can be solved.
Next, the most important merit is to use the noun term that is the subject of the following adjectives by using the term representing the impression, that is, the adjective that describes the noun or the similar adjective system as the heading term.
Specifically, by using a term representing an impression as a headline term, it is possible to associate and register genre-specific noun terms related to this in a hierarchically intuitive, simple and limited manner. At the same time, the user selects a term representing an impression that is a heading term, and hierarchically guides a genre-specific noun term associated with this term to the guidance display. It is possible to select easily.
For example, the effect of inducing the following term cannot be obtained in the order of eye-were or other headline classification methods.
In addition to the inductive effect, the method of assigning meanings from the technical terms first requires heading terms separately for each genre of video content, and it is difficult to share the headings. It is not a dictionary structure with a common hierarchical structure such as a common heading term.
Having a headline term common to all genres not only reduces the burden on the device and system development side, but also has great significance in terms of habituation and efficiency on the user side of the device and system.
ãã®å°è±¡ã衚çŸããçšèªã ãã§ãããŸããŸãªæ åã·ãŒã³ã®æå³ä»ããå¯èœã«ãªãç·šéã·ãŒã³ã®å€§æ¹ã®åé¡ãšããããšãåºæ¥ãããæŽã«ãã®å°è±¡ã衚çŸããçšèªã«æ åã³ã³ãã³ãã®ãžã£ã³ã«ããšã®ãžã£ã³ã«ç¹æã®çšèªãéå±€ç¶ã«çŽä»ããããããå©çšããããšã«ãã£ãŠæ åã³ã³ãã³ãã®ããããã·ãŒã³ã察象ãšããŠæåæ
å ±ãäžå¿ãšããæå³ä»ããå¯èœã«ããã
éå±€ãæ·±ãããããšã«ããè€éãªæåæ
å ±ã®ä»äžãå¯èœã§ããããŸãåŸã«éå±€ãæ·±ãããããšãªããç·šéã®ããã®ãå§ãçšèªãšãªãããšãæåŸ
ããŠç·šéã«äžå¯æ¬ ãªãžã£ã³ã«çšèªã®ã¿ãèŸæžç»é²ããããšãäžã€ã®èãæ¹ã§ããã
It is possible to define various video scenes with just the term that expresses this impression, and it can be classified into most of the editing scenes, but the term that expresses this impression is also a genre-specific term for each genre of video content Are linked in a hierarchical manner, and using this makes it possible to make meanings centered on character information for all scenes of video content.
It is possible to add complex character information by deepening the hierarchy, and dictionary only genre terms essential for editing in the hope that it will become a recommended term for editing without deepening the hierarchy. Registration is also one way of thinking.
æ¬å®æœäŸã§ã¯å
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã«å
±éã§ãæ åã·ãŒã³ãé³å£°ã·ãŒã³ã«ããäººãæ±ããåæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ããããçš®é¡ã®å°è±¡ã衚çŸãã圢容系ã®çšèªãæŽãåºãããããéçŽããŠå°è±¡çšèªãšãããšãšãã«ããããè¯ãå°è±¡ãæããã·ãŒã³ããã©ã¹ææ
ãæªãå°è±¡ãæãããã€ãã¹å°è±¡ãšããŠåºåããŠããã
ããã¯ç·šéãã¹ãã·ãŒã³ã¯ãã©ã¹å°è±¡ããã€ãã¹å°è±¡ã©ã¡ããã§ããããšãå©çšãããã®ã§ããã
In this example, it is common to all video content genres, and we find out the adjective terms that express all kinds of impressions such as emotions such as emotions and likes and dislikes that people hold by video scenes and audio scenes, These are combined into impression terms, and scenes that feel good are classified as positive emotions and negative impressions that feel bad.
This utilizes the fact that the scene to be edited is either a positive impression or a negative impression.
æŽã«ãã©ã¹ããã³ãã€ãã¹å°è±¡ãããã䟿å®äžã人ã察象ãšããå°è±¡ã人ç©ç³»ãšãã人ç©ä»¥å€ã®ã¢ããäºè±¡ãå Žé¢ç³»ãšããŠãåèšïŒçš®ïŒä»¥äžã®èª¬æã§ã¯å°è±¡åºåãšããïŒãïŒïŒã®çšèªãèŠåºãçšèªãšããŠããããæŽã«è¿œå ããããšãã倿Žããããšãå¯èœã§ããã
ãã®èŠåºãçšèªïŒä»¥äžã®èª¬æã§ã¯å°è±¡çšèªãšããïŒãããšã«æ åã³ã³ãã³ãã®ããŸããŸãªãžã£ã³ã«ã®ç¹æçšèªïŒä»¥äžã®èª¬æã§ã¯ãžã£ã³ã«çšèªãšããïŒãéå±€çã«ç»é²ããèŸæžæ§é ãšããããšã«ããåœåã®ç®çãæãããã®ã§ããã
åæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ã®èŠåºãçšèªã¯å
šãŠã®æ åã³ã³ãã³ãã®ãããããžã£ã³ã«å
±éã«å©çšããããšãå¯èœã§ããã以äžã«ãã®èŠåºãçšèªãçšãã代衚çãªïŒçš®é¡ã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã«ãããèŸæžã®å®æœäŸã瀺ãã
In addition, for the sake of convenience, positive and negative impressions, each of which has a human impression of a person and a non-human thing or event as a scene, a total of 4 types (in the following description, impression categories) and 38 terms However, it can be added or changed.
Based on this headline term (impression term in the following description), a unique dictionary structure of various genres of video content (genre term in the following explanation) is registered in a hierarchical structure. It serves a purpose.
Heading terms such as emotions such as emotions and preferences such as likes and dislikes can be used in common for all genres of all video content. An example of a genre dictionary is shown.
å³ïŒã¯éççªçµã®èŸæžæ§æã®äŸã§ããã
æ åã³ã³ãã³ãã®ãžã£ã³ã«ïŒïŒïŒå¥ã«æ§æãããç·šéçšèªèŸæžïŒïŒã®éå±€ïŒïŒïŒã®ç¬¬ïŒéå±€ã¯å
ã«èª¬æã®å°è±¡åºåïŒïŒïŒããã©ã¹å°è±¡ããã€ãã¹å°è±¡ãšå Žé¢ç³»ã人ç©ç³»ã®ïŒçš®ã«åºåãããŠãã第ïŒéå±€ã«ãããã®å°è±¡åºåïŒïŒïŒããšã«ãæ¬å®æœäŸã§ã¯åœ¢å®¹ç³»ã®å°è±¡çšèªïŒïŒïŒãåèšïŒïŒçšèªç»é²ãããŠããã
èŸæžæ§é 説æã®ããã«å°è±¡çšèªãå
šãžã£ã³ã«ã®æ åã³ã³ãã³ãã«å
±éãªåæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ã®å°è±¡çšèªã®ã¿ç€ºããŠããããæ¬æ¹åŒã§ã¯ç¬¬ïŒéå±€ãé©åãªåºåãšããããšã«ãã第ïŒéå±€ã®å°è±¡çšèªãæå€§ïŒïŒÃïŒïŒïŒïŒïŒïŒãŸã§ç»é²å¯èœã§ããã以å€ã®äºæãäœæãªã©é垞人ãæããããããšããããå°è±¡ã®çšèªãç»é²ããããšãåºæ¥ãã
çè§£ãæãé©åãªå°è±¡åºåãšæ åã³ã³ãã³ãã®ãžã£ã³ã«å
šäœãèæ
®ãå²ãä»ãããã詳现ã¯åŸè¿°ããã
ãããŸã§ã®æ§æã¯å
šãŠã®ãžã£ã³ã«ã«å
±éãšãªãã
FIG. 3 shows an example of a dictionary structure for a baseball program.
In the first level of the hierarchy 103 of the editing term dictionary 19 configured according to the genre 101 of the video content, the impression category 104 described above is divided into four types: positive impression, negative impression, scene type, and person type. In this embodiment, a total of 38 terms of adjective impression terms 105 are registered for each of these impression categories 104 in the hierarchy.
For the purpose of explaining the dictionary structure, only impression terms such as emotions and emotions that are common to video content of all genres and preferences such as likes and dislikes are shown, but in this method, the first hierarchy is classified appropriately. Thus, it is possible to register up to 12 Ã 12 = 144 impression terms in the second hierarchy, and it is possible to register terms of any impression that a normal person feels, such as other five senses and bodily sensations.
The details will be described later, in which an appropriate impression category that is easy to understand and the entire genre of video content are considered.
The configuration so far is common to all genres.
第ïŒéå±€ã«ã¯ç¬¬ïŒéå±€ã®å°è±¡çšèªïŒïŒïŒã®äžããéççªçµã«é¢é£ããå°è±¡çšèªïŒïŒïŒããšã«é¢é£ããéççªçµç¹æã®åè©çšèªã§ãããžã£ã³ã«çšèªïŒïŒïŒãç»é²ãããæŽã«æ¬äŸã§ã¯ãéžæãææãæçãéçãèµ°å¡ãçå¡ãã®ïŒçš®ã®ãžã£ã³ã«çšèªïŒïŒïŒã«é¢ããŠã¯æŽã«ãã®è©³çްã®åè©çšèªã第ïŒéå±€ã«ã第ïŒéå±€ã®éççªçµãžã£ã³ã«çšèªïŒïŒïŒã«é¢é£ããéççªçµãžã£ã³ã«çšèªïŒïŒïŒãšããŠç»é²ãããŠããã In the third level, genre terms 106, which are noun terms specific to baseball programs, are registered for each impression term 105 related to the baseball program from among the impression terms 105 in the second level. , Throwing, throwing, running, stealing, and the noun terms of this detail are registered in the fourth layer as baseball program genre terms 106 related to the baseball program genre term 106 in the third layer Has been.
éçã®å Žåã®ãã©ã¹å°è±¡ã¯ã²ããã®ããŒã ãåã€ããããããã¬ãŒãããæã®ã·ãŒã³ãã²ããã®éžæãåºå Žããã·ãŒã³ã®ããããã®å°è±¡çšèªïŒïŒïŒã«å¯ŸããŠéççªçµç¹æã®ãžã£ã³ã«çšèªïŒïŒïŒãé¢é£ä»ããããŠããã A positive impression in the case of baseball is that a baseball program-specific genre term 106 is associated with each impression term 105 of a scene when a favorite team wins or plays well or a scene where a favorite player participates. .
ãŸããã®äŸã§ã¯ãéå±ã»ã€ãŸããªããã®å°è±¡çšèªïŒïŒïŒã«ã¯ïŒ£ïŒããç»é²ãããŠããŠãããã¢ãããŒã·ã§ã³æ å ±ãšããŠéžæãæ å ±ä»äžããããšãã§ãããéççªçµä»¥å€ã®çªçµã§åäŸã«èŠããããªãã·ãŒã³ãèªåã§ãèŠãããªãã·ãŒã³ãªã©ãé©åãªå°è±¡çšèªïŒïŒïŒãéžæããŠæ å ±ä»äžããããšãå¯èœã§ãããããããã·ãŒã³ã«ãããã®ç·šéã«é¢é£ããé©åãªçšèªãéžæãæ å ±ä»äžããããšãåºæ¥ãã   In this example, CM is registered in the impression term 105 of bored and boring, and it is possible to select and add this as annotation information. However, it is possible to select an appropriate impression term 105 and give information to a scene or the like that you do not want to see, and to select and assign information to an appropriate term related to these edits to any scene.
äŸãã°åãæºå¡ããŒã ã©ã³ã®ã·ãŒã³ã§ãã²ããã®ããŒã ã§ããã°ããã©ã¹å°è±¡å Žé¢ç³»ãåãã»çŽ æŽããããææãæºå¡ããŒã ã©ã³ããšéå±€å¥ã«éžæãæ å ±ä»äžããã¢ãããŒã·ã§ã³æ å ±ããŒã¿ïŒïŒãšãªããäžæ¹å察ã®å Žåã¯ããã€ãã¹å°è±¡å Žé¢ç³»ãæããæçãæºå¡ããŒã ã©ã³ããšéå±€å¥ã«éžæãæ å ±ä»äžããã¢ãããŒã·ã§ã³æ å ±ããŒã¿ïŒïŒãšãªãã For example, if it is a favorite team even in the scene of the same Manchuria home run, it will be selected and classified according to hierarchy, plus impressive scene system, amazing / awesome, batting, Manchuria homerun, and the annotation information data 37 will be given if it is opposite, a negative impression Annotation information data 37 is obtained by selecting and assigning information according to hierarchy, such as scene type, scooping, pitching, and full home run.
æŽã«é«æ ¡éçãåªåãæ±ºãããããªè©Šåã«ã¯ããã©ã¹å°è±¡å Žé¢ç³»ãæåã»æè¬ã»ææ¿ãææãæºå¡ããŒã ã©ã³ããšããŠãéå±€å¥ã«éžæãæ
å ±ä»äžå¯èœãªããç»é²ãããŠãããã©ã¡ããéžã¶ãã¯å©çšè
ã®å°è±¡ã§ããã
åã«å¥œããªææã®éžæãç»å Žããã·ãŒã³ã§ã¯ããã©ã¹å°è±¡äººç©ç³»ã奜ããªã»ãã¡ã³ã®ãéžæãææããéå±€å¥ã«éžæå¯èœãªããã«ç»é²ãããŠããã
以äžã®ããã«å°è±¡çšèªïŒïŒïŒã«çŽä»ããããžã£ã³ã«çšèªïŒïŒïŒã¯éçãç¥ã£ãŠãã人ã§ããã°ãããããã·ãŒã³ã飿³ããããšã«ããå¹³æã«å®æœåºæ¥ããä»ã®ãžã£ã³ã«ã«ãããŠãåæ§ã§ããã
Furthermore, for games such as high school baseball and winning decisions, it is registered so that it can be selected and given information according to rank, as positive impression scene type, impression / appreciation / excitement, hitting, full home run, etc. The user's impression is good.
In a scene where a favorite pitcher player appears, a positive impression character system, a favorite fan, a player, a pitcher, and the like are registered so that they can be selected according to hierarchy.
As described above, the genre term 106 linked to the impression term 105 can be easily implemented by associating any scene as long as it is a person who knows baseball, and the same applies to other genres.
ãŸãå³ïŒã®éãæ¬çºæã®ç·šéçšèªèŸæžïŒïŒã§ã¯å
šãŠã®éå±€ã®å°è±¡åºåïŒïŒïŒãå°è±¡çšèªïŒïŒïŒããžã£ã³ã«çšèªïŒïŒïŒãšãæå€§ïŒïŒãïŒã°ã«ãŒããšããèŸæžæ§æãšããŠããã
ãã®çç±ã¯åŸè¿°ãããã
Further, as shown in FIG. 3, the edit term dictionary 19 of the present invention has a dictionary configuration in which all 12 impression categories 104, impression terms 105, and genre terms 106 have a maximum of 12 groups.
The reason for this will be described later.
æŽã«ãžã£ã³ã«çšèªã®éå±€ãæ·±ãããããšãå¯èœã§ããããæ¬å®æœäŸã§ã¯ç¬¬ïŒéå±€ã«å°è±¡ã®åºŠåãã€ãŸãææ
ãå奜ã®çšåºŠãïŒæ®µéã®æåèšå·ã§éžæå¯èœãªæ§æãšããŠããã
æ¬çºæã¯å°è±¡ã衚ãçšèªãèŠåºãçšèªãšããã®ã§ã以äžã®ããã«å°è±¡ã®åºŠåãã«ããšã¥ãç·šéæ
å ±ãçµæãšããŠéèŠãªç·šéã·ãŒã³ã容æã«èšå®åºæ¥ãããšãç¹åŸŽã®äžã€ã§ããã
ãã®èšå·ã¯æåæ
å ±ãšããããšãå¯èœã§ããããŸã段éãå°ãªããããå€ãããããšãèªç±ã§ããã
Although the genre term hierarchy can be further deepened, in this embodiment, the fifth level is configured such that the degree of impression, that is, the degree of emotion and preference, can be selected by five-stage star symbols.
Since the present invention uses a term representing an impression as a headline term, it is one of the features that editing information based on the degree of impression as described above, and as a result, an important editing scene can be easily set.
This symbol can also be character information, and can be reduced in number and increased in number.
å³ïŒã¯æççªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
æççªçµãªã©ã®å°è±¡çšèªïŒïŒïŒãšããŠã¯ãçŸå³ããäžå³ããªã©ã®å³èŠãå
èŠã«é¢é£ããçšèªãç»é²ãããŠããå ŽåæŽã«é©åãªãã®ãšãªããå³èŠãå
èŠä»¥å€ã«ãäºæãäœæã«é¢ããçšèªã¯å°è±¡çšèªã®å¯Ÿè±¡ã§ããå©çšé »åºŠãèæ
®ããŠç¬¬ïŒéå±€ããã³ç¬¬ïŒéå±€ãèšèšããã°ããã詳现ã¯åŸè¿°ããã
第ïŒéå±€ã第ïŒéå±€ã¯æççªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
äžè¬çãªæççªçµã§ã¯ãã¬ã·ããèª¿çæ¹æ³ãçãä»ããåºæ¥ããããããã€ã³ãã«ãªããåºæŒè
ãåå è
ã®äººç©ç³»ã«å°è±¡ã®ãŠãšãŒããå€ãå Žåã«ã察å¿ãå¯èœãªæ§æãšããŠããã
第ïŒéå±€ã®åºæŒè
ã®åºæåè©ã¯æŸéçªçµã®ïŒ¥ïŒ°ïŒ§ããŒã¿ããçªçµåå¥ã«ååŸãããã®ã§ããã
å€ãã®æŸéçªçµã®æ åã³ã³ãã³ãã«ã¯åºæŒè
æ
å ±ãããŒã¿ãšããŠããŒã¿æŸéãããã®ã§ãããå©çšããããšãåºæ¥ãããæŽã«è©³çްãªäººåãå°åãçªçµå
容ã«ä¿ãçšèªã«å¯ŸããŠã¯ãå€éšã€ã³ã¿ãŒããããµã€ãããããŠã³ããŒãããŠçªçµããšã«èŸæžçšèªãšããããšãå¯èœã§ããããããã¯ä»ã®ãžã£ã³ã«ã®æ åã³ã³ãã³ãã«ã€ããŠãåæ§ã§ããã
FIG. 4 shows an example of a dictionary structure for cooking programs.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
As impression term 105 for cooking programs, terms related to taste and smell are more appropriate when terms related to taste and smell are registered, such as taste and taste. The first layer and the second layer may be designed in consideration of the usage frequency and will be described in detail later.
In the third and fourth layers, genre terms specific to cooking programs are registered in relation to impression terms.
In general cooking programs, recipes, cooking methods, arrangements, and completions are the key points, but it is possible to handle cases where there are many weights of impressions of the performers and participants.
The proper nouns of performers in the fourth hierarchy are acquired individually from the EPG data of the broadcast program.
Performer information is broadcasted as EPG data in the video content of many broadcast programs, which can be used, but for more detailed personal names, place names, and terminology related to the program content, an external Internet site It is also possible to download from the dictionary terms for each program, and the same applies to video content of other genres.
å³ïŒã¯ãã¥ãŒã¹çªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯ãã¥ãŒã¹çªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
äžè¬çãªãã¥ãŒã¹çªçµã¯çªçµäžããŸããŸãªãµããžã£ã³ã«ã®å
容ãéãããŠããã®ã§ããã¯ãå ±éå
容ããšããŠãæ¿æ²»ãçµæžãæåã瀟äŒãåœéãå
端æè¡ãç°å¢ãè¶£å³ãçŠç¥ããšããŠç»é²ãããŠããŠæŽã«è¿œå èšå®ããããšãå¯èœã§ããã
FIG. 5 shows an example of the dictionary structure of a news program.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth hierarchies, genre terms specific to news programs are registered in relation to impression terms.
Since the contents of various sub-genres are sent to general news programs, this is registered as politics, economy, culture, society, international, advanced technology, environment, hobby, welfare, etc. It is possible to make additional settings.
å³ïŒã¯æ ç»çªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯æ ç»çªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
æ ç»çªçµã¯å Žé¢ç³»ãšäººç©ç³»ã®åæ¹ã®å©çšé »åºŠãé«ãããããããŸããŸãªç¹æã®çšèªãéžæå¯èœã§ãããã¹ã¿ãŒã®ååãªã©ãç»é²ãããŠããã
ãããã®ç·åªã女åªçã®ã¹ã¿ãŒã®ååã¯åè¿°ã®ã€ã³ã¿ãŒããããµã€ãããããŠã³ããŒãããŠååŸããäŸã§ãããåã«åºæŒè
ãšããå Žåã§ããã°ïŒ¥ïŒ°ïŒ§ããŒã¿ããååŸåºæ¥ãã
FIG. 6 shows an example of a dictionary structure for a movie program.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth layers, genre terms specific to movie programs are registered in relation to impression terms.
Movie programs are frequently used for both scenes and people, and various unique terms can be selected, and the names of stars are registered.
The names of these actors, actresses and other stars are examples obtained by downloading from the above-mentioned Internet site, and can be obtained from EPG data if they are simply made performers.
æ ç»çªçµã§ã¯ãã¢ã¯ã·ã§ã³ãã³ã¡ãã£ããã¢ãã¡ãå®é²ãªã©ãæŽã«è©³çްãžã£ã³ã«ãå®ããããšã«ããããã詳现ãžã£ã³ã«ã«ç¹åãããžã£ã³ã«çšèªãšããããšãå¯èœã§ããã
æç¢ºãªå®çŸ©ãå¯èœã§ããã°ãæ ç»ä»¥å€ã®ãžã£ã³ã«ã®æ åã³ã³ãã³ãã§ãåæ§ã§ããã
In movie programs, actions, comedy, SF, animation, actual recordings, etc., can be made into genre terms that are more specific to the detailed genre by defining more detailed genres.
The same applies to video content of a genre other than movies if a clear definition is possible.
å³ïŒã¯æ
çªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯æ
çªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
æ
çªçµã¯å
ã®æ ç»åæ§ã«å Žé¢ç³»ãšäººç©ç³»ã®åæ¹ã®å©çšé »åºŠãé«ãããããããŸããŸãªç¹æã®ãžã£ã³ã«çšèªãéžæå¯èœã§ãããåå°ãèªç¶ãæåéºç£ãã«ã¯å
·äœçãªååãªã©ãç»é²ãããŠããã
宿æ³å Žæãä¹ãç©ãã«ã¯ãã®çš®é¡ãç»é²ãããŠããã
FIG. 7 shows an example of a dictionary structure for a travel program.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth hierarchies, genre terms specific to travel programs are registered in relation to impression terms.
Like the previous movie, travel programs are frequently used for both scenes and people, and various unique genre terms can be selected, and specific names are registered for land, nature, and cultural heritage. ing.
The types of accommodation places and vehicles are registered.
å³ïŒã¯ã¯ã©ã·ãã¯çªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯ã¯ã©ã·ãã¯çªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
æ²åãææ®è
ãæŒå¥è
ãç¬å¥è
ãææãã«ã¯å
·äœçãªååãç»é²ãããŠããã
æççªçµåæ§ã«é³æ¥œçªçµã«ã¯èŽèŠã«ããå°è±¡çšèªããããšæŽã«è©³çްãªã¢ãããŒã·ã§ã³æ
å ±ãšããããšãåºæ¥ããããã«ã€ããŠã¯åŸè¿°ããã
FIG. 8 shows an example of the dictionary structure of a classic program.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth hierarchies, genre terms specific to classic programs are registered in relation to impression terms.
Specific names are registered for the song title, conductor, performer, soloist and singer.
Similar to a cooking program, if there is an auditory impression term in a music program, more detailed annotation information can be obtained, which will be described later.
å³ïŒã¯ãç¬ãçªçµã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯ãç¬ãçªçµç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
åžäŒè
ãåºæŒè
ãåå è
ãææãã«ã¯å
·äœçãªååãç»é²ãããŠããã
FIG. 9 shows an example of a dictionary structure for a comedy program.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth layers, genre terms specific to comedy programs are registered in relation to impression terms.
The host, performer, participant, and singer are registered with specific names.
以äžã®ïŒçš®ã®ä»£è¡šçãªæŸéçªçµã®ãžã£ã³ã«ã®èª¬æã®ããã«ãåæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ã®å°è±¡çšèªã¯ããããæ åã³ã³ãã³ãã®ãžã£ã³ã«ã«å
±éã«å©çšå¯èœã§ããèŸæžæ§ç¯ã®åºæ¬ãšãªãã
ãŸãããŒã¿éãå€ããå€åãæ¿ããåºæåè©ã«ãããŠã¯å®æœäŸïŒã®æççªçµã®ããã«çªçµïŒ¥ïŒ°ïŒ§ããŒã¿ããåºæŒè
ã®ååãªã©ã®åºæåè©ãå
¥æããæ¹æ³ãšãæŽã«è©³çŽ°ãªæ
å ±ãåŸãããã«å®æœäŸïŒã®æ ç»çªçµã®ã€ã³ã¿ãŒãããéä¿¡çã«ããç¹å®ã®ãµã€ãããæ
å ±ãå
¥æããŠãããŸããŸãªçªçµã«é¢é£ãã人åãå°åããªã©ã®åºæåè©ãå
¥æããæ¹æ³ãå©çšå¯èœã§ããã
æŽã«ã¯åŸè¿°ããæ¹æ³ã§åºæåè©ãåå¥ç»é²ããããšãåºæ¥ãã
Impression terms such as emotions and emotions and preferences such as likes and dislikes can be used in common with all video content genres, as described in the above seven representative broadcast program genres. It becomes.
In the case of proper nouns with a large amount of data and fluctuating fluctuations, a method for obtaining proper nouns such as names of performers from program EPG data as in the cooking program of Example 4, and an example for obtaining more detailed information. It is possible to use a method for obtaining proper nouns such as names of persons, places, etc. related to various programs by obtaining information from a specific site by Internet communication of 6 movie programs.
Furthermore, proper nouns can be individually registered by the method described later.
以äžã®ããã«ãžã£ã³ã«ãæŽã«ã¯çªçµããšã®èŸæžæ§æãšããããšã¯ãéžæããçšèªïŒæåïŒã®æ°ãéå®ããé©åãªãã®ãšããããã§ãããäŸãã°ããŸãã®èžèœäººãå°åã®äžããäŸãã°ãããããé ã«ç»é²ãããèŸæžã®äžããïŒäººã®èžèœäººãïŒåæã®å°åãéžæãããããªç ©éãªæäœããªããããã®ãã®ã§ããã     As described above, the dictionary structure for each genre and further for each program is to limit the number of terms (characters) to be selected and make it appropriate. For example, among other entertainers and place names, This is to eliminate the complicated operation of selecting one entertainer or one place name from the dictionaries registered in order.
å³ïŒïŒã¯èªäœãããªã®çµå©åŒã®èŸæžæ§æã®äŸã§ããã
第ïŒéå±€ã第ïŒéå±€ãªãã³ã«ç¬¬ïŒéå±€ã¯å
ã®éççªçµãšåæ§ã§ããã
第ïŒéå±€ã第ïŒéå±€ã¯çµå©åŒç¹æã®ãžã£ã³ã«çšèªãå°è±¡çšèªã«é¢é£ããŠç»é²ãããŠããã
å人ã»ç¥äººãåŒå Žãã«ã¯å
·äœçãªååãç»é²ãããŠããã
èªäœãããªã®å Žåãååãå°åãå£äœãªã©ã®åºæåè©çã¯å人å人ã§ç»é²ããããšãäžå¯æ¬ ãšãªãããããã®ååçã®ç»é²ã¯ä»¥äžã®éãã§ããã
FIG. 10 is an example of a wedding dictionary structure of a self-made video.
The first hierarchy, the second hierarchy, and the fifth hierarchy are the same as the previous baseball program.
In the third and fourth layers, genre terms specific to weddings are registered in relation to impression terms.
Specific names are registered in friends / acquaintances and ceremony halls.
In the case of self-made video, it is essential to register proper nouns such as names, regions, groups, etc. by individuals. The registration of these names is as follows.
å³ïŒïŒã¯ç¹å®ã·ãŒã³ç»é²ãæåç»é²ã®äŸã§ããã
å³ïŒã«ãããŠCMã·ãŒã³ã¯ãéå±ã»ã€ãŸããªããã«é¢é£ããçšèªãšããŠç»é²ãããŠããããæ¬äŸã§ã¯ïŒçš®ã®ïŒ£ïŒãªã©ã®ç¹å®ãªã·ãŒã³ã第ïŒéå±€ã®ïŒïŒã«ãŸãšããŠç»é²ãããŠããããããéå§ãäžéãçµäºã·ãŒã³ãçŽæ¥æå®åºæ¥ãããã«ãªã£ãŠããã
FIG. 11 shows an example of specific scene registration and character registration.
In FIG. 3, CM scenes are registered as terms related to boredom and boring, but in this example, specific scenes such as five types of CMs are registered together in the
æŽã«ãæ¬å®æœäŸã§ã¯ãå³ïŒïŒã®ãå人ã»ç¥äººãã®ååãªã©ã®åºæåè©ã®ç»é²ã¯å³ïŒïŒã«ç€ºãããã«ã第ïŒéå±€ã®ïŒïŒã«æåç»é²ãšããŠå²ãåœãŠãããŠã第ïŒéå±€ã®ïŒã«ã¯æŒ¢åãïŒã«ã¯ã«ããïŒã«ã¯ã¢ã«ãã¡ãŒããããªãã³ã«æ°åãå²ãåœãŠã第ïŒéå±€ã«åå¥ã«æåãç»é²ããŠãããããéžæããããšã«ããã å³ïŒããå³ïŒãŸã§ã®èŸæžç»é²çšèªä»¥å€ã®å人åäººå¿ èŠãªçšèªããäºåã«ç»é²ãŸãã¯é©å®å ¥ååºæ¥ãæ§é ãšããããšã«ããæŽã«å©çšäŸ¡å€ã®é«ãæåæ å ±çãäœæããããšãåºæ¥ãã   Furthermore, in this embodiment, registration of proper nouns such as names of friends and acquaintances in FIG. 10 is assigned as character registration to 12 in the first hierarchy as shown in FIG. Is a Chinese character, 2 is kana, 3 is assigned an alpha bet and a number, and individual characters are registered in the 4th layer. By selecting this, individuals other than the dictionary registered terms in FIGS. 3 to 9 can be selected. It is possible to create character information and the like having higher utility value by adopting a structure in which personally necessary terms can be registered in advance or input appropriately.
ãã®é åæåã®éžæã«ããæåå ¥åæ¹æ³ã¯æºåž¯é»è©±ã®æåå ¥åæ¹æ³ããªã¢ã³ã³ã®ã«ãŒãœã«ãã¿ã³æäœã«ããæåå ¥åãããã¿ã³ã®æäœåæ°ãå€§å¹ ã«æžããããšãåºæ¥ãæåå ¥åã®å¹çãå€§å¹ ã«æ¹åããããšãåºæ¥ãã The character input method by selecting this array character can greatly reduce the number of button operations compared to the character input method of the mobile phone or the character input by the cursor button operation of the remote control, and can greatly improve the efficiency of character input. .
å³ïŒïŒã¯ãªã¢ã³ã³ã®äŸã§ããã
çŸåšã®æ åè£
眮ã®é éæäœè£
眮ã§ãããªã¢ã³ã³ã¯ãã¿ã³åŒã§èµ€å€ç·ç¡ç·æ¹åŒãäž»æµã§ããã
æ¬çºæã§ã¯çŸåšåºãå©çšãããŠãããããžã¿ã«æŸéçšæ åè£
眮ã®ãã¿ã³åŒãªã¢ã³ã³ãè£
眮ïŒãšããŠå©çšããããšãæå³ããŠããã
FIG. 12 shows an example of a remote controller.
The remote control that is the remote control device of the current video apparatus is a button type and the infrared wireless system is the mainstream.
In the present invention, it is intended to use a button type remote control of a digital broadcast video apparatus that is widely used at present as the
ã«ãŒãœã«ãã¿ã³ïŒïŒã¯éåžžããŸããŸãªç·šéã®éã«å©çšããããã®ã§ãããéåžžæ åã³ã³ãã³ãã®ã¿ã€ãã«ãªã©ã®æåå ¥åãããéã«ãå©çšããããã®ã§ããããå ã«èª¬æã®éããã®ãã¿ã³ãçšããŠæåå ¥åãããããšã¯æ¥µããŠå¹çãæªãå®çšçã§ãªãããæ¬çºæã¯ãã®ã«ãŒãœã«ãã¿ã³ïŒïŒã䜿çšããã«ã以äžã®ïŒçš®é¡ã®ãã¿ã³ãçšããŠæåå ¥åããããããšãããã®ã§ããã The cursor button 49 is usually used for various edits and is usually used for inputting characters such as a title of video content. As described above, the cursor button 49 is used. Since inputting characters is extremely inefficient and impractical, the present invention tries to input characters using the following three types of buttons without using the cursor button 49.
éåžžã®æ åè£
眮æäœã«ãããŠãã£ã³ãã«ãã¿ã³ïŒïŒã¯ïŒããïŒïŒãŸã§ã®ïŒïŒåã®ãã£ã³ãã«åæ¿ããã¿ã³ã§ããããã®ãã¿ã³ãå
ã«èª¬æããçšèªã®ã°ã«ãŒããæå€§ïŒïŒãšããŠããç·šéçšèªèŸæžã®äžããã°ã«ãŒãããšã«çšèªãéžæãããã®ã§ããã
ã«ã©ãŒãã¿ã³ïŒïŒã¯ããŸããŸãªãã¡ã³ã¯ã·ã§ã³å®è¡ã®ããã®éãèµ€ãç·ãé»ã®ïŒåã®ãã¿ã³ã§ããããã£ãã¿ãã¿ã³ïŒïŒã¯ããã£ãã¿ããŒã¯ãä»äžãããã¿ã³ã§ããã
éåžžã®æ åã³ã³ãã³ãã®èŠèŽã§ã¯ãããã®ãã¿ã³ã¯ãã£ã³ãã«éžæãªã©éåžžã®ãã¿ã³æäœãšããåŸã«è¿°ã¹ãæåçå
¥åç·šéã¢ãŒãã«ããæãã®ãã¿ã³ãæåçå
¥åç·šéæäœçšãã¿ã³ãšããŠè»¢çšããŠäœ¿çšããã
In normal video device operation, the
The color buttons 44 are four buttons of blue, red, green, and yellow for executing various functions, and the chapter buttons 45 are buttons for assigning chapter marks.
In normal viewing of video content, these buttons are used as normal button operations such as channel selection, and when the character input / edit mode is described later, this button is used as a character input / edit operation button.
ãã®ïŒçš®ãïŒïŒåã®ãã¿ã³ãæå€§ã§ãïŒïŒåã®ãã¿ã³ãæäœããããšã«ãããæ åã³ã³ãã³ããèŠèŽããéãæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã«å¯ŸããŠç®çã®çšèªãèšå·ãæåæ å ±çãšããŠä»äžãããã®ã§ããã By operating these three types, 17 buttons, and a maximum of 20 buttons, when a video content is viewed, a desired term or symbol is given as character information to an arbitrary scene of the video content It is.
ãã®ãã¿ã³æäœã«ãããªã¢ã³ã³éä¿¡ä¿¡å·ïŒã¯å³ïŒã«ç€ºãæ åè£
眮ïŒã®ãªã¢ã³ã³ä¿¡å·åä¿¡éšïŒïŒã§åä¿¡ããå¶åŸ¡éšïŒïŒãªãã³ã«èŸæžçšèªéžæéšïŒïŒãæäœããã
The remote
åæ§ã«ãã€ã¯ããã©ã³ïŒãè£
眮ïŒãšããå Žåããã®ãªã¢ã³ã³ã®ãã¿ã³ã®åç§°ã§ãããã£ãã¿ãã¿ã³ããã³ã³ãïŒããïŒïŒããã€ãããããµã³ãã·ãŸãã¯ãšã³ããŽããã¯ããããŸãã¯ã·ããããããã¥ãŠãŸãã¯ã¯ããžã¥ãŠããžã¥ãŠã€ãããžã¥ãŠãããšéãèµ€ãç·ãé»ã®ã«ã©ãŒãã¿ã³ããã¢ãªãã¢ã«ããããªãããŸãã¯ãã€ãã以äžã®ïŒïŒçš®ä»¥äžã®çºå£°ãããããšã«ãããæ åè£
眮ïŒã®é³å£°èªèéšïŒïŒã¯ãªã¢ã³ã³ã«ããä¿¡å·ãšåæ§ã«ä¿¡å·æ
å ±ãšããŠèªèãåäžã®æäœãè¡ãã
Similarly, when the
ãªã¢ã³ã³æäœã®å Žåã«ã¯ãã¿ã³ã®äœçœ®ã確èªããŠãã¿ã³æäœãããå¿
èŠãããããèŠèŽç°å¢ã«åé¡ããªãå Žåã¯ããã®ãããªé³å£°èªèãå©çšããããšã«ããããªã¢ã³ã³ã®ãã¿ã³ãæ¯å確èªããå¿
èŠããªãããã£ã¹ãã¬ãŒç»é¢ãèŠããŸãŸã§çºå£°ãé«éãªæåå
¥åãããããšãåºæ¥ãã
ãã®ããã«æ¥µããŠå°ãªãçšèªã®ä¿¡å·æ
å ±ãšããŠã®é³å£°èªèã¯æ§æã極ããŠåçŽåãããã®ã§ãèå¥çãé«ç²ŸåºŠãªãã®ãšããããšãåºæ¥ãã
In the case of remote control operation, it is necessary to confirm the button position and operate the button, but if there is no problem in the viewing environment, it is necessary to check the remote control button every time by using such voice recognition There is no, you can speak while looking at the display screen, you can input characters at high speed.
As described above, the speech recognition as the signal information of very few terms can make the identification rate highly accurate even if the configuration is extremely simplified.
æ¬çºæã¯ä»¥äžã®éãèŠèŽç°å¢ã«ãã£ãŠãªã¢ã³ã³æäœãªãã³ã«é³å£°æäœã®ãããã§ã察å¿å¯èœã«ããŠããã   As described above, according to the present invention, both the remote control operation and the voice operation can be supported depending on the viewing environment.
å³ïŒïŒã¯æäœãããŒã®äŸã§ããã
以äžã®ãããªã·ã¹ãã æ§æãšèŸæžæ§æãå©çšããŠç®çã®æåæ
å ±çã®ããŒã¿ãäœæããããã®æ åè£
眮ïŒïŒå³ã§ã¯è£
眮æ¬äœãšèšèŒïŒã«å¯Ÿããè£
眮ïŒã®æäœãããŒã瀺ããã®ã§ããã
FIG. 13 is an example of the operation flow.
The operation flow of the
ïŒã«ç€ºãããã«æ åã³ã³ãã³ããèŠèŽããåã«æ åã³ã³ãã³ãã®ãžã£ã³ã«ãéžæããŠãããããã¯èªäœãããªä»¥å€ã®æŸéçªçµã¯ãçªçµïŒ¥ïŒ°ïŒ§ããŒã¿ã«ãã£ãŠèªåéžæãšããããšãå¯èœã§ããã
æ åã³ã³ãã³ããéå§ãããæåæ
å ±çãä»äžãããã·ãŒã³ããããšãPïŒã«ç€ºãããã«ãåœè©²ã·ãŒã³äœçœ®æå®ä¿¡å·ãè£
眮æ¬äœã«éä¿¡ããã
ããã«ããïŒã«ç€ºãããã«è£
眮æ¬äœã®ïŒ§ïŒµïŒ©è¡šç€ºéšïŒïŒã¯èŠèŽç»é¢ã«èŸæžããã®æåæ
å ±çã®ç·šéæ
å ±ã衚瀺ãã
ïŒã«ç€ºãããã«ïŒµïŒ©è£
眮ïŒã§ç®çã®æåæ
å ±çãéå±€é ã«éžæãïŒã€ã®ã·ãŒã³ã®æåæ
å ±çã®ä»äžãå®äºããããšã
ïŒã«ç€ºãããã«æåæ
å ±çã®ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒãäœæã
ïŒã«ç€ºãããã«ç»é¢è¡šç€ºãããšã«æ»ã
ä»¥éæ åã³ã³ãã³ãã®èŠèŽçµäºãŸã§ãããç¹°ãè¿ãããã
As shown in
When the video content is started and there is a scene to which character information or the like is to be added, the scene position designation signal is transmitted to the apparatus body as shown in STEP2.
Thereby, as shown in
As shown in
As shown in
以äžã®ãããªç°¡åãªæäœã§å®çŸåºæ¥ããããä»ãŸã§å°é£ã§ãã£ãå šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã察象ãšãããªã³ã©ã€ã³ããªã¢ã«ã¿ã€ã ã§ã®æåæ å ±çã®å ¥åç·šéãå¯èœã«ãããã以äžã«æ¬ã·ã¹ãã ã®æäœã®è©³çްã瀺ãã Since it can be realized with the simple operation as described above, it enables online and real-time input and editing of character information etc. for all video content genres that have been difficult so far. Details are shown.
å³ïŒïŒã¯ãžã£ã³ã«éžæã®äŸã§ããã
éåžžããžã¿ã«æŸéçªçµã®å ŽåãæŸé黿³¢ã§çªçµã®ïŒ¥ïŒ°ïŒ§ããŒã¿ãéãããŠããã®ã§ããã®æ
å ±ãããšã«æ åè£
眮ïŒã¯èªåçã«ãžã£ã³ã«éžæããããšãå¯èœã§ããããããã§ã¯ïŒµïŒ©è£
眮ïŒã§æå®ããå Žåã®å®æœäŸã瀺ããŠããïŒèªäœãããªã®å Žåã¯ïŒ¥ïŒ°ïŒ§ããŒã¿ããªããããã®æ¹æ³ã§éžæããïŒã
FIG. 14 shows an example of genre selection.
Usually, in the case of a digital broadcast program, EPG data of the program is sent by broadcast radio waves, so the
æ åè£
眮ïŒã®ã¡ã€ã³ã¡ãã¥ãŒããç·šéã¢ãŒãçã«ããŠãå¶åŸ¡éšïŒïŒã§ãããã説æããæåå
¥åçã®ç·šéã¢ãŒãã«èšå®ããããšã«ããæ åè£
眮ïŒã¯æåå
¥åçã®ç·šéã¢ãŒãã«ãªããªã¢ã³ã³ä¿¡å·ããã€ã¯ããã©ã³ããã®ä¿¡å·ãæåçå
¥åç·šéçšã®ä¿¡å·ãšããŠåãä»ããã
When the editing mode is set from the main menu of the
å
ã«èª¬æã®å³ïŒã«ç€ºãæ åã³ã³ãã³ãã®ãžã£ã³ã«ãè£
眮ïŒã®ãªã¢ã³ã³ïŒãŸãã¯ãã€ã¯ããã©ã³ïŒã䜿ã£ãŠéžæããå Žåã®å®æœäŸã§ããã
å³ã®å·ŠåŽã«ã¯ã¡ã€ã³ãžã£ã³ã«ïŒïŒïŒãéžæçªå·ïŒïŒã®ïŒããïŒïŒãŸã§éžæé
ç®ïŒïŒãšããŠè¡šç€ºãããŠããã
æŽã«ã«ã©ãŒãã¿ã³ïŒïŒã«å¯Ÿå¿ããŠãã¡ã³ã¯ã·ã§ã³å
容ïŒïŒã衚瀺ãããŠããã
æåã®æ®µéã§ã¯ã«ãŒãœã«ïŒïŒã¯è¡šç€ºãããŠããªããããªã¢ã³ã³æäœïŒïŒã®å ŽåãïŒããã¿ã³ãæŒãããšã«ããéžæçªå·ïŒïŒã®ïŒã§ããã¹ããŒãã®ã©ã€ã³ã«ã«ãŒãœã«ïŒïŒã衚瀺ãããããã確èªããè¯ããã°ãç·ããã¿ã³ã§ã次ã«é²ãããããžã£ã³ã«ãééããŠæŒããå Žåã«ã¯ãèµ€ããã¿ã³ãæŒãããšã«ãããåæ¶ãããã«ãŒãœã«ã¯æ¶æ»
ãåéžæããããšãã§ããã
é³å£°æäœïŒïŒã®å Žåã¯ãã€ã¯ããã©ã³ã«ãäºããšçºå£°ããããšã«ããã«ãŒãœã«ã衚瀺ããããã確èªããããã°ããããªããšçºå£°ããããšã«ããã¡ã€ã³ãžã£ã³ã«ãéžæãããã
This is an embodiment in which the genre of the video content shown in FIG. 2 described above is selected using the
On the left side of the figure, the main genre 101 is displayed as selection items 47 from
Further, function contents 48 are displayed corresponding to the color buttons 44.
In the first stage, the
In the case of the voice operation 52, the cursor is displayed by saying â2â on the microphone, and if it is confirmed, the main genre is selected by saying âmidoriâ.
ã¡ã€ã³ãžã£ã³ã«ã®éžæãå®äºãããšãå³ã®å³åŽã«ç€ºãããã«ãµããžã£ã³ã«ïŒïŒïŒã衚瀺ãããéçãéžæããå Žåããªã¢ã³ã³æäœïŒïŒã®å ŽåãïŒããã¿ã³ãæŒãããšã«ããéžæçªå·ïŒïŒã®ïŒã§ããéçã®ã©ã€ã³ã«ã«ãŒãœã«ïŒïŒã衚瀺ãããéããã¿ã³ãæŒãããšã«ãããžã£ã³ã«ïŒïŒïŒã®éžæãå®äºããã
é³å£°æäœïŒïŒã®å Žåã¯ãã€ããã¢ãªããšçºå£°ããããšã«ãããµããžã£ã³ã«ïŒïŒïŒã®éžæãå®äºãããã
ããæŒãééãããžã£ã³ã«ã®å€æŽãããã°ãèµ€ãç·ãã®æäœã«ããä¿®æ£ããããšãåºæ¥ãã
When the selection of the main genre is completed, the sub genre 101 is displayed as shown on the right side of the figure. When selecting baseball, in the case of
In the case of the voice operation 52, the selection of the sub-genre 101 is completed by uttering âIâ and âAoâ.
If you make a mistake and change the genre, you can correct it by operating red and green.
以äžã§æ åã³ã³ãã³ãã®ãžã£ã³ã«éžæãå®äºããå¶åŸ¡éšïŒïŒã¯åœè©²æŸéçªçµéå§æç¹ããçªçµçµäºæç¹ãŸã§ã®ä»»æã®ã·ãŒã³ã«èŠèŽè
ãå¿
èŠãšããæåæ
å ±çãä»äžããããšãå¯èœãªæåçå
¥åç·šéã¢ãŒããšãªãã
  The genre selection of the video content is completed as described above, and the
å³ïŒïŒã¯çšèªéžæãã¡ã³ã¯ã·ã§ã³æäœã®äŸã§ããã
æ¬å®æœäŸã§ã¯æåçå
¥åç·šéã«ä¿ãæäœããªã¢ã³ã³ã®ããã£ãã¿ãã¿ã³ããã£ã³ãã«ãã¿ã³ãšã«ã©ãŒãã¿ã³ã®ïŒïŒåã®ãã¿ã³ãæäœããŸãã¯é³å£°èªèããããšã«ããå®çŸããããã®ã§ããã£ãã¿ãã¿ã³ã¯ã·ãŒã³äœçœ®ãæå®ããïŒïŒåã®ãã£ã³ãã«ãã¿ã³ïŒïŒã¯éžæé
ç®ïŒïŒãéžæããããã®éžæçªå·ïŒïŒã®éžæã«å©çšããŠããããŸãã«ã©ãŒãã¿ã³ïŒïŒã¯ãã以å€ã®æäœãå®è¡ããã
æ¬äŸã§ã¯ã«ã©ãŒãã¿ã³ïŒïŒã®ãéããã¿ã³ããéžæãç·ããã¿ã³ããæ»ããé»ããã¿ã³ããæ¬¡ãžãèµ€ããã¿ã³ããåæ¶ãã«å¯Ÿå¿ãããéåžžæäœã¯ãéããã¿ã³æäœã®ã¿ã§ãããã倿Žã次ã®éå±€ãžã®ãžã£ã³ããªã©ã®éåžžå€æäœã®æŠèŠã¯å³ã«ç€ºãéãã§ããã
FIG. 15 shows an example of the term selection function operation.
In this embodiment, operations related to character input editing are realized by operating chapter buttons, channel buttons and color buttons on the remote controller or by voice recognition. The chapter buttons designate scene positions. The twelve
In this example, the color button 44 corresponds to blue, button, select, green, button, back, yellow, button, next, red, button, cancel, and normal operation is only blue, button operation However, the outline of the extraordinary operations such as change and jump to the next hierarchy is as shown in the figure.
å³ïŒïŒã¯ã·ãŒã³äœçœ®æå®ã®äŸã§ããã
å
ã«ãžã£ã³ã«éžæããã¹ããŒãïŒéçã®æ åã³ã³ãã³ããèŠèŽäžã®æºå¡ããŒã ã©ã³ã®ã·ãŒã³ã«æåæ
å ±çãä»äžããå Žåã®äŸã§ããã
FIG. 16 shows an example of scene position designation.
This is an example in the case where character information or the like is given to a scene of a full home run while viewing the sports / baseball video content previously selected for the genre.
èŠèŽè
ïŒå©çšè
ïŒã¯å
ãç·šéããã¹ãã·ãŒã³ãšæãããšããã§ããªã¢ã³ã³ïŒã䜿ã£ãŠã®ãªã¢ã³ã³æäœïŒïŒã®å Žåã¯ãã£ãã¿ãã¿ã³ïŒïŒãæŒãããšã«ããããã®ã·ãŒã³ã®æéäœçœ®ãæå®ããããšãåºæ¥ãã
When the viewer (user) first feels that the scene is to be edited, in the case of the remote control operation 51 using the
é³å£°æäœïŒïŒã®å Žåã«ã¯ãã€ã¯ããã©ã³ïŒã«ãã³ã³ããšçºå£°ããããšã«ãããã®ã·ãŒã³ã®æéäœçœ®ãæå®ããããšãåºæ¥ãã
é³å£°æäœïŒïŒã®å Žåã®æéäœçœ®æå®ã¯ãã³ã³ããšããŠç€ºãããããã«ä»£ããããã£ãã¿ããªã©ã®çºå£°ã§ããããçºå£°ã«å¯Ÿå¿ãããã¡ã³ã¯ã·ã§ã³ã決ããã°ããã
ãªã¢ã³ã³æäœã§ããé³å£°æäœã§ãããã®æã¢ãããŒã·ã§ã³æ
å ±ãä»äžããããã®çšèªçäžåèããå¿
èŠã¯ãªããæèŠãå°è±¡ã«ä»»ããŠããã®æéäœçœ®æå®ãããã°ããããã®ããšãæ¬çºæã®ãã€ã³ãã§ããã
以äžã«ããèŸæžæ©èœãäœåããæ¬äŸã®å Žåå³ïŒã«ç€ºãç·šéçšèªèŸæžïŒïŒã®ç¬¬ïŒéå±€ã®è¡šç€ºã«ç§»è¡ããã
In the case of the voice operation 52, the time position of this scene can be designated by uttering the
Although the time position designation in the case of the voice operation 52 is shown as âhereâ, it may be an utterance such as a chapter instead of this, and a function corresponding to the utterance may be determined.
It is not necessary to think about terms for giving annotation information at this time, whether it is remote control operation or voice operation. It is sufficient to specify this time position depending on the sense and impression. This is the point of the present invention. .
The dictionary function operates as described above, and in the case of this example, the display shifts to the display of the first hierarchy of the edited term dictionary 19 shown in FIG.
以äžã«é¢ããŠã¯æ åè£
眮ïŒã®æ åã³ã³ãã³ãã®é²ç»ãåçã衚瀺éšïŒïŒã®ã¿ã€ã ã·ããåçïŒè¿œãããåçïŒææ®µã«ãããèŠèŽäžã®æ åãèªåçã«äžæåæ¢ãã第ïŒéå±€ã®è¡šç€ºãããããšãå¯èœã§ãããããã«ãã£ãŠçæéã§ãã£ãŠããã€ã©ã€ãã·ãŒã³ã«ç¶ã倧äºãªã·ãŒã³ãèŠéãããšããªãå®å¿ããŠæåçå
¥åç·šéã宿œããããšãåºæ¥ãã
  With regard to the above, it is also possible to automatically pause the currently viewed video and display the first level by means of video content recording and playback of the
å³ïŒïŒã¯å³ïŒïŒããç§»è¡ããç·šéçšèªèŸæžã®ç¬¬ïŒéå±€ã®è¡šç€ºã®äŸã§ããã
æ¬äŸã§ã¯å³ïŒã§èª¬æã®éççªçµã«ãããç·šéçšèªèŸæžïŒïŒã®ç¬¬ïŒéå±€ãïŒããïŒãŸã§ïŒ§ïŒµïŒ©è¡šç€ºéšïŒïŒã«ãã衚瀺ãããŠããã
ãŸã第ïŒéå±€ã®ïŒïŒãïŒïŒã«ã¯å
ã«èª¬æã®å³ïŒïŒã®ç¹å®ã·ãŒã³ç»é²ãæåç»é²ãå©çšåºæ¥ãããã«ã¡ãã¥ãŒã衚瀺ãããŠããã
FIG. 17 shows an example of the first level display of the edited term dictionary transferred from FIG.
In this example, the first level of the editing term dictionary 19 in the baseball program described with reference to FIG.
Also,
å
ã®èª¬æã§æå®ããã·ãŒã³ã®æéäœçœ®ã«å¯ŸãããçšèªéžæãããéãèŠèŽè
ãæ»æåŽã®ããŒã ã®ãã¡ã³ã§ãã£ãå Žåã¯ãå°è±¡åºåïŒïŒïŒãšããŠãã®ãããéžæçªå·ïŒïŒãïŒã®ããã©ã¹å°è±¡å Žé¢ç³»ããéžæãããã
ãã®å Žåã®ãªã¢ã³ã³æäœïŒïŒã¯ãïŒããã¿ã³ãæŒãããšã«ããã«ãŒãœã«ïŒïŒã衚瀺ãããè¯ããã°ãéããã¿ã³ã§éžæããããã©ã¹å°è±¡å Žé¢ç³»ããã¢ãããŒã·ã§ã³æ
å ±äœæéšïŒïŒã§ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ããŒã¹ïŒïŒå
ã®ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®ç¬¬ïŒéå±€ã«èšæ¶ãããã
åæ§ã«é³å£°æäœïŒïŒã®å Žåã¯ãã€ããã¢ãªãã®çºå£°ã§ããã
When selecting a term for the time position of the scene specified in the previous description, if the viewer is a fan of the attacking team, the selection number 46 is set as the impression category 104, and the plus impression scene system. Is selected.
In this case, the remote controller operation 51 is performed by pressing the
Similarly, in the case of the voice operation 52, the voice is âIchiâ or âAoâ.
ãŸãéžæãã®ãã®ãã²ããã§ããã°å³ïŒã®èŸæžããããã©ã¹å°è±¡äººç©ç³»ã奜ããªã»ãã¡ã³ã®ãéžæãå
éæãã®ããã«éžæããŠç»é²ããããšãã§ããã
å察ã§ããã°ããã€ãã¹å°è±¡å Žé¢ç³»ããå°è±¡åºåïŒïŒïŒãšããŠå¿
ç¶çã«éžæãããã
ãã¡ããäžç«ãªç«å Žã§ã©ã¡ãã®ããŒã ã®çŽ æŽããããã¬ãŒã«å¯ŸããŠãããã©ã¹å°è±¡ã®å°è±¡çšèªïŒïŒïŒã§ç»é²ããããšãå¯èœã§ããã
以äžã第ïŒé局衚瀺ç»é¢ã§ã次ã®ç¬¬ïŒéå±€ã®è¡šç€ºã«ç§»è¡ããã
Further, if the player itself is a favorite, it can be selected and registered from the dictionary of FIG. 3, such as a positive impression person, a favorite / fan's player, an infielder, and the like.
If the opposite is true, the negative impression scene system is necessarily selected as the impression category 104.
Of course, it is possible to register with the positive impression term 104 for both teams in a neutral position.
The above is a 1st hierarchy display screen, and it transfers to the display of the following 2nd hierarchy.
å³ïŒïŒã¯å³ïŒïŒããç§»è¡ããç·šéçšèªèŸæžã®ç¬¬ïŒéå±€ã®è¡šç€ºã®äŸã§ããã
å
ã®èª¬æã®ç¬¬ïŒéå±€ã§å°è±¡åºåïŒïŒïŒãããã©ã¹å°è±¡å Žé¢ç³»ããšããŠéžæãããå Žåã®ç¬¬ïŒéå±€ã®å°è±¡çšèªïŒïŒïŒã®è¡šç€ºã§ããã
ãããã®äžããæé©ãªå°è±¡çšèªïŒïŒïŒãéžæçªå·ïŒïŒãïŒã®ãåãã»çŽ æŽãããããšããå Žåããã®å Žåã®ãªã¢ã³ã³æäœïŒïŒã¯ãïŒãéãã§ããé³å£°æäœïŒïŒã®å Žåã¯ããµã³ãã¢ãªã®çºå£°ã§ããã
以äžã第ïŒéå±€éžæç»é¢ã§ãéžæããå°è±¡çšèªïŒïŒïŒãã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®ç¬¬ïŒéå±€ã«èšæ¶ãããæ¬¡ã«ç¬¬ïŒéå±€ã®è¡šç€ºã«ç§»è¡ããã
FIG. 18 is an example of the display of the second hierarchy of the edited term dictionary transferred from FIG.
This is a display of impression terms 105 in the second hierarchy when the impression category 104 is selected as a positive impression scene system in the first hierarchy described above.
Of these, when the optimal impression term 105 is the selection number 46 of 4 and is awesome and wonderful, the remote control operation 51 in this case is 3 and blue, and in the case of the audio operation 52, the sound of San and Ao It is utterance.
The above is the second hierarchy selection screen. The selected impression term 105 is stored in the second hierarchy of the annotation information data 37, and then the display shifts to the third hierarchy display.
å³ïŒïŒã¯å³ïŒïŒããç§»è¡ããç·šéçšèªèŸæžã®ç¬¬ïŒéå±€ã®è¡šç€ºã®äŸã§ããã
å
ã®èª¬æã®ç¬¬ïŒéå±€ã§å°è±¡çšèªïŒïŒïŒããåãã»çŽ æŽãããããšããŠéžæãããå Žåã®ç¬¬ïŒéå±€ã®ãžã£ã³ã«çšèªïŒïŒïŒã®è¡šç€ºã§ããã
ãããã®äžããæé©ãªãžã£ã³ã«çšèªïŒïŒïŒãéžæçªå·ïŒïŒãïŒã®ãææããšããå Žåããã®å Žåã®ãªã¢ã³ã³æäœïŒïŒã¯ãïŒãéãã§ããé³å£°æäœïŒïŒã®å Žåã¯ãã€ããã¢ãªãã®çºå£°ã§ããã
以äžã第ïŒéå±€éžæç»é¢ã§ãéžæãããžã£ã³ã«çšèªïŒïŒïŒãã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®ç¬¬ïŒéå±€ã«èšæ¶ããæ¬¡ã«ç¬¬ïŒéå±€ã®è¡šç€ºã«ç§»è¡ããã
FIG. 19 is a display example of the third hierarchy of the edited term dictionary transferred from FIG.
This is a display of the genre term 106 in the third hierarchy when the impression term 105 is selected as awesome / excellent in the second hierarchy described above.
In the case where the most appropriate genre term 106 is a hit with a selection number 46 of 1, the remote control operation 51 in this case is 1, blue. It is.
The above is the third hierarchy selection screen, and the selected genre term 106 is stored in the third hierarchy of the annotation information data 37, and then the display shifts to the fourth hierarchy display.
å³ïŒïŒã¯å³ïŒïŒããç§»è¡ããç·šéçšèªèŸæžã®ç¬¬ïŒéå±€ã®è¡šç€ºã®äŸã§ããã
å
ã®èª¬æã®ç¬¬ïŒéå±€ã§ãžã£ã³ã«çšèªïŒïŒïŒããææããšããŠéžæãããå Žåã®ç¬¬ïŒéå±€ã®ãžã£ã³ã«çšèªïŒïŒïŒã®è¡šç€ºã§ããã
ãããã®äžããæé©ãªãžã£ã³ã«çšèªïŒïŒïŒãéžæçªå·ïŒïŒãïŒã®ãæºå¡ããŒã ã©ã³ããšããå Žåããã®å Žåã®ãªã¢ã³ã³æäœïŒïŒã¯ãïŒãéãã§ããé³å£°æäœïŒïŒã®å Žåã¯ãã€ããã¢ãªãã®çºå£°ã§ããã
以äžã第ïŒéå±€éžæç»é¢ã§ãéžæãããžã£ã³ã«çšèªïŒïŒïŒãã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®ç¬¬ïŒéå±€ã«èšæ¶ããæ¬¡ã«ç¬¬ïŒéå±€ã®è¡šç€ºã«ç§»è¡ããã
FIG. 20 shows an example of the fourth layer display of the edited term dictionary transferred from FIG.
It is a display of the genre term 106 of the 4th hierarchy when the genre term 106 is selected as a hit in the 3rd hierarchy of the previous description.
If the most appropriate genre term 106 is a full home run with selection number 46 of 1, the remote control operation 51 in this case is 1, blue, and in the case of voice operation 52, It is utterance.
The above is the fourth hierarchy selection screen. The selected genre term 106 is stored in the fourth hierarchy of the annotation information data 37, and then the display shifts to the fifth hierarchy display.
å³ïŒïŒã¯å³ïŒïŒããç§»è¡ããç·šéçšèªèŸæžã®ç¬¬ïŒéå±€ã®è¡šç€ºã®äŸã§ããã
å
ã®èª¬æã§ç¬¬ïŒéå±€ã§ãžã£ã³ã«çšèªïŒïŒïŒããæºå¡ããŒã ã©ã³ããšããŠéžæãããå Žåã®ç¬¬ïŒéå±€ã®è¡šç€ºã§ããã
FIG. 21 shows an example of the display of the fifth layer of the edited term dictionary migrated from FIG.
In the above description, it is a display of the fifth hierarchy when the genre term 106 is selected as a full home run in the fourth hierarchy.
第ïŒéå±€ã¯å³ïŒã§èª¬æã®éãããã®ã·ãŒã³ã®å°è±¡ã®åºŠåããçŸãæåãŸãã¯èšå·ãéžæããå Žåã§ãããã®äŸã§ã¯ïŒæ®µéã¬ãã«äžãïŒæ®µéã®ã¬ãã«ã瀺ãéžæçªå·ïŒïŒãïŒãéžæããå Žåã§ããããã®å Žåã®ãªã¢ã³ã³æäœïŒïŒã®å Žåã¯ãïŒãéãã®çºå£°ã§ããé³å£°æäœïŒïŒã¯ããšã³ãã¢ãªãã§ããã
æ¬çºæã¯å°è±¡ã衚ã圢容系ã®çšèªãèŠåºãçšèªãšããŠããã®ã§ä»¥äžã®ããã«å°è±¡ã®åºŠåãã«ããšã¥ãç·šéæ
å ±ãçµæãšããŠéèŠãªç·šéã·ãŒã³ã容æã«èšå®åºæ¥ãã
As shown in FIG. 3, the fifth layer is a case where a character or symbol representing the degree of impression of this scene is selected. In this example, among the five levels, the selection number 46 indicating four levels selects â4â. In the case of the remote control operation 51 in this case, the utterance is 4, blue, and the voice operation 52 is Yong, Ao.
Since the present invention uses an adjective term representing an impression as a heading term, editing information based on the degree of impression as described above and, as a result, an important editing scene can be easily set.
以äžã第ïŒé局衚瀺ç»é¢ã§ãããã«ãã以äžã®ããŒã¿ã¯ã¢ãããŒã·ã§ã³æ å ±ããŒã¿ïŒïŒã®ç¬¬ïŒéå±€ãšããŠèšæ¶ãããéåžžåäœã«æ»ããšãšãã«ãã¿ã€ã ã·ããåçïŒè¿œãããåçïŒææ®µã«ããäžæåæ¢äžã®å Žåã«ã¯ç»é¢ã¯åäŒããéåžžèŠèŽç»é¢ãšãªãã The above is the fifth tier display screen, whereby the above data is stored as the fifth tier of the annotation information data 37, the screen returns to normal operation, and when paused by time-shift playback (chase playback) means Will be reunited and become the normal viewing screen.
çªçµèŠèŽäžãèŠèŽè
ïŒå©çšè
ïŒã¯å°è±¡åºåãšåœ¢å®¹ç³»ã®å°è±¡çšèªãéžæããããšã«ããã以éã®ãžã£ã³ã«çšèªãèªåã§ããããèããããšããªããæ¡å
衚瀺ã«èªå°ããããããžã£ã³ã«çšèªãéžæããããšã«ãã£ãŠæ åã³ã³ãã³ãã®ä»»æã®ã·ãŒã³ã«æé©ãªæåæ
å ±çã®ã¢ãããŒã·ã§ã³æ
å ±ãä»äžããããšãå¯èœãšãªãã
以äžã®å³ïŒïŒããå³ïŒïŒã®æäœãç¹°ãè¿ãããšã«ããããªã¢ã«ã¿ã€ã ã§ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã宿ãããã
ããã¯èŠåºãçšèªã圢容系ã®å°è±¡çšèªãšããŠä»¥éã«ç¶ãåè©çšèªã§ãããžã£ã³ã«çšèªãé¢é£ä»ãããããšã®æå€§ã®å¹æã§ããã
ææ
ãäºæãäœæã«ä»»ããŠã·ãŒã³ã®æéäœçœ®ãæå®ããé©åãªå°è±¡çšèªãéžæããããšã«ããåŸã¯æ¡å
衚瀺ã«èªå°ãããããã«ãæé©ãªãžã£ã³ã«çšèªã®ä»äžãå¯èœãšãªãã
ç·šééäžã®æäœãã¹ã®ä¿®æ£ããåæ¶ããªã©ã®æäœã¯å
ã«èª¬æã®å³ïŒïŒã®çšèªéžæãã¡ã³ã¯ã·ã§ã³æäœã«ããšã¥ãèªç±ã«å®æœå¯èœã§ããã
While watching the program, the viewer (user) selects the genre term to be guided to the guidance display without selecting the genre term afterward by selecting the impression category and the impression-type impression term. This makes it possible to add annotation information such as character information that is optimal for an arbitrary scene of video content.
The annotation information data 37 is completed in real time by repeating the operations shown in FIGS.
This is the greatest effect of associating a headline term with an adjective impression term and a genre term that is a subsequent noun term.
By assigning the time position of the scene to the emotion, the five senses, and the bodily sensation, and selecting an appropriate impression term, it is possible to give âoptimum genre termâ so that it is guided to the guidance display.
Operations such as correction of operation mistakes during editing and cancellation can be freely performed based on the term selection function operation of FIG. 15 described above.
以äžã®æåæ
å ±çã®ä»äžã¯çæŸéçªçµã®ã¿ãªããé²ç»ããæ åã³ã³ãã³ãã®åèŠèŽæãªã©ã«ãããŠããäžèšåæ§ãªã¢ã³ã³æäœïŒïŒãé³å£°æäœïŒïŒã®ã·ã³ãã«ãªæäœã§å®æœããããšãåºæ¥ãã
ãŸãæ åè£
眮ïŒã«é²ç»ããæ åã³ã³ãã³ã以å€ã®ãªã ãŒãã«ãããªã³ã³ãã³ããšããŠæèŒãããæ åã³ã³ãã³ãïŒïŒã«å©çšããããšãå¯èœã§ããã
The addition of the character information and the like can be performed by simple operations of the remote control operation 51 and the audio operation 52 as described above, not only when viewing live broadcast programs but also when viewing recorded video content.
It can also be used for video content 33 mounted as removable video content other than video content recorded in the
æ¬å®æœäŸã§ã¯ãªã¢ã³ã³æ¹åŒãé³å£°èªèæ¹åŒãšãåèšïŒïŒã®ãã¿ã³ãŸãã¯é³å£°ããã£ãŠãã¹ãŠã®ç·šéãè¡ã£ãŠãããã¢ãããŒã·ã§ã³æ å ±ç·šéäžã«ã¯ãã£ã³ãã«åæ¿ãªã©ããªãããšãå©çšããŠãã£ã³ãã«ãã¿ã³ã§çšèªã®éžæãè¡ãããããªã¢ã³ã³ãã¿ã³ã®å²ãä»ããè¡ã£ãããä»ã®ãã¡ã³ã¯ã·ã§ã³ãã¿ã³çã¯ä»ã®ãã¿ã³å²ãä»ãã§ãæ§ããªãã黿ºãªã©ã®æäœãå«ããŠãæäœïŒïŒåã®ãã¿ã³ã®ãããªã¢ã³ã³ããŸãã¯æå€§ïŒïŒçš®ã®ä¿¡å·æ å ±ãšããŠã®é³å£°ã§å®çŸåºæ¥ããšãããæ¬çºæã®éèŠãªãã€ã³ãã§ããã In this embodiment, all editing is performed with a total of 17 buttons or voices in both the remote control method and the voice recognition method, and the channel button is used to select terms using the fact that there is no channel switching during annotation information editing. Although the remote control buttons are assigned, other function buttons may be assigned to other buttons, the remote control having at least 20 buttons including the operation of the power supply, or the maximum 30 kinds of signal information. This is an important point of the present invention.
以äžã®èª¬æããã«æ¬çºæã®é³å£°èªèã§èŸæžçšèªãçŽæ¥èªåããèªèçã課é¡ã«ãªããããªé³å£°èªèã¯ããŠããªããèŸæžå
ã®çšèªã®éžæã®ããã®ä¿¡å·æ
å ±ãšããã ãã§ããã
åŸãã£ãŠé³å£°èªèãæèã倿ããææ³ç³»ã®é³å£°èªèçãšããå¿
èŠããªããåçŽãªé³é¿ç³»ã®ãã¿ãŒã³ãããã³ã°ã«ããé³å£°èªèã§å¯èœã§ãããã·ã¹ãã ã®è² æ
ã倧ããããããšããªãèªèçãé«ãããããšãå¯èœã§ããã
éãããïŒïŒä»¥å
ã®é³å£°ã§ãããããå¿
èŠã«å¿ãç¹å®è©±è
ç»é²ã容æã§ããã
æŽã«é«ç²ŸåºŠã«ããããã«ã¯ãã€ã¯ããã©ã³ããããã»ããã¿ã€ãã«ãããçºé³ã¹ã€ãããåãä»ãããããŸãã¯ã€ã€ãŒã¿ã€ãã®ãã€ã¯ããã©ã³ã§éŒèã®æ¯åãéé³ãããªã©æ§ã
ãªäœ¿çšç°å¢ã«å¿ãã圢æ
ãšããããšãå¯èœã§ããã
As described above, the dictionary terms are directly read by the speech recognition of the present invention, and only the signal information for selecting the terms in the dictionary is not used.
Therefore, it is not necessary to use grammatical speech recognition for judging context, and it is possible to perform speech recognition by simple acoustic pattern matching, and the recognition rate can be increased without increasing the burden on the system. It can be increased.
Since the voice is limited to 30 or less, it is easy to register a specific speaker as necessary.
In order to achieve higher accuracy, it is possible to adopt a form that suits various usage environments, such as making the microphone a headset type, attaching a sound generation switch, or collecting eardrum vibration with an ear type microphone. is there.
æ¬çºæã¯ä»¥äžã®ããã«ãã·ãŒã³ã®å°è±¡ãããšã«ããå°è±¡çšèªïŒïŒïŒã«é¢é£ä»ãããããžã£ã³ã«ç¹æã®ãžã£ã³ã«çšèªïŒïŒïŒã®éžæã衚瀺éšïŒïŒã«ããã¡ãã¥ãŒéžæåœ¢åŒã§å°è±¡çšèªã«èªå°ãããããéžæåºæ¥ããããå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã察象ãšããŠã¢ããŠã³ãµãŒã®ãããªå°éå®¶ã§ãªããŠãçšèªãèããããéžæãè¿·ãããšããªããæäœãåçŽã§ãç¹å¥ãªç¿çãèŠãããçŽæçã«ãªã¢ã«ã¿ã€ã ç·šéãå¯èœã§ãããæ¬å®æœäŸã§ã¯ã·ãŒã³ã®æéäœçœ®ã®æå®ãã第ïŒéå±€ãã第ïŒéå±€ãŸã§ã®æåããã³èšå·åèšïŒïŒæåèšå·ãèšïŒïŒãã¿ã³æäœã§å®çŸåºæ¥ãå¹³åãã¿ã³æäœæéãšã·ã¹ãã æäœæéãå¹³åïŒïŒïŒç§ãšããå ŽåãæçïŒïŒïŒïŒç§ã§æ
å ±å
¥åãå®äºåºæ¥ãã
é³å£°å
¥åã®å Žåã¯ãå¹³åçºå£°æéãšã·ã¹ãã åäœæéãå¹³åïŒïŒïŒç§ãšããå ŽåãæçïŒïŒïŒç§ã§æ
å ±å
¥åãå®äºåºæ¥ãã
æ¬å®æœäŸã§ã¯ç¢ºå®æ§ãéèŠãããããéžæçªå·ïŒïŒãæå®åŸæŽã«ãéžæããæäœããããæ§æãããŠããããç·šéé床ãåªå
ããå Žåã«ã¯ãéžæçªå·ïŒïŒãæå®ããããšã«ããçŽæ¥äžã®éå±€ã«ç§»è¡ããããæ§æããã°ãå
ã®æéãã»ãŒååãŸã§ççž®ããããšãåºæ¥ãã
以äžã®ããã«æåãäžå¿ãšããæ
å ±ããæäœãåçŽã§ãç¹å¥ãªç¿çãèŠãããçŽæçã«æäœããããšã«ãããå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã察象ãšããŠãªã¢ã«ã¿ã€ã ã§ãåœè©²ã·ãŒã³ã«æé©ãªç·šéçšèªã®ã¢ãããŒã·ã§ã³æ
å ±ã®ä»äžç·šéãå¯èœãšããã®ããå°è±¡çšèªãèŠåºãçšèªãšããèŸæžãå©çšããæ¬çºæã®å€§ããªç¹åŸŽã§ããã
As described above, the present invention can be selected so that selection of the genre-specific genre term 106 associated with the impression term 105 based on the impression of the scene is guided to the impression term in the menu selection format by the
In the case of voice input, if the average utterance time and system operation time are 0.5 seconds on average, information input can be completed in a minimum of 5.5 seconds.
In this embodiment, since the certainty is emphasized, the selection is performed after the selection number 46 is specified. However, when priority is given to the editing speed, the selection number 46 is directly specified. If it is configured to move to the next hierarchy, the previous time can be reduced to almost half.
As described above, information centered on characters is simple to operate, does not require special learning, and is intuitively operated, so that it is optimal for the scene in real time for all video content genres. The feature of the present invention using a dictionary that uses impression terms as heading terms is to enable the editing of annotation information of editing terms.
ååèŠèŽã®æ åã³ã³ãã³ãã«å¯ŸããŠã¯äŸãã°ãã€ã©ã€ãã·ãŒã³ã®äœçœ®ãšç¬¬ïŒéå±€ãŸã§ã®å°è±¡çšèªïŒïŒïŒã®ã¿ãã¢ãããŒã·ã§ã³æ
å ±ãšããŠãªã¢ã«ã¿ã€ã ã§ä»äžããŠãããæ¬¡ååçæãç·šéæã«ä»¥åŸã®ãžã£ã³ã«çšèªã詳现ã«ç»é²ããããšã§ãããã
éåžžã®ãã£ãã¿ããŒã¯ã®ããã«æéäœçœ®ã®ã¿ãæå®ããæ¹æ³ã§ããŒã¯ãå€çšããå ŽåãåŸã§ãã®ããŒã¯ãäœã®æå³ã®ããŒã¯ã§ãã£ãããå€èªããããšãé£ããã
æäœãã®å°è±¡çšèªïŒïŒïŒãä»ãã ãã§ãåœè©²ã·ãŒã³ã®æå³ãçè§£åºæ¥ã以éã®ç·šéãè¿
éã§å¹ççã«ããããšãå¯èœãšãªãã
ãã®ãããªå Žåã«ã¯ç¬¬ïŒéå±€ã®å°è±¡çšèªïŒïŒïŒã®éžæå®äºåŸãèªåçã«éåžžèŠèŽç»é¢ã«æ»ãããèšå®ããŠããããšãåºæ¥ãã
For the first-time viewing video content, for example, only the position of the highlight scene and impression terms 105 up to the second hierarchy are assigned in real time as annotation information, and the subsequent genre terms are registered in detail during the next playback and editing. You may do it.
When a mark is frequently used by a method of specifying only the time position as in the case of a normal chapter mark, it is difficult to determine what the mark was intended later.
At least this impression term 105 can be added to understand the intention of the scene, and subsequent editing can be made quickly and efficiently.
In such a case, it is possible to automatically return to the normal viewing screen after the selection of the impression term 105 in the second hierarchy is completed.
å³ïŒïŒã¯äºæå°è±¡çšèªã®äŸã§ããã
ãããŸã§ã®èª¬æã®åæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ã«ãããã©ã¹å°è±¡ããã€ãã¹å°è±¡ã®å°è±¡çšèªïŒïŒïŒã¯ãå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã«å
±éã«å©çšåºæ¥ããã®ã§ããããããã«æŽã«èŠèŠãèŽèŠãå³èŠãå
èŠãè§ŠèŠã®äºæã«é¢ãã圢容系ã®å°è±¡çšèªïŒïŒïŒããŸãšãããã®ãå³ïŒïŒã§ããã第ïŒéå±€ã®ïŒããïŒïŒã®å°è±¡åºåïŒïŒïŒã«å²ãä»ãç»é²ãããŠããã
æ åã³ã³ãã³ãã«å¯ŸããŠã¯åœç¶ã®ããšãªããèŠèŠçå°è±¡ãæãå€ããããã圢ç¶çå°è±¡ã空éçå°è±¡ãææè²èª¿çå°è±¡ã®ïŒã€ã«åºåããŠå²ãä»ãããŠããã
èŽèŠçå°è±¡ã§ã¯é³æ¥œçªçµã«ã察å¿ã§ããããçšèªãç»é²ãããŠããã
å
èŠã«é¢ããŠã¯å³èŠãšäžç·ã«å²ãä»ãã°ã«ã¡ãæççªçµã«å¯Ÿå¿åºæ¥ãããçšèªãç»é²ãããŠããã
è§ŠèŠçå°è±¡ã¯ãã¢ã¯ã·ã§ã³æ ç»çèªåã䞻人å
¬ã«ãªã£ãã€ããã§æããå°è±¡çšèªãç»é²ãããŠããã
FIG. 22 shows examples of five sense impression terms.
The positive impression and negative impression impression terms 105 based on emotions such as emotions and likes and dislikes in the explanation so far, can be used in common with all video content genres. FIG. 22 shows a summary of the adjective impression terms 105 relating to the five senses of hearing, taste, smell, and touch, which are assigned and registered in the impression categories 104 of 5 to 10 in the first hierarchy.
As a matter of course, the video content has the most visual impressions, which are divided into three parts: a shape impression, a spatial impression, and a light and dark tone impression.
In terms of auditory impressions, terms are registered so that music programs can be handled.
Regarding olfaction, terms are registered with the taste so that it can be assigned to gourmet and cooking programs.
As for tactile impressions, impression terms such as action movies that you feel as if you were the main character are registered.
ãã©ã¹å°è±¡ããã€ãã¹å°è±¡ã®åæå楜ãªã©ã®ææ
ã奜ãå«ããªã©ã®å奜ãªã©ã®å°è±¡çšèªãšã以äžèª¬æã®äºæã«é¢ããå°è±¡çšèªãšãæŽã«å¿
èŠã«ãã£ãŠã¯ãç ããç²ãããé
ã£ã±ãã£ããçã®äºæã«å«ãŸããªãäœæçãªå°è±¡çšèªãšããå ãããšãæ åã³ã³ãã³ãã®ã·ãŒã³ç»åãã·ãŒã³é³å£°ã«ããåºæ¿ã«å¯Ÿãã人ã®åå¿ãšããŠã®ã·ãŒã³ã®å°è±¡ã¯å®å
šã«æºãããããã®ãšãªãã
ãããã®äºæãäœæçã®å°è±¡çšèªïŒïŒïŒã«æ åã³ã³ãã³ãããããã®ãžã£ã³ã«ã«é¢ä¿ããçšèªãåœãŠã¯ããããã«é¢é£ãããžã£ã³ã«çšèªïŒïŒïŒãé©åã«ç»é²ããããšã«ãããã©ã®ãããªã·ãŒã³ã®ç·šéã«ã§ãé©åãªèŠåºãçšèªãéžæå¯èœãšãªãã
ããããªãããäŸãã°éççªçµã®ç·šéã§å³èŠãå
èŠã®å°è±¡çšèªã¯ã»ãšãã©å©çšãããªããäžæ¹ã§æççªçµã§ã¯éèŠã«ãªããåŸãã£ãŠäºæãäœæã«é¢ããå°è±¡çšèªã«é¢ããŠã¯å¿
ãããå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã«å¯Ÿå¿ãããå¿
èŠã¯ãªããã¹ããŒãçªçµãæççªçµã鳿¥œçªçµãæ
è¡çªçµããã©ããã¢ã¯ã·ã§ã³æ ç»ãªã©ãã®ãžã£ã³ã«ã«å¿
èŠãªäºæãäœæã«é¢ããå°è±¡çšèªããžã£ã³ã«å¥ã«å©çšåºæ¥ãããã«ããã°ããã
ãŸããäºæãäœæã®å°è±¡çšèªïŒïŒïŒã®å Žåã«ã¯ããèªäœããã®ã·ãŒã³ããèŠããèŽãããå³ãã£ããå
ãã ãè§Šãããäœæãããçã®å°è±¡ã®æå³ãæã£ãŠããã®ã§ç¬¬ïŒãïŒéå±€ã«å¿
ããããžã£ã³ã«çšèªïŒïŒïŒãåœãŠã¯ããå¿
èŠããªããå¿
èŠãªãžã£ã³ã«çšèªïŒïŒïŒãé©åã«é¢é£ä»ãããã°ããã
Impression terms such as positive impressions, negative impressions such as emotions and preferences such as likes and dislikes, impression terms related to the five senses described above, and, if necessary, included in the five senses such as sleepy, tired, drunk In addition, the impression of the scene as a human reaction to the stimulus by the scene image and scene sound of the video content is completely satisfied.
By applying the terms related to the genre of each video content to the impression terms 105 such as the five senses and the bodily sensations, and appropriately registering the genre terms 106 related thereto, appropriate heading terms for any scene editing Can be selected.
However, for example, taste and olfactory impression terms are rarely used in editing baseball programs, but on the other hand, they are important in cooking programs. Therefore, impression terms related to the five senses and bodily sensations do not necessarily correspond to all video content genres. Impression terms related to the five senses and bodily sensations necessary for the genre, such as sports programs, cooking programs, music programs, travel programs, dramas, and action movies, may be made available for each genre.
Also, in the case of impression term 105 of the five senses and bodily sensations, since this itself has the meaning of impressions seen from the scene, heard, tasted, smelled, touched, experienced, etc., the third, It is not always necessary to apply the genre terms 106 to the four layers, and the necessary genre terms 106 may be appropriately associated.
å³ïŒïŒã¯ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ã®äŸã§ããã
ãããŸã§ã®äŸã§ã¢ãããŒã·ã§ã³æ
å ±ãä»äžãããã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®äŸã§ããæ åã³ã³ãã³ãã®ã¿ã€ãã«éšã«ã¯ãžã£ã³ã«ãçªçµåãæŸéå±åãæŸééå§æéãçµäºæéãèšé²ãããŠãããã¿ã€ãã«ã®äžã«ã¯ã¢ãããŒã·ã§ã³æ
å ±ä»äžè
åã§ããç·šéè
ã®ååãé¢é£æ
å ±ïŒïŒå
ã«åäººå¥æ
å ±ïŒïŒïŒãšããŠç»é²ãããŠãããåäžã®ã¿ã€ãã«ãå¥ãªè€æ°ã®å©çšè
ãæåæ
å ±çç·šéããããšãå¯èœã§ããã
FIG. 23 is an example of annotation information data.
In this example, the annotation information data 37 to which annotation information is added is shown. The title part of the video content includes a genre, a program name, a broadcast station name, a broadcast start time, and an end time. The name of the editor who is the name of the annotation information assigner is registered as the individual information 108 in the related information 36, and the same title can be edited by a plurality of different users with text information and the like. .
以äžã®æ
å ±ã®äžã«æŸééå§ããæŸéçµäºãŸã§ã«æå®ãéžæãããæå»ãšããããã®éå±€ã®ã¢ãããŒã·ã§ã³æ
å ±ïŒæåãèšå·çïŒãä»äžãããŠãããå°è±¡åºåïŒïŒïŒãšå°è±¡çšèªïŒïŒïŒããžã£ã³ã«çšèªïŒïŒïŒããããã®çšèªãéå±€å¥ã«éžæãããŠããã
ãããäžèЧããã ãã§ããããããã®ã·ãŒã³ã®æéäœçœ®ãšãã®å
容ãå
æã«çè§£ããããšãåºæ¥ãã
Under the above information, the time and annotation information (characters, symbols, etc.) of each layer specified and selected from the start of the broadcast to the end of the broadcast are given, and the impression category 104, the impression term 105, and the genre term 106, respectively. Terms are selected by hierarchy.
By just listing this, you can understand the approximate time position of the scene and its contents clearly.
é çªïŒã¯ïŒ£ïŒããéå±ã»ã€ãŸããªãããšããŠéžæããå
容ãšãªã£ãŠãããèªäœã®ãããªã®ç·šéãªã©ã§äžèŠãªã·ãŒã³ããéå±ã»ã€ãŸããªãããéžæããããšã«ããéžæããããšãå¯èœã§ããã
å
ã«èª¬æã®éããã®ãããªã·ãŒã³ã«å¯ŸããŠã¯ãå³ïŒïŒäžŠã³ã«å³ïŒïŒã§ç€ºããç¹å®ã·ãŒã³ããéžæç»é²ããããšãå¯èœã§ããã
In
As described above, such a scene can be selectively registered from the specific scenes shown in FIGS.
å³ïŒïŒã¯èŠèŽåŸç·šéã®äŸã§ããã
ãããŸã§ã®èª¬æã®ããã«ã以äžã®ã¢ãããŒã·ã§ã³æ
å ±ã®ä»äžã¯æ åã³ã³ãã³ããèŠèŽããçµæã«ããšã¥ããã®ã§å³å¯ãªæå³ã§ã®ã·ãŒã³äœçœ®æå®ã®ã¿ã€ãã³ã°ã¯åŸãã«ããããã®ãšãªãã
åŸãã£ãŠæå®ããã·ãŒã³ããå
è¡ããã·ãŒã³ã«æéäœçœ®ãåèšå®ããã°ããããã®éå
è¡ããã·ãŒã³ã«ã·ãŒã³äœçœ®æå®ãèªåä¿®æ£ããããšãå¯èœã§ããã
FIG. 24 shows an example of editing after viewing.
As described so far, the above annotation information is added based on the result of viewing the video content, and the timing of specifying the scene position in a strict sense is shifted backward.
Accordingly, it is only necessary to reset the time position to the scene preceding the designated scene. At this time, the scene position designation can be automatically corrected to the preceding scene.
ãã®å Žåæ åã³ã³ãã³ããèŠèŽããŠããŠãéå§ãšã»ãŒåæã«æéäœçœ®æå®ãå¯èœãªäŸãã°ïŒ£ïŒã·ãŒã³ãæ°ç§ããããªããšåãããªãäŸãã°ææãããŒã«ãæããŠããæºå¡ããŒã ã©ã³ãšãªããŸã§ã®ã·ãŒã³ãæ ç»ããã©ãã®ãã€ã©ã€ãã·ãŒã³ãªã©ã®ããã«ïŒåçšåºŠåããããã€ã©ã€ãã·ãŒã³ã®å°å ¥éšã«ãªãå Žåãªã©æ åã³ã³ãã³ãã®ãžã£ã³ã«ããã³éžæãããçšèªãããšã«ãæéäœçœ®ã®èªå詳现調æŽãããããšããç¯å²ã§æå®ããããšãå¯èœã§ããã In this case, you can watch the video content and specify the time position almost at the same time as the start, for example, the CM scene, which can only be seen in a few seconds, for example, the scene from the pitcher throwing the ball to the full home run, the movie or drama high Automatic range adjustment can also be specified within the range based on the genre of video content and the selected term, such as when a scene is introduced from about a minute ago, such as a light scene. It is also possible to do.
ãŸãæŽã«ç·šé广ãé«ããããã«ã¯ã飿¥ããæ åã®ç·šéç¹ïŒã«ããç¹ïŒãèªåæ€åºããŠãã®ç·šéç¹ãæéäœçœ®ãšãããšããã
ãããã®ç·šéç¹ïŒã«ããç¹ïŒã®èªåæ€åºã¯ããŸããŸãªæç®ã§ç޹ä»ãããŠããã
ãŸãç·šéãããã¢ãããŒã·ã§ã³çšèªãé©åã§ããããå°è±¡ã®åºŠåãããªã©æ åã³ã³ãã³ãå
šäœãç·åçã«å€æããŠä¿®æ£ããããšãå¯èœã§ããã
In order to further enhance the editing effect, it is preferable to automatically detect an editing point (cut point) of an adjacent video and set this editing point as a time position.
Automatic detection of these edit points (cut points) has been introduced in various documents.
It is also possible to comprehensively judge and correct the entire video content such as whether the edited annotation term is appropriate or the degree of impression.
以äžã®ãããªèŠèŽåŸã«äºæ¬¡ç·šéãæœãããšã«ãã粟床ãé«ãé«åäœãªã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒãšããããšãåºæ¥ãã
ãã®ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã®æå»æ
å ±ïŒïŒãå©çšããŠãã«ãããçµåãç·šéã¯ãã¡ããã®ããšãæ åã³ã³ãã³ããã©ã³ãã ã¢ã¯ã»ã¹ããŠãã¬ãŒãªã¹ãäœæãªã©ã®ç·šéãèªç±ã«è¡ããæ åã³ã³ãã³ãã®å©çšã®å¹
ãæ¡å€§ããããšããæ¬çºæã®æçµç®çã§ãã以äžã«ãã®äžäŸã瀺ãã
By performing secondary editing after viewing as described above, the annotation information data 37 with high accuracy and high quality can be obtained.
Using the
å³ïŒïŒã¯ããŒã¿ããŒã¹æ€çŽ¢ã®äŸã§ããã
æ¬çºæã®æå€§ã®ç¹åŸŽã¯å°è±¡çšèªãèŠåºãçšèªãšããŠããã«é¢é£ããæ åã³ã³ãã³ãã®ãžã£ã³ã«ç¹æã®ãžã£ã³ã«çšèªãéžæãç»é²ããæ¹åŒã§ãããããéžæã®çšèªãéå®ãããé©åãªçšèªãéžå®å¯èœã§ããããããŸãæ€çŽ¢çã®ãããªæ€çŽ¢ã·ã¹ãã ã®è² æ
ãå°ãªãã
FIG. 25 shows an example of database search.
The greatest feature of the present invention is a method of selecting and registering genre terms specific to the genre of video content related to impression terms as headline terms, so that selection terms are limited and appropriate terms can be selected. The burden of the search system such as fuzzy search is also small.
ãŸãåºæ¥ããã£ãã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒã¯ç·šéè
ã®æå³ã«æ²¿ã£ãå
容ãšãªããå©çšè
ã«æé©ãªå人ããšã®ããŒãœãã«ãªãã®ãšããããšãåºæ¥ãã
åŸãã£ãŠãããŒã¿ããŒã¹å
ã«è€æ°ã®ç·šéè
ã«ããã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ïŒïŒãããå Žåãã¢ãããŒã·ã§ã³æ
å ±ä»äžè
åã§ãããã®åäººå¥æ
å ±ïŒïŒïŒãéžæããããšã«ãããéžæããç·šéè
ã®å°è±¡ã«ããšã¥ãæ€çŽ¢ãå¯èœã§ããæ€çŽ¢ã®æå³ã«ãã£ãæ åã³ã³ãã³ãã®ã·ãŒã³ã®æ€çŽ¢ãå¯èœã«ãªãã
The completed annotation information data 37 has contents in line with the editor's intention, and can be personalized for each individual optimum for the user.
Therefore, when there is annotation information data 37 by a plurality of editors in the database, it is possible to perform a search based on the impression of the selected editor by selecting the individual information 108 which is the name of the annotation information assignor. Thus, it is possible to search for a scene of video content that matches the search intention.
ãŸãæ åã³ã³ãã³ãã®ã¿ã€ãã«ããšã«ããŒã¿ããŒã¹åãããã¢ãããŒã·ã§ã³æ å ±ããŒã¿ããŒã¹ïŒïŒã¯ããŸããŸããŒã¿ãšããŠå å·¥ããããšãåºæ¥ãæ åã³ã³ãã³ãã®ããããã®ãžã£ã³ã«ãéå±€ããšã®åºåãçšèªãèšå·ãªã©ããããæ¡ä»¶ã§æ€çŽ¢ãè¡ããæ åã³ã³ãã³ãã®è©³çްãªå å®¹ãæ€çŽ¢ããããšãå¯èœã§ããã In addition, the annotation information database 34 created as a database for each title of video content can be processed as various data, and the video content can be searched under various conditions such as each genre of video content, classification by hierarchy, term, symbol, etc. It is possible to search the detailed contents of.
å³ïŒïŒã§ã¯ç·šéè
ãæ åã³ã³ãã³ãã®ãžã£ã³ã«ã第ïŒéå±€ãã第ïŒéå±€ãŸã§ç¬ç«ãããŠæ€çŽ¢æ
å ±ãå
¥åãè€æ°ã®æ åã³ã³ãã³ãã®æåæ
å ±çä»äžã·ãŒã³ïŒïŒïŒã®äžããæ€çŽ¢æ¡ä»¶ã«åèŽããæ åã³ã³ãã³ãã®æåæ
å ±çä»äžã·ãŒã³ïŒïŒïŒãšããŠæ€åºããå Žåã®æŠèŠã瀺ããŠããã
ç¬ç«ãããã«å
šäœãäžæ¬ããŠç·šéçšèªãæ€çŽ¢ãããããšãèªç±ã§ããã
以äžã®ãããªæ€çŽ¢çµæã¯ããŸããŸãªçš®é¡ã®ãã€ã©ã€ãã·ãŒã³ã®ãã¬ãŒãªã¹ãã«å©çšããããšãåºæ¥ãã
æ åã³ã³ãã³ãå
šäœã®å°è±¡çšèªããšãããã«ã¯å°è±¡ã®åºŠåãããšãçã®ãã€ã©ã€ããã€ãžã§ã¹ãçãçããŸããŸãªæ€çŽ¢çµæã«ããšã¥ããæ åã³ã³ãã³ãã®å©çšç¯å²ãæ¡å€§ãããã
ãã€ã©ã€ãã·ãŒã³ä»¥å€ã®ãäžèŠã®ã·ãŒã³ãèŠããããªãã·ãŒã³ã®ç·šéã«ãæå¹ã§ããããšã¯èª¬æããå¿
èŠããªãã
In FIG. 25, the editor, the genre of the video content, the search information is input independently from the first layer to the fifth layer, and the character of the video content that meets the search condition from the given scene 107 such as character information of a plurality of video content. The outline in the case of detecting the information etc. addition scene 107 is shown.
It is also free to search the edited terms collectively without being independent.
The above search results can be used for playlists of various types of highlight scenes.
The range of use of the video content is expanded on the basis of various search results such as the highlight digest version of each impression term of the entire video content, and also the degree of impression.
There is no need to explain that it is also effective for editing unnecessary scenes and scenes that you do not want to see other than highlight scenes.
æ¬çºæã®å±éæ¹æ³ïŒãšããŠå³ïŒã«ç€ºãããã«ç·šéçšèªèŸæžïŒïŒã¯ã€ã³ã¿ãŒãããåç·ã§ææ°ããŒã¿ãæŽæ°ããããšããåå¥çªçµããšã®ããŒã¿ãçªçµéå§åã«ããŒã¿é
ä¿¡ããŸãã¯ããŠã³ããŒãããããšãèããããã
As an
æ¬çºæã®å±éæ¹æ³ïŒãšããŠæŸéçªçµã®ïŒ¥ïŒ°ïŒ§ããŒã¿ã®éå±€æ§é ãç·šéçšèªèŸæžïŒïŒãšçŽæ¥é£æºåºæ¥ãæ§é ãšããããšãåºæ¥ãã°è£
çœ®æ§æãç°¡çŽ åãããæŽã«æŽ»çšã®ç¯å²ãæ¡å€§ãããã
If the hierarchical structure of the EPG data of the broadcast program can be directly linked with the editing term dictionary 19 as the
ãããŸã§ã®èª¬æãæŽçãããšãã·ã¹ãã éçºåŽã¯ã
èŸæžæ§æã¯ãå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã«å
±éã«ãã€ã©ã€ãã·ãŒã³ãå«ãããŸããŸãªç·šéã·ãŒã³ã«å©çšåºæ¥ãå°è±¡ã衚ã圢容系ã®çšèªïŒå°è±¡çšèªïŒãèŠåºãçšèªã«ããŠããã®èŠåºãçšèªã«é¢é£ãããžã£ã³ã«çšèªãæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ã«ç»é²ããããšãåºæ¥ãã®ã§ãèŸæžã®çšèªã¯ãžã£ã³ã«å¥ã«éå®çãªãã®ãšãªããèŸæžæ§ç¯ã®è² æ
ãå°ãªãããžã£ã³ã«çšèªã®å©çšåºŠã«å¿ã远å ãåé€ãèªç±ã§ããã
To summarize the explanation so far, the system development side,
The dictionary structure uses common terms (impression terms) representing impressions that can be used in various editing scenes including highlight scenes as common to all video content genres, and genre terms related to this heading term. Can be registered by genre of video content, dictionary terms are limited by genre, the burden of dictionary construction is small, and additions and deletions are free according to the usage of genre terms.
æ åã³ã³ãã³ãã®ãžã£ã³ã«ãããŒã¿ã®ãžã£ã³ã«ãçšããããšãåºæ¥ãã®ã§èŸæžæ§æã®æšæºåãããããããŸãããŒã¿ã«ããçªçµããšã®çšèªãæããããšãå¯èœãšãªããèŸæžããŒã¿ã¯ã€ã³ã¿ãŒãããçã®éä¿¡åç·ããããŠã³ããŒããããããšãåºæ¥ãããææ°çã®æŽæ°ãèªç±ã§ããã The genre of EPG data can be used as the genre of video content, so it is easy to standardize the dictionary structure, and it is also possible to have terms for each program using EPG data. The dictionary data is downloaded from a communication line such as the Internet. It is possible to update the latest version.
èŸæžãšããŠäœ¿çšãããçšèªãéå®çãšãªãããšã«ããã宿ããã¢ãããŒã·ã§ã³æ å ±ããŒã¿ã®æ€çŽ¢ã®éã«ã¯ããããŸãæ€çŽ¢ãªã©ã®å¿ èŠããªããè£ çœ®è² æ ããããããšãªãé©ç¢ºã§å¹ççãªæ€çŽ¢ãããããšãåºæ¥ãã Because the terms used as dictionaries are limited, when searching for completed annotation information data, there is no need for fuzzy searches, etc., and an accurate and efficient search without burdening the equipment I can do it.
åžå Žã«åºãæµéããŠãããã¿ã³åŒã®ãªã¢ã³ã³ã®ãã¿ã³æ©èœããã®ãŸãŸå©çšåºæ¥ãã®ã§è£ 眮ããŒãéçºã容æã§ããã Device hardware development is easy because the button function of the button-type remote control widely distributed in the market can be used as it is.
é³å£°æäœã«ãããé³å£°èªèãæå€§ïŒïŒã®ä¿¡å·æ å ±ãšããŠã®é³å£°ãèªèããããšã§ã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãäœæããããšãåºæ¥ãã®ã§ãé³å£°èªèã«ã€ããŠãè£ çœ®è² æ ãå°ãªãã Since voice information in voice operation can also generate annotation information data by recognizing voice as a maximum of 30 signal information, the apparatus burden is also low for voice recognition.
äžæ¹æ¬çºæã®ã·ã¹ãã å©çšè
åŽã«ãããã¡ãªãããšããŠã
èŠèŽç°å¢ã«åœ±é¿ããªã䜿ãæ
£ãããã¿ã³åŒã®ãªã¢ã³ã³ã®ãã¿ã³æ©èœããã®ãŸãŸå©çšããŠãå°è±¡ã衚ãçšèªãèŠåºãçšèªãšããŠæåæ
å ±çãå
¥ååºæ¥ãã®ã§æäœã®éåæããªããå
šãŠã®æ åã³ã³ãã³ãã®ãžã£ã³ã«ã察象ãšããŠã ãã§ãç°¡åã«æŸéã«è¿œåŸããŠãªã¢ã«ã¿ã€ã ã§ã®å©çšãã§ããã
On the other hand, as a merit on the system user side of the present invention,
By using the button functions of the familiar button-type remote control that does not affect the viewing environment as it is, you can enter text information etc. as a headline term that represents the impression, so there is no sense of incongruity in operation, and it targets all video content genres Anyone can follow the broadcast easily and use it in real time.
èŠèŽç°å¢ã«ãã£ãŠãã€ã¯ããã©ã³ãæ¥ç¶ãé³å£°èªèã§æåæ å ±çãå ¥åããããšãå¯èœã§ãæå€§ïŒïŒçšåºŠã®çºå£°ã«ããè¡ãããã®ã§ç¹å¥ã®ç¿çã®å¿ èŠããªã誀èªèãå°ãªãã It is also possible to connect a microphone depending on the viewing environment and input character information etc. by voice recognition. Since it is performed with a maximum of about 30 utterances, there is no need for special learning and there are few false recognitions.
èŠèŽæã«æããå°è±¡ãããšæäœããããšã«ããé©åãªç·šéã®ããã®çšèªãæ¡å ãããã®ã§ã誰ã§ãæé©ãªã«ã¢ãããŒã·ã§ã³æ å ±ãäœæåºæ¥ããšãšãã«ãå人å人æãæãã®ãã©ã€ããŒãã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãšããããšãåºæ¥ããšãšãã«å°è±¡ã®åºŠåããç°¡åã«ç»é²åºæ¥ãã®ã§ãæ€çŽ¢ã«ãããŠãå人å人ã®å°è±¡ã®çš®é¡ããã®åºŠåããããšã«æé©ãªæ€çŽ¢ãå¯èœã«ãªãã By operating based on the impression felt during viewing, the terms for appropriate editing are guided, so anyone can create annotation information optimally and can create private annotation information data that is personally personalized. At the same time, the degree of impression can be easily registered, so that an optimum search can be performed based on the type and degree of impression of an individual person.
èŠèŽæã«ä»äžããã·ãŒã³ã®ã¢ãããŒã·ã§ã³æ å ±ãæ€çŽ¢ããŠãåœè©²ã·ãŒã³ããã€ã¬ã¯ãã«ã©ã³ãã ã¢ã¯ã»ã¹ãè€æ°ã®æ åã³ã³ãã³ãã®äžããã奜ã¿ã®ã·ãŒã³ã®ã¿ãé£ç¶ããŠãã€ãžã§ã¹ãã§åçããããããªå¿çšãå¯èœãšãªãã It is possible to search for annotation information of a scene given at the time of viewing, directly access the scene in a random manner, and continuously reproduce only a favorite scene from a plurality of video contents by digest.
ã¿ã€ã ã·ããïŒè¿œãããåçïŒææ®µã䜿ãããšã«ãããæ åã³ã³ãã³ãã®éäžã·ãŒã³ã®èŠéãããªããããšãåºæ¥ãã By using the time shift (chase playback) means, it is possible to eliminate missing scenes in the middle of the video content.
æ°ããæ å ±ãªã©ã®çšèªã¯ãžã£ã³ã«å¥ãŸãã¯åå¥çªçµå¥ã«æŸéããŒã¿ãéä¿¡åç·ã§ããŠã³ããŒãããããšã«ããåœè©²æ åã³ã³ãã³ãã«æé©ã§ãææ°çšèªã«ããã¢ãããŒã·ã§ã³æ å ±ãäœæããããšãåºæ¥ãã Terminology such as new information is optimal for the video content by downloading it by broadcast data or communication line by genre or individual program, and annotation information by the latest term can be created.
以äžã®èª¬æã®ããã«æ¬çºæã¯ãç¹å¥ãªè£ 眮ãéšåãçµç«æè¡ãçšããããšãªããçŸåšåžå Žã«åºãæµéããŠãããè£ çœ®ãéšåãçµç«ãŠãã®æè¡ã§å®çŸå¯èœãªã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ã§ãããå®¶åºçšæ±çšé²ç»è£ 眮ããããªã«ã¡ã©ãç·šéè£ çœ®ã¯ããšããå°éæ åè£ çœ®çã«åºãå©çšããããšãåºæ¥ãã As described above, the present invention is an annotation information providing system that can be realized with the technology of devices, parts, and assemblies that are currently widely distributed in the market without using special devices, components, and assembly techniques. It can be widely used not only for home general-purpose recording devices, video cameras and editing devices but also for professional video devices.
ïŒ æ åè£
眮
ïŒ ïŒµïŒ©ïŒãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ïŒè£
眮
ïŒ ãªã¢ã³ã³
ïŒ ãã€ã¯ããã©ã³
ïŒ ãªã¢ã³ã³éä¿¡ä¿¡å·
ïŒ ãã€ã¯ããã©ã³é³å£°ä¿¡å·
ïŒ æ åä¿¡å·
ïŒ ïŒ§ïŒµïŒ©ïŒã°ã©ãã£ãã¯ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ïŒè¡šç€ºä¿¡å·
ïŒ ã¡ã€ã³ãã£ã¹ãã¬ãŒ
ïŒïŒ ãµããã£ã¹ãã¬ãŒ
ïŒïŒ éš
ïŒïŒ å¶åŸ¡éš
ïŒïŒ 衚瀺éš
ïŒïŒ æ åã³ã³ãã³ãã®é²ç»ãåçã衚瀺éš
ïŒïŒ åä¿¡ä¿¡å·
ïŒïŒ èŸæžç»é²éš
ïŒïŒ èŸæžçšèªããŠã³ããŒãéš
ïŒïŒ èŸæžçšèªããŒããŒãå
¥åéš
ïŒïŒ ç·šéçšèªèŸæž
ïŒïŒ èŸæžçšèªéžæéš
ïŒïŒ ãªã¢ã³ã³ä¿¡å·åä¿¡éš
ïŒïŒ è£
眮æ
å ±èªèéš
ïŒïŒ ãžã£ã³ã«éžæéš
ïŒïŒ é³å£°èªèéš
ïŒïŒ ã¢ãããŒã·ã§ã³æ
å ±äœæéš
ïŒïŒ ã€ã³ã¿ãŒãããéä¿¡ä¿¡å·
ïŒïŒ ããŒããŒãä¿¡å·
ïŒïŒ èŸæžããŒã¿
ïŒïŒ éžæãããçšèª
ïŒïŒ ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿æ€çŽ¢éš
ïŒïŒ æ åã³ã³ãã³ãèšæ¶éšïŒãŸãã¯æèŒãããæ åã³ã³ãã³ãïŒ
ïŒïŒ ã¿ã€ãã«
ïŒïŒ æ åã³ã³ãã³ã
ïŒïŒ ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿ããŒã¹
ïŒïŒ æå»æ
å ±
ïŒïŒ é¢é£æ
å ±
ïŒïŒ ã¢ãããŒã·ã§ã³æ
å ±ããŒã¿
ïŒïŒ ãã£ã¹ãã¬ãŒéžæã¹ã€ãã
ïŒïŒ ã¢ã³ããå
¥å
ïŒïŒ å€éšæ åå
¥å
ïŒïŒ ãã£ã³ãã«ãã¿ã³
ïŒïŒ ã«ã©ãŒãã¿ã³
ïŒïŒ ãã£ãã¿ãã¿ã³
ïŒïŒ éžæçªå·
ïŒïŒ éžæé
ç®
ïŒïŒ ãã¡ã³ã¯ã·ã§ã³å
容
ïŒïŒ ã«ãŒãœã«ãã¿ã³
ïŒïŒ ã«ãŒãœã«
ïŒïŒ ãªã¢ã³ã³æäœ
ïŒïŒ é³å£°æäœ
ïŒïŒïŒ ãžã£ã³ã«
ïŒïŒïŒ ãžã£ã³ã«åºå
ïŒïŒïŒ éå±€
ïŒïŒïŒ å°è±¡åºå
ïŒïŒïŒ å°è±¡çšèª
ïŒïŒïŒ ãžã£ã³ã«çšèª
ïŒïŒïŒ æåæ
å ±çä»äžã·ãŒã³
ïŒïŒïŒ åäººå¥æ
å ±
1
32 Title 33 Video content 34
Claims (12)
äžèšæ åè£ çœ®ã¯ã
æ åã³ã³ãã³ãã®å šãŠã®ãžã£ã³ã«ã®ã·ãŒã³ã«å ±éãªã·ãŒã³ãèŠèŽããå°è±¡ã衚ãçšèªã§ããå°è±¡çšèªã®èŠåºãçšèªãšãæ åã³ã³ãã³ãã®ãžã£ã³ã«ç¹æã®çšèªã§ãããžã£ã³ã«çšèªãšããæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ããã³éå±€å¥ã«é¢é£ä»ããæ§æãããç·šéçšèªèŸæžãšã
äžèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£ 眮ã¯ã
æ åã³ã³ãã³ãã®èŠèŽéå§ããé æ¬¡ã¢ãããŒã·ã§ã³æ å ±ãä»äžããã·ãŒã³äœçœ®ãæå®ãããã®æå®ããã·ãŒã³ã«å¯ŸããŠäžèšç·šéçšèªèŸæžã®äžèšèŠåºãçšèªãšäžèšãžã£ã³ã«çšèªãšãé æ¬¡éžæãã以äžã®æå®ããã³éžæããä¿¡å·æ å ±ãæ åè£ çœ®ã«éä¿¡ããææ®µãåãã
æŽã«äžèšæ åè£ çœ®ã¯ã
ãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹ããåä¿¡ããä¿¡å·æ å ±ã«ããšã¥ãç·šéçšèªèŸæžã«ããã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãäœæããã¢ãããŒã·ã§ã³æ å ±äœæéšãšã
ãå ·åããããšãç¹åŸŽãšããæ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ã A system for adding annotation information to an arbitrary scene of video content including a self-made video composed of a video device and the user interface device,
The video device
The heading term of impression terms, which is a term that expresses the impression of viewing a scene common to scenes of all genres of video content, and the genre terms, which are terms specific to the genre of video content, are classified by genre and hierarchy of video content. An associated editorial dictionary of terms,
The user interface device is
The scene position to which annotation information is added sequentially from the start of viewing video content is specified, the heading term and the genre term in the editing term dictionary are sequentially selected for the specified scene, and the above designation and selected signal Means for transmitting information to the video device;
Furthermore, the video device
An annotation information creation unit that creates annotation information data based on an edited term dictionary based on signal information received from the user interface;
An annotation information adding system for video content, comprising:
åèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£ 眮ã¯å°ãªããŠãïŒïŒåã®æäœãã¿ã³ãå ·åãããªã¢ã³ã³ã§ãããåèšæå®ããã³éžæããä¿¡å·æ å ±ã¯ãªã¢ã³ã³éä¿¡ä¿¡å·ã§ãã£ãŠã
ãã®ãªã¢ã³ã³ãã¿ã³ãæäœããããšã«ãããåèšã·ãŒã³äœçœ®ãæå®ããåèšç·šéçšèªèŸæžã®åèšèŠåºãçšèªãªãã³ã«åèšãžã£ã³ã«çšèªãéžæãã
æ åè£ çœ®ã¯äžèšãªã¢ã³ã³ä¿¡å·åä¿¡éšã§ãã®ä¿¡å·æ å ±ãåä¿¡ããã¢ãããŒã·ã§ã³æ å ±äœæéšã§åèšç·šéçšèªèŸæžã®çšèªã«ããåèšã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããè«æ±é ïŒèšèŒã®æ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ã The video device includes a remote control signal receiver,
The user interface device is a remote controller having at least 20 operation buttons, and the signal information to be specified and selected is a remote control transmission signal,
By operating this remote control button, specify the scene position, select the heading term and the genre term in the editing term dictionary,
2. The video content annotation information according to claim 1, wherein the video device receives the signal information at the remote control signal reception unit, and the annotation information creation unit creates the annotation information data based on the terms in the editing term dictionary. Grant system.
åèšãŠãŒã¶ãŒã€ã³ã¿ãŒãã§ãŒã¹è£ 眮ã¯é³å£°çšãã€ã¯ããã©ã³ã§ãããåèšæå®ããã³éžæããä¿¡å·æ å ±ã¯ãã€ã¯ããã©ã³é³å£°ä¿¡å·ã§ãã£ãŠã
ãã®ãã€ã¯ããã©ã³ã«å€ããšãïŒïŒçš®ä»¥å ã®é³å£°ãçºããããšã«ãããåèšã·ãŒã³äœçœ®ãæå®ããåèšç·šéçšèªèŸæžã®åèšèŠåºãçšèªãªãã³ã«åèšãžã£ã³ã«çšèªãéžæãã
æ åè£ çœ®ã¯äžèšé³å£°èªèéšã§ãã€ã¯ããã©ã³é³å£°ä¿¡å·ãä¿¡å·æ å ±ãšããŠèªèããã¢ãããŒã·ã§ã³æ å ±äœæéšã§åèšç·šéçšèªèŸæžã®çšèªã«ããåèšã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããè«æ±é ïŒèšèŒã®æ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ã The video device includes a voice recognition unit,
The user interface device is a voice microphone, and the signal information to be specified and selected is a microphone voice signal,
By emitting at most 30 kinds of sounds to this microphone, the scene position is designated, the heading terms in the editing term dictionary and the genre terms are selected,
2. The video content according to claim 1, wherein the audio recognition unit recognizes a microphone audio signal as signal information in the audio recognition unit, and the annotation information generation unit generates the annotation information data based on terms in the editing term dictionary. Annotation information assignment system.
ãæŽã«å ·åããããšç¹åŸŽãšããè«æ±é ïŒèšèŒã®æ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžã·ã¹ãã ã The video device includes a dictionary download unit for downloading an edited term dictionary from a communication line, a keyboard input unit for an external keyboard for term registration,
The system for adding annotation information of video content according to claim 1, further comprising:
æ åã³ã³ãã³ãã®ãžã£ã³ã«ç¹æã®çšèªã§ãããžã£ã³ã«çšèªãšã
ãæ åã³ã³ãã³ãã®ãžã£ã³ã«å¥ããã³éå±€å¥ã«é¢é£ä»ããæ§æãããç·šéçšèªèŸæžããçšèªãéžæããŠã¢ãããŒã·ã§ã³æ å ±ããŒã¿ãäœæããããšãç¹åŸŽãšããæ åã³ã³ãã³ãã®ã¢ãããŒã·ã§ã³æ å ±ä»äžæ¹æ³ã Headline terms common to all genres of video content,
Genre terms, which are terms specific to the genre of video content,
An annotation information adding method for video content, characterized in that an annotation information data is created by selecting a term from an editing term dictionary configured by associating a video content with each genre and hierarchy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009171668A JP2011029795A (en) | 2009-07-23 | 2009-07-23 | System and method for providing annotation information of video content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009171668A JP2011029795A (en) | 2009-07-23 | 2009-07-23 | System and method for providing annotation information of video content |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2011029795A true JP2011029795A (en) | 2011-02-10 |
Family
ID=43638061
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2009171668A Pending JP2011029795A (en) | 2009-07-23 | 2009-07-23 | System and method for providing annotation information of video content |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2011029795A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019021088A1 (en) * | 2017-07-24 | 2019-01-31 | International Business Machines Corporation | Navigating video scenes using cognitive insights |
WO2021039129A1 (en) * | 2019-08-29 | 2021-03-04 | ãœããŒæ ªåŒäŒç€Ÿ | Information processing device, information processing method, and program |
JP7697467B2 (en) | 2020-07-15 | 2025-06-24 | ãœããŒã°ã«ãŒãæ ªåŒäŒç€Ÿ | Information processing device, information processing method, and program |
-
2009
- 2009-07-23 JP JP2009171668A patent/JP2011029795A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019021088A1 (en) * | 2017-07-24 | 2019-01-31 | International Business Machines Corporation | Navigating video scenes using cognitive insights |
US10970334B2 (en) | 2017-07-24 | 2021-04-06 | International Business Machines Corporation | Navigating video scenes using cognitive insights |
WO2021039129A1 (en) * | 2019-08-29 | 2021-03-04 | ãœããŒæ ªåŒäŒç€Ÿ | Information processing device, information processing method, and program |
US20220283700A1 (en) * | 2019-08-29 | 2022-09-08 | Sony Group Corporation | Information processing device, information processing method, and program |
JP7605113B2 (en) | 2019-08-29 | 2024-12-24 | ãœããŒã°ã«ãŒãæ ªåŒäŒç€Ÿ | Information processing device, information processing method, and program |
JP7697467B2 (en) | 2020-07-15 | 2025-06-24 | ãœããŒã°ã«ãŒãæ ªåŒäŒç€Ÿ | Information processing device, information processing method, and program |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7601938B2 (en) | Intelligent Automated Assistants in the Media Environment | |
KR102581116B1 (en) | Methods and systems for recommending content in the context of a conversation | |
US11860915B2 (en) | Systems and methods for automatic program recommendations based on user interactions | |
KR102038809B1 (en) | Intelligent automated assistant for media search and playback | |
US10659851B2 (en) | Real-time digital assistant knowledge updates | |
JP2024056690A (en) | Intelligent Automated Assistant for TV User Interaction - Patent application | |
JP4650552B2 (en) | Electronic device, content recommendation method and program | |
KR20230130761A (en) | Systems and methods for performing asr in the presence of heterograph | |
US12300274B2 (en) | Content system with user-input based video content generation feature | |
JP2011029795A (en) | System and method for providing annotation information of video content | |
JP2010250310A (en) | Karaoke system, and method and program for controlling the same | |
JP2013003685A (en) | Information processing device, information processing method and program | |
JP5350306B2 (en) | Karaoke music selection device, method for controlling karaoke music selection device, control program for karaoke music selection device, and information recording medium thereof | |
JP5838937B2 (en) | Data processing apparatus and data processing method | |
JP2013003684A (en) | Information processing device, information processing system, information processing method and program | |
JP2006245970A (en) | Digital broadcasting receiver |