TWI794512B

TWI794512B - System and apparatus for augmented reality and method for enabling filming using a real-time display

Info

Publication number: TWI794512B
Application number: TW108120717A
Authority: TW
Inventors: 承恩許; 勒內阿馬多爾; 威廉赫爾瓦世; 麥可派樂席亞
Original assignee: 美商亞沃公司
Priority date: 2018-06-15
Filing date: 2019-06-14
Publication date: 2023-03-01
Also published as: TW202032978A

Abstract

A system for real-time updates to a display based upon the location of a camera or a detected location of a human viewing the display or both is disclosed. The system enables real-time filming of an augmented reality display that reflects realistic perspective shifts. The display may be used for filming, or may be used as a “game” or informational screen in a physical location, or other applications. The system also enables the use of real-time special effects that are centered upon an actor or other human to be visualized on a display, with appropriate perspective shift for the location of the human relative to the display and the location of the camera relative to the display.

Description

System and device for augmented reality and method for shooting using a real-time display

本發明係關於用於電影製作及其他目的之背景之擴增實境投影。更特定言之，本發明係關於實現一投影螢幕之即時拍攝同時適當地計算投影內容內之適當視角移位以對應於即時攝影機移動或一個體相對於螢幕之移動。 The present invention relates to augmented reality projections for backgrounds in filmmaking and other purposes. More specifically, the present invention relates to enabling real-time capture of a projected screen while properly computing appropriate viewing angle shifts within the projected content to correspond to real-time camera movement or movement of an object relative to the screen.

存在對於所謂的「綠色螢幕」拍攝之各種解決方案。傳統綠色螢幕拍攝依賴於演員在一單一色彩前方(或有時包覆於一單一色彩中)拍攝。隨後，在後期製作中，可將數位物件、場景、角色、移動及類似者新增至場景。例如，在Ironman®及相關Marvel®電影中，Robert Downey Jr.偶爾穿一模型鋼鐵人套裝扮演，但亦穿一綠色套裝扮演，該綠色套裝使後期製作繪圖師能夠將使用實體道具及戲服製作不一定可行或將困難或昂貴之移動及動畫新增至鋼鐵人(Ironman)套裝。 Various solutions exist for so-called "green screen" shooting. Traditional green screen shooting relies on actors being shot in front of (or sometimes wrapped in) a single color. Then, in post-production, digital objects, scenes, characters, movements, and the like can be added to the scene. For example, in Ironman® and related Marvel® films, Robert Downey Jr. occasionally wears a model Iron Man suit, but also wears a green suit, which allows post-production graphic artists to use physical props and costumes to create Not necessarily feasible or adding difficult or expensive movement and animations to the Ironman suit.

綠色螢幕拍攝之缺點係對於全部必須想像其等與其交談之角色，或其等在其中操作之位置，或甚至其等坐在其旁之桌子的演員及導演，綠色螢幕拍攝降低一個體在場景內之整體身歷其境感。許多次，努力在扮演及電影之品質方面微不足道，但其可降低品質。又，其使得導演在拍攝時判定場景是否「正確」遠更困難。後期製作程序可新增不尋常事物或需要未經擷取之一特定視角。因此，許多該等類型之場景最終在重新拍攝中經校正。重新拍攝及額外後期製作新增製造成本及完成電影所需之時間。綠色螢幕擷取畫面(若其必須經校正)所需之演現可耗費大量時間段，取決於場景之長度，在一些情況中，該等時間段高達數天或數周。 The disadvantage of green screen shooting is that for actors and directors who all have to imagine the characters they are talking to, or the location they are operating in, or even the table they are sitting at The overall immersive feeling. Many times, the effort is negligible in the quality of the performance and film, but it can reduce the quality. Also, it enables the director to It is far more difficult to determine whether the scene is "correct" when shooting. Post-production processes can add unusual things or require a specific perspective that wasn't captured. Consequently, many of these types of scenes end up being corrected in reshoots. Reshoots and additional post-production add to the cost of production and the time it takes to complete a film. The rendering required for green screen shots (if they have to be corrected) can take significant time periods, depending on the length of the scene, in some cases up to days or weeks.

亦存在某些拍攝技術，其等嘗試藉由「代替」一綠色螢幕在一大顯示器上提供一經演現場景(例如，一三維高品質場景)或提供已經演現以供顯示之一場景而橋接此間隙。該等系統通常拍攝個體，接著使用電腦視覺技術以偵測該等個體之位置、移動及類似者。接著，可數位地在場景內疊加該等個體之視訊。 There are also certain filming techniques that attempt to bridge the gap by providing a rendered scene (e.g. a 3D high quality scene) on a large monitor "instead of" a green screen or providing a scene rendered for display this gap. These systems typically photograph individuals and then use computer vision techniques to detect the location, movement, and the like of those individuals. Videos of the individuals can then be digitally overlaid within the scene.

此等技術之缺點係其等通常併入一顯著延遲。必須在電影上擷取個體，接著將電腦視覺應用至該視訊，接著可在一現有三維數位場景中疊加該等個體。延遲通常係數秒。在最佳實施方案中，延遲係視訊之大約3至5個圖框。此聽起來可能不多，大約數秒鐘，但若一場景內之一角色欲對數位場景中發生之事物作出反應，則其可在一觀看觀眾之感覺感官中看起來具有顯著延遲，或必須設定特殊提示以使演員能夠適當地作出反應。一般言之，隨後在後期製作中新增此等動作更容易，其中場景中之物件自演員之假裝反應得到提示。 The disadvantage of these techniques is that they usually incorporate a significant delay. Individuals must be captured on film, then computer vision is applied to the video, and the individuals can then be overlaid in an existing 3D digital scene. The delay is usually a factor of seconds. In a preferred embodiment, the delay is approximately 3 to 5 frames of video. This may not sound like much, on the order of seconds, but if a character in a scene wants to react to something happening in the digital scene, it can appear to have a significant delay in the senses of a watching audience, or must be set Special cues to enable actors to respond appropriately. In general, it's easier to add such actions later in post-production, where objects in the scene are cued from the actors' feigned reactions.

亦存在疊加「實境」內之物件或實境之一視訊(其在實質上不具有滯後之情況下經遞送)之擴增實境系統。此等系統依賴於追蹤器監測擴增實境耳機(大多數為耳機，但存在其他形式)之穿戴者之位置，以持續更新該等經產生物件在真實場景內之位置。較不複雜的系統僅依賴於運動追蹤器，而更穩健的系統依賴於外部追蹤器，諸如具有固定紅外光點 (具有耳機上之追蹤器)或耳機上之紅外光點(及在相對於耳機之已知位置中之固定紅外光追蹤器)之攝影機或固定紅外光追蹤系統。此等紅外光點可稱為信標或追蹤點。又其他系統至少部分依賴於房間或空間之紅外光深度映射，或空間之LIDAR深度映射。亦已知其他深度映射技術。深度圖產生與在感測器之視野中之一位置相關聯之幾何形狀之實體位置。此等系統使擴增實境系統能夠在距擴增實境觀看者之適當距離處的空間(例如，不在桌子內部或牆中)內智慧地放置角色或其他物件。然而，擴增實境系統通常僅自一個別觀看者之視角呈現至該觀看者。 There are also augmented reality systems that overlay objects within a "reality" or a video of the reality that is delivered with substantially no lag. These systems rely on trackers to monitor the location of the wearer of an augmented reality headset (most are headsets, but other forms exist) to continuously update the location of the generated objects within the real scene. Less complex systems rely only on motion trackers, while more robust systems rely on external trackers, such as with a fixed infrared spot A camera or a fixed infrared light tracking system (with a tracker on the headset) or an infrared light spot on the headset (and a fixed infrared light tracker in a known position relative to the headset). These points of infrared light may be referred to as beacons or tracking points. Still other systems rely at least in part on infrared depth mapping of a room or space, or LIDAR depth mapping of a space. Other depth mapping techniques are also known. The depth map yields the physical location of the geometry associated with a location in the sensor's field of view. These systems enable the augmented reality system to intelligently place characters or other objects in a space (eg, not inside a table or in a wall) at an appropriate distance from the augmented reality viewer. However, augmented reality systems are typically only presented to an individual viewer from the perspective of that viewer.

虛擬實境系統類似，但完全演現一個體放置於其中之一替代實境。身歷其境感程度變化，且一使用者經放置於其中之世界在品質及互動性方面變化。但再次，此等系統幾乎完全來自一個使用者之一單一視角。如同擴增實境及虛擬實境之第一人稱視角偶爾用於傳統電影及電視拍攝中，但其等不普遍使用。 Virtual reality systems are similar, but fully embody a person placed in one of the alternate realities. The degree of immersion varies, and the world a user is placed in varies in quality and interactivity. But again, these systems come almost entirely from a single perspective of one user. First-person perspective, like augmented reality and virtual reality, is occasionally used in traditional film and television filming, but it is not commonly used.

在一相關領域中，存在互動式螢幕，其中使用者可與一螢幕互動，例如，在一商城或購物中心玩一遊戲或在商城內搜尋一商店。此等螢幕通常併入有限功能性。一些功能性能夠實體地追蹤一個體與螢幕之互動，例如，通常用於實現與一遊戲之互動。但此等螢幕通常不基於嘗試為與該螢幕及一給定場景互動之一特定個體重新產生場景之視角而作出反應或更改其自身。 In a related field, there are interactive screens where a user can interact with a screen, for example, to play a game in a mall or shopping mall or to search for a store within a mall. These screens typically incorporate limited functionality. Some functionality can physically track an individual's interaction with the screen, such as is commonly used to enable interaction with a game. But these screens generally do not react or alter themselves based on the perspective of an attempt to recreate the scene for a particular individual interacting with the screen and a given scene.

最後，後期製作特殊效果可將照明、物件或其他元素新增至演員或個體。例如，「閃電」可在漫威(Marvel)電影中自雷神(Thor)之錘子向外投射，或雷射光束可離開鋼鐵人(Iron Man)之手。然而，當前不存在其中可將實況、即時效果應用至一演員且相對於該演員在場景內之一位置調整該等實況、即時效果之一系統。 Finally, post-production special effects add lighting, objects, or other elements to actors or individuals. For example, a "lightning bolt" could be projected outward from Thor's hammer in a Marvel movie, or a laser beam could leave Iron Man's hand. However, currently there is no one in which live, real-time effects can be applied to an actor and relative to the actor's presence within the scene. One of these real-time, real-time effects systems for position adjustment.

100:系統 100: system

110:攝影機 110: camera

112:相關聯追蹤器 112:Associated Tracker

120:工作站 120: workstation

130:顯示器 130: Display

142:相關聯追蹤器 142:Associated Tracker

144:相關聯追蹤器 144:Associated Tracker

150:網路 150: Network

200:運算裝置 200: computing device

210:處理器 210: Processor

212:記憶體 212: memory

214:儲存器 214: Storage

216:網路介面 216: Network interface

218:I/O介面 218: I/O interface

300:系統 300: system

310:攝影機 310: camera

312:追蹤器 312: Tracker

313:通信介面 313: communication interface

314:追蹤系統 314: Tracking system

315:通信介面 315: communication interface

316:媒體產生 316: Media generation

320:工作站 320: workstation

321:通信介面 321: communication interface

322:位置計算 322: Position calculation

323:資源儲存器 323: resource storage

324:影像產生 324: Image generation

325:校準功能 325: Calibration function

326:管理/使用者介面 326: Administration/User Interface

330:顯示器 330: display

335:通信介面 335: communication interface

336:影像演現 336: Image performance

342:追蹤器 342: tracker

344:追蹤器 344:Tracker

346:通信介面 346: communication interface

347:追蹤系統 347: Tracking System

348:通信介面 348: communication interface

349:追蹤系統 349: Tracking System

410:攝影機 410: camera

430:顯示器 430: display

434:十字線 434: cross hair

436:背景物件 436:Background object

438:背景物件 438:Background object

442:追蹤器 442: tracker

444:追蹤器 444: tracker

510:攝影機 510: camera

530:顯示器 530: display

536:背景物件 536:Background object

538:背景物件 538:Background object

542:追蹤器 542: tracker

544:追蹤器 544: tracker

560:演員 560: actor

562:演員 562: actor

610:攝影機 610: camera

630:顯示器 630: display

636:背景物件 636:Background object

638:背景物件 638:Background object

642:追蹤器 642: tracker

644:追蹤器 644: tracker

660:演員 660: actor

662:演員 662: actor

710:攝影機 710: camera

730:顯示器 730: display

736:背景物件 736:Background object

738:背景物件 738:Background object

739:觸控螢幕感測器 739:Touch screen sensor

742:追蹤器 742:Tracker

744:追蹤器 744:Tracker

762:人類 762: Human

805:步驟 805: step

810:步驟 810: step

820:步驟 820: step

825:步驟 825:Step

830:步驟 830: step

840:步驟 840: step

850:步驟 850: step

895:步驟 895:Step

905:步驟 905: Step

910:步驟 910: step

920:步驟 920: step

930:步驟 930: step

935:步驟 935: step

950:步驟 950: step

960:步驟 960: step

965:步驟 965:step

995:步驟 995: step

1005:步驟 1005: step

1010:步驟 1010: step

1020:步驟 1020: Steps

1030:步驟 1030: step

1040:步驟 1040: step

1050:步驟 1050: step

1095:步驟 1095:step

1105:步驟 1105:step

1110:步驟 1110:step

1120:步驟 1120: Step

1130:步驟 1130: Step

1135:步驟 1135:step

1150:步驟 1150: step

1160:步驟 1160: step

1165:步驟 1165:step

1170:步驟 1170:step

1175:步驟 1175:step

1195:步驟 1195:step

1205:步驟 1205: step

1210:步驟 1210: step

1220:步驟 1220: step

1230:步驟 1230: step

1235:步驟 1235:step

1240:步驟 1240: step

1250:步驟 1250: step

1260:步驟 1260: step

1270:步驟 1270: step

1275:步驟 1275:step

1295:步驟 1295:step

圖1係用於產生且擷取擴增實境顯示之一系統的方塊圖。 FIG. 1 is a block diagram of a system for generating and capturing augmented reality displays.

圖2係一運算裝置的方塊圖。 FIG. 2 is a block diagram of a computing device.

圖3係用於產生且擷取擴增實境顯示之一系統的功能圖。 3 is a functional diagram of a system for generating and capturing augmented reality displays.

圖4係用於產生且擷取擴增實境顯示之一系統之校準的功能圖。 Figure 4 is a functional diagram of the calibration of a system for generating and capturing augmented reality displays.

圖5係用於產生且擷取擴增實境顯示之一系統之攝影機位置追蹤的功能圖。 5 is a functional diagram of camera position tracking of a system for generating and capturing augmented reality displays.

圖6係用於產生且擷取擴增實境顯示之一系統在移動時之攝影機位置追蹤的功能圖。 6 is a functional diagram of camera position tracking while moving for a system for generating and capturing augmented reality displays.

圖7係用於動態地更新一擴增實境螢幕用於與一觀看者互動之一系統在人類移動時之人類位置追蹤的功能圖。 7 is a functional diagram of human position tracking of a system for dynamically updating an augmented reality screen for interaction with a viewer as the human moves.

圖8係用於攝影機及顯示器校準之一程序的流程圖。 Figure 8 is a flowchart of a procedure for camera and display calibration.

圖9係用於位置追蹤之一程序的流程圖。 Fig. 9 is a flowchart of a procedure for location tracking.

圖10係用於在位置追蹤期間計算攝影機位置之一程序的流程圖。 FIG. 10 is a flowchart of a routine for computing camera positions during position tracking.

圖11係用於人類位置追蹤之一程序的流程圖。 Figure 11 is a flowchart of a program for human position tracking.

圖12係用於結合人類進行人類位置追蹤及AR物件之疊加之一程序的流程圖。 FIG. 12 is a flow chart of a procedure for human position tracking and AR object overlay combined with humans.

貫穿此描述，出現在圖中之元件經指派三數位參考標示符，其中最高有效數位係圖號且兩個最低有效數位特定於元件。可推測未結合一圖描述之一元件具有與具有具備相同最低有效數位之一參考標示符之一先前描述元件相同之特性及功能。 Throughout this description, elements appearing in the figures are assigned three-digit reference designators character, where the most significant digit is the drawing number and the two least significant digits are part-specific. It is presumed that an element not described in connection with a figure has the same properties and functions as a previously described element having a reference designator with the same least significant digit.

相關申請案資訊 Related Application Information

本專利主張2018年6月15日申請且標題為「AUGMENTED REALITY BACKGROUND FOR USE IN MOTION PICTURE FILMING」之美國臨時專利申請案第62/685,386號之優先權。 This patent claims priority to U.S. Provisional Patent Application No. 62/685,386 filed on June 15, 2018 and entitled "AUGMENTED REALITY BACKGROUND FOR USE IN MOTION PICTURE FILMING".

本專利主張2018年6月15日申請且標題為「AUGMENTED REALITY WALL WITH VIEWER TRACKING AND INTERACTION」之美國臨時專利申請案第62/685,388號之優先權。 This patent claims priority to U.S. Provisional Patent Application No. 62/685,388 filed on June 15, 2018 and entitled "AUGMENTED REALITY WALL WITH VIEWER TRACKING AND INTERACTION".

本專利主張2018年6月15日申請且標題為「AUGMENTED REALITY WALL WITH COMBINED VIEWER AND CAMERA TRACKING」之美國臨時專利申請案第62/685,390號之優先權。 This patent claims priority to U.S. Provisional Patent Application No. 62/685,390 filed on June 15, 2018 and entitled "AUGMENTED REALITY WALL WITH COMBINED VIEWER AND CAMERA TRACKING".

本專利亦係2018年12月5日申請且標題為「AUGMENTED REALITY BACKGROUND FOR USE IN LIVE-ACTION MOTION PICTURE FILMING」之美國非臨時專利申請案第16/210,951號之一部份接續申請案，該美國非臨時專利申請案主張2017年12月6日申請且標題為「AUGMENTED REALITY BACKGROUND FOR USE IN LIVE-ACTION MOTION PICTURE FILMING」之美國臨時專利申請案第62/595,427號之優先權。 This patent is also a continuation-in-part of U.S. Nonprovisional Patent Application No. 16/210,951, filed on December 5, 2018, and entitled "AUGMENTED REALITY BACKGROUND FOR USE IN LIVE-ACTION MOTION PICTURE FILMING," the U.S. The non-provisional patent application claims priority to U.S. Provisional Patent Application No. 62/595,427 filed on December 6, 2017 and entitled "AUGMENTED REALITY BACKGROUND FOR USE IN LIVE-ACTION MOTION PICTURE FILMING".

此等申請案之各者之揭示內容以引用的方式併入。 The disclosures of each of these applications are incorporated by reference.

設備之描述 Description of the device

現參考圖1，其為用於產生且擷取擴增實境顯示之一系統100的方塊圖。系統100包含一攝影機110、一相關聯追蹤器112、一工作站120、一顯示器130、相關聯追蹤器142及144，其等全部藉由網路150互連。 Referring now to FIG. 1 , which is a block diagram of a system 100 for generating and capturing augmented reality displays. System 100 includes a camera 110 , an associated tracker 112 , a workstation 120 , a display 130 , associated trackers 142 and 144 , all interconnected by a network 150 .

攝影機110較佳係一數位軟片攝影機，諸如來自RED®之攝影機或用於擷取視訊內容以供戲劇發佈或作為電視程式化發佈之其他高端攝影機。日益地，適合消費者之數位攝影機幾乎與此等專業級攝影機同樣好。因此，在一些情況中，亦可使用經製造主要用於擷取靜態影像或電影以供家庭或線上消費的較低端攝影機。攝影機較佳係數位的，但在一些情況中，實際傳統軟片攝影機可結合顯示器130使用，如下文論述。攝影機可係或併入下文參考圖2論述之一運算裝置。 Camera 110 is preferably a digital film camera, such as those from RED® or other high-end cameras used to capture video content for theatrical release or programmatic distribution for television. Increasingly, digital cameras for consumers are almost as good as these professional grade cameras. Thus, in some cases, lower end video cameras manufactured primarily for capturing still images or movies for home or online consumption may also be used. The camera is preferably multi-digit, but in some cases an actual conventional film camera may be used in conjunction with the display 130, as discussed below. The camera may be or incorporate one of the computing devices discussed below with reference to FIG. 2 .

攝影機110併入或附接至一追蹤器112。攝影機110與追蹤器之間之實體關係使得追蹤器相對於鏡頭(或更準確地，鏡頭之焦點)之位置已知或可係已知的。此已知距離及關係容許整體系統基於不在攝影機鏡頭之精確視點處的一追蹤器藉由外推自攝影機鏡頭之視點導出一適當視角。 Camera 110 is incorporated into or attached to a tracker 112 . The physical relationship between the camera 110 and the tracker is such that the position of the tracker relative to the lens (or more precisely, the focal point of the lens) is or can be known. This known distance and relationship allows the overall system to derive an appropriate viewing angle by extrapolating from the viewpoint of the camera lens based on a tracker that is not at the precise viewpoint of the camera lens.

因此，例如，追蹤器112可併入一紅外光LED(或LED陣列)，該紅外光LED(或LED陣列)具有一已知組態使得一紅外光攝影機可偵測(若干)紅外光LED且藉此導出相對於紅外光攝影機之一非常準確的距離、位置及定向。其他追蹤器可係基準標記器、可見LED(或其他燈)、實體特性(諸如形狀或電腦可見影像)。追蹤器112可係一所謂的由內而外(inside-out)追蹤器，其中追蹤器112係追蹤外部LED或其他標記器之一攝影機。已知各種追蹤方案，且基本上其等之任何者可在當前系統中採用。 Thus, for example, the tracker 112 may incorporate an infrared LED (or LED array) having a known configuration such that an infrared camera can detect the infrared LED(s) and Thereby a very accurate distance, position and orientation relative to one of the infrared cameras is derived. Other trackers could be fiducial markers, visible LEDs (or other lights), physical properties such as shape or computer-visible images. Tracker 112 may be a so-called inside-out tracker, where tracker 112 is a camera that tracks external LEDs or other markers. Various tracking schemes are known, and essentially any of them can be employed in current systems.

在本文中，字詞「追蹤器」用於大體上指代用於執行位置及定向追蹤之一組件。下文論述之追蹤器142及144可係此處論述之追蹤器112之對應物。追蹤器通常具有至少一個固定「追蹤器」及一個移動「追蹤器」。使用(若干)固定追蹤器以便準確地追蹤移動追蹤器之位置。但固定追蹤器及移動追蹤器之哪一者實際進行追蹤之動作(例如，注意到移動)在系統間變化。因此，如本文中使用，除了應注意相對位置已知且經追蹤且藉此，攝影機110(更準確言之，攝影機110鏡頭)之位置可在三維空間中追蹤外，非特定相關的為，攝影機較佳採用由一對紅外光攝影機追蹤之一組紅外光LED燈，該對紅外光攝影機藉此導出紅外光LED燈(附接至攝影機110)之相對位置。 In this document, the term "tracker" is used to refer generally to a component for performing position and orientation tracking. Trackers 142 and 144 discussed below may be counterparts to tracker 112 discussed here. Trackers typically have at least one stationary "tracker" and one mobile "tracker". Use fixed tracker(s) in order to accurately track the location of the mobile tracker. But which of the fixed tracker and the mobile tracker actually does the tracking (eg, noticing movement) varies from system to system. Thus, as used herein, not specifically relevant, is that the camera A set of infrared LED lights is preferably employed which is tracked by a pair of infrared cameras which thereby derive the relative positions of the infrared LED lights (attached to camera 110).

事實上，攝影機110可係多個攝影機，但僅展示一個攝影機110。在一些情況中，例如，攝影機110可安裝於顯示器及/或追蹤器142及144內、後方或相對於顯示器及/或追蹤器142及144之一已知位置中。此一攝影機可用於追蹤在顯示器130前方之一個體之位置。例如，系統可操作以回應於自在顯示器130前方之觀看顯示器130上之內容之一人類(例如，一人頭)偵測之位置資訊而移位展示於顯示器130上之一場景或一系列影像之視角而非追蹤攝影機110自身。在此一情況中，顯示器130可較不作為用於拍攝內容之一背景操作，而作為適用於作為一「遊戲」操作或將其他內容呈現至一人類觀看者之一互動式顯示器操作。 In fact, camera 110 could be multiple cameras, but only one camera 110 is shown. In some cases, for example, camera 110 may be mounted within, behind, or in a known position relative to displays and/or trackers 142 and 144 . Such a camera can be used to track the position of an individual in front of the display 130 . For example, the system may be operable to shift the perspective of a scene or series of images displayed on display 130 in response to positional information detected from a human (e.g., a human head) in front of display 130 viewing content on display 130 Instead of tracking the camera 110 itself. In this case, display 130 may operate less as a background for capturing content, and as an interactive display suitable for operation as a "game" or for presenting other content to a human viewer.

為了實現此互動，攝影機110可係或包含與一紅外光照明器或一LIDAR耦合之一紅外光攝影機或與用於追蹤一人類之面部或頭部之適合程式化耦合之一RGB攝影機。當在本文中論述時，在其中追蹤一人類而非一攝影機(如同攝影機110)之此等情況中，可基於人臉而非攝影機更新呈現於顯示器上之場景。在下文更完整論述之其他情境中，可追蹤人類及一相關聯攝影機(如同攝影機110)兩者以使攝影機100能夠拍攝一擴增實境背景且對僅在顯示器130上可見之個體產生一擴增實境擴增(下文論述)。雖然攝影機110之追蹤係重要的，但其在一些特定實施方案中可不使用，或其在其他特定實施方案中可結合人類追蹤使用。此等將在下文更完整論述。 To enable this interaction, camera 110 may be or include an infrared camera coupled with an infrared illuminator or a LIDAR or an RGB camera coupled with suitable programming for tracking a human's face or head. As discussed herein, in such cases where a human being is tracked rather than a camera (like camera 110), more The scene newly rendered on the monitor. In other contexts, discussed more fully below, both humans and an associated camera (like camera 110 ) can be tracked so that camera 100 can capture an augmented reality background and create an augmented reality for individuals only visible on display 130 . Augmented reality augmentation (discussed below). While tracking of camera 110 is important, it may not be used in some particular implementations, or it may be used in conjunction with human tracking in other particular implementations. These are discussed more fully below.

工作站120係下文參考圖2論述之負責使用追蹤器112、142、144計算攝影機相對於顯示器130之位置之一運算裝置。工作站120可係併入經設計用於視訊遊戲世界/虛擬實境演現或用於圖形處理之一相對高端處理器之一個人電腦或工作站類別之電腦(諸如經設計用於演現用於電腦輔助設計(CAD)或三維演現電影製作之三維圖形之一電腦)。此等類型之運算裝置可併入經特殊設計之專用硬體(諸如一或多個圖形處理單元(GPU))，且併入經設計用於向量之圖形處理、加陰影、射線追蹤、應用紋理及其他能力之指令集。GPU通常採用快於通用中央處理單元之記憶體的記憶體，且針對圖形處理例行需要之數學處理類型更佳地制定指令集。 The workstation 120 is the computing device responsible for calculating the position of the camera relative to the display 130 using the trackers 112 , 142 , 144 , discussed below with reference to FIG. 2 . Workstation 120 may be a personal computer or a workstation class of computer that incorporates a relatively high-end processor designed for video game world/virtual reality rendering or for graphics processing (such as designed for rendering for computer-aided design (CAD) or one of the three-dimensional graphics of three-dimensional performance film production computer). These types of computing devices may incorporate specially designed dedicated hardware such as one or more graphics processing units (GPUs), and incorporate graphics processing designed for vectors, shading, ray tracing, applying textures and instruction sets for other capabilities. GPUs typically employ memory that is faster than that of a general-purpose central processing unit, and an instruction set better tailored to the type of mathematical processing that graphics processing routinely requires.

工作站120使用網路(或其他運算系統)來與至少追蹤器112、追蹤器142、144互動且與顯示器130互動。工作站120亦可與正在擷取實況動作資料之攝影機110通信。替代地，攝影機110可將其擷取之資料儲存於其自身之系統(例如，攝影機110固有或插入攝影機110中之儲存能力)，或其他遠端系統(實況數位影像儲存系統)或兩者上。 Workstation 120 interacts with at least tracker 112 , trackers 142 , 144 and with display 130 using a network (or other computing system). Workstation 120 may also communicate with camera 110 that is capturing live action data. Alternatively, the camera 110 may store the data it captures on its own system (e.g., a storage capability native to or plugged into the camera 110), or on another remote system (a live digital video storage system), or both .

顯示器130係一大型顯示螢幕或能夠填充一場景作為用於拍攝在顯示器前方之實況動作演員之一背景之顯示螢幕。一典型顯示器可大約20至25呎寬乘以15至20呎高。雖然可使用各種縱橫比，且不同大小(例如，用於填充一實際實體佈景之一窗或用於填充一倉庫大小之建築物之一整個牆)之螢幕可行。雖然展示為一二維顯示器，但顯示器130可係經設計以用作一「圓頂」之一半球形或近半球形，在該圓頂上可顯示完全環繞演員及一拍攝攝影機之一場景。半球形之使用可實現涉及實況演員在一完全實現場景中之更動態擷取畫面，其中攝影機同時自不同角度擷取場景。 Display 130 is a large display screen or a display screen capable of filling a scene as a background for filming live action actors in front of the display. A typical display can be as large as About 20 to 25 feet wide by 15 to 20 feet high. Although various aspect ratios can be used, and screens of different sizes (eg, to fill a window of an actual physical set or to fill an entire wall of a warehouse-sized building) are feasible. Although shown as a two-dimensional display, display 130 may be designed to function as a hemispherical or nearly hemispherical "dome" on which a scene completely surrounding actors and a shooting camera may be displayed. The use of hemispheres enables more dynamic captures involving live actors in a fully realized scene, where the camera captures the scene from different angles simultaneously.

顯示器130可係一單一大LED或LCD或其他格式之顯示器，諸如在體育賽事中結合大螢幕使用之顯示器。顯示器130可係許多較小顯示器之一合併，該等較小顯示器接近彼此放置使得不存在空的空間或間隙。顯示器130可係投射至一螢幕上之一投影機。可使用各種形式之顯示器130。 Display 130 may be a single large LED or LCD or other format display, such as those used in conjunction with large screens at sporting events. Display 130 may be incorporated as one of many smaller displays placed close to each other so that there are no empty spaces or gaps. Display 130 may be a projector projected onto a screen. Various forms of display 130 may be used.

顯示器130自攝影機110(或人，下文論述)之視角，在於顯示器130前方操作之任何實況演員後方或結合於顯示器130前方操作之任何實況演員顯示一場景(或一個以上場景)及其中之任何物件。隨著攝影機在追蹤器之視野中移動，工作站120可使用追蹤器112、142、144以即時導出適當視角。 Display 130 displays a scene (or more than one scene) and any objects therein from the perspective of camera 110 (or a person, discussed below) behind or in conjunction with any live actor operating in front of display 130 . As the camera moves across the tracker's field of view, the workstation 120 can use the trackers 112, 142, 144 to derive the appropriate perspective in real time.

追蹤器142、144係以與顯示器130之一已知關係定向之追蹤器(上文論述)。在一典型設置中，採用兩個追蹤器142、144，各追蹤器處於與顯示器130之一頂部隅角之一已知關係。如可理解，取決於整體系統之設置，可採用額外或更少追蹤器。使用(若干)追蹤器142、144與顯示器130之已知關係以判定顯示器130之大小之完整範圍且基於由追蹤器112、142、144提供且由工作站120計算之位置導出用於攝影機110在顯示器130上顯示之適當視角。追蹤器112、142、144可係或包含如下文參考圖2論述之一運算裝置。 Trackers 142, 144 are trackers oriented in a known relationship to display 130 (discussed above). In a typical setup, two trackers 142 , 144 are employed, each tracker being in a known relationship to a top corner of the display 130 . As can be appreciated, additional or fewer trackers may be employed depending on the overall system configuration. The known relationship of the tracker(s) 142, 144 to the display 130 is used to determine the full range of sizes of the display 130 and derived for the camera 110 on the display based on the positions provided by the trackers 112, 142, 144 and calculated by the workstation 120 Appropriate viewing angles shown on 130. Trackers 112, 142, 144 may be or include as referenced below Figure 2 discusses a computing device.

網路150係一電腦網路，其可包含網際網路，但亦可包含其他連接能力系統，諸如乙太網路、無線網際網路、Bluetooth®及其他通信類型。針對網路150之一些態樣，亦可使用串列及並列連接，諸如USB®。網路150實現構成系統100之各種組件之間之通信。 Network 150 is a computer network that may include the Internet, but may also include other connectivity systems such as Ethernet, wireless Internet, Bluetooth®, and other communication types. For some aspects of network 150, serial and parallel connections, such as USB®, may also be used. Network 150 enables communication between the various components that make up system 100 .

現轉向圖2，展示代表圖1中之攝影機110(在一些情況中)、工作站120及追蹤器112、142及144(視情況)之一運算裝置200之一方塊圖。運算裝置200可係(例如)一桌上型或膝上型電腦、一伺服器電腦、一平板電腦、一智慧型電話或其他行動裝置。運算裝置200可包含用於提供本文中描述之功能性及特徵之軟體及/或硬體。因此，運算裝置200可包含以下項之一或多者：邏輯陣列、記憶體、類比電路、數位電路、軟體、韌體及處理器。運算裝置200之硬體及韌體組件可包含用於提供本文中描述之功能性及特徵之各種專用單元、電路、軟體及介面。 Turning now to FIG. 2 , there is shown a block diagram of a computing device 200 representing camera 110 (in some cases), workstation 120 , and trackers 112 , 142 and 144 (as appropriate) in FIG. 1 . The computing device 200 may be, for example, a desktop or laptop computer, a server computer, a tablet computer, a smart phone, or other mobile devices. Computing device 200 may include software and/or hardware for providing the functionality and features described herein. Therefore, the computing device 200 may include one or more of the following: logic arrays, memory, analog circuits, digital circuits, software, firmware, and processors. The hardware and firmware components of computing device 200 may include various special purpose units, circuits, software and interfaces for providing the functionality and features described herein.

運算裝置200具有耦合至一記憶體212、儲存器214、一網路介面216及一I/O介面218之一處理器210。處理器210可係或包含一或多個微處理器、用於特定功能之專用處理器、場可程式化閘陣列(FPGA)、特定應用積體電路(ASIC)、可程式化邏輯裝置(PLD)及可程式化邏輯陣列(PLA)。 The computing device 200 has a processor 210 coupled to a memory 212 , storage 214 , a network interface 216 and an I/O interface 218 . Processor 210 may be or include one or more microprocessors, special purpose processors for specific functions, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), programmable logic devices (PLDs) ) and Programmable Logic Array (PLA).

記憶體212可係或包含RAM、ROM、DRAM、SRAM及MRAM，且可包含韌體，諸如靜態資料或固定指令、BIOS、系統功能、組態資料及在運算裝置200及處理器210之操作期間使用之其他常式。記憶體212亦提供與由處理器210處置之應用程式及資料相關聯之資料及指令之一儲存區域。如本文中使用，術語「記憶體」對應於記憶體212且明確排除諸如信號或波形之暫時性媒體。 Memory 212 may be or include RAM, ROM, DRAM, SRAM, and MRAM, and may include firmware, such as static data or fixed instructions, BIOS, system functions, configuration data, and during operation of computing device 200 and processor 210 Other routines to use. Memory 212 also provides a storage area for data and instructions associated with applications and data processed by processor 210 . As used herein, the term "memory" corresponds to memory 212 and clearly Transient media such as signals or waveforms are definitely excluded.

儲存器214提供運算裝置200中之資料或指令之非揮發性、大量或長期儲存。儲存器214可採用一磁性或固態磁碟、磁帶、CD、DVD或其他合理高容量可定址或串列儲存媒體之形式。可提供多個儲存裝置或運算裝置200可用多個儲存裝置。此等儲存裝置之一些可在運算裝置200外部，諸如網路儲存器或基於雲端之儲存器。如本文中使用，術語「儲存器」及「儲存媒體」明確排除諸如信號或波形之暫時性媒體。在一些情況(諸如涉及固態記憶體裝置之情況)中，記憶體212及儲存器214可係一單一裝置。 Storage 214 provides non-volatile, bulk or long-term storage of data or instructions in computing device 200 . Storage 214 may take the form of a magnetic or solid state disk, magnetic tape, CD, DVD, or other reasonably high capacity addressable or serial storage medium. Multiple storage devices may be provided or the computing device 200 may use multiple storage devices. Some of these storage devices may be external to computing device 200, such as network storage or cloud-based storage. As used herein, the terms "storage" and "storage medium" expressly exclude transitory media such as signals or waveforms. In some cases, such as those involving solid-state memory devices, memory 212 and storage 214 may be a single device.

網路介面216包含至一網路(諸如網路150(圖1))之一介面。網路介面216可係有線或無線的。 Network interface 216 includes an interface to a network, such as network 150 (FIG. 1). Network interface 216 may be wired or wireless.

I/O介面218將處理器210介接至諸如顯示器、視訊攝影機及靜態攝影機、麥克風、鍵盤及USB®裝置之周邊設備(未展示)。 I/O interface 218 interfaces processor 210 to peripherals (not shown) such as displays, video and still cameras, microphones, keyboards, and USB® devices.

圖3係用於產生且擷取用於拍攝之擴增實境背景之一系統300的功能圖。系統300包含一攝影機310、一追蹤器312、一追蹤器342、一追蹤器344、一顯示器330及一工作站320。 FIG. 3 is a functional diagram of a system 300 for generating and capturing augmented reality backgrounds for filming. The system 300 includes a camera 310 , a tracker 312 , a tracker 342 , a tracker 344 , a display 330 and a workstation 320 .

攝影機310、追蹤器312、工作站320、顯示器330以及追蹤器342及344各分別包含一通信介面315、313、321、335、346及348。通信介面315、313、321、335、346及348之各者負責使裝置或組件之各者能夠與其他裝置或組件傳達資料。通信介面315、313、321、335、346及348可實施於軟體中，其中其等能力之某一部分使用硬體實行。 Camera 310, tracker 312, workstation 320, display 330, and trackers 342 and 344 each include a communication interface 315, 313, 321, 335, 346, and 348, respectively. Each of communication interfaces 315, 313, 321, 335, 346, and 348 is responsible for enabling each of the devices or components to communicate data with other devices or components. Communication interfaces 315, 313, 321, 335, 346, and 348 may be implemented in software, with some portion of their capabilities implemented using hardware.

攝影機310亦包含負責擷取媒體(例如，一場景)且將該媒體儲存至一儲存位置之媒體產生316。儲存位置可為攝影機310(未展示)本端或可係遠端的於一(或若干)伺服器或工作站電腦(亦未展示)上。可使用用於擷取且儲存數位影像或傳統電影影像之任何典型程序。然而，在其中攝影機自身併入追蹤器312之情況中，與追蹤相關聯之資料之傳達可使用工作站320進行傳達。或，在一些情況中，由攝影機310之媒體產生316擷取之視覺資料可用於擴增由追蹤器312提供之追蹤資料且可將該資料提供至工作站320。 Camera 310 also includes media generation 316 responsible for capturing media (eg, a scene) and storing the media to a storage location. The storage location can be the camera 310 (not shown) The end may be remote on a server (or several) or workstation computers (also not shown). Any typical procedure for capturing and storing digital or conventional film images can be used. However, in cases where the camera itself is incorporated into the tracker 312 , communication of data associated with tracking may be communicated using the workstation 320 . Alternatively, in some cases, visual data captured by media generation 316 of camera 310 may be used to augment tracking data provided by tracker 312 and may provide this data to workstation 320 .

追蹤器312、342及344各包含一追蹤系統314、347、349。如上文論述，追蹤系統可採取許多形式。且一個裝置可追蹤另一裝置或反之亦然。相關點係可參考追蹤器342、344相對於顯示器330追蹤在一已知相對位置中附接至攝影機310之追蹤器312。在一些情況中，可使用更多或更少追蹤器。追蹤器312、342、344可操作以追蹤攝影機310但亦可追蹤在顯示器330前方之一人類。 Trackers 312 , 342 and 344 each include a tracking system 314 , 347 , 349 . As discussed above, a tracking system can take many forms. And one device can track another device or vice versa. A point of interest may track tracker 312 attached to camera 310 in a known relative position relative to display 330 with reference to trackers 342 , 344 . In some cases, more or fewer trackers may be used. The trackers 312 , 342 , 344 are operable to track the camera 310 but also a human being in front of the display 330 .

顯示器330包含影像演現336。此係旨在涵蓋許多事物之一功能描述，包含用於在螢幕上產生影像之指令、用於該等指令之儲存器、一或多個圖框緩衝器(其等在一些情況中可為了速度而經停用)及與工作站320通信之任何螢幕刷新系統。顯示器330顯示由工作站320提供以顯示於顯示器330上之影像。如由工作站320引導，基於追蹤器312、347、349更新展示於顯示器上之影像以對應於攝影機310鏡頭之當前位置。 Display 330 includes video presentation 336 . This is a functional description intended to cover many things, including instructions for producing images on the screen, memory for those instructions, one or more frame buffers (which in some cases may be used for speed disabled) and any screen refresh system in communication with workstation 320. The display 330 displays images provided by the workstation 320 for display on the display 330 . As directed by the workstation 320 , based on the trackers 312 , 347 , 349 the image displayed on the display is updated to correspond to the current position of the camera 310 lens.

工作站320包含位置計算322、資源儲存器323、影像產生324、校準功能325及管理/使用者介面326。 Workstation 320 includes position calculation 322 , resource storage 323 , image generation 324 , calibration function 325 and management/user interface 326 .

位置計算322使用由追蹤器312、342、344之各者中之追蹤系統314、347、349產生之資料以基於追蹤器342、344與顯示器及追蹤器312與攝影機310鏡頭之間之已知關係而產生攝影機310(或人類或兩者)之位置資訊。在大多數典型情況中，可幾何上使用相對距離以導出攝影機310(實際上，攝影機310上之追蹤器312)相對於顯示器330之距離及高度。位置計算322使用該資料以導出位置。下文關於圖4及圖5呈現一典型計算之細節。 Position calculation 322 uses data generated by tracking systems 314, 347, 349 in each of trackers 312, 342, 344 to be based on known relationships between trackers 342, 344 and displays and between trackers 312 and camera 310 lenses while generating the camera 310 (or human or both) location information. In most typical cases, the relative distance can be used geometrically to derive the distance and height of the camera 310 (actually, the tracker 312 on the camera 310 ) relative to the display 330 . Location calculation 322 uses this data to derive a location. Details of a typical calculation are presented below with respect to FIGS. 4 and 5 .

資源儲存器323係一儲存媒體，且潛在地係一資料庫或資料結構，其用於儲存用於產生顯示器330上之影像之資料。資源儲存器323可儲存位置之三維圖、相關聯紋理及色彩、任何動畫資料、任何角色(包含其等自身之三維角色以及紋理及動畫資料)，以及一導演或藝術導演期望併入一實況動作背景中之任何特殊效果或其他元素。此等資源由下文論述之影像產生324使用。 Resource storage 323 is a storage medium, and potentially a database or data structure, that is used to store data used to generate images on display 330 . Asset storage 323 can store 3D maps of locations, associated textures and colors, any animation data, any characters (including their own 3D characters and texture and animation data), and what a director or art director wishes to incorporate into a live action Any special effects or other elements in the background. These resources are used by image generation 324 discussed below.

影像產生324基本上係一經修改視訊遊戲圖形引擎。影像產生324可更複雜，且可併入一視訊遊戲圖形引擎中不存在之功能及元素，但一般言之，視訊遊戲圖形引擎係經設計以在一二維顯示器上對一觀看者呈現一三維世界之軟體。該世界由儲存於資源儲存器323中之元素構成，如由一映射檔案或適用於定義一整體背景場景內之元素及任何動作之其他檔案格式描述。影像產生324可包含使影像產生324能夠引起事件發生或觸發事件或對涉及其他資源或動畫或背景之事件計時之一指令檔語言。指令檔語言可以使得一不精通電腦的人觸發事件相對簡單之一方式經設計。或，可採用一技術導演以確保指令檔平滑地操作。 Image generation 324 is basically a modified video game graphics engine. Image generation 324 can be more complex, and can incorporate functions and elements not present in a video game graphics engine, but in general, video game graphics engines are designed to present a three-dimensional image to a viewer on a two-dimensional display Software of the world. The world is made up of elements stored in resource storage 323, such as described by a map file or other file format suitable for defining elements and any actions within an overall background scene. Image generation 324 may include a script language that enables image generation 324 to cause events to occur or trigger events or time events involving other resources or animations or backgrounds. The script language can be designed in a way that makes triggering events relatively simple for a non-computer savvy person. Alternatively, a technical director may be employed to ensure smooth operation of command files.

校準功能325操作以設定攝影機310之一基線位置及顯示器330之基線特性。開始，影像產生324及位置計算322不確定顯示器之實際大小及尺寸。通常必須校準系統300。存在如同此之校準一系統之各種方式。例如，一使用者可將追蹤器固持於顯示器之各隅角處且對軟體做出關於哪一隅角係哪一隅角之一「備註」。此係耗時且尤其使用者不易用的。每次場景改變時，電影組將厭惡此一麻煩的設置程序。 The calibration function 325 operates to set a baseline position of the camera 310 and a baseline characteristic of the display 330 . Initially, image generation 324 and position calculation 322 do not determine the actual size and dimensions of the display. Typically the system 300 must be calibrated. There are various ways of calibrating a system as such. For example, a user can hold the trackers at the corners of the display and make changes to the software. In which corner is one of the "remarks". This is time consuming and especially user-friendly. Film crews will hate this cumbersome setup procedure every time a scene changes.

一替代程序涉及僅設定具有相對於頂部(對於任何大小之顯示器，兩個顯示隅角)之已知位置之追蹤器。接著，可指示影像產生324進入一校準模式且在顯示器330之中心上顯示一十字線或其他符號。接著，可將追蹤器312固持於顯示器之中心及軟體中提及之該位置處。藉由尋找確切中心(或在公差內足夠接近)，校準功能325可外推顯示器之完整大小。追蹤器342、追蹤器344及追蹤器312之三個點界定一平面，使得校準功能325可判定顯示器平面之角度及放置。又，自顯示器之中心至左上角之距離與自顯示器之中心至右下角之距離相同。相對隅角同樣如此。因此，校準功能325可判定顯示器之完整大小。一旦該兩個元件已知，便可以可容易地轉譯至傳統視訊遊戲引擎且至影像產生324之術語定義顯示器。 An alternative procedure involves just setting the tracker with a known position relative to the top (two display corners for any size display). Image generation 324 may then be instructed to enter a calibration mode and display a crosshair or other symbol on the center of display 330 . The tracker 312 can then be held in the center of the display and at that location mentioned in the software. By finding the exact center (or close enough within tolerance), the calibration function 325 can extrapolate the full size of the display. The three points of tracker 342, tracker 344, and tracker 312 define a plane so that calibration function 325 can determine the angle and placement of the display plane. Also, the distance from the center of the display to the upper left corner is the same as the distance from the center of the display to the lower right corner. The same is true for relative corners. Accordingly, the calibration function 325 can determine the full size of the display. Once these two components are known, the display can be defined in terms that can be easily translated to conventional video game engines and to image generation 324 .

管理/使用者介面326可係顯示器330之一更傳統使用者介面或工作站320之一獨立顯示器。該管理/使用者介面326可使系統300之管理者能夠設定某些設置、在不同場景之間切換、引起動作發生、設計且觸發指令檔動作、新增或移除物件或背景角色、或重新啟動系統。其他功能亦可行。 Management/user interface 326 may be a more traditional user interface of display 330 or a stand-alone display of workstation 320 . The management/user interface 326 may enable the administrator of the system 300 to set certain settings, switch between different scenes, cause actions to occur, design and trigger script actions, add or remove objects or background characters, or recreate Start the system. Other functions are also possible.

圖4係用於產生且擷取用於拍攝之擴增實境背景之一系統之校準之一功能圖。圖4包含攝影機410(相關聯追蹤器未展示)、顯示器430、追蹤器442、444。顯示器430併入各種背景物件436、438。 Figure 4 is a functional diagram of the calibration of a system for generating and capturing augmented reality backgrounds for filming. FIG. 4 includes a camera 410 (associated tracker not shown), a display 430 , trackers 442 , 444 . The display 430 incorporates various background objects 436,438.

攝影機410被帶至接近展示於顯示器430上之十字線434。距追蹤器442及444之距離可由系統備註。如上文關於圖3論述，隨著攝影機被移動遠離顯示器430，此使校準能夠考量作為一二維平面之顯示器相對於攝影機410之位置。並且，中心位置使系統能夠判定顯示器430之整體高度及寬度而無需一使用者之手動輸入。若任何事物出錯，則重新校準相對簡單，僅重新進入校準模式且將攝影機410放回十字線434處。 Camera 410 is brought close to reticle 434 shown on display 430 . The distance from trackers 442 and 444 can be noted by the system. As discussed above with respect to Figure 3, with photographic The camera is moved away from the display 430, which enables the calibration to take into account the position of the display relative to the camera 410 as a two-dimensional plane. Also, the center position enables the system to determine the overall height and width of the display 430 without manual input from a user. If anything goes wrong, recalibrating is relatively simple, just re-entering calibration mode and putting the camera 410 back at the reticle 434 .

在追蹤人類之情況中，可藉由知道人類追蹤攝影機410(或其他感測器)相對於顯示器430自身之絕對位置而完全避免校準。在此等情況中，可完全不需要校準，至少針對追蹤使用者相對於顯示器430之位置之攝影機或其他感測器。 In the case of human tracking, calibration can be avoided entirely by knowing the absolute position of the human tracking camera 410 (or other sensor) relative to the display 430 itself. In such cases, calibration may not be required at all, at least for cameras or other sensors that track the user's position relative to the display 430 .

圖5係用於產生且擷取用於拍攝之擴增實境背景之一系統之攝影機位置追蹤的功能圖。此處，展示相同攝影機510(相關聯追蹤器未展示)、顯示器530及追蹤器542、544。由於已校準系統，故攝影機510經展示被移動遠離顯示器530。追蹤器542及544可計算其等距攝影機之距離及任何方向(例如，自校準點向下或向上之角度)且使用幾何學以導出自攝影機510鏡頭至顯示器530之一中心點之距離及角度(例如，視角)。該資料共同的為顯示器之適當視角。可使用該視角以依令人信服地模擬一個體移動至一特定場景之視角之效果(例如，如同攝影機係一人且隨著該人之位置改變，背景基於該位置適當地改變)之一方式移位背景。 5 is a functional diagram of camera position tracking of a system for generating and capturing augmented reality backgrounds for filming. Here, the same camera 510 is shown (associated tracker not shown), display 530 and trackers 542, 544. Camera 510 is shown moved away from display 530 as the system has been calibrated. Trackers 542 and 544 can calculate the distance and any direction (e.g., angle down or up from the calibration point) of their equidistant cameras and use geometry to derive the distance and angle from the lens of camera 510 to a center point of display 530 (eg, viewing angle). Common to this information is the appropriate viewing angle for the display. This point of view can be used to move in a way that convincingly simulates the effect of an individual moving into the point of view of a particular scene (e.g., as if the camera were on a person and as the person's position changes, the background changes appropriately based on that position). bit background.

如此處展示，演員562及560存在於螢幕前方。存在背景物件536及538，但自攝影機510之視角，背景物件538在演員560「後方」。所展示之十字線在一演出期間可能不可見，但經展示為展示攝影機與顯示器之中心之相對位置。 As shown here, actors 562 and 560 exist in front of the screen. Background objects 536 and 538 are present, but from the perspective of camera 510 , background object 538 is "behind" actor 560 . The crosshairs shown may not be visible during a show, but are shown to show the relative position of the camera and the center of the display.

圖6係用於產生且擷取用於拍攝之擴增實境背景之一系統在移動時之攝影機位置追蹤之一功能圖。此係與圖5中展示之場景相同之場景，但係在攝影機610已相對於顯示器630向右移位之後。此處，演員662及660已保持相對固定，但攝影機610之位置已改變。自攝影機610之經計算視角，背景物件638已自演員660「後方」移出。此係因為攝影機610(觀看者)之位置已向右移位且現在，自該視角稍微在演員660後方之物件已自演員後方移出。相比之下，再次基於視角中之移位，背景物件636現在已移動至演員662「後方」。 FIG. 6 is a functional diagram of camera position tracking while moving for a system for generating and capturing augmented reality backgrounds for filming. This is the same scene as that shown in Figure 5 scene, but after camera 610 has been shifted to the right relative to display 630 . Here, actors 662 and 660 have remained relatively fixed, but the position of camera 610 has changed. From the calculated angle of view of camera 610 , background object 638 has moved out "behind" actor 660 . This is because the position of the camera 610 (the viewer) has shifted to the right and objects that were slightly behind the actor 660 from that view angle have now moved out from behind the actor. In contrast, background object 636 has now moved "behind" actor 662, again based on the shift in perspective.

追蹤器642及追蹤器644可由系統連同與攝影機610相關聯之追蹤器(未展示)使用以即時導出適當新視角且相應地更改顯示器。隨著實況演員在該顯示器630前方操作，工作站電腦(圖3)可更新展示於顯示器上之影像以適當地反映視角。所展示之十字線在一演出期間可能不可見，但在此處經展示為證實攝影機與顯示器之中心之相對位置。 Tracker 642 and tracker 644 may be used by the system in conjunction with a tracker (not shown) associated with camera 610 to derive the appropriate new perspective on the fly and alter the display accordingly. As live actors operate in front of the display 630, the workstation computer (FIG. 3) can update the image displayed on the display to reflect the viewing angle appropriately. The crosshairs shown may not be visible during a show, but are shown here to demonstrate the relative position of the camera to the center of the display.

圖6經展示僅具有一單一顯示器630。存在其中多個顯示器可搭配多個攝影機一起使用以產生多於一單一視角(例如，用於涵蓋一場景之擷取畫面)之情境，其中可在一或多個顯示器上拍攝同一場景上之相同或一不同視角。例如，一單一顯示器之刷新率通常高達60Hz。電影拍攝通常係每秒24個圖框，其中一些更現代選項使用每秒36個圖框或48個圖框。因此，一60Hz顯示器可重設自身高達每秒60次，多於涵蓋必要的24個圖框且幾乎涵蓋36個圖框。 FIG. 6 is shown with only a single display 630 . There are situations where multiple displays can be used with multiple cameras to produce more than a single view (e.g. for a frame capture covering a scene), where the same view of the same scene can be captured on one or more displays or a different perspective. For example, a single display typically has a refresh rate as high as 60Hz. Film shooting is usually done at 24 frames per second, with some more modern options using 36 frames per second or 48 frames per second. Thus, a 60Hz display can reset itself up to 60 times per second, more than covering the necessary 24 frames and almost covering 36 frames.

在此情況中，可使用兩個攝影機，各攝影機具有與顯示於顯示器430上之影像之每隔一個圖框同步之快門。使用此，追蹤器442及444可實際上追蹤兩個攝影機之位置且相關聯工作站可在旨在用於一第一攝影機之影像與旨在用於一第二攝影機之影像之間交替。以此一方式，可使用相同顯示器擷取相同背景之不同視角。偏光鏡頭亦可用於攝影機(或一人，如下文論述)至類似效果。 In this case, two cameras may be used, each camera having a shutter synchronized with every other frame of the image displayed on display 430 . Using this, the trackers 442 and 444 can actually track the positions of the two cameras and the associated workstation can alternate between images intended for a first camera and images intended for a second camera. In this way, different perspectives of the same background can be captured using the same display. Polarized lenses can also be used in cameras (or one, as discussed below) to a similar effect.

替代地，可提供多個顯示器，每一攝影機角度一個顯示器。在一些情況中，此多個顯示器可係其中放置演員或工作人員以進行拍攝(或放置人類以參與一遊戲)之一完整球形或半球形。在此等情況中，視角可係基於固定至指向不同方向之攝影機之追蹤器以藉此使系統能夠自多個視角演現相同場景使得可自多個視角提供場景之涵蓋。 Alternatively, multiple displays may be provided, one for each camera angle. In some cases, the multiple displays may be a full sphere or hemisphere in which actors or crew members are placed for filming (or humans are placed for participation in a game). In such cases, the perspective may be based on trackers fixed to cameras pointing in different directions to thereby enable the system to render the same scene from multiple perspectives so that coverage of the scene can be provided from multiple perspectives.

圖7係用於動態地更新一擴增實境螢幕用於與一觀看者互動之一系統在人類移動時之人類位置追蹤的功能圖。此類似於圖4至圖6中展示之圖式，但此處，至少一個攝影機710相對於顯示器固定且追蹤人類762。雖然此處論述人類，但可追蹤其他物件(例如，機器人、馬、狗及類似者)且採用類似功能性。另外或替代地，追蹤器742及744可追蹤人類762。此等追蹤器742及744以及攝影機710可依賴於LIDAR、紅外光感測器及照明器、與面部或眼睛追蹤耦合之RGB攝影機、基準標記器或其他追蹤方案以偵測在顯示器730前方之一人類存在且更新該人類相對於顯示器730之位置或相對位置。 7 is a functional diagram of human position tracking of a system for dynamically updating an augmented reality screen for interaction with a viewer as the human moves. This is similar to the diagrams shown in FIGS. 4-6 , but here, at least one camera 710 is fixed relative to the display and tracks a human 762 . While humans are discussed here, other objects (eg, robots, horses, dogs, and the like) can be tracked and employ similar functionality. Additionally or alternatively, trackers 742 and 744 may track human 762 . These trackers 742 and 744 and camera 710 may rely on LIDAR, infrared light sensors and illuminators, RGB cameras coupled with face or eye tracking, fiducial markers, or other tracking schemes to detect one of the A human is present and the position or relative position of the human relative to the display 730 is updated.

如同圖4至圖6之攝影機追蹤系統，BG物件736及738可使其等相關聯視角隨著人類762移動而更新。此可基於人類762之眼睛位置之一估計，或基於該人類之質量之一般位置。使用此一顯示器730，一虛擬或擴增實境世界可「展示」給一人類762，該人類762看似適當地追蹤該使用者之移動如同其係一真實窗口。人類762之移動可引起顯示器730適當地更新，包含在人類762移動時適當地由BG物件736及738遮擋。 Like the camera tracking system of FIGS. 4-6 , BG objects 736 and 738 can have their associated perspectives updated as the human 762 moves. This may be based on an estimate of the eye position of the human 762, or on the general position of the mass of the human. Using such a display 730, a virtual or augmented reality world can be "shown" to a human 762 that appears to properly track the user's movements as if it were a real window. Movement of human 762 may cause display 730 to update appropriately, including being properly occluded by BG objects 736 and 738 as human 762 moves.

在一些情況中，一觸控螢幕感測器739可整合至顯示器730中或構成顯示器730之一部分。此觸控螢幕感測器739被描述為一觸控螢幕感測器，且其可係電容性或電阻性觸控螢幕技術。然而，觸控螢幕感測器可代替性地依賴於基於追蹤器742及744及/或攝影機710之運動追蹤(例如，抬起一手臂或指向顯示器730)以啟用顯示器730之「觸控」功能性。使用此觸控螢幕感測器739，可實現與展示於顯示器730上之影像之互動。替代地，觸控螢幕感測器739可係一個體自身之行動裝置，諸如一平板電腦或行動電話。在一些情況中，例如，一使用者可使用其電話以與顯示器互動。 In some cases, a touch screen sensor 739 may be integrated into or form part of the display 730 . The touch screen sensor 739 is described as a touch screen screen sensor, and it can be capacitive or resistive touch screen technology. However, the touch screen sensor may instead rely on motion tracking based on trackers 742 and 744 and/or camera 710 (e.g., raising an arm or pointing at display 730) to enable the "touch" functionality of display 730 sex. Using the touch screen sensor 739, interaction with the image displayed on the display 730 can be realized. Alternatively, the touch screen sensor 739 may be an individual's own mobile device, such as a tablet computer or mobile phone. In some cases, for example, a user may use their phone to interact with the display.

程序之描述 Description of the program

現參考圖8，展示用於攝影機及顯示器校準之一程序的流程圖。流程圖具有一開始805及一結束895兩者，但若系統被移動、偏離校準或以其他方式由一使用者期望，則程序可視需要重複多次。 Referring now to FIG. 8, there is shown a flowchart of a procedure for camera and display calibration. The flowchart has both a beginning 805 and an end 895, but the procedure can be repeated as many times as necessary if the system is moved, deviated from calibration, or otherwise desired by a user.

在開始805之後，程序藉由啟用校準模式810開始。此處，一使用者或管理者操作工作站或其他控制裝置以進入具體經設計用於校準之一模式。如上文論述，存在用於校準之各種選項，但在此流程圖中揭示本文中使用之選項。在此校準模式中，一旦在810處啟用校準模式，便在顯示器上展示一十字線或類似指示符。 After start 805 , the process begins by enabling calibration mode 810 . Here, a user or supervisor operates the workstation or other control device to enter a mode specifically designed for calibration. As discussed above, there are various options for calibration, but the options used herein are disclosed in this flowchart. In this calibration mode, once the calibration mode is enabled at 810, a crosshair or similar indicator is shown on the display.

接著，可在820處提示使用者將追蹤器帶至顯示器。可提供螢幕上導引或提示，顯示器可比一十字線更複雜且可包含一攝影機裝備或待帶至顯示器之一追蹤器之一輪廓。以此方式，可提示使用者如何做以完成校準程序。 Next, the user may be prompted at 820 to bring the tracker to the display. On-screen guidance or prompts can be provided, the display can be more complex than a reticle and can include a camera rig or an outline of a tracker to be brought to the display. In this way, the user can be prompted what to do to complete the calibration procedure.

若未將追蹤器帶至顯示器(825處之「否」)，則可在820處再次提示使用者。若使用者已將追蹤器帶至顯示器(據推測在正確位置中)，則使用者可在830處確認基線位置。此可係藉由按一下一按鈕，離開校準模式或透過某一其他確認(當在校準模式中時不移動攝影機達10秒鐘)。 If the tracker was not brought to the display ("NO" at 825), the user may be prompted again at 820. If the user has brought the tracker to the display (presumably in the correct position), the user can confirm the baseline position at 830 . This can be done by clicking a button, leaving the Calibration mode or through some other confirmation (do not move the camera for 10 seconds while in calibration mode).

接著已知基線資訊(例如，追蹤器與顯示器及攝影機與其相關聯追蹤器之相對位置及顯示器之中心之位置)。可在840處儲存該資訊。 Baseline information is then known (eg, the relative position of the tracker to the display and camera to its associated tracker and the position of the center of the display). This information can be stored at 840 .

接著，系統可產生相對位置，且在850處顯示大小。在此階段，使用此資料定義顯示器之平面且設定顯示器之大小。 The system can then generate the relative position and at 850 display the size. At this stage, use this data to define the plane of the display and set the size of the display.

此之一實例可係顯示器係總共10公尺高乘以15公尺寬。將攝影機之追蹤器移動至顯示器之中心點，追蹤器系統可偵測攝影機之追蹤器距追蹤器近似9.01公尺且呈一特定角度。可使用畢氏定理(Pythagorean theorem)以判定若形成顯示區域之1/8之一三角形之一斜邊(至顯示器之中心之線)係9.01公尺，且顯示器上兩個追蹤器(顯示器之頂側)之間之距離係15公尺，則另兩條邊分別為7.5公尺(頂部之1/2)及5公尺。因此，顯示器之寬度係10公尺。 An example of this could be a display that is 10 meters high by 15 meters wide in total. Move the camera tracker to the center point of the display, and the tracker system can detect that the camera tracker is approximately 9.01 meters away from the tracker at a specific angle. Pythagorean theorem can be used to determine that if one of the hypotenuses (the line to the center of the display) of the triangle forming 1/8 of the display area is 9.01 meters, and the two trackers on the display (the top of the display The distance between the sides) is 15 meters, and the other two sides are 7.5 meters (1/2 of the top) and 5 meters respectively. Therefore, the width of the display is 10 meters.

接著，程序可在895處結束。 The routine can then end at 895.

圖9係用於位置追蹤之一程序的流程圖。程序具有一開始905及一結束995，但程序可發生許多次且可重複，如圖自身中展示。 Fig. 9 is a flowchart of a procedure for location tracking. The procedure has a start 905 and an end 995, but the procedure can occur many times and be repeated, as shown in the figure itself.

在開始905之後，程序藉由在910處執行校準開始。上文關於圖8描述校準。 After start 905 , the process begins by performing a calibration at 910 . Calibration is described above with respect to FIG. 8 .

一旦校準完成，便可在920處偵測攝影機相對於顯示器之位置。使用兩個距離(距各追蹤器之距離)偵測此位置。自此，已知顯示器自身之平面，可偵測攝影機與顯示器之相對位置。已知追蹤系統以各種方式執行此等功能。 Once the calibration is complete, the position of the camera relative to the display can be detected at 920 . This location is detected using two distances (distances from each tracker). From then on, knowing the plane of the display itself, the relative position of the camera and the display can be detected. Tracking systems are known to perform these functions in various ways.

接著，在930處顯示三維場景(例如，用於拍攝中)。此場景係由一藝術導演或導演產生且包含如導演期望之資產及其他動畫或指令檔動作之場景。下文更詳細論述此。 Next, at 930, the three-dimensional scene is displayed (eg, for use in filming). this scene Scenes that are generated by an art director or director and contain assets and other animation or script motion as the director intends. This is discussed in more detail below.

若未偵測到移動(935處之「否」)，則場景保持相對固定且保持顯示。可存在場景中發生之動畫(例如，刮風)或其他基線事物，但場景之視角保持不變。 If no motion is detected ("NO" at 935), the scene remains relatively stationary and remains displayed. There may be animations (eg, wind blowing) or other baseline things happening in the scene, but the perspective of the scene remains the same.

若偵測到移動(935處之「是」)，則在950處偵測攝影機相對於顯示器之新位置。在960處使用攝影機之此新位置資訊以計算攝影機相對於顯示器之一新視角。 If movement is detected (YES at 935 ), then at 950 the new position of the camera relative to the display is detected. This new position information of the camera is used at 960 to calculate a new viewing angle of the camera relative to the display.

若程序(例如，拍攝)未完成(965處之「否」)，則程序在930處繼續。若程序完成(965處之「是」)，則程序在995處結束。 If the process (eg, capture) is not complete ("NO" at 965), then the process continues at 930. If the process is complete ("YES" at 965), the process ends at 995.

圖10係用於在位置追蹤期間計算攝影機位置之一程序的流程圖。程序在開始1005處開始且在1095處結束，但每次攝影機在校準之後移動時可重複。 FIG. 10 is a flowchart of a routine for computing camera positions during position tracking. The procedure begins at start 1005 and ends at 1095, but may repeat each time the camera moves after calibration.

程序足夠有效，其可基本上即時完成使得在顯示器中未偵測到相對於演員或攝影機之可見「滯後」。實現此無「滯後」之一個元素係與攝影機定向相關之追蹤資料(相對於位置資料)之肯定忽視。具體言之，追蹤系統趨於不僅提供位置(例如，在三維空間中之一(x,y,z)位置(更通常言之，定義為來自(若干)追蹤器之向量))之大量資料，而且亦提供定向之大量資料。 The procedure is efficient enough that it can be done substantially in real-time so that no visible "lag" relative to the actor or camera is detected in the display. One element of achieving this "lag-free" is the definite disregard of tracking data (as opposed to position data) related to camera orientation. Specifically, tracking systems tend to provide a wealth of information not only about position (e.g., an (x,y,z) position in three-dimensional space (more generally defined as vectors from tracker(s)), And it also provides a large amount of targeted information.

定向資料指示追蹤器在該位置內被保持之特定定向。此係因為大多數追蹤器經設計用於擴增實境及虛擬實境追蹤。追蹤器經設計以追蹤「頭」及「手」。該等物件需要定向資料以及位置資料(例如，頭在「向上看」)，以便準確地對使用者提供一相關聯VR或AR視圖。為了在當前系統中採用此等追蹤器之目的，資料通常不相關。因此，忽視、捨棄或不考量所提供之任何此資料，除非其需要用於某一其他目的。一般言之，將假定攝影機始終基本上(若非實際上)面向顯示器，在一平行平面上之某處。此係因為將攝影機移動至一不同位置將引起錯視消失。在涉及圓頂或半球形設置之情境中，可使用該資料。但依賴於該資料可使處理顯著變慢且引入滯後。類似地，依賴於電腦視覺或偵測之其他系統因為類似運算密集原因而引入滯後。 The orientation data indicates the specific orientation the tracker is maintained within that location. This is because most trackers are designed for augmented reality and virtual reality tracking. The tracker is designed to track "head" and "hand". These objects require orientation data as well as position data (eg, the head is "looking up") in order to accurately provide the user with a relevant VR or AR view. in order to The data is generally irrelevant for the purpose of using these trackers in the current system. Therefore, disregard, discard or disregard any such information provided unless it is required for some other purpose. In general, it will be assumed that the camera is always facing substantially, if not actually, the display, somewhere on a parallel plane. This is because moving the camera to a different position will cause the illusion to disappear. This information may be used in situations involving dome or hemispherical settings. But relying on this data can significantly slow down processing and introduce lag. Similarly, other systems that rely on computer vision or detection introduce lag for similarly computationally intensive reasons.

顯示此場景之程序係在用於視訊遊戲、擴增實境或虛擬實境環境之三維圖形演現之背景內容中使用一段時間之一典型場景表示之一變動。用於演現即時3D電腦圖形之數學通常由使用一視角投影矩陣以將三維點映射至一二維平面(顯示器)組成。一左側視角投影矩陣通常定義為在中心，如下所示：

The program displaying this scene is a variation of a representative scene representation over a period of time used in the background content of a three-dimensional graphic presentation for a video game, augmented reality or virtual reality environment. The mathematics used to render real-time 3D computer graphics typically consists of using a perspective projection matrix to map 3D points onto a 2D plane (display). A left perspective projection matrix is usually defined at the center as follows:

視體(view-volume)可藉由使用以下項運用一偏離中心視角投影矩陣進行演現而偏移：

The view-volume can be shifted by rendering with an off-center view projection matrix using:

判定對應於攝影機位置之 l、r、b、t 之值產生一準確視圖相依視角移位。用於判定 l、r、b、t 之視圖相依值之方法如下：首先，在所要虛擬位置處將待按比例調整之一虛擬螢幕放置於3D場景中以表示螢幕。接著，在1010處使用如上文論述之追蹤器資訊及校準程序判定螢幕之隅角。接著，可計算在傳統三維圖形引擎中使用之左側螢幕軸。具體言之，可在1020處如下計算自螢幕隅角位置至攝影機之右、上及前(正常)單元向量：

Determining the values of l, r, b, t corresponding to the camera position produces an exact view-dependent perspective shift. The method for determining the view-dependent values of l, r, b, t is as follows : First, a virtual screen to be scaled is placed in the 3D scene at the desired virtual position to represent the screen. Next, the corners of the screen are determined at 1010 using the tracker information and the calibration procedure as discussed above. Next, the left screen axis as used in conventional 3D graphics engines can be calculated. Specifically, the right, top and front (normal) unit vectors from the screen corner position to the camera may be calculated at 1020 as follows:

接著，可在1030處藉由計算自已知攝影機位置至顯示器隅角之向量而產生自觀看者位置之視體之截頭角之範圍。 Next, the extent of the truncated angle of the view volume from the viewer position can be generated at 1030 by calculating the vector from the known camera position to the corner of the display.

接著，必須適當地按比例調整顯示器以考量攝影機距顯示器之距離。為了完成此，吾人在1040處計算攝影機(近平面)與螢幕平面之間之距離的一比率。此比率可用作一比例因數，此係因為在近平面處如下指定截頭角範圍：

Next, the display must be scaled appropriately to account for the camera's distance from the display. To do this, we calculate at 1040 a ratio of the distance between the camera (near plane) and the screen plane. This ratio can be used as a scale factor because the truncation angle range is specified at the near plane as follows:

最後，在1050處使用投影之攝影機視角，視圖相依範圍將向量及按比例調整比率應用至場景。為了完成此，如下計算 l、r、b、t ：

Finally, using the projected camera view at 1050, the view-dependent range applies the vector and scaling ratio to the scene. To accomplish this, l, r, b, t are calculated as follows:

計算一精確攝影機位置對於錯視效應正確地起作用係重要的。為了確保其快速地發生，追蹤系統可獨立於其他執行緒在一單獨執行緒(CPU核心)中同時進行更新及執行以最小化延時。工作站之其他系統(例如，演現自身)可每一圖框更新自追蹤系統讀取當前遙測資料(位置、定向)。藉由將演現執行保持為60Hz或更高而達成經最小化之運動至光子延時。在發明者之經驗中，若演現系統以60FPS讀取追蹤資料，則運動至光子(例如，攝影機至在螢幕上顯示)延時係近似16.66毫秒。以此低位準之延時，結果對於人類視覺及攝影機基本上或實際上不可感知。 Computing an exact camera position is important for the trompe l'oeil effect to work correctly. To ensure this happens quickly, the trace system can be updated and executed simultaneously in a single thread (CPU core) independently of other threads to minimize latency. Other systems at the workstation (eg, the rendering itself) can read current telemetry (position, orientation) from the tracking system every frame update. Minimized motion-to-photon latency is achieved by keeping rendering execution at 60Hz or higher. In the inventor's experience, if the rendering system reads the tracking data at 60 FPS, the motion-to-photon (eg, camera-to-display on-screen) latency is approximately 16.66 milliseconds. With this low level of latency, the result is essentially or practically imperceptible to human vision and cameras.

圖11係用於人類位置追蹤之一程序的流程圖。程序在1105處開始且在1195處結束。圖11相當類似於參考圖9描述之追蹤。並且，追蹤可以與上文描述之非常相同之方式發生。下文將僅詳細描述相對於人類位置追蹤之差異及其與相關聯程序之相關性。 Figure 11 is a flowchart of a program for human position tracking. The routine starts at 1105 and ends at 1195. FIG. 11 is quite similar to the trace described with reference to FIG. 9 . Also, tracking can occur in much the same way as described above. Only the differences relative to human location tracking and their relevance to associated programs will be detailed below.

如同圖9，在1110處執行校準。若不存在外部攝影機，則此可係不必要的。但，可需要一初始校準以使在一顯示器前方之準確人類追蹤能夠發生。此可與肯定地定義攝影機及/或追蹤器相對於顯示器之(若干)位置同樣簡單，使得人類追蹤可準確地發生。 As in FIG. 9 , calibration is performed at 1110 . This may not be necessary if there are no external cameras. However, an initial calibration may be required to enable accurate human tracking in front of a display to occur. This can be as simple as positively defining the camera and/or tracker position(s) relative to the display so that human tracking can occur accurately.

一旦在1110處已發生校準，在1120處便可偵測在顯示器前方之一人類之位置。此偵測可依賴於紅外光、LIDAR、基準標記器、 RGB攝影機結合影像處理軟體或其他類似方法。不管如何，偵測及/或產生經追蹤人眼或眼睛位置之一實際偵測或人眼位置之一估計作為此程序之一部分。 Once calibration has occurred at 1110, at 1120 the position of a human in front of the display can be detected. This detection can rely on infrared light, LIDAR, fiducial markers, RGB camera combined with image processing software or other similar methods. Regardless, detecting and/or generating an actual detection of a tracked human eye or eye position or an estimate of human eye position is part of this process.

此資訊以與在圖9中使用之追蹤器對於攝影機之偵測非常相同的方式使用，具體言之，在1130處在視角適當地連結至人眼位置之情況下顯示一三維場景。以此方式，顯示器上之場景以使得其對於經偵測人類看起來「正確」之一方式經演現。 This information is used in much the same way as the tracker used in FIG. 9 for camera detection, specifically, to display a three-dimensional scene at 1130 with the perspective properly linked to the human eye position. In this way, the scene on the display is rendered in such a way that it looks "correct" to the detected human.

接著，若(例如，藉由(若干)攝影機或(若干)追蹤器)未偵測到移動(1135處之「否」)，則繼續顯示相同場景。場景自身可係主動的(例如，事物可在顯示器上發生，諸如一系列事件、一日出、其他真實或動畫演員或動作、砲火、降雨或任何其他數目個動作)，但不處理視角移位，此係因為經追蹤人類無移動。 Then, if no motion is detected (eg, by camera(s) or tracker(s)) (“NO” at 1135 ), the same scene continues to be displayed. The scene itself can be active (e.g. things can happen on the display, such as a sequence of events, a sunrise, other real or animated actors or actions, gunfire, rain, or any other number of actions), but doesn't handle perspective shifts , which is due to no movement of tracked humans.

若在1135處(例如，藉由(若干)攝影機或(若干)追蹤器)偵測到移動(1135處之「是」)，則在1150處偵測人類相對於顯示器之新位置。此處，藉由(若干)攝影機及/或(若干)追蹤器產生及/或偵測人眼之經更新位置或人眼之一經估計位置。 If at 1135 movement is detected (eg, by camera(s) or tracker(s)) (“YES” at 1135 ), then at 1150 a new position of the human relative to the display is detected. Here, an updated position of the human eye or an estimated position of the human eye is generated and/or detected by camera(s) and/or tracker(s).

接著，在1160處針對人類計算並顯示人類相對於顯示器之新視角。此處，更改改變以便反映由人類之頭部、身體或眼睛之移動偵測之視角移位。此可難以置信地快速發生，使得場景基本上即時更新而對一人類觀看者無可辨別滯後。以此方式，場景可看起來為至一虛擬或擴增實境世界之一「入口」或「窗口」。 Next, at 1160, a new viewing angle for the human relative to the display is calculated and displayed for the human. Here, changes are altered to reflect viewing angle shifts detected by movement of a human's head, body or eyes. This happens incredibly quickly, allowing the scene to update substantially instantaneously with no discernible lag to a human viewer. In this way, the scene may appear to be a "portal" or "window" to a virtual or augmented reality world.

系統亦可使用「觸控」感測器或虛擬觸控感測器追蹤互動，如上文關於圖7論述。若偵測到此一「觸控」(此可僅係半空中之互動)(1165處之「是」)，則可在1170處處理互動。此處理可係更新場景(例如，一使用者在顯示器上選擇一選項)或與展示於螢幕上之某人互動(例如，握手)，發射一武器或引起顯示器中之某一其他移位。 The system can also track interactions using "touch" sensors or virtual touch sensors, as discussed above with respect to FIG. 7 . If this "touch" is detected (this may only be an interaction in mid-air), action) (“Yes” at 1165 ), the interaction can be processed at 1170 . This processing may be updating the scene (eg, a user selects an option on the display) or interacting with someone shown on the screen (eg, shaking hands), firing a weapon, or causing some other shift in the display.

若不存在互動(1165處之「否」)或在於1170處處理任何互動之後，則可在1175處進行程序是否完成之一判定。此可在使用者已停止在追蹤人類之攝影機或(若干)追蹤器之圖框中且一逾時發生或可由一管理者外部觸發時發生。程序完成之其他原因(1175處之「是」)亦可行。此後，程序將在1195處結束。 If there is no interaction ("NO" at 1165) or after any interaction has been processed at 1170, then at 1175 a determination can be made as to whether the procedure is complete. This can happen when the user has stopped in the frame of the camera or tracker(s) tracking the person and a timeout occurs or can be triggered externally by a supervisor. Other reasons for program completion ("YES" at 1175) are also possible. Thereafter, the program will end at 1195.

然而，直至程序完成(1175處之「是」)，在1130處將繼續顯示場景，在1135處偵測到移動且在1150至1175處根據移動及互動整體更新場景。 However, until the process is complete (“Yes” at 1175 ), the scene will continue to be displayed at 1130 , motion is detected at 1135 and the scene is updated as a whole based on the movement and interactions at 1150 to 1175 .

圖12係用於結合人類進行人類位置追蹤及AR物件之疊加之一程序的流程圖。程序在1205處開始且在1295處結束，但可發生許多次。如同圖11，此圖大部分對應於圖9。此處將僅論述相異之該等態樣。 FIG. 12 is a flow chart of a procedure for human position tracking and AR object overlay combined with humans. The procedure starts at 1205 and ends at 1295, but can happen any number of times. As with FIG. 11 , this figure largely corresponds to FIG. 9 . Only those aspects that differ will be discussed here.

在開始1205之後，程序以校準1210開始。此處，可存在多個攝影機及/或追蹤器。因此，追蹤人類之攝影機(若其相對於一顯示器固定)可不需要任何校準。然而，對於使用顯示器作為主動背景拍攝場景之攝影機，可需要校準，如關於圖8及圖9描述。 After start 1205, the routine begins with calibration 1210. Here, there may be multiple cameras and/or trackers. Thus, a camera tracking a person (if it is fixed relative to a display) may not require any calibration. However, for a camera that captures a scene using a display as an active background, calibration may be required, as described with respect to FIGS. 8 and 9 .

接著，在1220處偵測到人類之位置。此處，追蹤一人類相對於顯示器之位置之(若干)追蹤器及/或(若干)攝影機判定一人類相對於顯示器在何處。此資訊對於使系統能夠相對於人類令人信服地放置擴增實境物件係必要的。 Next, at 1220, the location of the human is detected. Here, tracker(s) and/or camera(s) that track a human's position relative to the display determine where a human is relative to the display. This information is necessary to enable the system to place augmented reality objects convincingly relative to humans.

接著，在1230處偵測到攝影機之位置。此係在人類疊加於其間之情況下拍攝顯示器作為一背景之攝影機。此處，計算攝影機之位置使得展示於顯示器上之場景可經適當地演現。 Next, at 1230, the position of the camera is detected. This system is superimposed on human beings In between the camera shoots the display as a background. Here, the position of the camera is calculated so that the scene shown on the display can be properly rendered.

雖然關於一攝影機論述在1230處之此偵測，但其可僅係觀看場景之一觀看的人類(例如，一個觀眾)之偵測。在此一情況中，將採用如關於圖11論述之類似人類追蹤系統及方法。 Although this detection at 1230 is discussed with respect to a camera, it may only be the detection of a human (eg, a viewer) viewing one of the scenes. In this case, a similar human tracking system and method as discussed with respect to FIG. 11 would be employed.

接著，在1240處使用任何所要AR物件展示三維場景。此步驟具有至少三個子分量。首先，必須在顯示器上準確地反映場景自身之視角(及任何相關聯視角移位)。此係單獨使用攝影機之位置完成。 Next, the three-dimensional scene is presented at 1240 using any desired AR objects. This step has at least three subcomponents. First, the perspective of the scene itself (and any associated perspective shift) must be accurately reflected on the display. This is done using the position of the camera alone.

第二，必須產生個體相對於攝影機及顯示器之位置。此係基於人類相對於顯示器之經偵測位置，及接著攝影機之相對位置之偵測。可組合此兩件資料以判定攝影機與人類及顯示器之相對位置。 Second, the position of the entity relative to the camera and display must be generated. This is based on the detected position of the human relative to the display, and then the detection of the relative position of the camera. These two pieces of data can be combined to determine the relative position of the camera to the human and the display.

此後，必須演現一(或若干)擴增實境物件。通常言之，此等將提前選擇作為用於經拍攝之場景中之一「特殊效果」之一位元。在一非常基本實例中，可看見其相對於顯示器之位置已知之一個體在其身體周圍具有一亮光之情況下「發光」。此發光可呈現於顯示器上，但由於其經即時更新，且依賴於攝影機位置及人類位置，故其可在記錄場景時對攝影機顯現。使用即時人類追蹤，發光可跟隨使用者，但在此步驟1240處，發光僅表示為「包圍」使用者或然而，特殊效果藝術家已指示其應發生。 Thereafter, one (or several) augmented reality objects must be rendered. Generally speaking, these will be selected in advance as bits of a "special effect" used in the scene being photographed. In a very basic example, an individual whose position relative to the display is known can be seen to "glow" with a bright light around its body. This glow can appear on the display, but since it is updated in real time and depends on the camera position and human position, it can appear to the camera as the scene is being recorded. Using real-time human tracking, the glow may follow the user, but at this step 1240 the glow is only indicated to "surround" the user or however the special effects artist has indicated it should happen.

擴增之其他實例可係光束自手發射(自動或在特定指令之後，例如，按下一發射按鈕，藉由一特殊效果監督員或助理導演)，在一個體走路時在其頭部上方之一光環，在一人類之背上之僅出現在該人類後方之明顯的「翅膀」，且亦可應用看起來來自或發出自經追蹤之一人類之其他效果。 Other examples of augmentation could be the self-firing of beams of light (either automatically or upon specific command, e.g., pressing a firing button, by a special effects supervisor or assistant director), above an individual's head as they walk. A halo, distinct "wings" on a human's back that appear only behind that human, and other effects that appear to come from or emanate from a tracked human can also be applied.

在此情況中，若未偵測到移動(1235處之「否」)，則效果繼續。效果自身(例如，發光或霧或脈衝或光束)可具有其自身之繼續之獨立動畫，但其將不與人類一起「移動」。 In this case, if no motion is detected ("NO" at 1235), the effect continues. The effect itself (eg glow or fog or pulse or beam) may have its own continuing independent animation, but it will not "move" with the human.

若偵測到移動(1235處之「是」)，則在1250處偵測到人類之新位置及攝影機之新位置兩者，且更新顯示器以反映場景之視角，且反映數位效果之新位置。注意，若一演員在顯示器前方，以具有針對(例如)該人類之頭部上方之一光環反映之準確視角移位，則光環將看起來比背景「更接近」攝影機。因此，相關聯視角移位將針對光環(此係因為其更接近)比針對背景更小。 If movement is detected ("YES" at 1235), then at 1250 both the new position of the human and the new position of the camera are detected, and the display is updated to reflect the perspective of the scene, and to reflect the new position of the digital effect. Note that if an actor is in front of the display with the correct perspective shift for reflection of, for example, a halo over the human's head, the halo will appear "closer" to the camera than the background. Therefore, the associated viewing angle shift will be smaller for the halo (because it is closer) than for the background.

在1260處計算攝影機及(若干)AR物件之新視角資料，且在1270處更新至顯示器之新視角。 New view data for the camera and AR object(s) is calculated at 1260 and updated to the new view for the display at 1270 .

在1275處進行程序是否完成之一判定。若其未完成(1275處之「否」)，則程序在1240處以顯示三維場景及(若干)任何AR物件繼續。若程序完成(1275處之「是」)，則程序在結束1295處結束。 At 1275 a determination is made whether the program is complete. If it is not complete ("NO" at 1275), the process continues at 1240 with the display of the three-dimensional scene and any AR objects(s). If the process is complete ("YES" at 1275), then the process ends at end 1295.

結束評論 end comment

貫穿此描述，所展示之實施例及實例應被視為實例，而非對所揭示或主張之設備或程序之限制。雖然本文中呈現之許多實例涉及方法動作或系統元件之特定組合，但應理解，可以其他方式組合該等動作及該等元件以完成相同目標。關於流程圖，可採取額外及更少步驟，且組合或進一步細化如展示之步驟以達成本文中描述之方法。僅結合一項實施例論述之動作、元件及特徵不旨在自其他實施例中之一類似角色排除。 Throughout this description, the embodiments and examples shown should be considered as examples, rather than limitations of the devices or processes disclosed or claimed. Although many of the examples presented herein involve specific combinations of method acts or system elements, it should be understood that such acts and such elements can be combined in other ways to accomplish the same goal. With respect to the flowcharts, additional and fewer steps may be taken, and steps as shown combined or further refined to achieve the methods described herein. Acts, elements and characteristics discussed in connection with only one embodiment are not intended to be excluded from a similar role in other embodiments.

如本文中使用，「複數」意謂兩個或更多個。如本文中使用，一「組」品項可包含此等品項之一或多者。如本文中使用，無論係在所書寫描述或發明申請專利範圍中，術語「包括」、「包含」、「帶有」、「具有」、「含有」、「涉及」及類似者應理解為開放式，即，意謂包含但不限於。僅過渡片語「由…組成」及「基本上由…組成」應分別相對於發明申請專利範圍為封閉式或半封閉式過渡片語。在發明申請專利範圍中使用諸如「第一」、「第二」、「第三」等之序數詞以修飾一發明申請專利範圍元素自身不意謂一個發明申請專利範圍元素優於另一發明申請專利範圍元素之任何優先權、優先地位或順序或執行一方法之動作之時間順序，而僅被用作區分具有一特定名稱之一個發明申請專利範圍元素與具有一相同名稱(但使用順序術語)之另一元素以區分該等發明申請專利範圍元素之標記。如本文中使用，「及/或」意謂所列舉品項係替代例，但替代例亦包含所列舉品項之任何組合。 As used herein, "plural" means two or more. As used herein, a "set" of items may comprise one or more of these items. As used herein, regardless of the In the written description or the scope of the patent application, the terms "comprising", "comprising", "with", "having", "containing", "involving" and the like should be understood as open-ended, that is, meaning including but not limited to. Only the transitional phrases "consisting of" and "consisting essentially of" should be closed or semi-closed transitional phrases respectively relative to the patent scope of the invention. Using ordinal numbers such as "first", "second", "third", etc. in a claim to modify a claim element does not by itself mean that one claim element is superior to another claim element Any priority, priority, or sequence of scope elements or chronological order of acts for performing a method is used only to distinguish a claimed scope element bearing a particular name from one having the same name (but using sequential terms) The other element is marked to distinguish the elements within the patent scope of the invention application. As used herein, "and/or" means that the listed items are alternatives, but alternatives also include any combination of the listed items.

100:系統 100: system

110:攝影機 110: camera

112:相關聯追蹤器 112:Associated Tracker

120:工作站 120: workstation

130:顯示器 130: Display

142:相關聯追蹤器 142:Associated Tracker

144:相關聯追蹤器 144:Associated Tracker

150:網路 150: Network

Claims

A system for augmented reality, comprising: a display for displaying images generated by a computing device in communication with the display; a sensor for detecting a position of the display relative to which the camera captures an image of the display from a movable position; the computing device communicates with the display for: the position of the display to display an image on the display so as to correspond to a first perspective of an object shown in the image; and based on the position relative to the display determined using the sensor The image displayed on the display is continuously adjusted once the current position is updated to correspond to additional viewing angles applicable to the objects displayed in the image; and the camera simultaneously captures images contained in the display a composite image of a human, a figure, or a physical object ahead and the image displayed on the display based on the location and the updated current location and display the first viewing angle and the additional viewing angles.

The system according to claim 1, further comprising: a second display for displaying an image generated by a computing device communicating with the second display; The computing device communicates with the second display for: displaying a second image on the second display based on the position relative to the second display determined using the sensor so as to correspond to a second image displayed on the image and continuously adjust the second image displayed on the second display based on the updated current position relative to the second display determined using the sensor to correspond to the applicable further additional viewing angles of the objects displayed in the second image; and based on the position and the updated current position, the camera further captures the second image displayed on the second display, the second display The second image above shows the second viewing angle and the further additional viewing angles.

The system of claim 1, further comprising: a second sensor for detecting a second position of a second display relative to a second camera; An image generated by the computing device in communication with a second display; the computing device communicates with the second display for: based on the second position relative to the second display determined using the second sensor, on the second display displaying a second image on the second display so as to correspond to a second viewing angle of an object displayed in the image; and based on an updated second current position relative to the second display determined using the second sensor to continuously adjusting the second image displayed on the second display to correspond to further additional views applicable to the objects displayed in the second image and based on the second position and the updated second current position, the second camera captures the second image displayed on the second display, the second image on the second display showing the first Second viewing angles and such further additional viewing angles.

The system of claim 1, wherein the position is a position of the camera viewing the display.

The system of claim 1, further comprising: a tracker (tracker), which is used to track the movement of a human participant (participant); the viewing angle of the position to alter (altering) the appearance (appearance) of the image displayed on the display to incorporate elements responsive to the human participant's action (action) to continuously adjust the image displayed on the display image.

The system of claim 5, wherein the elements positioned relative to the human participant maintain a fixed visual relative to a particular portion of a body of the human participant as the human participant moves in front of the display )element.

The system according to claim 1, further comprising: at least one touch sensor for realizing physical interaction with the display; Wherein the computing device is further configured to receive input from a user from the at least one touch sensor; and alter the image displayed on the display in response to the input.

An apparatus for augmented reality comprising a non-volatile machine-readable medium storing a program having instructions which, when executed by a processor, cause the processor to: use a sensor , detecting a position of a display relative to a camera used to capture an image of the display from a movable position relative to the display; based on the position relative to the display determined using the sensor to display an image on the display so as to correspond to a first viewing angle of an object displayed in the image; and continuously adjust the display on the display based on an updated current position relative to the display determined using the sensor using the camera to simultaneously capture a composite image containing a person, a figure, or a physical object in front of the display and the image displayed on the display, the image showing the first viewing angle and the additional viewing angles based on the location and the updated current location.

The apparatus of claim 8, wherein the instructions further cause the processor to: display a second image on the second display based on the position relative to a second display determined using the sensor so as to correspond to a second viewing angle of the object displayed in the image; and the updated current position relative to the second display based on the determination using the sensor configured to continuously adjust the second image displayed on the second display to correspond to further additional viewing angles applicable to the objects displayed in the second image; and based on the position and the updated current position, using The camera captures the second image displayed on the second display showing the second viewing angle and the further additional viewing angles.

The apparatus of claim 8, wherein the instructions further cause the processor to: use a second sensor to detect a second position of a second display relative to a second camera; displaying a second image on the second display at the second position relative to the second display determined by the detector so as to correspond to a second viewing angle of the object displayed in the image; and based on using the second an updated second current position relative to the second display determined by the sensor to continuously adjust the second image displayed on the second display to correspond to the objects suitable for display in the second image and using the second camera to capture the second image displayed on the second display, the second image showing the second angle of view based on the second position and the updated second current position and such further additional perspectives.

The apparatus of claim 8, wherein the position is a position of the camera viewing the display.

The apparatus of claim 8, wherein the instructions further cause the processor to: use a tracker to track movement of a human viewer relative to the display; and by based on one of the positions from the camera relative to the display The viewing angle is to change the appearance of the image displayed on the display to incorporate elements that respond to the actions of the human participant to continuously adjust the image displayed on the display.

The apparatus of claim 12, wherein the elements positioned relative to the human participant are visual elements that remain fixed relative to a particular part of a body of the human participant as the human participant moves in front of the display.

The apparatus of claim 8, wherein the instructions further cause the processor to: enable physical interaction with the display using at least one touch sensor; receive input from a user from the at least one touch sensor; and The image displayed on the display is altered in response to the input.

The device according to claim 8, further comprising: the processor: a memory; wherein the processor and the memory include circuits and software for executing the instructions on the storage medium.

A method for filming using a real-time display, the method comprising: Using a sensor, detecting a position of the display relative to a camera used to capture an image of the display from a movable position relative to the display; based on a relative determination using the sensor displaying an image on the display at the position of the display so as to correspond to a first viewing angle of an object displayed in the image; and based on an updated current position relative to the display determined using the sensor to continuously adjusting the image displayed on the display so as to correspond to additional viewing angles applicable to the objects displayed in the image; and using the camera to simultaneously capture a person, a figure or a person contained in front of the display A composite image of the physical object and the image displayed on the display, the image based on the location and the updated current location to show the first viewing angle and the additional viewing angles.

The method of claim 16, further comprising: displaying a second image on a second display based on the position relative to a second display determined using the sensor so as to correspond to the position displayed in the image a second viewing angle of the object; and continuously adjusting the second image displayed on the second display based on the updated current position relative to the second display determined using the sensor so as to correspond to an image suitable for displaying further additional viewing angles of the objects in the second image; and based on the position and the updated current position, using the camera to capture the second image displayed on the second display, on the second display The second image shows the second viewing angle and the further additional viewing angles.

The method of claim 16, further comprising: using a second sensor to detect a second position of a second display relative to a second camera; based on the second position relative to the display determined using the second sensor to displaying a second image on the second display so as to correspond to a second viewing angle of an object displayed in the image; and based on an updated second current position relative to the second display determined using the second sensor to continuously adjusting the second image displayed on the second display so as to correspond to further additional viewing angles applicable to the objects displayed in the second image; and based on the second position and the updated second current position , using the second camera to capture the second image displayed on the second display, the second image showing the second viewing angle and the further additional viewing angles.

The method of claim 16, wherein the position is a position of the camera viewing the display.

The method of claim 16, further comprising: using a tracker to track movement of a human viewer relative to the display; and altering the view displayed on the The appearance of the image on the display continuously adjusts the image displayed on the display to incorporate elements responsive to the actions of the human participant.