KR102559721B1

KR102559721B1 - Control method of electronic apparatus for selectively restore images according to field of view of user

Info

Publication number: KR102559721B1
Application number: KR1020220153506A
Authority: KR
Inventors: 장경익
Original assignee: 주식회사 지디에프랩
Priority date: 2022-11-16
Filing date: 2022-11-16
Publication date: 2023-07-26

Abstract

Disclosed is a control method of an electronic device. The control method comprises: a step of receiving streaming data including a first image data corresponding to a preset range of field of view; a step of tracking the direction of an eyeball of a user and identifying a field of view of the user; a step of selecting a target image data matching at least a part of the identified field of view among the first image data; a step of inputting target image data into an artificial intelligence (AI) model trained to restore an image with a high resolution, and acquiring a restored image data; and a step of replacing the target image data of the first image data with the restored image data, and acquiring a second image data, and outputting the second image data. Therefore, a streaming service for a high-resolution image can be possible.

Description

Method for controlling an electronic device that selectively restores an image according to a user's viewing area {CONTROL METHOD OF ELECTRONIC APPARATUS FOR SELECTIVELY RESTORE IMAGES ACCORDING TO FIELD OF VIEW OF USER}

본 개시는 이미지를 제공하는 전자 장치의 제어 방법에 관한 것으로, 보다 상세하게는, 사용자의 시야 영역에 매칭되는 일부 이미지를 고해상도 복원하여 제공하는 전자 장치의 제어 방법에 관한 것이다.The present disclosure relates to a control method of an electronic device that provides images, and more particularly, to a control method of an electronic device that provides high-resolution reconstructed images matching a user's viewing area.

코로나 19 사태 이후 VR(Virtual Reality), AR(Augmented Reality) 등 실감형 콘텐츠 시장이 활기를 찾기 시작하면서 시장 규모가 늘어나고 있다.Since the COVID-19 crisis, the realistic content market, such as VR (Virtual Reality) and AR (Augmented Reality), has begun to gain momentum, and the market size is increasing.

다만, AR/VR 콘텐츠의 모바일 트래픽 비용은 사업자와 사용자 모두에게 부담으로 작용하고 있으며, 이로 인해 5G 시대의 서비스 및 정보 격차가 커지고 있다.However, the mobile traffic cost of AR/VR content is burdening both operators and users, and as a result, the service and information gap in the 5G era is widening.

특히, 일반적으로 사용되는 소비자 기기(PC, 스마트폰)에서 초고화질 실감형 콘텐츠를 실시간 재생하기 어려움은 물론, VR 콘텐츠의 특성상 일반 영상 대비 6배 정도 많은 양의 픽셀 처리가 필요한 바 통신 환경은 물론 엣지 디바이스의 사양 면(ex. 고가의 GPU 장비 필요)에서 한계가 있는 실정이다.In particular, it is difficult to play ultra-high-definition realistic content in real time on commonly used consumer devices (PCs, smartphones), and due to the nature of VR content, pixel processing is required to process six times as many pixels as compared to general video. As a result, there are limitations in terms of edge device specifications (ex. expensive GPU equipment required) as well as communication environments.

등록 특허 공보 제10-1965746호Registered Patent Publication No. 10-1965746

본 개시는 저해상도의 이미지 데이터를 수신하여 실시간 영상을 제공하되, 실시간으로 감지되는 사용자의 시야 영역에 대해서는 그에 맞는 인공지능 모델을 활용하여 고해상도 복원을 수행함으로써 영상을 출력하는 전자 장치의 제어 방법을 제공한다.The present disclosure provides a control method of an electronic device that receives low-resolution image data and provides a real-time image, and outputs an image by performing high-resolution reconstruction using an artificial intelligence model suitable for a user's field of view sensed in real time.

본 개시의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 개시의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 개시의 실시 예에 의해 보다 분명하게 이해될 것이다. 또한, 본 개시의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present disclosure are not limited to the above-mentioned objects, and other objects and advantages of the present disclosure not mentioned above can be understood by the following description and will be more clearly understood by the embodiments of the present disclosure. Further, it will be readily apparent that the objects and advantages of the present disclosure may be realized by means of the instrumentalities and combinations indicated in the claims.

본 개시의 일 실시 예에 따른 전자 장치의 제어 방법은, 기설정된 시야 범위에 대응되는 제1 이미지 데이터를 포함하는 스트리밍 데이터를 수신하는 단계, 사용자의 안구 방향을 추적하여 상기 기설정된 시야 범위 내 사용자의 시야 영역을 식별하는 단계, 상기 제1 이미지 데이터 중 상기 식별된 시야 영역의 적어도 일부에 매칭되는 대상 이미지 데이터를 선택하는 단계, 고해상도 복원을 수행하도록 훈련된 인공지능 모델에 상기 대상 이미지 데이터를 입력하여, 복원 이미지 데이터를 획득하는 단계, 상기 제1 이미지 데이터의 상기 대상 이미지 데이터를 상기 복원 이미지 데이터로 대체하여 제2 이미지 데이터를 획득하고, 상기 제2 이미지 데이터를 출력하는 단계를 포함한다.A control method of an electronic device according to an embodiment of the present disclosure includes receiving streaming data including first image data corresponding to a preset viewing range, tracking a direction of the user's eyeballs and identifying a viewing area of the user within the preset viewing range, selecting target image data that matches at least a part of the identified viewing range from among the first image data, inputting the target image data to an artificial intelligence model trained to perform high-resolution reconstruction, and obtaining reconstructed image data, and the target image of the first image data. and replacing data with the reconstructed image data to obtain second image data, and outputting the second image data.

상기 스트리밍 데이터를 수신하는 단계는, 상기 제1 이미지 데이터가 영역 별로 분할된 복수의 부분 이미지를 수신하고, 상기 복수의 부분 이미지 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 신경망 모델에 대한 데이터를 수신할 수 있다. 상기 대상 이미지 데이터를 선택하는 단계는, 상기 복수의 부분 이미지 중 상기 사용자의 시야 영역에 포함되는 적어도 하나의 부분 이미지를 선택할 수 있다. 상기 복원 이미지 데이터를 획득하는 단계는, 상기 선택된 적어도 하나의 부분 이미지 각각에 매칭되는 적어도 하나의 신경망 모델에, 상기 선택된 적어도 하나의 부분 이미지를 각각 입력할 수 있다.The receiving of the streaming data may include receiving a plurality of partial images in which the first image data is divided into regions, and receiving data for a plurality of neural network models trained to perform high-resolution reconstruction on each of the plurality of partial images. In the selecting of the target image data, at least one partial image included in the viewing area of the user may be selected from among the plurality of partial images. The acquiring of the restored image data may include inputting the selected at least one partial image to at least one neural network model that matches each of the at least one selected partial image.

상기 전자 장치의 제어 방법은, 상기 전자 장치가 상기 서버로부터 VR(Virtual Reality) 콘텐츠가 시간 별로 분할된 복수의 이미지 프레임을 포함하는 스트리밍 데이터를 수신하는 단계, 상기 전자 장치가 상기 서버로부터 상기 복수의 이미지 프레임 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 제1 신경망 모델에 대한 데이터를 수신하는 단계를 포함할 수 있다. 상기 복수의 제1 신경망 모델 각각은, 각 이미지 프레임이 영역 별로 분할된 복수의 부분 이미지 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 제2 신경망 모델을 포함할 수 있다. 상기 전자 장치의 제어 방법은, 상기 전자 장치가 재생하는 상기 VR 콘텐츠 내 시간 구간 및 상기 시간 구간 동안 감지되는 상기 사용자의 시야 영역을 기초로, 적어도 하나의 이미지 프레임 내에 포함된 적어도 하나의 부분 이미지를 선택하는 단계, 상기 선택된 적어도 하나의 부분 이미지 각각에 매칭되는 적어도 하나의 제2 신경망 모델을 통해, 상기 선택된 적어도 하나의 부분 이미지에 대한 고해상도 복원을 수행하는 단계를 더 포함할 수 있다.The control method of the electronic device may include receiving, by the electronic device, streaming data including a plurality of image frames in which Virtual Reality (VR) content is divided by time from the server, and receiving, by the electronic device, data for a plurality of first neural network models trained to perform high-resolution reconstruction on each of the plurality of image frames from the server. Each of the plurality of first neural network models may include a plurality of second neural network models trained to perform high-resolution reconstruction on each of a plurality of partial images obtained by dividing each image frame into regions. The control method of the electronic device may further include selecting at least one partial image included in at least one image frame based on a time interval within the VR content reproduced by the electronic device and the user's viewing area sensed during the time interval, and performing high-resolution reconstruction of the selected at least one partial image through at least one second neural network model that matches each of the selected at least one partial image.

상기 전자 장치의 제어 방법은, 상기 사용자의 시야 영역을, 상기 사용자의 시선이 향하는 중심 지점을 포함하는 일정 시야각의 Foveal 영역, 상기 Foveal 영역을 둘러싸며 상기 Foveal 영역의 시야각보다 큰 시야각에 대응되는 Blend 영역, 및 상기 Foveal 영역과 상기 Blend 영역을 제외한 Peripheral 영역으로 구분하는 단계를 더 포함할 수 있다. 이때, 상기 복원 이미지 데이터를 획득하는 단계는, 상기 Foveal 영역에 매칭되는 제1 대상 이미지 데이터를 상기 인공지능 모델에 입력하여 제1 복원 이미지 데이터를 획득하고, 상기 Blend 영역에 매칭되는 제2 대상 이미지 데이터에 보간 보정을 수행하여 제2 복원 이미지 데이터를 획득할 수 있다.The control method of the electronic device may further include dividing the user's viewing area into a foveal area with a certain viewing angle including a central point toward which the user's gaze is directed, a blend area surrounding the foveal area and corresponding to a viewing angle greater than the viewing angle of the foveal area, and a peripheral area excluding the foveal area and the blend area. In this case, in the obtaining of the restored image data, first reconstructed image data may be obtained by inputting the first target image data matching the foveal region to the artificial intelligence model, and interpolation may be performed on the second target image data matching the blend region to obtain second restored image data.

여기서, 상기 전자 장치의 제어 방법은, 상기 전자 장치가 스트리밍 데이터를 수신하는 속도 및 상기 전자 장치가 상기 인공지능 모델을 통해 고해상도 복원을 수행하는 속도 중 적어도 하나를 바탕으로, 상기 Foveal 영역의 시야각 범위를 변경하는 단계를 더 포함할 수 있다.Here, the control method of the electronic device may further include changing the viewing angle range of the foveal area based on at least one of a speed at which the electronic device receives streaming data and a speed at which the electronic device performs high-resolution reconstruction through the artificial intelligence model.

또한, 상기 전자 장치의 제어 방법은, 상기 사용자의 시야 영역이 변경되는 속도를 식별하는 단계, 상기 식별된 속도를 바탕으로 상기 Foveal 영역의 시야각 범위를 변경하는 단계를 더 포함할 수도 있다.The control method of the electronic device may further include identifying a speed at which the user's viewing area changes, and changing a viewing angle range of the foveal area based on the identified speed.

본 개시의 일 실시 예에 따라 서버 및 전자 장치를 포함하는 시스템의 제어 방법은, 상기 서버가 360도의 시야 범위에 대응되는 제1 이미지 데이터를 포함하는 스트리밍 데이터를 상기 전자 장치로 전송하는 단계, 상기 전자 장치가 사용자의 안구 방향을 추적하여 사용자의 시야 영역을 식별하는 단계, 상기 전자 장치가 상기 제1 이미지 데이터 중 상기 식별된 시야 영역에 매칭되는 대상 이미지 데이터를 선택하는 단계, 상기 전자 장치가 고해상도 복원을 수행하도록 훈련된 인공지능 모델에 상기 대상 이미지 데이터를 입력하여, 복원 이미지 데이터를 획득하는 단계, 상기 전자 장치가, 상기 제1 이미지 데이터의 상기 대상 이미지 데이터를 상기 복원 이미지 데이터로 대체하여 제2 이미지 데이터를 획득하고, 상기 제2 이미지 데이터를 출력하는 단계를 포함한다.According to an embodiment of the present disclosure, a control method of a system including a server and an electronic device includes transmitting, by the server, streaming data including first image data corresponding to a 360-degree field of view to the electronic device; identifying a user's field of view area by tracking the direction of the user's eyeballs by the electronic device; selecting, by the electronic device, target image data matching the identified field of view area from among the first image data; , acquiring reconstructed image data, replacing, by the electronic device, the target image data of the first image data with the reconstructed image data to obtain second image data, and outputting the second image data.

본 개시에 따른 전자 장치의 제어 방법은 저해상도에 해당하는 이미지 데이터를 실시간으로 수신하여 제공하는 한편, 사용자의 시야 영역에서 일정 시선 부분에 대해서만 부분적으로 업스케일을 수행함으로써 통신 환경 및 장치 환경의 로드를 저하할 수 있다. 그 결과, 본 개시에 따른 전자 장치의 제어 방법은, VR/AR 등의 고화질 영상에 대한 스트리밍 서비스를 가능하게 한다.The method for controlling an electronic device according to the present disclosure receives and provides low-resolution image data in real time, and partially upscales only a certain line of sight in a user's viewing area, thereby reducing the load of the communication environment and the device environment. As a result, the method for controlling an electronic device according to the present disclosure enables a streaming service for high-definition video such as VR/AR.

도 1은 본 개시의 일 실시 예에 따라 이미지 데이터의 스트리밍 서비스를 제공하기 위한 서버 및 전자 장치의 개략적인 동작을 설명하기 위한 블록도,
도 2a는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도,
도 2b는 본 개시의 일 실시 예에 따른 서버의 구성을 설명하기 위한 블록도,
도 3은 본 개시의 일 실시 예에 따른 전자 장치의 동작을 설명하기 위한 흐름도,
도 4는 본 개시의 일 실시 예에 따른 전자 장치가 사용자의 시야 영역을 구분하는 동작을 설명하기 위한 도면,
도 5는 본 개시의 일 실시 예에 따른 전자 장치가 이미지 데이터의 영역 별로 훈련된 복수의 인공지능 모델을 선택적으로 활용하는 동작을 설명하기 위한 도면,
도 6은 본 개시의 일 실시 예에 따른 전자 장치가 이미지 데이터의 시간 구간 별로 훈련된 복수의 인공지능 모델을 선택적으로 활용하는 동작을 설명하기 위한 도면, 그리고
도 7은 본 개시의 다양한 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도이다.1 is a block diagram for explaining a schematic operation of a server and an electronic device for providing a streaming service of image data according to an embodiment of the present disclosure;
2A is a block diagram for explaining the configuration of an electronic device according to an embodiment of the present disclosure;
2B is a block diagram for explaining the configuration of a server according to an embodiment of the present disclosure;
3 is a flowchart for explaining an operation of an electronic device according to an embodiment of the present disclosure;
4 is a diagram for explaining an operation of an electronic device dividing a user's viewing area according to an embodiment of the present disclosure;
5 is a diagram for explaining an operation of selectively utilizing a plurality of artificial intelligence models trained for each region of image data by an electronic device according to an embodiment of the present disclosure;
6 is a diagram for explaining an operation of selectively utilizing a plurality of artificial intelligence models trained for each time interval of image data by an electronic device according to an embodiment of the present disclosure; and
7 is a block diagram for explaining the configuration of an electronic device according to various embodiments of the present disclosure.

본 개시에 대하여 구체적으로 설명하기에 앞서, 본 명세서 및 도면의 기재 방법에 대하여 설명한다.Prior to a detailed description of the present disclosure, the method of describing the present specification and drawings will be described.

먼저, 본 명세서 및 청구범위에서 사용되는 용어는 본 개시의 다양한 실시 예들에서의 기능을 고려하여 일반적인 용어들을 선택하였다. 하지만, 이러한 용어들은 당해 기술 분야에 종사하는 기술자의 의도나 법률적 또는 기술적 해석 및 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 일부 용어는 출원인이 임의로 선정한 용어도 있다. 이러한 용어에 대해서는 본 명세서에서 정의된 의미로 해석될 수 있으며, 구체적인 용어 정의가 없으면 본 명세서의 전반적인 내용 및 당해 기술 분야의 통상적인 기술 상식을 토대로 해석될 수도 있다. First, terms used in the present specification and claims are general terms in consideration of functions in various embodiments of the present disclosure. However, these terms may vary depending on the intention of a technician working in the art, legal or technical interpretation, and the emergence of new technologies. In addition, some terms are arbitrarily selected by the applicant. These terms may be interpreted as the meanings defined in this specification, and if there is no specific term definition, they may be interpreted based on the overall content of this specification and common technical knowledge in the art.

또한, 본 명세서에 첨부된 각 도면에 기재된 동일한 참조번호 또는 부호는 실질적으로 동일한 기능을 수행하는 부품 또는 구성요소를 나타낸다. 설명 및 이해의 편의를 위해서 서로 다른 실시 예들에서도 동일한 참조번호 또는 부호를 사용하여 설명한다. 즉, 복수의 도면에서 동일한 참조 번호를 가지는 구성요소를 모두 도시되어 있다고 하더라도, 복수의 도면들이 하나의 실시 예를 의미하는 것은 아니다. In addition, the same reference numerals or numerals in each drawing attached to this specification indicate parts or components that perform substantially the same function. For convenience of description and understanding, the same reference numerals or symbols are used in different embodiments. That is, even if all components having the same reference numerals are shown in a plurality of drawings, the plurality of drawings do not mean one embodiment.

또한, 본 명세서 및 청구범위에서는 구성요소들 간의 구별을 위하여 "제1", "제2" 등과 같이 서수를 포함하는 용어가 사용될 수 있다. 이러한 서수는 동일 또는 유사한 구성요소들을 서로 구별하기 위하여 사용하는 것이며 이러한 서수 사용으로 인하여 용어의 의미가 한정 해석되어서는 안 된다. 일 예로, 이러한 서수와 결합된 구성요소는 그 숫자에 의해 사용 순서나 배치 순서 등이 제한되어서는 안 된다. 필요에 따라서는, 각 서수들은 서로 교체되어 사용될 수도 있다. Also, in the present specification and claims, terms including ordinal numbers such as “first” and “second” may be used to distinguish between elements. These ordinal numbers are used to distinguish the same or similar components from each other, and the meaning of the term should not be construed as being limited due to the use of these ordinal numbers. For example, the order of use or arrangement of elements associated with such ordinal numbers should not be limited by the number. If necessary, each ordinal number may be used interchangeably.

본 명세서에서 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구성되다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this specification, singular expressions include plural expressions unless the context clearly dictates otherwise. In this application, the terms "comprise" or "consist of" are intended to designate that the features, numbers, steps, operations, components, parts, or combinations thereof described in the specification exist, but it should be understood that the presence or addition of one or more other features or numbers, steps, operations, components, parts, or combinations thereof is not excluded in advance.

본 개시의 실시 예에서 "모듈", "유닛", "부(part)" 등과 같은 용어는 적어도 하나의 기능이나 동작을 수행하는 구성요소를 지칭하기 위한 용어이며, 이러한 구성요소는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 "모듈", "유닛", "부(part)" 등은 각각이 개별적인 특정한 하드웨어로 구현될 필요가 있는 경우를 제외하고는, 적어도 하나의 모듈이나 칩으로 일체화되어 적어도 하나의 프로세서로 구현될 수 있다.In the embodiments of the present disclosure, terms such as “module,” “unit,” and “part” are terms used to refer to components that perform at least one function or operation, and these components may be implemented as hardware or software, or may be implemented as a combination of hardware and software. In addition, a plurality of "modules", "units", "parts", etc. may be integrated into at least one module or chip and implemented by at least one processor, except when each of them needs to be implemented with separate specific hardware.

또한, 본 개시의 실시 예에서, 어떤 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적인 연결뿐 아니라, 다른 매체를 통한 간접적인 연결의 경우도 포함한다. 또한, 어떤 부분이 어떤 구성요소를 포함한다는 의미는, 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Also, in an embodiment of the present disclosure, when a part is said to be connected to another part, this includes not only a direct connection but also an indirect connection through another medium. In addition, the meaning that a certain part includes a certain component means that it may further include other components without excluding other components unless otherwise stated.

도 1은 본 개시의 일 실시 예에 따라 이미지 데이터의 스트리밍 서비스를 제공하기 위한 서버 및 전자 장치의 개략적인 동작을 설명하기 위한 블록도이다.1 is a block diagram illustrating schematic operations of a server and an electronic device for providing a streaming service of image data according to an embodiment of the present disclosure.

전자 장치(100)는 서버(200)와의 통신을 바탕으로 다양한 이미지를 제공할 수 있다. 전자 장치(100)는 스마트폰, 태블릿 PC, TV, VR 기기, AR 기기, 웨어러블 장치(ex. 글래스, 워치, HMD(Head Mounted Device) 등), 콘솔 기기, 셋탑 박스, 기타 제어 기기 등으로 구현될 수 있다.The electronic device 100 may provide various images based on communication with the server 200 . The electronic device 100 may be implemented as a smartphone, a tablet PC, a TV, a VR device, an AR device, a wearable device (ex. glasses, a watch, a head mounted device (HMD), etc.), a console device, a set-top box, and other control devices.

서버(200)는 스트리밍 서비스를 지원함으로써 전자 장치(100)를 통해 이미지를 제공할 수 있다. 서버(200)는 하나 이상의 컴퓨터를 포함하는 시스템으로 구현될 수 있다. 서버(200)는 VR 콘텐츠, AR 콘텐츠, 2D 콘텐츠, 3D 콘텐츠 등 다양한 콘텐츠를 구성하는 이미지 데이터를 스트리밍 데이터의 형태로 전자 장치(100)로 전송할 수 있다. 서버(200)는 방송, 게임, 관광, 엔터테인먼트, IoT 모니터링, 군(military), 의료, CCTV, 메타버스 등 다양한 분야의 콘텐츠에 대한 스트리밍 서비스를 제공할 수 있다.The server 200 may provide images through the electronic device 100 by supporting a streaming service. The server 200 may be implemented as a system including one or more computers. The server 200 may transmit image data constituting various contents such as VR contents, AR contents, 2D contents, and 3D contents to the electronic device 100 in the form of streaming data. The server 200 may provide streaming services for content in various fields such as broadcasting, gaming, tourism, entertainment, IoT monitoring, military, medical, CCTV, and metaverse.

이때, 전자 장치(100)가 스트리밍 데이터에 포함된 이미지 데이터를 출력함으로써 실시간 영상이 제공될 수 있다. 예를 들어, 전자 장치(100)가 디스플레이를 포함하는 VR 기기(ex. HMD) 등으로 구현된 경우, 전자 장치(100)는 서버(200)로부터 수신된 이미지 데이터를 바탕으로 일정 시야 범위(ex. 360도)의 VR 영상을 디스플레이 할 수 있다. 또는, 전자 장치(100)가 VR 기기를 제어하는 제어 기기로 구현된 경우, 전자 장치(100)는 서버(200)로부터 수신된 이미지 데이터를 VR 기기로 전송하여 VR 기기를 통해 VR 영상이 제공되도록 제어할 수도 있다.In this case, real-time video may be provided by the electronic device 100 outputting image data included in the streaming data. For example, when the electronic device 100 is implemented as a VR device (ex. HMD) including a display, the electronic device 100 may display a VR image of a certain viewing range (ex. 360 degrees) based on image data received from the server 200. Alternatively, when the electronic device 100 is implemented as a control device that controls the VR device, the electronic device 100 transmits image data received from the server 200 to the VR device and controls to provide VR images through the VR device.

도 2a는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도이다.2A is a block diagram for explaining the configuration of an electronic device according to an embodiment of the present disclosure.

도 2a를 참조하면, 전자 장치(100)는 메모리(110), 프로세서(120), 통신부(130)를 포함할 수 있다. 또한, 전자 장치(100)는 디스플레이(140) 및 센서부(150) 중 적어도 하나를 추가로 포함할 수도 있다.Referring to FIG. 2A , the electronic device 100 may include a memory 110, a processor 120, and a communication unit 130. Also, the electronic device 100 may further include at least one of the display 140 and the sensor unit 150.

메모리(110)는 전자 장치(100)의 구성요소들의 전반적인 동작을 제어하기 위한 운영체제(OS: Operating System) 및 전자 장치(100)의 구성요소와 관련된 적어도 하나의 인스트럭션 또는 데이터를 저장하기 위한 구성이다.The memory 110 is a configuration for storing an Operating System (OS) for controlling overall operations of components of the electronic device 100 and at least one instruction or data related to components of the electronic device 100 .

메모리(110)는 ROM, 플래시 메모리 등의 비휘발성 메모리를 포함할 수 있으며, DRAM 등으로 구성된 휘발성 메모리를 포함할 수 있다. 또한, 메모리(110)는 하드 디스크, SSD(Solid state drive) 등을 포함할 수도 있다.The memory 110 may include non-volatile memory such as ROM and flash memory, and may include volatile memory such as DRAM. Also, the memory 110 may include a hard disk, a solid state drive (SSD), and the like.

일 실시 예에 따르면, 메모리(110)는 적어도 하나의 이미지에 대하여 고해상도 복원을 수행하도록 훈련된 적어도 하나의 인공지능 모델을 포함할 수 있다. According to an embodiment, the memory 110 may include at least one artificial intelligence model trained to perform high-resolution reconstruction on at least one image.

일 예로, 본 인공지능 모델은, 서로 다른 레이어에 속하는 노드 간의 가중치가 업데이트됨에 따라 훈련되는 신경망 모델에 해당할 수 있으며, 적어도 3개 이상의 컨볼루션 레이어를 포함하는 딥러닝 방식의 CNN(Convolutional Neural Network) 모델에 해당할 수 있으나, 이에 한정되지 않는다. As an example, this artificial intelligence model may correspond to a neural network model that is trained as weights between nodes belonging to different layers are updated, and may correspond to a convolutional neural network (CNN) model of a deep learning method including at least three or more convolutional layers, but is not limited thereto.

인공지능 모델은, 적어도 하나의 저해상도 이미지(ex. 각 픽셀의 값)가 입력되면, 저해상도 이미지보다 더 많은 픽셀로 구성된 고해상도 이미지를 출력할 수 있다. When at least one low-resolution image (eg, each pixel value) is input, the artificial intelligence model may output a high-resolution image composed of more pixels than the low-resolution image.

일 예로, 본 인공지능 모델은, 이미지로부터 특징을 추출하기 위한 특징 추출부, 픽셀을 생성(: 업샘플링)하고 생성된 픽셀의 값을 결정하는 복원부 등을 포함할 수 있으나 이에 한정되지 않는다. 여기서 특징 추출부, 복원부 등은 각각 하나 이상의 레이어를 포함할 수 있다.For example, the artificial intelligence model may include, but is not limited to, a feature extractor for extracting features from an image, a restorer for generating (upsampling) pixels and determining values of the generated pixels, and the like. Here, each of the feature extraction unit and restoration unit may include one or more layers.

프로세서(120)는 전자 장치(100)를 전반적으로 제어하기 위한 구성이다. The processor 120 is a component for controlling the electronic device 100 as a whole.

구체적으로, 프로세서(120)는 메모리(110)와 연결되는 한편 메모리(110)에 저장된 적어도 하나의 인스트럭션을 실행함으로써 본 개시의 다양한 실시 예들에 따른 동작을 수행할 수 있다.Specifically, the processor 120 may perform operations according to various embodiments of the present disclosure by executing at least one instruction stored in the memory 110 while being connected to the memory 110 .

프로세서(120)는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit) 등과 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공지능 전용 프로세서 등을 포함할 수 있다. 인공지능 전용 프로세서는, 특정 인공지능 모델의 훈련 내지는 이용에 특화된 하드웨어 구조로 설계될 수 있다.The processor 120 may include a general-purpose processor such as a CPU, an AP, or a digital signal processor (DSP), a graphics-only processor such as a GPU or a vision processing unit (VPU), or an artificial intelligence-only processor such as an NPU. An artificial intelligence-only processor may be designed as a hardware structure specialized for training or use of a specific artificial intelligence model.

프로세서(120)는 시야 영역 추출 모듈(121), 고해상도 복원 모듈(122), 디스플레이 모듈(123) 등을 제어할 수 있다. 본 모듈들은 기능적으로 정의된 모듈들에 해당하며 각 모듈은 소프트웨어 및/또는 하드웨어를 통해 구현될 수 있다.The processor 120 may control the viewing area extraction module 121 , the high-resolution reconstruction module 122 , the display module 123 , and the like. These modules correspond to functionally defined modules, and each module may be implemented through software and/or hardware.

시야 영역 추출 모듈(121)은 사용자의 안구 방향을 추적하여 사용자의 시야 영역을 식별하기 위한 구성이다. 시야 영역 추출 모듈(121)은 기설정된 시야 범위(ex. 360도) 내에서 사용자가 바라보는 방향에 기초하여 사용자의 시야 영역을 추출할 수 있다.The viewing area extraction module 121 is a component for identifying the viewing area of the user by tracking the direction of the user's eyeballs. The viewing area extraction module 121 may extract the viewing area of the user based on the direction the user looks within a preset viewing range (eg, 360 degrees).

예를 들어, 전자 장치(100)가 HMD 형태의 VR 기기인 경우, 시야 영역 추출 모듈(121)은 적어도 하나의 이미지 센서, 적외선 선세 등을 통해 사용자의 안구를 촬영함으로써 동공을 추적할 수 있다. 이때, 시야 영역 추출 모듈(121)은 동공의 위치를 바탕으로 사용자가 바라보는 방향을 식별함으로써, 사용자의 시야 영역을 획득할 수 있다. 구체적인 예로, 시야 영역 추출 모듈(121)은 사용자가 바라보는 방향을 중심으로 기설정된 시야각(ex. 30도, 45도, 60도 등) 범위를 사용자의 시야 영역으로 획득할 수 있으나, 이에 한정되지 않는다.For example, when the electronic device 100 is a VR device in the form of an HMD, the viewing area extraction module 121 may track the pupil by photographing the user's eyeballs through at least one image sensor and infrared rays. In this case, the viewing area extraction module 121 may obtain the viewing area of the user by identifying a direction in which the user is looking based on the location of the pupil. As a specific example, the viewing area extraction module 121 may obtain a range of a preset viewing angle (eg, 30 degrees, 45 degrees, 60 degrees, etc.) centered on the user's viewing direction as the user's viewing area, but is not limited thereto.

다른 예로, 시야 영역 추출 모듈(121)은 중력 센서, 자이로 센서 중 적어도 하나를 바탕으로 사용자가 착용한 HMD의 자세(: 방향)를 식별할 수 있으며, HMD가 향하는 방향을 사용자가 바라보는 방향으로 식별할 수도 있다.As another example, the viewing area extraction module 121 may identify the posture (direction) of the HMD worn by the user based on at least one of a gravity sensor and a gyro sensor, and may also identify a direction the HMD is facing as a direction the user is looking at.

전자 장치(100)가 HMD 형태의 VR 기기인 경우, 시야 영역 추출 모듈(121)은 전자 장치(100)의 센서부(150)에 구비된 적어도 하나의 센서(ex. 이미지 센서)를 통해 시야 영역을 추출할 수 있다. 반면, 전자 장치(100)가 VR 기기를 제어하는 제어 기기인 경우, 시야 영역 추출 모듈(121)은 통신부(130)를 통해 VR 기기의 센서부(ex. 이미지 센서)와 통신을 수행하여 사용자가 바라보는 시야 영역을 추출할 수 있다.When the electronic device 100 is an HMD-type VR device, the viewing area extraction module 121 may extract the viewing area through at least one sensor (eg, an image sensor) provided in the sensor unit 150 of the electronic device 100. On the other hand, when the electronic device 100 is a control device that controls the VR device, the viewing area extraction module 121 communicates with the sensor unit (eg, image sensor) of the VR device through the communication unit 130 to extract the viewing area viewed by the user.

고해상도 복원 모듈(122)은 서버(200)로부터 수신된 이미지 데이터의 적어도 일부를 복원하기 위한 모듈이다.The high resolution restoration module 122 is a module for reconstructing at least a part of the image data received from the server 200 .

구체적으로, 원활한 스트리밍을 위해 전자 장치(100)는 서버(200)로부터 저화질의 이미지 데이터를 포함하는 스트리밍 데이터를 수신할 수 있다.Specifically, for smooth streaming, the electronic device 100 may receive streaming data including low-quality image data from the server 200 .

이때, 고해상도 복원 모듈(122)은 저화질의 이미지 데이터 중 사용자의 시야 영역의 적어도 일부에 매칭되는 대상 이미지 데이터를 선택하고, 대상 이미지 데이터에 대하여 고해상도 복원을 수행할 수 있다.In this case, the high-resolution reconstruction module 122 may select target image data that matches at least a part of the user's viewing area from among the low-quality image data, and perform high-resolution reconstruction on the target image data.

고해상도 복원 모듈(122)은 메모리(110)에 저장된 인공지능 모델을 통해 고해상도 복원을 수행함으로써 적어도 일부의 시야 영역에 대하여 고해상도의 복원 이미지 데이터를 획득할 수 있다. 그 결과, 일부 시야 영역에 대해서는 고해상도의 이미지 데이터를 포함하고, 나머지 시야 영역에 대해서는 저해상도의 이미지 데이터가 포함된 최종 이미지 데이터가 획득될 수 있다.The high-resolution reconstruction module 122 may obtain high-resolution reconstruction image data for at least a part of the viewing area by performing high-resolution reconstruction through an artificial intelligence model stored in the memory 110 . As a result, final image data including high-resolution image data for a part of the viewing area and low-resolution image data for the remaining viewing area may be obtained.

디스플레이 모듈(123)은 스트리밍에 따라 획득된 이미지 데이터를 시각적으로 출력하기 위한 구성이다. 디스플레이 모듈(123)은 전자 장치(100)의 디스플레이(140)를 통해 영상을 출력할 수도 있고, 전자 장치(100)와 연결된 적어도 하나의 디스플레이 장치를 제어함으로써 영상 출력을 지원할 수도 있다.The display module 123 is a component for visually outputting image data acquired according to streaming. The display module 123 may output an image through the display 140 of the electronic device 100 or support image output by controlling at least one display device connected to the electronic device 100 .

일 실시 예로, 디스플레이 모듈(123)은 적어도 일부의 시야 영역에 대해 고해상도 복원이 수행된 최종 이미지 데이터를 시각적으로 출력할 수 있다.As an example, the display module 123 may visually output final image data obtained by performing high-resolution reconstruction on at least a part of the viewing area.

통신부(130)는 다양한 유무선 통신방식으로 적어도 하나의 외부 장치와 통신을 수행하기 위한 구성으로, 다양한 통신 방식에 매칭되는 회로, 모듈, 칩 등을 포함할 수 있다.The communication unit 130 is a component for performing communication with at least one external device using various wired/wireless communication methods, and may include circuits, modules, chips, etc. matched to various communication methods.

통신부(130)는 다양한 네트워크를 통해 외부 장치들과 연결될 수 있다.The communication unit 130 may be connected to external devices through various networks.

네트워크는 영역 또는 규모에 따라 개인 통신망(PAN; Personal Area Network), 근거리 통신망(LAN; Local Area Network), 광역 통신망(WAN; Wide Area Network) 등일 수 있으며, 네트워크의 개방성에 따라 인트라넷(Intranet), 엑스트라넷(Extranet), 또는 인터넷(Internet) 등일 수 있다.The network may be a Personal Area Network (PAN), a Local Area Network (LAN), a Wide Area Network (WAN), or the like, depending on the area or scale, and may be an intranet, an extranet, or the Internet, etc., depending on the openness of the network.

통신부(130)는 LTE(long-term evolution), LTE-A(LTE Advance), 5G(5th Generation) 이동통신, CDMA(code division multiple access), WCDMA(wideband CDMA), UMTS(universal mobile telecommunications system), WiBro(Wireless Broadband), GSM(Global System for Mobile Communications), DMA(Time Division Multiple Access), WiFi(Wi-Fi), WiFi Direct, Bluetooth, NFC(near field communication), Zigbee 등 다양한 무선 통신 방식을 통해 외부 장치들과 연결될 수 있다. The communication unit 130 is a long-term evolution (LTE), LTE Advance (LTE-A), 5th generation (5G) mobile communication, code division multiple access (CDMA), wideband CDMA (WCDMA), universal mobile telecommunications system (UMTS), wireless broadband (WiBro), global system for mobile communications (GSM), time division multiple access (DMA), Wi-Fi (WiFi), WiFi Direct, Bluetooth, near field communication (NFC) It can be connected with external devices through various wireless communication methods such as , Zigbee.

또한, 통신부(130)는 이더넷(Ethernet), 광 네트워크(optical network), USB(Universal Serial Bus), 선더볼트(ThunderBolt) 등의 유선 통신 방식을 통해 외부 장치들과 연결될 수도 있다.In addition, the communication unit 130 may be connected to external devices through a wired communication method such as Ethernet, optical network, Universal Serial Bus (USB), or Thunderbolt.

이 밖에도, 통신부(130)는 통상적으로 이용되는 다양한 통신 방식을 통해 외부 장치와 통신을 수행할 수 있다.In addition, the communication unit 130 may communicate with an external device through various commonly used communication methods.

전자 장치(100)는 통신부(130)를 통해 서버(200)로부터 실시간 스트리밍 데이터를 수신할 수 있다. 또한, 전자 장치(100)는 통신부(130)를 통해 서버(200)로부터 적어도 하나의 인공지능 모델의 데이터를 수신할 수 있다.The electronic device 100 may receive real-time streaming data from the server 200 through the communication unit 130 . Also, the electronic device 100 may receive data of at least one artificial intelligence model from the server 200 through the communication unit 130 .

여기서, 인공지능 모델의 데이터는, 인공지능 모델을 구성하는 가중치에 대한 데이터를 포함할 수 있으나, 이에 한정되지 않는다.Here, the data of the artificial intelligence model may include data on weights constituting the artificial intelligence model, but is not limited thereto.

디스플레이(140)는 다양한 이미지를 시각적으로 표시하기 위한 구성으로, LCD(Lizuid Crystal Display), LED(Light Emitting Diodes), OLED(Organic Light Emitting Diodes), TOLED(Transparent OLED), Micro LED 등으로 구현될 수 있다. 또한, 디스플레이(140)는 평면 디스플레이 외에 곡면 디스플레이, 플렉서블 디스플레이, 폴더블 디스플레이 등으로 구현될 수도 있다. 디스플레이(140)는 상술한 디스플레이 모듈(123)에 의해 구동/제어되어 다양한 이미지를 출력할 수 있다.The display 140 is a configuration for visually displaying various images, and may be implemented with a liquid crystal display (LCD), light emitting diodes (LED), organic light emitting diodes (OLED), transparent OLED (TOLED), or micro LED. In addition, the display 140 may be implemented as a curved display, a flexible display, a foldable display, or the like, in addition to a flat display. The display 140 may be driven/controlled by the above-described display module 123 to output various images.

구체적으로, 디스플레이(140)는 사용자의 안구 주변의 넓은 영역에 영상을 제공하기 위한 곡면 디스플레이로 구현될 수 있다. 이때, 디스플레이(140)는 사용자의 좌안 및 우안 각각에 영상을 제공하기 위한 별도의 디스플레이 패널을 포함할 수도 있다.Specifically, the display 140 may be implemented as a curved display for providing an image to a wide area around the user's eyes. In this case, the display 140 may include a separate display panel for providing images to each of the user's left and right eyes.

또한, 전자 장치(100)가 AR 글래스 등 AR 기기에 해당하는 경우, 디스플레이(140)는 실제 주변 환경과 가상으로 출력되는 이미지를 동시에 보여주기 위한 다양한 광학계로 구성될 수 있으며, 이때 디스플레이(140)의 적어도 일부는 투명 디스플레이로 구현될 수 있다.In addition, when the electronic device 100 corresponds to an AR device such as AR glasses, the display 140 may be composed of various optical systems for simultaneously showing a real surrounding environment and a virtually output image. In this case, at least a part of the display 140 may be implemented as a transparent display.

센서부(150)는 전자 장치(100)를 이용하는 사용자 또는 주변과 관련된 정보를 획득하기 위한 다양한 센서를 포함할 수 있다. 예를 들어, 센서부(150)는 사용자의 안구 또는 주변을 촬영하기 위한 적어도 하나의 이미지 센서(ex. 카메라), 전자 장치(100)의 이동 속도, 이동 방향, 자세(ex. 서로 수직인 세 개의 축 각각에 대한 방향) 등을 감지하기 위한 가속도 센서, 자이로 센서, 지자기 센서, 중력 센서 등을 포함할 수 있다. 또한, 센서부(150)는 VR/AR 기기로 구현된 전자 장치(100)의 착용 여부를 감지하기 위한 적어도 하나의 접촉 센서 내지는 근접 센서를 포함할 수 있다. The sensor unit 150 may include various sensors for obtaining information related to a user using the electronic device 100 or surroundings. For example, the sensor unit 150 may include at least one image sensor (eg, camera) for photographing the user's eyeball or surroundings, an acceleration sensor, a gyro sensor, a geomagnetic sensor, a gravity sensor, etc. In addition, the sensor unit 150 may include at least one contact sensor or proximity sensor for detecting whether or not the electronic device 100 implemented as a VR/AR device is worn.

또한, 센서부(150)는 사용자의 신체의 적어도 일부의 모션을 감지하기 위한 모션 센서를 포함할 수 있다. 예를 들어, VR 기기 또는 AR 기기로 구현된 전자 장치(100)가 사용자가 손으로 조작할 수 있는 컨트롤러를 포함하는 경우, 컨트롤러의 이동 속도, 이동 방향, 자세 등을 감지하기 위한 다양한 센서가 구비될 수 있다.Also, the sensor unit 150 may include a motion sensor for detecting motion of at least a part of the user's body. For example, when the electronic device 100 implemented as a VR device or an AR device includes a controller that can be operated by a user's hand, various sensors for detecting the movement speed, movement direction, posture, etc. of the controller may be provided.

한편, 도 2b는 본 개시의 일 실시 예에 따른 서버의 구성을 설명하기 위한 블록도이다.On the other hand, Figure 2b is a block diagram for explaining the configuration of the server according to an embodiment of the present disclosure.

도 2b를 참조하면, 서버(200)는 메모리(210), 프로세서(220), 통신부(230) 등을 포함할 수 있다.Referring to FIG. 2B , the server 200 may include a memory 210, a processor 220, a communication unit 230, and the like.

메모리(210)는 전자 장치(100)의 구성요소들의 전반적인 동작을 제어하기 위한 운영체제(OS: Operating System) 및 전자 장치(100)의 구성요소와 관련된 적어도 하나의 인스트럭션 또는 데이터를 저장하기 위한 구성이다.The memory 210 is a component for storing an Operating System (OS) for controlling overall operations of components of the electronic device 100 and at least one instruction or data related to components of the electronic device 100 .

메모리(210)는 ROM, 플래시 메모리 등의 비휘발성 메모리를 포함할 수 있으며, DRAM 등으로 구성된 휘발성 메모리를 포함할 수 있다. 또한, 메모리(210)는 하드 디스크, SSD(Solid state drive) 등을 포함할 수도 있다.The memory 210 may include non-volatile memory such as ROM and flash memory, and may include volatile memory composed of DRAM and the like. Also, the memory 210 may include a hard disk, a solid state drive (SSD), and the like.

프로세서(220)는 전자 장치(100)를 전반적으로 제어하기 위한 구성이다. The processor 220 is a component for controlling the electronic device 100 as a whole.

구체적으로, 프로세서(220)는 메모리(210)와 연결되는 한편 메모리(210)에 저장된 적어도 하나의 인스트럭션을 실행함으로써 본 개시의 다양한 실시 예들에 따른 동작을 수행할 수 있다.Specifically, the processor 220 may perform operations according to various embodiments of the present disclosure by executing at least one instruction stored in the memory 210 while being connected to the memory 210 .

프로세서(220)는 CPU, AP, DSP(Digital Signal Processor) 등과 같은 범용 프로세서, GPU, VPU(Vision Processing Unit) 등과 같은 그래픽 전용 프로세서 또는 NPU와 같은 인공지능 전용 프로세서 등을 포함할 수 있다. 인공지능 전용 프로세서는, 특정 인공지능 모델의 훈련 내지는 이용에 특화된 하드웨어 구조로 설계될 수 있다.The processor 220 may include a general-purpose processor such as a CPU, an AP, or a digital signal processor (DSP), a graphics-only processor such as a GPU or a vision processing unit (VPU), or an artificial intelligence-only processor such as an NPU. An artificial intelligence-only processor may be designed as a hardware structure specialized for training or use of a specific artificial intelligence model.

도 2b를 참조하면, 프로세서(220)는 이미지 처리 모듈(221), 스트리밍 모듈(222), 인공지능 훈련 모듈(223) 등을 제어할 수 있다. 본 모듈들은 기능적으로 정의된 모듈들에 해당하며 각 모듈은 소프트웨어 및/또는 하드웨어를 통해 구현될 수 있다.Referring to FIG. 2B , the processor 220 may control an image processing module 221, a streaming module 222, an artificial intelligence training module 223, and the like. These modules correspond to functionally defined modules, and each module may be implemented through software and/or hardware.

이미지 처리 모듈(221)은 이미지 데이터에 대한 인코딩(encoding) 내지는 디코딩(decoding)을 수행하기 위한 모듈이다. 구체적으로, 이미지 처리 모듈(221)은 고해상도의 이미지를 압축하여 저해상도의 이미지를 획득하거나, 또는 저해상도의 이미지를 복원하여 고해상도의 이미지를 획득할 수 있다.The image processing module 221 is a module for encoding or decoding image data. Specifically, the image processing module 221 may acquire a low-resolution image by compressing a high-resolution image, or acquire a high-resolution image by restoring a low-resolution image.

일 실시 예로, 이미지 처리 모듈(221)은 고해상도의 이미지를 압축하여 저해상도의 이미지를 획득함으로써, 스트리밍에 용이한 데이터의 형태로 제공할 수 있다.As an example, the image processing module 221 compresses a high-resolution image to obtain a low-resolution image, thereby providing the image in the form of data that is easy to stream.

구체적인 예로, 이미지 처리 모듈(221)은 360도의 시야 범위에 해당하는 고해상도의 VR 이미지를 복수의 분할 이미지로 구분하고, 복수의 분할 이미지 각각을 저해상도의 이미지로 변환함으로써 스트리밍에 용이한 형태로 제공할 수 있다.As a specific example, the image processing module 221 divides a high-resolution VR image corresponding to a 360-degree field of view into a plurality of divided images, converts each of the plurality of divided images into low-resolution images, and provides them in a form that is easy to stream.

스트리밍 모듈(222)은 실시간 스트리밍 데이터를 전송하기 위한 구성이다. 스트리밍 모듈(222)은 통신부(230)를 통해 전자 장치(100)와 연결될 수 있으며, 이미지 데이터를 포함하는 스트리밍 데이터를 전송할 수 있다.The streaming module 222 is a component for transmitting real-time streaming data. The streaming module 222 may be connected to the electronic device 100 through the communication unit 230 and transmit streaming data including image data.

구체적으로, 스트리밍 모듈(222)은 저해상도의 복수의 분할 이미지를 포함하는 스트리밍 데이터를 전자 장치(100)로 전송할 수 있다. Specifically, the streaming module 222 may transmit streaming data including a plurality of low-resolution divided images to the electronic device 100 .

인공지능 훈련 모듈(223)은 고해상도 복원을 수행하기 위한 인공지능 모델을 훈련시키기 위한 모듈이다.The artificial intelligence training module 223 is a module for training an artificial intelligence model for performing high-resolution reconstruction.

여기서, 인공지능 모델은, 저해상도의 이미지가 입력되면 고해상도의 이미지를 출력하기 위한 CNN 모델에 해당할 수 있다. 일 예로, GT(ground Truth)로 정의되는 고해상도 타겟 이미지와 이에 대응하는 저해상도 버전의 이미지(LR; Low Resolution image)를 쌍으로 투입하여, 저해상도 이미지(LR)를 GT로 복원하도록 학습되는 방식인 SR(Super Resolution)모델에 해당할 수 있다.Here, the artificial intelligence model may correspond to a CNN model for outputting a high-resolution image when a low-resolution image is input. For example, a high-resolution target image defined as GT (ground truth) and a corresponding low-resolution version of an image (LR; Low Resolution Image) are input as a pair, and the low-resolution image (LR) is learned to restore the GT. It may correspond to the SR (Super Resolution) model.

일 예로, 인공지능 훈련 모듈(223)은 360도의 시야 범위에 해당하는 VR 영상의 고해상도 버전 및 저해상도 버전 각각을 활용할 수 있다.For example, the artificial intelligence training module 223 may utilize a high-resolution version and a low-resolution version of VR images corresponding to a 360-degree viewing range, respectively.

이때, 인공지능 훈련 모듈(223)은 인공지능 모델의 입력에 적합한 패치 사이즈에 따라 저해상도의 VR 영상(: 360도 시야 범위)을 분할할 수 있다. 마찬가지로, 인공지능 훈련 모듈(223)은 인공지능 모델의 출력에 적합한 패치 사이즈에 따라 고해상도의 VR 영상(: 360도 시야 범위)을 분할할 수 있다. 이때, 인공지능 훈련 모듈(223)은 분할된 각각의 저해상도-고해상도 이미지 쌍을 활용하여 인공지능 모델을 훈련시킬 수 있다. 이 경우, 인공지능 훈련 모듈(223)은 공간상 분리된 분할 이미지 각각에 대하여 별도의 인공지능 모델을 훈련시킬 수 있다. At this time, the artificial intelligence training module 223 may divide the low-resolution VR image (: 360-degree field of view) according to the patch size suitable for the input of the artificial intelligence model. Similarly, the artificial intelligence training module 223 may divide a high-resolution VR image (360-degree field of view) according to a patch size suitable for the output of the artificial intelligence model. At this time, the artificial intelligence training module 223 may train the artificial intelligence model by utilizing each of the divided low-resolution-high-resolution image pairs. In this case, the artificial intelligence training module 223 may train a separate artificial intelligence model for each spatially separated segmented image.

또한, 동일한 시야 영역의 이미지라도, VR 비디오의 경우 시간의 흐름에 따라 이미지가 변경될 수 있는 바, 인공지능 훈련 모듈(223)은 시간의 흐름에 따라 이미지를 구분하여 각 이미지에 대한 복원을 수행하도록 각 인공지능 모델을 훈련시킬 수도 있다.In addition, even in the case of an image of the same field of view, in the case of a VR video, since the image may change over time, the artificial intelligence training module 223 distinguishes images over time and restores each image. Each AI model can be trained.

한편, 인공지능 훈련 모듈(223)은 학습용 이미지들의 std 값(표준 편차)에 따라 각 이미지를 분류하고, std 값에 매칭되는 각 인공지능 모델의 훈련에 활용할 수 있다.On the other hand, the artificial intelligence training module 223 can classify each image according to the std value (standard deviation) of the training images and use it for training of each artificial intelligence model matched to the std value.

std 값은 각 이미지의 대략적인 구조를 담고 있는 개념으로, std 값이 클수록 픽셀 값 간의 표준 편차가 크므로 이미지의 복잡도가 큰 것으로 해석될 수 있다. 구체적으로, 인공지능 훈련 모듈(223)은 std 값의 범위에 따라 복수의 학습용 이미지를 여러 개의 그룹으로 분류할 수 있는데, 예를 들어, 0~50까지의 std값을 갖는 이미지들을 0~10, 10~20, 20~30, 30~40, 40~50과 같이 총 5개의 그룹으로 분류될 수 있다. 이때, 그룹 별로 별도의 인공지능 모델의 훈련에 활용될 수 있다. The std value is a concept containing an approximate structure of each image, and the higher the std value, the larger the standard deviation between pixel values, so it can be interpreted as the greater the complexity of the image. Specifically, the artificial intelligence training module 223 may classify a plurality of training images into several groups according to the range of std values. For example, images having std values from 0 to 50 may be classified into a total of five groups, such as 0 to 10, 10 to 20, 20 to 30, 30 to 40, and 40 to 50. At this time, it can be used for training of separate artificial intelligence models for each group.

구체적으로, std 값은 영상의 분야(ex. 바이오용 x-ray 영상, 게임용 VR 영상, 뉴스 영상 등)에 따라 달라질 수도 있는 개념인 바, 인공지능 훈련 모듈(223)은 그룹 별 이미지를 서로 다른 분야에 매칭되는 인공지능 모델들 각각의 훈련에 활용할 수 있다. 예를 들어, 제1 그룹의 학습용 이미지들은 x-ray 영상의 고해상도 복원을 위한 인공지능 모델의 훈련에 활용되고, 제2 그룹의 학습용 이미지들은 VR 영상의 고해상도 복원을 위한 인공지능 모델의 훈련에 활용될 수 있다.Specifically, since the std value is a concept that may vary depending on the field of the image (ex. x-ray image for bio, VR image for game, news image, etc.), the artificial intelligence training module 223 can use images for each group to train each of the artificial intelligence models matched to different fields. For example, the training images of the first group can be used to train an artificial intelligence model for high-resolution reconstruction of x-ray images, and the training images of the second group can be used to train an artificial intelligence model for high-resolution reconstruction of VR images.

그리고, 스트리밍 모듈(222)은 고해상도 복원을 수행하도록 훈련된 적어도 하나의 인공지능 모델에 대한 데이터를 전자 장치(100)로 전송할 수 있다. 여기서, 인공지능 모델의 데이터는 인공지능 모델의 노드 간 가중치를 포함할 수 있다.In addition, the streaming module 222 may transmit data for at least one artificial intelligence model trained to perform high-resolution reconstruction to the electronic device 100 . Here, data of the artificial intelligence model may include weights between nodes of the artificial intelligence model.

통신부(230)는 다양한 유무선 통신방식으로 적어도 하나의 외부 장치와 통신을 수행하기 위한 구성으로, 다양한 통신 방식에 매칭되는 회로, 모듈, 칩 등을 포함할 수 있다. 통신부(230)는 다양한 네트워크를 통해 외부 장치들과 연결될 수 있으며, 전자 장치(100)를 통해 스트리밍 서비스를 제공할 수 있다.The communication unit 230 is a component for communicating with at least one external device using various wired/wireless communication methods, and may include circuits, modules, chips, etc. matched to various communication methods. The communication unit 230 may be connected to external devices through various networks and may provide a streaming service through the electronic device 100 .

도 3은 본 개시의 일 실시 예에 따른 전자 장치의 동작을 설명하기 위한 흐름도이다. 도 3은, 전자 장치(100)가 서버(200)로부터 VR 영상 또는 AR 영상의 스트리밍 서비스를 제공받는 상황을 전제로 한다.3 is a flowchart illustrating an operation of an electronic device according to an embodiment of the present disclosure. FIG. 3 assumes a situation in which the electronic device 100 receives a VR video or AR video streaming service from the server 200 .

도 3을 참조하면, 전자 장치(100)는 기설정된 시야 범위에 대응되는 제1 이미지 데이터를 포함하는 스트리밍 데이터를 수신할 수 있다(S310).Referring to FIG. 3 , the electronic device 100 may receive streaming data including first image data corresponding to a preset viewing range (S310).

예를 들어, 전자 장치(100)는 360도의 시야 범위에 해당하는 VR 영상에 해당하는 제1 이미지 데이터를 실시간으로 수신할 수 있다. 이때, 제1 이미지 데이터는 원본 데이터(고해상도)가 압축된 형태의 저해상도의 이미지 데이터에 해당할 수 있다.For example, the electronic device 100 may receive first image data corresponding to a VR image corresponding to a 360-degree field of view in real time. In this case, the first image data may correspond to image data of low resolution in a form in which original data (high resolution) is compressed.

이때, 전자 장치(100)는 제1 이미지 데이터가 영역 별로 분할된 복수의 부분 이미지를 수신할 수도 있다. 각 영역은, 360도의 중심축을 기준으로 시야각 범위가 서로 구분되는 시야 영역에 해당할 수 있다.In this case, the electronic device 100 may receive a plurality of partial images in which the first image data is divided into regions. Each region may correspond to a viewing region in which viewing angle ranges are distinguished from each other based on a central axis of 360 degrees.

이 경우, 전자 장치(100)는 서버(200)로부터 복수의 부분 이미지 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 신경망 모델에 대한 데이터를 수신할 수 있다.In this case, the electronic device 100 may receive data on a plurality of neural network models trained to perform high-resolution reconstruction on each of a plurality of partial images from the server 200 .

이렇듯 분할 이미지 각각의 고해상도 복원을 위한 복수의 신경망 모델에 대한 데이터가 각각 수신되는 경우, 그 용량이 비교적 적고 전송 속도가 빠르다는 장점이 있다. 예를 들어, 전자 장치(100)에 신경망 모델의 레이어 구조가 기저장된 상태에서, 각 분할 이미지에 대하여 고해상도 복원을 수행하기 위한 신경망 모델의 데이터(ex. 노드 간 가중치)가 각각 수신될 수 있다.In this way, when data for a plurality of neural network models for high-resolution reconstruction of each of the divided images is received, the capacity is relatively small and the transmission speed is fast. For example, in a state in which the layer structure of the neural network model is pre-stored in the electronic device 100, neural network model data (eg, weights between nodes) for performing high-resolution reconstruction on each divided image may be received.

그리고, 전자 장치(100)는 사용자의 안구 방향을 추적하여 기설정된 시야 범위 내 사용자의 시야 영역을 식별할 수 있다(S320). 이때, 전자 장치(100)는 시야 영역 추출 모듈(121)을 통해 사용자의 시야 영역을 식별할 수 있다.In addition, the electronic device 100 may track the direction of the user's eyeballs and identify the user's viewing area within a preset viewing range (S320). In this case, the electronic device 100 may identify the user's viewing area through the viewing area extraction module 121 .

예를 들어, 전자 장치(100)는 적외선 카메라를 통해 사용자의 안구 영상을 획득하고, 영상을 그레이 레벨로 변환하여 동공 영역을 검출할 수 있다. 그리고, 전자 장치(100)는 동공의 중심 좌표를 바탕으로 사용자의 안구 방향을 추적함으로써, 안구 방향에 매칭되는 시선 방향을 검출할 수 있다. 그 결과, 시선 방향을 중심으로 하는 일정 시야각 범위의 시야 영역이 식별될 수 있다.For example, the electronic device 100 may obtain an image of the user's eyes through an infrared camera, convert the image into a gray level, and detect a pupil area. In addition, the electronic device 100 may detect the gaze direction matching the eyeball direction by tracking the user's eyeball direction based on the center coordinates of the pupil. As a result, a viewing area of a certain viewing angle range centered on the viewing direction can be identified.

구체적인 예로, 시야 영역은 사용자가 바라보는 시선 방향을 중심으로 하여 30도, 45도, 60도 등 다양한 시야각 범위 내의 영역으로 설정될 수 있다.As a specific example, the viewing area may be set as an area within various viewing angle ranges such as 30 degrees, 45 degrees, and 60 degrees with the user's gaze direction as the center.

이때, 전자 장치(100)는 사용자의 시야 영역을, Foveal 영역, Blend 영역, Peripheral 영역으로 구분할 수 있다.In this case, the electronic device 100 may divide the user's viewing area into a Foveal area, a Blend area, and a Peripheral area.

관련하여, 도 4는 본 개시의 일 실시 예에 따른 전자 장치가 사용자의 시야 영역을 구분하는 동작을 설명하기 위한 도면이다.In relation to this, FIG. 4 is a diagram for explaining an operation of an electronic device according to an embodiment of the present disclosure to classify a user's viewing area.

도 4를 참조하면, Foveal 영역은 사용자의 시선이 향하는 중심 지점을 포함하는 일정 시야각의 영역이고, Blend 영역은 Foveal 영역을 둘러싸며 Foveal 영역의 시야각보다 큰 시야각에 대응되는 영역이며, Peripheral 영역은 Foveal 영역과 Blend 영역을 제외한 나머지 영역이다.Referring to FIG. 4, the Foveal area is an area with a certain viewing angle including the central point to which the user's gaze is directed, the Blend area is an area surrounding the Foveal area and corresponding to a viewing angle larger than the Foveal area's viewing angle, and the Peripheral area is the remaining area except for the Foveal area and the Blend area.

예를 들어, Foveal 영역은 30도 이내, Blend 영역은 30도에서 45도 사이, Peripheral 영역은 45도 이상의 시야각 범위에 해당할 수 있으나, 이에 한정되지 않는다.For example, the Foveal area may correspond to a viewing angle within 30 degrees, the Blend area from 30 degrees to 45 degrees, and the Peripheral area may correspond to a viewing angle range of 45 degrees or more, but is not limited thereto.

그리고, 전자 장치(100)는 제1 이미지 데이터 중 식별된 시야 영역의 적어도 일부에 매칭되는 대상 이미지 데이터를 선택할 수 있다(S330).Then, the electronic device 100 may select target image data that matches at least a part of the identified viewing area from among the first image data (S330).

구체적으로, 전자 장치(100)는 사용자의 시야 영역 중 Foveal 영역을 식별할 수 있으며, 제1 이미지 데이터 중 Foveal 영역과 대응되는 대상 이미지 데이터를 선택할 수 있다. 즉, 복수의 분할 이미지 중 Foveal 영역에 포함되는 하나 이상의 분할 이미지가 선택될 수 있다.Specifically, the electronic device 100 may identify a foveal area among the user's viewing area, and may select target image data corresponding to the foveal area from among the first image data. That is, one or more divided images included in the foveal area may be selected from among the plurality of divided images.

그리고, 전자 장치(100)는 고해상도 복원을 수행하도록 훈련된 인공지능 모델에 대상 이미지 데이터를 입력하여, 복원 이미지 데이터를 획득할 수 있다(S340).Then, the electronic device 100 may obtain reconstructed image data by inputting target image data to an artificial intelligence model trained to perform high-resolution reconstruction (S340).

이 경우, 전자 장치(100)는 Foveal 영역에 매칭되는 제1 대상 이미지 데이터를 상기 인공지능 모델에 입력하여 제1 복원 이미지 데이터를 획득할 수 있다. 구체적으로, 전자 장치(100)는 Foveal 영역에 포함되는 적어도 하나의 부분 이미지 각각에 매칭되는 적어도 하나의 신경망 모델에, 부분 이미지를 각각 입력할 수 있다.In this case, the electronic device 100 may acquire first reconstructed image data by inputting first target image data that matches the foveal area to the artificial intelligence model. Specifically, the electronic device 100 may respectively input the partial images to at least one neural network model that matches each of the at least one partial image included in the foveal area.

또한, 전자 장치(100)는, Blend 영역에 매칭되는 제2 대상 이미지 데이터에 대해서는 보간 보정을 수행하여 제2 대상 이미지 데이터를 획득할 수 있다. 이때, 보간(interpolation) 보정은, bicubic 보간, 3차 보간 등 종래 알려진 다양한 형태가 가능하다.In addition, the electronic device 100 may obtain second target image data by performing interpolation correction on the second target image data that matches the blend region. At this time, interpolation correction may be performed in various conventionally known forms such as bicubic interpolation and cubic interpolation.

그리고, 전자 장치(100)는 제1 이미지 데이터 내에서 대상 이미지 데이터를 복원 이미지 데이터로 대체하여 제2 이미지 데이터를 획득하고, 제2 이미지 데이터를 출력할 수 있다(S350). 제2 이미지 데이터에 해당하는 영상은 전자 장치(100)의 디스플레이(140)를 통해 출력될 수도 있고, 전자 장치(100)와 연결된 적어도 하나의 디스플레이 장치를 통해 출력될 수도 있다.Then, the electronic device 100 may obtain second image data by replacing target image data with restored image data in the first image data, and output the second image data (S350). An image corresponding to the second image data may be output through the display 140 of the electronic device 100 or may be output through at least one display device connected to the electronic device 100 .

여기서, 복원 이미지 데이터는 인공지능 모델의 고해상도 복원 및 보간 보정 중 적어도 하나를 거친 이미지 데이터에 해당한다.Here, the reconstructed image data corresponds to image data that has undergone at least one of high-resolution restoration and interpolation correction of the artificial intelligence model.

그 결과, 사용자의 시야 영역 내 일부분만이 선택적으로 고해상도로 출력될 수 있으며, 스트리밍에 필요한 로드를 최소화하면서도 고품질의 콘텐츠 체험이 제공될 수 있다.As a result, only a portion of the user's field of view can be selectively output in high resolution, and a high-quality content experience can be provided while minimizing the load required for streaming.

한편, 도 5는 본 개시의 일 실시 예에 따른 전자 장치가 이미지 데이터의 영역 별로 훈련된 복수의 인공지능 모델을 선택적으로 활용하는 동작을 설명하기 위한 도면이다. 도 5는 360도의 시야 범위에 대한 이미지를 예시로 하고 있지는 않으나, 이미지의 영역 분할 면에서 공통점이 있어 쉬운 설명을 위한 예시에 해당한다.Meanwhile, FIG. 5 is a diagram for explaining an operation of selectively utilizing a plurality of artificial intelligence models trained for each region of image data by an electronic device according to an embodiment of the present disclosure. Although FIG. 5 does not exemplify an image for a 360-degree field of view, it corresponds to an example for easy explanation because the image has a common feature in area division.

도 5를 참조하면, 이미지(510)는 복수의 타일로 구분될 수 있다. 이 경우, 각 타일(: 영역) 별로 별도의 인공지능 모델이 훈련될 수 있다. 이 경우, 서버(200)는 복수의 타일 각각에 해당하는 복수의 부분 이미지를 전자 장치(100)로 전송할 수 있다. 또한, 서버(200)는 복수의 부분 이미지 각각을 고해상도 복원하도록 훈련된 복수의 인공지능 모델 각각에 대한 데이터를 전자 장치(100)로 전송할 수 있다.Referring to FIG. 5 , an image 510 may be divided into a plurality of tiles. In this case, a separate artificial intelligence model may be trained for each tile (region). In this case, the server 200 may transmit a plurality of partial images corresponding to each of a plurality of tiles to the electronic device 100 . In addition, the server 200 may transmit data for each of a plurality of artificial intelligence models trained to reconstruct each of a plurality of partial images in high resolution to the electronic device 100 .

이후, 전자 장치(100)는 사용자의 시야 영역 또는 사용자 명령에 따라 선택된 분할 이미지에 대하여 선택적으로 고해상도 복원을 수행할 수 있으며, 매칭되는 인공지능 모델만이 활용될 수 있다.Thereafter, the electronic device 100 may selectively perform high-resolution reconstruction on the divided image selected according to the user's viewing area or a user command, and only the matched artificial intelligence model may be utilized.

한편, 일 실시 예로, 전자 장치(100)는 제1 이미지 데이터가 시간 별로 분할된 복수의 이미지 프레임을 수신하고, 복수의 이미지 프레임 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 제1 신경망 모델에 대한 데이터를 수신할 수 있다. 즉, 공간에 따른 분할이 외에 시간에 따른 분할 역시 활용될 수 있다.Meanwhile, as an embodiment, the electronic device 100 may receive a plurality of image frames in which the first image data is divided by time, and may receive data for a plurality of first neural network models trained to perform high-resolution reconstruction on each of the plurality of image frames. That is, in addition to division according to space, division according to time may also be utilized.

이때, 복수의 제1 신경망 모델 각각은, 각 이미지 프레임이 영역 별로 분할된 복수의 부분 이미지 각각에 대하여 고해상도 복원을 수행하도록 훈련된 복수의 제2 신경망 모델을 포함할 수 있다. 즉, 시간에 따른 분할에 더하여 공간에 따른 분할도 함께 적용될 수 있다.In this case, each of the plurality of first neural network models may include a plurality of second neural network models trained to perform high-resolution reconstruction on each of a plurality of partial images obtained by dividing each image frame into regions. That is, in addition to division according to time, division according to space may also be applied.

관련하여, 도 6은 본 개시의 일 실시 예에 따른 전자 장치가 이미지 데이터의 시간 구간 별로 훈련된 복수의 인공지능 모델을 선택적으로 활용하는 동작을 설명하기 위한 도면이다. 도 6은 지도 내 다양한 구역을 순차적으로 이동하는 가상의 체험을 위한 복수의 이미지 프레임(image frame 1, 2, 3, 4 등)이 순차적으로 포함된 VR 콘텐츠가 전자 장치(100)를 통해 실시간으로 제공되는 상황을 가정한다.In this regard, FIG. 6 is a diagram for explaining an operation of selectively utilizing a plurality of artificial intelligence models trained for each time interval of image data by an electronic device according to an embodiment of the present disclosure. 6 assumes a situation in which VR content sequentially including a plurality of image frames (image frames 1, 2, 3, 4, etc.) for a virtual experience of sequentially moving through various areas on a map is provided in real time through the electronic device 100.

이때, 전자 장치(100)는 서버(200)로부터 복수의 이미지 프레임을 수신할 수 있으며, 각 이미지 프레임에 대해 고해상도 복원을 수행하기 위한 제1 신경망 모델(model 1, 2, 3, 4 등)에 대한 데이터를 각각 수신할 수 있다. 이 경우, 각 이미지 프레임이 영역 별로 분할된 복수의 분할 이미지가 수신될 수 있으며, 분할 이미지 각각의 고해상도 복원을 위한 복수의 제2 신경망 모델에 대한 데이터가 각각 수신될 수도 있다.At this time, the electronic device 100 may receive a plurality of image frames from the server 200, and may receive data for a first neural network model (models 1, 2, 3, 4, etc.) for performing high-resolution reconstruction on each image frame. In this case, a plurality of divided images in which each image frame is divided into regions may be received, and data for a plurality of second neural network models for high-resolution reconstruction of each of the divided images may be received.

도 6을 참조하면, 전자 장치(100)는 VR 콘텐츠가 재생되는 시간 구간을 식별함으로써, 상술한 복수의 이미지 프레임 중 적어도 하나의 이미지 프레임을 선택할 수 있다. 또한, 실시간으로 감지된 사용자의 시야 영역을 바탕으로, 이미지 프레임 내 적어도 하나의 분할 이미지를 선택하고, 선택된 분할 이미지를 매칭되는 제2 신경망 모델에 입력하여 고해상도 복원을 수행할 수 있다.Referring to FIG. 6 , the electronic device 100 may select at least one image frame from among the plurality of image frames described above by identifying a time interval in which VR content is reproduced. In addition, based on the user's field of view sensed in real time, at least one segmented image in the image frame is selected, and high-resolution reconstruction may be performed by inputting the selected segmented image to a matching second neural network model.

한편, 사용자의 시야 영역이 구분된 상술한 Foveal 영역, Blend 영역 등은 시스템 환경 내지는 통신 환경에 따라 유동적으로 조정될 수 있다.Meanwhile, the foveal area, the blend area, etc., in which the user's viewing area is divided, can be flexibly adjusted according to the system environment or communication environment.

일 실시 예로, 전자 장치(100)는, 전자 장치(100)가 스트리밍 데이터를 수신하는 속도 및 전자 장치(100)가 인공지능 모델을 통해 고해상도 복원을 수행하는 속도 중 적어도 하나를 바탕으로, Foveal 영역의 시야각 범위를 변경할 수 있다. 시야각 범위의 변경은, 이미지를 바라보는 사용자 시점에서 Foveal 영역의 반경이 달라짐을 의미한다.As an embodiment, the electronic device 100 may change the viewing angle range of the foveal area based on at least one of a speed at which the electronic device 100 receives streaming data and a speed at which the electronic device 100 performs high-resolution reconstruction through an artificial intelligence model. The change in the viewing angle range means that the radius of the foveal area changes from the user's point of view looking at the image.

스트리밍 데이터를 수신하는 속도는, 예를 들어 단위 시간 별로 수신되는 데이터의 용량 또는 이미지 프레임의 수 등을 의미할 수 있다. 고해상도 복원을 수행하는 속도는, 단위 시간 별로 복원되는 픽셀의 수, 이미지 프레임의 수, 또는 복원된 데이터의 용량 등에 해당할 수 있다.The speed at which streaming data is received may mean, for example, the capacity of data received per unit time or the number of image frames. The speed at which high-resolution reconstruction is performed may correspond to the number of pixels reconstructed per unit time, the number of image frames, or the capacity of reconstructed data.

예를 들어, 스트리밍 데이터를 수신하는 속도가 느려질수록 Foveal 영역의 시야각 범위가 작아지고, 스트리밍 데이터를 수신하는 속도가 빨라질수록 Foveal 영역의 시야각 범위가 커질 수 있다. 또한, 고해상도 복원을 수행하는 속도가 느려질수록 Foveal 영역의 시야각 범위가 작아지고, 고해상도 복원을 수행하는 속도가 느려질수록 Foveal 영역의 시야각 범위가 작아질 수 있다.For example, the viewing angle range of the foveal area may decrease as the streaming data reception speed decreases, and the viewing angle range of the foveal area may increase as the streaming data reception speed increases. In addition, the viewing angle range of the foveal region may decrease as the speed of performing the high-resolution reconstruction decreases, and the viewing angle range of the foveal region may decrease as the speed of performing the high-resolution reconstruction decreases.

구체적인 예로, 고해상도 복원을 수행하는 속도가 제1 임계치 미만이 되는 경우, 다시 제1 임계치 미만이 될 때까지 일정 기울기에 따라 Foveal 영역의 시야각 범위가 작아질 수 있다. 이때, Blend 영역의 외곽에 해당하는 시야각 범위는 유지되지만, Foveal 영역이 줄어들면서 Blend 영역의 안쪽 경계에 해당하는 시야각 범위는 좁아져 Blend 영역은 점차 커질 수 있다. 다만, 만약 고해상도 복원을 수행하는 속도가 제1 임계치보다 작은 제2 임계치 미만에 도달하는 경우, Foveal 영역은 사라지고 Blend 영역으로 모두 대체되면서, Blend 영역 전체에 대한 보간 보정만이 실시간으로 수행될 수도 있다.As a specific example, when the speed at which high-resolution reconstruction is performed becomes less than the first threshold, the viewing angle range of the foveal area may be reduced according to a predetermined inclination until the speed again becomes less than the first threshold. At this time, the viewing angle range corresponding to the outer edge of the blend area is maintained, but as the foveal area decreases, the viewing angle range corresponding to the inner boundary of the blend area narrows, and the blend area may gradually increase. However, if the speed of performing high-resolution reconstruction reaches less than the second threshold, which is smaller than the first threshold, the foveal region disappears and is replaced with the blend region, and only interpolation correction for the entire blend region may be performed in real time.

다른 예로, 고해상도 복원을 수행하는 속도가 제1 임계치 이상인 상태에서는, 점차 Foveal 영역의 시야각 범위가 커질 수 있다. 이때, Blend 영역의 외곽에 해당하는 시야각 범위가 함께 커질 수도 있고, 또는 Blend 영역의 외곽에 해당하는 시야각 범위가 그대로 유지된 결과 Blend 영역이 점차 좁아질 수도 있다. 이때, Foveal 영역이 넓어지면서 고해상도 복원을 수행하는 속도는 점차 느려질 수 있으며, 해당 속도가 다시 제1 임계치에 도달한 경우, Foveal 영역의 시야각 범위는 더 이상 커지지 않고 유지될 수 있다.As another example, in a state where the speed for performing high-resolution reconstruction is greater than or equal to the first threshold, the viewing angle range of the foveal area may gradually increase. At this time, the viewing angle range corresponding to the periphery of the blend area may also increase, or the blend area may gradually narrow as a result of maintaining the viewing angle range corresponding to the periphery of the blend area. In this case, as the foveal area widens, the speed at which high-resolution reconstruction is performed may gradually decrease, and when the speed reaches the first threshold value again, the viewing angle range of the foveal area may be maintained without increasing any more.

또한, 일 실시 예로, 전자 장치(100)는 사용자의 시야 영역이 변경되는 속도를 토대로 Foveal 영역의 시야각 범위를 변경할 수도 있다.Also, as an example, the electronic device 100 may change the viewing angle range of the foveal area based on the speed at which the user's viewing area changes.

사용자의 시야 영역이 변경되는 속도(이하 "시야 변경 속도"로 지칭함)는, 사용자의 시선 방향이 변경되면서 단위 시간마다 사용자의 시야 영역에서 사라지고 새로 생겨나는 픽셀의 수에 비례한다. 사용자의 안구 움직임이 빠르거나 고개 돌림이 빠를수록 시야 영역이 변경되는 속도도 빠르다.The speed at which the user's viewing area changes (hereinafter, referred to as "viewing area change speed") is proportional to the number of pixels that disappear from the user's viewing area and are newly created per unit time while the user's viewing direction is changed. The faster the eyeball movement or head turn of the user is, the faster the viewing area is changed.

여기서, 전자 장치(100)는 사용자의 시야 변경 속도가 빠를수록 인공지능 모델의 고해상도 복원이 필요한 실시간 Foveal 영역의 시야각 범위를 작게 설정할 수 있다. 그 결과, 사용자의 빠른 움직임에 따른 로드 한계로 인해 VR 체험의 딜레이가 발생하는 상황이 방지될 수 있다.Here, the electronic device 100 may set the viewing angle range of the real-time foveal area to be smaller as the user's view change speed is faster, requiring high-resolution reconstruction of the artificial intelligence model. As a result, a situation in which a VR experience delay occurs due to a load limit according to a user's rapid movement can be prevented.

이때, 전자 장치(100)는 사용자의 시야 변경 속도를 실시간으로 예측할 수도 있다. 예를 들어, 전자 장치(100)는 복수의 단위 시간 구간에 대하여 사용자의 시야 변경 속도를 모니터링하여 이력 데이터를 획득할 수 있다. 이때, 전자 장치(100)는 이력 데이터를 바탕으로 예측용 인공지능 모델을 훈련시켜, 인공지능 모델이, 다음에 이어질 단위 시간 구간 내 사용자의 시야 변경 속도를 예측하도록 할 수 있다. In this case, the electronic device 100 may predict the user's view change speed in real time. For example, the electronic device 100 may obtain history data by monitoring a user's view change rate for a plurality of unit time intervals. In this case, the electronic device 100 may train an artificial intelligence model for prediction based on the history data so that the artificial intelligence model predicts the speed of change of the user's view within a next unit time interval.

여기서, 예측용 인공지능 모델은, 연속되는 단위 시간 구간 각각에 있어 시야 영역에 포함되는 픽셀들의 값을 바탕으로 시야 변경 속도를 예측하기 위한 RNN(Recurrent Neural Network) 또는 LSTM(Long Short-Term Memory) 기반의 모델에 해당할 수 있다. 또한, 예측용 인공지능 모델은, 시야 영역 내에 포함된 적어도 하나의 객체를 인식하기 위한 CNN 기반의 객체 인식용 레이어들을 포함할 수도 있다. 예를 들어, 객체 인식과 관련된 특징 벡터 내지는 객체 인식 결과가 RNN 또는 LSTM 기반의 모델에 함께 입력될 수 있다. 그 결과, 예측용 인공지능 모델은, 실시간으로 시야 영역 내에 나타나는 객체들의 특성(ex. 종류, 컬러, 크기, 개체 수 등)과 시야 변경 속도 간의 연관 관계에 따라 훈련되어 이어질 상황에 대한 시야 변경 속도를 예측할 수 있게 된다. 예를 들어, 특정한 종류의 객체가 나타나는 경우, 유독 사용자의 시야 변경 속도가 빨라지는 등의 상황에 대해 예측용 인공지능 모델이 학습할 수 있게 된다.Here, the artificial intelligence model for prediction is based on a Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM) for predicting the rate of change of the visual field based on the values of pixels included in the visual field area in each successive unit time interval. It may correspond to a model. Also, the artificial intelligence model for prediction may include CNN-based object recognition layers for recognizing at least one object included in the viewing area. For example, a feature vector or object recognition result related to object recognition may be input to an RNN or LSTM-based model. As a result, the artificial intelligence model for prediction is trained according to the relationship between the characteristics (ex. type, color, size, number of objects, etc.) of objects appearing in the field of view in real time and the rate of change in the field of view, and can predict the rate of change of the field of view for the next situation. For example, when a specific type of object appears, the artificial intelligence model for prediction can learn about a situation where the user's view change speed increases.

이후, 전자 장치(100)는 과거로부터 현재까지 이어지는 복수의 단위 시간 구간 각각의 사용자의 시야 변경 속도를 인공지능 모델에 입력함으로써, 다음에 이어질 미래의 단위 시간 구간에 대한 사용자의 시야 변경 속도를 예측할 수 있다. 이렇게 예측된 단위 시간 구간 별 시야 변경 속도는, 상술한 시야각 범위의 실시간 변경 설정에 실시간으로 활용/적용될 수 있다.Thereafter, the electronic device 100 inputs the speed of change of the user's field of view in each of a plurality of unit time sections continuing from the past to the present into the artificial intelligence model, thereby predicting the speed of changing the user's field of view for the next future unit time section. The predicted view change speed per unit time section may be utilized/applied in real time to the real-time change setting of the above-described view angle range.

예를 들어, 단위 시간 구간마다 예측된 사용자의 시야 변경 속도에 따라, 각 단위 시간 구간의 Foveal 영역의 시야각 범위가 설정될 수 있으나, 이에 한정되지는 않는다.For example, the viewing angle range of the foveal area of each unit time interval may be set according to the user's visual field change rate predicted for each unit time interval, but is not limited thereto.

도 7은 본 개시의 다양한 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블록도이다.7 is a block diagram for explaining the configuration of an electronic device according to various embodiments of the present disclosure.

도 7을 참조하면, 전자 장치(100)는 메모리(110), 프로세서(120), 통신부(130), 디스플레이(140), 센서부(150) 외에 사용자 입력부(160), 오디오 출력부(170) 등을 더 포함할 수 있다.Referring to FIG. 7 , the electronic device 100 may further include a user input unit 160, an audio output unit 170, and the like in addition to a memory 110, a processor 120, a communication unit 130, a display 140, and a sensor unit 150.

사용자 입력부(160)는 다양한 명령 또는 정보를 사용자로부터 입력 받기 위한 구성이다. 사용자 입력부(160)는 적어도 하나의 버튼, 터치 패드, 터치 스크린, 마이크, 카메라, 센서 등으로 구현될 수 있다. 또한, 전자 장치(100)는 적어도 하나의 키패드, 버튼, 모션 센서, 가속도 센서, 자이로 센서 등을 구비한 다양한 사용자 입력 장치(ex. 컨트롤러)와 연결될 수도 있다.The user input unit 160 is a component for receiving various commands or information from a user. The user input unit 160 may be implemented with at least one button, a touch pad, a touch screen, a microphone, a camera, a sensor, and the like. In addition, the electronic device 100 may be connected to various user input devices (eg, controllers) including at least one keypad, button, motion sensor, acceleration sensor, and gyro sensor.

오디오 출력부(170)는 다양한 정보를 청각적으로 출력하기 위한 구성으로, 스피커, 이어폰/헤드폰 단자 등을 포함할 수 있다. 일 예로, 전자 장치(100)는 서버(200)로부터 이미지 데이터 및 오디오 데이터를 포함하는 스트리밍 데이터를 수신할 수 있으며, 이미지 데이터의 재생 구간에 맞는 오디오 데이터의 재생 구간을 오디오 출력부(170)를 통해 출력할 수 있다.The audio output unit 170 is a component for aurally outputting various information, and may include a speaker, an earphone/headphone terminal, and the like. For example, the electronic device 100 may receive streaming data including image data and audio data from the server 200, and output a reproduction section of audio data that matches the reproduction section of the image data through the audio output unit 170.

한편, 이상에서 설명된 다양한 실시 예들은 서로 저촉되거나 모순되지 않는 한 두 개 이상의 실시 예가 서로 결합되어 구현될 수 있다.Meanwhile, the various embodiments described above may be implemented by combining two or more embodiments as long as they do not conflict or contradict each other.

한편, 이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합된 것을 이용하여 컴퓨터(computer) 또는 이와 유사한 장치로 읽을 수 있는 기록 매체 내에서 구현될 수 있다.Meanwhile, various embodiments described above may be implemented in a recording medium readable by a computer or a similar device using software, hardware, or a combination thereof.

하드웨어적인 구현에 의하면, 본 개시에서 설명되는 실시 예들은 ASICs(Application Specific Integrated Circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 전기적인 유닛(unit) 중 적어도 하나를 이용하여 구현될 수 있다. According to hardware implementation, the embodiments described in this disclosure may be implemented using at least one of application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, and electrical units for performing other functions.

일부의 경우에 본 명세서에서 설명되는 실시 예들이 프로세서 자체로 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 상술한 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다.In some cases, the embodiments described herein may be implemented by a processor itself. According to software implementation, embodiments such as procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules described above may perform one or more functions and operations described herein.

한편, 상술한 본 개시의 다양한 실시 예들에 따른 서버, 전자 장치 등에서의 처리동작을 수행하기 위한 컴퓨터 명령어(computer instructions) 또는 컴퓨터 프로그램은 비일시적 컴퓨터 판독 가능 매체(non-transitory computer-readable medium)에 저장될 수 있다. 이러한 비일시적 컴퓨터 판독 가능 매체에 저장된 컴퓨터 명령어 또는 컴퓨터 프로그램은 특정 기기의 프로세서에 의해 실행되었을 때 상술한 다양한 실시 예에 따른 서버, 전자 장치 등에서의 처리 동작을 상술한 특정 기기가 수행하도록 한다. Meanwhile, computer instructions or computer programs for performing processing operations in servers, electronic devices, etc. according to various embodiments of the present disclosure described above may be stored in a non-transitory computer-readable medium. Computer instructions or computer programs stored in such a non-transitory computer readable medium, when executed by a processor of a specific device, cause the above-described specific device to perform processing operations in a server, electronic device, etc. according to various embodiments described above.

비일시적 컴퓨터 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 비일시적 컴퓨터 판독 가능 매체의 구체적인 예로는, CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등이 있을 수 있다.A non-transitory computer readable medium is a medium that stores data semi-permanently and is readable by a device, not a medium that stores data for a short moment, such as a register, cache, or memory. Specific examples of the non-transitory computer readable media may include CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시에 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안될 것이다.Although preferred embodiments of the present disclosure have been shown and described above, the present disclosure is not limited to the specific embodiments described above, and various modifications may be made by those skilled in the art without departing from the gist of the present disclosure claimed in the claims.

100: 전자 장치 110: 메모리
120: 프로세서 130: 통신부
140: 디스플레이 150: 센서부
200: 서버100: electronic device 110: memory
120: processor 130: communication unit
140: display 150: sensor unit
200: server

Claims

In the control method of an electronic device,
Receiving streaming data including first image data corresponding to a preset viewing range;
identifying a viewing area of the user within the preset viewing range by tracking a direction of the user's eyeballs;
selecting target image data that matches at least a part of the identified viewing area from among the first image data;
obtaining reconstructed image data by inputting the target image data to an artificial intelligence model trained to perform high-resolution reconstruction;
obtaining second image data by replacing the target image data of the first image data with the reconstructed image data, and outputting the second image data;
Dividing the user's viewing area into a foveal area with a certain viewing angle including the central point to which the user's gaze is directed, a blend area surrounding the foveal area and corresponding to a viewing angle greater than the viewing angle of the foveal area, and a peripheral area excluding the foveal area and the blend area;
changing a viewing angle range of the foveal area based on at least one of a speed at which the electronic device receives streaming data and a speed at which the electronic device performs high-resolution reconstruction through the artificial intelligence model;
identifying a rate at which the viewing area of the user changes; and
Based on the identified speed, changing the viewing angle range of the foveal area; includes,
Obtaining the restored image data,
obtaining first reconstructed image data by inputting first target image data matched to the foveal region to the artificial intelligence model;
A control method of an electronic device comprising obtaining second reconstructed image data by performing interpolation correction on second target image data that matches the blend region.

According to claim 1,
Receiving the streaming data,
Receiving a plurality of partial images in which the first image data is divided into regions;
Receiving data for a plurality of neural network models trained to perform high-resolution reconstruction on each of the plurality of partial images;
The step of selecting the target image data,
Selecting at least one partial image included in the user's viewing area from among the plurality of partial images;
Obtaining the restored image data,
and inputting the selected at least one partial image to at least one neural network model that matches each of the selected at least one partial image.

According to claim 1,
The control method of the electronic device,
Receiving, by the electronic device, streaming data including a plurality of image frames in which Virtual Reality (VR) content is divided by time; and
Receiving, by the electronic device, data for a plurality of first neural network models trained to perform high-resolution reconstruction on each of the plurality of image frames;
Each of the plurality of first neural network models,
Each image frame includes a plurality of second neural network models trained to perform high-resolution reconstruction on each of a plurality of partial images divided by region;
The control method of the electronic device,
selecting at least one partial image included in at least one image frame based on a time interval within the VR content reproduced by the electronic device and a field of view of the user detected during the time interval; and
The control method of an electronic device further comprising performing high-resolution reconstruction on the selected at least one partial image through at least one second neural network model that matches each of the selected at least one partial image.

delete

In the control method of a system including a server and an electronic device,
transmitting, by the server, streaming data including first image data corresponding to a viewing range of 360 degrees to the electronic device;
identifying, by the electronic device, a viewing area of the user by tracking the direction of the user's eyeballs;
selecting, by the electronic device, target image data matching the identified viewing area from among the first image data;
obtaining reconstructed image data by inputting the target image data to an artificial intelligence model trained to perform high-resolution reconstruction by the electronic device;
replacing, by the electronic device, the target image data of the first image data with the restored image data to obtain second image data, and outputting the second image data;
Dividing, by the electronic device, the user's viewing area into a foveal area with a certain viewing angle including the center point to which the user's gaze is directed, a blend area surrounding the foveal area and corresponding to a viewing angle greater than the viewing angle of the foveal area, and a peripheral area excluding the foveal area and the blend area;
changing, by the electronic device, a viewing angle range of the foveal area based on at least one of a speed at which the electronic device receives streaming data and a speed at which the electronic device performs high-resolution reconstruction through the artificial intelligence model;
identifying, by the electronic device, a rate at which the viewing area of the user is changed; and
Changing, by the electronic device, a viewing angle range of the foveal area based on the identified speed;
Obtaining the restored image data,
obtaining first reconstructed image data by inputting first target image data matched to the foveal region to the artificial intelligence model;
Characterized in that, interpolation correction is performed on the second target image data that matches the blend region to obtain second restored image data.