WO2024148975A1 - 一种拍摄方法及设备 - Google Patents
一种拍摄方法及设备 Download PDFInfo
- Publication number
- WO2024148975A1 WO2024148975A1 PCT/CN2023/134757 CN2023134757W WO2024148975A1 WO 2024148975 A1 WO2024148975 A1 WO 2024148975A1 CN 2023134757 W CN2023134757 W CN 2023134757W WO 2024148975 A1 WO2024148975 A1 WO 2024148975A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- frame
- motion
- shooting
- information
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 147
- 230000033001 locomotion Effects 0.000 claims abstract description 445
- 238000001514 detection method Methods 0.000 claims abstract description 256
- 230000004044 response Effects 0.000 claims abstract description 9
- 230000000875 corresponding effect Effects 0.000 claims description 163
- 238000012216 screening Methods 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 32
- 238000004590 computer program Methods 0.000 claims description 11
- 230000002596 correlated effect Effects 0.000 claims description 4
- 230000001276 controlling effect Effects 0.000 claims 1
- 230000008569 process Effects 0.000 description 79
- 238000004422 calculation algorithm Methods 0.000 description 23
- 238000010586 diagram Methods 0.000 description 21
- 230000008859 change Effects 0.000 description 20
- 238000004891 communication Methods 0.000 description 20
- 238000012545 processing Methods 0.000 description 19
- 230000001360 synchronised effect Effects 0.000 description 13
- 230000003068 static effect Effects 0.000 description 10
- 230000000007 visual effect Effects 0.000 description 10
- 238000007726 management method Methods 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 230000001960 triggered effect Effects 0.000 description 5
- 238000010295 mobile communication Methods 0.000 description 4
- 241000282472 Canis lupus familiaris Species 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- 244000025254 Cannabis sativa Species 0.000 description 2
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 229910001285 shape-memory alloy Inorganic materials 0.000 description 2
- 241000271566 Aves Species 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 229910044991 metal oxide Inorganic materials 0.000 description 1
- 150000004706 metal oxides Chemical class 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72439—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for image or video messaging
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/67—Focus control based on electronic image sensor signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2250/00—Details of telephonic subscriber devices
- H04M2250/52—Details of telephonic subscriber devices including functional features of a camera
Definitions
- the embodiments of the present application relate to the field of electronic technology, and in particular, to a shooting method and device.
- the camera functions of electronic devices such as mobile phones or tablet computers are becoming more and more powerful, and can be used to realize functions such as taking pictures and recording videos.
- the mobile phone can display the shooting pictures captured by the camera in real time in the user interface.
- the user can obtain images with different effects.
- the mobile phone can focus according to the focus selected by the user.
- the mobile phone can also provide users with the function of automatically tracking the focus and focusing.
- it may not be able to track the shooting target in the shooting picture in time, resulting in inaccurate focusing results and low focusing success rate.
- the photos or videos taken are blurred, reducing the user experience.
- the embodiments of the present application provide a shooting method and device for solving the problem that, during the process of automatic tracking and focusing of a mobile phone, a shooting target in a shooting picture cannot be tracked in time, resulting in inaccurate focusing results.
- a shooting method which is applied to an electronic device having a camera, wherein the camera includes a motion sensor and an image sensor, and the method includes:
- the electronic device acquires the first event data collected by the motion sensor and the first frame of the original image collected by the image sensor. Afterwards, the electronic device performs moving target detection based on the first event data to determine the first motion information of the shooting target; the first motion information is used to characterize the motion of the shooting target. And the electronic device performs target detection based on the first frame of the original image to determine the first target detection information of the shooting target; the first target detection information is used to characterize the position of the shooting target in the shooting picture.
- the electronic device updates the first target detection information of the shooting target based on the first motion information of the shooting target. After that, the electronic device determines the first ROI focus area according to the first motion information and the updated first target detection information; and focuses the camera according to the first ROI focus area to capture the second frame of the original image. Finally, the electronic device displays the first target image, which is generated by the second frame of the original image after the focus is completed.
- the embodiment of the present application can perform target detection on the shooting target in the first frame of the original image and output the first target detection information.
- the first target detection information is used to represent the static information of the shooting target in the shooting picture.
- the first motion information is used to represent the motion situation information of the shooting target in the shooting picture.
- the first target detection information of the shooting target is updated using the first motion information.
- the focus area is determined based on the first motion information and the updated first target detection information to complete automatic focus.
- the second frame of the original image is collected and the first target image is displayed.
- the first target image can be the image displayed on the shooting preview interface.
- the embodiment of the present application can collect the first event data using an event camera with a high shooting frame rate, and update the first target detection information with the first motion information corresponding to the first event data.
- the first motion information corresponding to the first event data collected by the event camera with a high shooting frame rate can guide the first target detection information in a timely manner, thereby improving the situation where the first target detection result is lagging, so that the first target detection information is closer to the motion state of the shooting target in the real scene, and improves the probability of successful focusing in the motion scene.
- the embodiment of the present application determines the focus area through the first motion information corresponding to the shooting target in the motion state.
- the first motion information of the shooting target Since the first motion information of the shooting target has been obtained, the first motion information can be directly used to perform the autofocus process. Furthermore, there is no need to traverse the event signals of all positions from near to far in the shooting picture. The focus area can be accurately determined, which is convenient for subsequent shooting of clear pictures or videos, and improves the user experience.
- the electronic device detects the target according to the first frame of the original image and determines the shooting
- the process of first target detection information of the target includes:
- the electronic device performs scene detection and target detection according to the first frame of the original image to determine first target detection information and scene information of the shooting target; the scene information is used to characterize the shooting scene of the shooting target.
- the electronic device determines a first ROI focus area of interest according to the first motion information and the updated first target detection information; including:
- the electronic device determines a first ROI focus area according to the first motion information, the scene information and the updated first target detection information.
- the embodiment of the present application can also perform scene detection on the shooting target in the first frame of the original image and output scene information.
- the scene information is used to represent the shooting scene in the shooting picture.
- the embodiment of the present application can determine the focus area based on the motion information, scene information and updated target detection information, thereby completing automatic focus.
- the embodiment of the present application can also determine the focus area in combination with the shooting scene in the shooting picture.
- the shooting picture includes shooting scenes such as the moon, the blue sky and the sun
- the focus area can also be comprehensively determined based on the motion information, the scene information and the updated target detection information. For example, if the user wants to shoot some shooting scenes, the electronic device can focus on the shooting scenes later. In turn, the user's experience is improved.
- the electronic device in a process of detecting a moving target according to first event data, includes:
- the electronic device divides the first event data to generate multiple event frame images. If the motion rate corresponding to the shooting target changes, the number of event frame images generated within a unit time length is adjusted according to the motion rate; the number of event frame images is positively correlated with the motion rate corresponding to the shooting target; then, the electronic device performs moving target detection on the event frame images.
- the embodiment of the present application can segment the first event data collected by the motion sensor to generate multiple frames of event frame images. Afterwards, multiple groups of event frame images are accumulated, and a series of motion trajectories in the multiple groups of event frame images are analyzed. Moreover, in the embodiment of the present application, the number of frames of event frame images generated per unit time length is related to the motion rate of the shooting target. The greater the motion rate corresponding to the shooting target, the more frames of event frame images are generated per unit time length. Conversely, the smaller the motion rate corresponding to the shooting target, the fewer frames of event frame images are generated per unit time length.
- the embodiment of the present application provides that the number of frames of the event frame image generated within a unit time length is positively correlated with the motion rate of the shooting target.
- the more event frame images are generated per unit time the more first motion information corresponding to the event frame image.
- an electronic device divides first event data into multiple event frame images; the process includes: the electronic device divides the first event data into multiple event frame images according to a preset event data amount; wherein each event frame image corresponds to a preset event data amount.
- the embodiment of the present application can divide the first event data into multiple groups of event frame images, such as k groups of event frame images, according to the preset event data amount, and each event frame image can correspond to the same preset event data amount.
- the preset event data amount may include an event signal.
- each event frame image may correspond to a different preset event data amount.
- the first target detection information includes one or more of a first target frame of the photographed target, a size of an area corresponding to the first target frame, or coordinates corresponding to the first target frame;
- the first motion information includes one or more of a second target frame corresponding to the photographed target, a region size corresponding to the second target frame, or coordinates corresponding to the second target frame, and a confidence level corresponding to the photographed target;
- the first motion information includes one or more of a motion profile corresponding to the photographed target, a region size corresponding to the motion profile, or coordinates corresponding to the motion profile, and a confidence level corresponding to the photographed target.
- the first motion information also includes one or more of the motion direction, motion speed or motion type corresponding to the photographed target; the motion type includes at least one or more of reciprocating motion, rotational motion, lateral motion, longitudinal motion or bouncing motion.
- the first target detection information includes one or more of the first target frame of the shooting target, the area size corresponding to the first target frame, or the coordinates corresponding to the first target frame;
- the first motion information includes one or more of the second target frame corresponding to the shooting target, the area size corresponding to the second target frame, or the coordinates corresponding to the second target frame, and the confidence corresponding to the shooting target.
- the subsequent mobile phone needs to calculate the average depth value in the second target frame when performing the focus task. Since the second target frame includes some other areas in addition to the shooting target itself, the average depth value of the motion contour area corresponding to the shooting target is more accurate than the average depth value of the second target frame in the process of calculating the average depth value. Therefore, the motion information can include not only the second target frame, but also one or more of the motion contour area corresponding to the shooting target, the area size corresponding to the motion contour, or the coordinates corresponding to the motion contour, as well as the confidence corresponding to the shooting target.
- the first motion information may also include the motion direction of the shooting target, the motion speed of the shooting target, and the motion type of the shooting target.
- the electronic device may determine the subsequent focus area according to the motion direction of the shooting target, the motion speed of the shooting target, and the motion type of the shooting target, so as to make the focus area more accurate, thereby improving the focus accuracy of the electronic device during the focusing process.
- the collection time of the first event data is before the collection time of the second frame of the original image
- the electronic device updates the first target detection information of the shooting target based on the first motion information of the shooting target; including: the electronic device can adjust the position of the first target frame in the first target detection information based on the position of the second target frame in the first motion information.
- the electronic device can update the target detection information of the first frame original image based on the first motion information of the event frame image.
- the detection accuracy of the motion target detection of the shooting target in the event frame image is higher than that of the target detection of the shooting target in the first frame original image. Therefore, the target detection information of the second frame original image can be guided by the first motion information of the event frame image to complete the update of the first target detection information in the first frame original image, so that the first target detection information is more accurate and closer to the motion state of the shooting target in the real scene.
- the time interval between the second frame of the original image and the updated first frame of the original image is reduced.
- the second frame of the original image is guided based on the first motion information that is closer to the real scene, which is more accurate than the first target detection result corresponding to the first frame of the original image, so as to improve the focus accuracy and success probability in the subsequent focusing process.
- the electronic device in a process of determining the first ROI focus area according to the first motion information, the scene information, and the updated first target detection information, includes:
- the electronic device obtains a first ROI focus area by screening the second target frame in the first motion information and the first target frame in the updated first target detection information based on a preset screening rule;
- the preset screening rule includes comparing the confidence of the photographed target in the second target frame with the confidence of the photographed target in the first target frame, and determining the first target frame or the second target frame with a high confidence as the first ROI focus area;
- the preset screening rule also includes detecting the correlation between the first motion information and the scene information and the correlation between the first target detection information and the scene information, and determining the first target frame or the second target frame with a high correlation as the first ROI focus area;
- the preset screening rule also includes comparing the priorities corresponding to the first motion information and the first target detection information, and determining the first target frame or the second target frame with a higher priority as the first ROI focus area.
- the embodiments of the present application can determine the focus area based on preset screening rules.
- the preset screening rules can be preset based on the confidence level corresponding to the first target frame and the second target frame; they can also be preset based on the correlation between the first motion information and the scene information and the correlation between the first target detection information and the scene information; they can also be preset based on the priority level corresponding to the first motion information and the first target detection information.
- the above preset screening rules can be combined to determine the focus area based on the combined preset screening rules.
- the accuracy of determining the focus area is improved, which facilitates subsequent precise focusing.
- the electronic device focuses on the second frame of the original image according to the first ROI focus area, including: the electronic device calculates the average depth value in the first ROI focus area. Then, the electronic device determines the distance moved by the focus motor of the camera and the focus position according to the average depth value; the electronic device controls the focus motor of the camera to move the distance and move to the focus position.
- the embodiment of the present application calculates the average depth value of the focus area and determines the movement of the focus motor according to the average depth value. Finally, the electronic device controls the camera's focus motor to move the determined distance and to the focus position.
- the electronic device controls the focus motor to focus the lens according to the focus area.
- the first motion information can update the first target detection information in advance
- the first target detection information can be made closer to the shooting target in the real scene. Therefore, in the process of determining the ROI focus area and subsequent focusing, the focus area and the focus position to be focused can be determined more accurately.
- the shooting target in the shooting picture can be focused in a timely and accurate manner. In particular, it can effectively improve the focusing accuracy and focusing speed of different shooting targets in complex scenes (such as scenes with shooting targets in motion). It is convenient for subsequent clear shooting of the shooting target, and improves the image quality of the captured image and the success rate of shooting.
- the first event data is event data collected between a first moment and a second moment
- the first moment is a time when the first frame of the original image is collected
- the second moment is a time when the second frame of the original image is collected
- the electronic device detects the moving target according to the second event data, and determines the second motion information of the shooting target;
- the second event data is the event data collected between the first moment and the third moment, the third moment is between the first moment and the second moment, and the first moment and the second moment include at least one third moment;
- the electronic device predicts the motion of the shooting target according to the second motion information of the shooting target, and determines the prediction information of the shooting target;
- the prediction information includes one or more of a prediction target frame corresponding to the shooting target, a size of the prediction target frame, and coordinates of the prediction target frame;
- the electronic device may perform target detection according to the first frame of the original image to determine second target detection information of the photographed target; and based on the second motion information of the photographed target, update the second target detection information of the photographed target;
- the electronic device determines a second ROI focus area according to the second motion information and the updated second target detection information; and controls the focus motor of the camera to focus on the shooting target according to the predicted target frame and the second ROI focus area.
- the embodiment of the present application can also perform a focusing process, and the tracking focus process is to continue to focus on the focus point after completing the focusing task, and keep the focus point in an aligned state. It can also be understood as a process of keeping the focus on the shooting target during the subsequent shooting process.
- the first event data is the event data collected between the first moment and the second moment
- the first moment is the moment of collection of the first frame of the original image
- the second moment is the moment of collection of the second frame of the original image.
- the second event data is the event data collected between the first moment and the third moment
- the third moment is between the first moment and the second moment
- the first moment and the second moment include at least one third moment. That is to say, after focusing on the first frame of the original image, the electronic device can perform a subsequent focus tracking process.
- the embodiment of the present application can also predict the motion of the shooting target and output prediction information during the process of moving target detection; the prediction information is used to represent the predicted motion information of the shooting target. Then, automatic focus tracking is performed based on the output prediction information, so that the user can decide the best shooting time according to the automatic focus tracking function and shoot clearer pictures or videos.
- the method further includes: the electronic device displays a predicted target frame; the predicted target frame is presented in the first target image, and the predicted target frame is used to indicate the predicted position of the photographed target.
- the embodiment of the present application can display the predicted target box in the prediction information on the display screen to indicate the predicted position of the shooting target.
- the user can obtain the moving position of the shooting target within a period of time in the future based on the predicted target box.
- the user's playability and interactivity with the electronic device are increased, and the user's experience is improved.
- the electronic device in the process of displaying the first target image, includes: displaying the first target image on the shooting preview interface, and the shooting preview interface includes a photo preview interface or a video preview interface.
- the electronic device can display the first target image on the display screen.
- the shooting preview interface may include a photo preview interface or a video preview interface.
- the first target image can be represented as a preview image, that is, an image after the focus is completed.
- the user can view the focused shooting target based on the first target image to facilitate subsequent photo operations or video operations.
- the electronic device in the process of capturing the second frame of the original image, includes: after the electronic device detects the photo operation, capturing the second frame of the original image and generating the first target image obtained by taking the photo;
- the electronic device in a process of displaying a first target image, includes:
- a thumbnail corresponding to the first target image is displayed.
- the electronic device collects the second frame of the original image, it includes:
- the electronic device After the electronic device detects the recording operation, it captures a second frame of original image; and displays the first target image in a second preset area of the recording interface; wherein the electronic device generates a video frame in the target video obtained by recording based on the second frame of original image.
- the embodiment of the present application provides that in the scene of taking pictures, after the electronic device detects the photo operation, it collects the second frame of original image and generates the first target image obtained by taking pictures. And a thumbnail of the first target image is displayed in a preset area in the shooting preview interface. The first target image is used to represent the image obtained by the user taking pictures.
- the electronic device detects the recording operation, it collects the second frame of original image; and displays the first target image in the second preset area of the recording interface.
- the second frame of original image is used to generate a video frame in the target video obtained by recording. Therefore, the electronic device can collect and display the image obtained by taking pictures or recording videos according to the shooting timing decided by the user. So that the user can see the clear image captured. In addition, the user's user experience is improved.
- an electronic device comprising a memory and one or more processors; the memory is coupled to the processor; wherein computer program code is stored in the memory, and the computer program code comprises computer instructions, and when the computer instructions are executed by the processor, the electronic device executes the shooting method as described in the first aspect above.
- a computer-readable storage medium in which instructions are stored.
- the computer can execute the shooting method as described in the first aspect.
- FIG1 is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present application.
- FIG2 is a schematic diagram of a software structure of an electronic device provided in an embodiment of the present application.
- FIG3 is a schematic diagram of a flow chart of a photographing method provided in an embodiment of the present application.
- FIG4 is a schematic diagram of a preview interface provided in an embodiment of the present application.
- FIG5 is a schematic diagram of a photographing target provided by an embodiment of the present application.
- FIG6 is a schematic diagram of a frame combining method using a fixed event rate according to an embodiment of the present application.
- FIG7 is a schematic diagram of an event monitored by a motion sensor provided in an embodiment of the present application.
- FIG8 is a schematic diagram of motion information provided by an embodiment of the present application.
- FIG9 is a schematic diagram of a mask of event data provided in an embodiment of the present application.
- FIG10 is a schematic diagram of updating target detection information provided by an embodiment of the present application.
- FIG11 is a schematic diagram of adaptive segmentation of data volume corresponding to event data provided by an embodiment of the present application.
- FIG12 is a schematic diagram of determining a ROI focus area provided in an embodiment of the present application.
- FIG13 is a schematic diagram of a focus tracking process provided in an embodiment of the present application.
- FIG14 is a schematic diagram of performing a focus tracking task provided by an embodiment of the present application.
- FIG15 is a schematic diagram of turning on a motion sensor provided in an embodiment of the present application.
- FIG16 is a schematic diagram of switching a motion mode provided in an embodiment of the present application.
- FIG. 17 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
- At least one of the following or similar expressions refers to any combination of these items, including any combination of single items or plural items.
- at least one of a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, c can be single or multiple.
- words such as “first” and “second” are used to distinguish between identical or similar items with substantially identical functions and effects.
- words such as “first” and “second” do not limit the quantity and order of execution, and words such as “first” and “second” do not necessarily limit differences.
- words such as “exemplary” or “for example” are used to indicate examples, illustrations or explanations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as being more preferred or advantageous than other embodiments or designs.
- words such as “exemplary” or “for example” is intended to present related concepts in a concrete way for the convenience of understand.
- the network architecture and business scenarios described in the embodiments of the present application are intended to more clearly illustrate the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided in the embodiments of the present application.
- a person of ordinary skill in the art can appreciate that with the evolution of the network architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
- the camera function in electronic devices such as mobile phones to take photos and videos.
- the user can manually select the focus in the shooting picture, and then the mobile phone can focus according to the focus selected by the user.
- the mobile phone can also provide an automatic focus function.
- the mobile phone can perform motion target detection on the target, and automatically track the focus and focus according to the motion detection result.
- the motion target detection result can be used as a supplement to information such as texture and color.
- the motion target detection result corresponding to the static image is obtained with a lag.
- the shooting target in the motion target detection result may have entered the next position or state. Therefore, there will be a situation where the shooting target in the shooting picture cannot be focused on in time, and the captured photos or videos will be blurred, reducing the user experience.
- the mobile phone can also use all the event data collected by the event camera (moving from near to far positions), calculate the value corresponding to the focus function based on the event rate, and determine the position with the largest value as the focus position; then automatically track the focus and focus.
- the process of the above mobile phone automatically tracking the focus and focusing a large amount of data needs to be traversed for calculation, and the work efficiency is low. There will also be a situation where calculation errors lead to repeated focusing, reducing the accuracy of the focusing result.
- the embodiment of the present application provides a shooting method that can be applied to various scenes, such as photography scenes (shooting static scenes and shooting dynamic scenes) and monitoring scenes.
- the embodiment of the present application can obtain the first event data collected by the motion sensor and the first frame of the original image collected by the image sensor.
- the moving target is detected to determine the first motion information of the shooting target; the first motion information is used to characterize the motion of the shooting target.
- the target detection is performed to determine the first target detection information of the shooting target; the first target detection information is used to characterize the position of the shooting target in the shooting picture.
- the first target detection information of the shooting target is updated.
- the first ROI focus area of interest is determined.
- the camera is focused to collect the second frame of the original image.
- the first target image is displayed on the display screen, and the first target image is generated by the second frame of the original image after focusing.
- the electronic device in the process of performing target detection according to the first frame of the original image and determining the first target detection information of the shooting target, includes: performing scene detection and target detection according to the first frame of the original image to determine the first target detection information and scene information of the shooting target; the scene information is used to characterize the shooting scene of the shooting target. Afterwards, the electronic device determines the first ROI focus area according to the first motion information, the scene information and the updated first target detection information.
- the embodiment of the present application can use an event camera with a high shooting frame rate to collect event data, and the motion information corresponding to the event data can update the target detection information.
- the motion information corresponding to the event data collected by the event camera with a high shooting frame rate can guide the target detection information in a timely manner, thereby improving the situation where the target detection result is lagging, so that the target detection information is closer to the motion state of the shooting target in the real scene, and improves the probability of successful focusing in the motion scene.
- the embodiment of the present application determines the focus area through the motion information corresponding to the shooting target in motion. Since the first motion information of the shooting target has been obtained, the motion information can be directly used to perform the autofocus process. Furthermore, there is no need to traverse the event signals of all positions from near to far in the shooting picture. It avoids errors and low work efficiency during the traversal process, and can accurately determine the focus area, which is convenient for subsequent capture of clear pictures or videos, and improves the user experience.
- the embodiment of the present application can also predict the motion of the shooting target and output prediction information during the process of moving target detection; the prediction information is used to represent the predicted motion information of the shooting target. Then, automatic focus tracking is performed based on the output prediction information, so that the user can decide the best shooting time according to the automatic focus tracking function and shoot clearer pictures or videos.
- the electronic device may be a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an augmented reality (AR)/virtual reality (VR) device, a laptop computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (PDA) and other mobile terminals, or may be a professional camera and other equipment.
- AR augmented reality
- VR virtual reality
- UMPC ultra-mobile personal computer
- PDA personal digital assistant
- the embodiments of the present application do not make any specific reference to the specific type of the electronic device. limit.
- the operating system installed on the electronic device includes but is not limited to Or other operating systems. This application does not limit the specific type of electronic equipment and the type of operating system when an operating system is installed.
- FIG1 shows a schematic structural diagram of a mobile phone 100 .
- the mobile phone 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a Subscriber Identification Module (SIM) card interface 195, etc.
- SIM Subscriber Identification Module
- the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, an image sensor 180N and a motion sensor 180O, etc.
- the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the mobile phone 100.
- the mobile phone 100 may include more or fewer components than shown in the figure, or combine some components, or separate some components, or arrange the components differently.
- the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a memory, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU), etc.
- AP application processor
- GPU graphics processor
- ISP image signal processor
- controller a memory
- video codec a digital signal processor
- DSP digital signal processor
- NPU neural-network processing unit
- Different processing units may be independent devices or integrated in one or more processors.
- the controller may be the nerve center and command center of the mobile phone 100.
- the controller may generate an operation control signal according to the instruction operation code and the timing signal to complete the control of fetching and executing instructions.
- the processor 110 may also be provided with a memory for storing instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory may store instructions or data that the processor 110 has just used or cyclically used. If the processor 110 needs to use the instruction or data again, it may be directly called from the memory. This avoids repeated access, reduces the waiting time of the processor 110, and thus improves the efficiency of the system.
- the wireless communication function of the mobile phone 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
- the wireless communication module 160 can provide wireless communication solutions including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), infrared (IR) and the like applied on the mobile phone 100.
- WLAN wireless local area networks
- BT Bluetooth
- GNSS global navigation satellite system
- FM frequency modulation
- NFC near field communication
- IR infrared
- the wireless communication module 160 can be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the frequency of the electromagnetic wave signal and performs filtering, and sends the processed signal to the processor 110.
- the wireless communication module 160 can also receive the signal to be sent from the processor 110, modulate the frequency of the signal, amplify the signal, and convert it into electromagnetic waves for radiation through the antenna 2.
- the antenna 1 of the mobile phone 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the mobile phone 100 can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include Global System For Mobile Communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology.
- the GNSS may include Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), Beidou Navigation Satellite System (BDS), Quasi-Zenith Satellite System (QZS), etc. Satellite System, QZSS) and/or Satellite Based Augmentation Systems (SBAS).
- GPS Global Positioning System
- GLONASS Global Navigation Satellite System
- BDS Beidou Navigation Satellite System
- QZS Quasi-Zenith Satellite System
- SBAS Satellite Based Augmentation Systems
- the mobile phone 100 implements the display function through a GPU, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, which connects the display screen 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations for graphics rendering.
- the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos, etc.
- the display screen 194 includes a display panel.
- the mobile phone 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
- the mobile phone 100 can realize the shooting function through ISP, camera 193, video codec, GPU, display screen 194 and application processor.
- the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, and the light is transmitted to the camera photosensitive element through the lens. The light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
- the ISP can also perform algorithm optimization on the noise, brightness, and skin color of the image. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the camera 193.
- the camera 193 is used to capture still images or videos.
- the object generates an optical image through the lens and projects it onto the photosensitive element.
- the photosensitive element can be a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) phototransistor.
- CMOS complementary metal oxide semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to be converted into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- the DSP converts the digital image signal into an image signal in a standard RGB, YUV or other format.
- the mobile phone 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
- the camera 193 can be set in the mobile phone 100. Of course, it can also be a part of the electronic device. In some implementations, the camera can also be set outside the electronic device and connected to the electronic device by wired or wireless means. For example, the camera and the electronic device can be connected to the electronic device via Bluetooth or a mobile hotspot. The electronic device can control the camera by sending or receiving instructions to the camera.
- the mobile phone 100 may also include a camera module, which may be disposed in the camera head, or may be disposed at other locations inside the mobile phone 100.
- the camera module includes a lens, a focus motor, a base, a circuit board, and a photosensitive element.
- the base is fixedly connected to one side of the circuit board.
- the focus motor is located on the side of the base away from the circuit board and is fixedly connected to the periphery of the base.
- the lens is installed in the middle of the focus motor.
- the photosensitive element is fixed to the side of the circuit board facing the lens.
- the lens is used to collect the light signal reflected by the photographed target.
- the focus motor is used to drive the lens to move in a direction parallel to the optical axis.
- the optical axis refers to a line passing through the center of the lens.
- the mobile phone 100 can control the focus motor to move the lens to a focus position, thereby completing the focusing process.
- the focus motor can be a voice coil motor (VCM), a shape memory alloy (SMA) motor, a ceramic motor (Piezo Motor, PM), and a stepper motor (STM), etc.
- VCM voice coil motor
- SMA shape memory alloy
- PM piezo Motor
- STM stepper motor
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the mobile phone 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music and videos can be stored in the external memory card.
- the internal memory 121 may be used to store computer executable program codes, which include instructions.
- the processor 110 executes various functional applications and data processing of the mobile phone 100 by running the instructions stored in the internal memory 121.
- the internal memory 121 may include a program storage area and a data storage area.
- the mobile phone 100 can implement audio functions such as music playing and recording through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor.
- the image sensor 180N can be used to detect objects within the range of the camera, and each photosensitive unit corresponds to a pixel in the image sensor.
- the image sensor 180N may include a first color (red green blue, RGB) image sensor, a second color (red yellow blue, RYB) image sensor, a black and white image sensor, and an infrared image sensor, etc., which is not limited in the embodiment of the present application.
- the image sensor 180N is used to collect original images, wherein the original images may include RGB images, RYB images, black and white images, and infrared images, etc.
- the image sensor 180N is taken as an RGB image sensor and the original image is taken as an RGB image as an example for subsequent description.
- the image can be the second frame RGB image.
- Each photosensitive unit is covered with RGB red, green and blue color filters. In this way, after receiving light, the photosensitive unit generates a corresponding current, and the current size corresponds to the light intensity. Therefore, the electrical signal directly output by the photosensitive unit is analog, and then the output analog electrical signal is converted into a digital signal. Finally, all the digital signals obtained are output in the form of a digital image matrix to a dedicated DSP processing chip for processing.
- the RGB image sensor outputs the full-frame image of the shooting area in frame format.
- the motion sensor 180O may include a variety of different types of visual sensors, such as a frame-based motion detection visual sensor (Motion Detection Vision Sensor, MDVS) and an event-based motion detection visual sensor. It can be used to detect moving targets within the range of the camera and collect event data of moving objects; the event data includes motion profiles or motion trajectories, etc.
- a frame-based motion detection visual sensor Motion Detection Vision Sensor, MDVS
- an event-based motion detection visual sensor It can be used to detect moving targets within the range of the camera and collect event data of moving objects; the event data includes motion profiles or motion trajectories, etc.
- the motion sensor 180O may include a motion detection (MD) visual sensor.
- MD is a type of visual sensor that detects motion information.
- the motion information originates from the relative motion between the camera and the target.
- the motion information may be the camera motion, the target motion, or both the camera and the target motion.
- Motion detection visual sensors include frame-based motion detection and event-based motion detection. Frame-based motion detection visual sensors require exposure integration and obtain motion information through frame differences. Event-based motion detection visual sensors do not require integration and obtain motion information through asynchronous event detection.
- the motion sensor 180O may include a Motion Detection Vision Sensor (MDVS), a Dynamic Vision Sensor (DVS), an Active Pixel Sensor (APS), an infrared sensor, a laser sensor, or an Inertial Measurement Unit (IMU), etc.
- the DVS may specifically include a Dynamic and Active-pixel Vision Sensor (DAVIS) and an Asynchronous Time-based Image Sensor (ATIS), etc.
- DAVIS Dynamic and Active-pixel Vision Sensor
- ATIS Asynchronous Time-based Image Sensor
- DVS draws on the characteristics of biological vision.
- Each pixel simulates a neuron and independently responds to relative changes in light intensity (hereinafter referred to as "light intensity").
- light intensity For example, if the motion sensor is DVS, when the relative change in light intensity exceeds a threshold, the pixel will output an event signal, including the pixel's position, timestamp, and characteristic information of the light intensity.
- the event data, event frame images, motion information, etc. mentioned can all be collected by the motion sensor.
- multiple image sensors can be set in the same camera.
- a single-lens dual-sensor camera has an RGB image sensor and a motion sensor both set in one camera.
- a dual-lens dual-sensor camera and a single-lens three-sensor camera these sensors are used to image the same photographed object.
- a spectrometer can be further provided between the lens and the sensor to decompose the light entering from one lens onto multiple sensors to ensure that each sensor can receive light.
- the number of processors can be one or more. The embodiments of the present application do not specifically limit the arrangement of components and the number of components.
- RGB pixels and event pixels are synthesized into an image sensor in the form of a hybrid mode.
- the event pixels can be evenly inserted into the spatial arrangement of RGB pixels to achieve space division multiplexing.
- RGB pixels can also be provided together with event pixels in a photodiode, and then the corresponding RGB data and event data are read out respectively through two different readout circuits.
- an RGB image sensor and a motion sensor are provided in the mobile phone 100.
- the RGB image sensor can be provided in an RGB camera
- the motion sensor can be provided in an event camera.
- the RGB camera and the event camera will form two conical fields of view according to their own visual angles. In the process of collecting images, the two conical fields of view corresponding to the RGB camera and the event camera need to have overlapping fields of view, and the overlapping fields of view should fall at the position of the collected image.
- the RGB camera needs to be arranged side by side with the event camera on the same plane, and the arrangement positions cannot overlap, so that the two cone fields of view have overlapping fields of view.
- the RGB image sensor and the motion sensor can also be arranged in one camera.
- the RGB camera can also be called an RGB camera
- the event camera can also be called an event camera.
- the RGB image sensor and the motion sensor provided in the embodiment of the present application can cooperate to capture the shooting picture including the shooting target, and can track the shooting target in a timely and accurate manner.
- camera calibration is usually performed to make the image information collected by the RGB image sensor and the motion information collected by the motion sensor more accurate.
- the calibration of RGB image sensors and motion sensors is completed by determining the relationship between the 3D geometric position of a point on the surface of a space object and the corresponding point in the collected image. Furthermore, in the calibration process, it is necessary to establish an imaging geometric model. These geometric model parameters The numbers can be summarized as calibration parameters.
- the calibration parameters include internal parameters, external parameters, and distortion parameters.
- the calibration parameters can be solved by experiments and calculations, and the process of solving the calibration parameters by experiments and calculations is called the calibration process.
- the calibration process of the RGB image sensor and the motion sensor in the embodiment of the present application can be performed before leaving the factory, or it can be performed regularly before the user uses the camera.
- the specific calibration process and the execution time of the calibration process are not specifically limited in the embodiment of the present application.
- the key 190 includes a power key, a volume key, etc.
- the key 190 can be a mechanical key or a touch key.
- the mobile phone 100 can receive key input and generate key signal input related to the user settings and function control of the mobile phone 100.
- Motor 191 can generate vibration prompts. Motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback.
- the indicator 192 may be an indicator light, which may be used to indicate the charging status, power changes, messages, missed calls, and notifications, etc.
- FIG. 2 is a software structure block diagram of the mobile phone 100 according to an embodiment of the present invention.
- the layered architecture divides the software into several layers, each with a clear role and division of labor.
- the layers communicate with each other through software interfaces.
- the Android system is divided into five layers, from top to bottom: the application layer, the application framework layer, the Android runtime (Android Runtime) and system library, the hardware abstract layer (HAL) and the kernel layer.
- the application layer can include a series of application packages.
- the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message, etc.
- the application framework layer provides application programming interface (API) and programming framework for the applications in the application layer.
- API application programming interface
- the application framework layer includes some predefined functions.
- the application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, a notification manager, and the like.
- the window manager is used to manage window programs.
- the window manager can obtain the display screen size, determine whether there is a status bar, lock the screen, capture the screen, etc.
- Content providers are used to store and retrieve data and make it accessible to applications.
- the data may include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
- the view system includes visual controls, such as controls for displaying text, controls for displaying images, etc.
- the view system can be used to build applications.
- a display interface can be composed of one or more views.
- a display interface including a text notification icon can include a view for displaying text and a view for displaying images.
- the phone manager is used to provide communication functions of the mobile phone 100, such as management of call status (including answering, hanging up, etc.).
- the resource manager provides various resources for applications, such as localized strings, icons, images, layout files, video files, and so on.
- the notification manager enables applications to display notification information in the status bar. It can be used to convey notification-type messages and can disappear automatically after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
- the notification manager can also be a notification that appears in the system top status bar in the form of a chart or scroll bar text, such as notifications of applications running in the background, or a notification that appears on the screen in the form of a dialog window. For example, a text message is displayed in the status bar, a prompt sound is emitted, an electronic device vibrates, an indicator light flashes, etc.
- Android Runtime includes core libraries and virtual machines. Android Runtime is responsible for the scheduling and management of the Android system.
- the core library consists of two parts: one part is the function that needs to be called by the Java language, and the other part is the Android core library.
- the application layer and the application framework layer run in a virtual machine.
- the virtual machine executes the Java files of the application layer and the application framework layer as binary files.
- the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
- the system library can include multiple functional modules, such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
- functional modules such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
- the surface manager is used to manage the display subsystem and provide the fusion of 2D and 3D layers for multiple applications.
- the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
- the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
- the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis and layer processing, etc.
- a 2D graphics engine is a drawing engine for 2D drawings.
- the hardware abstraction layer runs in user space, encapsulates the kernel layer driver, and provides a calling interface to the upper layer.
- the hardware abstraction layer can include acquisition module, target detection module, scene detection module, moving target detection module, prediction module, ROI selection module and AF module.
- the acquisition module is used to acquire RGB data and event data; the target detection module is used to perform target detection on the collected RGB data and generate target detection information; the scene detection module is used to perform scene detection on the collected RGB data and generate scene information; the moving target detection module is used to perform moving target detection on the collected event data and generate motion information of the shooting target; the prediction module is used to perform motion prediction based on the motion information of the shooting target and generate prediction information; the ROI selection module is used to determine the ROI focus area based on the scene information, motion information and updated target detection information; the AF module is used to control the focus motor to move the lens to the focus position according to the ROI focus area to complete the focusing task; the AF module is also used to perform the tracking task according to the prediction information corresponding to the shooting target.
- the kernel layer is the layer between hardware and software.
- the kernel layer contains at least display driver, camera driver, audio driver, image sensor driver and motion sensor driver.
- the motion sensor generally adopts an asynchronous readout mode based on event stream (hereinafter referred to as “event stream-based readout mode” or “asynchronous readout mode”) and a synchronous readout mode based on frame scanning (hereinafter referred to as “frame scanning-based readout mode” or “synchronous readout mode”).
- event stream-based readout mode or “asynchronous readout mode”
- frame scanning-based readout mode or “synchronous readout mode”
- the motion sensor is sensitive to motion, and static areas in a normal environment usually do not generate light intensity change events (also referred to as "events" in the application embodiments). When events are arranged in a certain order, they can be called event streams. Generally, depending on the specific application scenario, the amount of signal data required to be read per unit time in the above two readout modes may be significantly different, and the methods required to output the read data are also different.
- the following takes the dynamic vision sensor (DVS) in the motion sensor as an example to illustrate the asynchronous readout mode.
- DVS dynamic vision sensor
- each event can be expressed as ⁇ x, y, t, p>, where (x, y) represents the pixel position where the event occurs, t represents the time when the event occurs, and p represents the polarity information of the event.
- the pixel can output a data signal indicating the event. Therefore, in the asynchronous readout mode based on the event stream, the pixels of the motion sensor are further divided into pixels that generate light intensity change events and pixels that do not generate light intensity change events. And through the coordinates and timestamps associated with the pixels, the spatiotemporal position of the light intensity change event can be uniquely determined, and then all events can be organized into an event stream in the order of occurrence.
- the synchronous readout mode based on frame scanning does not distinguish whether the pixel generates a light intensity change event. Regardless of whether a light intensity change event occurs at a certain pixel, the data signal generated by the pixel is read.
- the motion sensor scans the pixel array circuit in a predetermined order, synchronously reads the polarity information p indicating the light intensity of each pixel (the polarity information of the event is the same as above), and outputs it in sequence as the first frame data, the second frame data, etc.
- the running sensor can output event frames with ultra-high frame rates, the time interval between adjacent event frames is less than 1ms, and the equivalent frame rate exceeds 1000 frames (Frames Per Second, FPS). If a pixel does not trigger an event, the polarity value of the pixel is 0.
- the amount of data read by the motion sensor for each frame is of the same size, and the amount of data remains unchanged over time.
- the frames are output at equal time intervals, for example, they can be output at rates such as 30 frames per second, 60 frames per second, and 120 frames per second.
- the faster the target moves the more signals the motion sensor captures of the target. Therefore, in fast motion detection, events are triggered only where there is movement, and there are no events where there is no movement. At the same time, the faster the object moves, the more event signal data is captured per unit time.
- the shooting target in the shooting picture can be a person, animal, doll, building or plant that the user is interested in, and the shooting target can remain motionless and/or move continuously during the shooting process.
- the shooting target in the shooting picture can be one or more.
- the method may include:
- Step S301 The mobile phone obtains RGB data and first event data in response to a start operation of a shooting function.
- an application program (APP) with a shooting function is installed in the mobile phone, and the user can use the application to shoot pictures and videos.
- the mobile phone After the mobile phone detects the start operation input by the user, in response to the operation, the mobile phone can display the shooting picture captured by the camera in real time in the viewfinder window.
- the mobile phone will display each frame of the shooting picture in the viewfinder window at a certain frame rate (frames per second, FPS). Taking the frame rate of 30 frames per second as an example, the mobile phone can display 30 frames of shooting pictures captured by the camera within 1 second.
- the acquisition module can respond to a start operation input by the user to synchronously start the RGB sensor and the motion sensor to obtain corresponding RGB data and first event data to perform subsequent shooting tasks.
- the user's start-up operation of the shooting function may include a click operation triggered by the user in the photo or video preview state; for example, the user clicks on the camera APP icon.
- the mobile phone enters the photo or video preview state, and displays the photo or video preview state interface by obtaining RGB data and event data.
- the user clicks on the shooting control For example, the user clicks on the shooting control in the preview interface.
- the mobile phone enters the photo state or video state, and generates the shot photos or videos by obtaining RGB data and event data. It is understandable that the user can click the shooting control to complete the current photo or video.
- the camera application can be started. After starting the camera application, the mobile phone can enter a shooting mode and display a preview interface in a preview state.
- the mobile phone can also start the RGB sensor and the motion sensor in response to the user's voice command or quick gesture operation, and the embodiment of the present application does not limit the operation of triggering the start of the RGB sensor and the motion sensor.
- the output of the RGB sensor is an image sequence composed of continuous RGB frame images, and the content of each frame of the RGB image includes information such as the color brightness of the object at the time of shooting. It can also be understood that the RGB image captured by the RGB sensor is static.
- the RGB sensor has a lower frame rate than the motion sensor.
- the RGB sensor can output RGB data greater than or equal to 24fps.
- the motion sensor has the characteristics of high frame rate, sensitive operation and low power consumption.
- the frame rate of the motion sensor can be between 1000fps and 15000fps. This high frame rate characteristic makes it very suitable for capturing high-speed dynamic images.
- the motion sensor can trigger an event signal. And the faster the shooting target moves, the denser the event data collected by the motion sensor.
- the frame rate of the motion sensor can be adaptively adjusted according to the movement speed of the shooting target. When the movement speed of the shooting target is slower, the frame rate of the motion sensor is reduced. When the movement speed of the shooting target is faster, the frame rate of the motion sensor increases.
- the minimum frame rate of the motion sensor must be greater than or equal to the frame rate of the RGB sensor.
- the maximum frame rate of the motion sensor can be less than or equal to the preset frame rate, such as 1000fps.
- the motion sensor only captures image changes, so the output data is less, and the power consumption requirement is lower even if it works for a long time.
- the motion sensor generates events in response to motion changes. Since static areas will not trigger events, most events are generated in areas where moving objects exist.
- the specific principle of the motion sensor is: there is an independent photoelectric sensing module for each pixel in the motion sensor. When the brightness change at the pixel exceeds the set threshold, event data (sometimes also called pulse data) will be generated and output.
- event data sometimes also called pulse data
- the motion sensor When the shooting picture includes a shooting target in a moving state, the motion sensor will generate and output event data only when the brightness of the pixel changes. Furthermore, redundant data (such as data corresponding to no brightness change in the pixel) can be greatly reduced, thereby improving the computational efficiency of the post-processing algorithm.
- the subsequent focusing process can be performed by obtaining the RGB data collected by the RGB sensor.
- Step S302 The mobile phone performs scene detection and target detection according to the RGB data to determine first target detection information and scene information of the photographed target.
- the RGB data collected by the RGB sensor includes N frames of RGB images, where N is a positive number greater than or equal to 1. Integer. After the mobile phone turns on the RGB sensor to output RGB data in real time, it can process the output RGB data in real time.
- the target detection module and the scene detection module can perform target detection and scene detection on each frame of RGB image in the RGB data to generate scene information and target detection information of the photographed target.
- the mobile phone performs target detection on the first frame of RGB image in the RGB data to determine the first target detection information of the photographed target; and performs scene detection based on the first frame of RGB image to determine the scene information of the photographed target.
- the first target detection information is used to indicate the subject in the shooting picture.
- the subject in the embodiment of the present application may be a person, an animal, a doll, etc. that the user is interested in.
- the range other than the subject can be called the shooting background.
- the first target detection information may include one or more of a first target frame of the photographed target, a size of an area corresponding to the first target frame, and coordinates corresponding to the first target frame.
- the shooting target may be a shooting subject.
- the scene information is used to indicate the shooting background in the shooting picture, and may include scene types, such as blue sky, forest, sun, building, moon and other scenes.
- the first target detection information is used to characterize the relevant information of the shooting target within the detection range of the RGB sensor, which can be understood as the static information of the shooting target.
- the shooting targets may include pedestrians, trees, and cars.
- the first target detection information may include the first target frame A corresponding to pedestrians, the area size and coordinates of the first target frame A; it may also include the first target frame B corresponding to trees, the area size and coordinates of the first target frame B; the first target frame C corresponding to cars, the area size and coordinates of the first target frame C.
- Scene information includes the sun.
- the shooting target includes a stationary shooting target and/or a moving shooting target.
- the mobile phone detects all shooting targets regardless of whether the shooting target is stationary or moving, and then generates target detection information of the shooting target. It is understandable that when there is a moving shooting target, the mobile phone can also perform moving target detection on the first frame of RGB image.
- a perception algorithm can be used to perform target detection and scene detection on every other frame or every few frames of RGB images in the RGB data.
- the target detection algorithm can be any one or a combination of any of FastR-CNN, FasterR-CNN, R-FCN, YOLO, SSD, and RetinaNet, or other neural network algorithms or non-neural network algorithms.
- the embodiment of the present application does not specifically limit the detection algorithm corresponding to target detection and scene detection, and can be selected according to actual scene requirements and image requirements.
- Step S303 The mobile phone performs moving target detection according to the first event data to determine first motion information of the shooting target.
- the mobile phone can use two different processing methods to detect event data during the process of detecting event data.
- the asynchronous readout mode is exemplarily described below.
- each event is a point cloud data of ⁇ x, y, t, p>, where (x, y) represents the pixel position where the event occurs, t represents the time when the event occurs, and p represents the polarity information of the event.
- the interval between two adjacent events is very close, such as 1 ⁇ s.
- the mobile phone can segment the first event data to generate multiple event frame images. If the motion rate corresponding to the shooting target changes, the number of frames of the event frame image generated within a unit time length is adjusted according to the motion rate; the number of frames of the event frame image is positively correlated with the motion rate corresponding to the shooting target. Afterwards, the mobile phone can perform moving target detection on the event frame image. It can be understood that in the embodiment of the present application, the number of frames of the event frame image generated within a unit time length is related to the motion rate of the shooting target. The greater the motion rate corresponding to the shooting target, the more frames of the event frame image generated within the unit time length. Conversely, the smaller the motion rate corresponding to the shooting target, the fewer frames of the event frame image generated within the unit time length.
- the mobile phone can divide the first event data according to the preset event data volume to generate multiple event frame images; wherein each event frame image corresponds to the preset event data volume.
- the mobile phone can perform frame processing on the first event data to obtain an event frame image. For example, by counting the event rate (Event Rate), a fixed event is synthesized into an event frame image.
- Event Rate the event rate
- FIG6 is a schematic diagram of merging frames using a fixed event rate.
- the embodiment of the present application can split the first event data into multiple frames.
- Event frame image See Figure 6, each black dot represents an event signal, that is, a point cloud data packet of ⁇ x, y, t, p>.
- one frame of event frame image can be synthesized according to every 5 event signals. That is, the preset event data volume is 5. Afterwards, the synthesized event frame image is used as the input event data for subsequent detection.
- the embodiment of the present application only uses the above-mentioned 5 event signals as an example, which does not represent the actual number of events. The number of events can be designed and adjusted according to the actual usage scenario.
- the embodiment of the present application does not limit the specific preset event data amount for synthesizing a frame, and can be adaptively adjusted according to the actual application scenario. For example, the faster the shooting target moves, the more event data collected by the triggering motion sensor, the less event data synthesized into a frame, and the shorter the frame processing event.
- the frame processing method includes different methods such as cumulative summation, cumulative summation and polarity, voxel grid (Voxel Grid), time surface (Time Surface), etc.
- the embodiment of the present application does not specifically limit the frame processing method.
- the mobile phone may also set a preset period, and divide the first event data into multiple groups of event data according to the preset period. Then, subsequent detection is directly performed on the event data within each preset period.
- the preset period is 100 ⁇ s. In this way, the event data corresponding to each 100 ⁇ s is taken as a group of event data, and subsequent detection is performed.
- the synchronous readout mode is exemplarily described below.
- the motion sensor readout mode When the motion sensor readout mode is synchronous readout mode, the amount of data read by each frame of the motion sensor is the same size, and the amount of data remains unchanged over time. At the same time, the frames are output at equal time intervals. Since the motion sensor outputs high-frame rate event frames, there is no need to perform frame merging processing on the event data. The mobile phone can directly perform subsequent detection on one or more event frames.
- the first event data collected by the motion sensor can obtain M frames of event frame images, where M is a positive integer greater than or equal to 1. That is, the event frame image is an image generated by the above-mentioned first event data, specifically including an image generated by the corresponding motion trajectory information when the photographed target is in motion, or the event frame image can be used to identify information when the photographed target moves within the detection range of the motion sensor within a period of time.
- Figure 7 shows the event detected by the motion sensor, that is, the motion sensor can detect moving objects within the detection range and the corresponding outlines and positions.
- the mobile phone after the mobile phone turns on the motion sensor to output the first event data in real time, it can process the output first event data in real time.
- the motion target detection module can perform motion target detection and target tracking detection on each event frame image in the first event data, such as performing motion target detection on the first event data to determine the first motion information of the shooting target. That is, the mobile phone can use the motion sensor to monitor the movement of the shooting target within the detection range, use the monitored shooting target as the tracking target, and then obtain the motion information of the shooting target when it moves.
- the shooting target can be an object moving within the detection range of the motion sensor, and the number of shooting targets can be one or more.
- target tracking refers to tracking a moving target in a continuous image sequence to obtain motion information of the moving target in each frame of the image, so as to facilitate the subsequent determination of the motion trajectory of the moving target.
- the mobile phone can use a tracking algorithm to perform target tracking detection on each event frame image, wherein the tracking algorithm can be a centroid tracking algorithm (centroid), a correlation tracking algorithm (correlation) or an edge tracking algorithm (edge), etc., and the embodiments of the present application do not impose any restrictions on this.
- the first motion information may include one or more of a second target frame corresponding to the photographed target, a region size corresponding to the second target frame, coordinates corresponding to the second target frame, and a confidence level corresponding to the photographed target.
- the motion information is used to represent information when the photographed target moves within the detection range of the motion sensor.
- the motion information may include the second target frame A corresponding to the car, the area size corresponding to the second target frame A, the coordinates corresponding to the second target frame A, and the confidence level of 0.98 corresponding to the photographed target; the second target frame B corresponding to the pedestrian, the area size corresponding to the second target frame B, the coordinates corresponding to the second target frame B, and the confidence level of 0.94 corresponding to the photographed target; the second target frame C corresponding to the two-wheeled vehicle, the area size corresponding to the second target frame C, the coordinates corresponding to the second target frame C, and the confidence level of 0.87 corresponding to the photographed target.
- the first motion information may include one or more of a motion profile corresponding to the target, a region size corresponding to the motion profile, coordinates corresponding to the motion profile, and a confidence level corresponding to the target.
- FIG9 is a schematic diagram of a mask of event data provided in an embodiment of the present application.
- the first motion information may include a motion profile corresponding to a pedestrian, a region size corresponding to the motion profile, coordinates corresponding to the motion profile, and a confidence level corresponding to the pedestrian.
- the mobile phone when it performs the focus task later, it needs to calculate the average depth value in the second target frame. Since the second target frame includes some other areas in addition to the shooting target itself, in the process of calculating the average depth value, the average depth value of the motion contour area corresponding to the shooting target is more accurate than the average depth value of the second target frame.
- the first motion information may include not only the second target frame, but also one or more of the motion contour area corresponding to the shooting target, the area size corresponding to the motion contour, or the coordinates corresponding to the motion contour, and the confidence level corresponding to the shooting target.
- the embodiments of the present application do not specifically limit the specific content included in the motion information.
- the first motion information may further include information corresponding to a motion trajectory when the photographed target moves within the detection range of the motion sensor.
- the first motion information may further include one or more of the motion direction of the target, the motion speed of the target, and the motion type of the target; the motion type includes at least one or more of reciprocating motion, rotational motion, lateral motion, longitudinal motion, or bouncing motion.
- the motion speed may be a change trend of the speed of the shooting target in the current event frame image compared to the shooting target in the previous event frame image, including but not limited to speed trend state quantities such as faster, slower, and even more levels of speed trend state quantities, such as fast, faster, very fast, slow, slower, very slow, etc.
- the motion direction may also be a change in direction of the shooting target in the current event frame image compared to the shooting target in the previous event frame image, including but not limited to direction trend state quantities such as left, right, up, down, unchanged, and even more levels of direction trend state quantities, such as upper left, lower left, upper right, lower right, left, right, up, down, unchanged, etc.
- the motion type may be a motion change trend of the shooting target in the current event frame image compared to the shooting target in the previous event frame image.
- the mobile phone can segment the event data collected by the motion sensor according to the preset number of event frame images. Afterwards, multiple sets of preset number of event frame images are accumulated, and a series of motion trajectories in the multiple sets of preset number of event frame images are analyzed. By calculating optical flow, motion vectors, etc., the motion characteristics of the moving target, such as the motion direction, motion speed, and other information, are obtained.
- the first event data may be divided into a plurality of groups of event frame images with a preset number of frames, such as k groups of event frame images with a preset number of frames, and the event frame images with a preset number of frames may correspond to two event frame images.
- the division method may be based on a set number of frames, a random number of frames, or a change in motion trajectory, etc., which may be adjusted according to the actual application scenario.
- the positions of the events in each group of preset number of event frame images are analyzed to determine the region where the shooting target is located in each two frames of event frame images, such as the motion region in the first group of two frames of event frame images is motion region A, and the motion region in the second group of two frames of event frame images is motion region B.
- the first motion information of the shooting target such as motion direction, motion speed, and motion type, is determined according to the change of motion region A-B.
- the motion target detection algorithm can be any one or a combination of background difference method, inter-frame difference method, mixed Gaussian model, optical flow method, block matching and optical flow estimation, or other neural network algorithms or non-neural network algorithms.
- the embodiments of the present application are not limited to this and can be selected according to actual scene requirements and image requirements.
- Step S304 The mobile phone updates the first target detection information based on the first motion information of the photographed target.
- the target Since the motion of the target has already occurred, if the RGB sensor is triggered to collect RGB images and perform target detection on the RGB images only based on the current area and motion characteristics of the target, the target may have entered the next position or state, and the target detection result of the RGB image will be delayed.
- the target detection information of the second frame RGB image will perform a subsequent focus task based on the target detection information of the first frame RGB image.
- the target detection information of the third frame RGB image will perform a subsequent focus task based on the target detection information of the second frame RGB image.
- the operating frame rate of the RGB sensor is usually less than or equal to 24fps to 30fps. Taking the operating frame rate of the RGB sensor as 15fps as an example, even if the motion target detection and target detection results for the RGB image are accurate, when the target detection information of the second frame of the RGB image performs the subsequent focus task based on the target detection information of the first frame of the RGB image, the target detection information corresponding to the shooting target to be focused is already the target detection information of at least 1/15s (i.e. 0.0666s) ago.
- the time difference between the target detection information of the first frame of the RGB image and the subsequent execution of the focus task is greater than 1/15s; and in the process of the RGB sensor performing motion target detection and target detection, there will be a delay in the channel, and the target detection result of the RGB image will be more delayed.
- the position of the shooting target in the event frame image may be closer to the actual position of the shooting target in the shooting scene than the position of the shooting target in the RGB image.
- the position of the shooting target in the event frame image and the position of the shooting target in the RGB image may also be the same.
- the RGB sensor captures the first frame of RGB images
- the motion sensor captures the first event data.
- the capture event of the first event data is before the capture time of the second frame of RGB images
- the first event data includes the first frame of event frame images.
- the position of the shooting target in the first frame of event frame image is the same as the position of the shooting target in the first frame of RGB image.
- the motion sensor captures the second frame of event frame image.
- the RGB sensor does not capture the RGB image.
- the shooting target has moved, and the position of the shooting target in the second frame of event frame image is closer to the actual position of the shooting target in the shooting scene than the position of the shooting target in the RGB image.
- the detection accuracy of the motion target detection of the shooting target in the event frame image is higher than that of the target detection of the shooting target in the RGB image. Therefore, the first frame RGB image can be updated by the motion information of the second frame event frame image, that is, the second frame event frame image guides the target detection information of the second frame RGB image, and the update of the first target detection information in the first frame RGB image is completed, so that the first target detection information is more accurate and closer to the motion state of the shooting target in the real scene.
- the mobile phone in a scenario where there is a shooting target in motion, can update the first target detection information based on the first motion information of the shooting target.
- the mobile phone can update the target detection information of the RGB image based on the motion information of the event frame image. It can also be understood that under the premise that the acquisition time of the event frame image is earlier than the acquisition time of the RGB image, the mobile phone can guide the target detection information of the RGB image based on the motion information of the event frame image.
- the first target detection information corresponding to the first RGB image is updated based on the motion information corresponding to the current event frame image. Then, the second RGB image is guided.
- the mobile phone can update the first target frame of the first target detection information based on the second target frame of the first motion information.
- the specific updating process can be: based on the position of the second target frame in the first motion information, adjust the position of the first target frame in the first target detection information.
- the second target frame in the first motion information is closer to the real position of the target, that is, the second target frame of the first motion information is used to represent the position B of the target in the shooting picture; and the first target frame of the first target detection information is used to represent the position A of the target in the shooting picture. Therefore, the second target frame is used to update the first target frame, and the position of the target in the shooting picture is updated.
- the updated first target frame is closer to the real position of the target in the shooting scene.
- the first target frame of the first target detection information can also be updated based on the motion profile of the first motion information.
- the acquisition time of the second event frame image is earlier than that of the second RGB image; the mobile phone can guide the target detection information of the second RGB image based on the motion information of the second event frame image.
- the acquisition time of the fourth event frame image is earlier than that of the third RGB image; the mobile phone can guide the target detection information of the third RGB image based on the motion information of the fourth event frame image.
- Tests show that for fast-moving targets, when the equivalent frame rate of the motion sensor reaches 200fps, the time difference between the target detection information of the RGB image and the subsequent execution of the focus task is less than 5ms. When the equivalent frame rate of the motion sensor reaches 1000fps, the time difference between the target detection information of the RGB image and the subsequent execution of the focus task is less than 1ms.
- the first motion information of the shooting target is more advanced than the first target detection information, and the motion situation of the shooting target can be obtained in advance.
- an event frame image whose acquisition time is earlier than that of the second frame RGB image is used to guide the second frame RGB image.
- the first frame RGB image is updated using an event frame image whose acquisition time is earlier than that of the second frame RGB image. Then, the time interval between the second frame RGB image and the updated first frame RGB image is reduced. And guiding the RGB image based on motion information that is closer to the real scene is more accurate than relying solely on the target detection result corresponding to the RGB image, so as to improve the focusing accuracy and success probability in the subsequent focusing process.
- the frame rate of the motion sensor can be adaptively adjusted according to the motion speed of the shooting target.
- the frame rate of the motion sensor increases.
- the mobile phone updates the target detection information of the RGB image based on the motion information of the event frame image, it can use the more accurate position of the target to update the target detection information of the RGB image. This can improve the focus accuracy and success rate in the subsequent focusing process.
- the equivalent frame rate of the motion sensor can be adaptively increased.
- the time interval between frames can be adaptively segmented and output according to the data volume corresponding to the event data.
- event frame images are output in the first RGB image frame and the second RGB image frame
- 5 event frame images are output in the second RGB image frame and the third RGB image frame. It is understandable that, because the faster the movement speed of the shooting target is, the shorter the time interval between frames is, the more event frame images are output per unit time, and thus the target detection information of the shooting target is more accurate.
- the embodiments of the present application only use the numbers indicated in the figure as an exemplary illustration, which does not represent the actual number of event frames.
- the number of event frame images in the actual time interval between frames can be 30, 40, 50 or 100, etc.
- Step S305 The mobile phone determines a first ROI focus area according to the scene information, the first motion information and the updated first target detection information.
- the mobile phone can determine the region of interest (ROI) corresponding to the shooting target, that is, the first ROI focus area, according to the actual size and position of the shooting target in the shooting picture; so as to facilitate subsequent focusing according to the size and position of the first ROI focus area.
- ROI region of interest
- the ROI selection module can filter out the first ROI focus area according to the scene information, the second target frame in the first motion information, and the first target frame in the updated target detection information.
- the mobile phone can also present the first ROI focus area on the display screen.
- the scene information, the second target frame in the first motion information, and the first target frame in the updated target detection information may be screened according to a preset screening rule.
- the first ROI focus area may also be screened from the second target frame in the first motion information and the first target frame in the updated first target detection information based on a preset screening rule.
- the preset screening rule may include setting priorities corresponding to the first motion information, the first target detection information, and the scene information, and determining the first ROI focus area according to the information corresponding to the highest priority.
- the priority of the first motion information is greater than the priority of the first target detection information, and the first target detection information is greater than the scene information.
- the scene information includes the sun; the first target frame in the first target detection information includes the target frame corresponding to the car; the second target frame in the first motion information includes the target frame corresponding to the pedestrian; based on the preset screening rule, the target frame corresponding to the pedestrian is determined as the first ROI focus area, and the first ROI focus area is displayed.
- the preset screening rule further includes comparing the priorities corresponding to the first motion information and the first target detection information, and determining the first target frame or the second target frame with a higher priority as the first ROI focus area.
- the preset screening rules may also include detecting the correlation between the first motion information and the scene information and the correlation between the first target detection information and the scene information, and determining the first target frame or the second target frame with a high correlation as the first ROI focus area.
- the second target frame in the first motion information includes a target frame corresponding to a pedestrian and a target frame corresponding to a pet; the scene information includes a blue sky. Based on a preset screening rule, in a scene of blue sky, the target frame corresponding to the pedestrian is determined as the first ROI focus area, and the first ROI focus area is displayed.
- the second target frame in the first motion information includes a target frame corresponding to a pedestrian and a target frame corresponding to a pet; the scene information includes grass. Based on a preset screening rule, in a scene of grass, the target frame corresponding to the pet is determined as the first ROI focus area.
- the preset screening rule may also include determining a first ROI focus area according to the distances from the center points of the first target frame in the first target detection information and the second target frame in the first motion information to the center point of the captured image. When the distance from the center point of the target frame to the center point of the captured image is less than or equal to a preset threshold, the target frame is determined to be the first ROI focus area.
- the first target frame in the first target detection information includes a target frame corresponding to a car;
- the second target frame in the first motion information includes a target frame corresponding to a pedestrian and a target frame corresponding to a puppy;
- the center point of the target frame corresponding to the pedestrian is less than or equal to a preset threshold from the center point of the captured image.
- the target frame corresponding to the pedestrian is determined as the first ROI focus area, and the first ROI focus area is displayed.
- the preset screening rule may further include setting a priority corresponding to the shooting target, and determining the target frame corresponding to the shooting target with the highest priority as the first ROI focus area.
- the objects to be photographed include: trees, people, cats, dogs, birds, bicycles, buses, motorcycles, trucks, cars, trains, boats, puppets, kites, balloons, robot vacuum cleaners, smart TVs, plates, cups, and handbags.
- the priority of the object category to which the object belongs can be divided into four levels, with humans as the first priority, animals as the second priority, vehicles as the third priority, and the rest as the fourth priority.
- the first target frame in the target detection information includes the target frame corresponding to the motorcycle; the second target frame in the motion information includes the target frame corresponding to the dog; based on the preset screening rules, the target frame corresponding to the dog is determined as the first ROI focus area.
- the preset screening rule may also include comparing the confidence of the photographed target in the second target frame with the confidence of the photographed target in the first target frame, and determining the first target frame or the second target frame with a high confidence as the first ROI focus area.
- the confidence level of a target such as a pedestrian in the first motion information is 0.94
- the confidence level of a target such as a two-wheeled vehicle in the first target detection information is 0.87.
- the target frame corresponding to the pedestrian is determined as the first ROI focus area.
- the preset screening rules may also include detecting whether the first motion information includes a motion type.
- the second target frame corresponding to the first motion information is adjusted, and the second target frame is determined as the first ROI focus area.
- the second target frame in the first motion information includes a target frame corresponding to a pedestrian and a target frame corresponding to a puppy; wherein the target frame corresponding to the pedestrian corresponds to a motion type such as rotational motion.
- the target frame corresponding to the pedestrian is reduced according to a preset ratio, and the reduced target frame corresponding to the pedestrian is determined as the first ROI focus area.
- the above-mentioned preset screening rules can be combined, and the first ROI focus area is determined according to the combined preset screening rules.
- the preset screening rules of the first ROI focus area in the embodiment of the present application can also be combined with other preset strategies to provide different first ROI focus area determination methods in different scenarios.
- the embodiment of the present application does not limit the first ROI focus area determination method.
- Step S306 The mobile phone focuses the camera according to the first ROI focus area to capture a second frame of RGB image.
- the mobile phone after determining the first ROI focus area, can automatically focus on the first ROI focus area.
- the autofocus (AF) module can adjust the focal length of the lens and control the focus motor of the camera to move to the focus position based on the relevant information of the first ROI focus area, such as the position information of the four sides of the first ROI focus area corresponding to the RGB image, through the AF algorithm.
- a preview RGB image is collected and displayed. The user can see the preview RGB image after focusing in the preview interface.
- the AF module calculates the average depth value in the first ROI focus area. Then, the AF module determines the distance the focus motor of the camera moves and the focus position according to the average depth value. Finally, the focus motor of the camera is controlled to move the distance and move to the focus position. After the camera is controlled to focus, a second frame of RGB image is collected.
- the embodiment of the present application only takes the first frame RGB image and the second frame RGB image as examples, and during the focusing process, the second frame RGB image is focused according to the first frame RGB image.
- the AF module may query the first ROI focus area in the first frame RGB image for the corresponding area in the depth image collected by the time of flight (TOF) sensor; then, the average depth value of the first ROI focus area is obtained through the area corresponding to the first ROI focus area in the depth image.
- the autofocus module may also use the phase detection (PD) information with variable filter length in the RGB image to calculate the average depth value of the first ROI focus area.
- PD phase detection
- the first target detection information corresponding to the first frame RGB image is updated according to the first motion information corresponding to the first event data.
- the first ROI focus area is determined according to the scene information, the first motion information and the updated first target detection information.
- the focus motor is controlled to complete the focus of the lens according to the first ROI focus area. Because the first motion information can update the first target detection information in advance, the first target detection information can be made closer to the shooting target in the real scene. Therefore, in the process of determining the first ROI focus area and subsequent focusing, the first ROI focus area and the focus position to be focused can be determined more accurately. Then, the shooting target in the shooting picture can be focused in time and accurately. In particular, it can effectively improve the focusing accuracy and focusing speed of different shooting targets in complex scenes (such as scenes with shooting targets in motion). It is convenient for subsequent clear shooting of the shooting target, and improves the image quality of the shot image and the shooting success rate.
- the mobile phone may further include:
- Step S307 The mobile phone predicts the motion of the shooting target and determines the prediction information of the shooting target.
- the prediction module in the mobile phone can predict the position of the future shooting target based on the event data.
- the prediction module determines the prediction information of the shooting target according to the predicted position of the shooting target.
- the prediction information is used to represent the prediction
- the motion information of the photographed target may include the location of the photographed target.
- the prediction information includes a predicted target frame corresponding to the photographed target, a size of the predicted target frame, coordinates of the predicted target frame, and a confidence level of the predicted target frame. It is understandable that the predicted photographed target is in motion.
- the motion trajectory of the future shooting target can be predicted according to the second motion information corresponding to the second event data.
- motion target detection is performed according to the first event data to determine the first motion information of the shooting target.
- Target detection is performed according to the first frame of the original image to determine the first target detection information of the shooting target.
- the first event data is the event data collected between the first moment and the second moment
- the first moment is the acquisition moment of the first frame of the RGB image
- the second moment is the acquisition moment of the second frame of the RGB image.
- the first ROI focus area is determined according to the scene information, the first motion information and the updated first target detection information; then the focusing process is completed, and the second frame of the original image is collected at the second moment.
- the prediction module can make a prediction based on the second motion information corresponding to the second event data.
- the second event data is event data collected between the first moment and the third moment
- the third moment is between the first moment and the second moment
- the first moment and the second moment include at least one third moment.
- the mobile phone can detect the moving target based on the second event data and determine the second motion information of the shooting target.
- the prediction module can predict the motion of the shooting target based on the second motion information of the shooting target and determine the prediction information of the shooting target.
- the mobile phone can continue to perform subsequent tracking of the shooting target after completing the focus on the shooting target in the first frame of the RGB image.
- the prediction module can predict the motion trajectory of the shooting target within a preset time period in the future based on at least one of the motion trajectory, motion direction, motion speed and motion type of the shooting target when it moves within the detection range, and obtain prediction information of the shooting target.
- Step S308 The mobile phone performs a focus tracking task based on the predicted information of the shooting target.
- the focus tracking process is a changing process, which is to continue to focus on the focus point after completing the focus task, and keep the focus point in the aligned state. It can also be understood as the process of keeping the focus on the shooting target in the subsequent shooting process.
- the mobile phone can perform target detection based on the first frame of the original image to determine the second target detection information of the shooting target. After that, based on the second motion information of the shooting target, the second target detection information of the shooting target is updated. Then, based on the second motion information and the updated second target detection information, the second ROI focus area is determined. Finally, based on the predicted target frame in the prediction information and the second ROI focus area, the focus motor of the camera is controlled to track the shooting target.
- the prediction module can send the generated prediction information to the ROI selection module. That is to say, after determining the second ROI focus area, the ROI selection module can send the prediction information of the second ROI focus area and the shooting target to the AF module.
- the AF module performs a focus tracking process based on the predicted target frame and the second ROI focus area.
- the first ROI focus area includes pedestrians.
- the second ROI focus area also includes the pedestrian, and continues to move and change with the pedestrian.
- the mobile phone can also control the display screen to present the first ROI focus area and the prediction area; wherein the prediction area includes the predicted target frame in the prediction information of the shooting target.
- the mobile phone can display the first ROI focus area and the prediction area in different ways to distinguish them.
- the first ROI focus area and the prediction area are marked with different colors (for example, one is a yellow rectangular box and the other is a red rectangular box), or are marked with different line types (for example, one is a solid line and the other is a dotted line; or the linear thickness of the two marking boxes is different).
- the AF module may also send the first ROI focus area and the predicted area to other applications with camera functions other than the camera application.
- Other applications may perform other image processing such as interpolation algorithm, deblur algorithm and snapshot algorithm based on the ROI focus area and the predicted area.
- the embodiment of the present application can predict the position of the future shooting target and perform subsequent focus tracking based on the predicted information. Therefore, under the premise that the above-mentioned embodiment of the present application can focus in time, the shooting target can also be quickly tracked according to the prediction information. When the shooting target moves too fast, it may be possible to cause the shooting target to be out of the frame or even fail to track the focus due to the delay in tracking. At the same time, it is convenient for subsequent users to decide the best shooting time and shoot clearer images.
- Step S309 The mobile phone detects the user's photo-taking operation, and generates and displays a first target image.
- the preview interface may include a shooting control 401.
- the user can click the shooting control 401 in the preview interface, and after the mobile phone detects the user's shooting operation, it starts shooting and generates a first target image, that is, a clear photo taken by the user.
- a first target image that is, a clear photo taken by the user.
- the mobile phone detects the photo operation, it captures the second frame RGB image and generates a first target image obtained by taking the photo.
- a thumbnail corresponding to the first target image is displayed in the first preset area of the shooting preview interface.
- the first target image can be the image obtained by the user after taking the photo.
- the mobile phone can display the first target image on a shooting preview interface, and the shooting preview interface includes a shooting preview interface or a video preview interface.
- the mobile phone takes the mobile phone entering the photo preview state as an example.
- the mobile phone can also enter the video preview state.
- the preview interface may include multiple mode controls 402. For example, a video mode control, a portrait mode control, and a panoramic mode control.
- a video mode control When the mobile phone detects that the user clicks the video mode control, it enters the video mode.
- the process involved in the video preview state of the mobile phone is the same as the process involved in the shooting preview state, which will not be repeated here.
- the user can still click the shooting control 401 to trigger the video recording.
- the mobile phone detects the video recording operation of the user clicking the shooting control 401, it starts to execute the video recording task. For example, the focusing and tracking process continues.
- the mobile phone detects the video recording operation, it collects the second frame of the original image; and displays the first target image in the second preset area of the video recording interface. It is understandable that the video frame in the target video obtained by the video recording can be generated based on the second frame of the original image. And then the final clear target video is generated. It is understandable that after the mobile phone generates a photo or completes the recording of the video, it can save it in the memory and control the display control 403 to display the completed photo or video.
- the motion sensor can be turned on in real time, or it can be turned on in response to a user's start-up operation like the RGB sensor. Of course, it can also be turned on automatically or prompted to the user when the mobile phone recognizes that there is a moving target in the RGB image.
- the mobile phone can use AI algorithms (such as target detection algorithms, scene detection algorithms, moving target detection algorithms, etc.) to identify the RGB image currently captured by the camera. If a moving target is identified, the mobile phone automatically turns on the motion sensor.
- AI algorithms such as target detection algorithms, scene detection algorithms, moving target detection algorithms, etc.
- the mobile phone when the mobile phone automatically turns on the motion sensor, it can also output a first prompt message to prompt the user that the motion sensor is now turned on; for example: the motion shooting mode is now turned on.
- a first prompt message to prompt the user that the motion sensor is now turned on; for example: the motion shooting mode is now turned on.
- the mobile phone after the mobile phone recognizes that there is a shooting target in motion, it outputs a second prompt message to prompt the user whether the motion sensor needs to be turned on; for example: whether the motion shooting mode needs to be turned on. If the user chooses to turn on the motion shooting mode or the user's instruction is not received within a preset time period, the mobile phone turns on the motion sensor. Otherwise, the mobile phone does not turn on the motion sensor.
- one of the asynchronous readout mode and the synchronous readout mode may be used during the use of the motion sensor. It is also possible to use the asynchronous readout mode for a certain period of time and then switch to the synchronous readout mode. It is also possible to use the synchronous readout mode for a certain period of time and then switch to the asynchronous readout mode.
- the embodiments of the present application do not limit the readout mode used by the motion sensor.
- the motion sensor can continuously perform historical statistics and real-time analysis on the light intensity change events generated in the pixel array circuit during the entire acquisition and analysis process. Once the switching condition is met, a mode switching signal is sent to switch from the asynchronous readout mode to the synchronous readout mode, or from the synchronous readout mode to the asynchronous readout mode. This adaptive switching process is repeated until the reading of all event signals is completed.
- the user can also select the readout mode by himself.
- the asynchronous readout mode can be the first motion mode
- the synchronous readout mode can be the second motion mode.
- the user can click on the mode switching control 1501 in the camera application interface to select the first motion mode or the second motion mode; or switch from the first motion mode to the second motion mode, or switch from the second motion mode to the first motion mode.
- the user can also enter the property interface corresponding to the camera application and click on the mode control displayed in the property interface to select the first motion mode or the second motion mode.
- a setting application icon is displayed on the main screen interface of the mobile phone, and the user can click the setting application icon.
- the mobile phone responds to the user's click operation on the setting application icon and displays the setting application.
- User interface In the settings app, a list of all applications installed on the phone is displayed. The user can click on the camera app in the list to trigger the property page corresponding to the camera app.
- the mobile phone displays the property page corresponding to the camera application, which includes the first motion mode control and the second motion mode control corresponding to the camera application.
- the user can click the first motion mode control to turn on the first motion mode corresponding to the camera application.
- the embodiment of the present application can output the first prompt information and the second prompt information, and enable the user to select the readout mode of the motion sensor.
- the user can be aware of the activation of the motion sensor, or the user can authorize whether the motion sensor can be activated.
- the interactivity between the user and the mobile phone is increased, and the user's experience is improved.
- multiple embodiments of the present application can be combined, and the combined scheme can be implemented.
- some operations in the process of each method embodiment are optionally combined, and/or the order of some operations is optionally changed.
- the execution order between the steps of each process is only exemplary and does not constitute a restriction on the execution order between the steps.
- a person of ordinary skill in the art will think of a variety of ways to reorder the operations described in the embodiments of the present application.
- the process details involved in a certain embodiment of the present application are also applicable to other embodiments in a similar manner, or different embodiments can be used in combination.
- steps in the method embodiment may be equivalently replaced by other possible steps.
- some steps in the method embodiment may be optional and may be deleted in certain usage scenarios.
- other possible steps may be added to the method embodiment.
- An embodiment of the present application also provides an electronic device, such as the above-mentioned mobile phone.
- the electronic device may include one or more processors 1610 , a memory 1620 and a communication interface 1630 .
- the memory 1620 , the communication interface 1630 , and the processor 1610 are coupled together.
- the memory 1620 , the communication interface 1630 , and the processor 1610 may be coupled together via a bus 1640 .
- the communication interface 1630 is used for data transmission with other devices.
- the memory 1620 stores computer program code.
- the computer program code includes computer instructions. When the computer instructions are executed by the processor 1610, the electronic device executes the control method of the background application in the embodiment of the present application.
- the processor 1610 can be a processor or a controller, for example, a central processing unit (CPU), a general processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logic blocks, modules and circuits described in conjunction with the present disclosure.
- the processor can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of DSP and microprocessors, and the like.
- the bus 1640 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus.
- PCI Peripheral Component Interconnect
- EISA Extended Industry Standard Architecture
- the bus 1640 may be divided into an address bus, a data bus, a control bus, etc.
- FIG17 only uses one thick line, but does not mean that there is only one bus or one type of bus.
- An embodiment of the present application further provides a computer-readable storage medium, in which a computer program code is stored.
- a computer program code is stored.
- the electronic device executes the relevant method steps in the method embodiment.
- the embodiment of the present application also provides a computer program product.
- the computer program product When the computer program product is run on a computer, it enables the computer to execute the relevant method steps in the above method embodiment.
- the electronic device, computer storage medium or computer program product provided in this application is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding method provided above, and will not be repeated here.
- the disclosed devices and methods can be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of modules or units is only a logical function division. There may be other division methods in actual implementation.
- multiple units or components can be combined or integrated into another device, or some features can be ignored or not executed.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may be one physical unit or multiple physical units, that is, they may be located in one place or distributed in multiple different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the present embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
- the technical solution of the embodiment of the present application is essentially or the part that makes the contribution or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions to enable a device (which can be a single-chip microcomputer, chip, etc.) or a processor (processor) to execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk and other media that can store program code.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- Studio Devices (AREA)
- Image Analysis (AREA)
Abstract
本申请实施例提供一种拍摄方法及设备,涉及电子技术领域。本申请实施例可以响应于对拍摄功能的启动操作,获取运动传感器采集的第一事件数据和图像传感器采集的第一帧原始图像。并且根据第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息。并且根据第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息。之后,基于拍摄目标的第一运动信息,更新拍摄目标的第一目标检测信息。最后,根据第一运动信息和更新后的第一目标检测信息,确定第一感兴趣ROI对焦区域。并根据第一ROI对焦区域,对摄像头进行对焦以采集第二帧原始图像。提高了对焦精度以及对焦速度,进而提高用户的使用体验。
Description
本申请要求于2023年01月10日提交国家知识产权局、申请号为202310035833.0、申请名称为“一种拍摄方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及电子技术领域,尤其涉及一种拍摄方法及设备。
随着电子技术的发展,手机或平板电脑等电子设备的摄像头功能越来越强大,可用于实现拍照和录影等功能。例如,用户打开摄像头后,手机可将摄像头捕捉到的拍摄画面实时显示在用户界面中。进而,用户可以获得具有不同效果的图像。
目前,用户可以手动在拍摄画面中选择焦点,使得手机可以根据用户选择的焦点进行对焦。同时,手机也可以提供给用户自动跟踪焦点并对焦的功能。然而,在手机自动跟踪焦点并对焦的过程中,会出现无法及时追踪到拍摄画面中的拍摄目标,从而导致对焦结果不准确,且对焦成功率低的情况。进而,导致拍摄的照片或者视频模糊,降低用户的体验感。
发明内容
本申请实施例提供一种拍摄方法及设备,用于解决在手机自动跟踪焦点并对焦的过程中,会出现无法及时追踪到拍摄画面中的拍摄目标,从而导致对焦结果不准确的问题。
为达到上述目的,本申请的实施例采用如下技术方案:
第一方面,提供了一种拍摄方法,应用于具有摄像头的电子设备,摄像头包括运动传感器和图像传感器,该方法包括:
电子设备响应于对拍摄功能的启动操作,获取运动传感器采集的第一事件数据和图像传感器采集的第一帧原始图像。之后,电子设备根据第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息;第一运动信息用于表征拍摄目标的运动情况。并且根据第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息;第一目标检测信息用于表征拍摄画面中拍摄目标的位置。
接着,电子设备基于拍摄目标的第一运动信息,更新拍摄目标的第一目标检测信息。之后,电子设备根据第一运动信息和更新后的第一目标检测信息,确定第一感兴趣ROI对焦区域;并且根据第一ROI对焦区域,对摄像头进行对焦以采集第二帧原始图像。最后,电子设备显示第一目标图像,第一目标图像由对焦完成的第二帧原始图像生成。
可见,本申请实施例可以对第一帧原始图像中的拍摄目标进行目标检测,输出第一目标检测信息。其中,第一目标检测信息用于表示拍摄画面中拍摄目标的静态信息。并且还可以根据运动传感器采集的第一事件数据进行运动目标检测,输出第一运动信息。第一运动信息用于表示拍摄画面中拍摄目标的运动情况信息。之后,利用第一运动信息更新拍摄目标的第一目标检测信息。最后,根据第一运动信息和更新后的第一目标检测信息确定对焦区域,以完成自动对焦。在对焦之后,采集第二帧原始图像并显示第一目标图像,第一目标图像可以为拍摄预览界面显示的图像。
由此,本申请实施例可以利用具有拍摄帧率较高的事件相机采集第一事件数据,以及第一事件数据对应的第一运动信息更新第一目标检测信息。也就是说,拍摄帧率较高的事件相机采集的第一事件数据所对应的第一运动信息,能够及时地对第一目标检测信息进行指导,从而可以改善第一目标检测结果存在滞后的情况,以使得第一目标检测信息更贴近拍摄目标在真实场景下的运动状态,提高了在运动场景下对焦成功的概率。同时,本申请实施例通过处于运动状态的拍摄目标对应的第一运动信息确定对焦区域。由于已经得到拍摄目标的第一运动信息,可以直接利用该第一运动信息执行自动对焦过程。进而,无需遍历拍摄画面中由近至远的全部位置的事件信号。能够精准地确定对焦区域,便于后续拍摄到清晰的图片或视频,提升用户的使用体验。
在第一方面的一种可实现方式中,电子设备在根据第一帧原始图像进行目标检测,确定拍摄
目标的第一目标检测信息的过程中,包括:
电子设备根据第一帧原始图像进行场景检测和目标检测,确定拍摄目标的第一目标检测信息和场景信息;场景信息用于表征拍摄目标的拍摄场景。
并且,电子设备在根据第一运动信息和更新后的第一目标检测信息,确定感兴趣第一ROI对焦区域的过程中;包括:
电子设备根据第一运动信息、场景信息和更新后的第一目标检测信息,确定第一ROI对焦区域。
可见,本申请实施例除了对第一帧原始图像中的拍摄目标进行目标检测之外,还可以对第一帧原始图像中的拍摄目标进行场景检测,输出场景信息。场景信息用于表示拍摄画面中的拍摄场景。并且,本申请实施例可以根据运动信息、场景信息以及更新后的目标检测信息确定对焦区域,进而完成自动对焦。
也就是说,本申请实施例还可以结合拍摄画面中的拍摄场景确定对焦区域。这样,若拍摄画面中包括拍摄场景如月亮、蓝天以及太阳等,对焦区域也可以根据运动信息、场景信息以及更新后的目标检测信息来综合确定。示例性的,若用户想要拍摄一些拍摄场景,电子设备后续可以对焦在拍摄场景中。进而,提高用户的使用体验感。
在第一方面的一种可实现方式中,电子设备在根据第一事件数据进行运动目标检测的过程中,包括:
电子设备对第一事件数据进行切分,生成多帧事件帧图像。若拍摄目标对应的运动速率发生变化,则根据运动速率调整单位时长内生成事件帧图像的帧数;事件帧图像的帧数与拍摄目标对应的运动速率正相关;之后,电子设备对事件帧图像进行运动目标检测。
可见,本申请实施例可以对运动传感器采集到的第一事件数据进行切分,生成多帧事件帧图像。之后,累积多组事件帧图像,对多组事件帧图像中的一系列运动轨迹进行分析。并且,本申请实施例中单位时长内生成事件帧图像的帧数与拍摄目标的运动速率相关。拍摄目标对应的运动速率越大,单位时长内生成事件帧图像的帧数越多。反之,拍摄目标对应的运动速率越小,单位时长内生成事件帧图像的帧数越少。
也就是说,随着拍摄画面中拍摄目标对应的运动速率发生变化,单位时长内生成事件帧图像的帧数也会相应发生变化。由于拍摄目标对应的运动速率较大时,拍摄目标在拍摄画面中位置的移动速度越快。由此,为了便于更精准的进行后续对焦过程,本申请实施例提供单位时长内生成事件帧图像的帧数与拍摄目标的运动速率正相关,单位时间内产生的事件帧图像越多,即事件帧图像对应的第一运动信息越多。这样,便于后续采用第一运动信息提供给第一目标检测信息针对拍摄目标更精确的运动情况,以使拍摄目标的第一目标检测信息也就越准确。进而提升后续对焦过程中的对焦精度和成功概率。
在第一方面的一种可实现方式中,电子设备在对第一事件数据进行切分,生成多帧事件帧图像的过程中;包括:电子设备根据预设事件数据量,对第一事件数据进行切分,生成多帧事件帧图像;其中,每帧事件帧图像均对应预设事件数据量。
可见,本申请实施例可以按照预设事件数据量将第一事件数据切分成多组事件帧图像,如k组事件帧图像,每帧事件帧图像可以对应相同的预设事件数据量。其中,预设事件数据量可以包括事件信号。当然,每帧事件帧图像可以对应不相同的预设事件数据量。这样,通过按照预设事件数据量对第一事件数据进行切分,生成多帧事件帧图像。便于后续根据事件帧图像得到第一运动信息,进而掌握拍摄目标的运动情况。同时,便于后续根据第一运动信息更新第一目标检测信息,以能够精准地确定对焦区域,便于后续用户拍摄到清晰的图片或视频,提升用户的使用体验。
在第一方面的一种可实现方式中,第一目标检测信息包括拍摄目标的第一目标框、第一目标框对应的区域大小或第一目标框对应的坐标中的一种或多种;
第一运动信息包括拍摄目标对应的第二目标框、第二目标框对应的区域大小或第二目标框对应的坐标中的一种或多种,以及拍摄目标对应的置信度;
或者,第一运动信息包括拍摄目标对应的运动轮廓、运动轮廓对应的区域大小或运动轮廓对应的坐标中的一种或多种,以及拍摄目标对应的置信度。
第一运动信息还包括拍摄目标对应的运动方向、运动速度或运动类型中的一种或多种;运动类型至少包括往复运动、旋转运动、横向运动、纵向运动或蹦跳运动中的一种或多种。
可见,第一目标检测信息包括拍摄目标的第一目标框、第一目标框对应的区域大小或第一目标框对应的坐标中的一种或多种;第一运动信息包括拍摄目标对应的第二目标框、第二目标框对应的区域大小或第二目标框对应的坐标中的一种或多种,以及拍摄目标对应的置信度。后续在确定对焦区域以及执行对焦任务时,可根据上述第一目标检测信息以及第一运动信息来确定。
可以理解的是,后续手机在执行对焦任务时,需计算第二目标框内的平均深度值。由于第二目标框中除了拍摄目标自身还包括一些其他区域,因而,在计算平均深度值的过程中,拍摄目标对应的运动轮廓区域的平均深度值较于第二目标框的平均深度值更精确。由此,运动信息不仅可以包括第二目标框,还可以包括拍摄目标对应的运动轮廓区域、运动轮廓对应的区域大小或运动轮廓对应的坐标中的一种或多种,以及拍摄目标对应的置信度。
并且,第一运动信息还可以包括拍摄目标的运动方向、拍摄目标的运动速度以及拍摄目标的运动类型等。电子设备可以通过拍摄目标的运动方向、拍摄目标的运动速度以及拍摄目标的运动类型确定后续对焦区域,以使对焦区域更加精确。进而提升电子设备在对焦过程的对焦精度。
在第一方面的一种可实现方式中,第一事件数据的采集时间在第二帧原始图像的采集时间之前,电子设备基于拍摄目标的第一运动信息,更新拍摄目标的第一目标检测信息的过程中;包括:电子设备可以基于第一运动信息中第二目标框的位置,调整第一目标检测信息中第一目标框的位置。
可见,本申请实施例在事件帧图像的采集时间早于第二帧原始图像的采集时间的前提下,电子设备可以基于事件帧图像的第一运动信息,更新第一帧原始图像的目标检测信息。
也就是说,对事件帧图像中拍摄目标的运动目标检测较于对第一帧原始图像中拍摄目标的目标检测的检测精度更高。由此,可以通过事件帧图像的第一运动信息指导第二帧原始图像的目标检测信息,完成对第一帧原始图像中第一目标检测信息的更新,以使得第一目标检测信息更精准并且更贴近拍摄目标在真实场景下的运动状态。
这样,减小第二帧原始图像与更新完成的第一帧原始图像之间的时间间隔。并且基于更贴近真实场景的第一运动信息对第二帧原始图像进行指导,相比仅依赖第一帧原始图像对应的第一目标检测结果更准确,以便于提升后续对焦过程中的对焦精度和成功概率。
在第一方面的一种可实现方式中,电子设备在根据第一运动信息、场景信息和更新后的第一目标检测信息,确定第一ROI对焦区域的过程中,包括:
电子设备基于预设筛选规则,从第一运动信息中第二目标框和更新后的第一目标检测信息中第一目标框中筛选得到第一ROI对焦区域;
其中,预设筛选规则包括对比第二目标框中拍摄目标的置信度与第一目标框中拍摄目标的置信度,将置信度高的第一目标框或第二目标框确定为第一ROI对焦区域;
预设筛选规则还包括检测第一运动信息与场景信息之间的关联度以及第一目标检测信息与场景信息之间的关联度,将关联度高的第一目标框或第二目标框确定为第一ROI对焦区域;
预设筛选规则还包括对比第一运动信息和第一目标检测信息对应的优先级,将优先级高的第一目标框或第二目标框确定为第一ROI对焦区域。
可见,本申请实施例可以基于预设筛选规则,确定对焦区域。预设筛选规则可以基于第一目标框与第二目标框对应的置信度来预置;还可以基于第一运动信息与场景信息之间的关联度以及第一目标检测信息与场景信息之间的关联度来预置;还可以基于第一运动信息和第一目标检测信息对应的优先级来预置。当然,上述预设筛选规则可以进行结合,根据结合后的预设筛选规则确定对焦区域。进而,提升确定对焦区域的准确度,便于后续进行精准对焦。
在第一方面的一种可实现方式中,电子设备在根据第一ROI对焦区域,对第二帧原始图像进行对焦的过程中,包括;电子设备计算第一ROI对焦区域中的平均深度值。之后,电子设备根据平均深度值,确定摄像头的对焦马达移动的距离以及对焦位置;电子设备控制摄像头的对焦马达移动距离以及移动至对焦位置。
可见,本申请实施例通过计算对焦区域的平均深度值,根据平均深度值确定对焦马达移动的
距离以及对焦位置。最后,电子设备控制摄像头的对焦马达移动确定的距离并移动至对焦位置。
也就是说,电子设备根据对焦区域控制对焦马达将镜头完成对焦。因为第一运动信息可以预先对第一目标检测信息进行更新,能够使第一目标检测信息更贴近在真实场景下的拍摄目标。所以在确定ROI对焦区域以及后续对焦的过程中,能够更精确的确定对焦区域以及待对焦的对焦位置。进而能够及时且精准地对焦到拍摄画面中拍摄目标。尤其能够有效地提高在复杂场景(如存在包括运动状态的拍摄目标的场景)中,对不同的拍摄目标的对焦准确度以及对焦速度。便于后续对拍摄目标的清晰拍摄,提高了拍摄图像的图像质量以及拍摄的成片率。
在第一方面的一种可实现方式中,第一事件数据为第一时刻与第二时刻之间采集到的事件数据,第一时刻为第一帧原始图像的采集时刻,第二时刻为第二帧原始图像的采集时刻,方法还包括:
电子设备根据第二事件数据进行运动目标检测,确定拍摄目标的第二运动信息;第二事件数据为第一时刻与第三时刻之间采集到的事件数据,第三时刻位于第一时刻与第二时刻之间,且第一时刻与第二时刻之间包括至少一个第三时刻;
电子设备根据拍摄目标的第二运动信息对拍摄目标的运动情况进行预测,确定拍摄目标的预测信息;预测信息包括拍摄目标对应的预测目标框、预测目标框的大小、预测目标框的坐标中的一种或多种;
之后,电子设备可以根据第一帧原始图像进行目标检测,确定拍摄目标的第二目标检测信息;并且基于拍摄目标的第二运动信息,更新拍摄目标的第二目标检测信息;
最后,电子设备根据第二运动信息和更新后的第二目标检测信息,确定第二ROI对焦区域;并且根据预测目标框和第二ROI对焦区域,控制摄像头的对焦马达对拍摄目标进行追焦。
可见,本申请实施例还可执行对焦过程,追焦过程是为在完成对焦任务后,继续对焦点进行连续对焦,一直保持这个对焦点处于被对准的状态。也可以理解为,在后续的拍摄过程中保持对拍摄目标的对焦的过程。
在本申请实施中,第一事件数据为第一时刻与第二时刻之间采集到的事件数据,第一时刻为第一帧原始图像的采集时刻,第二时刻为第二帧原始图像的采集时刻。第二事件数据为第一时刻与第三时刻之间采集到的事件数据,第三时刻位于第一时刻与第二时刻之间,且第一时刻与第二时刻之间包括至少一个第三时刻。也就是说,电子设备在对第一帧原始图像进行对焦之后,可以执行后续追焦过程。
由此,本申请实施例还可以在进行运动目标检测的过程中,对拍摄目标的运动情况进行预测,输出预测信息;预测信息用于表示预测的拍摄目标的运动情况信息。之后基于输出的预测信息进行自动追焦,便于用户根据自动追焦功能决策出最佳的拍摄时机,拍摄出更清晰的图片或视频。
在第一方面的一种可实现方式中,方法还包括:电子设备显示预测目标框;预测目标框呈现在第一目标图像中,预测目标框用于指示拍摄目标的预测位置。
可见,本申请实施例可以在显示屏中显示预测信息中的预测目标框,以指示拍摄目标的预测位置。这样,用户可以根据预测目标框,获取拍摄目标在未来一段时长内的移动位置。增加了用户的可玩性以及与电子设备的交互性,提高用户的使用体验。在第一方面的一种可实现方式中,电子设备在显示第一目标图像的过程中,包括:在拍摄预览界面上显示第一目标图像,拍摄预览界面包括拍照预览界面或录像预览界面。
可见,在拍摄预览的场景下,电子设备可以在显示屏中显示第一目标图像。当然,拍摄预览界面可以包括拍照预览界面或录像预览界面。第一目标图像可以表示为预览图像,即对焦完成后的图像。用户可以基于第一目标图像查看对焦到的拍摄目标,以便于进行后续的拍照操作或录像操作。进而,提升用户的使用体验。在第一方面的一种可实现方式中,电子设备在采集第二帧原始图像的过程中,包括:电子设备检测到拍照操作后,采集第二帧原始图像并生成拍照得到的第一目标图像;
电子设备在显示第一目标图像的过程中,包括:
在拍摄预览界面的第一预设区域,显示第一目标图像对应的缩略图。
同理,电子设备在采集第二帧原始图像的过程中,包括:
电子设备检测到录像操作后,采集第二帧原始图像;并在录像界面的第二预设区域显示第一目标图像;其中,电子设备根据第二帧原始图像生成录像获得的目标视频中的视频帧。
可见,本申请实施例提供了在拍照的场景下,电子设备检测到拍照操作后,采集第二帧原始图像并生成拍照得到的第一目标图像。并在拍摄预览界面中的预设区域中显示第一目标图像的缩略图。第一目标图像用于表征用户拍照得到的图像。同理,在录像的场景下,电子设备检测到录像操作后,采集第二帧原始图像;并在录像界面的第二预设区域显示第一目标图像。第二帧原始图像用于生成录像获得的目标视频中的视频帧。由此,电子设备可以根据用户决策的拍摄时机,采集并显示拍照或录像得到的图像。以使用户看到拍摄出的清晰的图像。进而,提高用户的使用体验感。
第二方面,提供了一种电子设备,电子设备包括存储器、一个或多个处理器;存储器与处理器耦合;其中,存储器中存储有计算机程序代码,计算机程序代码包括计算机指令,当计算机指令被处理器执行时,使得电子设备执行如上述第一方面所述的拍摄方法。
第三方面,提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机可以执行如上述第一方面所述的拍摄方法。
图1为本申请实施例提供的一种电子设备的硬件结构示意图;
图2为本申请实施例提供的一种电子设备的软件结构示意图;
图3为本申请实施例提供的一种拍摄方法的流程示意图;
图4为本申请实施例提供的一种预览界面的界面示意图;
图5为本申请实施例提供的一种拍摄目标的示意图;
图6为本申请实施例提供的一种采用固定事件率合帧的示意图;
图7为本申请实施例提供的一种运动传感器监测到的事件的示意图;
图8为本申请实施例提供的一种运动信息的示意图;
图9为本申请实施例提供的一种事件数据的掩膜(Mask)的示意图;
图10为本申请实施例提供的一种更新目标检测信息的示意图;
图11为本申请实施例提供的一种事件数据对应的数据量自适应切分的示意图;
图12为本申请实施例提供的一种确定ROI对焦区域的示意图;
图13为本申请实施例提供的一种追焦过程的示意图;
图14为本申请实施例提供的一种执行追焦任务的示意图;
图15为本申请实施例提供的一种开启运动传感器的示意图;
图16为本申请实施例提供的一种切换运动模式的示意图;
图17为本申请实施例提供的一种电子设备的结构示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。其中,在本申请的描述中,除非另有说明,“/”表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。并且,在本申请的描述中,除非另有说明,“多个”是指两个或多于两个。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。同时,在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念,便于
理解。
此外,本申请实施例描述的网络架构以及业务场景是为了更加清楚地说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
随着电子技术的发展,用户可以使用手机等电子设备中的摄像头功能,进行拍照和录像等。在用户进行拍摄的过程中,用户可以手动在拍摄画面中选择焦点,之后手机可以根据用户选择的焦点进行对焦。或者,手机也可以提供自动对焦的功能。
例如,当拍摄画面中存在移动的拍摄目标时,手机可以对拍摄目标进行运动目标检测,根据运动检测结果执行自动跟踪焦点并对焦。在运动目标检测的过程中,需要依赖于拍摄的静态图像。例如运动目标检测结果可以作为纹理、颜色等信息的补充。这样,在上述手机自动跟踪焦点并对焦的过程中,由于静态图像对应的拍摄帧率较低以及拍摄目标的运动轨迹存在不规律性,得到静态图像对应的运动目标检测结果存在滞后。也就是说,运动目标检测结果中的拍摄目标有可能已进入下一位置或状态。因此,会出现无法及时对焦到拍摄画面中的拍摄目标的情况,以及导致拍摄的照片或者视频模糊,降低用户的体验感。
再例如,手机还可以通过利用事件相机采集的全部事件数据(由近至远的位置进行移动),基于事件率计算对焦函数对应的数值,将数值最大的位置确定为对焦位置;之后执行自动跟踪焦点并对焦。然而,在上述手机自动跟踪焦点并对焦的过程中,需要遍历大量的数据进行计算,工作效率低。同样也会出现计算误差导致反复对焦的情况,降低对焦结果的准确度。
基于上述内容,本申请实施例提供一种拍摄方法可以应用于各种场景,如摄影场景(拍摄静态场景和拍摄动态场景)以及监控场景等。本申请实施例可以响应于对拍摄功能的启动操作,获取运动传感器采集的第一事件数据和图像传感器采集的第一帧原始图像。并且根据第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息;第一运动信息用于表征所述拍摄目标的运动情况。根据第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息;第一目标检测信息用于表征拍摄画面中拍摄目标的位置。之后,基于拍摄目标的第一运动信息,更新拍摄目标的第一目标检测信息。最后,根据第一运动信息和更新后的第一目标检测信息,确定第一感兴趣ROI对焦区域。并根据第一ROI对焦区域,对摄像头进行对焦以采集第二帧原始图像。在显示屏中显示第一目标图像,第一目标图像由对焦完成的所述第二帧原始图像生成。
并且,电子设备在根据第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息的过程中,包括:根据第一帧原始图像进行场景检测和目标检测,确定拍摄目标的第一目标检测信息和场景信息;场景信息用于表征拍摄目标的拍摄场景。之后,电子设备根据第一运动信息、场景信息和更新后的第一目标检测信息,确定第一ROI对焦区域。
由此,本申请实施例可以利用具有拍摄帧率较高的事件相机采集事件数据,以及事件数据对应的运动信息更新目标检测信息。而拍摄帧率较高的事件相机采集的事件数据所对应的运动信息,能够及时地对目标检测信息进行指导,从而可以改善目标检测结果存在滞后的情况,以使得目标检测信息更贴近拍摄目标在真实场景下的运动状态,提高了在运动场景下对焦成功的概率。
同时,本申请实施例通过处于运动状态的拍摄目标对应的运动信息确定对焦区域。由于已经得到拍摄目标的第一运动信息,可以直接利用该运动信息执行自动对焦过程。进而,无需遍历拍摄画面中由近至远的全部位置的事件信号。避免遍历过程中出现误差以及工作效率低的情况,同时能够精准地确定对焦区域,便于后续拍摄到清晰的图片或视频,提升用户的使用体验。
并且,本申请实施例还可以在进行运动目标检测的过程中,对拍摄目标的运动情况进行预测,输出预测信息;预测信息用于表示预测的拍摄目标的运动情况信息。之后基于输出的预测信息进行自动追焦,便于用户根据自动追焦功能决策出最佳的拍摄时机,拍摄出更清晰的图片或视频。
在本申请实施例中,电子设备可以是手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等移动终端,也可以是专业的相机等设备,本申请实施例对电子设备的具体类型不作任何
限制。
该电子设备安装的操作系统包括但不限于或者其它操作系统。本申请对电子设备的具体类型以及在有安装操作系统下操作系统的类型均不作限制。
示例性的,以电子设备为手机为例,图1示出了手机100的结构示意图。
手机100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(Universal Serial Bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(Subscriber Identification Module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M,图像传感器180N以及运动传感器180O等。
可以理解的是,本发明实施例示意的结构并不构成对手机100的具体限定。在本申请另一些实施例中,手机100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一种或多种处理单元,例如:处理器110可以包括应用处理器(Application Processor,AP),调制解调处理器,图形处理器(Graphics Processing Unit,GPU),图像信号处理器(Image Signal Processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(Digital Signal Processor,DSP),基带处理器,和/或神经网络处理器(Neural-network Processing Unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是手机100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
手机100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
无线通信模块160可以提供应用在手机100上的包括无线局域网(Wireless Iocal Area Networks,WLAN)(如无线保真(Wireless Fidelity,Wi-Fi)网络),蓝牙(Bluetooth,BT),全球导航卫星系统(Global Navigation Satellite System,GNSS),调频(Frequency Modulation,FM),近距离无线通信技术(Near Field Communication,NFC),红外技术(Infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一种或多种器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,手机100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得手机100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(Global System For Mobile Communications,GSM),通用分组无线服务(General Packet Radio Service,GPRS),码分多址接入(Code Division Multiple Access,CDMA),宽带码分多址(Wideband Code Division Multiple Access,WCDMA),时分码分多址(Time-Division Code Division Multiple Access,TD-SCDMA),长期演进(Long Term Evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(Global Positioning System,GPS),全球导航卫星系统(Global Navigation Satellite System,GLONASS),北斗卫星导航系统(Beidou Navigation Satellite System,BDS),准天顶卫星系统(Quasi-Zenith
Satellite System,QZSS)和/或星基增强系统(Satellite Based Augmentation Systems,SBAS)。
手机100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一种或多种GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。在一些实施例中,手机100可以包括1个或N个显示屏194,N为大于1的正整数。
手机100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(Charge Coupled Device,CCD)或互补金属氧化物半导体(Complementary Metal-Oxide-Semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,手机100可以包括1个或N个摄像头193,N为大于1的正整数。
这里,摄像头193可以设置在手机100中。当然也可以是电子设备的一部分部件。一些实现方式中,摄像头也可以设置在电子设备外部,与电子设备通过有线或无线的方式连接。例如,摄像头与电子设备可以通过蓝牙或移动热点与电子设备连接。电子设备可以通过向摄像头发送或接收指令的方式实现对摄像头的控制。
手机100中还可以包括摄像模组,摄像模组可以设置在摄像头内。当然也可以设置在手机100内部的其他位置。其中,摄像模组包括镜头、对焦马达、底座、电路板及感光元件。
底座固定连接于电路板的一侧。对焦马达位于底座远离电路板的一侧、且固定连接于底座的周缘。镜头安装于对焦马达的中部。感光元件固定于电路板朝向镜头的一侧。
镜头用于采集被拍摄目标反射的光信号。对焦马达用于驱动镜头沿平行于光轴的方向移动。其中,光轴是指通过镜头中心的线。在一些实施例中,手机100可以控制对焦马达移动镜头至对焦位置,进而完成对焦过程。
在一些实施例中,对焦马达可以是音圈马达(Voice Coil Motor,VCM)、记忆金属(Shape Memory Alloy,SMA)马达、陶瓷马达(Piezo Motor,PM)以及步进马达(Stepper Motor,STM)等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展手机100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行手机100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。
手机100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
图像传感器180N可以用于对摄像头拍摄的范围内的物体进行检测,每个感光单元对应图像传感器中的一个像点。图像传感器180N可以包括第一颜色(red green blue,RGB)图像传感器、第二颜色(red yellow blue,RYB)图像传感器、黑白图像传感器以及红外图像传感器等,本申请实施例对此不作限定。图像传感器180N用于采集原始图像,其中,原始图像可以包括RGB图像、RYB图像、黑白图像以及红外图像等。
为便于后续描述,将图像传感器180N以RGB图像传感器作为示例,以及原始图像为RGB图像作为示例进行后续说明。示例性的,第一帧原始图像可以为第一帧RGB图像,第二帧原始图
像可以为第二帧RGB图像。每个感光单元上方覆盖RGB红绿蓝三色滤光片。这样,在接受光照之后,感光单元产生对应的电流,电流大小与光照强度对应,因此感光单元直接输出的电信号是模拟的,之后再将输出的模拟电信号转换为数字信号,最终得到的所有数字信号以数字图像矩阵的形式输出给专门的DSP处理芯片处理。RGB图像传感器是按帧格式输出拍摄区域的全幅图像。
运动传感器180O可以包括多种不同类型的视觉传感器,如可以包括基于帧的运动检测视觉传感器(Motion Detection Vision Sensor,MDVS)和基于事件的运动检测视觉传感器。可以用于对摄像头拍摄的范围内的运动目标进行检测,采集运动物体的事件数据;事件数据包括运动轮廓或者运动轨迹等。
在一些实施例中,运动传感器180O可以包括运动检测(Motion Detection,MD)视觉传感器,MD是一类检测运动信息的视觉传感器,运动信息源于摄像头和拍摄目标之间的相对运动,可以是摄像头运动,也可以是拍摄目标运动,还可以是摄像头和目标都在运动。运动检1测视觉传感器包括基于帧的运动检测和基于事件的运动检测。基于帧的运动检测视觉传感器需要曝光积分,通过帧差获得运动信息。基于事件的运动检测视觉传感器无需积分,通过异步事件检测获得运动信息。
在一些实施例中,运动传感器180O可以包括检测视觉传感器(Motion Detection Vision Sensor,MDVS)、动态视觉传感器(Dynamic Vision Sensor,DVS)、主动式传感器(Active Pixel Sensor,APS)、红外传感器、激光传感器或者惯性测量单元(Inertial Measurement Unit,IMU)等。该DVS具体可以包括动态视觉像素传感器(Dynamic and Active-pixel Vision Sensor,DAVIS)和异步时间图像传感器(Asynchronous Time-based Image Sensor,ATIS)等。
DVS借鉴了生物视觉的特性,每个像素模拟一个神经元,独立地对光照强度(以下简称“光照强度”)的相对变化做出响应。例如,若运动传感器为DVS时,当光照强度的相对变化超过阈值时,像素会输出一个事件信号,包括像素的位置、时间戳以及光照强度的特征信息。应理解,在本申请以下实施方式中,所提及的事件数据、事件帧图像以及运动信息等,都可以通过运动传感器采集得到。
在一些实施例中,在同一个摄像头中可以设置多个图像传感器。例如,单镜头双传感器的相机,将RGB图像传感器和运动传感器均设置在一个相机中。再例如,双镜头双传感器的相机以及单镜头三传感器的相机,这些传感器用于对相同的被拍摄物体成像。当镜头数目少于传感器数目时,在镜头和传感器之间还可以设置分光器,以便把一个镜头进入的光线分解到多个传感器上,确保每个传感器都能接收到光线。同时,在这些摄像头中,处理器的数量可以是一种或多种。本申请实施例不对部件布置和部件数量进行具体限定。
在另一些实施例中,在同一个摄像头中可以仅设置一个图像传感器。例如,RGB像素和事件像素采用hybrid模式的形式合成的一个图像传感器。其中,事件像素可以均匀的插入RGB像素的空间排布中,实现空分复用。当然,RGB像素也可以和事件像素共同设置在光电二极管中,之后通过两种不同的读出电路分别读出对应的RGB数据和事件数据。
例如,在本申请实施例中,手机100中设置有RGB图像传感器和运动传感器,RGB图像传感器可以设置在RGB摄像头中,运动传感器可以设置在事件摄像头中。通常,RGB摄像头和事件摄像头会根据自身可视角度形成两个圆锥体视场。在采集图像的过程中,RGB摄像头和事件摄像头对应的两个圆锥体视场需存在重叠视场,且该重叠视场应落在采集图像的位置。
进而,RGB摄像头需与事件摄像头可以并排设置于同一平面,设置位置不能重叠,以使两个圆锥体视场存在重叠视场。当然,RGB图像传感器和运动传感器也可以设置在一个摄像头中。RGB摄像头也可以称为RGB相机,事件摄像头也可以称为事件相机。本申请实施例提供的RGB图像传感器和运动传感器相配合可以采集到包括拍摄目标的拍摄画面,同时可以及时且精准的追踪到拍摄目标。
在一些实施例中,在RGB图像传感器和运动传感器出厂之前,通常会进行相机标定,以使得RGB图像传感器采集的图像信息和运动传感器采集的运动信息更加精确。
通过确定空间物体表面某点的三维几何位置与采集图像中对应点之间相互关系,以完成RGB图像传感器和运动传感器的标定。进而,在标定过程中需要建立成像几何模型,这些几何模型参
数可以被概括为标定参数。
其中,标定参数包括内参、外参以及畸变参数等。在大多数情况下,标定参数可以通过实验和计算才能够求解得到,通过实验和计算求解标定参数的过程被即为标定过程。本申请实施例中RGB图像传感器和运动传感器的标定过程可以在出厂之前进行,也可以在用户使用摄像头之前定期进行。具体标定过程以及标定过程的执行时间本申请实施例并不具体进行限定。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。手机100可以接收按键输入,产生与手机100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电以及通知等。
图2是本发明实施例的手机100的软件结构框图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为五层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android Runtime)和系统库,硬件抽象层(Hardware Abstract Layer,HAL)以及内核层。
应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(Application Programming Interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一种或多种视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供手机100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android Runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
硬件抽象层运行于用户空间(user space),对内核层驱动进行封装,向上层提供调用接口。硬件抽象层可以包括采集模块、目标检测模块、场景检测模块、运动目标检测模块、预测模块、ROI选择模块以及AF模块。
在一些实施例中,采集模块用于采集RGB数据和事件数据;目标检测模块用于对采集到的RGB数据进行目标检测,生成目标检测信息;场景检测模块用于对采集到的RGB数据进行场景检测,生成场景信息;运动目标检测模块用于对采集到的事件数据进行运动目标检测,生成拍摄目标的运动信息;预测模块用于对根据拍摄目标的运动信息进行运动预测,生成预测信息;ROI选择模块用于根据场景信息、运动信息以及更新后的目标检测信息,确定出ROI对焦区域;AF模块用于根据ROI对焦区域,控制对焦马达移动镜头至对焦位置,完成对焦任务;AF模块还用于根据拍摄目标对应的预测信息,执行追焦任务。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,图像传感器驱动以及运动传感器驱动。
在本申请实施例中,运动传感器通常采用基于事件流的异步读出模式(下文也简称“基于事件流的读出模式”或“异步读出模式”)和基于帧扫描的同步读出模式(下文也简称“基于帧扫描的读出模式”或“同步读出模式”)。
可以理解的是,运动传感器具有对运动敏感的特性,通常环境中的静态区域通常不产生光照强度变化事件(在申请实施例中也被称为“事件”),当事件按照一定顺序排列可称为事件流。通常,根据具体的应用场景上述两种读出模式在单位时间内所需读取的信号数据量可能具有显著差异,进而输出读取数据所需的方式也不尽相同。
下面以运动传感器中的动态视觉传感器(Dynamic Vision Sensor,DVS)为例,对异步读出模式进行示例性说明。根据DVS的采样原理,通过比较当前光照强度与上一次事件发生时的光照强度,当其变化量达到预定阈值时,产生并输出一个事件。即通常在当前光照强度与上一次事件产生时的光照强度的差异超过预定阈值时,DVS将产生一个事件。示例性的,若光照强度的差异达到光强增加的第一阈值,则触发产生正事件信号,事件信号的极性值(polarity,简称p)为+1。若光照强度的差异达到光强降低的第二阈值,则触发产生负事件信号,事件信号的极性值为-1。
这样,对于异步读出模式,每个事件可以表示为<x,y,t,p>,其中(x,y)表示产生事件的像素位置,t表示产生事件的时间,p表示事件的极性信息。具体而言,如果运行传感器测量到的光照强度变化量超过预定阈值,则像素可以输出指示事件的数据信号。因此,在基于事件流的异步读出模式下,运动传感器的像素被进一步区分为产生光照强度变化事件的像素和未产生光照强度变化事件的像素。以及通过像素相关联的坐标和时间戳,能够唯一地确定发生光照强度变化事件的时空位置,继而可以将所有的事件按发生的先后顺序构成一个事件流。
基于帧扫描的同步读出模式并未对像素是否产生光照强度变化事件进行区分。无论在某一像素处是否产生光照强度变化事件,该像素所生成的数据信号均被读取。在读取数据信号时,运动传感器按照预定顺序扫描像素阵列电路,同步读取各个像素的指示光照强度的极性信息p(事件的极性信息同上述),并按照顺序输出为第1帧数据、第2帧数据等。其中,运行传感器可以输出超高帧率的事件帧,相邻事件帧之间的时间间隔小于1ms,等效帧率超过1000帧数(Frames Per Second,FPS)。若某个像素未触发事件发生,则该像素的极性值为0。
因而,在同步读出模式下,运动传感器读取的每一帧数据量具有相同的大小,数据量随时间保持不变。同时,帧与帧之间按照相等的时间间隔输出,例如可以按照每秒30帧、每秒60帧、每秒120帧等速率输出。当拍摄目标的运动速度越快,则运动传感器对该拍摄目标捕捉到的信号也越多。因此,在快速运动检测上,只有运动的地方才触发事件,没有运动的地方则没有事件。同时物体速度越快,单位时间内捕捉的事件信号数据也越多。
本申请的实施例涉及拍摄画面中的拍摄目标可以是用户感兴趣的人、动物、玩偶、建筑或植物等,拍摄目标可以在拍摄过程中保持不动和/或持续运动。拍摄画面中的拍摄目标可以是一个也可以是多个。
以下实施例将结合附图,以电子设备为具有图1所示结构的手机为例,对本申请实施例提供的拍摄方法进行阐述。参见图3,该方法可以包括:
步骤S301、手机响应于对拍摄功能的启动操作,获取RGB数据和第一事件数据。
在一些实施例中,手机中安装具有拍摄功能的应用程序(Application program,APP),用户可以使用应用程序拍摄图片和视频。手机检测到用户输入的启动操作后,响应于该操作,手机可将摄像头捕捉到的拍摄画面实时显示在取景窗口中。通常,手机会以一定的帧速率(frames per second,FPS)在取景窗口中显示每一帧拍摄画面。以帧率为30帧/秒举例,手机可在1秒内显示30帧摄像头捕捉到的拍摄画面。
在本申请实施例中,采集模块可以响应于用户输入的启动操作,同步启动RGB传感器和运动传感器获取对应的RGB数据和第一事件数据,以执行后续的拍摄任务。
其中,用户对拍摄功能的启动操作可以包括用户触发拍照或录像预览状态下的点击操作;例如,用户对相机APP图标的点击操作。之后,手机进入拍照或录像预览状态,通过获取RGB数据和事件数据,显示拍照或录像预览状态界面。再或者,用户针对拍摄控件的点击操作。例如,用户对预览界面中拍摄控件的点击操作。之后,手机进入拍照状态或录像状态,通过获取RGB数据和事件数据,生成拍摄的照片或视频。可以理解的是,用户可以点击拍摄控件以完成当前的拍照或录像。
为便于描述,以下将以手机进入拍照预览状态作为示例,对本申请实施例提供的拍摄方法进行具体阐述。
示例性的,手机检测到用户针对相机应用图标的点击操作后,可以启动相机应用。在启动相机应用后,手机可以进入拍摄模式,并在预览状态下显示预览界面。
参见图4中的(A),手机检测到用户点击相机图标301的操作后,启动相机应用并自动启动RGB传感器和运动传感器获取数据,显示如图4中的(B)所示的预览界面。
可以理解的是,手机还可以响应于用户的语音指令或快捷手势等操作,启动RGB传感器和运动传感器,本申请实施例对触发RGB传感器和运动传感器启动的操作不作限定。
在一些实施例中,RGB传感器的输出是由连续的RGB帧图像组成的图像序列,每一帧RGB图像的内容包括拍摄时刻拍摄物体的色彩亮度等信息,也可以理解RGB传感器拍摄的RGB图像是静态的。RGB传感器与运动传感器相比具有较低的帧率。在本申请实施例中,RGB传感器可以输出大于等于24fps的RGB数据。
而运动传感器具有帧率高、运行敏感以及功耗低的特性。通常运动传感器的帧率可以在1000fps至15000fps之间,这种高帧率特性使得其非常适用于高速动态图像的捕捉。当拍摄目标处于移动状态时,运动传感器可以触发事件信号。且拍摄目标移动速度越快,运动传感器采集的事件数据也就越稠密。在本申请实施例中,运动传感器的帧率可以根据拍摄目标的运动速度进行自适应的调整。当拍摄目标的运动速度较慢,运动传感器的帧率降低。当拍摄目标的运动速度较快,运动传感器的帧率升高。
其中,运动传感器的最低帧率需大于等于RGB传感器的帧率。运动传感器的最高帧率可以小于或者等于预设帧率如1000fps。并且,运动传感器只捕捉图像变化,因此输出数据较少,即使长时间工作对功耗的要求也会较低。
可以理解的是,运动传感器对运动变化进行响应产生事件,由于静态区域不会激发事件,因此事件大多产生于存在运动物体的区域。运动传感器的具体原理为:在运动传感器中针对每个像素处都有一个独立的光电传感模块,当该像素处的亮度变化超过设定阈值时,就会生成并输出事件数据(有时也称脉冲数据)。当拍摄画面中包括处于运动状态的拍摄目标时,运动传感器在像素出现亮度变化,才会生成并输出事件数据。进而,可以大大减少冗余数据(如像素不出现亮度变化对应的数据),从而提高后处理算法的计算效率。当然,当拍摄画面中不包括处于运动状态的拍摄目标时,可以通过获取RGB传感器采集的RGB数据执行后续的对焦过程。
步骤S302、手机根据RGB数据进行场景检测和目标检测,确定拍摄目标的第一目标检测信息和场景信息。
在本申请实施例中,RGB传感器采集的RGB数据包括N帧RGB图像,N为大于等于1的正
整数。手机在开启RGB传感器实时输出RGB数据后,可以对输出的RGB数据进行实时处理。
在一种可实现的方式中,目标检测模块和场景检测模块可以对RGB数据中每帧RGB图像进行目标检测和场景检测,生成场景信息和拍摄目标的目标检测信息。示例性的,手机对RGB数据中的第一帧RGB图像进行目标检测,确定拍摄目标的第一目标检测信息;以及根据第一帧RGB图像进行场景检测,确定拍摄目标的场景信息。
其中,第一目标检测信息用于表示拍摄画面中的拍摄主体。本申请的实施例涉及的拍摄主体可以是用户感兴趣的人、动物以及玩偶等。在手机的拍摄范围内,除了拍摄主体以外的范围可以称为拍摄背景。
例如,第一目标检测信息可以包括拍摄目标的第一目标框、第一目标框对应的区域大小以及第一目标框对应的坐标中的一种或多种。
其中,拍摄目标可以为一个拍摄主体。场景信息用于表示拍摄画面中的拍摄背景,例如可以包括场景类型,如蓝天、森林、太阳、建筑以及月亮等场景。第一目标检测信息用于表征拍摄目标在RGB传感器的检测范围内拍摄目标的相关信息,可以理解为拍摄目标的静态信息。
用户在拍摄时通常都是有目的性的拍摄一些人或物,为了能够对焦到用户希望拍摄到的人或物,手机需对拍摄画面进行目标检测和场景检测。示例性的,参见图5,拍摄目标可以包括行人、树木以及汽车等。第一目标检测信息可以包括行人对应的第一目标框A、第一目标框A的区域大小以及坐标;还可以包括树木对应的第一目标框B、第一目标框B的区域大小以及坐标;汽车对应的第一目标框C、第一目标框C的区域大小以及坐标。场景信息包括太阳。
在一些场景中,拍摄目标包括静止的拍摄目标和/或运动的拍摄目标。手机在对第一帧RGB图像进行目标检测的过程中,无论拍摄目标是静止状态还是运动状态,对全部拍摄目标进行检测,进而生成拍摄目标的目标检测信息。可以理解的是,当存在运动的拍摄目标时,手机还可以针对第一帧RGB图像进行运动目标检测。
在本申请实施例中可以采用感知算法,对RGB数据中每隔一帧或者每隔几帧RGB图像进行目标检测和场景检测。在对RGB图像进行目标检测和场景检测的过程中,可以包括对RGB图像的人像检测、人脸检测、显著主体检测以及场景类型检测等。其中,目标检测算法可以为FastR-CNN、FasterR-CNN、R-FCN、YOLO、SSD和RetinaNet中任一种或任几种的组合,也可以为其他神经网络算法或非神经网络算法。本申请实施例不对目标检测和场景检测对应的检测算法进行具体限定,可根据实际场景需求和图像需求自行选择。
步骤S303、手机根据第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息。
基于上述运动传感器的两种读出模式(异步读出模式和同步读出模式),手机在对事件数据进行检测的过程中,可以采用两种不同的处理方式对事件数据进行检测。
下面对异步读出模式进行示例性说明。
在运动传感器的读出模式为异步读出模式的情况下,运动传感器输出的是几乎连续的事件数据流,即每个事件为<x,y,t,p>的点云数据,其中(x,y)表示产生事件的像素位置,t表示产生事件的时间,p表示事件的极性信息。相邻两个事件的间隔时间十分接近,如1μs。
在一种可实现的方式中,手机可以对第一事件数据进行切分,生成多帧事件帧图像。若拍摄目标对应的运动速率发生变化,则根据运动速率调整单位时长内生成事件帧图像的帧数;事件帧图像的帧数与拍摄目标对应的运动速率正相关。之后,手机可以对事件帧图像进行运动目标检测。可以理解的是,本申请实施例中单位时长内生成事件帧图像的帧数与拍摄目标的运动速率相关。拍摄目标对应的运动速率越大,单位时长内生成事件帧图像的帧数越多。反之,拍摄目标对应的运动速率越小,单位时长内生成事件帧图像的帧数越少。
其中,手机在对第一事件数据进行切分,生成多帧事件帧图像的过程中,可以根据预设事件数据量,对第一事件数据进行切分,生成多帧事件帧图像;其中,每帧事件帧图像均对应预设事件数据量。
也就是说,手机可以将第一事件数据进行合帧处理,得到事件帧图像。例如,通过统计事件速率(Event Rate),将固定的事件合成一帧事件帧图像。
图6为采用固定事件率合帧示意图。本申请实施例可以将第一事件数据进行切分,生成多帧
事件帧图像。参见图6,每个黑色圆点代表一个事件信号,即一个<x,y,t,p>的点云数据包。例如:可以按照每5个事件信号合成一帧事件帧图像。即预设事件数据量为5。之后,将合帧后的事件帧图像作为输入的事件数据进行后续检测。本申请实施例仅以上述5个事件信号作为示例性说明,不代表真实的事件数量,可根据实际使用场景对事件数量进行设计和调整。
本申请实施例不对具体的合成一帧的具体预设事件数据量进行限定,可以根据实际应用场景进行自适应调整。例如,当拍摄目标运动越快,触发运动传感器采集的事件数据越多,合成一帧的事件数据越少,合帧处理事件则越短。可以理解的是,合帧处理的方式包括累加求和、累加求和并取极性、体素网格(Voxel Grid),时间表面(Time Surface)等不同的方式。本申请实施例对合帧处理的方式不作具体限定。
在另一种可实现的方式中,手机还可以设定预设周期,将第一事件数据按照预设周期分成多组事件数据。之后直接针对每个预设周期内的事件数据进行后续检测。例如,预设周期为100μs。这样,将每100μs对应的事件数据作为一组事件数据,并执行后续检测。
下面对同步读出模式进行示例性说明。
在运动传感器的读出模式为同步读出模式的情况下,运动传感器读取的每一帧数据量具有相同的大小,数据量随时间保持不变。同时,帧与帧之间按照相等的时间间隔输出。由于运动传感器输出的是高帧率的事件帧,因而无需再对事件数据进行合帧处理。手机可以直接针对一帧或者多帧事件帧进行后续检测。
在本申请实施例中,运动传感器采集的第一事件数据可以得到M帧事件帧图像,M为大于等于1的正整数。也就是说,事件帧图像是由上述第一事件数据生成的图像,具体包括了拍摄目标处于运动状态时,对应的运动轨迹信息生成的图像,或者说事件帧图像可以用于标识一段时间内,拍摄目标在运动传感器的检测范围内进行运动时的信息。
示例性的,若拍摄目标在运动传感器的检测范围内骑车,手机监测到的其中一个时刻的事件如图7所示,其中,图7中表示运动传感器监测到的事件,即运动传感器可以监测到在检测范围内的运动物体以及对应的轮廓和位置。
在本申请实施例中,手机在开启运动传感器实时输出第一事件数据后,可以对输出的第一事件数据进行实时处理。运动目标检测模块可以对第一事件数据中每帧事件帧图像进行运动目标检测和目标跟踪检测,如对第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息。即手机可以使用运动传感器对拍摄目标在检测范围内的运动情况进行监测,将监测到的拍摄目标作为跟踪目标,进而得到拍摄目标进行运动时的运动信息。其中,拍摄目标可以是在运动传感器的检测范围内运动的物体,拍摄目标的数量可以是一个或者多个。
可以理解的是,目标跟踪是指在连续的图像序列中对运动的拍摄目标进行跟踪,得到运动的拍摄目标在每一帧图像中的运动信息,便于后续确定运动的拍摄目标的运动轨迹。手机可以使用追踪算法对每帧事件帧图像进行目标跟踪检测,其中,上述追踪算法可以为质心跟踪算法(centroid)、相关跟踪算法(correlation)或边缘跟踪算法(edge)等,本申请实施例对此不作任何限制。
在一种可实现方式中,第一运动信息可以包括拍摄目标对应的第二目标框、第二目标框对应的区域大小、第二目标框对应的坐标中的一种或多种以及拍摄目标对应的置信度等。运动信息用于表征拍摄目标在运动传感器的检测范围内进行运动时的信息。
示例性的,参见图8,运动信息可以包括汽车对应的第二目标框A、第二目标框A对应的区域大小、第二目标框A对应的坐标以及拍摄目标对应的置信度0.98;行人对应的第二目标框B、第二目标框B对应的区域大小、第二目标框B对应的坐标以及拍摄目标对应的置信度0.94;两轮车对应的第二目标框C、第二目标框C对应的区域大小、第二目标框C对应的坐标以及拍摄目标对应的置信度0.87。
在另一种可实现方式中,第一运动信息可以包括拍摄目标对应的运动轮廓、运动轮廓对应的区域大小、运动轮廓对应的坐标中的一种或多种以及拍摄目标对应的置信度等。示例性的,图9为本申请实施例提供的一种事件数据的掩膜(Mask)示意图。参见图9,第一运动信息可以包括行人对应的运动轮廓、运动轮廓对应的区域大小、运动轮廓对应的坐标以及行人对应的置信度。
可以理解的是,后续手机在执行对焦任务时,需计算第二目标框内的平均深度值。由于第二目标框中除了拍摄目标自身还包括一些其他区域,因而,在计算平均深度值的过程中,拍摄目标对应的运动轮廓区域的平均深度值较于第二目标框的平均深度值更精确。
由此,第一运动信息不仅可以包括第二目标框,还可以包括拍摄目标对应的运动轮廓区域、运动轮廓对应的区域大小或运动轮廓对应的坐标中的一种或多种,以及拍摄目标对应的置信度。本申请实施例不对运动信息中包括的具体内容进行具体限定。
在另一种可实现方式中,第一运动信息还可以包括拍摄目标在运动传感器的检测范围内进行运动时,运动轨迹对应的信息。
其中,第一运动信息还可以包括拍摄目标的运动方向、拍摄目标的运动速度以及拍摄目标的运动类型中的一种或多种;运动类型至少包括往复运动、旋转运动、横向运动、纵向运动或蹦跳运动中的一种或多种。
示例性的,运动速度可以是当前帧事件帧图像中拍摄目标相比前一帧事件帧图像中拍摄目标的速度的变化趋势,包括但不限于变快、变慢等速度趋势状态量,甚至更多级别的速度趋势状态量,如快、较快、非常快、慢、较慢、非常慢等。运动方向也可以是当前帧事件帧图像中拍摄目标相比前一帧事件帧图像中拍摄目标的方向变化,包括但不限于向左、向右、向上、向下、不变等方向趋势状态量,甚至更多级别的方向趋势状态量,如向左上、向左下、向右上、向右下、向左、向右、向上、向下、不变等。运动类型可以是当前帧事件帧图像中拍摄目标相比前一帧事件帧图像中拍摄目标的运动变化趋势。
具体地,手机可以对运动传感器采集到的事件数据,按照预设帧数事件帧图像进行切分。之后,累积多组预设帧数事件帧图像,对多组预设帧数事件帧图像中的一系列运动轨迹进行分析。通过计算光流、运动矢量等方式,得到运动的拍摄目标的运动特性,如运动方向、运动速度等信息。
示例性地,可以将第一事件数据切分多组预设帧数事件帧图像,如k组预设帧数事件帧图像,预设帧数事件帧图像可以对应两帧事件帧图像。切分的方式可以是按照设定的帧数进行切分,也可以是按照随机的帧数进行切分,或者按照运动轨迹变化情况进行切分等,具体可以根据实际应用场景进行调整。
在切分得到k组预设帧数事件帧图像之后,分析每组预设帧数事件帧图像的事件的位置,确定拍摄目标在每两帧事件帧图像中的所在的区域,如第一组两帧事件帧图像中的运动区域为运动区域A,第二组两帧事件帧图像中的运动区域为运动区域B。然后通过运动区域A-B的变化情况,确定拍摄目标的第一运动信息,如运动方向、运动速度以及运动类型等。
其中,运动目标检测算法可以为背景差分法、帧间差分法、混合高斯模型、光流法、块匹配和光流估计中任一种或任几种的组合,也可以为其他神经网络算法或非神经网络算法,本申请实施例对此不作限定,可根据实际场景需求和图像需求自行选择。
步骤S304、手机基于拍摄目标的第一运动信息,更新第一目标检测信息。
由于拍摄目标的运动已经发生,若只根据拍摄目标当前所在的区域和运动特性触发RGB传感器采集RGB图像,并对RGB图像进行目标检测,则拍摄目标可能已经进入下一位置或状态,此时RGB图像的目标检测结果存在滞后。
具体地,参见图10中的(A),随着时间推移,针对RGB图像,通常第二帧RGB图像的目标检测信息会根据第一帧RGB图像的目标检测信息,执行后续的对焦任务。以此类推,第三帧RGB图像的目标检测信息会根据第二帧RGB图像的目标检测信息,执行后续的对焦任务。
RGB传感器的运行帧率通常小于或者等于24fps至30fps。以RGB传感器的运行帧率为15fps为例,即使针对RGB图像的运动目标检测和目标检测结果准确,当第二帧RGB图像的目标检测信息根据第一帧RGB图像的目标检测信息执行后续的对焦任务时,待对焦的拍摄目标对应的目标检测信息已经至少是1/15s(即0.0666s)之前的目标检测信息。即第一帧RGB图像的目标检测信息和后续执行对焦任务的时间差异大于1/15s;并且在RGB传感器进行运动目标检测和目标检测的过程中,还会存在通路的延时,RGB图像的目标检测结果更加滞后。
在本申请一些实施例中,由于运动传感器帧率相较于RGB传感器帧率更高,且对运动更为敏
感。因而,在相同时间下,事件帧图像中拍摄目标的位置较于RGB图像中拍摄目标的位置,可能更接近拍摄目标在拍摄场景下的真实位置。当然,在相同时间下,事件帧图像中拍摄目标的位置与RGB图像中拍摄目标的位置也可能相同。
示例性的,参见图10中的(B),在0.1秒时,RGB传感器采集第一帧RGB图像,运动传感器采集第一事件数据,第一事件数据的采集事件在第二帧RGB图像的采集时间之前,第一事件数据包括第一帧事件帧图像。此时,第一帧事件帧图像中拍摄目标的位置与第一帧RGB图像中拍摄目标的位置相同。在0.15秒时,由于运动传感器帧率较高,运动传感器采集第二帧事件帧图像。而RGB传感器没有采集RGB图像。此时,拍摄目标进行了位置移动,第二帧事件帧图像中拍摄目标的位置较于RGB图像中拍摄目标的位置,更接近拍摄目标在拍摄场景下的真实位置。
也可以理解的是,对事件帧图像中拍摄目标的运动目标检测较于对RGB图像中拍摄目标的目标检测的检测精度更高。由此,可以通过第二帧事件帧图像的运动信息更新第一帧RGB图像,即第二帧事件帧图像指导第二帧RGB图像的目标检测信息,完成对第一帧RGB图像中第一目标检测信息的更新,以使得第一目标检测信息更精准并且更贴近拍摄目标在真实场景下的运动状态。
由此,在本申请一些实施例中,在存在处于运动状态的拍摄目标的场景中,手机可以基于拍摄目标的第一运动信息,更新第一目标检测信息。
在一种可实现的方式中,在事件帧图像的采集时间晚于RGB图像的采集时间的前提下,手机可以基于事件帧图像的运动信息,更新RGB图像的目标检测信息。也可以理解为,在事件帧图像的采集时间早于RGB图像的采集时间的前提下,手机可以基于事件帧图像的运动信息,指导RGB图像的目标检测信息。
示例性的,若第一事件数据中当前帧事件帧图像的采集时间早于且靠近RGB数据中第二帧RGB图像的采集时间,则基于当前帧事件帧图像对应的运动信息,更新第一帧RGB图像对应的第一目标检测信息。进而,对第二帧RGB图像进行指导。
在本申请实施例中,手机可以基于第一运动信息的第二目标框,更新第一目标检测信息的第一目标框。具体更新过程可以为:基于第一运动信息中第二目标框的位置,调整第一目标检测信息中第一目标框的位置。
由于拍摄目标一直处于移动状态,如从拍摄画面中的位置A移动至位置B。第一运动信息中的第二目标框更贴近拍摄目标处于的真实位置,即第一运动信息的第二目标框用于表征拍摄目标处于拍摄画面中的位置B;而第一目标检测信息的第一目标框用于表征拍摄目标处于拍摄画面中的位置A。由此,采用第二目标框更新第一目标框,更新了拍摄目标处于拍摄画面中的位置。更新后的第一目标框即更贴近拍摄目标在拍摄场景下的真实位置。当然,还可以基于第一运动信息的运动轮廓,更新第一目标检测信息的第一目标框。
继续参见图10中的(B),例如,第二帧事件帧图像的采集时间早于第二帧RGB图像;手机可以基于第二帧事件帧图像的运动信息,指导第二帧RGB图像的目标检测信息。再例如,第四帧事件帧图像的采集时间早于第三帧RGB图像;手机可以基于第四帧事件帧图像的运动信息,指导第三帧RGB图像的目标检测信息。
通过测试表明,针对运动速度较快的拍摄目标,当运动传感器的等效帧率达到200fps时,RGB图像的目标检测信息和后续执行对焦任务的时间差异小于5ms。当运动传感器的等效帧率达到1000fps时,RGB图像的目标检测信息和后续执行对焦任务的时间差异小于1ms。
可见,基于运动传感器帧率相较于RGB传感器帧率更高的特性,拍摄目标的第一运动信息较于第一目标检测信息更超前,能够预先得到拍摄目标的运动情况。同时,由于第一帧RGB图像与第二帧RGB图像之间的时间间隔较大。而在本申请实施例中,利用采集时间早于第二帧RGB图像的事件帧图像,对第二帧RGB图像进行指导。也可以理解为,利用采集时间早于第二帧RGB图像的事件帧图像,对第一帧RGB图像进行更新。进而,减小第二帧RGB图像与更新完成的第一帧RGB图像之间的时间间隔。并且基于更贴近真实场景的运动信息对RGB图像进行指导,相比仅依赖RGB图像对应的目标检测结果更准确,以便于提升后续对焦过程中的对焦精度和成功概率。
并且,在本申请实施例中,运动传感器的帧率可以根据拍摄目标的运动速度进行自适应的调
整。当拍摄目标的运动速度较快,运动传感器的帧率升高。这样,手机在基于事件帧图像的运动信息,更新RGB图像的目标检测信息的过程中,能够利用更精确的拍摄目标位置对RGB图像的目标检测信息进行更新。进而,便于提升后续对焦过程中的对焦精度和成功概率。
在本申请的另一些实施例中,针对运动速度较快的拍摄目标,运动传感器的等效帧率可以适应性地升高。帧与帧之间的时间间隔可以按照事件数据对应的数据量自适应切分输出。
参见图11,示例性的,在第一帧RGB图像和第二帧RGB图像中输出了4帧事件帧图像,在第二帧RGB图像和第三帧RGB图像中输出了5帧事件帧图像。可以理解的是,由于拍摄目标的运动速度越快,帧与帧之间的时间间隔越短,单位时间内输出的事件帧图像越多,进而拍摄目标的目标检测信息也就越准确。本申请实施例仅以图中标识的数字作为示例性说明,不代表真实的事件帧数量,实际的帧与帧之间的时间间隔内的事件帧图像的帧数可以为30,40,50或100等。
步骤S305、手机根据场景信息、第一运动信息以及更新后的第一目标检测信息,确定第一ROI对焦区域。
在本申请实施例中,手机能够根据拍摄目标在拍摄画面中的实际大小及位置,确定拍摄目标对应的感兴趣区域(region of interest,ROI),即第一ROI对焦区域;以便于后续根据第一ROI对焦区域的大小和位置,进行对焦。示例性的,在确定第一ROI对焦区域的过程中,ROI选择模块可以根据场景信息、第一运动信息中的第二目标框以及更新后的目标检测信息中的第一目标框,筛选出第一ROI对焦区域。并且手机还可以在显示屏中呈现第一ROI对焦区域。
在一种可实现方式中,在筛选第一ROI对焦区域的过程中,可以按照预设筛选规则,对场景信息、第一运动信息中的第二目标框以及更新后的目标检测信息中的第一目标框进行筛选。当然,也可以基于预设筛选规则,从第一运动信息中第二目标框和更新后的第一目标检测信息中第一目标框中筛选得到第一ROI对焦区域。
示例性的,预设筛选规则可以包括设定第一运动信息、第一目标检测信息以及场景信息对应的优先级,根据优先级最高对应的信息,确定第一ROI对焦区域。
参见图12中的(A),例如:第一运动信息的优先级大于第一目标检测信息,第一目标检测信息大于场景信息。其中,场景信息包括太阳;第一目标检测信息中的第一目标框包括汽车对应的目标框;第一运动信息中的第二目标框包括行人对应的目标框;基于预设筛选规则,则将行人对应的目标框确定为第一ROI对焦区域,并且显示该第一ROI对焦区域。
在示例性的,预设筛选规则还包括对比第一运动信息和第一目标检测信息对应的优先级,将优先级高的第一目标框或第二目标框确定为第一ROI对焦区域。
预设筛选规则还可以包括检测第一运动信息与场景信息之间的关联度以及第一目标检测信息与场景信息之间的关联度,将关联度高的第一目标框或第二目标框确定为第一ROI对焦区域。
例如,第一运动信息中的第二目标框包括行人对应的目标框和宠物对应的目标框;场景信息包括蓝天。基于预设筛选规则,在蓝天的场景下,将行人对应的目标框确定为第一ROI对焦区域,并且显示该第一ROI对焦区域。再例如,第一运动信息中的第二目标框包括行人对应的目标框和宠物对应的目标框;场景信息包括草地。基于预设筛选规则,在草地的场景下,将宠物对应的目标框确定为第一ROI对焦区域。
预设筛选规则还可以包括根据第一目标检测信息中的第一目标框以及第一运动信息中的第二目标框分别对应的中心点至拍摄画面的中心点的距离,确定第一ROI对焦区域。在目标框的中心点距离拍摄画面的中心点小于或者等于预设阈值时,确定该目标框为第一ROI对焦区域。
参见图12中的(B),例如,第一目标检测信息中的第一目标框包括汽车对应的目标框;第一运动信息中的第二目标框包括行人对应的目标框以及小狗对应的目标框;行人对应的目标框的中心点距离拍摄画面的中心点小于或者等于预设阈值。基于预设筛选规则,则将行人对应的目标框确定为第一ROI对焦区域,并且显示该第一ROI对焦区域。
预设筛选规则还可以包括设定拍摄目标对应的优先级,将优先级最高的拍摄目标对应的目标框,确定为第一ROI对焦区域。
例如,拍摄目标包括:树木、人、猫、狗、鸟、自行车、公共汽车、摩托车、卡车、小汽车、火车、船、布偶、风筝、气球、扫地机器人、智能电视、盘子、杯子以及手提包。而其中的拍摄
物体所属的物体类别的优先级可以分四个等级,人为第一优先级,动物为第二优先级,车辆为第三优先级,其余为第四优先级。目标检测信息中的第一目标框包括摩托车对应的目标框;运动信息中的第二目标框包括狗对应的目标框;基于预设筛选规则,则将狗对应的目标框确定为第一ROI对焦区域。
预设筛选规则还可以包括对比第二目标框中拍摄目标的置信度与第一目标框中拍摄目标的置信度,将置信度高的第一目标框或第二目标框确定为第一ROI对焦区域。
例如,第一运动信息中拍摄目标如行人对应的置信度为0.94;第一目标检测信息中拍摄目标如两轮车对应的置信度为0.87。基于预设筛选规则,则根据行人对应的目标框确定为第一ROI对焦区域。
预设筛选规则还可以包括检测第一运动信息中是否包括运动类型。在第一运动信息中包括运动类型的情况下,调整第一运动信息对应的第二目标框,将第二目标框确定为第一ROI对焦区域。例如,第一运动信息中的第二目标框包括行人对应的目标框以及小狗对应的目标框;其中,行人对应的目标框对应有运动类型如旋转运动。基于预设筛选规则,按预设比例缩小行人对应的目标框,并将缩小后行人对应的目标框确定为第一ROI对焦区域。
可以理解的是,上述预设筛选规则可以进行结合,根据结合后的预设筛选规则确定第一ROI对焦区域。本申请实施例中的第一ROI对焦区域除了提供上述确定方式以外,预设筛选规则也可以结合其他预设策略,在不同的场景下提供不同的第一ROI对焦区域确定方式。本申请实施例对第一ROI对焦区域确定方式不予限定。
步骤S306、手机根据第一ROI对焦区域,对摄像头进行对焦以采集第二帧RGB图像。
在本申请实施例中,在确定第一ROI对焦区域后,手机就可以根据该对第一ROI对焦区域进行自动对焦。在一种实现方式中,自动对焦(autofocus,AF)模块可以根据第一ROI对焦区域的相关信息,如第一ROI对焦区域四个边对应在RGB图像上的位置信息,通过AF算法,调整镜头的焦距,控制摄像头的对焦马达移动至对焦位置。在镜头移动至对焦位置后,采集并显示预览RGB图像。用户可以在预览界面中看到对焦后的预览RGB图像。
在另一种实现方式中,AF模块计算第一ROI对焦区域中的平均深度值。之后,AF模块根据平均深度值,确定摄像头的对焦马达移动的距离以及对焦位置。最后,控制摄像头的对焦马达移动距离以及移动至对焦位置。当控制摄像头进行对焦后,采集第二帧RGB图像。
可以理解的是,本申请实施例仅以第一帧RGB图像和第二帧RGB图像作为示例,在对焦过程中,根据第一帧RGB图像对第二帧RGB图像进行对焦。
示例性的,在AF模块获取第一ROI对焦区域的平均深度值的过程中,AF模块可以根据第一帧RGB图像中的第一ROI对焦区域,查询该第一ROI对焦区域在飞行时间(time of flight,TOF)传感器所采集的深度图像中对应的区域;之后,通过第一ROI对焦区域在深度图像中对应的区域,得到第一ROI对焦区域的平均深度值。再例如,在自动对焦模块获取第一ROI对焦区域的平均深度值的过程中,自动对焦模块还可以使用RGB图像中的滤波长度可变的相位检测(phase detection,PD)信息,计算第一ROI对焦区域的平均深度值。
可见,通过根据第一事件数据对应的第一运动信息,更新第一帧RGB图像对应的第一目标检测信息。并且根据场景信息、第一运动信息以及更新后的第一目标检测信息,确定第一ROI对焦区域。之后,根据第一ROI对焦区域控制对焦马达将镜头完成对焦。因为第一运动信息可以预先对第一目标检测信息进行更新,能够使第一目标检测信息更贴近在真实场景下的拍摄目标。所以在确定第一ROI对焦区域以及后续对焦的过程中,能够更精确的确定第一ROI对焦区域以及待对焦的对焦位置。进而能够及时且精准地对焦到拍摄画面中拍摄目标。尤其能够有效地提高在复杂场景(如存在包括运动状态的拍摄目标的场景)中,对不同的拍摄目标的对焦准确度以及对焦速度。便于后续对拍摄目标的清晰拍摄,提高了拍摄图像的图像质量以及拍摄的成片率。
在本申请实施例中,手机在执行完成对焦过程之后,还可以包括:
步骤S307、手机对拍摄目标的运动情况进行预测,确定拍摄目标的预测信息。
在本申请实施例中,手机中的预测模块可以基于事件数据,对未来拍摄目标的位置进行预测。预测模块根据预测的拍摄目标的位置,确定拍摄目标的预测信息。其中,预测信息用于表示预测
的拍摄目标的运动情况信息。运动情况信息可以包括拍摄目标所在的位置。例如,预测信息包括拍摄目标对应的预测目标框、预测目标框的大小、预测目标框的坐标以及预测目标框的置信度。可以理解的是,预测的拍摄目标处于运动状态。
在一种可实现的方式中,在预测模块对未来预设时长内拍摄目标的位置进行预测的过程中,可以根据第二事件数据对应的第二运动信息,对未来拍摄目标的运动轨迹进行预测。
具体地,参见图13,在上述对焦过程中,根据第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息。根据第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息。其中,第一事件数据为第一时刻与第二时刻之间采集到的事件数据,第一时刻为第一帧RGB图像的采集时刻,第二时刻为第二帧RGB图像的采集时刻。之后,根据场景信息、第一运动信息和更新后的第一目标检测信息,确定第一ROI对焦区域;进而完成对焦过程,并在第二时刻采集第二帧原始图像。
预测模块在预测的过程中,可以根据第二事件数据对应的第二运动信息进行预测。其中,第二事件数据为第一时刻与第三时刻之间采集到的事件数据,第三时刻位于第一时刻与第二时刻之间,且第一时刻与第二时刻之间包括至少一个第三时刻。具体地,手机可以根据第二事件数据进行运动目标检测,确定拍摄目标的第二运动信息。之后,预测模块可以根据拍摄目标的第二运动信息对拍摄目标的运动情况进行预测,确定拍摄目标的预测信息。由此,手机可以在完成对第一帧RGB图像中的拍摄目标的对焦后,持续对拍摄目标进行后续追焦。
在预测的过程中,预测模块可以根据拍摄目标的在检测范围内进行运动时的运动轨迹、运动方向、运动速度以及运动类型中的至少一个,对未来的预设时长内拍摄目标的运动轨迹进行预测,得到拍摄目标的预测信息。
在另一种可实现的方式中,还可以是根据监测到的拍摄目标在检测范围内进行运动时的运动轨迹、运动方向和/或运动速度,拟合出拍摄目标所在位置的中心点随时间变化的变化函数,然后根据该变化函数计算出预测中心点,并根据该预测中心点确定预测信息。
步骤S308、手机基于拍摄目标的预测信息,执行追焦任务。
本申请实施例中,追焦过程是一个变化的过程,是为在完成对焦任务后,继续对焦点进行连续对焦,一直保持这个对焦点处于被对准的状态。也可以理解为,在后续的拍摄过程中保持对拍摄目标的对焦的过程。
在预测模块确定拍摄目标的预测信息之后,手机可以根据第一帧原始图像进行目标检测,确定拍摄目标的第二目标检测信息。之后,基于拍摄目标的第二运动信息,更新拍摄目标的第二目标检测信息。接着,根据第二运动信息和更新后的第二目标检测信息,确定第二ROI对焦区域。最后,根据预测信息中的预测目标框和第二ROI对焦区域,控制摄像头的对焦马达对拍摄目标进行追焦。
也就是说,预测模块可以将生成的预测信息发送至ROI选择模块。也就是说,ROI选择模块在确定出第二ROI对焦区域之后,可以将第二ROI对焦区域和拍摄目标的预测信息发送至AF模块。AF模块在基于预测目标框和第二ROI对焦区域执行追焦过程。示例性的,在对焦过程中,第一ROI对焦区域包括行人。那么在追焦过程中,随着行人的移动状态,第二ROI对焦区域也包括该行人,并随着该行人持续移动且变化。当然,手机还可以控制显示屏呈现第一ROI对焦区域以及预测区域;其中,预测区域包括拍摄目标的预测信息中的预测目标框。
示例性的,参见图14,当拍摄画面中存在运动的行人时,用户可以在拍摄画面中看到行人对应的对焦区域以及预测区域。其中,手机可以采用不同的方式显示第一ROI对焦区域和预测区域以示区别。例如,第一ROI对焦区域和预测区域为不同颜色的标记(例如,一个为黄色矩形框,另一个为红色矩形框),或者为不同线型的标记(例如,一个为实线,另一个为虚线;或者两个标记框的线性粗细不同)等。
在一些实施例中,AF模块还可以将第一ROI对焦区域和预测区域发送给除了相机应用之外具有摄像功能的其他应用。其他应用可以基于ROI对焦区域和预测区域进行插帧算法、deblur算法以及抓拍算法等多种其他图像处理。
可见,本申请实施例能够对未来拍摄目标的位置进行预测,并且根据预测信息执行后续追焦
过程。由此,在上述本申请实施例可以及时进行对焦的前提下,也可以根据预测信息对拍摄目标迅速进行追焦。避免拍摄目标运动速度过快时,可能因为追焦延迟导致拍摄目标出镜以及甚至出现追焦失败的情况。同时,便于后续用户可以决策出最佳的拍摄时机,拍摄出更清晰的图像。
步骤S309、手机检测到用户的拍照操作,生成并显示第一目标图像。
在本申请的实施例中,在预览状态下,参见图4中的(B),预览界面上可以包括拍摄控件401。用户可以点击预览界面中的拍摄控件401,手机检测到用户的拍照操作后,开始拍摄并生成第一目标图像,即用户拍摄的清晰照片。例如,手机检测到拍照操作后,采集第二帧RGB图像并生成拍照得到的第一目标图像。之后,在在拍摄预览界面的第一预设区域,显示第一目标图像对应的缩略图。第一目标图像可以为用户拍照完成得到的图像。
可以理解的是,手机可以在拍摄预览界面上显示第一目标图像,拍摄预览界面包括拍摄拍照预览界面或录像预览界面。
上述本申请实施例以手机进入拍照预览状态作为示例,同理,手机还可以进入录像预览状态。在一些实施例中,预览界面上可以包括多个模式控件402。例如,录像模式控件、人像模式控件以及全景模式控件等。当手机检测到用户点击录像模式控件后,进入录像模式。手机在录像预览状态下涉及的过程与上述在拍摄预览状态涉及的过程相同,在此不再赘述。
而在录像模式下,用户仍可以点击拍摄控件401触发录像。手机检测到用户点击拍摄控件401的录像操作后,开始执行录像任务。例如,持续进行对焦和追焦过程,手机检测到录像操作后,采集第二帧原始图像;并在录像界面的第二预设区域显示第一目标图像。可以理解的是,可以根据第二帧原始图像生成录像获得的目标视频中的视频帧。进而生成最后清晰的目标视频。可以理解的是,手机在生成照片或录制完成视频之后,可以将其保存在存储器中以及控制显示控件403显示拍摄完成的照片或视频。
在本申请一些实施例中,运动传感器可以实时开启,也可以和RGB传感器一样响应于用户的启动操作开启。当然,还可以待手机识别出RGB图像中存在处于运动状态的拍摄目标时自动开启或提示用户开启。
示例性的,手机可以使用AI算法(例如目标检测算法、场景检测算法、运动目标检测算法等)对当前摄像头采集的RGB图像进行识别,若识别存在处于运动状态的拍摄目标时,则手机自动开启运动传感器。
参见图15中的(A),在手机自动开启运动传感器时,还可以输出第一提示信息,提示用户现已开启运动传感器;例如:现已开启运动拍摄模式。或者,参见图15中的(B),在手机识别存在处于运动状态的拍摄目标后,输出第二提示信息,提示用户是否需要开启运动传感器;例如:是否需要开启运动拍摄模式。若用户选择开启运动拍摄模式或者在预设时间段内未接收到用户的指示,则手机开启运动传感器。否则,手机不开启运动传感器。
在本申请一些实施例中,在使用运动传感器的过程中,可以采用异步读出模式和同步读出模式中的一种模式。还可以采用一定时间异步读出模式之后,切换至同步读出模式。还可以采用一定时间同步读出模式之后,切换至异步读出模式。本申请实施例对运动传感器采用的读出模式不作限定。
在一种可实现方式中,运动传感器可以在整个采集和解析的过程中,持续地对像素阵列电路中产生的光照强度变化事件进行历史统计和实时分析。一旦满足切换条件就发送模式切换信号,以使得从异步读出模式切换为同步读出模式,或者从同步读出模式切换为异步读出模式。该自适应地切换过程不断重复,直至对所有事件信号的读取完成。
在另一种可实现方式中,用户也可以自行对读出模式进行选择。示例性的,异步读出模式可以为第一运动模式,同步读出模式可以为第二运动模式。参见图16中的(A),用户可以点击相机应用界面中的模式切换控件1501,来选择第一运动模式或第二运动模式;或者从第一运动模式切换至第二运动模式,再或者从第二运动模式切换至第一运动模式。用户还可以进入相机应用对应的属性界面后,点击属性界面中显示的模式控件,来选择第一运动模式或第二运动模式。
例如,参见图16中(B),手机的主屏幕界面上显示有设置应用图标,用户可以点击设置应用图标,如图16中的(C)所示,手机响应于用户对设置应用图标的点击操作,显示设置应用的
用户界面。其中,在设置应用中显示手机中安装的所有应用程序的列表。用户可以点击列表中的相机应用,以触发进入相机应用对应的属性页面。
如图16中的(D)所示,手机显示相机应用对应的属性页面,相机应用对应的属性页面中包括相机应用对应的第一运动模式控件和第二运动模式控件。用户可以点击第一运动模式控件,以开启相机应用对应的第一运动模式。
可见,本申请实施例可以输出第一提示信息以及第二提示信息,能够使用户自行选择运动传感器的读出模式。并且还可以让用户感知到运动传感器的开启,或者让用户授权运动传感器是否可以开启。增加了用户与手机之间的交互性,提高用户的使用体验。
在一些方案中,可以对本申请的多个实施例进行组合,并实施组合后的方案。可选的,各方法实施例的流程中的一些操作任选地被组合,并且/或者一些操作的顺序任选地被改变。并且,各流程的步骤之间的执行顺序仅是示例性的,并不构成对步骤之间执行顺序的限制,各步骤之间还可以是其他执行顺序。并非旨在表明执行次序是可以执行这些操作的唯一次序。本领域的普通技术人员会想到多种方式来对本申请实施例所描述的操作进行重新排序。另外,应当指出的是,本申请某个实施例涉及的过程细节同样以类似的方式适用于其他实施例,或者,不同实施例之间可以组合使用。
此外,方法实施例中的某些步骤可等效替换成其他可能的步骤。或者,方法实施例中的某些步骤可以是可选的,在某些使用场景中可以删除。或者,可以在方法实施例中增加其他可能的步骤。
并且,各方法实施例之间可以单独实施,或结合起来实施。
本申请实施例还提供一种电子设备,比如可以是上述手机,如图17所示,该电子设备可以包括一个或者多个处理器1610、存储器1620和通信接口1630。
其中,存储器1620、通信接口1630与处理器1610耦合。例如,存储器1620、通信接口1630与处理器1610可以通过总线1640耦合在一起。
其中,通信接口1630用于与其他设备进行数据传输。存储器1620中存储有计算机程序代码。计算机程序代码包括计算机指令,当计算机指令被处理器1610执行时,使得电子设备执行本申请实施例中的后台应用程序的控制方法。
其中,处理器1610可以是处理器或控制器,例如可以是中央处理器(Central Processing Unit,CPU),通用处理器,数字信号处理器(Digital Signal Processor,DSP),专用集成电路(Application-Specific Integrated Circuit,ASIC),现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一种或多种微处理器组合,DSP和微处理器的组合等等。
其中,总线1640可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。上述总线1640可以分为地址总线、数据总线、控制总线等。为便于表示,图17中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
本申请实施例还提供一种计算机可读存储介质,该计算机存储介质中存储有计算机程序代码,当上述处理器执行该计算机程序代码时,电子设备执行上述方法实施例中的相关方法步骤。
本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述方法实施例中的相关方法步骤。
其中,本申请提供的电子设备、计算机存储介质或者计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其他的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其他的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (16)
- 一种拍摄方法,应用于具有摄像头的电子设备,所述摄像头包括运动传感器和图像传感器,其特征在于,包括:响应于对拍摄功能的启动操作,获取所述运动传感器采集的第一事件数据和所述图像传感器采集的第一帧原始图像;根据所述第一事件数据进行运动目标检测,确定拍摄目标的第一运动信息;所述第一运动信息用于表征所述拍摄目标的运动情况;根据所述第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息;所述第一目标检测信息用于表征拍摄画面中所述拍摄目标的位置;基于所述拍摄目标的第一运动信息,更新所述拍摄目标的第一目标检测信息;根据所述第一运动信息和更新后的所述第一目标检测信息,确定第一感兴趣ROI对焦区域;根据所述第一ROI对焦区域,对所述摄像头进行对焦以采集第二帧原始图像;显示第一目标图像,所述第一目标图像由对焦完成的所述第二帧原始图像生成。
- 根据权利要求1所述的方法,其特征在于,所述根据所述第一帧原始图像进行目标检测,确定拍摄目标的第一目标检测信息,包括:根据所述第一帧原始图像进行场景检测和目标检测,确定所述拍摄目标的第一目标检测信息和场景信息;所述场景信息用于表征所述拍摄目标的拍摄场景;所述根据所述第一运动信息和更新后的所述第一目标检测信息,确定感兴趣第一ROI对焦区域;包括:根据所述第一运动信息、所述场景信息和更新后的所述第一目标检测信息,确定所述第一ROI对焦区域。
- 根据权利要求2所述的方法,其特征在于,所述根据所述第一事件数据进行运动目标检测,包括:对所述第一事件数据进行切分,生成多帧事件帧图像;若所述拍摄目标对应的运动速率发生变化,则根据所述运动速率调整单位时长内生成所述事件帧图像的帧数;所述事件帧图像的帧数与所述拍摄目标对应的运动速率正相关;对所述事件帧图像进行运动目标检测。
- 根据权利要求3所述的方法,其特征在于,所述对所述第一事件数据进行切分,生成多帧事件帧图像;包括:根据预设事件数据量,对所述第一事件数据进行切分,生成多帧所述事件帧图像;其中,每帧所述事件帧图像均对应所述预设事件数据量。
- 根据权利要求2-4任一项所述的方法,其特征在于,所述第一目标检测信息包括拍摄目标的第一目标框、第一目标框对应的区域大小或第一目标框对应的坐标中的一种或多种;所述第一运动信息包括拍摄目标对应的第二目标框、第二目标框对应的区域大小或第二目标框对应的坐标中的一种或多种,以及拍摄目标对应的置信度;或者,所述第一运动信息包括拍摄目标对应的运动轮廓、运动轮廓对应的区域大小或运动轮廓对应的坐标中的一种或多种,以及拍摄目标对应的置信度。
- 根据权利要求5所述的方法,其特征在于,所述第一运动信息还包括拍摄目标对应的运动方向、运动速度或运动类型中的一种或多种;所述运动类型至少包括往复运动、旋转运动、横向运动、纵向运动或蹦跳运动中的一种或多种。
- 根据权利要求5或6所述的方法,其特征在于,所述第一事件数据的采集时间在所述第二帧原始图像的采集时间之前,所述基于所述拍摄目标的第一运动信息,更新所述拍摄目标的第一目标检测信息;包括:基于所述第一运动信息中所述第二目标框的位置,调整所述第一目标检测信息中所述第一目标框的位置。
- 根据权利要求5-7任一项所述的方法,其特征在于,所述根据所述第一运动信息、所述场景信息和更新后的所述第一目标检测信息,确定所述第一ROI对焦区域,包括:基于预设筛选规则,从所述第一运动信息中所述第二目标框和更新后的所述第一目标检测信息中所述第一目标框中筛选得到所述第一ROI对焦区域;其中,所述预设筛选规则包括对比所述第二目标框中所述拍摄目标的置信度与所述第一目标框中所述拍摄目标的置信度,将置信度高的所述第一目标框或所述第二目标框确定为所述第一ROI对焦区域;所述预设筛选规则还包括检测所述第一运动信息与所述场景信息之间的关联度以及所述第一目标检测信息与所述场景信息之间的关联度,将关联度高的所述第一目标框或所述第二目标框确定为所述第一ROI对焦区域;所述预设筛选规则还包括对比所述第一运动信息和所述第一目标检测信息对应的优先级,将优先级高的所述第一目标框或所述第二目标框确定为所述第一ROI对焦区域。
- 根据权利要求1-8任一项所述的方法,其特征在于,根据所述第一ROI对焦区域,对第二帧原始图像进行对焦,包括;计算所述第一ROI对焦区域中的平均深度值;根据所述平均深度值,确定所述摄像头的对焦马达移动的距离以及对焦位置;控制所述摄像头的对焦马达移动所述距离以及移动至所述对焦位置。
- 根据权利要求9所述的方法,其特征在于,所述第一事件数据为第一时刻与第二时刻之间采集到的事件数据,所述第一时刻为所述第一帧原始图像的采集时刻,所述第二时刻为所述第二帧原始图像的采集时刻,所述方法还包括:根据第二事件数据进行运动目标检测,确定所述拍摄目标的第二运动信息;所述第二事件数据为所述第一时刻与第三时刻之间采集到的事件数据,所述第三时刻位于所述第一时刻与所述第二时刻之间,且所述第一时刻与所述第二时刻之间包括至少一个所述第三时刻;根据拍摄目标的所述第二运动信息对拍摄目标的运动情况进行预测,确定拍摄目标的预测信息;所述预测信息包括拍摄目标对应的预测目标框、预测目标框的大小、预测目标框的坐标中的一种或多种;根据所述第一帧原始图像进行目标检测,确定所述拍摄目标的第二目标检测信息;基于所述拍摄目标的第二运动信息,更新所述拍摄目标的所述第二目标检测信息;根据所述第二运动信息和更新后的所述第二目标检测信息,确定第二ROI对焦区域;根据所述预测目标框和所述第二ROI对焦区域,控制所述摄像头的对焦马达对所述拍摄目标进行追焦。
- 根据权利要求10所述的方法,其特征在于,所述方法还包括:显示所述预测目标框;所述预测目标框呈现在所述第一目标图像中,所述预测目标框用于指示所述拍摄目标的预测位置。
- 根据权利要求1-11任一项所述的方法,其特征在于,所述显示第一目标图像,包括:在拍摄预览界面上显示所述第一目标图像,所述拍摄预览界面包括拍照预览界面或录像预览界面。
- 根据权利要求1-12任一项所述的方法,其特征在于,所述采集第二帧原始图像,包括:检测到拍照操作后,采集所述第二帧原始图像并生成拍照得到的所述第一目标图像;所述显示第一目标图像,包括:在拍摄预览界面的第一预设区域,显示所述第一目标图像对应的缩略图。
- 根据权利要求1-13任一项所述的方法,其特征在于,所述采集第二帧原始图像,包括:检测到录像操作后,采集所述第二帧原始图像;在录像界面的第二预设区域显示所述第一目标图像;所述方法还包括:根据所述第二帧原始图像生成录像获得的目标视频中的视频帧。
- 一种电子设备,其特征在于,所述电子设备包括存储器、一个或多个处理器;所述存储器与所述处理器耦合;其中,所述存储器中存储有计算机程序代码,所述计算机程序代码包括计算机指令,当所述计算机指令被所述处理器执行时,使得所述电子设备执行如权利要求1-14任一项 所述的拍摄方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在计算机上运行时,使得计算机可以执行如权利要求1-14任一项所述的拍摄方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310035833.0A CN118338122A (zh) | 2023-01-10 | 2023-01-10 | 一种拍摄方法及设备 |
CN202310035833.0 | 2023-01-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024148975A1 true WO2024148975A1 (zh) | 2024-07-18 |
Family
ID=91770789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/134757 WO2024148975A1 (zh) | 2023-01-10 | 2023-11-28 | 一种拍摄方法及设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118338122A (zh) |
WO (1) | WO2024148975A1 (zh) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009273023A (ja) * | 2008-05-09 | 2009-11-19 | Fujifilm Corp | 撮像装置、撮像方法、合焦制御方法及びプログラム |
CN103369227A (zh) * | 2012-03-26 | 2013-10-23 | 联想(北京)有限公司 | 一种运动对象的拍照方法及电子设备 |
CN106324945A (zh) * | 2015-06-30 | 2017-01-11 | 中兴通讯股份有限公司 | 非接触式自动对焦方法和装置 |
WO2021258321A1 (zh) * | 2020-06-24 | 2021-12-30 | 华为技术有限公司 | 一种图像获取方法以及装置 |
CN114466129A (zh) * | 2020-11-09 | 2022-05-10 | 哲库科技(上海)有限公司 | 图像处理方法、装置、存储介质及电子设备 |
CN115037871A (zh) * | 2021-03-05 | 2022-09-09 | Oppo广东移动通信有限公司 | 控制对焦的方法、装置、电子设备及计算机可读存储介质 |
CN115297262A (zh) * | 2022-08-09 | 2022-11-04 | 中国电信股份有限公司 | 对焦方法、对焦装置、存储介质与电子设备 |
-
2023
- 2023-01-10 CN CN202310035833.0A patent/CN118338122A/zh active Pending
- 2023-11-28 WO PCT/CN2023/134757 patent/WO2024148975A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009273023A (ja) * | 2008-05-09 | 2009-11-19 | Fujifilm Corp | 撮像装置、撮像方法、合焦制御方法及びプログラム |
CN103369227A (zh) * | 2012-03-26 | 2013-10-23 | 联想(北京)有限公司 | 一种运动对象的拍照方法及电子设备 |
CN106324945A (zh) * | 2015-06-30 | 2017-01-11 | 中兴通讯股份有限公司 | 非接触式自动对焦方法和装置 |
WO2021258321A1 (zh) * | 2020-06-24 | 2021-12-30 | 华为技术有限公司 | 一种图像获取方法以及装置 |
CN114466129A (zh) * | 2020-11-09 | 2022-05-10 | 哲库科技(上海)有限公司 | 图像处理方法、装置、存储介质及电子设备 |
CN115037871A (zh) * | 2021-03-05 | 2022-09-09 | Oppo广东移动通信有限公司 | 控制对焦的方法、装置、电子设备及计算机可读存储介质 |
CN115297262A (zh) * | 2022-08-09 | 2022-11-04 | 中国电信股份有限公司 | 对焦方法、对焦装置、存储介质与电子设备 |
Also Published As
Publication number | Publication date |
---|---|
CN118338122A (zh) | 2024-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11860511B2 (en) | Image pickup device and method of tracking subject thereof | |
CN113592887B (zh) | 视频拍摄方法、电子设备及计算机可读存储介质 | |
US12096120B2 (en) | Photographing method in telephoto scenario and mobile terminal | |
CN109671106B (zh) | 一种图像处理方法、装置与设备 | |
WO2021258321A1 (zh) | 一种图像获取方法以及装置 | |
US11375097B2 (en) | Lens control method and apparatus and terminal | |
JP2010056692A (ja) | 撮像装置およびその制御方法 | |
JP2016129347A (ja) | コンテクストデータを使用して端末によって画像キャプチャの確率を自動的に決定する方法 | |
US10277888B2 (en) | Depth triggered event feature | |
CN112822412B (zh) | 曝光方法、装置、电子设备和存储介质 | |
JP2022512125A (ja) | 長時間露光画像を撮影するための方法及び電子デバイス | |
CN104506767A (zh) | 利用马达连续移动生成同一景物不同焦距的方法和终端 | |
CN116055844B (zh) | 一种跟踪对焦方法、电子设备和计算机可读存储介质 | |
CN115484403A (zh) | 录像方法和相关装置 | |
WO2024148975A1 (zh) | 一种拍摄方法及设备 | |
WO2024179100A1 (zh) | 一种拍摄方法 | |
WO2024174625A1 (zh) | 图像处理方法和电子设备 | |
WO2024078032A1 (zh) | 信号处理方法、装置、设备、存储介质及计算机程序 | |
CN116389885B (zh) | 拍摄方法、电子设备及存储介质 | |
EP3304551A1 (en) | Adjusting length of living images | |
CN114143471B (zh) | 图像处理方法、系统、移动终端及计算机可读存储介质 | |
CN115134536A (zh) | 拍摄方法及其装置 | |
CN115942113A (zh) | 图像拍摄方法、应用处理芯片及电子设备 | |
CN107431756B (zh) | 自动图像帧处理可能性检测的方法和装置 | |
CN117177052B (zh) | 图像获取方法、电子设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23915733 Country of ref document: EP Kind code of ref document: A1 |