[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118102044A - Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter - Google Patents

Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter Download PDF

Info

Publication number
CN118102044A
CN118102044A CN202410225434.5A CN202410225434A CN118102044A CN 118102044 A CN118102044 A CN 118102044A CN 202410225434 A CN202410225434 A CN 202410225434A CN 118102044 A CN118102044 A CN 118102044A
Authority
CN
China
Prior art keywords
point cloud
cloud data
video
gaussian
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410225434.5A
Other languages
Chinese (zh)
Inventor
刘祖渊
杨白云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Star River Vision Technology Beijing Co ltd
Original Assignee
Star River Vision Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Star River Vision Technology Beijing Co ltd filed Critical Star River Vision Technology Beijing Co ltd
Priority to CN202410225434.5A priority Critical patent/CN118102044A/en
Publication of CN118102044A publication Critical patent/CN118102044A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The application relates to a point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter, which are applied to the technical field of video implantation, and the method comprises the following steps: acquiring a limited view video containing a complete target object; processing the limited view video based on 3D Gaussian splatter to generate a predicted panoramic video of the target object; and carrying out point cloud construction based on the prediction panoramic video, and generating point cloud data of the target object. The method and the device have the effect of improving the creation efficiency of the point cloud data.

Description

Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter
Technical Field
The application relates to the technical field of video implantation, in particular to a point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter.
Background
Point cloud data is a type of data capable of expressing the spatial contour and specific position of an object, is a potential data resource, can help users better understand and reproduce three-dimensional worlds, and can be used in various 3D video processing applications, such as creating realistic three-dimensional roles and scenes in animation, or generating three-dimensional models or videos by using the point cloud data.
Most of the conventional 3D video processing methods need to extract a depth image from a 3D video, and then calculate 3D coordinates corresponding to each pixel according to the depth image and parameters of a photographing device, so as to generate point cloud data, however, the quality of the depth image and the resolution of the 3D video adopted by the methods affect the quality and density of the point cloud, if a high-quality and high-density point cloud is to be generated, a high-quality and video resolution depth image is required, which requires to collect and process a large amount of video and image data, and processing the data requires a large amount of computing resources and time, and has high cost and low efficiency.
Disclosure of Invention
In order to reduce the cost of point cloud data generation and improve the efficiency of point cloud data generation, the application provides a point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter.
In a first aspect, the present application provides a point cloud data generating method based on 3D gaussian splatter, which adopts the following technical scheme:
a point cloud data generation method based on 3D gaussian splatter, comprising:
acquiring a limited view video containing a complete target object;
processing the limited view video based on 3D Gaussian splatter to generate a predicted panoramic video of the target object;
And carrying out point cloud construction based on the prediction panoramic video, and generating point cloud data of the target object.
By adopting the technical scheme, the 3D Gaussian splatter technology is applied to the generation of the point cloud data, has stronger universality, can be suitable for complex 3D environments, can automatically complete the generation of the point cloud data without depending on a specific requirement and only by inputting videos under a small number of views, has stronger flexibility and expansibility, is not influenced by factors such as image quality, shielding and the like when the point cloud data are generated, does not need to label the input image data, is simple and quick, reduces the cost of the generation of the point cloud data, improves the efficiency of the generation of the point cloud data, and can be widely applied to the 3D video fields requiring three-dimensional models such as animation production, virtual reality, game development and the like.
Optionally, the processing the limited view video based on 3D gaussian splatter, and generating the predicted panoramic video of the target object includes:
performing data preprocessing on the limited view video to generate sparse point cloud data;
Acquiring data points of the sparse point cloud data, and representing the data points as a 3D Gaussian function;
optimizing the 3D Gaussian function and the sparse point cloud data to generate an optimization function;
generating a video to be spliced based on the optimization function, the limited view video and the sparse point cloud data;
and generating a prediction panoramic video based on the video to be spliced and the limited view video.
Optionally, the performing point cloud construction based on the predicted panoramic video, and generating the point cloud data of the target object includes:
Decomposing and preprocessing the prediction panoramic video to generate an image sequence;
extracting the characteristics of each frame of image in the image sequence, and determining ORB characteristics;
performing feature point matching on the ORB features according to continuous frames of the image sequence, and determining pose information of shooting equipment;
converting the ORB characteristic into a three-dimensional coordinate point;
And constructing point cloud data of the target object based on the three-dimensional coordinate points.
Optionally, the converting the ORB feature to a three-dimensional coordinate point includes:
performing feature matching on the ORB features in all the frame images to determine a frame image group with the same ORB feature;
determining any two adjacent frame images in the frame image group as a left image and a right image;
Calculating a horizontal offset between pixel coordinates in the left image and the right image for the same ORB feature;
and calculating a three-dimensional coordinate point of the ORB characteristic based on the horizontal offset and a triangulation method.
Optionally, after the converting the ORB feature into a three-dimensional coordinate point based on the pose information, the method further includes:
performing closed loop detection based on the pose information;
upon detecting the generation of a closed loop, global optimization is performed to correct for accumulated drift generated by the closed loop.
Optionally, after the constructing the point cloud data of the target object based on the three-dimensional coordinate points, the method further includes:
Performing noise removal optimization processing on the point cloud data to generate point cloud data to be stored;
Obtaining a target storage format;
And storing the point cloud data to be stored based on the target storage format.
In a second aspect, the present application provides a point cloud data generating device based on 3D gaussian splatter, which adopts the following technical scheme:
a point cloud data generation apparatus based on 3D gaussian splatter, comprising:
The visual angle video acquisition module is used for acquiring limited visual angle videos containing complete target objects;
The panoramic video generation module is used for processing the limited-view video based on 3D Gaussian splatter and generating a predicted panoramic video of the target object;
and the point cloud data generation module is used for carrying out point cloud construction based on the prediction panoramic video and generating point cloud data of the target object.
By adopting the technical scheme, the 3D Gaussian splatter technology is applied to the generation of the point cloud data, so that the method has stronger universality, can be suitable for complex 3D environments, has stronger universality, can be suitable for complex environments, can automatically complete the generation of the point cloud data without depending on a specific requirement, has stronger flexibility and expansibility by inputting videos under a small number of views, is not influenced by factors such as image quality and shielding when the point cloud data are generated, does not need to label the input image data, is simple and quick, reduces the cost of generating the point cloud data, improves the efficiency of generating the point cloud data, and can be widely applied to the field of 3D videos requiring three-dimensional models such as animation production, virtual reality, game development and the like.
In a third aspect, the present application provides an electronic device, which adopts the following technical scheme:
An electronic device comprising a processor coupled with a memory;
The processor is configured to execute a computer program stored in the memory, to cause the electronic device to execute the computer program of the 3D gaussian splatter-based point cloud data generation method according to any of the first aspects.
In a fourth aspect, the present application provides a computer readable storage medium, which adopts the following technical scheme:
a computer readable storage medium storing a computer program capable of being loaded by a processor and executing the 3D gaussian splatter based point cloud data generation method of any of the first aspects.
Drawings
Fig. 1 is a flow chart of a point cloud data generating method based on 3D gaussian splatter according to an embodiment of the present application.
Fig. 2 is a block diagram of a point cloud data generating device based on 3D gaussian splatter according to an embodiment of the present application.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the accompanying drawings.
The embodiment of the application provides a point cloud data generation method based on 3D Gaussian splatter, which can be executed by electronic equipment, wherein the electronic equipment can be a server or terminal equipment, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can be a cloud server for providing cloud computing service. The terminal device may be, but is not limited to, a smart phone, a tablet computer, a desktop computer, etc.
3D gaussian splatter is a chinese translation of 3D Gaussian Splatting, 3D Gaussian Splatting can also be translated into 3D gaussian splatter. When a 3D virtual object is to be placed in a video, in order to improve the implantation effect of the 3D virtual object, it is necessary to make the 3D virtual object and the video space perspective coincide, so that a rapid positioning space and perspective position using the point cloud technology in the present application are required, and according to this method, the 3D coordinates are transformed into the perspective position in the video to improve the effect of video fusion.
Fig. 1 is a flow chart of a point cloud data generating method based on 3D gaussian splatter according to an embodiment of the present application.
As shown in fig. 1, the main flow of the method is described as follows (steps S101 to S103):
Step S101, obtaining a limited view video containing a complete target object.
In this embodiment, the target object is a three-dimensional object that is to generate point cloud data, and the target object may be any three-dimensional object such as a person, a building, a doll, and a fruit, before the point cloud is generated, the target object needs to be photographed at any position by a photographing device such as a camera or a mobile phone, and when photographing, the target object needs to be completely presented at the center position of a photographing picture, and then video photographing is performed at a certain range angle, so as to obtain a video with a limited viewing angle. It should be noted that, the central position means that the target object is located at the center of the picture as much as possible during shooting, which is not an absolute requirement, and the surrounding angle, the shooting duration and the resolution of the video during shooting are not limited during shooting, and the user can select different illumination conditions and backgrounds according to actual requirements, so that the diversity and the authenticity of the video are increased.
Step S102, processing the limited view video based on the 3D Gaussian splatter to generate a predicted panoramic video of the target object.
Aiming at step S102, carrying out data preprocessing on the limited view video to generate sparse point cloud data; acquiring data points of sparse point cloud data, and representing the data points as a 3D Gaussian function; carrying out optimization processing on the 3D Gaussian function and the sparse point cloud data to generate an optimization function; generating a video to be spliced based on the optimization function, the limited view video and the sparse point cloud data; and generating a predicted panoramic video based on the video to be spliced and the limited view video.
In this embodiment, the limited view video shot by the shooting device is subjected to data preprocessing, so as to obtain sparse point cloud data, where the sparse point cloud data is partial point cloud data corresponding to the target object displayed in the limited view video, and the sparse point cloud data obtained by preprocessing is not the point cloud data of the complete target object because the target object in the limited view video is incomplete. In sparse point cloud data, each data point carries position information and radiation information of the data point, wherein the position information is the position of the data point in a world coordinate system, that is, the position relative to a fixed reference system, and the position information is usually represented by a three-dimensional vector, for example, (x, y, z), where x, y, z represent coordinates of the point in x, y, z axes, respectively; the radiation information refers to the appearance attribute of the data points, such as color, reflection, luminescence, etc., and the color is the pointed surface color, and can be represented by a color space of RGB or HSV, etc. Reflection is the reflection characteristic of the surface to which light is directed, such as diffuse reflection, specular reflection, refraction, etc., and luminescence is whether the surface to which light is directed is capable of emitting light, such as fluorescence, phosphorescence, etc.
After the data points are obtained, the data points are represented by a 3D Gaussian function, the 3D Gaussian function used for representing the data points has the following parameters, namely position coordinates, a covariance matrix, opacity and spherical harmonics, wherein the position coordinates are used for representing the positions of the data points in space, the covariance matrix is used for determining the shape of the Gaussian function, the covariance matrix describes the distribution of the Gaussian function in different directions, the opacity is used for subsequent rendering, the spherical harmonics are used for fitting the view-related appearance, and 3D space information of the target object is represented. After conversion is completed, optimization processing is carried out, wherein the optimization processing comprises alternate optimization and density control, the alternate optimization is an iterative optimization method and is used for adjusting parameters of a Gaussian function, and the parameters of the Gaussian function, such as positions, spherical harmonics, covariance matrixes and the like, can be optimized step by minimizing an error function so as to approximate a real scene better; density control is mainly used to ensure that the number and distribution of gaussian functions are appropriate for the scene, by adding or deleting gaussian functions.
In this embodiment, density control is a method for adjusting the number and distribution of gaussian functions to adapt to the complexity and detail of a scene, and the basic idea of density control is to divide a grid in space, then determine whether to add or delete a gaussian function according to the number and variance of points in each grid unit, and if the number or variance of points in a grid unit is too small or too large, it is indicated that the gaussian function of the grid unit is insufficient to represent information of the scene, and adjustment is required, where the method of adding a gaussian function is to randomly select a point from the grid unit, and then initialize a gaussian function with its position, color and variance; the method for deleting the Gaussian function is to randomly select one Gaussian function from the grid cells and then remove the Gaussian function from the scene, and the method can enable the quantity and distribution of the Gaussian functions to be more reasonable, so that the reconstruction quality of the scene is improved.
After the optimization processing is carried out to generate an optimization function, generating a video to be spliced which is matched with the limited view angle video through the optimization function and the sparse point cloud data, and splicing the video to be spliced and the limited view angle video, so that a predicted panoramic video is obtained, wherein the predicted panoramic video is a video containing a complete target object, and a part of the target object in the predicted panoramic video and the limited view angle video is a video of a complete and comprehensive-angle target object. For example, the target object is a small bear doll, the right opposite face of the small bear doll is taken as 0 degree, when the small bear doll shoots a limited visual angle video, the small bear doll rotates clockwise by 30 degrees with 0 degree as a starting point to shoot, then 0-30 degrees are limited visual angle videos, then the generated video to be spliced is the video under the rest 31-360 degrees of visual angles, and when the small bear doll shoots, the video of 31-360 degrees of the video to be spliced is directly supplemented on the basis of 0-30 degrees of the limited visual angle video, so that the complete angle prediction panoramic video of 0-360 degrees is obtained.
And step S103, performing point cloud construction based on the predicted panoramic video, and generating point cloud data of the target object.
Aiming at step S103, decomposing and preprocessing the predicted panoramic video to generate an image sequence; extracting the characteristics of each frame of image in the image sequence, and determining ORB characteristics; performing feature point matching on ORB features according to continuous frames of an image sequence, and determining pose information of shooting equipment; converting ORB characteristics into three-dimensional coordinate points; and constructing point cloud data of the target object based on the three-dimensional coordinate points.
Further, feature matching is carried out on ORB features in all frame images, and a frame image group with the same ORB features is determined; determining any two adjacent frame images in the frame image group as a left image and a right image; calculating a horizontal offset between pixel coordinates in the left and right images for the same ORB feature; and calculating to obtain the three-dimensional coordinate point of the ORB characteristic based on the horizontal offset and a triangulation method.
In this embodiment, the point cloud data is converted by using an ORB-SLAM method, firstly, the predicted panoramic video is decomposed and preprocessed, the complete predicted panoramic video is decomposed into an image sequence by using video editing software or a professional video processing library, and then, feature extraction is performed on each frame of image in the image sequence by using a feature extractor of the ORB-SLAM to obtain ORB features. The feature extractor of the ORB-SLAM comprises two parts, namely feature point detection and feature description, the feature points are found through an Oriented FAST algorithm, each feature point is then converted into a binary feature vector by adopting a Rotated BRIEF algorithm, the binary feature vector is also called a binary descriptor and is a feature vector only comprising 1 and 0, the binary feature vector is generally a character string with 128-512 bits, and in order to enable the ORB feature to meet the scaling and rotation invariance, the ORB uses an image pyramid algorithm for processing objects with different scales and adopts a feature point direction for guaranteeing the rotation invariance of the feature.
And carrying out feature point matching operation on the acquired ORB features according to continuous frames of the image sequence, dividing frames with the same ORB features into a frame image group, taking any two adjacent frame images in the frame image group as a left image and a right image by adopting a stereoscopic vision technology, calculating horizontal offset between pixel coordinates of the same ORB features in the left image and the right image according to pose information of shooting equipment, and calculating three-dimensional coordinate points of the ORB features according to the horizontal offset and a triangulation method, so as to construct point cloud data of a target object according to the three-dimensional coordinate points.
In this embodiment, the determination of the same ORB feature is performed by calculating a hamming distance between ORB feature descriptors in two frames of images, where the smaller the hamming distance is, the more similar the two feature points are, and ORB-SLAM uses a fast matching strategy, and first predicts the position of the feature point according to a motion model of the photographing device, then searches for the best match in the predicted range, and through feature matching, a feature point correspondence between two frames of images can be obtained, and then, using epipolar geometric constraint, a pose transformation matrix of the photographing device is solved, so as to obtain pose information of the photographing device. And determining a horizontal offset according to the pose information, wherein the horizontal offset refers to the pixel distance of the same feature point in the horizontal direction in the left image and the right image, namely parallax, and the horizontal offset can be obtained by calculating the difference between the horizontal coordinates of the two pixel points. After the horizontal offset is acquired, three-dimensional coordinate points of each ORB feature are calculated from the horizontal offset by a triangulation method, and finally, the three-dimensional coordinate points of all ORB features are overlapped to construct point cloud data.
The basic idea of triangulation is to calculate the position of three-dimensional points by using the similarity of triangles, using the geometrical relationships between feature points under multiple views. The depth Z of the three-dimensional point can be calculated through parallax and a base line, and the X and Y coordinates of the three-dimensional point can be obtained through a two-dimensional image, so that the three-dimensional coordinates (X, Y, Z) of the three-dimensional point are finally obtained, and the depth Z coordinate conversion formula is as follows
Wherein Z is the depth of the three-dimensional point, b is the base line length, f is the focal length of the photographing device, and d is the horizontal offset.
In this embodiment, closed-loop detection is performed based on pose information; upon detecting the generation of a closed loop, global optimization is performed to correct for accumulated drift generated by the closed loop.
Global optimization refers to optimizing the structure of the entire map and the trajectory of the photographing device using a nonlinear optimization method, such as map optimization or Bundle Adjustment, so that it is more consistent with the observed data, thereby reducing errors and drift. The global optimization is performed after the closed loop is detected, the closed loop refers to the position where the shooting equipment returns to before access, and the current pose and the map can be corrected by using the prior map information at the moment, so that accumulated drift is eliminated, and the optimized effect is that the precision and consistency of the map can be improved, so that the map is closer to a real scene.
In the embodiment, the point cloud data is subjected to the noise removal optimization processing to generate point cloud data to be stored; obtaining a target storage format; and carrying out storage processing on the point cloud data to be stored based on the target storage format.
The denoising optimization process is denoising, filtering and optimizing process, and refers to using some algorithms and methods to improve quality and effect of point cloud data, such as removing noise points, smoothing curved surfaces, reducing data volume, etc., wherein denoising refers to removing noise points in point cloud data, i.e. points not belonging to the surface of a target object, which may be caused by factors such as scanning equipment, environmental interference, object motion, etc. There are many denoising methods, such as a method based on neighborhood analysis, a method based on statistical analysis, a method based on filtering, etc.; filtering refers to smoothing the point cloud data to reduce the effects of noise and aliasing while maintaining the point cloud geometry. There are also many filtering methods, such as convolution-based methods, curved surface reconstruction-based methods, projection-based methods, etc. 4; optimization refers to simplifying and compressing point cloud data to reduce data volume and storage space while maintaining the topology and visual effects of the point cloud. The optimization method mainly comprises two types, namely a sampling-based method, namely selecting a part of representative points from the original point cloud as a new point cloud, such as random sampling, uniform sampling, curvature-based sampling and the like; another type is a clustering-based method, i.e., points in an original point cloud are divided into several clusters, and then a center point or an average point of each cluster is used as a new point cloud, such as K-means clustering, hierarchical clustering, grid-based clustering, and the like.
And carrying out storage processing on the point cloud data to be stored according to the target storage format, and carrying out format conversion on the point cloud data to be stored, namely converting the point cloud data to be stored into a required format so as to facilitate subsequent processing and application. The formats of the point cloud data are various, such as PLY format, PCD format, LAS format, XYZ format, etc., and they are mainly different in storage content and manner, for example, some formats only store coordinates of points, some formats also store information of colors, normal vectors, etc., some formats are binary, some formats are text, etc., and many methods of format conversion are available, for example, using some specialized software or libraries, such as CloudCompare, PCL, etc., or writing programs by themselves.
Fig. 2 is a block diagram of a point cloud data generating device 200 based on 3D gaussian splatter according to an embodiment of the present disclosure.
As shown in fig. 2, the point cloud data generating apparatus 200 based on 3D gaussian splatter mainly includes:
A view video acquisition module 201, configured to acquire a limited view video including a complete target object;
a panoramic video generation module 202, configured to process the limited view video based on 3D gaussian splatter, and generate a predicted panoramic video of the target object;
The point cloud data generating module 203 is configured to perform point cloud construction based on the predicted panoramic video, and generate point cloud data of the target object.
2. The method of claim 1, wherein the panoramic video generation module 202 is specifically configured to perform data preprocessing on the limited view video to generate sparse point cloud data; acquiring data points of sparse point cloud data, and representing the data points as a 3D Gaussian function; carrying out optimization processing on the 3D Gaussian function and the sparse point cloud data to generate an optimization function; generating a video to be spliced based on the optimization function, the limited view video and the sparse point cloud data; and generating a predicted panoramic video based on the video to be spliced and the limited view video.
3. The method of claim 1, the point cloud data generation module 203 comprises:
the image sequence generation module is used for carrying out decomposition pretreatment on the predicted panoramic video to generate an image sequence;
The image feature determining module is used for extracting the features of each frame of image in the image sequence and determining ORB features;
the pose information determining module is used for carrying out feature point matching on ORB features according to continuous frames of the image sequence and determining pose information of shooting equipment;
The three-dimensional coordinate conversion module is used for converting ORB characteristics into three-dimensional coordinate points;
And the point cloud data construction module is used for constructing point cloud data of the target object based on the three-dimensional coordinate points.
4. According to the method of claim 3, the three-dimensional coordinate transformation module is specifically configured to perform feature matching on the ORB features in all frame images, and determine a frame image group having the same ORB feature; determining any two adjacent frame images in the frame image group as a left image and a right image; calculating a horizontal offset between pixel coordinates in the left and right images for the same ORB feature; and calculating to obtain the three-dimensional coordinate point of the ORB characteristic based on the horizontal offset and a triangulation method.
5. The method according to claim 3, the 3D gaussian splatter based point cloud data generation apparatus 200 further comprising:
the closed loop detection module is used for performing closed loop detection based on pose information;
and the drift correction module is used for executing global optimization to correct accumulated drift generated by the closed loop when the closed loop is detected.
6. The method according to claim 5, the 3D gaussian splatter based point cloud data generation apparatus 200 further comprising:
The to-be-stored data generation module is used for performing the de-agitation optimization processing on the point cloud data to generate the to-be-stored point cloud data;
the target format acquisition module is used for acquiring a target storage format;
and the data storage processing module is used for carrying out storage processing on the point cloud data to be stored based on the target storage format.
In one example, a module in any of the above apparatuses may be one or more integrated circuits configured to implement the above methods, for example: one or more application specific integrated circuits (application specific integratedcircuit, ASIC), or one or more digital signal processors (DIGITAL SIGNAL processor, DSP), or one or more field programmable gate arrays (field programmable GATE ARRAY, FPGA), or a combination of at least two of these integrated circuit forms.
For another example, when a module in an apparatus may be implemented in the form of a scheduler of processing elements, the processing elements may be general-purpose processors, such as a central processing unit (central processing unit, CPU) or other processor that may invoke a program. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.
Fig. 3 is a block diagram of an electronic device 300 according to an embodiment of the present application.
As shown in FIG. 3, electronic device 300 includes a processor 301 and memory 302, and may further include an information input/information output (I/O) interface 303, one or more of a communication component 304, and a communication bus 305.
Wherein the processor 301 is configured to control the overall operation of the electronic device 300 to complete all or part of the steps of the above-described 3D gaussian splatter-based point cloud data generation method; the memory 302 is used to store various types of data to support operation at the electronic device 300, which may include, for example, instructions for any application or method operating on the electronic device 300, as well as application-related data. The Memory 302 may be implemented by any type or combination of volatile or non-volatile Memory devices, such as one or more of static random access Memory (Static Random Access Memory, SRAM), electrically erasable programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.
The I/O interface 303 provides an interface between the processor 301 and other interface modules, which may be a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 304 is used for wired or wireless communication between the electronic device 300 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, near field Communication (NFC for short), 2G, 3G, or 4G, or a combination of one or more thereof, and accordingly the Communication component 104 can include: wi-Fi part, bluetooth part, NFC part.
The electronic device 300 may be implemented by one or more Application Specific Integrated Circuits (ASIC), digital signal Processor (DIGITAL SIGNAL Processor, DSP), digital signal processing device (DIGITAL SIGNAL Processing Device, DSPD), programmable logic device (Programmable Logic Device, PLD), field programmable gate array (Field Programmable GATE ARRAY, FPGA), controller, microcontroller, microprocessor, or other electronic components for performing the 3D gaussian splatter based point cloud data generation method as set forth in the above embodiments.
Communication bus 305 may include a pathway to transfer information between the aforementioned components. The communication bus 305 may be a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus 305 may be divided into an address bus, a data bus, a control bus, and the like.
The electronic device 300 may include, but is not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like, and may also be a server, and the like.
The application also provides a computer readable storage medium, wherein the computer readable storage medium is stored with a computer program, and the computer program realizes the steps of the point cloud data generation method based on 3D Gaussian splatter when being executed by a processor.
The computer readable storage medium may include: a usb disk, a removable hard disk, a read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application is not limited to the specific combinations of the features described above, but also covers other embodiments which may be formed by any combination of the features described above or their equivalents without departing from the spirit of the application. Such as the above-mentioned features and the technical features having similar functions (but not limited to) applied for in the present application are replaced with each other.

Claims (9)

1. A point cloud data generation method based on 3D gaussian splatter, comprising:
acquiring a limited view video containing a complete target object;
processing the limited view video based on 3D Gaussian splatter to generate a predicted panoramic video of the target object;
And carrying out point cloud construction based on the prediction panoramic video, and generating point cloud data of the target object.
2. The method of claim 1, wherein the processing the limited view video based on 3D gaussian splatter to generate a predicted panoramic video of the target object comprises:
performing data preprocessing on the limited view video to generate sparse point cloud data;
Acquiring data points of the sparse point cloud data, and representing the data points as a 3D Gaussian function;
optimizing the 3D Gaussian function and the sparse point cloud data to generate an optimization function;
generating a video to be spliced based on the optimization function, the limited view video and the sparse point cloud data;
and generating a prediction panoramic video based on the video to be spliced and the limited view video.
3. The method of claim 1, wherein the generating the point cloud data for the target object based on the predicted panoramic video for point cloud construction comprises:
Decomposing and preprocessing the prediction panoramic video to generate an image sequence;
extracting the characteristics of each frame of image in the image sequence, and determining ORB characteristics;
performing feature point matching on the ORB features according to continuous frames of the image sequence, and determining pose information of shooting equipment;
converting the ORB characteristic into a three-dimensional coordinate point;
And constructing point cloud data of the target object based on the three-dimensional coordinate points.
4. The method of claim 3, wherein the converting the ORB feature to a three dimensional coordinate point comprises:
performing feature matching on the ORB features in all the frame images to determine a frame image group with the same ORB feature;
determining any two adjacent frame images in the frame image group as a left image and a right image;
Calculating a horizontal offset between pixel coordinates in the left image and the right image for the same ORB feature;
and calculating a three-dimensional coordinate point of the ORB characteristic based on the horizontal offset and a triangulation method.
5. The method of claim 3, further comprising, after said converting said ORB features to three-dimensional coordinate points based on said pose information:
performing closed loop detection based on the pose information;
upon detecting the generation of a closed loop, global optimization is performed to correct for accumulated drift generated by the closed loop.
6. The method of claim 5, further comprising, after the constructing the point cloud data of the target object based on the three-dimensional coordinate points:
Performing noise removal optimization processing on the point cloud data to generate point cloud data to be stored;
Obtaining a target storage format;
And storing the point cloud data to be stored based on the target storage format.
7. A point cloud data generation apparatus based on 3D gaussian splatter, comprising:
The visual angle video acquisition module is used for acquiring limited visual angle videos containing complete target objects;
The panoramic video generation module is used for processing the limited-view video based on 3D Gaussian splatter and generating a predicted panoramic video of the target object;
and the point cloud data generation module is used for carrying out point cloud construction based on the prediction panoramic video and generating point cloud data of the target object.
8. An electronic device comprising a processor coupled to a memory;
The processor is configured to execute a computer program stored in the memory to cause the electronic device to perform the method of any one of claims 1 to 6.
9. A computer readable storage medium comprising a computer program or instructions which, when run on a computer, cause the computer to perform the method of any of claims 1 to 6.
CN202410225434.5A 2024-02-29 2024-02-29 Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter Pending CN118102044A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410225434.5A CN118102044A (en) 2024-02-29 2024-02-29 Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410225434.5A CN118102044A (en) 2024-02-29 2024-02-29 Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter

Publications (1)

Publication Number Publication Date
CN118102044A true CN118102044A (en) 2024-05-28

Family

ID=91149156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410225434.5A Pending CN118102044A (en) 2024-02-29 2024-02-29 Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter

Country Status (1)

Country Link
CN (1) CN118102044A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118521720A (en) * 2024-07-23 2024-08-20 浙江核新同花顺网络信息股份有限公司 Virtual person three-dimensional model determining method and device based on sparse view angle image

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118521720A (en) * 2024-07-23 2024-08-20 浙江核新同花顺网络信息股份有限公司 Virtual person three-dimensional model determining method and device based on sparse view angle image

Similar Documents

Publication Publication Date Title
Jam et al. A comprehensive review of past and present image inpainting methods
CN111243093B (en) Three-dimensional face grid generation method, device, equipment and storage medium
CN112767538B (en) Three-dimensional reconstruction and related interaction and measurement methods, related devices and equipment
WO2019157924A1 (en) Real-time detection method and system for three-dimensional object
Nishida et al. Procedural modeling of a building from a single image
EP2272050B1 (en) Using photo collections for three dimensional modeling
Choi et al. Depth analogy: Data-driven approach for single image depth estimation using gradient samples
Pan et al. Rapid scene reconstruction on mobile phones from panoramic images
CN108010123B (en) Three-dimensional point cloud obtaining method capable of retaining topology information
CN111489396A (en) Determining camera parameters using critical edge detection neural networks and geometric models
US20200057778A1 (en) Depth image pose search with a bootstrapped-created database
CN109842811B (en) Method and device for implanting push information into video and electronic equipment
CN112927359A (en) Three-dimensional point cloud completion method based on deep learning and voxels
Jampani et al. Navi: Category-agnostic image collections with high-quality 3d shape and pose annotations
Kim et al. Real-time panorama canvas of natural images
CN117557714A (en) Three-dimensional reconstruction method, electronic device and readable storage medium
CN115937546A (en) Image matching method, three-dimensional image reconstruction method, image matching device, three-dimensional image reconstruction device, electronic apparatus, and medium
CN118102044A (en) Point cloud data generation method, device, equipment and medium based on 3D Gaussian splatter
Yin et al. [Retracted] Virtual Reconstruction Method of Regional 3D Image Based on Visual Transmission Effect
Navarro et al. SketchZooms: Deep Multi‐view Descriptors for Matching Line Drawings
Tang et al. Image dataset creation and networks improvement method based on CAD model and edge operator for object detection in the manufacturing industry
CN109166176B (en) Three-dimensional face image generation method and device
EP4328784A1 (en) A method for selecting a refined scene points, distance measurement and a data processing apparatus
CN109118576A (en) Large scene three-dimensional reconstruction system and method for reconstructing based on BDS location-based service
CN108921908B (en) Surface light field acquisition method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination