WO2024150322A1

WO2024150322A1 - Point group processing device, point group processing method, and point group processing program

Info

Publication number: WO2024150322A1
Application number: PCT/JP2023/000446
Authority: WO
Inventors: 大我吉田; 直人阿部; 潤島村
Original assignee: 日本電信電話株式会社
Priority date: 2023-01-11
Filing date: 2023-01-11
Publication date: 2024-07-18

Abstract

This point group processing device includes: an input unit that receives a combination of a 3D point group which is on the surface of an object that was measured, and a measured posture in which the 3D point group was measured; a foreground extraction unit that extracts, on the basis of the measured posture, a foreground point group which is the point group of a foreground portion, from the 3D point group; and a point group integration unit that integrates the foreground point group and a background point group which is the point group of a background portion and which was found in advance.

Description

Point cloud processing device, point cloud processing method, and point cloud processing program

The disclosed technology relates to a point cloud processing device, a point cloud processing method, and a point cloud processing program.

A measuring device called LiDAR (Light Detection and Ranging, Laser Imaging Detection and Ranging) uses the reflection of light such as lasers to measure the distance to objects in space, and by combining this with location information obtained from GPS (Global Positioning System) etc., it obtains point cloud data, which is a collection of points with three-dimensional coordinates.

Due to the nature of LIDAR, which measures one point at a time, the number of points that can be measured per unit time is smaller than with camera images, and long periods of measurement are necessary to obtain a high-density point cloud. However, when measuring the point cloud of a moving object, the measurement must be completed in a short period of time, making it difficult to measure a high-density point cloud with LIDAR.

As a method for increasing the density of a low-density point cloud, for example, the technology shown in Patent Document 1 can be used. By converting the point cloud into a depth image from the camera's viewpoint and then converting the densified depth image back into a point cloud, it is possible to increase the density of the point cloud in the range captured by the camera image.

JP 2021-174406 A

While a typical LiDAR can measure a point cloud in all directions, 360 degrees horizontally, the angle of view of the camera images used in the technology shown in Patent Document 1 is limited, and in order to densify a point cloud over a wide area, it is necessary to prepare a large number of cameras and take images in all directions. In that case, it takes time to apply densification processing to all camera images, making it unusable for applications that require real-time performance.

The disclosed technology has been made in consideration of the above points, and aims to provide a point cloud processing device, a point cloud processing method, and a point cloud processing program that can measure a three-dimensional point cloud over a wide area, including moving objects, in a short period of time.

A first aspect of the present disclosure is a point cloud processing device that includes an input unit that accepts a combination of a three-dimensional point cloud on a measured surface of an object and a measurement orientation in which the three-dimensional point cloud was measured, a foreground extraction unit that extracts a foreground point cloud, which is a point cloud of the foreground portion, from the three-dimensional point cloud based on the measurement orientation, and a point cloud integration unit that integrates the foreground point cloud with a background point cloud, which is a point cloud of the background portion that has been determined in advance.

A second aspect of the present disclosure is a point cloud processing method in which a computer receives a combination of a three-dimensional point cloud on the measured surface of an object and a measurement orientation in which the three-dimensional point cloud was measured, extracts a foreground point cloud, which is a point cloud of the foreground part, from the three-dimensional point cloud based on the measurement orientation, and integrates the foreground point cloud with a background point cloud, which is a point cloud of the background part that has been determined in advance.

The third aspect of the present disclosure is a point cloud processing program that causes a computer to function as the point cloud processing device of the first aspect.

The disclosed technology makes it possible to measure 3D point clouds over a wide area, including moving objects, in a short period of time.

FIG. 1 is a schematic block diagram of an example of a computer that functions as a point cloud processing device according to a first, second, third, and fourth embodiments. 1 is a block diagram showing a configuration of a point cloud processing device according to a first embodiment; 11A and 11B are diagrams for explaining a method of extracting a foreground point cloud. 4 is a flowchart showing a point cloud processing routine of the point cloud processing device of the first embodiment. 4 is a flowchart showing a point cloud processing routine of the point cloud processing device of the first embodiment. A diagram showing an arrangement of multiple lidars and multiple cameras. FIG. 11 is a block diagram showing a configuration of a point cloud processing device according to a second embodiment. 10 is a flowchart showing a point cloud processing routine of the point cloud processing device of the second embodiment. FIG. 13 is a block diagram showing a configuration of a point cloud processing device according to a third embodiment. 13 is a flowchart showing a point cloud processing routine of the point cloud processing device of the third embodiment. FIG. 13 is a block diagram showing a configuration of a point cloud processing device according to a fourth embodiment.

Below, an example of an embodiment of the disclosed technology will be described with reference to the drawings. Note that the same reference symbols are used for identical or equivalent components and parts in each drawing. Also, the dimensional ratios in the drawings have been exaggerated for the convenience of explanation and may differ from the actual ratios.

<Outline of this embodiment>
In this embodiment, a foreground point cloud representing an object including a moving body is extracted from a three-dimensional point cloud measured by a lidar, and the foreground point cloud is integrated with a background point cloud representing a stationary background that has been measured separately over a long period of time. In this way, a high-density three-dimensional point cloud can be measured over a long period of time by measuring the background point cloud representing a stationary background in advance. In addition, by narrowing the range to only the area representing the moving body and measuring the three-dimensional point cloud, the amount of data to be transmitted and processed can be reduced, and measurement with high real-time performance can be realized. Therefore, point cloud data with high measurement accuracy can be output at a higher speed than before. In addition, since point cloud data can be measured at high speed in a short period of time, it is possible to measure a three-dimensional space including a moving body in a short period of time or to measure it in a moving image in real time.

[First embodiment]
<Configuration of the point cloud processing device according to this embodiment>
FIG. 1 is a block diagram showing the hardware configuration of a point cloud processing device 10 according to the present embodiment.

As shown in FIG. 1, the point cloud processing device 10 has a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a storage 14, an input unit 15, a display unit 16, and a communication interface (I/F) 17. Each component is connected to each other so that they can communicate with each other via a bus 19.

The CPU 11 is a central processing unit that executes various programs and controls each part. That is, the CPU 11 reads the programs from the ROM 12 or storage 14, and executes the programs using the RAM 13 as a working area. The CPU 11 controls each of the above components and performs various calculation processes according to the programs stored in the ROM 12 or storage 14. In this embodiment, the ROM 12 or storage 14 stores a point cloud processing program for measuring a three-dimensional point cloud. The point cloud processing program may be a single program, or may be a group of programs consisting of multiple programs or modules.

ROM 12 stores various programs and data. RAM 13 temporarily stores programs or data as a working area. Storage 14 is composed of a HDD (Hard Disk Drive) or SSD (Solid State Drive) and stores various programs including the operating system and various data.

The input unit 15 includes a pointing device such as a mouse, and a keyboard, and is used to perform various inputs including a combination of a three-dimensional point cloud on the surface of an object and the measurement attitude at which the three-dimensional point cloud was measured, and a combination of a camera image capturing an area corresponding to the three-dimensional point cloud and the shooting attitude at which the camera image was captured. Any device such as a lidar, depth camera, stereo camera, or sonar can be used to measure the three-dimensional point cloud, and for example, a three-dimensional point cloud measured by a lidar and a camera image capturing an area corresponding to the three-dimensional point cloud are input. The measurement direction of the three-dimensional point cloud and the shooting direction of the camera image do not need to be the same, but the closer they are, the more likely it is that blind spots and coloring errors during correspondence can be suppressed. The input camera image is captured by a camera, and each pixel has either or both of brightness and color information, for example, a pixel value represented by RGB.

Here, the measurement attitude of the three-dimensional point cloud can be expressed, for example, as a combination of a rotation matrix, a scaling matrix, and a translation matrix for converting the three-dimensional point cloud into the same absolute coordinate system as the background point cloud. Note that when the input three-dimensional point cloud and the background point cloud are expressed in the same absolute coordinate system, when the measurement attitude is estimated by the foreground point cloud integration unit 204 described later, or when the measurement attitude is estimated by the point cloud integration unit 104, it is not necessary to input the measurement attitude of the three-dimensional point cloud. Also, the shooting attitude of the camera image can be expressed, for example, as a combination of the internal parameters, external parameters, and distortion coefficients of the camera. Note that the external parameters of the camera may be expressed as a combination of a rotation matrix and a translation vector with respect to the absolute coordinate system, or may be expressed as a combination of a rotation matrix and a translation vector with respect to the lidar coordinate system. Also, the three-dimensional point cloud may be expressed in the form of a depth map recorded as an image having a pixel-by-pixel depth value from a certain viewpoint.

The display unit 16 is, for example, a liquid crystal display, and displays various information including the measured three-dimensional point cloud. The display unit 16 may be a touch panel type and function as the input unit 15.

The communication interface 17 is an interface for communicating with other devices, and uses standards such as Ethernet (registered trademark), FDDI, and Wi-Fi (registered trademark).

Next, the functional configuration of the point cloud processing device 10 will be described. Figure 2 is a block diagram showing an example of the functional configuration of the point cloud processing device 10.

As shown in FIG. 2, the point cloud processing device 10 functionally comprises a correspondence unit 101, a coloring unit 102, a foreground extraction unit 103, a point cloud integration unit 104, and an output unit 105.

The correspondence unit 101 uses the 3D point cloud, the camera image, and either or both of the measurement attitude and the shooting attitude to determine the correspondence between each point of the 3D point cloud and each pixel of the camera image. Note that there may be pixels with no corresponding points or points with no corresponding pixels.

The coloring unit 102 assigns color information of each pixel in the camera image to the corresponding points in the three-dimensional point cloud based on the correspondence relationship identified by the correspondence unit 101, and outputs the colored three-dimensional point cloud. At this time, the three-dimensional coordinates corresponding to each pixel may be estimated using the camera image and the shooting posture, and a three-dimensional point cloud to which color information has been assigned may be created and output based on the color information of each pixel and the estimated three-dimensional coordinates.

The foreground extraction unit 103 extracts a foreground point cloud, which is a point cloud of the foreground part, from the three-dimensional point cloud.

Specifically, the foreground extraction unit 103 receives a three-dimensional point cloud as input, extracts only points that correspond to the object being photographed, and outputs them as a foreground point cloud. To extract the foreground, a three-dimensional coordinate range occupied by the object may be determined in advance, and only points within that range may be extracted. Alternatively, a reference point cloud that does not include the foreground may be measured in advance, and only points in the input point cloud that do not have any reference point in the vicinity may be regarded as foreground points (see Figure 3).

Figure 3 shows an example in which the reference point cloud is converted into a depth map from the camera's viewpoint, the input 3D point cloud is converted into a depth map from the camera's viewpoint, and the foreground is extracted by thresholding the difference between each pixel to obtain the foreground point cloud.

In addition, image processing such as image recognition may be applied to the camera image to determine the two-dimensional region of the target object, and corresponding points may be extracted based on the correspondence between the pixels and points identified by the correspondence unit 101.

The point cloud integration unit 104 integrates the foreground point cloud with a background point cloud, which is a point cloud of the background portion that has been calculated in advance.

Specifically, the point cloud integration unit 104 receives the background point cloud, the foreground point cloud, and the positional information of the foreground point cloud, and outputs a three-dimensional point cloud in which all point clouds are superimposed. If the positional information of the foreground point cloud is not input, the foreground point cloud may be displayed at a predetermined position, or separately calculated positional information may be used as the positional information of the foreground point cloud. Note that if the background point cloud and the foreground point cloud are expressed in the same absolute coordinate system, it is not necessary to input the positional information of the foreground point cloud.

The output unit 105 outputs the three-dimensional point cloud output from the point cloud integration unit 104 as data. For example, the integrated three-dimensional point cloud may be output as a file. The three-dimensional point cloud may also be output in the form of an image or video from a certain viewpoint. The output method may include saving to a file, transmitting to another system, or displaying directly on a screen.

<Action of the point cloud processing device according to this embodiment>
Next, the operation of the point cloud processing device 10 will be described.

Figures 4 and 5 are flowcharts showing the flow of point cloud processing by the point cloud processing device 10. Point cloud processing is performed by the CPU 11 reading out a point cloud processing program from the ROM 12 or storage 14, expanding it into the RAM 13 and executing it. In addition, it is assumed that the data input to the point cloud processing device 10 is acquired by an equipment configuration including a camera and a LIDAR, and the camera and the LIDAR are installed so that the object to be measured is within the shooting range of the camera and the measurement range of the LIDAR. It is assumed that a background point cloud obtained in advance has been input to the point cloud processing device 10. Note that the point cloud processing is an example of a point cloud processing method. In addition, if the installation positions of the camera and the LIDAR can be fixed, it is possible to omit the process of calculating the coordinate conversion formula each time by performing calibration in advance and determining the measurement attitude of the three-dimensional point cloud and the shooting attitude of the camera.

First, in step S101, the CPU 11 acquires a reference point cloud, which is a three-dimensional point cloud measured by the lidar when there is no object to be photographed, and a measured attitude.

In step S102, the CPU 11 obtains a combination of the 3D point cloud measured by the rider and the measurement attitude at which the 3D point cloud was measured, and a combination of the camera image captured by the camera and the shooting attitude at which the camera image was captured.

In step S103, the CPU 11, as the correspondence unit 101, uses the 3D point cloud, the camera image, and either or both of the measurement attitude and the shooting attitude to derive a transformation formula for the coordinates between the 3D point cloud and the camera image as a correspondence relationship between the 3D point cloud and the camera image.

Next, steps S104 and S105 are repeatedly executed for each pixel in the camera image.

In step S104, the CPU 11, as the coloring unit 102, estimates the depth value of the pixel in the camera image based on the conversion formula derived in step S103, estimates the three-dimensional coordinates, and generates a corresponding three-dimensional point, thereby densifying the point cloud. As an example, a densified depth map is obtained using the method described in Patent Document 1. Specifically, the three-dimensional point cloud is converted into a depth map, pixels without depth values are complemented with the depth values of surrounding pixels, the depth map is corrected so as to be as consistent as possible with the camera image, and the corrected depth map is converted back into a three-dimensional point cloud, thereby obtaining a densified three-dimensional point cloud.

In step S105, the CPU 11, functioning as the coloring unit 102, assigns color information of the pixel to the three-dimensional point generated in step S104.

In step S106, the CPU 11, as the foreground extraction unit 103, converts the reference point group into a depth map viewed from the same viewpoint as the camera image based on the conversion formula derived in step S103, and also converts the densified 3D point group into a depth map viewed from the same viewpoint as the camera image. Note that if a camera image from a camera viewpoint corresponding to the reference point group can be acquired, densification may also be performed on the depth map of the reference point group. If a camera image corresponding to the reference point group cannot be acquired, the depth values of pixels with no corresponding points may be interpolated using nearest neighbor interpolation or the like.

Then, the process of step S107 is executed for each pixel in the depth map.

In step S107, the CPU 11, functioning as the foreground extraction unit 103, deletes the depth value of the pixel in question in accordance with the difference in the depth value of the pixel in the depth map. Specifically, when the depth value of the pixel in the depth map of the reference point group is D _b and the depth value of the same pixel in the depth map of the input 3D point group is _Di , the CPU 11 deletes the depth value of the pixel that satisfies the following formula:

D _b −D _i <d

Note that D is a positive value proportional to the distance from the camera, and is expressed so that the farther away it is, the larger the value becomes. d is a parameter that represents the threshold value.

In step S108, the CPU 11, functioning as the foreground extraction unit 103, converts the depth map after the processing of step S107 back into a point cloud to obtain a foreground point cloud. Specifically, the depth map is converted back into a three-dimensional point cloud by converting each pixel into three-dimensional coordinates using a conversion equation derived using camera internal parameters including the focal length of the camera lens. Note that the foreground extraction unit 103, instead of the coloring unit 102, may color the point cloud by adding color information of the pixel of the camera image corresponding to the pixel of the depth map to the corresponding point.

Next, in step S109, the CPU 11, functioning as the point cloud integration unit 104, acquires point cloud position information of the foreground point cloud in order to integrate the foreground point cloud with the background point cloud. Specifically, in order to display the foreground point cloud at an arbitrary position, it is sufficient to acquire predetermined position information or point cloud position information input from an external system.

In step S109, the CPU 11, as the point cloud integration unit 104, integrates the foreground point cloud and the background point cloud based on the point cloud position information of the foreground point cloud and the point cloud position information of the background point cloud that has been determined in advance. Here, the point cloud position information of the foreground point cloud and the point cloud position information of the background point cloud are position information in the same coordinate system.

In step S110, the CPU 11, functioning as the output unit 105, outputs the three-dimensional point cloud output from the point cloud integration unit 104 as data. Specifically, it outputs the integrated point cloud to a file or another system, or displays it on the display unit 16.

In step S111, the CPU 11 determines whether or not to continue measuring the three-dimensional point cloud. If the measurement of the three-dimensional point cloud is to be continued, the process returns to step S102, and a new combination of the three-dimensional point cloud measured by the lidar and the measurement attitude at which the three-dimensional point cloud was measured, and a new combination of the camera image taken by the camera and the shooting attitude at which the camera image was taken, is obtained. On the other hand, if the measurement of the three-dimensional point cloud is not to be continued, the point cloud processing routine is terminated.

As described above, the point cloud processing device according to this embodiment accepts a combination of a 3D point cloud on the measured surface of an object and the measurement orientation in which the 3D point cloud was measured, extracts a foreground point cloud, which is a point cloud of the foreground portion, from the 3D point cloud, and combines the foreground point cloud with a background point cloud, which is a point cloud of the background portion that has been determined in advance. This makes it possible to measure a 3D point cloud of a wide range, including moving objects, in a short period of time.

In addition to adding color information to the point cloud, by increasing the point density, a higher density point cloud can be obtained even in a short measurement time.

In addition, the input 3D point cloud is converted into a depth map from the camera's viewpoint, and compared with a reference depth map created from a reference point cloud measured in advance. Only points corresponding to pixels whose difference with the reference depth map is equal to or exceeds a threshold are extracted as foreground point clouds, allowing for high-speed extraction of only points corresponding to moving objects.

Furthermore, in conventional technology, when measuring a three-dimensional point cloud that represents a large space including a moving object at high resolution, the state of the moving object changes during measurement of one frame, making measurement difficult. On the other hand, in this embodiment, a background point cloud that does not contain a moving object is measured in advance, and a sparse point cloud is measured in a short time for the area representing the moving object, and then densified using high-resolution technology, thereby achieving measurement in a short time.

[Second embodiment]
Next, a point cloud processing apparatus according to a second embodiment will be described. Note that parts having the same configuration as those in the first embodiment will be given the same reference numerals and the description thereof will be omitted.

The second embodiment differs from the first embodiment in that two or more combinations of lidar and cameras are used to measure and integrate the foreground point cloud.

<Configuration of the point cloud processing device according to this embodiment>
FIG. 1 is a block diagram showing the hardware configuration of a point cloud processing device 210 according to this embodiment.

As shown in FIG. 1, the point cloud processing device 210 has a CPU 11, a ROM 12, a RAM 13, a storage 14, an input unit 15, a display unit 16, and a communication interface 17.

The input unit 15 is used to perform various inputs, including two or more sets, where one set is a combination of a measured three-dimensional point cloud and the measurement attitude at which the three-dimensional point cloud was measured, and another set is a combination of a camera image of an area corresponding to the three-dimensional point cloud and the shooting attitude at which the camera image was taken. The following describes the process of performing two sets of input.

Specifically, the data input to the input unit 15 can be acquired by an equipment configuration with two cameras 20A, 20B and two LIDARs 30A, 30B as shown in FIG. 6. Multiple cameras 20A, 20B and LIDARs 30A, 30B are installed so that the object to be measured falls within the shooting range of the cameras 20A, 20B and the measurement range of the LIDARs 30A, 30B.

For example, the input unit 15 accepts a combination of a three-dimensional point cloud measured by a rider 30A and the measured attitude of the rider 30A who measured the three-dimensional point cloud, and a combination of a camera image of an area corresponding to the three-dimensional point cloud taken by a camera 20A and the shooting attitude of the camera 20A who took the camera image. The input unit 15 also accepts a combination of a three-dimensional point cloud measured by a rider 30B and the measured attitude of the rider 30B who measured the three-dimensional point cloud, and a combination of a camera image of an area corresponding to the three-dimensional point cloud taken by a camera 20B and the shooting attitude of the camera 20B who took the camera image.

Next, the functional configuration of the point cloud processing device 210 will be described. Figure 7 is a block diagram showing an example of the functional configuration of the point cloud processing device 210.

As shown in FIG. 7, the point cloud processing device 210 functionally comprises correspondence units 201A and 201B, coloring units 202A and 202B, foreground extraction units 203A and 203B, a foreground point cloud integration unit 204, a point cloud integration unit 205, and an output unit 206.

Similar to the correspondence unit 101 described in the first embodiment above, the correspondence unit 201A uses the three-dimensional point cloud measured by the rider 30A, the camera image captured by the camera 20A, and either or both of the measurement attitude of the rider 30A and the shooting attitude of the camera 20A to determine the correspondence between each point of the three-dimensional point cloud measured by the rider 30A and each pixel of the camera image captured by the camera 20A.

In addition, similar to correspondence unit 201A, correspondence unit 201B uses the three-dimensional point cloud measured by rider 30B, the camera image captured by camera 20B, and either or both of the measurement attitude of rider 30B and the shooting attitude of camera 20B to determine the correspondence between each point of the three-dimensional point cloud measured by rider 30B and each pixel of the camera image captured by camera 20B.

Similar to the coloring unit 102 described in the first embodiment above, the coloring unit 202A assigns color information of each pixel of the camera image captured by the camera 20A to the corresponding points of the three-dimensional point cloud measured by the LIDAR 30A based on the correspondence relationship identified by the correspondence unit 201A, and outputs the colored three-dimensional point cloud.

Similar to coloring unit 202A, coloring unit 202B assigns color information of each pixel of the camera image captured by camera 20B to the corresponding points of the three-dimensional point cloud measured by LIDAR 30B based on the correspondence relationship identified by correspondence unit 201B, and outputs the colored three-dimensional point cloud.

Similar to the foreground extraction unit 103 described in the first embodiment above, the foreground extraction unit 203A extracts a foreground point cloud, which is a point cloud of the foreground portion, from the three-dimensional point cloud colored by the coloring unit 202A.

Similar to foreground extraction unit 203A, foreground extraction unit 203B extracts a foreground point cloud, which is a point cloud of the foreground portion, from the three-dimensional point cloud colored by coloring unit 202B.

The foreground point cloud integration unit 204 converts the foreground point clouds extracted by each of the foreground extraction units 203A and 203B into the same coordinate system and outputs a foreground point cloud in which multiple foreground point clouds are superimposed. For example, each foreground point cloud may be converted into the same absolute coordinate system based on the measured orientation corresponding to each foreground point cloud, and then superimposed. The foreground point cloud integration unit 204 may also integrate the foreground point clouds extracted by each of the foreground extraction units 203A and 203B, and align them so that the misalignment of the joints is minimized. In other words, the foreground point cloud integration unit 204 corrects the position of the foreground point clouds so that the difference in the area common to the two foreground point clouds extracted by each of the foreground extraction units 203A and 203B is minimized, and then integrates the foreground point clouds.

The point cloud integration unit 205 integrates the integrated foreground point cloud and background point cloud, similar to the point cloud integration unit 104 described in the first embodiment above.

The output unit 206 outputs the three-dimensional point cloud output from the point cloud integration unit 205 as data, similar to the output unit 105 described in the first embodiment above.

<Action of the point cloud processing device according to this embodiment>
Next, the operation of the point cloud processing device 210 will be described.

FIG. 8 is a flowchart showing the flow of point cloud processing by the point cloud processing device 210. Point cloud processing is performed by the CPU 11 reading out a point cloud processing program from the ROM 12 or storage 14, expanding it into the RAM 13 and executing it. Furthermore, it is assumed that the data input to the point cloud processing device 210 is acquired by an equipment configuration of cameras 20A, 20B and LIDARs 30A, 30B, and that the cameras 20A, 20B and LIDARs 30A, 30B are each installed so that the object to be photographed that is to be measured falls within the photographing range and measurement range. It is assumed that a previously determined background point cloud has been input to the point cloud processing device 210.

First, in step S201, the CPU 11 acquires a reference point cloud, which is a three-dimensional point cloud measured by the lidar 30A in the absence of an object to be photographed, and a measured attitude.

In parallel with step S201, in step S203, the CPU 11 acquires a reference point cloud, which is a three-dimensional point cloud measured by the lidar 30B in the absence of an object to be photographed, and a measured attitude.

Then, in step S202, the CPU 11 extracts a foreground point cloud from the three-dimensional point cloud measured by the LIDAR 30A.

This step S202 is realized by the same processing as steps S102 to S108 described in the first embodiment above, and involves obtaining a 3D point cloud, densifying it, coloring the point cloud, and extracting the foreground point cloud.

In parallel with step S202, in step S204, the CPU 11 extracts a foreground point cloud from the three-dimensional point cloud measured by the LIDAR 30B.

This step S204 is realized by the same processing as steps S102 to S108 described in the first embodiment above, and involves obtaining a 3D point cloud, densifying it, coloring the point cloud, and extracting the foreground point cloud.

Then, in step S205, the CPU 11, functioning as the foreground point cloud integration unit 204, integrates the foreground point clouds extracted by each of the foreground extraction units 203A and 203B, and aligns them so that the misalignment of the joints is minimized.

In this case, the foreground point clouds can be integrated into a single point cloud by aligning their positions by rotating and translating the foreground point clouds based on their respective measured orientations. However, there are cases where perfect alignment cannot be achieved due to differences in the time at which each point cloud is measured. In this regard, for example, the technology described in Non-Patent Document 1 can be used to calculate rotation and translation so that areas common to multiple point clouds overlap as much as possible.

[Non-patent document 1]: RUSINKIEWICZ, Szymon; LEVOY, Marc. Efficient variants of the ICP algorithm. In: Proceedings third international conference on 3-D digital imaging and modeling. IEEE, 2001. p. 145-152.

Next, in step S206, the CPU 11, functioning as the point cloud integration unit 205, acquires point cloud position information for integrating the foreground point cloud with the background point cloud. Specifically, when integrating the foreground point clouds, if one of the foreground point clouds is used as a reference and the other foreground point clouds are integrated by calculating rotation and translation for aligning them with the reference foreground point cloud, the measured orientation of the reference foreground point cloud may be used as the point cloud position information. Furthermore, in order to reduce the positional deviation before and after integrating multiple foreground point clouds, the foreground point cloud position information may be determined so that the center of gravity does not change before and after the integration. Furthermore, in order to display the foreground point cloud at an arbitrary position, predetermined position information or point cloud position information input from an external system may be used.

In step S207, the CPU 11, functioning as the point cloud integration unit 205, integrates the foreground point cloud and the background point cloud based on the point cloud position information of the foreground point cloud and the point cloud position information of the background point cloud.

In step S208, the CPU 11, as the output unit 206, outputs the three-dimensional point cloud output from the point cloud integration unit 205 as data.

In step S209, the CPU 11 determines whether or not to continue measuring the three-dimensional point cloud. If the measurement of the three-dimensional point cloud is to be continued, the process returns to steps S202 and S204, and a new combination of the three-dimensional point cloud measured by the riders 30A and 30B and the measurement attitude at which the three-dimensional point cloud was measured, and a new combination of the camera images taken by the cameras 20A and 20B and the shooting attitude at which the camera images were taken, is obtained. On the other hand, if the measurement of the three-dimensional point cloud is not to be continued, the point cloud processing routine is terminated.

As described above, the point cloud processing device of the second embodiment can measure multiple moving objects and perform measurements with fewer blind spots by measuring and integrating foreground point clouds using multiple combinations of lidar and cameras.

In addition, by correcting the positions of the foreground point clouds and integrating them so that the difference in areas common to two or more foreground point clouds is minimized, it is possible to reduce the visual discomfort even when there is a difference in the measurement time between measuring instruments and simple integration would result in misalignment.

In addition, when a lidar measures a 3D point cloud horizontally, the ground is captured in a fan shape, and ground points remain in the point cloud output from the coloring section. If the foreground point cloud is integrated in this state, the fan-shaped parts will coincide and alignment will fail, so it is necessary to delete the point cloud in parts other than the foreground in advance. Although it is possible to perform threshold processing based on the height coordinates of the point cloud, in this embodiment, the foreground point cloud is extracted based on the difference with the reference point cloud measured initially. This makes it possible to handle cases where there are slopes, walls, obstacles, etc.

[Third embodiment]
Next, a point cloud processing apparatus according to a third embodiment will be described. Note that parts having the same configuration as those in the first and second embodiments will be given the same reference numerals and descriptions thereof will be omitted.

The third embodiment differs from the second embodiment in that coloring is performed after integrating multiple foreground point groups.

<Configuration of the point cloud processing device according to this embodiment>
FIG. 1 is a block diagram showing the hardware configuration of a point cloud processing device 310 according to this embodiment.

As shown in FIG. 1, the point cloud processing device 310 has a CPU 11, a ROM 12, a RAM 13, a storage 14, an input unit 15, a display unit 16, and a communication interface 17.

As in the second embodiment, the input unit 15 is used to perform various inputs, including two or more sets, with one set being a combination of a measured three-dimensional point cloud and the measurement attitude at which the three-dimensional point cloud was measured, and another being a combination of a camera image capturing an area corresponding to the three-dimensional point cloud and the shooting attitude at which the camera image was captured. The following describes the process of performing two sets of input.

Specifically, the data input to the input unit 15 can be acquired using the equipment configuration shown in FIG. 6 above, which includes two cameras 20A and 20B and two riders 30A and 30B.

Next, the functional configuration of the point cloud processing device 310 will be described. Figure 9 is a block diagram showing an example of the functional configuration of the point cloud processing device 310.

As shown in FIG. 9, the point cloud processing device 310 functionally comprises correspondence units 301A and 301B, foreground extraction units 302A and 302B, a foreground point cloud integration unit 303, a coloring unit 304, a point cloud integration unit 305, and an output unit 306.

Similar to the correspondence unit 201A described in the second embodiment above, the correspondence unit 301A uses the three-dimensional point cloud measured by the rider 30A, the camera image captured by the camera 20A, and either or both of the measurement attitude of the rider 30A and the shooting attitude of the camera 20A to determine the correspondence between each point of the three-dimensional point cloud measured by the rider 30A and each pixel of the camera image captured by the camera 20A.

In addition, similar to the correspondence unit 201B described in the second embodiment above, the correspondence unit 301B uses the three-dimensional point cloud measured by the rider 30B, the camera image captured by the camera 20B, and either or both of the measurement attitude of the rider 30B and the shooting attitude of the camera 20B to determine the correspondence between each point of the three-dimensional point cloud measured by the rider 30B and each pixel of the camera image captured by the camera 20B.

The foreground extraction unit 302A extracts a foreground point cloud, which is a point cloud of the foreground portion, from the three-dimensional point cloud measured by the LIDAR 30A.

Specifically, like the foreground extraction unit 103 described in the first embodiment above, the foreground extraction unit 302A receives the three-dimensional point cloud measured by the LIDAR 30A as input, extracts only the points that correspond to the object being photographed, and outputs them as a foreground point cloud.

Similar to foreground extraction unit 302A, foreground extraction unit 302B extracts a foreground point cloud, which is a point cloud of the foreground portion, from the three-dimensional point cloud measured by LIDAR 30B.

Similar to the foreground point cloud integration unit 204 described in the second embodiment above, the foreground point cloud integration unit 303 converts the foreground point clouds extracted by each of the foreground extraction units 302A and 302B into the same coordinate system and outputs a foreground point cloud in which multiple foreground point clouds are superimposed. In addition, the foreground point cloud integration unit 303 may integrate the foreground point clouds extracted by each of the foreground extraction units 302A and 302B, and align them so that the misalignment of the joints is minimized.

The coloring unit 304 assigns a color to each point of the integrated foreground point group based on the correspondence relationship identified by the correspondence units 301A and 301B, determined based on the color information of the corresponding pixel in the camera image captured by the cameras 20A and 20B. The color to be assigned may be, for example, the median value for each dimension of RGB, or a representative value of the largest cluster may be used after clustering.

Specifically, based on the correspondence relationship specified by correspondence unit 301A, color information of each pixel of the camera image captured by camera 20A is assigned to the corresponding point of the integrated foreground point cloud, and based on the correspondence relationship specified by correspondence unit 301B, color information of each pixel of the camera image captured by camera 20B is assigned to the corresponding point of the integrated foreground point cloud. Then, the color information to be assigned to each point of the integrated foreground point cloud is determined by majority vote, and the colored 3D point cloud is output.

The point cloud integration unit 305 integrates the integrated foreground point cloud and background point cloud, similar to the point cloud integration unit 205 described in the second embodiment above.

The output unit 306 outputs the three-dimensional point cloud output from the point cloud integration unit 305 as data, similar to the output unit 206 described in the second embodiment above.

<Action of the point cloud processing device according to this embodiment>
Next, the operation of the point cloud processing device 310 will be described.

FIG. 10 is a flowchart showing the flow of point cloud processing by the point cloud processing device 310. Point cloud processing is performed by the CPU 11 reading out a point cloud processing program from the ROM 12 or storage 14, expanding it into the RAM 13 and executing it. Furthermore, it is assumed that the data input to the point cloud processing device 310 is acquired by an equipment configuration of cameras 20A, 20B and LIDARs 30A, 30B, and that the cameras 20A, 20B and LIDARs 30A, 30B are each installed so that the object to be photographed that is to be measured falls within the photographing range and measurement range. It is assumed that a previously determined background point cloud has been input to the point cloud processing device 310.

First, in step S301, the CPU 11 acquires a reference point cloud, which is a three-dimensional point cloud measured by the lidar 30A in the absence of an object to be photographed, and a measured attitude.

In parallel with step S301, in step S303, the CPU 11 acquires a reference point cloud, which is a three-dimensional point cloud measured by the lidar 30B in the absence of an object to be photographed, and a measured attitude.

Then, in step S302, the CPU 11 extracts a foreground point cloud from the three-dimensional point cloud measured by the LIDAR 30A.

This step S302 is realized by the same processing as steps S102, S103, and S106 to S108 described in the first embodiment above, and involves obtaining a 3D point cloud and extracting a foreground point cloud.

In parallel with step S302, in step S304, the CPU 11 extracts a foreground point cloud from the three-dimensional point cloud measured by the LIDAR 30B.

This step S304 is realized by the same processing as steps S102, S103, and S106 to S108 described in the first embodiment above, and involves obtaining a 3D point cloud and extracting a foreground point cloud.

Then, in step S305, the CPU 11, functioning as the foreground point cloud integration unit 303, integrates the foreground point clouds extracted by each of the foreground extraction units 302A and 302B, and aligns them so that the misalignment of the joints is minimized.

In step S306, the CPU 11, as the coloring unit 304, assigns a color determined from the color information of the pixels in the camera image to each point of the integrated foreground point cloud based on the correspondence relationship identified by the correspondence units 301A and 301B. In addition, the CPU 11 may densify the point cloud by estimating the depth value of each pixel in the camera image based on a conversion formula, estimating the three-dimensional coordinates, and generating corresponding three-dimensional points.

Next, in step S307, the CPU 11, functioning as the point cloud integration unit 305, acquires point cloud position information for integrating the foreground point cloud with the background point cloud.

In step S308, the CPU 11, functioning as the point cloud integration unit 305, integrates the foreground point cloud and the background point cloud based on the point cloud position information of the foreground point cloud and the point cloud position information of the background point cloud.

In step S309, the CPU 11, as the output unit 306, outputs the three-dimensional point cloud output from the point cloud integration unit 205 as data.

In step S310, the CPU 11 determines whether or not to continue measuring the three-dimensional point cloud. If the measurement of the three-dimensional point cloud is to be continued, the process returns to steps S302 and S304, and a new combination of the three-dimensional point cloud measured by the riders 30A and 30B and the measurement attitude at which the three-dimensional point cloud was measured, and a new combination of the camera images taken by the cameras 20A and 20B and the shooting attitude at which the camera images were taken, is obtained. On the other hand, if the measurement of the three-dimensional point cloud is not to be continued, the point cloud processing routine is terminated.

As described above, the point cloud processing device of the third embodiment colors the point cloud after integrating the foreground point cloud, and can determine the color to be used based on multiple camera images, thereby reducing color unevenness at the boundary positions of multiple point clouds.

When coloring a 3D point cloud, the color of each pixel in the camera images taken by the camera is referenced for the 3D point cloud measured by the LIDAR, and the color of each point is determined. If the LIDAR and the camera are separate devices, parallax will inevitably occur, and calibration errors will also occur. In this embodiment, by determining the color of each point based on multiple camera images, it is possible to compensate for blind spots caused by parallax and reduce coloring errors caused by errors by majority vote.

In addition, when densifying a three-dimensional point cloud in the coloring section, points that did not exist in the original point cloud are added, which may result in points being generated in positions different from the actual positions, which may result in poor integration of the foreground point cloud. In this embodiment, by integrating the foreground point cloud before densifying the point cloud, more accurate alignment can be achieved.

[Fourth embodiment]
Next, a point cloud processing apparatus according to a fourth embodiment will be described. Note that parts having the same configuration as those in the first embodiment will be denoted by the same reference numerals and the description thereof will be omitted.

The fourth embodiment differs from the first embodiment in that the timing of measuring the 3D point cloud and taking camera images is controlled so that they are performed simultaneously, and the measured and captured data is input continuously.

<Configuration of the point cloud processing device according to this embodiment>
FIG. 1 is a block diagram showing the hardware configuration of a point cloud processing device 410 according to this embodiment.

As shown in FIG. 1, the point cloud processing device 410 has a CPU 11, a ROM 12, a RAM 13, a storage 14, an input unit 15, a display unit 16, and a communication interface 17.

Next, the functional configuration of the point cloud processing device 410 will be described. Figure 11 is a block diagram showing an example of the functional configuration of the point cloud processing device 410.

As shown in FIG. 11, the point cloud processing device 410 functionally comprises an imaging control unit 400, a correspondence unit 101, a coloring unit 102, a foreground extraction unit 103, a point cloud integration unit 104, and an output unit 105.

The imaging control unit 400 repeatedly controls the timing of the lidar's measurement of the 3D point cloud and the timing of the camera's image capture to correspond to each other, and inputs the results to the correspondence unit 101.

Specifically, the imaging control unit 400 minimizes the time lag between the 3D point cloud measured by the LIDAR and the camera images taken by the camera, and controls so that measurements and imaging are performed simultaneously. For example, the time may be set in advance to perform measurements and imaging periodically, or an external system may send measurement and imaging signals to multiple measuring devices simultaneously. In addition, measurements are performed continuously by the LIDAR and imaging is performed by the camera, and the measured 3D point cloud and captured camera images are input sequentially to the corresponding unit 101.

Note that other configurations and operations of the point cloud processing device 410 according to the fourth embodiment are similar to those of the first embodiment, and therefore will not be described.

As described above, the point cloud processing device of the fourth embodiment controls the measurement of a three-dimensional point cloud and the capture of camera images, thereby making it possible to suppress coloring of the point cloud caused by a difference in the timing of measurement and capture, and to prevent deviations in alignment when integrating foreground point clouds, and by capturing images continuously, it is possible to capture a time series of a scene including moving objects as three-dimensional information.

When measuring a 3D point cloud that represents an object, including a moving object, in real time, if the point cloud is measured and images are taken at a fixed frame rate, the subsequent processing (densification, point cloud integration) may not be able to keep up. On the other hand, if the next frame is processed after all processing is completed, it will take the same amount of time as sequential processing, even if each process can be parallelized. Therefore, in this embodiment, by timing each process to finish and issuing instructions to measure the point cloud and take images in advance to the lidar and camera, it is possible to achieve measurement at the highest possible frame rate.

The imaging control unit 400 may be configured with multiple PCs that are time-synchronized in advance, or multiple PCs that are connected via a network and operate in cooperation with each other.

The control by the imaging control unit 400 may also be applied to the point cloud processing device of the second and third embodiments.

<Modification>
The present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the spirit and scope of the present invention.

For example, in the above embodiment, the number of lidars and cameras is the same, but this is not limited to the above. As long as the measurement range of the lidar is covered by the shooting range of any of the cameras, the number of lidars and cameras does not have to be the same.

In addition, although the example has been described with respect to a case where there is one background point group calculated in advance, the present invention is not limited to this. Even if there are multiple background point groups calculated in advance, they can be integrated in advance.

Although the example has been described with a case where a camera image is input, the present invention is not limited to this. There may be a case where no camera image is input. In this case, color information is not added to the 3D point cloud.

Although the above description has been given of an example in which the densification of a three-dimensional point cloud is performed using the method described in Patent Document 1, the present invention is not limited to this. The densification of a three-dimensional point cloud may be performed using other densification methods.

In addition, although the example described above is a case where the reference point cloud is measured at the beginning of point cloud processing, the present invention is not limited to this. A depth map may be obtained from a previously determined background point cloud and used as the depth map for the reference point cloud.

The reference point cloud may also be measured by a different LIDAR from the 3D point cloud used to extract the foreground point cloud. In this case, the reference point cloud measured by the different LIDAR can be converted into a depth map viewed from the same viewpoint as the camera image.

Furthermore, various processes that the CPU reads and executes software (programs) in each of the above embodiments may be executed by various processors other than the CPU. Examples of processors in this case include PLDs (Programmable Logic Devices) such as FPGAs (Field-Programmable Gate Arrays) whose circuit configuration can be changed after manufacture, and dedicated electrical circuits such as ASICs (Application Specific Integrated Circuits), which are processors with circuit configurations designed specifically to execute specific processes. Furthermore, point cloud processing may be executed by one of these various processors, or may be executed by a combination of two or more processors of the same or different types (for example, multiple FPGAs, and a combination of a CPU and an FPGA, etc.). Moreover, the hardware structure of these various processors is, more specifically, an electrical circuit that combines circuit elements such as semiconductor elements.

In addition, in each of the above embodiments, the point cloud processing program is described as being pre-stored (installed) in the storage 14, but this is not limiting. The program may be provided in a form stored in a non-transitory storage medium such as a CD-ROM (Compact Disk Read Only Memory), a DVD-ROM (Digital Versatile Disk Read Only Memory), or a USB (Universal Serial Bus) memory. The program may also be downloaded from an external device via a network.

The following notes are further provided with respect to the above embodiment.

(Additional Note 1)
A point cloud processing device,
Memory,
at least one processor coupled to the memory;
Including,
The processor,
receiving a combination of a measured 3D point cloud on a surface of an object and a measurement orientation at which the 3D point cloud was measured;
extracting a foreground point cloud, which is a point cloud of a foreground portion, from the three-dimensional point cloud based on the measured orientation;
A point cloud processing device configured to integrate the foreground point cloud with a background point cloud, which is a point cloud of a background portion that has been obtained in advance.

(Additional Note 2)
A non-transitory storage medium storing a program executable by a computer to perform point cloud processing,
The point cloud processing includes:
receiving a combination of a measured 3D point cloud on a surface of an object and a measurement orientation at which the 3D point cloud was measured;
extracting a foreground point cloud, which is a point cloud of a foreground portion, from the three-dimensional point cloud based on the measured orientation;
A non-transitory storage medium that integrates the foreground point cloud with a background point cloud that is a point cloud of a background portion that has been obtained in advance.

10, 210, 310, 410 Point cloud processing device 11 CPU
14 Storage 15 Input unit 16 Display unit 20A, 20B Camera 30A, 30B LIDAR 101 Correspondence unit 102, 304 Coloring unit 103 Foreground extraction unit 104, 205, 305 Point cloud integration unit 105, 206, 306 Output unit 201A, 201B, 301A, 301B Correspondence unit 202A, 202B Coloring unit 203A, 203B, 302A, 302B Foreground extraction unit 204, 303 Foreground point cloud integration unit 400 Shooting control unit

Claims

an input unit that receives a combination of a three-dimensional point cloud on a surface of a measured object and a measurement orientation at which the three-dimensional point cloud was measured;
a foreground extraction unit that extracts a foreground point cloud, which is a point cloud of a foreground portion, from the three-dimensional point cloud based on the measured orientation;
a point cloud integration unit that integrates the foreground point cloud with a background point cloud that is a point cloud of a background portion that is obtained in advance;
A point cloud processing device comprising:
Further comprising a foreground point cloud integration unit;
the input unit receives two or more combinations of the three-dimensional point cloud and the measurement orientation;
The foreground extraction unit extracts the foreground points from each set of the three-dimensional point clouds;
The foreground point cloud integration unit integrates the foreground point clouds extracted from each set of the three-dimensional point clouds,
The point cloud processing device according to claim 1 , wherein the point cloud integration unit integrates the integrated foreground point cloud and the background point cloud.
The corresponding part,
A coloring portion,
The input unit further receives a combination of a camera image and a shooting attitude at which the camera image was captured, the combination corresponding to the combination of the three-dimensional point cloud and the measurement attitude;
the correspondence unit specifies a correspondence relationship between each point of the three-dimensional point cloud and each pixel of the camera image based on at least one of the measurement attitude and the shooting attitude;
The point cloud processing apparatus according to claim 1 , wherein the coloring unit assigns color information of each pixel of the camera image to a corresponding point in the three-dimensional point cloud based on the correspondence relationship.
Further comprising a foreground point cloud integration unit;
the input unit receives two or more combinations of the three-dimensional point cloud and the measurement attitude, and receives two or more combinations of a camera image and a shooting attitude at which the camera image was captured;
The foreground extraction unit extracts the foreground points from each set of the three-dimensional point clouds;
The foreground point cloud integration unit integrates the foreground point clouds extracted from each set of the three-dimensional point clouds,
the correspondence unit specifies, for each pair, a correspondence relationship between each point of the three-dimensional point cloud and each pixel of the camera image based on the measurement attitude and the shooting attitude;
The point cloud processing device according to claim 3 , wherein the coloring unit assigns color information of a corresponding pixel in the camera image to each point in the integrated foreground point cloud based on the correspondence relationship identified for each pair.
The point cloud processing device according to claim 3, wherein the coloring unit further densifies the three-dimensional point cloud.
The point cloud processing device according to claim 1, wherein the foreground extraction unit converts the received three-dimensional point cloud into a depth map from a specific viewpoint in which each pixel has a pixel value corresponding to the depth value of the object, compares the converted depth map with a reference depth map obtained in advance, and extracts, as the foreground point cloud, a point cloud consisting of points corresponding to pixels whose difference from the depth value of the corresponding pixel in the reference depth map is equal to or greater than a threshold value.
The point cloud processing device according to claim 2, wherein the foreground point cloud integration unit corrects the positions of the foreground point clouds and integrates them so that the difference between areas common to two or more foreground point clouds is minimized.
The point cloud processing device according to claim 3, further comprising an imaging control unit that repeatedly controls the measurement timing of the three-dimensional point cloud and the capture timing of the camera images to correspond to each other, and repeatedly inputs the three-dimensional point cloud and the camera images to the correspondence unit.
The computer
receiving a combination of a measured 3D point cloud on a surface of an object and a measurement orientation at which the 3D point cloud was measured;
extracting a foreground point cloud, which is a point cloud of a foreground portion, from the three-dimensional point cloud based on the measured orientation;
The foreground point cloud is integrated with a background point cloud, which is a point cloud of a background portion that has been obtained in advance.
A point cloud processing program for causing a computer to function as a point cloud processing device according to any one of claims 1 to 8.