[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2021128314A1 - Image processing method and device, image processing system and storage medium - Google Patents

Image processing method and device, image processing system and storage medium Download PDF

Info

Publication number
WO2021128314A1
WO2021128314A1 PCT/CN2019/129364 CN2019129364W WO2021128314A1 WO 2021128314 A1 WO2021128314 A1 WO 2021128314A1 CN 2019129364 W CN2019129364 W CN 2019129364W WO 2021128314 A1 WO2021128314 A1 WO 2021128314A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
image area
projection model
initial
target image
Prior art date
Application number
PCT/CN2019/129364
Other languages
French (fr)
Chinese (zh)
Inventor
徐斌
陈晓智
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/129364 priority Critical patent/WO2021128314A1/en
Priority to CN201980094989.8A priority patent/CN113661513A/en
Publication of WO2021128314A1 publication Critical patent/WO2021128314A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C11/00Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • the embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method, device, image processing system, and storage medium.
  • the embodiment of the invention discloses an image processing method, equipment, image processing system and storage medium, which can adaptively perform three-dimensional target detection on different types of images, thereby improving the efficiency and effectiveness of image processing.
  • an embodiment of the present invention provides an image processing method, including:
  • an embodiment of the present invention provides an image processing device, including: a memory and a processor,
  • the memory is used to store programs
  • the processor is configured to execute a program stored in the memory, and when the program is executed, the processor is configured to:
  • an embodiment of the present invention provides an image processing system, including:
  • an embodiment of the present invention provides a computer-readable storage medium in which a computer program is stored.
  • the computer program is executed by a processor, the method described in the first aspect is implemented. step.
  • the embodiment of the present invention can process the initial image captured by the shooting device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the projection of the two-dimensional target image area
  • the semantic information of the model and the target object determine the initial three-dimensional coordinates of the target object, and extract the regional feature points of the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, so as to according to the initial three-dimensional coordinates And three-dimensional coordinate adjustment information to determine the target three-dimensional coordinates of the target object.
  • three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
  • Fig. 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of determining three-dimensional coordinates based on a small hole imaging model according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of determining three-dimensional coordinates based on an isometric projection model according to an embodiment of the present invention
  • FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention.
  • FIG. 8 is a schematic flowchart of yet another image processing method provided by an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention.
  • Fig. 10 is a schematic structural diagram of an image processing device provided by an embodiment of the present invention.
  • ADAS Advanced Driving Assistant System
  • the vision camera can capture surrounding scene information.
  • the monocular camera has a lower cost, so it is a sensor that has more choices. Therefore, how to perform three-dimensional detection of dynamic obstacles through a monocular camera, including information such as the three-dimensional coordinate position, three-dimensional size, and orientation of the obstacle, is a problem that needs to be solved urgently.
  • the two-dimensional or three-dimensional detection of the target object on the image mostly adopts the convolutional neural network in deep learning.
  • the convolutional neural network can be used to extract features from the image, and then the category and circumscribed rectangle of the corresponding target object can be estimated directly from the features.
  • the detection results can generally be divided into three aspects.
  • One is the three-dimensional size of the target object, which is usually expressed by the length, width, and height of the circumscribed cuboid of the target object obtained by the detection.
  • One is the three-dimensional position of the target object, usually represented by the center of the circumscribed cuboid on the left side of the camera coordinate system, and can be converted to coordinates in other coordinate systems if necessary;
  • one is the orientation information of the target object, usually It is expressed by the pitch angle, yaw angle, and roll angle of the circumscribed cuboid.
  • a monocular camera mounted on a vehicle captures an image around the vehicle, and the image contains a target object (such as another vehicle), then the image can be semantically recognized through the neural network to identify the target object If it is a car, it can be given preset three-dimensional information corresponding to the car, such as a height of 1.5 meters, and combine this preset information and the camera parameters of the monocular camera to finally obtain the three-dimensional information of the car.
  • any photographing device such as a camera
  • its parameters are known, such as focal length (f) and optical center coordinates (cx, cy). Therefore, when performing three-dimensional object detection, it can be based on its image
  • the size of the upper part is a rough estimate of its three-dimensional coordinates. In layman's terms, it is the near-large and far-small. The smaller the object on the image, the farther the distance.
  • the embodiment of the present invention provides an image processing method, which processes the initial image captured by the camera to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and According to the projection model of the two-dimensional target image area and the semantic information of the target object, the initial three-dimensional coordinates of the target object are determined, and the area feature points of the two-dimensional target image area are extracted, and the three-dimensional coordinates of the target object are determined based on the area feature points Adjust the information to determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
  • the image processing method provided in the embodiments of the present invention may be executed by an image processing system, where the image processing system may include a photographing device and an image processing device.
  • the photographing device may be combined with the image processing system.
  • Equipment integration in some embodiments, the photographing device may be spatially independent of the image processing equipment, and the photographing device and the image processing equipment establish a communication connection in a wired or wireless manner.
  • the photographing device may be set on a movable platform configured with a load (such as a photographing device, an infrared detection device, a surveying instrument, etc.); in other embodiments, the photographing device may be spatially Independent of the movable platform.
  • the photographing device may include, but is not limited to, a sports camera, a panoramic camera, a fisheye camera, and the like. In some embodiments, the number of the photographing device may be one or more.
  • the movable platform includes movable equipment such as drones, robots capable of autonomous movement, unmanned vehicles, and unmanned ships.
  • the image processing device may include one or more of a smart phone, a tablet computer, a laptop computer, and a wearable device.
  • the embodiments of the present invention can be applied to the detection of obstacles in mobile platforms such as ADAS, autonomous driving, robots, drones, etc., to realize obstacle avoidance functions and subsequent path planning functions.
  • the movable platform may be a vehicle
  • the imaging device may be a fisheye camera
  • the image processing device may be a computing platform mounted on the vehicle; specifically, the fisheye camera is installed on the vehicle
  • the rearview mirror is used to obtain the environmental image of the side of the vehicle and then send it to the on-board computing platform for processing, and can provide a wider field of view than ordinary lenses, thereby reducing or avoiding the blind area of the camera's field of view.
  • FIG. 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present invention.
  • the image processing system includes a photographing device 11 and an image processing device 12, wherein one of the photographing device 11 and the image processing device 12
  • the communication connection can be established through wireless communication connection.
  • a communication connection between the photographing device 11 and the image processing device 12 may also be established through a wired communication connection.
  • the photographing device 11 may be provided on the image processing device 12.
  • the photographing device 11 and the image processing device 12 are independent of each other, and the image processing device 12 may include one or more of a smart phone, a tablet computer, a laptop computer, and a wearable device.
  • the image processing device 12 may obtain the initial image taken by the photographing device 11, and process the initial image to determine the two-dimensional target image area of the initial image and the semantics of the target object contained in the two-dimensional target image area Information, and according to the projection model of the two-dimensional target image area and the semantic information of the target object, determine the initial three-dimensional coordinates of the target object, extract the area feature points of the two-dimensional target image area, and determine the three-dimensional target object based on the area feature points
  • the coordinate adjustment information, and the target three-dimensional coordinates of the target object are determined according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present invention.
  • the method may be executed by an image processing device, and the specific explanation of the image processing device is as described above.
  • the method of the embodiment of the present invention includes the following steps.
  • S201 Acquire an initial image photographed by the photographing device.
  • the image processing device may obtain the initial image captured by the photographing device.
  • the initial image may be obtained by photographing the target object by a photographing device.
  • the explanation of the photographing device is as described above, and will not be repeated here.
  • S202 Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the image processing device may perform image processing and semantic analysis on the initial image to determine the semantic information of the target object.
  • the semantic information includes, but is not limited to, the category and location information of the target object.
  • the semantic information of the target object includes the semantics of different types of objects such as cars and drones.
  • the location information includes two-dimensional coordinates of the target object.
  • the image processing device when the image processing device processes the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, it may The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
  • the first neural network is a convolutional neural network.
  • S203 Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the projection model includes, but is not limited to, any one of a pinhole imaging model, a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model.
  • the image processing device may acquire the two-dimensional target when determining the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the projection model of the image area and obtain the parameters of the photographing device, and determine the two-dimensional target image according to the projection model of the two-dimensional target image area, the semantic information of the target object and the parameters of the photographing device
  • the projection model may be a preset projection model.
  • the parameters of the photographing device include an internal parameter and an external parameter
  • the internal parameter includes the focal length of the photographing device
  • the external parameter includes the optical center of the photographing device
  • the image processing device when the image processing device acquires the projection model of the two-dimensional target image area, it may determine the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area, And according to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
  • the image processing device determines that the category information of the two-dimensional target image area is a fisheye image according to the feature point information of the two-dimensional target image area
  • the projection model of the two-dimensional target image area is Any one of a Snell window projection model, an equal area projection model, an equal distance projection model, and a stereo projection model corresponding to the fisheye image.
  • the image processing device when the image processing device acquires the projection model of the two-dimensional target image area, it may process the initial image according to the first neural network to obtain the projection model of the initial image, and determine The projection model of the initial image is a projection model of the two-dimensional target image area.
  • the image processing device when it acquires the projection model of the two-dimensional target image area, it may process the initial image according to a third neural network to obtain the projection model of the initial image, and determine The projection model of the initial image is a projection model of the two-dimensional target image area.
  • the third neural network may be a convolutional neural network, and the third neural network is different from the first neural network.
  • the projection model includes a first projection model; the image processing device determines according to the projection model of the two-dimensional target image area, the semantic information of the target object and the parameters of the photographing device When the initial three-dimensional coordinates of the two-dimensional target image area, the image height of the target object in the two-dimensional target image area and the actual height of the target object can be obtained, and the actual height of the target object can be obtained according to the image height, the actual height, The semantic information of the target object and the parameters of the photographing device are used to determine the initial three-dimensional coordinates of the two-dimensional target image area by using the first projection model.
  • the first projection model includes a pinhole imaging model.
  • the image processing device uses the first projection model to determine the two-dimensional image based on the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  • the first projection model may be used to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device, and Obtain the two-dimensional coordinates of the target object in the image coordinate system, and use the first projection model according to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object Determine the initial three-dimensional coordinates of the two-dimensional target image area.
  • FIG. 3 is a schematic diagram of determining three-dimensional coordinates based on a small hole imaging model according to an embodiment of the present invention.
  • f is the focal length of the camera
  • h is the image height of the target object in the two-dimensional target image area
  • H is the actual height of the target object
  • D is the distance from the target object to the camera.
  • f/ D h/H.
  • h is the height of the target object in the two-dimensional target image area.
  • H Since the category of the target object in each two-dimensional target image area can be predicted, H can be roughly obtained (for example, assuming that the target object is small For a car, it can be assumed that H is 1 meter), f is the camera's internal parameters, so the actual distance D of the target object from the camera can be calculated. That is, the coordinates of the target object in the Z direction in the camera coordinate system.
  • the X and Y coordinates of the target object in the camera coordinate system can be calculated, as shown in the following formula (1) Show:
  • u and v are the coordinates of the target object in the image coordinate system, which can be approximated by using the coordinates of the center point of the target object.
  • the image processing device calculates the two-dimensional coordinates of the target object in the image coordinate system, it can use the two-dimensional coordinates of the target object in the image coordinate system, the camera internal parameters, and the semantic information of the target object, The initial three-dimensional coordinates of the two-dimensional target image area where the target object is located are calculated.
  • the projection model includes a second projection model; the image processing device determines according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device
  • the image processing device determines according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device
  • the second projection model includes any one of a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model; in other embodiments, the second projection model
  • the two-projection model may also include other projection models, which are not specifically limited here.
  • the image processing device uses the second projection model to determine the second projection model based on the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the shooting device.
  • the second projection model may be used to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device, and to obtain the The actual distance of the target object from the photographing device, so as to use the second projection model to determine the initial two-dimensional target image area according to the photographing angle of view, the actual distance, and the semantic information of the target object.
  • Three-dimensional coordinates may be used to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device.
  • the image processing device uses the second projection model to determine the initial two-dimensional target image area based on the shooting angle of view, the actual distance, and the semantic information of the target object.
  • the second projection model may be used to determine the physical distance of the target object according to the shooting angle of view and the actual distance, and the physical distance of the target object may be determined according to the parameters of the shooting device, the shooting angle of view, and the two-dimensional The coordinates, the physical distance and the semantic information of the target object determine the initial three-dimensional coordinates of the two-dimensional target image area.
  • FIG. 4 can be used as an example for description.
  • FIG. 4 is a schematic diagram of determining three-dimensional coordinates based on an isometric projection model according to an embodiment of the present invention. As shown in Figure 4, where f is the focal length of the camera, t is the distance of the target object in the two-dimensional target image area, r is the actual distance of the target object, and z is the distance from the camera to the object. According to the principle of isometric projection, there are the following Formula (2):
  • u, v are the pixel coordinates of the target object in the image coordinate system
  • f, cx, and cy are the internal camera parameters.
  • h is the height of the target object in the two-dimensional target image area
  • H is the actual height of the target object
  • the category to which the target object in the two-dimensional target image area belongs can be predicted based on the semantic information of the target object, and then H can be roughly obtained (for example, if the target object is a car, it can be assumed that H is 1 meter) .
  • H can be roughly obtained (for example, if the target object is a car, it can be assumed that H is 1 meter) .
  • u and v are the coordinates of the target object in the image coordinate system, which can be approximated by using the coordinates of the center point of the target object. Therefore, the ⁇ can be calculated first, and then the r value can be calculated. Finally, the X, Y, and Z coordinates of the target object in the camera coordinate system can be obtained, that is, the initial three-dimensional coordinates.
  • the above describes how to calculate the internal parameters of the camera, the two-dimensional coordinates of the target object in the image coordinate system, the semantic information of the target object, the image distance of the target object in the two-dimensional target image area, and the projection model.
  • the initial three-dimensional coordinates of the target object mainly describe in detail the two models of small hole imaging and equidistant projection.
  • the initial X, Y, and Z coordinates can also be calculated, which will not be repeated here.
  • the image processing device may also detect the two-dimensional target image area according to a second neural network to obtain the area information of the two-dimensional target image area; wherein, the area information includes the Any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
  • the second neural network may be a convolutional neural network, and the second neural network is different from the first neural network and the second neural network.
  • S204 Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • the image processing device when it extracts the area feature points of the two-dimensional target image area, it may process the two-dimensional target image area according to the second neural network to extract the two-dimensional target image area. Feature point information of the target image area.
  • the second neural network is a convolutional neural network. In some embodiments, the second neural network is different from the first neural network and the third neural network.
  • S205 Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing device may compare the two-dimensional target image according to the second neural network.
  • the area is processed, the three-dimensional coordinate adjustment information of the two-dimensional target image area is determined, and the target three-dimensional coordinates of the two-dimensional target image area are determined according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
  • the image processing device when the image processing device determines the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates, it may perform an adjustment to the three-dimensional coordinate adjustment information.
  • the initial three-dimensional coordinates are adjusted, and the adjusted three-dimensional coordinates are determined to be the target three-dimensional coordinates of the two-dimensional target image area.
  • the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional
  • the projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area, the three-dimensional coordinate adjustment information of the target object is determined based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined.
  • three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
  • FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the present invention.
  • the method may be executed by an image processing device, and the specific explanation of the image processing device is as described above.
  • the method of the embodiment of the present invention includes the following steps.
  • S501 Acquire an initial image photographed by the photographing device.
  • the image processing device may obtain the initial image captured by the photographing device.
  • the explanation of the photographing device is as described above, and will not be repeated here.
  • S502 Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the specific implementation is as described above and will not be repeated here.
  • S503 Acquire a projection model of the two-dimensional target image area, where the projection model is a preset projection model.
  • the image processing device may obtain a projection model of the two-dimensional target image area, and the projection model is a preset projection model.
  • the explanation of the projection model is as described above, and will not be repeated here.
  • S504 Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the specific implementation is as described above and will not be repeated here.
  • S505 Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • S506 Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the specific implementation is as described above and will not be repeated here.
  • the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the preset
  • the projection model and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, so as to adjust the information according to the initial three-dimensional Coordinates and three-dimensional coordinate adjustment information determine the target three-dimensional coordinates of the target object.
  • three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
  • FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present invention.
  • the method may be executed by an image processing device, and the specific explanation of the image processing device is as described above.
  • the method of the embodiment of the present invention includes the following steps.
  • S601 Acquire an initial image photographed by the photographing device.
  • the image processing device may obtain the initial image captured by the photographing device.
  • the explanation of the photographing device is as described above, and will not be repeated here.
  • S602 Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the specific implementation is as described above and will not be repeated here.
  • S603 Process the initial image according to the third neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the projection model of the two-dimensional target image area.
  • the image processing device may process the initial image according to a third neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the two-dimensional target image The projection model of the area.
  • FIG. 7 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention.
  • the image processing device may process the initial image 72 through the first neural network 71.
  • the two-dimensional target image area is detected according to the second neural network 74 to obtain the area information 75 of the two-dimensional target image area, where the area information 75 includes category information, three-dimensional size information, and orientation of the two-dimensional target image area. Any one or more of information and two-dimensional circumscribed matrix information.
  • the projection model 77 of the two-dimensional target image area is determined by the third neural network 76, wherein the projection model 77 is at least one determined from a plurality of projection models.
  • the corresponding projection model can be automatically selected, and various projection models can be adapted within the same algorithm framework to adapt to all types of monocular images, which helps to improve the efficiency and efficiency of image processing. Effectiveness.
  • S604 Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the specific implementation is as described above and will not be repeated here.
  • S605 Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • S606 Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the specific implementation is as described above, and will not be repeated here.
  • the image processing device may process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the passage
  • the projection model of the two-dimensional target image area and the semantic information of the target object obtained by the three-neural network processing the initial image, determine the initial three-dimensional coordinates of the target object, and extract the regional feature points of the two-dimensional target image area to
  • the three-dimensional coordinate adjustment information of the target object is determined based on the regional feature points, so that the target three-dimensional coordinate of the target object is determined according to the initial three-dimensional coordinate and the three-dimensional coordinate adjustment information.
  • the projection model can be determined through a neural network to adaptively perform three-dimensional target detection on different types of images, which improves the efficiency and effectiveness of image processing.
  • FIG. 8 is a schematic flowchart of another image processing method provided by an embodiment of the present invention.
  • the method may be executed by an image processing device, and the specific explanation of the image processing device is as described above.
  • the method of the embodiment of the present invention includes the following steps.
  • S801 Acquire an initial image photographed by the photographing device.
  • the image processing device may obtain the initial image captured by the photographing device.
  • the explanation of the photographing device is as described above, and will not be repeated here.
  • S802 Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
  • the specific implementation is as described above and will not be repeated here.
  • S803 Process the initial image according to the first neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the projection model of the two-dimensional target image area.
  • the image processing device may process the initial image according to the first neural network to obtain the projection model of the initial image, and determine that the projection model of the initial image is the one of the two-dimensional target image area. Projection model.
  • the image processing device may process the two-dimensional target image area according to the second neural network to extract feature point information of the two-dimensional target image area, and according to the characteristics of the two-dimensional target image area
  • the feature point information determines the category information of the two-dimensional target image area
  • the projection model of the two-dimensional target image area is determined according to the category information of the two-dimensional target image area.
  • FIG. 9 is an example.
  • FIG. 9 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention. As shown in FIG. Processing is performed to extract feature point information 93 of the two-dimensional target image area 92. The two-dimensional target image area is detected according to the second neural network 94 to obtain the area information 95 of the two-dimensional target image area.
  • the projection model 96 of the two-dimensional target image area 92 is determined by the first neural network 91, and the projection model 96 of the two-dimensional target image area, the parameters 97 of the photographing device, and the semantic information of the target object are determined to determine the The initial three-dimensional coordinates 981 of the target object; perform regional feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information 982 of the target object based on the regional feature points; according to the initial three-dimensional coordinates 981 and the The three-dimensional coordinate adjustment information 982 determines the target three-dimensional coordinate 99 of the target object.
  • the neural network for judging the projection model and the neural network for extracting the feature point information of the two-dimensional target image area can be shared, reducing the amount of calculation and complexity, and further improving the efficiency of image processing.
  • S804 Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
  • the imaging device may obtain the parameters of the imaging device when determining the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object, and The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
  • the parameters of the photographing device include an internal parameter and an external parameter, the internal parameter includes the focal length of the photographing device, and the external parameter includes the optical center of the photographing device.
  • S805 Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
  • S806 Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  • the image processing equipment may process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the first
  • the neural network processes the initial image to obtain a projection model of the two-dimensional target image area, and according to the projection model and the semantic information of the target object, the initial three-dimensional coordinates of the target object are determined, and the two-dimensional target image area is determined by
  • the regional feature point is extracted to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature point, so as to determine the target three-dimensional coordinate of the target object according to the initial three-dimensional coordinate and the three-dimensional coordinate adjustment information.
  • the projection model of the two-dimensional target image area can be determined by the category information of the two-dimensional target image area determined according to the feature point information, so as to adaptively perform three-dimensional target detection on different types of images, thereby improving the image The efficiency and effectiveness of processing.
  • FIG. 10 is a schematic structural diagram of an image processing device according to an embodiment of the present invention.
  • the image processing device includes: a memory 1001 and a processor 1002.
  • the image processing device further includes a data interface 1003, and the data interface 1003 is used to transfer data information between the image processing device and other devices.
  • the memory 1001 may include a volatile memory (volatile memory); the memory 1001 may also include a non-volatile memory (non-volatile memory); the memory 1001 may also include a combination of the foregoing types of memories.
  • the processor 1002 may be a central processing unit (CPU).
  • the processor 1002 may further include a hardware chip.
  • the above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), or any combination thereof.
  • the memory 1001 is used to store programs, and the processor 1002 can call the programs stored in the memory 1001 to perform the following steps:
  • processor 1002 processes the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, it is specifically used for:
  • the initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
  • processor 1002 performs regional feature point extraction on the two-dimensional target image region, it is specifically configured to:
  • the two-dimensional target image area is processed according to the second neural network to extract feature point information of the two-dimensional target image area.
  • processor 1002 is further configured to:
  • the area information includes any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
  • processor 1002 determines the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object, it is specifically configured to:
  • the initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
  • the projection model is a preset projection model.
  • the processor 1002 acquires the projection model of the two-dimensional target image area, it is specifically configured to:
  • a projection model of the two-dimensional target image area is determined.
  • the processor 1002 obtains the projection model of the two-dimensional target image area, it is specifically configured to:
  • the projection model of the initial image is the projection model of the two-dimensional target image area.
  • the processor 1002 obtains the projection model of the two-dimensional target image area, it is specifically configured to:
  • the projection model of the initial image is the projection model of the two-dimensional target image area.
  • the projection model includes a first projection model; the processor 1002 determines the two-dimensional projection model according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device
  • the initial three-dimensional coordinates of the target image area are specifically used for:
  • the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  • the processor 1002 uses the first projection model to determine the size of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  • initial three-dimensional coordinates it is specifically used for:
  • the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area.
  • the projection model includes a second projection model; the processor 702 determines the two-dimensional projection model according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device
  • the initial three-dimensional coordinates of the target image area are specifically used for:
  • the second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device.
  • the processor 1002 uses the second projection model to determine the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the shooting device.
  • the initial three-dimensional coordinates are specifically used for:
  • the second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object.
  • the processor 1002 uses the second projection model to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object, Specifically used for:
  • the processor 1002 determines the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information, it is specifically configured to:
  • the parameters of the shooting device include internal parameters and external parameters
  • the internal parameters include the focal length of the shooting device
  • the external parameters include the optical center of the shooting device.
  • the first projection model includes a small hole imaging model.
  • the second projection model includes any one of a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model.
  • the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional
  • the projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined.
  • various projection models can be adapted within the same algorithm framework to adapt to all types of monocular images, which improves the efficiency and effectiveness of image processing.
  • An embodiment of the present invention also provides an image processing system, including: a photographing device and the above-mentioned image processing equipment.
  • the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional
  • the projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined.
  • three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
  • the embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the present invention is implemented as shown in FIG. 2, FIG. 5, FIG. 6 or FIG.
  • the method described in the embodiment corresponding to 8 can also implement the device in the embodiment corresponding to the present invention described in FIG. 10, and will not be repeated here.
  • the computer-readable storage medium may be an internal storage unit of the device described in any of the foregoing embodiments, such as a hard disk or memory of the device.
  • the computer-readable storage medium may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD) card. , Flash Card, etc.
  • the computer-readable storage medium may also include both an internal storage unit of the device and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the terminal.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Studio Devices (AREA)

Abstract

An image processing method and device, an image processing system and a storage medium. Said method comprises: acquiring an initial image photographed by a photographing apparatus; processing the initial image, to determine a two-dimensional target image area of the initial image and semantic information of a target object contained in the two-dimensional target image area; according to a projection model of the two-dimensional target image area and the semantic information of the target object, determining the initial three-dimensional coordinates of the target object; performing area feature point extraction on the two-dimensional target image area, and determining, on the basis of the area feature points, three-dimensional coordinate adjustment information of the target object; and according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information, determining the target three-dimensional coordinates of the target object. By means of this implementation, three-dimensional target detection can be adaptively performed on different types of images, improving the efficiency and effectiveness of image processing.

Description

一种图像处理方法、设备、图像处理系统及存储介质Image processing method, equipment, image processing system and storage medium 技术领域Technical field
本发明实施例涉及图像处理技术领域,尤其涉及一种图像处理方法、设备、图像处理系统及存储介质。The embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method, device, image processing system, and storage medium.
背景技术Background technique
目前,对单目图像进行三维目标检测的方案大多针对的是普通小孔成像投影方式获取的图像,很少有针对其他投影模型进行检测的方案。如果直接将普通的单目三维目标检测算法运用在其他投影方式获取的图像(如鱼眼图像)上,则对图像的三维目标检测效果会下降。因此,如何更有效地对不同类型的图像进行三维目标检测处理成为亟需解决的问题。At present, most of the three-dimensional target detection schemes for monocular images are for images obtained by ordinary small-hole imaging projection methods, and there are few schemes for detecting other projection models. If the ordinary monocular 3D target detection algorithm is directly applied to the image obtained by other projection methods (such as fisheye image), the 3D target detection effect of the image will be reduced. Therefore, how to more effectively perform three-dimensional target detection processing on different types of images has become an urgent problem to be solved.
发明内容Summary of the invention
本发明实施例公开了一种图像处理方法、设备、图像处理系统及存储介质,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。The embodiment of the invention discloses an image processing method, equipment, image processing system and storage medium, which can adaptively perform three-dimensional target detection on different types of images, thereby improving the efficiency and effectiveness of image processing.
第一方面,本发明实施例提供了一种图像处理方法,包括:In the first aspect, an embodiment of the present invention provides an image processing method, including:
获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
第二方面,本发明实施例提供了一种图像处理设备,包括:存储器和处理器,In the second aspect, an embodiment of the present invention provides an image processing device, including: a memory and a processor,
所述存储器,用于存储程序;The memory is used to store programs;
所述处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于:The processor is configured to execute a program stored in the memory, and when the program is executed, the processor is configured to:
获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
第三方面,本发明实施例提供了一种图像处理系统,包括:In a third aspect, an embodiment of the present invention provides an image processing system, including:
拍摄装置和上述第一方面所述的图像处理设备。A photographing device and the image processing device described in the first aspect described above.
第四方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被处理器执行时实现如上述第一方面所述方法的步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium in which a computer program is stored. When the computer program is executed by a processor, the method described in the first aspect is implemented. step.
本发明实施例可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及对二维目标图像区域进行区域特征点提取,并基于区域特征点确定所述目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。The embodiment of the present invention can process the initial image captured by the shooting device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the projection of the two-dimensional target image area The semantic information of the model and the target object, determine the initial three-dimensional coordinates of the target object, and extract the regional feature points of the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, so as to according to the initial three-dimensional coordinates And three-dimensional coordinate adjustment information to determine the target three-dimensional coordinates of the target object. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative labor, other drawings can be obtained from these drawings.
图1是本发明实施例提供的一种图像处理系统的结构示意图;Fig. 1 is a schematic structural diagram of an image processing system provided by an embodiment of the present invention;
图2是本发明实施例提供的一种图像处理方法的流程示意图;2 is a schematic flowchart of an image processing method provided by an embodiment of the present invention;
图3是本发明实施例提供的一种基于小孔成像模型确定三维坐标的示意图;3 is a schematic diagram of determining three-dimensional coordinates based on a small hole imaging model according to an embodiment of the present invention;
图4是本发明实施例提供的一种基于等距投影模型确定三维坐标的示意图;4 is a schematic diagram of determining three-dimensional coordinates based on an isometric projection model according to an embodiment of the present invention;
图5是本发明实施例提供的另一种图像处理方法的流程示意图;5 is a schematic flowchart of another image processing method provided by an embodiment of the present invention;
图6是本发明实施例提供的又一种图像处理方法的流程示意图;FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present invention;
图7是本发明实施例提供的一种图像处理方法的架构示意图;FIG. 7 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention;
图8是本发明实施例提供的又一种图像处理方法的流程示意图;FIG. 8 is a schematic flowchart of yet another image processing method provided by an embodiment of the present invention;
图9是本发明实施例提供的一种图像处理方法的架构示意图;FIG. 9 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention;
图10是本发明实施例提供的一种图像处理设备的结构示意图。Fig. 10 is a schematic structural diagram of an image processing device provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following clearly describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
下面结合附图,对本发明的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。In the following, some embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
在高级驾驶辅助系统(Advanced Driving Assistant System,ADAS)、自动驾驶、无人机和机器人导航等领域,如何获取周围动态障碍物的三维坐标位置是一件非常重要的事情。目前,在众多传感器中,视觉摄像头可以捕获周围场景信息,其中,单目摄像头成本较低,因此是选择较多的一款传感器。因此,如何通过单目摄像头进行动态障碍物的三维检测,包括障碍物的三维坐标位置、三维尺寸、朝向等信息,是一件急需解决的问题。同时,目前存在各种各样的镜头以及投影模型,除了常见的采用小孔成像的普通镜头,还有具有超大视场的鱼眼镜头,并且其采用的投影方式与通常的小孔成像方式不同,因此,如何在各类单目拍摄装置上进行三维目标检测更加困难。In the fields of Advanced Driving Assistant System (ADAS), autonomous driving, drones and robot navigation, how to obtain the three-dimensional coordinate position of surrounding dynamic obstacles is a very important matter. At present, among many sensors, the vision camera can capture surrounding scene information. Among them, the monocular camera has a lower cost, so it is a sensor that has more choices. Therefore, how to perform three-dimensional detection of dynamic obstacles through a monocular camera, including information such as the three-dimensional coordinate position, three-dimensional size, and orientation of the obstacle, is a problem that needs to be solved urgently. At the same time, there are a variety of lenses and projection models. In addition to the common lens that uses small aperture imaging, there are also fisheye lenses with a super large field of view, and the projection method used is different from the usual aperture imaging method. Therefore, it is more difficult to detect three-dimensional targets on various monocular cameras.
在一个实施例中,在图像上进行目标物体的二维或三维检测大多采用深度学习中的卷积神经网络。对于二维目标检测,可以利用卷积神经网络对图像进 行特征提取,然后直接从特征来估计得到对应目标物体的类别和外接矩形。In one embodiment, the two-dimensional or three-dimensional detection of the target object on the image mostly adopts the convolutional neural network in deep learning. For two-dimensional target detection, the convolutional neural network can be used to extract features from the image, and then the category and circumscribed rectangle of the corresponding target object can be estimated directly from the features.
对于一个特定的目标物体的三维目标检测,其检测结果一般可以分为三个方面,一个是该目标物体的三维尺寸,通常以检测得到的该目标物体的外接长方体的长、宽、高进行表示;一个是该目标物体的三维位置,通常以该外接长方体的中心在相机坐标系下的左边进行表示,应需要也可以转换至其他坐标系下的坐标;一个是该目标物体的朝向信息,通常以该外接长方体的俯仰角、偏航角、横滚角进行表示。在单目摄像头的目标物体三维检测中,常需要使用时序上不同时刻的多帧图像结合相机或搭载相机的可移动平台的运动信息来实现;若想使用单目摄像头的单帧图像进行目标物体三维检测,则需要通过对目标进行语义识别,并结合预先设置的该目标物体检测的语义信息对应的三维信息经验值才能实现,这样所得到的结果比较不准确,只能提供一个粗略的三维信息。For the three-dimensional target detection of a specific target object, the detection results can generally be divided into three aspects. One is the three-dimensional size of the target object, which is usually expressed by the length, width, and height of the circumscribed cuboid of the target object obtained by the detection. ; One is the three-dimensional position of the target object, usually represented by the center of the circumscribed cuboid on the left side of the camera coordinate system, and can be converted to coordinates in other coordinate systems if necessary; one is the orientation information of the target object, usually It is expressed by the pitch angle, yaw angle, and roll angle of the circumscribed cuboid. In the three-dimensional detection of the target object of the monocular camera, it is often necessary to use multiple frames of images at different moments in time sequence combined with the motion information of the camera or the movable platform equipped with the camera to achieve; if you want to use the single frame image of the monocular camera for the target Three-dimensional detection requires semantic recognition of the target, combined with the preset three-dimensional information experience value corresponding to the semantic information of the target object detection, so that the result obtained is relatively inaccurate and can only provide a rough three-dimensional information .
例如,搭载于车辆上的一个单目摄像头拍摄得到车辆周围的图像,且图像中包含有目标物体(例如是其他车辆),那么可以首先对该图像通过神经网络进行语义识别,识别得到该目标物体为小汽车,则可以为其赋予对应于小汽车的预设的三维信息,如高度为1.5米,并结合此预设的信息以及单目摄像头的相机参数来最终获得该小汽车的三维信息。For example, if a monocular camera mounted on a vehicle captures an image around the vehicle, and the image contains a target object (such as another vehicle), then the image can be semantically recognized through the neural network to identify the target object If it is a car, it can be given preset three-dimensional information corresponding to the car, such as a height of 1.5 meters, and combine this preset information and the camera parameters of the monocular camera to finally obtain the three-dimensional information of the car.
在一个实施例中,对于任一拍摄装置如相机,其参数是已知的,如焦距(f),光心坐标(cx、cy),因此,在进行三维物体检测时,可以根据其在图像上的大小,粗略估计其三维坐标,通俗来讲,就是近大远小,图像上较小的物体,其距离就越远。In one embodiment, for any photographing device such as a camera, its parameters are known, such as focal length (f) and optical center coordinates (cx, cy). Therefore, when performing three-dimensional object detection, it can be based on its image The size of the upper part is a rough estimate of its three-dimensional coordinates. In layman's terms, it is the near-large and far-small. The smaller the object on the image, the farther the distance.
本发明实施例提出了一种图像处理方法,通过对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及对二维目标图像区域进行区域特征点提取,并基于区域特征点确定所述目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。The embodiment of the present invention provides an image processing method, which processes the initial image captured by the camera to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and According to the projection model of the two-dimensional target image area and the semantic information of the target object, the initial three-dimensional coordinates of the target object are determined, and the area feature points of the two-dimensional target image area are extracted, and the three-dimensional coordinates of the target object are determined based on the area feature points Adjust the information to determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
本发明实施例中提供的图像处理方法可以由一种图像处理系统执行,其 中,所述图像处理系统可以包括拍摄装置和图像处理设备,在某些实施例中,所述拍摄装置可以与图像处理设备集成;在某些实施例中,所述拍摄装置可以在空间上独立于图像处理设备,所述拍摄装置与所述图像处理设备通过有线或无线的方式建立通信连接。在某些实施例中,所述拍摄装置可以设置在配置有负载(如拍摄装置、红外探测装置、测绘仪等)的可移动平台上;在其他实施例中,所述拍摄装置可以在空间上独立于可移动平台。在某些实施例中,所述拍摄装置可以包括但不限于运动相机、全景相机、鱼眼相机等。在某些实施例中,所述拍摄装置的数量可以为一个或多个。在某些实施例中,所述可移动平台包括无人机、能够自主移动的机器人、无人车、无人船等可移动设备。在某些实施例中,所述图像处理设备可以包括智能手机、平板电脑、膝上型电脑和穿戴式设备中的一种或者多种。在某些实施例中,本发明实施例可应用于ADAS、自动驾驶、机器人、无人机等可移动平台中进行障碍物的感知,实现避障功能和后续的路径规划等功能。在一个实施例中,所述可移动平台可以是车辆,所述拍摄装置可以是鱼眼相机,所述图像处理设备可以是搭载于车辆上的计算平台;具体的,鱼眼相机安装于车辆的后视镜上,用于获得车辆侧方的环境图像然后发送至车载计算平台进行处理,并且能够提供相较于普通镜头更广范围的视野,从而实现减少或避免相机视野盲区。The image processing method provided in the embodiments of the present invention may be executed by an image processing system, where the image processing system may include a photographing device and an image processing device. In some embodiments, the photographing device may be combined with the image processing system. Equipment integration; in some embodiments, the photographing device may be spatially independent of the image processing equipment, and the photographing device and the image processing equipment establish a communication connection in a wired or wireless manner. In some embodiments, the photographing device may be set on a movable platform configured with a load (such as a photographing device, an infrared detection device, a surveying instrument, etc.); in other embodiments, the photographing device may be spatially Independent of the movable platform. In some embodiments, the photographing device may include, but is not limited to, a sports camera, a panoramic camera, a fisheye camera, and the like. In some embodiments, the number of the photographing device may be one or more. In some embodiments, the movable platform includes movable equipment such as drones, robots capable of autonomous movement, unmanned vehicles, and unmanned ships. In some embodiments, the image processing device may include one or more of a smart phone, a tablet computer, a laptop computer, and a wearable device. In some embodiments, the embodiments of the present invention can be applied to the detection of obstacles in mobile platforms such as ADAS, autonomous driving, robots, drones, etc., to realize obstacle avoidance functions and subsequent path planning functions. In one embodiment, the movable platform may be a vehicle, the imaging device may be a fisheye camera, and the image processing device may be a computing platform mounted on the vehicle; specifically, the fisheye camera is installed on the vehicle The rearview mirror is used to obtain the environmental image of the side of the vehicle and then send it to the on-board computing platform for processing, and can provide a wider field of view than ordinary lenses, thereby reducing or avoiding the blind area of the camera's field of view.
下面结合附图1为例对本发明实施例提供的图像处理系统进行示意性说明。The image processing system provided by the embodiment of the present invention will be schematically described below with reference to FIG. 1 as an example.
请参见图1,图1是本发明实施例提供的一种图像处理系统的结构示意图,其中,该图像处理系统包括拍摄装置11和图像处理设备12,其中,拍摄装置11和图像处理设备12之间可以通过无线通信连接方式建立通信连接。其中,在某些场景下,拍摄装置11和图像处理设备12之间也可以通过有线通信连接方式建立通信连接。在某些实施例中,拍摄装置11可以设置在图像处理设备12上。在其他实施例中,拍摄装置11和图像处理设备12彼此独立,所述图像处理设备12可以包括智能手机、平板电脑、膝上型电脑和穿戴式设备中的一种或者多种。Please refer to FIG. 1, which is a schematic structural diagram of an image processing system provided by an embodiment of the present invention. The image processing system includes a photographing device 11 and an image processing device 12, wherein one of the photographing device 11 and the image processing device 12 The communication connection can be established through wireless communication connection. Among them, in some scenarios, a communication connection between the photographing device 11 and the image processing device 12 may also be established through a wired communication connection. In some embodiments, the photographing device 11 may be provided on the image processing device 12. In other embodiments, the photographing device 11 and the image processing device 12 are independent of each other, and the image processing device 12 may include one or more of a smart phone, a tablet computer, a laptop computer, and a wearable device.
本发明实施例中,图像处理设备12可以获取拍摄装置11拍摄得到的初始图像,并对初始图像进行处理,确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,以及根据二维目标图像区域的投影模 型和目标物体的语义信息,确定目标物体的初始三维坐标,通过对二维目标图像区域进行区域特征点提取,并基于区域特征点确定目标物体的三维坐标调整信息,以及根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing device 12 may obtain the initial image taken by the photographing device 11, and process the initial image to determine the two-dimensional target image area of the initial image and the semantics of the target object contained in the two-dimensional target image area Information, and according to the projection model of the two-dimensional target image area and the semantic information of the target object, determine the initial three-dimensional coordinates of the target object, extract the area feature points of the two-dimensional target image area, and determine the three-dimensional target object based on the area feature points The coordinate adjustment information, and the target three-dimensional coordinates of the target object are determined according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
下面结合附图对本发明实施例提供的图像处理方法进行示意性说明。The image processing method provided by the embodiment of the present invention will be schematically described below in conjunction with the accompanying drawings.
具体请参见图2,图2是本发明实施例提供的一种图像处理方法的流程示意图,所述方法可以由图像处理设备执行,其中,图像处理设备具体解释如前所述。具体地,本发明实施例的所述方法包括如下步骤。Please refer to FIG. 2 for details. FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present invention. The method may be executed by an image processing device, and the specific explanation of the image processing device is as described above. Specifically, the method of the embodiment of the present invention includes the following steps.
S201:获取拍摄装置拍摄得到的初始图像。S201: Acquire an initial image photographed by the photographing device.
本发明实施例中,图像处理设备可以获取拍摄装置拍摄得到的初始图像。在某些实施例中,所述初始图像可以是拍摄装置拍摄目标物体得到的。在某些实施例中,所述拍摄装置的解释如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may obtain the initial image captured by the photographing device. In some embodiments, the initial image may be obtained by photographing the target object by a photographing device. In some embodiments, the explanation of the photographing device is as described above, and will not be repeated here.
S202:对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。S202: Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
本发明实施例中,图像处理设备可以对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。In the embodiment of the present invention, the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
在一个实施例中,所述图像处理设备可以对所述初始图像进行图像处理和语义分析,确定出所述目标物体的语义信息。在某些实施例中,所述语义信息包括但不限于目标物体的类别、位置信息等,在一个示例中,所述目标物体的语义信息包括汽车、无人机等不同类别物体的语义。在某些实施例中,所述位置信息包括目标物体的二维坐标。In an embodiment, the image processing device may perform image processing and semantic analysis on the initial image to determine the semantic information of the target object. In some embodiments, the semantic information includes, but is not limited to, the category and location information of the target object. In one example, the semantic information of the target object includes the semantics of different types of objects such as cars and drones. In some embodiments, the location information includes two-dimensional coordinates of the target object.
在一个实施例中,所述图像处理设备在对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息时,可以根据第一神经网络对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。在某些实施例中,所述第一神经网络为卷积神经网络。In one embodiment, when the image processing device processes the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, it may The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined. In some embodiments, the first neural network is a convolutional neural network.
通过确定出所述初始图像的二维目标图像区域和所述二维目标图像区域 所包含的目标物体的语义信息,有助于后续确定所述目标物体的初始三维坐标。By determining the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, it is helpful to subsequently determine the initial three-dimensional coordinates of the target object.
S203:根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。S203: Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
本发明实施例中,图像处理设备可以根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。In the embodiment of the present invention, the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
在某些实施例中,所述投影模型包括但不限于小孔成像模型、斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型等中的任意一种。In some embodiments, the projection model includes, but is not limited to, any one of a pinhole imaging model, a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model.
在一个实施例中,所述图像处理设备在根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标时,可以获取所述二维目标图像区域的投影模型,并获取所述拍摄装置的参数,以及根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。在某些实施例中,所述投影模型可以为预设的投影模型。In one embodiment, the image processing device may acquire the two-dimensional target when determining the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object. The projection model of the image area, and obtain the parameters of the photographing device, and determine the two-dimensional target image according to the projection model of the two-dimensional target image area, the semantic information of the target object and the parameters of the photographing device The initial 3D coordinates of the area. In some embodiments, the projection model may be a preset projection model.
在某些实施例中,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。In some embodiments, the parameters of the photographing device include an internal parameter and an external parameter, the internal parameter includes the focal length of the photographing device, and the external parameter includes the optical center of the photographing device.
在一个实施例中,所述图像处理设备在获取所述二维目标图像区域的投影模型时,可以根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息,并根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。In one embodiment, when the image processing device acquires the projection model of the two-dimensional target image area, it may determine the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area, And according to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
例如,假设所述图像处理设备根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息为鱼眼图像,则可以确定所述二维目标图像区域的投影模型为与鱼眼图像对应的斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型等中的任意一种投影模型。For example, assuming that the image processing device determines that the category information of the two-dimensional target image area is a fisheye image according to the feature point information of the two-dimensional target image area, it may be determined that the projection model of the two-dimensional target image area is Any one of a Snell window projection model, an equal area projection model, an equal distance projection model, and a stereo projection model corresponding to the fisheye image.
在一个实施例中,所述图像处理设备在获取所述二维目标图像区域的投影模型时,可以根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型,并确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。In one embodiment, when the image processing device acquires the projection model of the two-dimensional target image area, it may process the initial image according to the first neural network to obtain the projection model of the initial image, and determine The projection model of the initial image is a projection model of the two-dimensional target image area.
在一个实施例中,所述图像处理设备在获取所述二维目标图像区域的投影模型时,可以根据第三神经网络对所述初始图像进行处理,得到所述初始图像 的投影模型,并确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。在某些实施例中,所述第三神经网络可以为卷积神经网络,所述第三神经网络与所述第一神经网络不相同。In one embodiment, when the image processing device acquires the projection model of the two-dimensional target image area, it may process the initial image according to a third neural network to obtain the projection model of the initial image, and determine The projection model of the initial image is a projection model of the two-dimensional target image area. In some embodiments, the third neural network may be a convolutional neural network, and the third neural network is different from the first neural network.
在一个实施例中,所述投影模型包括第一投影模型;所述图像处理设备在根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,可以获取所述目标物体在所述二维目标图像区域中的图像高度和目标物体的实际高度,并根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。在某些实施例中,所述第一投影模型包括小孔成像模型。In one embodiment, the projection model includes a first projection model; the image processing device determines according to the projection model of the two-dimensional target image area, the semantic information of the target object and the parameters of the photographing device When the initial three-dimensional coordinates of the two-dimensional target image area, the image height of the target object in the two-dimensional target image area and the actual height of the target object can be obtained, and the actual height of the target object can be obtained according to the image height, the actual height, The semantic information of the target object and the parameters of the photographing device are used to determine the initial three-dimensional coordinates of the two-dimensional target image area by using the first projection model. In some embodiments, the first projection model includes a pinhole imaging model.
在一个实施例中,所述图像处理设备在根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标时,可以根据所述图像高度、所述实际高度和所述拍摄装置的参数,利用所述第一投影模型确定所述目标物体距离所述拍摄装置的实际距离,并获取所述目标物体在图像坐标系中的二维坐标,以及根据所述拍摄装置的参数、所述实际距离、所述二维坐标和所述目标物体的语义信息,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。In one embodiment, the image processing device uses the first projection model to determine the two-dimensional image based on the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device. In the initial three-dimensional coordinates of the target image area, the first projection model may be used to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device, and Obtain the two-dimensional coordinates of the target object in the image coordinate system, and use the first projection model according to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object Determine the initial three-dimensional coordinates of the two-dimensional target image area.
以图3为例进行说明,图3是本发明实施例提供的一种基于小孔成像模型确定三维坐标的示意图。如图3所示,f为相机的焦距,h为二维目标图像区域中目标物体的图像高度,H为目标物体的实际高度,D为目标物体至相机的距离,由图3可得f/D=h/H。对于物体检测,h便是目标物体在二维目标图像区域上的高度,由于每个二维目标图像区域中目标物体所属的类别可以预测得到,则H可以大致得到(例如,假设目标物体为小汽车,则可假定H为1米),f为相机内参已知,因此便可以计算得到目标物体距离相机的实际距离D。即目标物体在相机坐标系下的Z方向的坐标。同理,利用小孔成像投影模型,根据相机的光心cx、cy、目标物体距离相机的实际距离D,可以计算得到目标物体在相机坐标系下的X、Y坐标,如下公式(1)所示:Take FIG. 3 as an example for description. FIG. 3 is a schematic diagram of determining three-dimensional coordinates based on a small hole imaging model according to an embodiment of the present invention. As shown in Figure 3, f is the focal length of the camera, h is the image height of the target object in the two-dimensional target image area, H is the actual height of the target object, and D is the distance from the target object to the camera. From Figure 3, f/ D=h/H. For object detection, h is the height of the target object in the two-dimensional target image area. Since the category of the target object in each two-dimensional target image area can be predicted, H can be roughly obtained (for example, assuming that the target object is small For a car, it can be assumed that H is 1 meter), f is the camera's internal parameters, so the actual distance D of the target object from the camera can be calculated. That is, the coordinates of the target object in the Z direction in the camera coordinate system. In the same way, using the small hole imaging projection model, according to the camera's optical center cx, cy, and the actual distance D of the target object from the camera, the X and Y coordinates of the target object in the camera coordinate system can be calculated, as shown in the following formula (1) Show:
Figure PCTCN2019129364-appb-000001
Figure PCTCN2019129364-appb-000001
其中,u、v为目标物体在图像坐标系下的坐标,可以使用目标物体的中心点坐标来近似。Among them, u and v are the coordinates of the target object in the image coordinate system, which can be approximated by using the coordinates of the center point of the target object.
在一个实施例中,图像处理设备在计算得到目标物体在图像坐标系中的二维坐标后,可以通过目标物体在图像坐标系中的二维坐标、相机内参和所述目标物体的语义信息,计算得到目标物体所在的二维目标图像区域的的初始三维坐标。In one embodiment, after the image processing device calculates the two-dimensional coordinates of the target object in the image coordinate system, it can use the two-dimensional coordinates of the target object in the image coordinate system, the camera internal parameters, and the semantic information of the target object, The initial three-dimensional coordinates of the two-dimensional target image area where the target object is located are calculated.
在一个实施例中,所述投影模型包括第二投影模型;所述图像处理设备在根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,可以获取所述目标物体在所述二维目标图像区域中的图像距离,并获取所述目标物体在图像坐标系中的二维坐标,以根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。在某些实施例中,所述第二投影模型包括斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型中的任意一种;在其他实施例中,所述第二投影模型还可以包括其他投影模型,在此不做具体限定。In one embodiment, the projection model includes a second projection model; the image processing device determines according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device When the initial three-dimensional coordinates of the two-dimensional target image area, the image distance of the target object in the two-dimensional target image area can be obtained, and the two-dimensional coordinates of the target object in the image coordinate system can be obtained, according to The image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device are used to determine the initial three-dimensional coordinates of the two-dimensional target image area by using the second projection model. In some embodiments, the second projection model includes any one of a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model; in other embodiments, the second projection model The two-projection model may also include other projection models, which are not specifically limited here.
在一个实施例中,所述图像处理设备在根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,可以根据所述图像距离、所述二维坐标和所述拍摄装置的参数,利用所述第二投影模型确定所述拍摄装置的拍摄视角,并获取所述目标物体距离所述拍摄装置的实际距离,以根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。In one embodiment, the image processing device uses the second projection model to determine the second projection model based on the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the shooting device. When determining the initial three-dimensional coordinates of the target image area, the second projection model may be used to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device, and to obtain the The actual distance of the target object from the photographing device, so as to use the second projection model to determine the initial two-dimensional target image area according to the photographing angle of view, the actual distance, and the semantic information of the target object. Three-dimensional coordinates.
在一个实施例中,所述图像处理设备在根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,可以根据所述拍摄视角和所述实际距离,利用所述第二投影模型确定所述目标物体的物理距离,并根据所述拍摄装置的参数、所 述拍摄视角、所述二维坐标、所述物理距离和所述目标物体的语义信息,确定所述二维目标图像区域的初始三维坐标。In one embodiment, the image processing device uses the second projection model to determine the initial two-dimensional target image area based on the shooting angle of view, the actual distance, and the semantic information of the target object. In the case of three-dimensional coordinates, the second projection model may be used to determine the physical distance of the target object according to the shooting angle of view and the actual distance, and the physical distance of the target object may be determined according to the parameters of the shooting device, the shooting angle of view, and the two-dimensional The coordinates, the physical distance and the semantic information of the target object determine the initial three-dimensional coordinates of the two-dimensional target image area.
具体可以图4为例进行说明,图4是本发明实施例提供的一种基于等距投影模型确定三维坐标的示意图。如图4所示,其中f为相机焦距,t为目标物体在二维目标图像区域上距离,r为目标物体的实际距离,z为相机至物体的距离,根据等距投影的原理,存在如下公式(2):Specifically, FIG. 4 can be used as an example for description. FIG. 4 is a schematic diagram of determining three-dimensional coordinates based on an isometric projection model according to an embodiment of the present invention. As shown in Figure 4, where f is the focal length of the camera, t is the distance of the target object in the two-dimensional target image area, r is the actual distance of the target object, and z is the distance from the camera to the object. According to the principle of isometric projection, there are the following Formula (2):
Figure PCTCN2019129364-appb-000002
Figure PCTCN2019129364-appb-000002
其中,u、v为目标物体在图像坐标系中的像素坐标,f、cx、cy为相机内参。Among them, u, v are the pixel coordinates of the target object in the image coordinate system, and f, cx, and cy are the internal camera parameters.
对于目标物体检测,h为目标物体在二维目标图像区域上的高度,H为目标物体的实际高度,则存在如下公式(3):For target object detection, h is the height of the target object in the two-dimensional target image area, and H is the actual height of the target object, then the following formula (3) exists:
Figure PCTCN2019129364-appb-000003
Figure PCTCN2019129364-appb-000003
在一个实施例中,二维目标图像区域中的目标物体所属的类别可以根据目标物体的语义信息预测得到,则H可以大致得到(例如,目标物体为小汽车,则可假定H为1米)。已知,u、v为目标物体在图像坐标系下的坐标,可以使用目标物体的中心点坐标来近似。因此便可以先计算得到θ,再计算得到r值,最后可以得到目标物体在相机坐标系下的X、Y、Z坐标即初始三维坐标。In one embodiment, the category to which the target object in the two-dimensional target image area belongs can be predicted based on the semantic information of the target object, and then H can be roughly obtained (for example, if the target object is a car, it can be assumed that H is 1 meter) . It is known that u and v are the coordinates of the target object in the image coordinate system, which can be approximated by using the coordinates of the center point of the target object. Therefore, the θ can be calculated first, and then the r value can be calculated. Finally, the X, Y, and Z coordinates of the target object in the camera coordinate system can be obtained, that is, the initial three-dimensional coordinates.
在一个实施例中,上述描述了如何通过相机内参、目标物体在图像坐标系中的二维坐标、目标物体的语义信息、目标物体在二维目标图像区域中的图像距离以及投影模型来计算得到目标物体的初始三维坐标,主要详细叙述了小孔成像、等距投影两种模型,对于其他的投影模型,同样可以计算出初始的X、Y、Z坐标,此处不再赘述。In one embodiment, the above describes how to calculate the internal parameters of the camera, the two-dimensional coordinates of the target object in the image coordinate system, the semantic information of the target object, the image distance of the target object in the two-dimensional target image area, and the projection model. The initial three-dimensional coordinates of the target object mainly describe in detail the two models of small hole imaging and equidistant projection. For other projection models, the initial X, Y, and Z coordinates can also be calculated, which will not be repeated here.
在一个实施例中,所述图像处理设备还可以根据第二神经网络对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息;其中,所述区域信息包括所述二维目标图像区域的类别信息、三维尺寸信息、朝向信息、 二维外接矩阵信息中的任意一种或多种。在某些实施例中,所述第二神经网络可以为卷积神经网络,所述第二神经网络与所述第一神经网络和第二神经网络不相同。In an embodiment, the image processing device may also detect the two-dimensional target image area according to a second neural network to obtain the area information of the two-dimensional target image area; wherein, the area information includes the Any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area. In some embodiments, the second neural network may be a convolutional neural network, and the second neural network is different from the first neural network and the second neural network.
S204:对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。S204: Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
本发明实施例中,图像处理设备可以对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。In the embodiment of the present invention, the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
在一个实施例中,所述图像处理设备在对所述二维目标图像区域进行区域特征点提取时,可以根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息。在某些实施例中,所述第二神经网络为卷积神经网络。在某些实施例中,所述第二神经网络与第一神经网络和所述第三神经网络不相同。In one embodiment, when the image processing device extracts the area feature points of the two-dimensional target image area, it may process the two-dimensional target image area according to the second neural network to extract the two-dimensional target image area. Feature point information of the target image area. In some embodiments, the second neural network is a convolutional neural network. In some embodiments, the second neural network is different from the first neural network and the third neural network.
S205:根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。S205: Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
本发明实施例中,图像处理设备可以根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。In the embodiment of the present invention, the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
在一个实施例中,所述图像处理设备在根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标时,可以根据第二神经网络对所述二维目标图像区域进行处理,确定所述二维目标图像区域的三维坐标调整信息,并根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像区域的目标三维坐标。In one embodiment, when the image processing device determines the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information, the image processing device may compare the two-dimensional target image according to the second neural network. The area is processed, the three-dimensional coordinate adjustment information of the two-dimensional target image area is determined, and the target three-dimensional coordinates of the two-dimensional target image area are determined according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
在一个实施例中,所述图像处理设备在根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像区域的目标三维坐标时,可以根据所述三维坐标调整信息对所述初始三维坐标进行调整,并确定调整后的三维坐标为所述二维目标图像区域的目标三维坐标。In one embodiment, when the image processing device determines the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates, it may perform an adjustment to the three-dimensional coordinate adjustment information. The initial three-dimensional coordinates are adjusted, and the adjusted three-dimensional coordinates are determined to be the target three-dimensional coordinates of the two-dimensional target image area.
本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初 始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional The projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area, the three-dimensional coordinate adjustment information of the target object is determined based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
具体请参见图5,图5是本发明实施例提供的另一种图像处理方法的流程示意图,所述方法可以由图像处理设备执行,其中,图像处理设备具体解释如前所述。具体地,本发明实施例的所述方法包括如下步骤。Please refer to FIG. 5 for details. FIG. 5 is a schematic flowchart of another image processing method provided by an embodiment of the present invention. The method may be executed by an image processing device, and the specific explanation of the image processing device is as described above. Specifically, the method of the embodiment of the present invention includes the following steps.
S501:获取拍摄装置拍摄得到的初始图像。S501: Acquire an initial image photographed by the photographing device.
本发明实施例中,图像处理设备可以获取拍摄装置拍摄得到的初始图像。在某些实施例中,所述拍摄装置的解释如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may obtain the initial image captured by the photographing device. In some embodiments, the explanation of the photographing device is as described above, and will not be repeated here.
S502:对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。S502: Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
本发明实施例中,图像处理设备可以对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area. The specific implementation is as described above and will not be repeated here.
S503:获取所述二维目标图像区域的投影模型,所述投影模型为预设的投影模型。S503: Acquire a projection model of the two-dimensional target image area, where the projection model is a preset projection model.
本发明实施例中,图像处理设备可以获取所述二维目标图像区域的投影模型,所述投影模型为预设的投影模型。在某些实施例中,所述投影模型的解释如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may obtain a projection model of the two-dimensional target image area, and the projection model is a preset projection model. In some embodiments, the explanation of the projection model is as described above, and will not be repeated here.
S504:根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。S504: Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
本发明实施例中,图像处理设备可以根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object. The specific implementation is as described above and will not be repeated here.
S505:对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。S505: Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
本发明实施例中,图像处理设备可以对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points. The specific implementation is as described above and will not be repeated here.
S506:根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。S506: Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
本发明实施例中,图像处理设备可以根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information. The specific implementation is as described above and will not be repeated here.
本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据预设的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the preset The projection model and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, so as to adjust the information according to the initial three-dimensional Coordinates and three-dimensional coordinate adjustment information determine the target three-dimensional coordinates of the target object. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
具体请参见图6,图6是本发明实施例提供的又一种图像处理方法的流程示意图,所述方法可以由图像处理设备执行,其中,图像处理设备具体解释如前所述。具体地,本发明实施例的所述方法包括如下步骤。Please refer to FIG. 6 for details. FIG. 6 is a schematic flowchart of another image processing method provided by an embodiment of the present invention. The method may be executed by an image processing device, and the specific explanation of the image processing device is as described above. Specifically, the method of the embodiment of the present invention includes the following steps.
S601:获取拍摄装置拍摄得到的初始图像。S601: Acquire an initial image photographed by the photographing device.
本发明实施例中,图像处理设备可以获取拍摄装置拍摄得到的初始图像。在某些实施例中,所述拍摄装置的解释如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may obtain the initial image captured by the photographing device. In some embodiments, the explanation of the photographing device is as described above, and will not be repeated here.
S602:对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。S602: Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
本发明实施例中,图像处理设备可以对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area. The specific implementation is as described above and will not be repeated here.
S603:根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型,并确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。S603: Process the initial image according to the third neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the projection model of the two-dimensional target image area.
本发明实施例中,所述图像处理设备可以根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型,并确定所述初始图像的投影模型 为所述二维目标图像区域的投影模型。In the embodiment of the present invention, the image processing device may process the initial image according to a third neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the two-dimensional target image The projection model of the area.
具体可以图7为例,图7是本发明实施例提供的一种图像处理方法的架构示意图,如图7所示,所述图像处理设备可以通过第一神经网络71对初始图像72进行处理,以得到二维目标图像区域73。根据第二神经网络74对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息75,其中,区域信息75包括二维目标图像区域的类别信息、三维尺寸信息、朝向信息、二维外接矩阵信息中的任意一种或多种。通过第三神经网络76确定所述二维目标图像区域的投影模型77,其中,所述投影模型77为从多个投影模型中确定出的至少一个。根据所述二维目标图像区域的投影模型77、所述目标物体的语义信息和拍摄装置的参数78,确定所述目标物体的初始三维坐标791;对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息792;根据所述初始三维坐标791和所述三维坐标调整信息792,确定所述目标物体的目标三维坐标710。Specifically, FIG. 7 can be taken as an example. FIG. 7 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention. As shown in FIG. 7, the image processing device may process the initial image 72 through the first neural network 71. To obtain a two-dimensional target image area 73. The two-dimensional target image area is detected according to the second neural network 74 to obtain the area information 75 of the two-dimensional target image area, where the area information 75 includes category information, three-dimensional size information, and orientation of the two-dimensional target image area. Any one or more of information and two-dimensional circumscribed matrix information. The projection model 77 of the two-dimensional target image area is determined by the third neural network 76, wherein the projection model 77 is at least one determined from a plurality of projection models. Determine the initial three-dimensional coordinates 791 of the target object according to the projection model 77 of the two-dimensional target image area, the semantic information of the target object and the parameters 78 of the photographing device; perform regional feature points on the two-dimensional target image area Extract and determine the three-dimensional coordinate adjustment information 792 of the target object based on the regional feature points; determine the target three-dimensional coordinate 710 of the target object according to the initial three-dimensional coordinate 791 and the three-dimensional coordinate adjustment information 792.
通过这种实施方式,可以实现自动选择相应的投影模型,实现了在同一个算法框架内对各类投影模型自适应,以适配所有类型的单目图像,有助于提高图像处理的效率和有效性。Through this implementation mode, the corresponding projection model can be automatically selected, and various projection models can be adapted within the same algorithm framework to adapt to all types of monocular images, which helps to improve the efficiency and efficiency of image processing. Effectiveness.
S604:根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。S604: Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
本发明实施例中,图像处理设备可以根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object. The specific implementation is as described above and will not be repeated here.
S605:对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。S605: Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
本发明实施例中,图像处理设备可以对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points. The specific implementation is as described above and will not be repeated here.
S606:根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。S606: Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
本发明实施例中,图像处理设备可以根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。具体实施例如前所述,此处 不再赘述。In the embodiment of the present invention, the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information. The specific implementation is as described above, and will not be repeated here.
本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据通过第三神经网络对所述初始图像进行处理得到的二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以通过神经网络确定投影模型,以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing device may process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the passage The projection model of the two-dimensional target image area and the semantic information of the target object obtained by the three-neural network processing the initial image, determine the initial three-dimensional coordinates of the target object, and extract the regional feature points of the two-dimensional target image area to The three-dimensional coordinate adjustment information of the target object is determined based on the regional feature points, so that the target three-dimensional coordinate of the target object is determined according to the initial three-dimensional coordinate and the three-dimensional coordinate adjustment information. Through this implementation manner, the projection model can be determined through a neural network to adaptively perform three-dimensional target detection on different types of images, which improves the efficiency and effectiveness of image processing.
具体请参见图8,图8是本发明实施例提供的又一种图像处理方法的流程示意图,所述方法可以由图像处理设备执行,其中,图像处理设备具体解释如前所述。具体地,本发明实施例的所述方法包括如下步骤。Please refer to FIG. 8 for details. FIG. 8 is a schematic flowchart of another image processing method provided by an embodiment of the present invention. The method may be executed by an image processing device, and the specific explanation of the image processing device is as described above. Specifically, the method of the embodiment of the present invention includes the following steps.
S801:获取拍摄装置拍摄得到的初始图像。S801: Acquire an initial image photographed by the photographing device.
本发明实施例中,图像处理设备可以获取拍摄装置拍摄得到的初始图像。在某些实施例中,所述拍摄装置的解释如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may obtain the initial image captured by the photographing device. In some embodiments, the explanation of the photographing device is as described above, and will not be repeated here.
S802:对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。S802: Process the initial image, and determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area.
本发明实施例中,图像处理设备可以对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。具体实施例如前所述,此处不再赘述。In the embodiment of the present invention, the image processing device may process the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area. The specific implementation is as described above and will not be repeated here.
S803:根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型,并确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。S803: Process the initial image according to the first neural network to obtain a projection model of the initial image, and determine that the projection model of the initial image is the projection model of the two-dimensional target image area.
本发明实施例中,图像处理设备可以根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型,并确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。In the embodiment of the present invention, the image processing device may process the initial image according to the first neural network to obtain the projection model of the initial image, and determine that the projection model of the initial image is the one of the two-dimensional target image area. Projection model.
在一个实施例中,图像处理设备可以根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息,并根据所述二 维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息,以及根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。In one embodiment, the image processing device may process the two-dimensional target image area according to the second neural network to extract feature point information of the two-dimensional target image area, and according to the characteristics of the two-dimensional target image area The feature point information determines the category information of the two-dimensional target image area, and the projection model of the two-dimensional target image area is determined according to the category information of the two-dimensional target image area.
具体可以图9为例,图9是本发明实施例提供的一种图像处理方法的架构示意图,如图9所示,所述图像处理设备可以通过第一神经网络91对二维目标图像区域92进行处理,以提取所述二维目标图像区域92的特征点信息93。根据第二神经网络94对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息95。通过第一神经网络91确定所述二维目标图像区域92的投影模型96,根据所述二维目标图像区域的投影模型96、拍摄装置的参数97和所述目标物体的语义信息,确定所述目标物体的初始三维坐标981;对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息982;根据所述初始三维坐标981和所述三维坐标调整信息982,确定所述目标物体的目标三维坐标99。Specifically, FIG. 9 is an example. FIG. 9 is a schematic structural diagram of an image processing method provided by an embodiment of the present invention. As shown in FIG. Processing is performed to extract feature point information 93 of the two-dimensional target image area 92. The two-dimensional target image area is detected according to the second neural network 94 to obtain the area information 95 of the two-dimensional target image area. The projection model 96 of the two-dimensional target image area 92 is determined by the first neural network 91, and the projection model 96 of the two-dimensional target image area, the parameters 97 of the photographing device, and the semantic information of the target object are determined to determine the The initial three-dimensional coordinates 981 of the target object; perform regional feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information 982 of the target object based on the regional feature points; according to the initial three-dimensional coordinates 981 and the The three-dimensional coordinate adjustment information 982 determines the target three-dimensional coordinate 99 of the target object.
通过这种实施方式,可以实现将判断投影模型的神经网络和提取二维目标图像区域的特征点信息的神经网络进行共享,减小了计算量和复杂度,进一步提高了图像处理的效率。Through this embodiment, the neural network for judging the projection model and the neural network for extracting the feature point information of the two-dimensional target image area can be shared, reducing the amount of calculation and complexity, and further improving the efficiency of image processing.
S804:根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。S804: Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
本发明实施例中,图像处理设备可以根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标。In the embodiment of the present invention, the image processing device may determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object.
在一个实施例中,图像设备在根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标时,可以获取所述拍摄装置的参数,并根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。在某些实施例中,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。In one embodiment, the imaging device may obtain the parameters of the imaging device when determining the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object, and The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device. In some embodiments, the parameters of the photographing device include an internal parameter and an external parameter, the internal parameter includes the focal length of the photographing device, and the external parameter includes the optical center of the photographing device.
S805:对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。S805: Perform area feature point extraction on the two-dimensional target image area, and determine three-dimensional coordinate adjustment information of the target object based on the area feature points.
本发明实施例中,图像处理设备可以对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息。In the embodiment of the present invention, the image processing device may perform area feature point extraction on the two-dimensional target image area, and determine the three-dimensional coordinate adjustment information of the target object based on the area feature points.
S806:根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。S806: Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
本发明实施例中,图像处理设备可以根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。In the embodiment of the present invention, the image processing device may determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据第一神经网络对所述初始图像进行处理,以得到二维目标图像区域的投影模型,以及根据该投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以通过根据特征点信息确定的二维目标图像区域的类别信息,确定二维目标图像区域的投影模型,以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing equipment may process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the first The neural network processes the initial image to obtain a projection model of the two-dimensional target image area, and according to the projection model and the semantic information of the target object, the initial three-dimensional coordinates of the target object are determined, and the two-dimensional target image area is determined by The regional feature point is extracted to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature point, so as to determine the target three-dimensional coordinate of the target object according to the initial three-dimensional coordinate and the three-dimensional coordinate adjustment information. Through this embodiment, the projection model of the two-dimensional target image area can be determined by the category information of the two-dimensional target image area determined according to the feature point information, so as to adaptively perform three-dimensional target detection on different types of images, thereby improving the image The efficiency and effectiveness of processing.
请参见图10,图10是本发明实施例提供的一种图像处理设备的结构示意图。具体的,所述图像处理设备包括:存储器1001、处理器1002。Please refer to FIG. 10, which is a schematic structural diagram of an image processing device according to an embodiment of the present invention. Specifically, the image processing device includes: a memory 1001 and a processor 1002.
在一种实施例中,所述图像处理设备还包括数据接口1003,所述数据接口1003,用于传递图像处理设备和其他设备之间的数据信息。In an embodiment, the image processing device further includes a data interface 1003, and the data interface 1003 is used to transfer data information between the image processing device and other devices.
所述存储器1001可以包括易失性存储器(volatile memory);存储器1001也可以包括非易失性存储器(non-volatile memory);存储器1001还可以包括上述种类的存储器的组合。所述处理器1002可以是中央处理器(central processing unit,CPU)。所述处理器1002还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA)或其任意组合。The memory 1001 may include a volatile memory (volatile memory); the memory 1001 may also include a non-volatile memory (non-volatile memory); the memory 1001 may also include a combination of the foregoing types of memories. The processor 1002 may be a central processing unit (CPU). The processor 1002 may further include a hardware chip. The above-mentioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof. The above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), or any combination thereof.
所述存储器1001用于存储程序,所述处理器1002可以调用存储1001中存储的程序,用于执行如下步骤:The memory 1001 is used to store programs, and the processor 1002 can call the programs stored in the memory 1001 to perform the following steps:
获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
进一步地,所述处理器1002对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息时,具体用于:Further, when the processor 1002 processes the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, it is specifically used for:
根据第一神经网络对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
进一步地,所述处理器1002对所述二维目标图像区域进行区域特征点提取时,具体用于:Further, when the processor 1002 performs regional feature point extraction on the two-dimensional target image region, it is specifically configured to:
根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息。The two-dimensional target image area is processed according to the second neural network to extract feature point information of the two-dimensional target image area.
进一步地,所述处理器1002还用于:Further, the processor 1002 is further configured to:
根据第二神经网络对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息;Detecting the two-dimensional target image area according to the second neural network to obtain the area information of the two-dimensional target image area;
其中,所述区域信息包括所述二维目标图像区域的类别信息、三维尺寸信息、朝向信息、二维外接矩阵信息中的任意一种或多种。Wherein, the area information includes any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
进一步地,所述处理器1002根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标时,具体用于:Further, when the processor 1002 determines the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object, it is specifically configured to:
获取所述二维目标图像区域的投影模型;Acquiring a projection model of the two-dimensional target image area;
获取所述拍摄装置的参数;Acquiring the parameters of the photographing device;
根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
进一步地,所述投影模型为预设的投影模型。Further, the projection model is a preset projection model.
进一步地,所述处理器1002获取所述二维目标图像区域的投影模型时, 具体用于:Further, when the processor 1002 acquires the projection model of the two-dimensional target image area, it is specifically configured to:
根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息;Determining the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area;
根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。According to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
进一步地,所述处理器1002获取所述二维目标图像区域的投影模型时,具体用于:Further, when the processor 1002 obtains the projection model of the two-dimensional target image area, it is specifically configured to:
根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to the first neural network to obtain a projection model of the initial image;
确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
进一步地,所述处理器1002获取所述二维目标图像区域的投影模型时,具体用于:Further, when the processor 1002 obtains the projection model of the two-dimensional target image area, it is specifically configured to:
根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to a third neural network to obtain a projection model of the initial image;
确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
进一步地,所述投影模型包括第一投影模型;所述处理器1002根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,具体用于:Further, the projection model includes a first projection model; the processor 1002 determines the two-dimensional projection model according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device The initial three-dimensional coordinates of the target image area are specifically used for:
获取所述目标物体在所述二维目标图像区域中的图像高度和目标物体的实际高度;Acquiring the image height of the target object in the two-dimensional target image area and the actual height of the target object;
根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。The first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
进一步地,所述处理器1002根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:Further, the processor 1002 uses the first projection model to determine the size of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device. When initial three-dimensional coordinates, it is specifically used for:
根据所述图像高度、所述实际高度和所述拍摄装置的参数,利用所述第一投影模型确定所述目标物体距离所述拍摄装置的实际距离;Using the first projection model to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device;
获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
根据所述拍摄装置的参数、所述实际距离、所述二维坐标和所述目标物体 的语义信息,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。According to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object, the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area.
进一步地,所述投影模型包括第二投影模型;所述处理器702根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,具体用于:Further, the projection model includes a second projection model; the processor 702 determines the two-dimensional projection model according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device The initial three-dimensional coordinates of the target image area are specifically used for:
获取所述目标物体在所述二维目标图像区域中的图像距离;Acquiring the image distance of the target object in the two-dimensional target image area;
获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device.
进一步地,所述处理器1002根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:Further, the processor 1002 uses the second projection model to determine the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the shooting device. When the initial three-dimensional coordinates are specifically used for:
根据所述图像距离、所述二维坐标和所述拍摄装置的参数,利用所述第二投影模型确定所述拍摄装置的拍摄视角;Using the second projection model to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device;
获取所述目标物体距离所述拍摄装置的实际距离;Acquiring the actual distance of the target object from the photographing device;
根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object.
进一步地,所述处理器1002根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:Further, when the processor 1002 uses the second projection model to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object, Specifically used for:
根据所述拍摄视角和所述实际距离,利用所述第二投影模型确定所述目标物体的物理距离;Determine the physical distance of the target object by using the second projection model according to the shooting angle of view and the actual distance;
根据所述拍摄装置的参数、所述拍摄视角、所述二维坐标、所述物理距离和所述目标物体的语义信息,确定所述二维目标图像区域的初始三维坐标。Determine the initial three-dimensional coordinates of the two-dimensional target image area according to the parameters of the photographing device, the photographing angle of view, the two-dimensional coordinates, the physical distance, and the semantic information of the target object.
进一步地,所述处理器1002根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标时,具体用于:Further, when the processor 1002 determines the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information, it is specifically configured to:
根据第二神经网络对所述二维目标图像区域进行处理,确定所述二维目标图像区域的三维坐标调整信息;Processing the two-dimensional target image area according to the second neural network, and determining the three-dimensional coordinate adjustment information of the two-dimensional target image area;
根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像 区域的目标三维坐标。Determine the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
进一步地,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。Further, the parameters of the shooting device include internal parameters and external parameters, the internal parameters include the focal length of the shooting device, and the external parameters include the optical center of the shooting device.
进一步地,所述第一投影模型包括小孔成像模型。Further, the first projection model includes a small hole imaging model.
进一步地,所述第二投影模型包括斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型中的任意一种。Further, the second projection model includes any one of a Snell window projection model, an isometric projection model, an isometric projection model, and a stereo projection model.
本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以在同一个算法框架内对各类投影模型自适应,以适配所有类型的单目图像,提高了图像处理的效率和有效性。In the embodiment of the present invention, the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional The projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined. Through this implementation manner, various projection models can be adapted within the same algorithm framework to adapt to all types of monocular images, which improves the efficiency and effectiveness of image processing.
本发明实施例还提供了一种图像处理系统,包括:拍摄装置以及上述图像处理设备。本发明实施例中,图像处理设备可以对拍摄装置拍摄得到的初始图像进行处理,以确定初始图像的二维目标图像区域和二维目标图像区域所包含的目标物体的语义信息,并根据二维目标图像区域的投影模型和目标物体的语义信息,确定目标物体的初始三维坐标,以及通过对二维目标图像区域进行区域特征点提取,以基于区域特征点确定目标物体的三维坐标调整信息,从而根据初始三维坐标和三维坐标调整信息,确定目标物体的目标三维坐标。通过这种实施方式,可以自适应地对不同类型的图像进行三维目标检测,提高了图像处理的效率和有效性。An embodiment of the present invention also provides an image processing system, including: a photographing device and the above-mentioned image processing equipment. In the embodiment of the present invention, the image processing equipment can process the initial image captured by the photographing device to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area, and according to the two-dimensional The projection model of the target image area and the semantic information of the target object are used to determine the initial three-dimensional coordinates of the target object, and by extracting regional feature points from the two-dimensional target image area to determine the three-dimensional coordinate adjustment information of the target object based on the regional feature points, thereby According to the initial 3D coordinates and 3D coordinate adjustment information, the target 3D coordinates of the target object are determined. Through this implementation, three-dimensional target detection can be adaptively performed on different types of images, which improves the efficiency and effectiveness of image processing.
本发明的实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本发明图2、图5、图6或图8所对应实施例中描述的方法,也可实现图10所述本发明所对应实施例的设备,在此不再赘述。The embodiment of the present invention also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the present invention is implemented as shown in FIG. 2, FIG. 5, FIG. 6 or FIG. The method described in the embodiment corresponding to 8 can also implement the device in the embodiment corresponding to the present invention described in FIG. 10, and will not be repeated here.
所述计算机可读存储介质可以是前述任一实施例所述的设备的内部存储单元,例如设备的硬盘或内存。所述计算机可读存储介质也可以是所述设备的外部存储设备,例如所述设备上配备的插接式硬盘,智能存储卡(Smart Media  Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述计算机可读存储介质还可以既包括所述设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述终端所需的其他程序和数据。所述计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of the device described in any of the foregoing embodiments, such as a hard disk or memory of the device. The computer-readable storage medium may also be an external storage device of the device, such as a plug-in hard disk equipped on the device, a smart memory card (Smart Media Card, SMC), or a Secure Digital (SD) card. , Flash Card, etc. Further, the computer-readable storage medium may also include both an internal storage unit of the device and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the terminal. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
以上所揭露的仅为本发明部分实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。The above-disclosed are only some of the embodiments of the present invention, which of course cannot be used to limit the scope of the present invention. Therefore, equivalent changes made according to the claims of the present invention still fall within the scope of the present invention.

Claims (55)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, characterized in that it comprises:
    获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
    对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
    根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
    对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
    根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  2. 根据权利要求1所述的方法,其特征在于,所述对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息,包括:The method according to claim 1, wherein the processing of the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area ,include:
    根据第一神经网络对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
  3. 根据权利要求1所述的方法,其特征在于,所述对所述二维目标图像区域进行区域特征点提取,包括:The method according to claim 1, wherein said extracting regional feature points of said two-dimensional target image region comprises:
    根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息。The two-dimensional target image area is processed according to the second neural network to extract feature point information of the two-dimensional target image area.
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    根据第二神经网络对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息;Detecting the two-dimensional target image area according to the second neural network to obtain the area information of the two-dimensional target image area;
    其中,所述区域信息包括所述二维目标图像区域的类别信息、三维尺寸信息、朝向信息、二维外接矩阵信息中的任意一种或多种。Wherein, the area information includes any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
  5. 根据权利要求1或2所述的方法,其特征在于,所述根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标,包括:The method according to claim 1 or 2, wherein the determining the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object comprises:
    获取所述二维目标图像区域的投影模型;Acquiring a projection model of the two-dimensional target image area;
    获取所述拍摄装置的参数;Acquiring the parameters of the photographing device;
    根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
  6. 根据权利要求5所述的方法,其特征在于,所述投影模型为预设的投影模型。The method according to claim 5, wherein the projection model is a preset projection model.
  7. 根据权利要求5所述的方法,其特征在于,所述获取所述二维目标图像区域的投影模型,包括:The method according to claim 5, wherein said obtaining the projection model of the two-dimensional target image area comprises:
    根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息;Determining the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area;
    根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。According to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
  8. 根据权利要求5所述的方法,其特征在于,所述获取所述二维目标图像区域的投影模型,包括:The method according to claim 5, wherein said obtaining the projection model of the two-dimensional target image area comprises:
    根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to the first neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  9. 根据权利要求5所述的方法,其特征在于,所述获取所述二维目标图像区域的投影模型,包括:The method according to claim 5, wherein said obtaining the projection model of the two-dimensional target image area comprises:
    根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to a third neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  10. 根据权利要求5-9任一项所述的方法,其特征在于,所述投影模型包括第一投影模型;所述根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标,包括:The method according to any one of claims 5-9, wherein the projection model comprises a first projection model; the projection model according to the two-dimensional target image area, the semantic information of the target object, and The parameters of the photographing device to determine the initial three-dimensional coordinates of the two-dimensional target image area include:
    获取所述目标物体在所述二维目标图像区域中的图像高度和目标物体的实际高度;Acquiring the image height of the target object in the two-dimensional target image area and the actual height of the target object;
    根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。The first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  11. 根据权利要求10所述的方法,其特征在于,所述根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标,包括:The method according to claim 10, wherein the first projection model is used to determine the location based on the image height, the actual height, the semantic information of the target object, and the parameters of the shooting device. The initial three-dimensional coordinates of the two-dimensional target image area include:
    根据所述图像高度、所述实际高度和所述拍摄装置的参数,利用所述第一投影模型确定所述目标物体距离所述拍摄装置的实际距离;Using the first projection model to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述拍摄装置的参数、所述实际距离、所述二维坐标和所述目标物体的语义信息,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。According to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object, the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area.
  12. 根据权利要求5-9任一项所述的方法,其特征在于,所述投影模型包括第二投影模型;所述根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标,包括:The method according to any one of claims 5-9, wherein the projection model comprises a second projection model; the projection model according to the two-dimensional target image area, the semantic information of the target object, and The parameters of the photographing device to determine the initial three-dimensional coordinates of the two-dimensional target image area include:
    获取所述目标物体在所述二维目标图像区域中的图像距离;Acquiring the image distance of the target object in the two-dimensional target image area;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device.
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标,包括:The method according to claim 12, wherein the second projection model is used to determine according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device The initial three-dimensional coordinates of the two-dimensional target image area include:
    根据所述图像距离、所述二维坐标和所述拍摄装置的参数,利用所述第二投影模型确定所述拍摄装置的拍摄视角;Using the second projection model to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device;
    获取所述目标物体距离所述拍摄装置的实际距离;Acquiring the actual distance of the target object from the photographing device;
    根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object.
  14. 根据权利要求13所述的方法,其特征在于,所述根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标,包括:The method according to claim 13, wherein the second projection model is used to determine the two-dimensional target image according to the shooting angle of view, the actual distance, and semantic information of the target object The initial three-dimensional coordinates of the area, including:
    根据所述拍摄视角和所述实际距离,利用所述第二投影模型确定所述目标物体的物理距离;Determine the physical distance of the target object by using the second projection model according to the shooting angle of view and the actual distance;
    根据所述拍摄装置的参数、所述拍摄视角、所述二维坐标、所述物理距离和所述目标物体的语义信息,确定所述二维目标图像区域的初始三维坐标。Determine the initial three-dimensional coordinates of the two-dimensional target image area according to the parameters of the photographing device, the photographing angle of view, the two-dimensional coordinates, the physical distance, and the semantic information of the target object.
  15. 根据权利要求4所述的方法,其特征在于,所述根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标,包括:The method according to claim 4, wherein the determining the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information comprises:
    根据第二神经网络对所述二维目标图像区域进行处理,确定所述二维目标图像区域的三维坐标调整信息;Processing the two-dimensional target image area according to the second neural network, and determining the three-dimensional coordinate adjustment information of the two-dimensional target image area;
    根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像区域的目标三维坐标。Determine the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
  16. 根据权利要求1所述的方法,其特征在于,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。The method according to claim 1, wherein the parameters of the shooting device include internal parameters and external parameters, the internal parameters include the focal length of the shooting device, and the external parameters include the optical center of the shooting device.
  17. 根据权利要求10所述的方法,其特征在于,所述第一投影模型包括小孔成像模型。The method according to claim 10, wherein the first projection model comprises a pinhole imaging model.
  18. 根据权利要求12所述的方法,其特征在于,所述第二投影模型包括斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型中的任意一种。The method according to claim 12, wherein the second projection model comprises any one of a Snell window projection model, an equal area projection model, an equal distance projection model, and a stereo projection model.
  19. 一种图像处理设备,其特征在于,包括:存储器和处理器,An image processing device, characterized by comprising: a memory and a processor,
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,用于执行所述存储器存储的程序,当所述程序被执行时,所述处理器用于:The processor is configured to execute a program stored in the memory, and when the program is executed, the processor is configured to:
    获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
    对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
    根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
    对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
    根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  20. 根据权利要求19所述的设备,其特征在于,所述处理器对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息时,具体用于:The device according to claim 19, wherein the processor processes the initial image to determine the two-dimensional target image area of the initial image and the target object contained in the two-dimensional target image area. When semantic information, it is specifically used for:
    根据第一神经网络对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
  21. 根据权利要求19所述的设备,其特征在于,所述处理器对所述二维目标图像区域进行区域特征点提取时,具体用于:The device according to claim 19, wherein the processor is specifically configured to: when performing region feature point extraction on the two-dimensional target image region:
    根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息。The two-dimensional target image area is processed according to the second neural network to extract feature point information of the two-dimensional target image area.
  22. 根据权利要求19所述的设备,其特征在于,所述处理器还用于:The device according to claim 19, wherein the processor is further configured to:
    根据第二神经网络对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息;Detecting the two-dimensional target image area according to the second neural network to obtain the area information of the two-dimensional target image area;
    其中,所述区域信息包括所述二维目标图像区域的类别信息、三维尺寸信息、朝向信息、二维外接矩阵信息中的任意一种或多种。Wherein, the area information includes any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
  23. 根据权利要求19或21所述的设备,其特征在于,所述处理器根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标时,具体用于:The device according to claim 19 or 21, wherein when the processor determines the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object, Specifically used for:
    获取所述二维目标图像区域的投影模型;Acquiring a projection model of the two-dimensional target image area;
    获取所述拍摄装置的参数;Acquiring the parameters of the photographing device;
    根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
  24. 根据权利要求23所述的设备,其特征在于,所述投影模型为预设的投影模型。The device according to claim 23, wherein the projection model is a preset projection model.
  25. 根据权利要求23所述的设备,其特征在于,所述处理器获取所述二维目标图像区域的投影模型时,具体用于:The device according to claim 23, wherein when the processor obtains the projection model of the two-dimensional target image area, it is specifically configured to:
    根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息;Determining the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area;
    根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。According to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
  26. 根据权利要求23所述的设备,其特征在于,所述处理器获取所述二维目标图像区域的投影模型时,具体用于:The device according to claim 23, wherein when the processor obtains the projection model of the two-dimensional target image area, it is specifically configured to:
    根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to the first neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  27. 根据权利要求23所述的设备,其特征在于,所述处理器获取所述二维目标图像区域的投影模型时,具体用于:The device according to claim 23, wherein when the processor obtains the projection model of the two-dimensional target image area, it is specifically configured to:
    根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to a third neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  28. 根据权利要求23-27任一项所述的设备,其特征在于,所述投影模型包括第一投影模型;所述处理器根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,具体用于:The device according to any one of claims 23-27, wherein the projection model comprises a first projection model; the processor is based on the projection model of the two-dimensional target image area and the semantics of the target object. The information and the parameters of the photographing device are specifically used to determine the initial three-dimensional coordinates of the two-dimensional target image area:
    获取所述目标物体在所述二维目标图像区域中的图像高度和目标物体的实际高度;Acquiring the image height of the target object in the two-dimensional target image area and the actual height of the target object;
    根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。The first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  29. 根据权利要求28所述的设备,其特征在于,所述处理器根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The device according to claim 28, wherein the processor uses the first projection model according to the image height, the actual height, the semantic information of the target object, and the parameters of the shooting device When determining the initial three-dimensional coordinates of the two-dimensional target image area, it is specifically used for:
    根据所述图像高度、所述实际高度和所述拍摄装置的参数,利用所述第一投影模型确定所述目标物体距离所述拍摄装置的实际距离;Using the first projection model to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述拍摄装置的参数、所述实际距离、所述二维坐标和所述目标物体的语义信息,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。According to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object, the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area.
  30. 根据权利要求23-27任一项所述的设备,其特征在于,所述投影模型包括第二投影模型;所述处理器根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初 始三维坐标时,具体用于:The device according to any one of claims 23-27, wherein the projection model comprises a second projection model; the processor is based on the projection model of the two-dimensional target image area and the semantics of the target object The information and the parameters of the photographing device are specifically used to determine the initial three-dimensional coordinates of the two-dimensional target image area:
    获取所述目标物体在所述二维目标图像区域中的图像距离;Acquiring the image distance of the target object in the two-dimensional target image area;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device.
  31. 根据权利要求30所述的设备,其特征在于,所述处理器根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The device according to claim 30, wherein the processor uses the second projection according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the shooting device. When the model determines the initial three-dimensional coordinates of the two-dimensional target image area, it is specifically used for:
    根据所述图像距离、所述二维坐标和所述拍摄装置的参数,利用所述第二投影模型确定所述拍摄装置的拍摄视角;Using the second projection model to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device;
    获取所述目标物体距离所述拍摄装置的实际距离;Acquiring the actual distance of the target object from the photographing device;
    根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object.
  32. 根据权利要求31所述的设备,其特征在于,所述处理器根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The device according to claim 31, wherein the processor uses the second projection model to determine the two-dimensional image based on the shooting angle of view, the actual distance, and semantic information of the target object. The initial three-dimensional coordinates of the target image area are specifically used for:
    根据所述拍摄视角和所述实际距离,利用所述第二投影模型确定所述目标物体的物理距离;Determine the physical distance of the target object by using the second projection model according to the shooting angle of view and the actual distance;
    根据所述拍摄装置的参数、所述拍摄视角、所述二维坐标、所述物理距离和所述目标物体的语义信息,确定所述二维目标图像区域的初始三维坐标。Determine the initial three-dimensional coordinates of the two-dimensional target image area according to the parameters of the photographing device, the photographing angle of view, the two-dimensional coordinates, the physical distance, and the semantic information of the target object.
  33. 根据权利要求22所述的设备,其特征在于,所述处理器根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标时,具体用于:The device according to claim 22, wherein the processor is specifically configured to: when determining the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information:
    根据第二神经网络对所述二维目标图像区域进行处理,确定所述二维目标图像区域的三维坐标调整信息;Processing the two-dimensional target image area according to the second neural network, and determining the three-dimensional coordinate adjustment information of the two-dimensional target image area;
    根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像区域的目标三维坐标。Determine the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
  34. 根据权利要求19所述的设备,其特征在于,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。The device according to claim 19, wherein the parameters of the shooting device include internal parameters and external parameters, the internal parameters include the focal length of the shooting device, and the external parameters include the optical center of the shooting device.
  35. 根据权利要求28所述的设备,其特征在于,所述第一投影模型包括小孔成像模型。The device of claim 28, wherein the first projection model comprises a small hole imaging model.
  36. 根据权利要求30所述的设备,其特征在于,所述第二投影模型包括斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型中的任意一种。The device according to claim 30, wherein the second projection model comprises any one of a Snell window projection model, an equal area projection model, an equal distance projection model, and a stereo projection model.
  37. 一种图像处理系统,其特征在于,包括:An image processing system, characterized in that it comprises:
    拍摄装置,用于拍摄得到初始图像;The photographing device is used for photographing to obtain the initial image;
    图像处理设备,用于:Image processing equipment for:
    获取拍摄装置拍摄得到的初始图像;Acquiring an initial image taken by the photographing device;
    对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息;Processing the initial image to determine the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area;
    根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标;Determine the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object;
    对所述二维目标图像区域进行区域特征点提取,并基于所述区域特征点确定所述目标物体的三维坐标调整信息;Performing regional feature point extraction on the two-dimensional target image area, and determining the three-dimensional coordinate adjustment information of the target object based on the regional feature points;
    根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标。Determine the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information.
  38. 根据权利要求37所述的系统,其特征在于,所述图像处理设备对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息时,具体用于:The system according to claim 37, wherein the image processing device processes the initial image to determine a two-dimensional target image area of the initial image and a target object contained in the two-dimensional target image area When the semantic information is specifically used for:
    根据第一神经网络对所述初始图像进行处理,确定所述初始图像的二维目标图像区域和所述二维目标图像区域所包含的目标物体的语义信息。The initial image is processed according to the first neural network, and the two-dimensional target image area of the initial image and the semantic information of the target object contained in the two-dimensional target image area are determined.
  39. 根据权利要求37所述的系统,其特征在于,所述图像处理设备对所述二维目标图像区域进行区域特征点提取时,具体用于:The system according to claim 37, wherein when the image processing device extracts regional feature points of the two-dimensional target image area, it is specifically configured to:
    根据第二神经网络对所述二维目标图像区域进行处理,以提取所述二维目标图像区域的特征点信息。The two-dimensional target image area is processed according to the second neural network to extract feature point information of the two-dimensional target image area.
  40. 根据权利要求37所述的系统,其特征在于,所述图像处理设备还用于:The system according to claim 37, wherein the image processing device is further used for:
    根据第二神经网络对所述二维目标图像区域进行检测,得到所述二维目标图像区域的区域信息;Detecting the two-dimensional target image area according to the second neural network to obtain the area information of the two-dimensional target image area;
    其中,所述区域信息包括所述二维目标图像区域的类别信息、三维尺寸信息、朝向信息、二维外接矩阵信息中的任意一种或多种。Wherein, the area information includes any one or more of category information, three-dimensional size information, orientation information, and two-dimensional circumscribed matrix information of the two-dimensional target image area.
  41. 根据权利要求37或39所述的系统,其特征在于,所述图像处理设备根据所述二维目标图像区域的投影模型和所述目标物体的语义信息,确定所述目标物体的初始三维坐标时,具体用于:The system according to claim 37 or 39, wherein the image processing device determines the initial three-dimensional coordinates of the target object according to the projection model of the two-dimensional target image area and the semantic information of the target object. , Specifically used for:
    获取所述二维目标图像区域的投影模型;Acquiring a projection model of the two-dimensional target image area;
    获取所述拍摄装置的参数;Acquiring the parameters of the photographing device;
    根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标。The initial three-dimensional coordinates of the two-dimensional target image area are determined according to the projection model of the two-dimensional target image area, the semantic information of the target object, and the parameters of the photographing device.
  42. 根据权利要求41所述的系统,其特征在于,所述投影模型为预设的投影模型。The system according to claim 41, wherein the projection model is a preset projection model.
  43. 根据权利要求41所述的系统,其特征在于,所述图像处理设备获取所述二维目标图像区域的投影模型时,具体用于:The system according to claim 41, wherein when the image processing device acquires the projection model of the two-dimensional target image area, it is specifically used for:
    根据所述二维目标图像区域的特征点信息确定所述二维目标图像区域的类别信息;Determining the category information of the two-dimensional target image area according to the feature point information of the two-dimensional target image area;
    根据所述二维目标图像区域的类别信息,确定所述二维目标图像区域的投影模型。According to the category information of the two-dimensional target image area, a projection model of the two-dimensional target image area is determined.
  44. 根据权利要求41所述的系统,其特征在于,所述图像处理设备获取所述二维目标图像区域的投影模型时,具体用于:The system according to claim 41, wherein when the image processing device acquires the projection model of the two-dimensional target image area, it is specifically used for:
    根据第一神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to the first neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  45. 根据权利要求41所述的系统,其特征在于,所述图像处理设备获取所述二维目标图像区域的投影模型时,具体用于:The system according to claim 41, wherein when the image processing device acquires the projection model of the two-dimensional target image area, it is specifically used for:
    根据第三神经网络对所述初始图像进行处理,得到所述初始图像的投影模型;Processing the initial image according to a third neural network to obtain a projection model of the initial image;
    确定所述初始图像的投影模型为所述二维目标图像区域的投影模型。It is determined that the projection model of the initial image is the projection model of the two-dimensional target image area.
  46. 根据权利要求41-45任一项所述的系统,其特征在于,所述投影模型包括第一投影模型;所述图像处理设备根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,具体用于:The system according to any one of claims 41-45, wherein the projection model comprises a first projection model; the image processing device is based on the projection model of the two-dimensional target image area and the projection of the target object The semantic information and the parameters of the photographing device are specifically used to determine the initial three-dimensional coordinates of the two-dimensional target image area:
    获取所述目标物体在所述二维目标图像区域中的图像高度和目标物体的实际高度;Acquiring the image height of the target object in the two-dimensional target image area and the actual height of the target object;
    根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。The first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image height, the actual height, the semantic information of the target object, and the parameters of the photographing device.
  47. 根据权利要求46所述的系统,其特征在于,所述图像处理设备根据所述图像高度、所述实际高度、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The system according to claim 46, wherein the image processing device uses the first projection according to the image height, the actual height, the semantic information of the target object, and the parameters of the shooting device. When the model determines the initial three-dimensional coordinates of the two-dimensional target image area, it is specifically used for:
    根据所述图像高度、所述实际高度和所述拍摄装置的参数,利用所述第一 投影模型确定所述目标物体距离所述拍摄装置的实际距离;Using the first projection model to determine the actual distance of the target object from the shooting device according to the image height, the actual height, and the parameters of the shooting device;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述拍摄装置的参数、所述实际距离、所述二维坐标和所述目标物体的语义信息,利用所述第一投影模型确定所述二维目标图像区域的初始三维坐标。According to the parameters of the photographing device, the actual distance, the two-dimensional coordinates, and the semantic information of the target object, the first projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area.
  48. 根据权利要求41-45任一项所述的系统,其特征在于,所述投影模型包括第二投影模型;所述图像处理设备根据所述二维目标图像区域的投影模型、所述目标物体的语义信息和所述拍摄装置的参数,确定所述二维目标图像区域的初始三维坐标时,具体用于:The system according to any one of claims 41-45, wherein the projection model comprises a second projection model; the image processing device is based on the projection model of the two-dimensional target image area and the projection of the target object The semantic information and the parameters of the photographing device are specifically used to determine the initial three-dimensional coordinates of the two-dimensional target image area:
    获取所述目标物体在所述二维目标图像区域中的图像距离;Acquiring the image distance of the target object in the two-dimensional target image area;
    获取所述目标物体在图像坐标系中的二维坐标;Acquiring the two-dimensional coordinates of the target object in the image coordinate system;
    根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device.
  49. 根据权利要求48所述的系统,其特征在于,所述图像处理设备根据所述图像距离、所述二维坐标、所述目标物体的语义信息和所述拍摄装置的参数,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The system according to claim 48, wherein the image processing device uses the second image processing device according to the image distance, the two-dimensional coordinates, the semantic information of the target object, and the parameters of the photographing device. When the projection model determines the initial three-dimensional coordinates of the two-dimensional target image area, it is specifically used for:
    根据所述图像距离、所述二维坐标和所述拍摄装置的参数,利用所述第二投影模型确定所述拍摄装置的拍摄视角;Using the second projection model to determine the shooting angle of view of the shooting device according to the image distance, the two-dimensional coordinates, and the parameters of the shooting device;
    获取所述目标物体距离所述拍摄装置的实际距离;Acquiring the actual distance of the target object from the photographing device;
    根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标。The second projection model is used to determine the initial three-dimensional coordinates of the two-dimensional target image area according to the shooting angle of view, the actual distance, and the semantic information of the target object.
  50. 根据权利要求49所述的系统,其特征在于,所述图像处理设备根据所述所述拍摄视角、所述实际距离和所述目标物体的语义信息,利用所述第二投影模型确定所述二维目标图像区域的初始三维坐标时,具体用于:The system according to claim 49, wherein the image processing device uses the second projection model to determine the two according to the shooting angle of view, the actual distance, and semantic information of the target object. When the initial three-dimensional coordinates of the target image area are dimensioned, it is specifically used for:
    根据所述拍摄视角和所述实际距离,利用所述第二投影模型确定所述目标 物体的物理距离;Using the second projection model to determine the physical distance of the target object according to the shooting angle of view and the actual distance;
    根据所述拍摄装置的参数、所述拍摄视角、所述二维坐标、所述物理距离和所述目标物体的语义信息,确定所述二维目标图像区域的初始三维坐标。Determine the initial three-dimensional coordinates of the two-dimensional target image area according to the parameters of the photographing device, the photographing angle of view, the two-dimensional coordinates, the physical distance, and the semantic information of the target object.
  51. 根据权利要求40所述的系统,其特征在于,所述图像处理设备根据所述初始三维坐标和所述三维坐标调整信息,确定所述目标物体的目标三维坐标时,具体用于:The system according to claim 40, wherein when the image processing device determines the target three-dimensional coordinates of the target object according to the initial three-dimensional coordinates and the three-dimensional coordinate adjustment information, it is specifically used for:
    根据第二神经网络对所述二维目标图像区域进行处理,确定所述二维目标图像区域的三维坐标调整信息;Processing the two-dimensional target image area according to the second neural network, and determining the three-dimensional coordinate adjustment information of the two-dimensional target image area;
    根据所述三维坐标调整信息和所述初始三维坐标,确定所述二维目标图像区域的目标三维坐标。Determine the target three-dimensional coordinates of the two-dimensional target image area according to the three-dimensional coordinate adjustment information and the initial three-dimensional coordinates.
  52. 根据权利要求37所述的系统,其特征在于,所述拍摄装置的参数包括内参和外参,所述内参包括所述拍摄装置的焦距,所述外参包括所述拍摄装置的光心。The system according to claim 37, wherein the parameters of the shooting device include internal parameters and external parameters, the internal parameters include the focal length of the shooting device, and the external parameters include the optical center of the shooting device.
  53. 根据权利要求46所述的系统,其特征在于,所述第一投影模型包括小孔成像模型。The system of claim 46, wherein the first projection model comprises a pinhole imaging model.
  54. 根据权利要求48所述的系统,其特征在于,所述第二投影模型包括斯涅耳窗投影模型、等积投影模型、等距投影模型、体视投影模型中的任意一种。The system of claim 48, wherein the second projection model comprises any one of a Snell window projection model, an equal area projection model, an equal distance projection model, and a stereo projection model.
  55. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其特征在于:所述计算机程序被处理器执行时实现如权利要求1至18中任一项所述方法的步骤。A computer-readable storage medium in which a computer program is stored, characterized in that: when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 18 are implemented .
PCT/CN2019/129364 2019-12-27 2019-12-27 Image processing method and device, image processing system and storage medium WO2021128314A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/129364 WO2021128314A1 (en) 2019-12-27 2019-12-27 Image processing method and device, image processing system and storage medium
CN201980094989.8A CN113661513A (en) 2019-12-27 2019-12-27 Image processing method, image processing device, image processing system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/129364 WO2021128314A1 (en) 2019-12-27 2019-12-27 Image processing method and device, image processing system and storage medium

Publications (1)

Publication Number Publication Date
WO2021128314A1 true WO2021128314A1 (en) 2021-07-01

Family

ID=76573529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/129364 WO2021128314A1 (en) 2019-12-27 2019-12-27 Image processing method and device, image processing system and storage medium

Country Status (2)

Country Link
CN (1) CN113661513A (en)
WO (1) WO2021128314A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887290A (en) * 2021-08-31 2022-01-04 际络科技(上海)有限公司 Monocular 3D detection method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101839692A (en) * 2010-05-27 2010-09-22 西安交通大学 Method for measuring three-dimensional position and stance of object with single camera
US20100315505A1 (en) * 2009-05-29 2010-12-16 Honda Research Institute Europe Gmbh Object motion detection system based on combining 3d warping techniques and a proper object motion detection
US20150279046A1 (en) * 2012-09-28 2015-10-01 Raytheon Company System for correcting rpc camera model pointing errors using 2 sets of stereo image pairs and probabilistic 3-dimensional models
CN107248178A (en) * 2017-06-08 2017-10-13 上海赫千电子科技有限公司 A kind of fisheye camera scaling method based on distortion parameter
CN109741241A (en) * 2018-12-26 2019-05-10 斑马网络技术有限公司 Processing method, device, equipment and the storage medium of fish eye images
CN109764858A (en) * 2018-12-24 2019-05-17 中公高科养护科技股份有限公司 A kind of photogrammetric survey method and system based on monocular camera
CN110060202A (en) * 2019-04-19 2019-07-26 湖北亿咖通科技有限公司 A kind of initial method and system of monocular SLAM algorithm
CN110070025A (en) * 2019-04-17 2019-07-30 上海交通大学 Objective detection system and method based on monocular image

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4950787B2 (en) * 2007-07-12 2012-06-13 株式会社東芝 Image processing apparatus and method
CN110148196B (en) * 2018-09-12 2022-03-25 腾讯大地通途(北京)科技有限公司 Image processing method and device and related equipment
CN109919999B (en) * 2019-01-31 2021-06-11 深兰科技(上海)有限公司 Target position detection method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100315505A1 (en) * 2009-05-29 2010-12-16 Honda Research Institute Europe Gmbh Object motion detection system based on combining 3d warping techniques and a proper object motion detection
CN101839692A (en) * 2010-05-27 2010-09-22 西安交通大学 Method for measuring three-dimensional position and stance of object with single camera
US20150279046A1 (en) * 2012-09-28 2015-10-01 Raytheon Company System for correcting rpc camera model pointing errors using 2 sets of stereo image pairs and probabilistic 3-dimensional models
CN107248178A (en) * 2017-06-08 2017-10-13 上海赫千电子科技有限公司 A kind of fisheye camera scaling method based on distortion parameter
CN109764858A (en) * 2018-12-24 2019-05-17 中公高科养护科技股份有限公司 A kind of photogrammetric survey method and system based on monocular camera
CN109741241A (en) * 2018-12-26 2019-05-10 斑马网络技术有限公司 Processing method, device, equipment and the storage medium of fish eye images
CN110070025A (en) * 2019-04-17 2019-07-30 上海交通大学 Objective detection system and method based on monocular image
CN110060202A (en) * 2019-04-19 2019-07-26 湖北亿咖通科技有限公司 A kind of initial method and system of monocular SLAM algorithm

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887290A (en) * 2021-08-31 2022-01-04 际络科技(上海)有限公司 Monocular 3D detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113661513A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
WO2020103427A1 (en) Object detection method, related device and computer storage medium
WO2019042426A1 (en) Augmented reality scene processing method and apparatus, and computer storage medium
CN106960454B (en) Depth of field obstacle avoidance method and equipment and unmanned aerial vehicle
CN109479082B (en) Image processing method and apparatus
WO2020258286A1 (en) Image processing method and device, photographing device and movable platform
WO2020237565A1 (en) Target tracking method and device, movable platform and storage medium
JP2004334819A (en) Stereo calibration device and stereo image monitoring device using same
EP2610778A1 (en) Method of detecting an obstacle and driver assist system
CN113508420A (en) Object tracking device and object tracking method
CN111213153A (en) Target object motion state detection method, device and storage medium
WO2018102990A1 (en) System and method for rectifying a wide-angle image
JP2020052647A (en) Object detection device, object detection method, computer program for object detection, and vehicle control system
CN107749069B (en) Image processing method, electronic device and image processing system
WO2020124517A1 (en) Photographing equipment control method, photographing equipment control device and photographing equipment
TW202029134A (en) Driving detection method, vehicle and driving processing device
JP6617150B2 (en) Object detection method and object detection apparatus
WO2020024182A1 (en) Parameter processing method and apparatus, camera device and aircraft
CN105335959A (en) Quick focusing method and device for imaging apparatus
WO2021128314A1 (en) Image processing method and device, image processing system and storage medium
US11539871B2 (en) Electronic device for performing object detection and operation method thereof
JP2018073275A (en) Image recognition device
TWI499999B (en) The 3D ring car image system based on probability calculation and its obtaining method
TWI658431B (en) Image processing method, image processing device and computer readable storage medium
CN111656404A (en) Image processing method and system and movable platform
CN113011212B (en) Image recognition method and device and vehicle

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19957217

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19957217

Country of ref document: EP

Kind code of ref document: A1