CN118226804A - Method for determining a mission plan for a mobile device - Google Patents
Method for determining a mission plan for a mobile device Download PDFInfo
- Publication number
- CN118226804A CN118226804A CN202311757063.7A CN202311757063A CN118226804A CN 118226804 A CN118226804 A CN 118226804A CN 202311757063 A CN202311757063 A CN 202311757063A CN 118226804 A CN118226804 A CN 118226804A
- Authority
- CN
- China
- Prior art keywords
- mobile device
- motion
- environment
- plan
- determined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000033001 locomotion Effects 0.000 claims abstract description 140
- 230000009471 action Effects 0.000 claims abstract description 44
- 230000002787 reinforcement Effects 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 8
- 238000010801 machine learning Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 4
- 230000001105 regulatory effect Effects 0.000 claims description 2
- 230000000875 corresponding effect Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 239000003795 chemical substances by application Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000007774 longterm Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000010408 sweeping Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
- G05B19/4189—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by the transport system
- G05B19/41895—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by the transport system using automatic guided vehicles [AGV]
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/20—Control system inputs
- G05D1/24—Arrangements for determining position or orientation
- G05D1/246—Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM]
- G05D1/2464—Arrangements for determining position or orientation using environment maps, e.g. simultaneous localisation and mapping [SLAM] using an occupancy grid
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/60—Intended control result
- G05D1/617—Safety or protection, e.g. defining protection zones around obstacles or avoiding hazards
- G05D1/622—Obstacle avoidance
- G05D1/633—Dynamic obstacles
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32252—Scheduling production, machining, job shop
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D2109/00—Types of controlled vehicles
- G05D2109/10—Land vehicles
Landscapes
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Automation & Control Theory (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Quality & Reliability (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Aviation & Aerospace Engineering (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Manufacturing & Machinery (AREA)
- General Engineering & Computer Science (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
Method for determining a mission plan for a mobile device, in particular a robot, a drone or a transport vehicle, which is to be moved in an environment according to a movement plan, comprising: providing (310324) one or more actions (302) that the mobile device (110) should perform in the environment; providing (200) device and/or environment information, comprising: a current location and/or orientation (202), an environment and/or task description (304), and a motion map (500) having motion information (510) about motions of an entity, in particular a person, different from the mobile device; determining (240, 320) the mission plan (232, 324) based on device and/or environment information and the one or more actions such that the one or more actions can be performed; and providing (244) a mission plan (324) for the mobile device.
Description
Technical Field
The invention relates to a method for determining a mission plan for a mobile device, a computing unit and a computer program for carrying out the method, and a mobile device.
Background
In a different field, mobile devices are used, in particular robots, unmanned aerial vehicles or at least partially automated moving vehicles, such as so-called AGVs ("Automated Guided Vehicles (automated guided vehicles)"). For example, a large number of such mobile devices may be used in an environment such as a factory hall.
Disclosure of Invention
According to the invention, a method for determining a mission plan for a mobile device, a computing unit and a computer program for performing the method, and a mobile device are proposed, having the features of the independent patent claims. Advantageous embodiments are the subject matter of the dependent claims and the following description.
The present invention relates to mobile devices, in particular robots, unmanned aerial vehicles or at least partially automated moving vehicles, such as so-called AGVs ("automated guided vehicles") and the use of a large number of such mobile devices in environments such as factory halls. Typically, such mobile devices are able to discern orientations (orientieren sich) in the environment and navigate there, i.e., independently follow a predefined path of motion. For this purpose, the mobile device may, for example, have suitable sensor means, such as a lidar and/or a camera, and a corresponding drive unit.
Examples of such mobile devices (or also called mobile working devices) are generally robots and/or unmanned aerial vehicles, for example, and/or transport vehicles which can also be moved semi-automatically or (completely) automatically (on land, on water or in air). As robots, for example, domestic robots, such as dust-collecting and/or floor-sweeping robots, floor-or road-cleaning devices or lawn mowing robots, but also other so-called service robots, such as passenger or freight vehicles (i.e. so-called land vehicles, for example in warehouses), or aircraft, such as so-called unmanned aerial vehicles, or ships, can be considered. Autonomous or automatic passenger or cargo vehicles may also be considered mobile devices.
A mission plan can be determined or predefined for such a mobile device. Various device and/or environment information, i.e. in particular the current position and/or orientation of the mobile device (in the environment), and an environment and/or task description, in particular a description of the environment and/or task on programming, with Knowledge of the environment ("Domain Knowledge") can be used here. In addition, one or more actions (or tasks) that the mobile device should perform in the environment may be provided or pre-specified. Such an action may be, for example, that the mobile device should be loaded (beladen) or unloaded at a particular location or that another activity should be performed, such as simply waiting for a particular time. For example, a time specification (Vorgabe) for a later movement may be generated herein.
The current position and/or orientation of the mobile device, e.g. the position of an obstacle, may be determined using the already mentioned sensing mechanism. For example, an environment description may be predefined, which includes, for example, the path sum (static obstacle) and the location of the possible destination.
For example, these actions may be predefined in an automated manner and/or by the user according to the current requirements; this may depend, for example, on the current use of the mobile device. For example, provision may be made for: the mobile device picks up the first component at position a, brings it to position B, unloads it there, and then picks up the second component at position C. Based on the specific action (or actions) and by taking into account the device and/or environment information, possibly also in particular by taking into account the environment description, a task or action plan may be determined, which comprises the mentioned action or actions, possibly also with a time component. In this case, a mission planner can also be mentioned. It should be mentioned in this connection that further actions from other mobile devices may be considered; for example, the desired action may be assigned to one or the other mobile device depending on the situation
For example, based on a mission plan (or navigation plan) that is itself quite abstract, a motion plan can then be determined according to which the mobile device should then move with the environment. In particular, the motion planning may in this case comprise a motion path, in particular also a trajectory, along which the mobile device should move in the environment. While the motion path generally includes only the path along which the mobile device should move, the trajectory may also include a velocity and possibly an acceleration according to which the mobile device should move along the motion path. In addition, the motion planning may also include time specifications, such as specification data regarding when and/or where the mobile device should be and for how long. In addition to the movement path or trajectory itself, it is often desirable for the mobile device to reach a predetermined destination in the environment as quickly as possible. In the context of motion planning, agents are often mentioned instead of mobile devices.
Based on the current location and/or orientation of the mobile device, it is possible to predict what will happen to the environment. The mission plan and thus also the movement plan can then be determined and specifically determined such that the mobile device can perform the one or more actions. Thus, for example, it is ensured that: the mobile device is also able to reach a specific location. The determination of the mission plan thus comprises in particular also: determining a motion plan, or may be based on a mission plan; here too, this is done based on device and/or environment information. In this case, what is known as a motion planner is also generally referred to. In this case, for example, an optimization or optimization method can be used, with which the movement path is found, with which the mobile device can perform the predefined action fastest.
The mobile device can then be moved according to a mission plan or in particular according to a movement plan or movement path (determined therefrom), in particular a trajectory. In particular, a movement control variable for the mobile device can be determined on the basis of the mission plan or the movement path, which movement control variable is then provided and/or the mobile device is then moved on the basis of the movement control variable, so that the mobile device follows the movement path.
However, as already shown, in task or movement and path planning (determining task or movement planning), there are special challenges in the case of dynamic environments in which other entities than the mobile device are also moving. This applies in particular to persons as entities, for example, unlike other mobile devices, whose movements are generally not predefinable, the mission planning of which can likewise be planned. The approach of assuming only a static environment without considering such dynamic parts is hardly suitable for this, even though it generally allows planning in advance for a Long time ("Long-Term Horizon").
Further, a movement map (Bewegungskarte) is now used as device and/or environment information, which has movement information about movements of entities, in particular persons, different from the mobile device. This information is then also taken into account when determining the mission plan and, if appropriate, the movement path. Such movement information may include, in particular, the speed ρ, the direction angle θ, the movement probability p and the observation ratio of the person) Q (i.e., whether there is a person at a particular location). For example, such information may be provided for each of a plurality of cells in an environment (or at least those areas of the environment in which the mobile device may move).
In order to obtain such a motion map, an environmental model may be used in particular, by means of which the motion pattern of an entity (e.g. a stream of people) is extracted from the provided dataset. For example, such a data set may be generated from long-term observations of the environment or similar environments. Using an environment model, the environment, and in particular in 2D, can be Discretized (DISKRETISIEREN) into the cells (or grid cells) mentioned, so that motion information can then be distributed there. For this purpose, the proposed method can be used, for example, in page "Enabling flow awareness for mobile robots in partially observable environments".IEEE Robotics and Automation Letters 2.2(2017):1093-1100 of "Kucner, tomasz Piotr et al. The motion diagram is then in particular a so-called CLiFF diagram.
The motion map may in particular also be represented or used as a graph (or grid graph) with nodes and edges. For example, each node in the graph represents the center of a grid cell. The edges between the nodes are particularly bi-directional, which means that the cost of passing the stream w edge is or can be different in each direction. The weight of each edge may be calculated as a weighted sum of the step cost (Schrittkosten) c p and the cost of passing the stream of people into the two particular grid cells c flow. The cost of passing a human flow may also be associated with the relative angle θ between the mobile device ("agent") and the human flow direction, the human flow velocity ρ, the motion probability p, and the observed proportion q, as shown in the following equation:
wedge=w1cstep+w2cflow=w1cstep+w2[cos(θ-θagent)·ρ·p·q].
in particular, an environment map or graph may be stored in the environment model in order to obtain (in the sense of a cost function) the actual rewards, when the mobile device (agent) interacts with the environment, for example when the mobile device moves from one node to another, the resulting cost is the sum of the weights across all sides (abfahren) in the direction of movement.
Creating a long-term mission plan based on a 2D map and thus creating a motion plan (which, like a map, may be in 2D form) may be very inefficient in view of the computational expense required. In one embodiment, the reduced (reduziert) motion map is thus used as a motion map for determining tasks and (possibly) motion planning. The reduced motion map may be determined or created from higher level motion maps (which may themselves look like the motion map generally described above). This may also apply to the reduced motion map if the higher level motion map is used as a graph with nodes and edges.
As mentioned, the motion map may include corresponding motion information for different locations in the environment (e.g., at the mentioned units or nodes, respectively). The number of such positions should be reduced to a first number, while the higher level motion map comprises a second number of positions with motion information, wherein the second number is larger or even significantly larger than the first number.
The positions in the scaled-down motion map or the corresponding map comprise, for example, only specific predefined orientation points (Orientierungspunkt); for example, the orientation points may be assembly locations for mobile devices, component shelves or loading or unloading points, etc. The reduced motion map or corresponding chart may be used to determine a mission plan and/or a motion plan for the mobile device, wherein the orientation points are then considered. Each edge of the (reduced) graph is here especially also bi-directional and shows the cost (i.e. the cost and distance across the stream of people) based on two corresponding nodes of the 2D graph.
In one embodiment, the mission plan is determined by using a machine learning algorithm, in particular reinforcement learning (or reinforcement learning). Further specifically, the reinforcement learning is hierarchical reinforcement learning (English: "HIERARCHICAL REINFORCEMENT LEARNING").
In reinforcement learning, a series of actions is usually created to achieve the task goal, i.e. actions that are predefined for mobile devices, in particular also including movements in between, are created as a sequence. In contrast, in hierarchical reinforcement learning, a task goal is broken down into multiple options or hierarchies, e.g., at a first hierarchy level, then e.g., considering the order of component transportation, and at a second hierarchy level, e.g., considering the extraction or transportation of components back to the assembly area. Each hierarchical level or option or sub-option may represent a minimal (MINIMALISTISCH) task here, and may then be combined into a small sequence of actions (third hierarchical level). This makes the calculation particularly efficient and fast.
For example, hierarchical reinforcement learning may be based on Dyna-Q. Dyna-Q is a conceptual algorithm that illustrates how the real experience and the simulated experience are combined when creating guidelines (or action schemes).
The proposed method allows task and motion planning at a high level, wherein the high level is mainly integrated into the planning level of the entire autonomous system. For example, the motion pattern of the person is used as prediction information and provided as input for a mission planning pipeline, which then outputs a mission plan to a motion planner to calculate a motion profile for the mobile device.
For example, the invention can then be applied in a factory setting to perform electric bicycle assembly tasks in which the mobile device collects the different components of the electric bicycle, namely steering wheel, rear wheel, frame, battery and saddle, from different places. And bringing them successively to the assembly area while not driving against the flow of people in the factory.
The mission plan may also be redetermined again and again; this may be done, for example, depending on trigger criteria, for example, in case a predefined period of time (e.g., 10 seconds or 1 minute) has elapsed since the last determination of the mission plan, in case there is a change in the motion map, or in case there is a change in the one or more actions (e.g., because the mobile device should now perform other or further actions).
The computing unit according to the invention, for example a control unit of a control device or a mobile device, or a server or other computer, is especially programmed to carry out the method according to the invention.
The invention also relates to a mobile device, such as a robot, a drone or an at least partially automated moving transport (e.g. an AGV), which is designed to obtain mission or movement planning or movement control variables. The mobile device then has a drive system and a control or regulating unit for actuating the drive system based on the task or motion planning and/or motion control variables. The mobile device may also have a computing unit according to the invention, i.e. the motion planning may be adapted on the mobile device or its computing unit. However, it is also expedient for the adaptation of the movement plan to take place on a higher-level computing unit, for example a server or in a so-called cloud, from which the mobile device then obtains the task or movement plan or movement control variables.
It is also advantageous to implement the method according to the invention in the form of a computer program or a computer program product having a program code for performing all method steps, since this results in particularly low costs, in particular if the control device performing the execution is also used for other tasks and is therefore present anyway. Finally, a machine readable storage medium having the computer program described above stored thereon is provided. Suitable storage media or data carriers for providing the computer program are in particular magnetic, optical and electrical memories, such as hard disks, flash memories, EEPROMs, DVDs, etc. The program may also be downloaded via a computer network (internet, intranet, etc.). Such a download may take place here in a wired or cable connection or in a wireless manner (for example via a WLAN network, a 3G, 4G, 5G or 6G connection, etc.).
Drawings
Further advantages and embodiments of the invention can be derived from the description and the drawing.
The invention is schematically illustrated in the drawings on the basis of embodiments and is described below with reference to the drawings.
Fig. 1 schematically shows an apparatus in which the method according to the invention can be performed.
Fig. 2 schematically shows the flow of the method according to the invention in a preferred embodiment.
Fig. 3 schematically shows a flow of a method according to the invention in another preferred embodiment.
Fig. 4 and 5 schematically show motion diagrams that may be used in the method according to the invention.
Detailed Description
An apparatus 100 in which the method according to the invention may be performed is schematically shown in fig. 1. The device includes, for example, a computing unit 104, such as a higher level server, and, for example, a mobile device 110 that moves or should move in an environment 120. Here, the mobile device 110 should be moved, for example, according to a movement plan with a movement path 130, i.e. the mobile device 110 should follow the movement path 130. The movement path is guided, for example, on a path through the obstacles 121, 122, 123. It should be mentioned that only an illustrative and exemplary environment with exemplary motion paths is shown here in order to explain the basic principle.
The mobile device may be, for example, a so-called AGV. For example, the mobile device 110 has a drive system 114 and a control or regulation unit 112 for controlling the drive system based on a motion planning or motion control variable. In addition, the mobile device 110 has a computing unit 112, which is designed as a control unit and can be connected to the computing unit 104, for example, wirelessly in a data transmission manner. In addition, the mobile device 110 has, for example, a lidar sensor 113 for navigation.
Fig. 2 schematically shows the flow of the method according to the invention in a preferred embodiment. In particular, this is a flow of how a mobile device or autonomous system may be operated. In step 200, the environment is detected and thereby, for example, the position and/or orientation of the mobile device in the environment is determined and provided 202. This can be done, for example, by using the mentioned lidar sensor.
In step 210, a prediction or forecast may be made, which accounts for: the situation of other mobile devices (or agents) may or will behave within a predetermined period of time in the future.
Based on the prediction, planning for the mobile device may be performed in step 220. This initially includes planning an action or task in step 230, i.e., task planning 232 for the mobile device, to which the action to be performed by the mobile device in the environment is provided.
In particular, based on these actions or mission plans 232, in step 240: a motion plan 242 is determined, which may be provided accordingly in step 244. Based on the motion plan 242, and in turn in step 250, a motion control variable 252 may be determined, and then in step 254 the motion control variable 252 may be provided to move the mobile device.
As described above, other aspects may be considered in determining motion and task or motion planning, as will be explained in more detail below.
In fig. 3, a flow of a method according to the invention in a further preferred embodiment is schematically shown. In particular, this is an overview of how individual steps or aspects may be correlated.
Shown by 310: determining (or planning) an action (or task) of the mobile device; it can be said that: an action or task planner or an action or task planning module. It may be automated in particular and based on, for example, a PDDL solver (PDDL stands for "Planning Domain Definition Language (planning domain definition language)") and Fast downlink (Fast downlink) (this is based on a heuristic PDDL solver).
The user 300 may, for example, specify or provide a particular action 302 that should or can be performed by the mobile device. Furthermore, an environment and/or task description 304 (so-called "domain knowledge") is considered in this case. The environment and/or task description or "domain knowledge" includes in particular knowledge about the problem and the actions to be solved, such as the object (part) in the problem and its (in the environment) location, initial location and target location, which actions the mobile device can perform, and the corresponding preconditions and effects of the actions, and also, for example, the costs of traversing (Durchquerung) between different nodes in the 2D graph, which costs can initially be initialized with the distances between the different nodes. During the learning process, the cost may be updated based on the cost of traversing the flow of people in each direction.
Thus, for example, one can consider: different locations for different tasks or where targets are in the environment, for example, so that a particular order may be determined that is particularly advantageous for the flow. The result is: here, the (initial) mission plan 312 that should be performed by the mobile device is thus output. This may be, in particular, an action or a mission plan, wherein not only the actions themselves, but also, for example, the associations between the actions, for example, the time dependencies, are determined.
The use 320 shows: a mission plan is then further determined based on the action or mission plan 312. In particular, hierarchical reinforcement learning 322 may be used for this. The results or outputs of the machine learning algorithm include the (updated) mission plan 324. Through this mission planning, the mobile device interacts with an environment model 330 in which the reduced motion map is integrated. The environmental model may be based on MPP (MDP stands for "Marcov Decision Process (markov decision process)").
The reduced motion map 332 in turn affects the machine learning algorithm via feedback or rewards (Reward) for execution of the mission plan (see arrow 334); the motion plan is determined based, inter alia, on the reduced motion map 332.
Through the environment model 330, the environment and/or task descriptions ("domain knowledge") may then also be updated, see arrow 336; this may also be based on PDDL.
The determination of the task and the movement plan, in particular by using a machine learning algorithm, will be explained in more detail below on the basis of examples.
The machine learning algorithm may begin with optimistic initialization according to a first algorithm. The mission planner then creates a first optimistic plan based on the current environment map ("domain knowledge"), i.e., the distance between the various locations or nodes (orientation points) in the reduced motion map. Since the mission plan consists of only state-action pairs, options and sub-options must be extracted from the mission plan to supply (bedienen) hierarchical reinforcement learning. Mobile device uses programming with options to initialize transitionsRewards and Q value functions. For example, the Q-value function represents the quality for a possible action given the state, option and sub-option in terms of Q-value and is used to select an action.
The mission plan can be updated by means of hierarchical reinforcement learning using a second algorithm. In each training step, an action may be selected for the mobile device based on the current state, options and sub-options, which may then be performed to interact with the environment and obtain the actual rewards from the environment model for updating the Q-value function and task and/or environment description ("domain knowledge").
The mission planner can then create a new plan based on the updated environment map to update the Q-value function. Once the current option is finalized (beenden), the options and sub-options are updated. The learning cycle may be repeated until the bonus converges or a maximum number of rounds is reached.
In a third algorithm, the first and second algorithms may be integrated to obtain an updated Q-value function and a final effective mission plan as a result.
The motion diagrams that can be used in the method according to the invention are schematically shown in fig. 4 and 5. Fig. 4 shows a motion map 400, which may be a higher level motion map, as described above. Here, for example, movement information related to movement of a person is included for a specific area or an environment portion (for example, a corridor). As already mentioned, the motion information may include, in particular, the speed ρ, the direction angle θ, the motion probability p, and the observation proportion q of the person (i.e., whether or not there is a person at a particular location). Here, there is own motion information for each unit. The size of the cells may be, for example, in the range of less than or equal to one square meter (e.g., for environments with hundreds or thousands of square meter areas).
In particular, the motion information specifies a static value or probability. In this way, the movement flow of the person can thus be represented. For example, it is possible that in one corridor there is a high probability that many people move in a particular direction, while in another corridor few people may be encountered.
Since the computation based on such detailed motion maps is computationally expensive, as mentioned, a reduced motion map can be determined or generated from the higher level motion map.
A reduced motion map 500 is shown in fig. 5, which also has motion information 510. In particular, some of the orientation points 520 through 526 (as well as some orientation points not labeled) are also shown herein. The orientation points may be, for example, locations (Stelle) in the location or environment of the relevant object, such as where the mobile device is to be loaded or unloaded, or the orientation points may be curves or intersections.
Furthermore, a connection 530 between two orientation points, respectively, is shown. In the motion information 410 that is substantially present in the higher-level motion map 400, only the cost of traversing the stream of people calculated from the motion information according to the above equation is still used in the reduced motion map 500. As described above, these costs are expressed in terms of weights of the bi-directional edges.
This reduced motion map allows: significantly simpler and faster to calculate: in order for a mobile device to reach from one orientation point to another, the motion plan looks like. For example, if the mobile device were to travel from point 522 to point 520, the shortest path would be through point 521. But if it is considered that there are many people coming up to the mobile device, a faster path is for example passing through points 523, 524, 525, 526.
Claims (14)
1. Method for determining a mission plan for a mobile device, in particular a robot, a drone or a vehicle that moves at least partially automatically, which mobile device is to be moved in an environment according to a movement plan, the method comprising:
-providing (310324) one or more actions (302) that the mobile device (110) should perform in the environment;
Providing (200) device and/or environment information, comprising:
-a current position and/or orientation (202) of the mobile device,
-An environment and/or task description (304), and
-A motion map (500) with motion information (510) about motions of an entity, in particular a person, different from the mobile device;
Determining (240, 320) the mission plan (232, 324) based on the device and/or environment information and the one or more actions such that the one or more actions can be performed; and
-Providing (244) a mission plan (324) for the mobile device, in particular a movement plan (242) determined therefrom, and in particular-moving the mobile device in an environment according to the mission plan (324), in particular the movement plan (242).
2. The method of claim 1, wherein the motion map (500) is a reduced motion map comprising motion information for a first number of locations (520-526) and/or connections (530) between the locations in an environment,
Wherein the reduced motion map (500) is determined from or is determined from a higher level motion map in such a way that the motion information (410) of the higher level motion map for the second number of locations and/or connections between the locations in the environment is reduced by a first amount or to the first amount.
3. The method of claim 2, wherein the locations (520-526) of the reduced motion map comprise pre-determined orientation points in the environment.
4. The method according to any of the preceding claims, wherein the determining a mission plan is performed by using a machine learning algorithm (322), in particular reinforcement learning.
5. The method of claim 4, wherein the reinforcement learning is hierarchical reinforcement learning.
6. The method according to any of the preceding claims, wherein the motion map (500) is determined by using an environment model (330).
7. The method of any of the preceding claims, wherein the determining a mission plan comprises: a motion plan is determined (230), or wherein based on one or more actions, a motion plan is determined based on the mission plan.
8. The method of claim 7, wherein the one or more actions are determined based on the motion map.
9. The method according to any of the preceding claims, wherein the mission plan (324) is redetermined when at least one trigger criterion is present, wherein the at least one trigger criterion comprises at least one of the following trigger criteria:
a predefined duration has elapsed since the last time the movement plan was determined,
-There is a change in the motion map, and
-There is a change in the one or more actions.
10. The method of any of the preceding claims, the method further comprising:
Determining (250) a motion control variable (252) for the mobile device based on the task or motion plan, and
Providing (254) the motion control variable and/or moving the mobile device based on the motion control variable.
11. A computing unit comprising a processor configured to perform the method according to any of the preceding claims.
12. Mobile device (110), in particular a robot, a drone or an at least partially automated moving vehicle, which is set up to obtain a mission or a movement plan determined according to any one of claims 1 to 9 or a movement control variable determined according to claim 10:
the mobile device has a drive system (114) and a control or regulating unit (112) for actuating the drive system on the basis of the task or motion planning and/or the motion control variables, and in particular has a computing unit (112) according to claim 11.
13. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to perform the method of claims 1 to 10.
14. Computer readable data carrier on which a computer program according to claim 13 is stored.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102022213838.4A DE102022213838A1 (en) | 2022-12-19 | 2022-12-19 | Method for determining a task schedule for a mobile device |
DE102022213838.4 | 2022-12-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118226804A true CN118226804A (en) | 2024-06-21 |
Family
ID=91278948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311757063.7A Pending CN118226804A (en) | 2022-12-19 | 2023-12-19 | Method for determining a mission plan for a mobile device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN118226804A (en) |
DE (1) | DE102022213838A1 (en) |
-
2022
- 2022-12-19 DE DE102022213838.4A patent/DE102022213838A1/en active Pending
-
2023
- 2023-12-19 CN CN202311757063.7A patent/CN118226804A/en active Pending
Also Published As
Publication number | Publication date |
---|---|
DE102022213838A1 (en) | 2024-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lakshmanan et al. | Complete coverage path planning using reinforcement learning for tetromino based cleaning and maintenance robot | |
Frey et al. | Locomotion policy guided traversability learning using volumetric representations of complex environments | |
Ergezer et al. | 3D path planning for multiple UAVs for maximum information collection | |
JP6884685B2 (en) | Control devices, unmanned systems, control methods and programs | |
CN109655066A (en) | One kind being based on the unmanned plane paths planning method of Q (λ) algorithm | |
Zalama et al. | Adaptive behavior navigation of a mobile robot | |
JP7466785B2 (en) | Controllers with early termination in mixed integer optimal control optimization. | |
Gunawan et al. | Smoothed a-star algorithm for nonholonomic mobile robot path planning | |
KR102303126B1 (en) | Method and system for optimizing reinforcement learning based navigation to human preference | |
Rottmann et al. | Autonomous blimp control using model-free reinforcement learning in a continuous state and action space | |
EP3904973A1 (en) | Device and method for controlling a robot | |
Bodhale et al. | Path planning for a mobile robot in a dynamic environment | |
Quinones-Ramirez et al. | Robot path planning using deep reinforcement learning | |
CN111949013A (en) | Method for controlling vehicle and device for controlling vehicle | |
Martyshkin et al. | Development and investigation of a motion planning algorithm for a mobile robot with a smart machine vision system | |
Wang et al. | Altitude control for an indoor blimp robot | |
Basiri et al. | Synergy of deep learning and artificial potential field methods for robot path planning in the presence of static and dynamic obstacles | |
CN118226804A (en) | Method for determining a mission plan for a mobile device | |
KR102617418B1 (en) | Method, computer system, and computer program for reinforcement learning-based navigation adaptable to sensor configuration and robot shape | |
Pavel et al. | Control of open mobile robotic platform using deep reinforcement learning | |
Jaiton et al. | Neural control and online learning for speed adaptation of unmanned aerial vehicles | |
Bellini et al. | Information driven path planning and control for collaborative aerial robotic sensors using artificial potential functions | |
Davis et al. | Motion planning under uncertainty: application to an unmanned helicopter | |
Daniali et al. | Fast Nonlinear Model Predictive Control of Quadrotors: Design and Experiments | |
Johnsen et al. | NaviSlim: Adaptive Context-Aware Navigation and Sensing via Dynamic Slimmable Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication |