[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111882184B - Multi-agent system null space behavior control dynamic task priority planning method - Google Patents

Multi-agent system null space behavior control dynamic task priority planning method Download PDF

Info

Publication number
CN111882184B
CN111882184B CN202010677791.7A CN202010677791A CN111882184B CN 111882184 B CN111882184 B CN 111882184B CN 202010677791 A CN202010677791 A CN 202010677791A CN 111882184 B CN111882184 B CN 111882184B
Authority
CN
China
Prior art keywords
task
priority
tasks
planning
zero
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010677791.7A
Other languages
Chinese (zh)
Other versions
CN111882184A (en
Inventor
黄捷
张祯毅
陈宇韬
蔡逢煌
齐义文
王武
柴琴琴
林琼斌
李卓敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202010677791.7A priority Critical patent/CN111882184B/en
Publication of CN111882184A publication Critical patent/CN111882184A/en
Application granted granted Critical
Publication of CN111882184B publication Critical patent/CN111882184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06316Sequencing of tasks or work
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention relates to a zero-space behavior control dynamic task priority planning method for a multi-agent system, which comprises the following steps of firstly, establishing a composite task by a zero-space projection method; secondly, converting the dynamic task priority planning problem into a switching mode optimal control problem; and finally, solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track. The method solves the problem of dynamic task priority planning of multi-agent system zero-space behavior control, does not need artificial switching conditions for setting task priorities, reduces the workload of researchers, has good expandability, and can be applied to large-quantity task priority dynamic planning which is difficult to process by a traditional logic method. In addition, the method considers the prediction information of the future state of the intelligent agent in the process of task priority switching, so that the method has more ideal switching effect compared with the traditional logic method.

Description

Multi-agent system null space behavior control dynamic task priority planning method
Technical Field
The invention relates to the technical field of intelligent robots, in particular to a zero-space behavior control dynamic task priority planning method for a multi-agent system.
Background
The multi-agent system has the characteristics of autonomy, fault tolerance, flexibility, expandability, cooperative capability and the like, and is widely applied to the fields of military affairs and civil affairs. Multiple autonomous mobile robots, a typical multi-agent system, are generally required to work in unstructured environments and to perform multiple tasks including moving along preset trajectories, avoiding static and dynamic obstacles, maintaining formation and aggregation, etc. However, these tasks may collide during execution, for example, an autonomous mobile robot cannot move along a preset trajectory while avoiding obstacles appearing on the preset trajectory.
Behavior control has been extensively studied in the last decade as one of the ways to resolve task conflicts. The method solves the conflict between tasks by setting task priorities and designing a collaborative scheme with the tasks with the priorities. The conventional behavior control method mainly includes a hierarchical control method (the layered control approach) and a motion mode control method (the motor schema control approach), where the hierarchical control method is a competitive method, that is, in the process of executing a task, only a task with a high priority is executed, and a task with a low priority can be executed only after the task with the high priority is completed; the motion pattern control method is a collaborative method in which the velocity outputs of all tasks are weighted and executed as the final task output, so that none of the tasks is completely completed. Thereafter, the literature (g.antonelli and s.chiaverini, "Kinematic control of planes of autonomous vehicles," IEEE Transactions on Robotics, vol.22, no.6, pp.1285-1292, 2006.) proposes a null-space based behavior control that projects low priority tasks to the null-space of high priority tasks while ensuring that high priority tasks are fully executed, so that sub-priority tasks are executed with the redundancy of the system, and will not be executed if high priority tasks use the full degrees of freedom of the system. Therefore, the null-space behavior control has a good potential in handling task conflicts. However, the priority of a task is determined by the supervision of the task and is usually set in advance and cannot be changed during the execution of the task. This may degrade the control performance of the null-space behavior control method, limiting its application in dynamic and unknown environments. For example, when an obstacle does not exist, setting the obstacle avoidance task to the highest priority level may affect the execution of other tasks.
In this regard, researchers have conducted some research. A finite state machine is a state transition mechanism in which the state of a system automatically transitions from one state to another state based on a state transition trigger condition. Under the framework of zero-space behavior control, different task priority sequences are hidden in the state of each finite-state machine, and the task priority is planned by designing a trigger condition of state transition. In addition, the fuzzy logic method is to design some fuzzy rules to realize the task priority planning. However, both of these methods require a manual switching rule of task priorities designed in advance, which is very difficult when the number of tasks is very large or in a dynamic and unknown environment, and both of these methods use only current state information to the autonomous robot without predicting a future state, thereby degrading performance of priority switching.
Disclosure of Invention
In view of this, the invention provides a multi-agent system zero-space behavior control dynamic task priority planning method, which converts a task priority dynamic planning problem into an optimization problem to be processed, and obtains an optimal composite task track by solving the optimization problem, aiming at the defects that the existing zero-space behavior control priority dynamic planning needs a person to set priority switching rules in advance and is difficult to process a large number of task priority dynamic plans.
The invention is realized by adopting the following scheme: a multi-agent system null space behavior control dynamic task priority planning method comprises the following steps:
step S1: designing a basic task rho of an intelligent agent by using a zero-space-based behavior control method, enumerating the possibility of the priorities of all tasks, and combining the designed basic behaviors into a composite task in different priority orders by using a zero-space projection method;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology; and step S3: and solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track.
Further, the step S1 specifically includes the following steps:
step S11: design of basic tasks:
a basic task is formed by a task variable rho epsilon R m Coding is carried out, which is a code matching with the system configuration x epsilon R n A function of correlation, and can be expressed as:
ρ=g(x) (1)
differentiating it to obtain:
Figure GDA0003793684590000031
wherein J (x) epsilon R m×n Is a task Jacobian matrix related to system configuration, v belongs to R n Is the speed of the system; by locally linear inverse mapping, referenced velocity v d Expressed as:
Figure GDA0003793684590000032
where ρ is d A function of the desired task is represented,
Figure GDA0003793684590000041
representing a pseudo-inverse of a jacobian matrix; in practice, the integral of the reference velocity will result in a numerical drift of the position of the agent, and therefore the following form of closed-loop inverse kinematics (closed-loop inverse kinematics) is used to compensate this numerical drift:
Figure GDA0003793684590000042
where a is a suitably constant positive definite gain matrix,
Figure GDA0003793684590000043
is a mission error;
step S12: and (3) constructing a compound task:
a compound task is a combination of a plurality of basic behaviors in a certain priority order; setting up
Figure GDA0003793684590000044
For a task function, where i ∈ N r ,N r ={1,...,r},m i A dimension representing a task space; defining a time-dependent priority function y i (i,t):N r ×[0,∞]→N r (ii) a At the same time, a task hierarchy is defined with the following rules:
a task i with y (i) priority cannot interfere with a task j with y (j) priority, if y (i) ≧ y (j),
Figure GDA0003793684590000045
i≠j;
the mapping relation from the speed to the task speed is determined by the Jacobian matrix of the task
Figure GDA0003793684590000046
i∈N r Representing;
the dimension of task i with the lowest priority may be larger than
Figure GDA0003793684590000047
Thus ensuring that dimension n is greater than the total dimension of all tasks;
the value of y (i) is assigned by the task supervisor according to the needs of the task and the sensor information;
by assigning a given priority to the basic tasks, the speed of the compound task at time t is expressed as:
Figure GDA0003793684590000048
wherein v is i,t Is the speed of task j with priority y (j, t) = i; equation (5) is the composite task formed after all the basic tasks are subjected to zero-space projection.
Further, the specific content of step S2 is:
considering a group of n (n is more than or equal to 2) autonomous robots and a series of m (m is more than or equal to 2) tasks of each autonomous robot; defining state vectors
Figure GDA0003793684590000051
Is composed of
Figure GDA0003793684590000052
Wherein each element represents a state of one robot; introducing a binary variable
Figure GDA0003793684590000053
Wherein each element w i (t)∈{0,1} M (ii) a The dynamics of the robot are written as a convex combination of dynamics in different modes:
Figure GDA0003793684590000054
wherein
Figure GDA0003793684590000055
Is w i The (j) th element of (a),
Figure GDA0003793684590000056
the jth task for the ith robot has a velocity in the form of equation (5); the total number of task speeds is M = M! (ii) a
Figure GDA00037936845900000517
One constraint of (a) is expressed as:
Figure GDA0003793684590000057
this constraint is called a class of special-ordered-set-of-type-one, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure GDA0003793684590000058
wherein
Figure GDA0003793684590000059
Is the desired task function for the ith robot. The cost function is defined as follows:
Figure GDA00037936845900000510
wherein u i Denotes the relaxation variable, K i And P i Is a normal number; thus, one optimal control problem is described as:
Figure GDA00037936845900000511
Figure GDA00037936845900000512
Figure GDA00037936845900000513
Figure GDA00037936845900000514
wherein
Figure GDA00037936845900000515
In the initial state, the state of the device is as follows,
Figure GDA00037936845900000516
the position of an obstacle detected by the ith robot at time T is shown, and T is a prediction range; the relaxation vector is
Figure GDA0003793684590000061
Constraint (10 d) ensures that all relaxation variables are non-negative, making (10 c) a soft bundle for the local minimum problem.
Further, the specific content of step S3 is:
different initial states at each sampling time in the task execution process
Figure GDA00037936845900000614
Such that the problem (10) (10 a) (10 b) (10 c) (10 d) must be solved repeatedly; however, problem (10) contains integer variables that are difficult to solve; the problem (10) can be effectively solved on line by adopting a real-time model predictive control method, and dynamic task priority planning is provided; the algorithm adopts a direct method of first dispersion and then optimization and external convection of an integer relaxation technology;
time domain [0, T]Divided into N intervals characterised by equidistant grid points, i.e. 0= t 0 <t 1 <···<t N-1 <t N =T,
Figure GDA0003793684590000062
In each interval, a binary variable is assumed
Figure GDA0003793684590000063
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure GDA0003793684590000064
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure GDA0003793684590000065
Figure GDA0003793684590000066
Figure GDA0003793684590000067
Figure GDA0003793684590000068
Figure GDA0003793684590000069
wherein
Figure GDA00037936845900000610
s is the current sample and is the current sample,
Figure GDA00037936845900000611
predicting at sample s, x for k step state i,s|k Is composed of
Figure GDA00037936845900000612
The ith element of (1); in the formula (11), the first and second groups,
Figure GDA00037936845900000613
is a function obtained by solving the nonlinear dynamical equation (6) through numerical integration, and the constraint (11 c) ensures that the state trajectory is continuous in the whole prediction range after the solution of the problem (11), namely the equation 11 (a) 11 (b) 11 (c) 11 (d) 11 (e), reaches convergence; the nonlinear programming problem shown in equation (11) can be solved efficiently by a standard nonlinear programming problem solver using a sequential quadratic programming or interior point method without using an integer optimization algorithm;
after solving equations 11 (a), 11 (b), 11 (c), 11 (d), 11 (e), oneThe method of summary rounding is applied from
Figure GDA0003793684590000071
In-taking binary variables
Figure GDA0003793684590000072
The steps are as follows:
Figure GDA0003793684590000073
Figure GDA0003793684590000074
wherein j =1,.. A, M, k =0,.. A, N-1, the optimal compound task trajectory volume is formula (12 a) and formula (12 b), with the following proposition for formula (12 a) (12 b);
if it is used
Figure GDA0003793684590000075
Is measurable and satisfies
Figure GDA0003793684590000076
Then function
Figure GDA0003793684590000077
Is converted from the formulas (12 a) (12 b) by using the zero order hold, and satisfies
Figure GDA0003793684590000078
Wherein
Figure GDA0003793684590000079
Satisfy the requirement of
Figure GDA00037936845900000710
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention takes task supervision controlled by zero-space behavior as a research object, and researches aiming at the problem of dynamic task priority planning; the invention can effectively solve the problem of dynamic task priority planning of multi-agent system null space behavior control, avoids the switching rule of artificial predesigned tasks and reduces the workload of researchers. Meanwhile, the proposed method has good expandability and can be used for processing the problem of dynamic task priority planning in a large number of tasks and a dynamic unknown environment, and the traditional method needs a design rule in advance, so that great difficulty can be brought to designers under the conditions of a large number of tasks and an unknown environment, and the designers are difficult to consider all possible complex conditions, thereby influencing the performance of task priority switching. The method does not need to artificially set the switching condition of the task priority, thereby reducing the workload of researchers, having good expandability and being applied to the dynamic planning of a large number of task priorities which are difficult to process by the traditional logic method. In addition, the method not only considers the current state information of the autonomous robot but also considers the future state information in the process of switching the task priorities, and further improves the task priority switching effect.
(2) The method introduces a model predictive control technology into the dynamic task priority planning of the zero-space behavior control, converts the planning problem into an optimization problem at each moment for processing, and obtains the optimal task priority track by solving the designed optimization problem. The method does not need to artificially set task priority switching conditions, and can be applied to large-quantity task priority dynamic planning which is difficult to process by the existing method.
Drawings
FIG. 1 is a schematic block diagram of a method of an embodiment of the invention.
FIG. 2 shows an embodiment of the present invention with N
Figure GDA0003793684590000082
N-1 initial guesses of
Figure GDA0003793684590000081
Example graph of the direct discretization method of initial guessing.
FIG. 3 is a continuous trace diagram after the solution to the problem (11) is converged according to an embodiment of the present invention.
Fig. 4 is a robot trajectory diagram of the model predictive control method and the conventional method according to the embodiment of the present invention.
Fig. 5 is a diagram illustrating a distance between the robot and an obstacle according to the embodiment of the present invention.
Fig. 6 is a tracking error map of the robot according to the embodiment of the present invention.
FIG. 7 is a track diagram of a compound task according to an embodiment of the present invention.
Detailed Description
The invention is further explained by the following embodiments in conjunction with the drawings.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a method for zero-space behavior control dynamic task priority planning for a multi-agent system, which includes the following steps:
step S1: designing a basic task rho of an intelligent agent by using a zero-space-based behavior control method, enumerating the possibility of the priorities of all tasks, and combining the designed basic behaviors into a composite task in different priority orders by using a zero-space projection method; if there are m basic tasks, there are theoretically m! A composite task;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology; the method mainly comprises the following steps: introducing a binary variable, performing evagination processing, and constructing a target function and a constraint condition, so as to convert the dynamic task priority planning problem into a switching mode optimal control problem;
and step S3: and solving the optimal control problem by using a mixed integer optimization method to obtain an optimal composite task track. The main contents comprise: discretization, relaxation and projection to integer sets.
In this embodiment, the step S1 specifically includes the following steps:
step S11: design of basic tasks:
a basic task is formed by a task variable rho epsilon R m Coding is carried out, which is a code matching with the system configuration x epsilon R n A function of correlation, and can be expressed as:
ρ=g(x) (1)
differentiation thereof gives:
Figure GDA0003793684590000101
wherein J (x) ∈ R m×n Is a task Jacobian matrix related to system configuration, v belongs to R n Is the speed of the system; by locally linear inverse mapping, the velocity v of the reference d Expressed as:
Figure GDA0003793684590000102
where ρ is d Expressing the expected task function (which is a position function related to time, namely the track of the intelligent agent and is the expected task function of behavior control)
Figure GDA0003793684590000103
Representing a pseudo-inverse of a jacobian matrix; in practical applications, the integral of the reference velocity will result in a numerical drift of the position of the agent, and therefore the following form of closed-loop inverse kinematics (closed-loop inverse kinematics) is used to compensate for this numerical drift:
Figure GDA0003793684590000104
where a is a suitably constant positive definite gain matrix,
Figure GDA0003793684590000105
is a task error;
step S12: and (3) constructing a composite task:
a compound task is a combination of a plurality of basic behaviors in a certain priority order; setting up
Figure GDA0003793684590000106
Is a task function where i ∈ N r ,N r ={1,...,r},m i A dimension representing a task space; defining a time-dependent priority function y i (i,t):N r ×[0,∞]→N r (ii) a At the same time, a task hierarchy is defined with the following rules:
a task i with y (i) priority cannot interfere with a task j with y (j) priority, if y (i) ≧ y (j),
Figure GDA0003793684590000111
i≠j;
the mapping relation from the speed to the task speed is determined by the Jacobian matrix of the task
Figure GDA0003793684590000112
i∈N r Represents;
the dimension of task i with the lowest priority may be larger than
Figure GDA0003793684590000113
Thereby to obtainEnsuring that dimension n is greater than the total dimension of all tasks;
the value of y (i) is assigned by the task supervisor according to the needs of the task and the sensor information;
by assigning a given priority to the basic tasks, the speed of the compound task at time t is expressed as:
Figure GDA0003793684590000114
wherein v is i,t Is the speed of task j with priority y (j, t) = i. Formula (5) is a composite task formed after all basic tasks are subjected to zero-space projection, and is different from the conventional method in that the priority of the tasks is time-varying, for example, the basic task with the highest priority is not fixed and is determined according to the requirements of the tasks, namely, the subsequent optimization problem and the solution thereof. The null-space projection method is the part I-J + J in the formula, wherein J + is the Jacobian pseudo inverse matrix of a high-priority task, J is the Jacobian matrix of a low priority, and the physical meaning of the part is to project a task with a sub-optimal priority onto the null-space of the high-priority task, wherein I-J + J is the so-called null-space projection.
In this embodiment, the specific content of step S2 is:
considering a group of n (n is more than or equal to 2) autonomous robots and a series of m (m is more than or equal to 2) tasks of each autonomous robot; defining state vectors
Figure GDA0003793684590000115
Is composed of
Figure GDA0003793684590000116
Wherein each element represents a state of a robot; introducing a binary variable
Figure GDA0003793684590000117
Wherein each element w i (t)∈{0,1} M (ii) a The dynamics of the robot are written as a convex combination of dynamics in different modes:
Figure GDA0003793684590000118
wherein
Figure GDA0003793684590000121
Is w i The (j) th element of (a),
Figure GDA0003793684590000122
the jth task for the ith robot has a velocity in the form of equation (5); the total number of task speeds is M = M! (ii) a
Figure GDA0003793684590000123
One constraint of (a) is expressed as:
Figure GDA0003793684590000124
this constraint is called a class of special-ordered-set-of-type-one, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure GDA0003793684590000125
wherein
Figure GDA0003793684590000126
Is the expected task function of the ith robot; the cost function is defined as follows:
Figure GDA0003793684590000127
wherein u is i Denotes the relaxation variable, K i And P i Is a normal number; thus, one optimal control problem is described as:
Figure GDA0003793684590000128
Figure GDA0003793684590000129
Figure GDA00037936845900001210
Figure GDA00037936845900001211
wherein
Figure GDA00037936845900001212
In the initial state, the state of the device is as follows,
Figure GDA00037936845900001213
the position of the obstacle detected by the ith robot at time T is determined, and T is a prediction range; the relaxation vector is
Figure GDA00037936845900001214
Constraint (10 d) ensures that all relaxation variables are non-negative, making (10 c) a soft-beam of the local minimum problem.
In this embodiment, the specific content of step S3 is:
different initial states at each sampling time in the task execution process
Figure GDA00037936845900001215
Such that the problem (10) (10 a) (10 b) (10 c) (10 d) must be solved repeatedly; however, problem (10) contains integer variables that are difficult to solve; the problem (10) can be effectively solved on line by adopting a real-time model predictive control method, and dynamic task priority planning is provided; the algorithm adopts a direct method of first dispersion and then optimization and external convection of an integer relaxation technology; the question 10 is the whole formula of (10 a), (10 b), (10 c) and (10 d), and the whole formula constitutes oneAn optimization problem, where 10 (a) is a cost function and 10 b) (10 c) (10 d) is its constraint, is a solution to the optimization problem, which is to solve 10 (a) on the basis of the constraints (10 b) (10 c) (10 d), so that the problem 10 refers to all equations.
Will time domain [0, T]Divided into N intervals characterised by equidistant grid points, i.e. 0= t 0 <t 1 <···<t N-1 <t N =T,
Figure GDA0003793684590000131
At each interval, a binary variable is assumed
Figure GDA0003793684590000132
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure GDA0003793684590000133
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure GDA0003793684590000134
Figure GDA0003793684590000135
Figure GDA0003793684590000136
Figure GDA0003793684590000137
Figure GDA0003793684590000138
wherein
Figure GDA0003793684590000139
s is the current sample and the current sample,
Figure GDA00037936845900001310
predicting at samples s, x for k step states i,s|k Is composed of
Figure GDA00037936845900001311
The ith element of (1); an example of such discretization in the initial guess is shown in fig. 2. In the formula (11), the first and second groups,
Figure GDA00037936845900001312
is a function obtained by solving the nonlinear dynamical equation (6) by numerical integration, and the constraint (11 c) ensures that the state trajectory is continuous in the whole prediction range after the solution of the problem (11), i.e. the equation 11 (a) 11 (b) 11 (c) 11 (d) 11 (e), reaches convergence; a diagram of the state of the solution and the pattern trajectory is shown in fig. 3. The nonlinear programming problem shown in equation (11) can be solved efficiently by a standard nonlinear programming problem solver using a sequential quadratic programming or interior point method without using an integer optimization algorithm;
after solving equations 11 (a), 11 (b), 11 (c), 11 (d), 11 (e), a summary rounding method is applied from
Figure GDA0003793684590000141
In-taking binary variables
Figure GDA0003793684590000142
The steps are as follows:
Figure GDA0003793684590000143
Figure GDA0003793684590000144
the optimal composite task trajectory body is a formula (12 a) and a formula (12 b), the optimal composite task trajectory is embodied in the formula (12 a) and the formula (12 b), the weight of one composite task obtained by solving an optimization problem is 1, the other composite tasks are 0, the solution is the optimal composite task at the current moment, the basic composite tasks are composed in a certain priority sequence, the priority sequence at the moment is the optimal priority sequence, and the optimal composite tasks at different moments can be obtained in the whole time field, so that the optimal composite task trajectory is obtained. The following propositions are given for the equations (12 a) and (12 b);
if it is not
Figure GDA0003793684590000145
Is measurable and satisfies
Figure GDA0003793684590000146
Then function
Figure GDA0003793684590000147
Is converted from the formulas (12 a) (12 b) by using zero-order hold, and satisfies
Figure GDA0003793684590000148
Wherein
Figure GDA0003793684590000149
Satisfy the requirement of
Figure GDA00037936845900001410
Preferably, in this embodiment, the basic task ρ of the agent is designed by using a zero-space-based behavior control method, and the basic behaviors are combined into the composite task in different priority orders by using a zero-space projection method.
Preferably, in this embodiment, first, a binary variable is introduced
Figure GDA0003793684590000151
Second, it is used forCarrying out outward bulging treatment; finally, an objective function is constructed
Figure GDA0003793684590000152
And constraint conditions, and further converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment.
Preferably, in this embodiment, in order to process an optimization problem including integer variables, a real-time model prediction control method is provided, and first, a multi-target shooting method (Multiple shooting) is used to discretize the optimization problem of continuous time, then, a relaxation process is performed, and finally, a sum-up-rounding method is used to project a solution to an integer set, so as to obtain an optimal composite task trajectory.
Preferably, in the present embodiment, an embodiment is given to show the effectiveness and superiority of the proposed method for dynamic task priority planning for null-space behavior control of multi-agent system based on model predictive control.
1. Design of basic tasks
a. Moving tasks
In the movement task, the robot moves to a target point along a predetermined trajectory. The corresponding task function may be defined as:
ρ B =p∈R 2 (14)
wherein p = [ p ] x ,p y ] T Is the position of the robot. The speed of the corresponding desired movement task may be expressed as:
Figure GDA0003793684590000153
wherein J B =I 2 Jacobian matrix for mobile tasks, I 2 The matrix of the unit is expressed by,
Figure GDA0003793684590000154
for pseudo-inverses of the Jacobian matrix of the moving task, p B,d For desired mission function, Λ B To moveThe constant of the task positively determines the gain matrix,
Figure GDA0003793684590000155
is the error of the movement task.
b. Obstacle avoidance task
In the obstacle avoidance task, the robot must avoid the obstacle detected on the preset track by the sensor. The corresponding task function may be defined as:
ρ A =||p-p O ||∈R (16)
wherein p is O =[p xO ,p yO ] T Is the position of the obstacle. The speed of the corresponding desired obstacle avoidance task may be expressed as:
Figure GDA0003793684590000161
wherein
Figure GDA0003793684590000162
For the jacobian matrix of the obstacle avoidance task,
Figure GDA0003793684590000163
Figure GDA0003793684590000164
pseudo-inverse of jacobian matrix for obstacle avoidance tasks, Λ A A constant positive definite gain matrix for the obstacle avoidance task,
Figure GDA0003793684590000165
d represents a safe distance for avoiding the task error of the obstacle avoidance task.
2. Construction of composite tasks
The first composite task takes the obstacle avoidance task as the highest priority task, and the final composite task speed obtained according to the null-space projection method is as follows:
Figure GDA0003793684590000166
wherein
Figure GDA0003793684590000167
Is a zero space of the obstacle avoidance task.
The second compound task takes the mobile task as the highest priority task, and the final compound task speed obtained according to the null-space projection method is as follows:
Figure GDA0003793684590000168
wherein
Figure GDA0003793684590000169
Is a zero space of the obstacle avoidance task.
3. Simulation comparison and analysis
In the given simulation case, the model prediction control method is compared with a traditional logic method, wherein the traditional logic method switches task priorities by judging the distance between the obstacle and the robot and the safe distance, namely, the obstacle avoidance task is the highest priority when the distance between the obstacle and the robot is smaller than the safe distance, and the mobile task is the highest priority when the distance between the obstacle and the intelligent body is larger than the safe distance. The preset tracks of the three robots are respectively
Figure GDA0003793684590000171
Figure GDA0003793684590000172
The obstacle avoidance task gain is 15, the moving task gain is 13, the safety distance is 1m, and the positions of the three obstacles are P O1 =[10;10.5],P O2 =[8;12.5],P O3 =[16;11.5]The sampling frequency of the model predictive control method and the sampling frequency of the traditional logic method are both 20HZ, and the prediction time range of the model predictive control method is 3s. The robot trajectory of the proposed model predictive control method and the conventional method is shown in fig. 4, the distance between the robot and the obstacle is shown in fig. 5, and the tracking error of the robot is shown in fig. 5As shown in fig. 6. The result shows that the model prediction control method is obviously superior to the traditional method under the same sampling frequency. The trajectory of the compound task is shown in fig. 7. The conventional method frequently switches to avoid an obstacle when encountering the obstacle, resulting in oscillation of a trajectory and violation of a safety distance. On the other hand, the proposed model predictive control method can smoothly complete the task by only using two priority switches under the condition of not violating the safety distance constraint. This can be explained by the fact that the model predictive control method is able to predict obstacles in advance, thereby preparing priority switching in advance. This demonstrates the effectiveness of the proposed model predictive control method and its advantages over traditional logic-based methods.
The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims (3)

1. A method for planning the priority of a dynamic task controlled by the zero-space behavior of a multi-agent system is characterized in that: the method comprises the following steps:
step S1: designing basic tasks of an intelligent agent by using a zero-space-based behavior control method, then enumerating the possibility of the priorities of all tasks, and combining the designed basic tasks into a composite task in different priority sequences by using a zero-space projection method;
step S2: converting the dynamic task priority planning problem into a switching mode optimal control problem at each sampling moment by using a model predictive control technology;
and step S3: solving the optimal control problem by applying a mixed integer optimization method to obtain an optimal composite task track;
the step S1 specifically comprises the following steps:
step S11: design of basic tasks:
by a base task rho epsilon R m Coding is carried out, which is a function of the system configuration x epsilon R n A function of correlation, and is expressed as:
ρ=g(x) (1)
differentiating to obtain:
Figure FDA0003793684580000011
wherein J (x) epsilon R m×n Is a task Jacobian matrix related to system configuration, v belongs to R n Is the speed of the system; by locally linear inverse mapping, reference velocity v d Expressed as:
Figure FDA0003793684580000012
where ρ is d A function of the desired task is represented,
Figure FDA0003793684580000013
representing a pseudo-inverse of a jacobian matrix; integration of the reference velocity results in a numerical drift of the position of the agent, and therefore the following form of closed loop inverse kinematics is used to compensate for this numerical drift:
Figure FDA0003793684580000021
where a is a constant positive definite gain matrix,
Figure FDA0003793684580000022
is a mission error;
step S12: and (3) constructing a compound task:
a compound task is a combination of a plurality of basic tasks in a certain priority order; setting up
Figure FDA0003793684580000023
Is a task function where i' is e.N r ,N r ={1,...,r},m i′ A dimension representing a task space; defining a time-dependent priority functionThe number y (i', t) N r ×[0,∞]→N r (ii) a At the same time, a task hierarchy is defined with the following rules:
a task i ' with y (i ') priority cannot interfere with a task j with y (j) priority, if y (i ') > y (j),
Figure FDA0003793684580000024
i′≠j;
the mapping relation from the speed to the task speed is determined by the Jacobian matrix of the task
Figure FDA0003793684580000025
Represents;
the dimension of task i with lowest priority is greater than
Figure FDA0003793684580000026
Ensuring that dimension n is greater than the total dimension of all tasks;
the value of y (i') is assigned by the task supervisor according to the needs of the task and the sensor information;
the basic task is assigned a given priority, and the speed of the compound task at time t is expressed as:
Figure FDA0003793684580000027
wherein v is i′,t Is the speed of task j with priority function y (j, t) = i'; equation (5) is the composite task formed after all the basic tasks are subjected to zero-space projection.
2. The multi-agent system null-space behavior control dynamic task priority planning method according to claim 1, characterized in that:
considering n 'autonomous robots and m' tasks of each autonomous robot, n 'is not less than 2, m' is not less than 2; defining state vectors
Figure FDA0003793684580000028
Is composed of
Figure FDA0003793684580000029
Wherein each element represents a state of a robot; introducing a binary variable
Figure FDA0003793684580000031
Wherein each element w i (t)∈{0,1} M (ii) a The dynamics of the robot are written as a convex combination of dynamics in different modes:
Figure FDA0003793684580000032
wherein
Figure FDA0003793684580000033
A velocity in the form of formula (5) for the jth task of the ith robot; the total number of the task speeds is M = M'! (ii) a
Figure FDA0003793684580000034
One constraint of (2) is expressed as:
Figure FDA0003793684580000035
this constraint is called a class of special ordered sets, which guarantees that at least and only one mode is activated at any time;
defining the tracking error of the ith robot as:
Figure FDA0003793684580000036
wherein
Figure FDA0003793684580000037
Is a function of the expected task of the ith robot(ii) a The cost function is defined as follows:
Figure FDA0003793684580000038
wherein u is i (t) represents a relaxation variable, K i And P i Is a normal number; the optimal control problem is described as:
Figure FDA0003793684580000039
Figure FDA00037936845800000310
Figure FDA00037936845800000311
Figure FDA00037936845800000312
wherein
Figure FDA00037936845800000313
In the initial state of the process, the process is carried out,
Figure FDA00037936845800000314
the position of the obstacle detected by the ith robot at the time T is determined, and T' is a prediction range; the relaxation vector is
Figure FDA00037936845800000315
3. The multi-agent system null-space behavior control dynamic task priority planning method according to claim 2, characterized in that:
different initial states at each sampling time in the task execution process
Figure FDA0003793684580000041
Such that the problem (10) (10 a) (10 b) (10 c) (10 d) must be solved repeatedly; the problem (10) can be effectively solved on line by adopting a real-time model predictive control technology, and dynamic task priority planning is provided; the model predictive control technology adopts a direct method of first dispersion and then optimization and an external convection of an integer relaxation technology;
will be in time domain [0,T']Is divided into N intervals which are characterized by equidistant grid points,
Figure FDA0003793684580000042
at each interval, a binary variable is assumed
Figure FDA0003793684580000043
Is constant and its value can only be changed at grid points; the binary variable is relaxed to the real variable and satisfied
Figure FDA0003793684580000044
Using the multi-objective targeting method, the problem (10) is transformed into a nonlinear programming problem and is represented as:
Figure FDA0003793684580000045
Figure FDA0003793684580000046
Figure FDA0003793684580000047
Figure FDA0003793684580000048
Figure FDA0003793684580000049
wherein
Figure FDA00037936845800000410
s is the current sample and is the current sample,
Figure FDA00037936845800000411
predicting at samples s, x for k step states i,s|k Is composed of
Figure FDA00037936845800000412
The ith element of (1);
after solving equations 11 (a), 11 (b), 11 (c), 11 (d), 11 (e), a rounding method is employed to get
Figure FDA00037936845800000413
In-taking binary variables
Figure FDA00037936845800000414
The steps are as follows:
Figure FDA00037936845800000415
Figure FDA00037936845800000416
wherein j = 1.., M, k = 0.., N-1, the optimal composite task trajectory volume is equation (12 a) and equation (12 b),
the following propositions are given for the equations (12 a) and (12 b);
if it is not
Figure FDA0003793684580000051
Is measurable and satisfies
Figure FDA0003793684580000052
Then function
Figure FDA0003793684580000053
Is converted from the formulas (12 a) (12 b) by using the zero order hold, and satisfies
Figure FDA0003793684580000054
Wherein
Figure FDA0003793684580000055
Satisfy the requirement of
Figure FDA0003793684580000056
CN202010677791.7A 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method Active CN111882184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010677791.7A CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010677791.7A CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Publications (2)

Publication Number Publication Date
CN111882184A CN111882184A (en) 2020-11-03
CN111882184B true CN111882184B (en) 2022-10-14

Family

ID=73150276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010677791.7A Active CN111882184B (en) 2020-07-14 2020-07-14 Multi-agent system null space behavior control dynamic task priority planning method

Country Status (1)

Country Link
CN (1) CN111882184B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965722B (en) * 2021-03-03 2022-04-08 深圳华大九天科技有限公司 Verilog-A model optimization method, electronic device and computer readable storage medium
CN113467465B (en) * 2021-07-22 2023-08-04 福州大学 Robot system-oriented human-in-loop decision modeling and control method
CN114469642B (en) * 2022-01-20 2024-05-10 深圳华鹊景医疗科技有限公司 Rehabilitation robot control method and device and rehabilitation robot

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107065547A (en) * 2017-04-07 2017-08-18 西北工业大学 A kind of autonomous rendezvous strategy of noncooperative target based on kernel method
CN108237532A (en) * 2016-12-23 2018-07-03 深圳光启合众科技有限公司 The gait control method, apparatus and robot of multi-foot robot
CN108829113A (en) * 2018-09-01 2018-11-16 哈尔滨工程大学 A kind of adaptive kernel action amalgamation method of multi-robot formation
CN109079780A (en) * 2018-08-08 2018-12-25 北京理工大学 Distributed mobile mechanical arm task hierarchy optimization control method based on generalized coordinates

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078585A1 (en) * 2010-06-29 2012-03-29 University Of Connecticut Method and system for constructing geometric skeletons and medial zones of rigid and non-rigid shapes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108237532A (en) * 2016-12-23 2018-07-03 深圳光启合众科技有限公司 The gait control method, apparatus and robot of multi-foot robot
CN107065547A (en) * 2017-04-07 2017-08-18 西北工业大学 A kind of autonomous rendezvous strategy of noncooperative target based on kernel method
CN109079780A (en) * 2018-08-08 2018-12-25 北京理工大学 Distributed mobile mechanical arm task hierarchy optimization control method based on generalized coordinates
CN108829113A (en) * 2018-09-01 2018-11-16 哈尔滨工程大学 A kind of adaptive kernel action amalgamation method of multi-robot formation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宗立军等.自由漂浮空间机器人多约束混合整数预测控制.《宇航学报》.2016,(第08期),全文. *

Also Published As

Publication number Publication date
CN111882184A (en) 2020-11-03

Similar Documents

Publication Publication Date Title
CN111882184B (en) Multi-agent system null space behavior control dynamic task priority planning method
Castillo et al. Soft computing for control of non-linear dynamical systems
Ebrahimi et al. Intelligent Robust Fuzzy-Parallel Optimization Control of a Continuum Robot Manipulator
Xiao et al. Time-varying nonholonomic robot consensus formation using model predictive based protocol with switching topology
Tan et al. Robust model-free control for redundant robotic manipulators based on zeroing neural networks activated by nonlinear functions
Chen et al. Dynamic task priority planning for null-space behavioral control of multi-agent systems
Campos et al. PSO tuning for fuzzy PD+ I controller applied to a mobile robot trajectory control
Zhang et al. Reinforcement learning behavioral control for nonlinear autonomous system
Cui et al. Noise-resistant adaptive gain recurrent neural network for visual tracking of redundant flexible endoscope robot with time-varying state variable constraints
Santibañez et al. Global asymptotic stability of a tracking sectorial fuzzy controller for robot manipulators
Chemori et al. A new fast nonlinear model predictive control of parallel manipulators: Design and experiments
Scibilia et al. Modeling Nonlinear Dynamics in Human–Machine Interaction
Boufera et al. Fuzzy inference system optimization by evolutionary approach for mobile robot navigation
Lyu et al. CBF-inspired Weighted Buffered Voronoi Cells for Distributed Multi-agent Collision Avoidance
Cursi et al. Augmenting loss functions of feedforward neural networks with differential relationships for robot kinematic modelling
Duy et al. Designing hedge algebraic controller and optimizing by genetic algorithm for serial robots adhering trajectories
Bai et al. Adaptive Hybrid Optimization Learning-Based Accurate Motion Planning of Multi-Joint Arm
Franklin Historical perspective and state of the art in connectionist learning control
Kim Independent joint adaptive fuzzy control of robot manipulator
Esmaieli et al. Design Intelligent Robust Back stepping Controller
Zhang et al. A behavior-based adaptive dynamic programming method for multiple mobile manipulators coordination control
Xiang et al. Interactive natural motion planning for robot systems based on representation space
Vasilopoulos et al. Technical report: A hierarchical deliberative-reactive system architecture for task and motion planning in partially known environments
Ding et al. Adaptive optimal tracking control for continuum robots with uncertain dynamics
Zhang et al. Real-Time Joint Angular-Acceleration Planning for Vision-Based Kinematically Redundant Manipulator in Dynamic Environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant