CN109743210A - Unmanned plane network multi-user connection control method based on deeply study - Google Patents
Unmanned plane network multi-user connection control method based on deeply study Download PDFInfo
- Publication number
- CN109743210A CN109743210A CN201910074944.6A CN201910074944A CN109743210A CN 109743210 A CN109743210 A CN 109743210A CN 201910074944 A CN201910074944 A CN 201910074944A CN 109743210 A CN109743210 A CN 109743210A
- Authority
- CN
- China
- Prior art keywords
- access
- unmanned aerial
- base station
- network
- aerial vehicle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000013528 artificial neural network Methods 0.000 claims description 26
- 230000002787 reinforcement Effects 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 14
- 230000005540 biological transmission Effects 0.000 claims description 8
- 230000009471 action Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 4
- 230000007613 environmental effect Effects 0.000 claims description 2
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000004891 communication Methods 0.000 abstract description 3
- 230000006978 adaptation Effects 0.000 abstract 1
- 230000008569 process Effects 0.000 description 7
- 238000005562 fading Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Landscapes
- Mobile Radio Communication Systems (AREA)
Abstract
The invention belongs to wireless communication technology fields, are related to a kind of unmanned plane network multi-user connection control method based on deeply study.The present invention utilizes changing rule intrinsic in deeply study academic environment, the deeply learning framework in adaptation unmanned plane network in the case of multiple access is proposed, and realizes this unmanned plane network multi-user based on deeply study under global network information unknown situation and accesses control program.Access control mode proposed by the present invention can be realized higher throughput of system and lower switching times compared with traditional access control mode.Meanwhile different compromises can be realized in handling capacity and switching times by adjusting switch penalty item, and performance can be ensured in different switch penalties.
Description
Technical Field
The invention belongs to the technical field of wireless communication, and relates to an unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning.
Background
Conventional access control techniques utilize a threshold comparison method, which is implemented by selecting different metrics (e.g., received signal strength, etc.) and selecting an appropriate threshold. When the received signal strength of the User Equipment (UE) from the source base station is lower than the set threshold, the base station which can provide the received signal strength higher than the threshold is selected for access. However, for the network of the unmanned aerial vehicle using the unmanned aerial vehicle as the base station, because the base station has mobility, the relative distance between the base station and the user changes frequently, which causes the intensity of the received signal at the user to change violently, and at this time, the conventional access control technology brings the problem of frequent switching, which causes a large amount of extra signal overhead; in addition, when a plurality of UEs are switched simultaneously, the conventional access control technology can only ensure the throughput of a single user, but cannot ensure the throughput of the entire system.
Disclosure of Invention
In order to solve the problem of frequent switching of the traditional access control technology in an unmanned aerial vehicle network and ensure the overall throughput of a multi-user access situation network, the invention mainly focuses on the conditions of the long-term throughput and the switching times of the overall system. Because the deep reinforcement learning has excellent performance in the complex dynamic environment decision problem, in order to overcome the problem that the global network information in the unmanned aerial vehicle network environment is difficult to collect, the invention provides a deep reinforcement learning framework which is suitable for the condition of multi-user access in the unmanned aerial vehicle network by utilizing the inherent change rule in the deep reinforcement learning environment, and realizes the unmanned aerial vehicle network multi-user access control scheme based on the deep reinforcement learning under the condition that the global network information is unknown.
In the invention, a system model is established from the perspective of providing service for ground users by using the unmanned aerial vehicle as a mobile base station, and the unmanned aerial vehicle moves according to a preset track to provide downlink transmission service for ground UE. In the invention, each UE is regarded as an independent decision maker, and selects a suitable drone base station for access in each time slot. The invention completely hands over the decision process to the UE for execution, and the unmanned aerial vehicle base station is only responsible for receiving the access request and providing the transmission service. In the invention, information interaction does not exist among a plurality of UEs in the decision process, namely the decision process of the UE only depends on the network information obtained by the UE, thereby reducing the overall signal overhead.
In order to solve the problem of multi-user access decision, the invention provides a deep strong learning framework for distributed decision centralized training, namely, a central node is responsible for training neural network parameters of all UE. In the deep reinforcement learning framework provided by the invention, each UE is provided with a neural network with the same structure, and a corresponding access strategy is obtained after local network information is input into the neural network; the central node is responsible for collecting experience information from each UE and training neural network parameters, and the central node transmits the trained parameters to the user after each training stage is completed. And after the UE acquires the trained neural network parameters from the central node, updating the local neural network parameters. The invention separates the decision-making process and the training process, so that the UE only needs to utilize the trained neural network, and the calculation complexity at the UE is reduced.
In order to solve the problem that the position information of the base station in the unmanned aerial vehicle network is difficult to collect, the invention avoids the position information in the design of the user state, mainly adopts the information such as the received signal strength of the user, and the information can be directly measured locally. In order to avoid frequent switching and ensure the throughput performance of the whole network under the condition of multiple users, the invention not only considers the throughput of the users in the design of the deep reinforcement learning reward function, but also considers the switching inhibition of the UE and the influence of the access action of a single UE on other related UEs.
In order to better capture and learn the received signal strength change rule at the UE, the invention also introduces a long-short term memory (LSTM) network into the neural network design. The neural network of the invention has simple design, and the LSTM is used for extracting the characteristics and then is processed by the three-layer full-connection network to obtain the corresponding access decision output.
Compared with the traditional access control mode, the access control mode provided by the invention can realize higher system throughput and lower switching times. Meanwhile, different compromises can be realized on throughput and switching times by adjusting the switching penalty items, and the performance can be guaranteed under different switching penalty conditions.
Drawings
Figure 1 shows a system model of a drone network in accordance with the present invention;
FIG. 2 illustrates a deep reinforcement learning framework model in accordance with the present invention;
FIG. 3 shows a structural model of a neural network in the present invention;
fig. 4 shows the throughput and the number of handovers of the access control scheme proposed by the present invention compared to a conventional access control scheme.
Detailed Description
The invention is described in detail below with reference to the drawings and simulation examples so that those skilled in the art can better understand the invention.
FIG. 1 shows a system model of the present invention. The wireless communication system has two parts, namely an unmanned aerial vehicle base station and ground UE. The unmanned aerial vehicle basic station flies according to the fixed orbit in the air, ground UE. Since the drone base station is flying in the air, there are two components in the channel, line of sight (LOS) and non-line of sight (NLOS), the proportion of the two components appearing is mainly determined by the elevation angle between the drone and the ground user. Both LOS and NLOS components include large-scale fading and small-scale fading, the large-scale fading is mainly determined by the distance between the UE and the base station, and the small-scale fading follows rice distribution and rayleigh distribution, respectively. In particular, the channel gain model between the jth drone base station and the ith ground UE may be expressed as:
wherein,andrespectively showing the proportion of the occurrence of LOS and NLOS components,andrespectively, representing the corresponding channel gains. f denotes the carrier frequency and v denotes the speed of light. Mu.sLOSAnd muNLOSAttenuation factors, l, corresponding to LOS and NLOS, respectivelyi,jTo representDistance between drone base station and UE, αLOSAnd αNLOSPath LOSs indices for LOS and NLOS, respectively.
In the established system model, each drone has the same transmission power, and since small-scale fading exists in the channel gain model, in order to eliminate the small-scale fading, the UE performs sampling averaging on the received signal during access selection, and the average received signal strength adopted can be represented as:
wherein, PtFor the transmission power of the drone base station, N represents the number of signal samples to average.
Because all unmanned aerial vehicle basic stations utilize same spectrum resource to transmit, so ground UE when inserting an unmanned aerial vehicle and transmitting, can receive the interference from other unmanned aerial vehicles, the SINR of user department can be expressed as:
whereinRepresenting the set of unmanned aerial vehicle base stations in the network, σ2Representing the noise power.
The user selects a suitable unmanned aerial vehicle base station to access in each time slot, and for the base station with a plurality of users accessing in a single time slot, the base station selects a Time Division Multiple Access (TDMA) form to serve the users, namely, the time slot is averagely divided into subslots with the same size as the number of the accessed users. The reception rate of the UE may be expressed as:
wherein B represents the frequency bandwidth used for base station transmission, Nj(t) represents the number of users accessed by the base station at that time.
Fig. 2 shows the proposed deep reinforcement learning framework. The frame is composed of 3 parts, namely an unmanned aerial vehicle base station, a central node and UE. The unmanned aerial vehicle base station is responsible for transmitting service, the central node is responsible for training neural network parameters of the UE, and the UE makes appropriate base station access selection in each decision phase. Each UE is provided with the same neural network as the central node, and the neural network parameters at the UE are obtained from the central node and can be regarded as a replica at the central node. Each UE is regarded as an independent individual in the framework, information interaction does not occur between the UE and the UE, and the UE independently selects the unmanned aerial vehicle base station to access and is responsible for transmitting network information of the UE to the central node.
For a single UE, other users and drone base stations may be considered as environments. Therefore, the whole information interaction process is composed of two parts, namely an interaction process between the UE and the environment, and a transmission process of experience information and network parameters between the UE and the central node. In each access selection stage, each UE selects a proper unmanned aerial vehicle base station to access according to the state of the UE. Since we mainly pay attention to the maximization of user throughput, and the receiving rate of a user mainly relates to the strength of a received signal and the number of access users of a base station, the number of user connections and the strength of the received signal are mainly used as state elements, and a specific state can be expressed as follows:
wherein u isi,jThe binary indicator variable may also be referred to as an access indicator variable, and if "1" indicates that the base station is accessed, and if "0" indicates that the base station is not selected to be accessed. In the state design, the access indication variable u of the user at the last moment is includedi,j(t-1), aboveA time instant and the received signal strength at this time instantAndnumber N of access users of each base station at last moment0(t-1),ωi(t-1) represents the throughput of the UE at the previous time instant.
After making self access selection, the UE sends an access request to the selected unmanned aerial vehicle base station, and after receiving the request, the unmanned aerial vehicle provides transmission service for the UE. After all UE access decisions are made, the environmental information is updated, and the unmanned aerial vehicle base station counts the number of access users per se and sends new network information to each UE to form a new state of the UE. All the UEs transmit the original state transition, the access selection made, the throughput condition and the new state to the central node. And the central node calculates the reward function of each UE and perfects the experience information. The final reward function may be expressed as:
wherein,indicating the impact of the UE on the performance of other relevant users after making access selections. a isi(t) and ai(t-1) denotes the access actions taken by the user at time t and time t-1, respectively, C denotes the penalty for creating the handover, η is a control factor.
After collecting experience information of all the UEs, the central node stores all the information into a local storage in a queue form, and summarizes the experience information of all the users. And then the central node randomly samples the data from the central node by using a random gradient descent method to serve as a training sample of the training, and the neural network parameters are trained. And after each training is finished, the central node sends the trained neural network parameters to each UE. And after acquiring the new neural network parameters, the UE updates the local parameters and makes a switching decision by using the updated neural network according to the new state of the UE.
Figure 3 shows the neural network architecture employed in the present invention. The neural network structure is composed of two parts of networks: LSTM networks and fully connected networks. The LSTM network is responsible for extracting time continuity features in input parameters, and data of M moments needs to be input simultaneously in the LSTM network; the full-connection network is responsible for processing the features extracted by the LSTM network to obtain a corresponding access strategy.
Fig. 4 shows the performance of the system throughput and the handover times of the access control technique proposed by the present invention under different handover penalty coefficients. Wherein, the test result is the result when the test time is 1000 time slots. It can be seen that the proposed access control method can achieve higher system throughput with a smaller number of handovers compared to conventional access control methods (received signal strength based access control methods and learning algorithm based access control methods). And under different switching punishment conditions, the proposed access control technology can realize the optimal performance, and different compromises between the switching times and the system throughput can be realized by adjusting different switching punishment items.
Claims (4)
1. An unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning is used for a system which takes an unmanned aerial vehicle as a mobile base station to provide service for ground user UE, and is characterized in that the control method comprises the following steps:
constructing a deep reinforcement learning framework for distributed decision-making centralized training, namely configuring a neural network with the same structure for each UE, and independently acquiring a strategy of accessing an unmanned aerial vehicle base station by each UE according to the neural network of each UE; and meanwhile, a central node with the same neural network is arranged and used for collecting experience information from each UE and training neural network parameters, and the central node transmits the trained parameters to each UE after each training stage is completed.
2. The deep reinforcement learning-based multi-user access control method for the unmanned aerial vehicle network according to claim 1, wherein the specific method for the central node to collect experience information from each UE is as follows:
the UE needs to select a proper action according to its own state, and obtains a corresponding reward after execution, and the throughput of the UE is mainly related to the number of access users of the base station and the strength of the received signal, so the state of the UE is expressed as:
si(t)={ui,0(t-1),…,ui,K-1(t-1),
ωi(t-1)}
wherein u isi,jThe defined access indicating variable is a binary indicating variable, namely, 1 indicates that the base station is accessed, and 0 indicates that the base station is not selected to be accessed; the state includes the access indicator variable u of the user at the last momenti,j(t-1), the last time and the received signal strength at this timeAndnumber N of access users of each base station at last moment0(t-1),ωi(t-1) indicates that the UE was at the previous timeThroughput;
after making self access selection, the UE sends an access request to a selected unmanned aerial vehicle base station, and after receiving the request, the unmanned aerial vehicle provides transmission service for the UE;
after all UE access decisions are made, the environmental information is updated, and the unmanned aerial vehicle base station counts the number of access users per se and sends new network information to each UE to form a new state of the UE; all the UE transmit the original state transition, the made access selection, the throughput condition and the new state to the central node, the central node calculates the reward function of each UE, and the experience information is perfected:
wherein, ω isi(t) represents the throughput of the UE at the current time instant,representing the change of throughput of the UE to other relevant users after access selection, defined as the impact on the performance of other users, ai(t) and ai(t-1) denotes the access actions taken by the user at time t and time t-1, respectively, C denotes the penalty for creating the handover, η is a control coefficient.
3. The unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning of claim 2, wherein the specific method for the central node to train neural network parameters is as follows:
after the central node collects experience information of all the UEs, all the information is stored in a local memory in a queue form, the experience information of all the UEs is collected, random sampling is carried out by using a random gradient descent method, and an obtained sample is used as a training sample of the training to train the neural network parameters.
4. The deep reinforcement learning-based unmanned aerial vehicle network multi-user access control method according to claim 3, wherein the neural network is composed of an LSTM network and a fully-connected network: the LSTM network is responsible for extracting time continuity features in input parameters, and data of M moments needs to be input simultaneously in the LSTM network; the full-connection network is responsible for processing the features extracted by the LSTM network to obtain a corresponding access strategy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910074944.6A CN109743210B (en) | 2019-01-25 | 2019-01-25 | Unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910074944.6A CN109743210B (en) | 2019-01-25 | 2019-01-25 | Unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109743210A true CN109743210A (en) | 2019-05-10 |
CN109743210B CN109743210B (en) | 2020-04-17 |
Family
ID=66366151
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910074944.6A Active CN109743210B (en) | 2019-01-25 | 2019-01-25 | Unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109743210B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110351252A (en) * | 2019-06-19 | 2019-10-18 | 南京航空航天大学 | It is a kind of can synchronism switching unmanned plane ad hoc network adaptive media connection control method |
CN110458283A (en) * | 2019-08-13 | 2019-11-15 | 南京理工大学 | Maximization overall situation handling capacity method under static environment based on deeply study |
CN110661566A (en) * | 2019-09-29 | 2020-01-07 | 南昌航空大学 | Unmanned aerial vehicle cluster networking method and system adopting depth map embedding |
CN111083767A (en) * | 2019-12-23 | 2020-04-28 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111884740A (en) * | 2020-06-08 | 2020-11-03 | 江苏方天电力技术有限公司 | Unmanned aerial vehicle channel optimal allocation method and system based on frequency spectrum cognition |
CN112947541A (en) * | 2021-01-15 | 2021-06-11 | 南京航空航天大学 | Unmanned aerial vehicle intention track prediction method based on deep reinforcement learning |
CN113342030A (en) * | 2021-04-27 | 2021-09-03 | 湖南科技大学 | Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning |
CN114422086A (en) * | 2022-01-25 | 2022-04-29 | 南京航空航天大学 | Unmanned aerial vehicle ad hoc network self-adaptive MAC protocol method based on flow prediction and consensus algorithm |
CN115454646A (en) * | 2022-09-29 | 2022-12-09 | 电子科技大学 | Multi-agent reinforcement learning acceleration method for clustered unmanned aerial vehicle decision making |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113938907A (en) * | 2020-07-13 | 2022-01-14 | 华为技术有限公司 | Communication method and communication device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104950906A (en) * | 2015-06-15 | 2015-09-30 | 中国人民解放军国防科学技术大学 | Unmanned aerial vehicle remote measuring and control system and method based on mobile communication network |
CN106094606A (en) * | 2016-05-19 | 2016-11-09 | 南通航运职业技术学院 | A kind of unmanned surface vehicle navigation and control remote-controlled operation platform |
CN107205225A (en) * | 2017-08-03 | 2017-09-26 | 北京邮电大学 | The switching method and apparatus for the unmanned aerial vehicle onboard base station predicted based on user trajectory |
US9826415B1 (en) * | 2016-12-01 | 2017-11-21 | T-Mobile Usa, Inc. | Tactical rescue wireless base station |
CN107680151A (en) * | 2017-09-27 | 2018-02-09 | 千寻位置网络有限公司 | Strengthen the method and its application of the indicative animation fulfillment capability in Web3D |
CN108684047A (en) * | 2018-07-11 | 2018-10-19 | 北京邮电大学 | A kind of unmanned plane carries small base station communication system and method |
CN108733051A (en) * | 2017-04-17 | 2018-11-02 | 英特尔公司 | The advanced sensing of autonomous vehicle and response |
CN109195135A (en) * | 2018-08-06 | 2019-01-11 | 同济大学 | Base station selecting method based on deeply study in LTE-V |
-
2019
- 2019-01-25 CN CN201910074944.6A patent/CN109743210B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104950906A (en) * | 2015-06-15 | 2015-09-30 | 中国人民解放军国防科学技术大学 | Unmanned aerial vehicle remote measuring and control system and method based on mobile communication network |
CN106094606A (en) * | 2016-05-19 | 2016-11-09 | 南通航运职业技术学院 | A kind of unmanned surface vehicle navigation and control remote-controlled operation platform |
US9826415B1 (en) * | 2016-12-01 | 2017-11-21 | T-Mobile Usa, Inc. | Tactical rescue wireless base station |
CN108733051A (en) * | 2017-04-17 | 2018-11-02 | 英特尔公司 | The advanced sensing of autonomous vehicle and response |
CN107205225A (en) * | 2017-08-03 | 2017-09-26 | 北京邮电大学 | The switching method and apparatus for the unmanned aerial vehicle onboard base station predicted based on user trajectory |
CN107680151A (en) * | 2017-09-27 | 2018-02-09 | 千寻位置网络有限公司 | Strengthen the method and its application of the indicative animation fulfillment capability in Web3D |
CN108684047A (en) * | 2018-07-11 | 2018-10-19 | 北京邮电大学 | A kind of unmanned plane carries small base station communication system and method |
CN109195135A (en) * | 2018-08-06 | 2019-01-11 | 同济大学 | Base station selecting method based on deeply study in LTE-V |
Non-Patent Citations (2)
Title |
---|
YANG CAO,ET ALL: ""Deep Reinforcement Learning for User Access Control in UAV Networks"", 《2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEM(ICCS)》 * |
魏湧明等: ""基于CNN及Bi-LSTM 的无人机时序图像定位研究"", 《电光与控制》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110351252A (en) * | 2019-06-19 | 2019-10-18 | 南京航空航天大学 | It is a kind of can synchronism switching unmanned plane ad hoc network adaptive media connection control method |
CN110458283A (en) * | 2019-08-13 | 2019-11-15 | 南京理工大学 | Maximization overall situation handling capacity method under static environment based on deeply study |
CN110661566A (en) * | 2019-09-29 | 2020-01-07 | 南昌航空大学 | Unmanned aerial vehicle cluster networking method and system adopting depth map embedding |
CN110661566B (en) * | 2019-09-29 | 2021-11-19 | 南昌航空大学 | Unmanned aerial vehicle cluster networking method and system adopting depth map embedding |
CN111083767B (en) * | 2019-12-23 | 2021-07-27 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111083767A (en) * | 2019-12-23 | 2020-04-28 | 哈尔滨工业大学 | Heterogeneous network selection method based on deep reinforcement learning |
CN111884740A (en) * | 2020-06-08 | 2020-11-03 | 江苏方天电力技术有限公司 | Unmanned aerial vehicle channel optimal allocation method and system based on frequency spectrum cognition |
CN112947541A (en) * | 2021-01-15 | 2021-06-11 | 南京航空航天大学 | Unmanned aerial vehicle intention track prediction method based on deep reinforcement learning |
CN113342030A (en) * | 2021-04-27 | 2021-09-03 | 湖南科技大学 | Multi-unmanned aerial vehicle cooperative self-organizing control method and system based on reinforcement learning |
CN114422086A (en) * | 2022-01-25 | 2022-04-29 | 南京航空航天大学 | Unmanned aerial vehicle ad hoc network self-adaptive MAC protocol method based on flow prediction and consensus algorithm |
CN114422086B (en) * | 2022-01-25 | 2024-07-02 | 南京航空航天大学 | Unmanned aerial vehicle self-networking self-adaptive MAC protocol method based on flow prediction and consensus algorithm |
CN115454646A (en) * | 2022-09-29 | 2022-12-09 | 电子科技大学 | Multi-agent reinforcement learning acceleration method for clustered unmanned aerial vehicle decision making |
CN115454646B (en) * | 2022-09-29 | 2023-08-25 | 电子科技大学 | Multi-agent reinforcement learning acceleration method for clustered unmanned plane decision |
Also Published As
Publication number | Publication date |
---|---|
CN109743210B (en) | 2020-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109743210B (en) | Unmanned aerial vehicle network multi-user access control method based on deep reinforcement learning | |
Cao et al. | Deep reinforcement learning for multi-user access control in non-terrestrial networks | |
CN110809306B (en) | Terminal access selection method based on deep reinforcement learning | |
CN113162679A (en) | DDPG algorithm-based IRS (inter-Range instrumentation System) auxiliary unmanned aerial vehicle communication joint optimization method | |
CN113873434B (en) | Communication network hotspot area capacity enhancement oriented multi-aerial base station deployment method | |
CN110531617A (en) | Multiple no-manned plane 3D hovering position combined optimization method, device and unmanned plane base station | |
CN114900225B (en) | Civil aviation Internet service management and access resource allocation method based on low-orbit giant star base | |
CN110380776B (en) | Internet of things system data collection method based on unmanned aerial vehicle | |
CN111526592B (en) | Non-cooperative multi-agent power control method used in wireless interference channel | |
CN113055078B (en) | Effective information age determination method and unmanned aerial vehicle flight trajectory optimization method | |
Cao et al. | Deep reinforcement learning for multi-user access control in UAV networks | |
CN113255218B (en) | Unmanned aerial vehicle autonomous navigation and resource scheduling method of wireless self-powered communication network | |
CN115499921A (en) | Three-dimensional trajectory design and resource scheduling optimization method for complex unmanned aerial vehicle network | |
CN114980169A (en) | Unmanned aerial vehicle auxiliary ground communication method based on combined optimization of track and phase | |
Chen et al. | An actor-critic-based UAV-BSs deployment method for dynamic environments | |
Najla et al. | Machine learning for power control in D2D communication based on cellular channel gains | |
CN101217345B (en) | A detecting method on vertical layered time space code message system | |
CN116866974A (en) | Federal learning client selection method based on deep reinforcement learning | |
CN114268348A (en) | Honeycomb-free large-scale MIMO power distribution method based on deep reinforcement learning | |
CN105722201A (en) | Femtocell network interference alignment optimizing method | |
CN116634450A (en) | Dynamic air-ground heterogeneous network user association enhancement method based on reinforcement learning | |
CN115915402A (en) | Method for optimizing NOMA communication coverage of user scheduling, power distribution and unmanned aerial vehicle track in combined manner | |
CN114980205A (en) | QoE (quality of experience) maximization method and device for multi-antenna unmanned aerial vehicle video transmission system | |
CN114745032A (en) | Non-cellular large-scale MIMO intelligent distributed beam selection method | |
Mehbodniya et al. | A dynamic weighting of attributes in heterogeneous wireless networks using fuzzy linguistic variables |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |