[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Reinforcement learning with smart contracts on blockchains

Published: 01 November 2023 Publication History

Abstract

In recent years Machine Learning and Blockchain technologies have been at the spearhead of innovation, both in the research and application fields. Machine Learning is predominantly used to enable data knowledge extraction while Blockchain excels in providing a ‘public ledger’ upon which data are securely, consistently and irreversibly recorded. Machine Learning may use data stored on Blockchains and pursue to exploit distributed computing resources. On the other hand, Blockchain may exploit Machine Learning and capitalize user data and establish marketplaces for Machine Learning models. In this work we propose a combination of Machine Learning and in particular Reinforcement Learning (RL) and Imitation Learning (IL) with Blockchain. RL allows a software agent to interact with its environment and learn – via ‘trial and error’ techniques – based exclusively on its own activity, experiences and observations. The software agent will learn via an interactions’ reward/ penalize set of measures, immediately received from its own environment. Designing an interactions’ reward/penalize mechanism is challenging as designers need to draw focused techniques securing that agents’ immediate environment will consistently recognize and reward desirable agent behaviour and that the rewarding mechanism cannot be tapped, corrupted or circumvented. In this work, we have approached this via a coordinated collaboration of RL and IL. A Trainer Agent takes on the task of training Trainee agents using RL/IL via recording its own environmental behaviour in demonstration files. In this respect trainees may imitate trainers’ good practices and get effectively trained. This work proposes the concept of an expert trainer software agent (the Trainer Agent) who records its own behaviour in demonstration files and distributes these files via Blockchain to other (receiving) software agents (Trainee agents). Trainees’ training is applied using RL techniques (i.e. reward/ penalize) in conjunction with IL (based on demo files). Demo files are ‘stored’ on smart contract Blockchains, who in the end get to reward Trainer Agents; pro-rated according to the level with which the Trainer has assisted to the improvement of the Trainee agent models. The invariant Blockchain structure with its unmodifiable smart contracts’ nature secure the demo files and nurture credible all interactions among stakeholders involved. The developed application (dApp) fully automates the workflow of trading demonstration files and of training the Trainee agents.

Highlights

A Trainer Software Agent (TA) makes its behaviour available via blockchains.
Trainee Agents seek effective training to interact with similar to TAs environments.
Reinforcement Learning is used for the training, i.e., reward/penalize.
We propose RL’s enrichment with behavioural cloning and imitation learning.
Blockchain smart contracts are utilized for ‘storing’ demonstration files.
Blockchains reassure quality and non-modifiability of the demonstration files.

References

[1]
Dinh T.N., Thai M.T., AI and blockchain: A disruptive integration, Computer 51 (9) (2018) 48–53,.
[2]
Salah K., Rehman M.H.U., Nizamuddin N., Al-Fuqaha A., Blockchain for AI: Review and open research challenges, IEEE Access 7 (2019).
[3]
Chen F., Wan H., Cai H., Cheng G., Machine learning in/for blockchain: Future and challenges, Canad. J. Statist. 49 (4) (2021) 1364–1382.
[4]
Harris J.D., Waggoner B., Decentralized and collaborative AI on blockchain, in: 2019 IEEE International Conference on Blockchain, Blockchain, IEEE, 2019, pp. 368–375.
[5]
Vyas S., Gupta M., Yadav R., Converging blockchain and machine learning for healthcare, in: 2019 Amity International Conference on Artificial Intelligence, AICAI, IEEE, 2019, pp. 709–711.
[6]
Tanwar S., Bhatia Q., Patel P., Kumari A., Singh P.K., Hong W.C., Machine learning adoption in blockchain-based smart applications: The challenges, and a way forward, IEEE Access 8 (2020).
[7]
Duong T., Todi K.K., Chaudhary U., Truong H.L., Decentralizing air traffic flow management with blockchain-based reinforcement learning, in: 2019 IEEE 17th International Conference on Industrial Informatics, INDIN, vol. 1, IEEE, 2019, pp. 1795–1800.
[8]
Zheng Z., Xie S., Dai H.N., Chen X., Wang H., Blockchain challenges and opportunities: A survey, Int. J. Web Grid Serv. 14 (4) (2018) 352–375.
[9]
Papadodimas G., Palaiokrasas G., Litke A., Varvarigou T., Implementation of smart contracts for blockchain based IoT applications, in: 9th International Conference on the Network of the Future, IEEE, 2018.
[10]
Palaiokrassas G., et al., Combining blockchains, smart contracts, and complex sensors management platform for hyper-connected SmartCities: An IoT data marketplace use case, Computers 10 (10) (2021).
[11]
Palaiokrassas G., Litke A., Fragkos G., Papaefthymiou V., Varvarigou T., Deploying blockchains for a new paradigm of media experience, in: Economics of Grids, Clouds, Systems, and Services: 15th International Conference, September 18–20, 2018, Proceedings 15, GECON 2018, Springer, Pisa, Italy, 2019, pp. 234–242.
[12]
IPFS G., How IPFS works | IPFS docs, 2022, URL https://docs.ipfs.tech/concepts/how-ipfs-works/.
[14]
Harmon M.E., Harmon S.S., Reinforcement Learning: A Tutorial, Wright Lab Wright-Patterson Afb Oh, 1997.
[15]
Sutton R.S., Barto A.G., Reinforcement Learning: An Introduction, MIT Press, 2018.
[16]
D. Dewey, Reinforcement learning and the reward engineering principle, in: 2014 AAAI Spring Symposium Series, 2014.
[17]
Faust A., Francis A., Mehta D., Evolving rewards to automate reinforcement learning, 2019, arXiv preprint arXiv:1905.07628.
[18]
Bain M., Sammut C., A framework for behavioural cloning, in: Machine Intelligence 15, 1995, pp. 103–129.
[19]
Torabi F., Warnell G., Stone P., Behavioral cloning from observation, 2018, arXiv preprint arXiv:1805.01954.
[20]
Dai Y., Xu D., Maharjan S., Chen Z., He Q., Zhang Y., Blockchain and deep reinforcement learning empowered intelligent 5G beyond, IEEE Netw. 33 (3) (2019) 10–17,.
[21]
Ruggeri A., Di Salvo R., Fazio M., Celesti A., Villari M., Blockchain-based strategy to avoid fake AI in ehealth scenarios with reinforcement learning, in: 2021 IEEE Symposium on Computers and Communications, ISCC, IEEE, 2021, pp. 1–7.
[22]
Wang Y., Yang W., Ma F., Xu J., Zhong B., Deng Q., Gao J., Weak supervision for fake news detection via reinforcement learning, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020.
[23]
Goindani M., Neville J., Social reinforcement learning to combat fake news spread, in: Uncertainty in Artificial Intelligence, PMLR, 2020.
[24]
A. Mosallanezhad, M. Karami, K. Shu, M.V. Mancenido, H. Liu, Domain adaptive fake news detection via reinforcement learning, in: Proceedings of the ACM Web Conference 2022, 2022, pp. 3632–3640.
[25]
Nguyen T.T., et al., Deep reinforcement learning for cyber security, IEEE Trans. Neural Netw. Learn. Syst. (2021).
[26]
Wu Y., Wang Z., Ma Y., Leung V.C.M., Deep reinforcement learning for blockchain in industrial IoT: A survey, Comput. Netw. 191 (2021).
[27]
Liu C.H., Lin Q., Wen S., Blockchain-enabled data collection and sharing for industrial IoT with deep reinforcement learning, IEEE Trans. Ind. Inform. 15 (6) (2019) 3516–3526.
[28]
Xiao L., Ding Y., Jiang D., Huang J., Wang D., Li J., Vincent Poor H., A reinforcement learning and blockchain-based trust mechanism for edge networks, IEEE Trans. Commun. 68 (9) (2020).
[29]
Feng J., Richard Yu F., Pei Q., Chu X., Du J., Zhu L., Cooperative computation offloading and resource allocation for blockchain-enabled mobile-edge computing: A deep reinforcement learning approach, IEEE Internet Things J. 7 (7) (2020) 6214–6228.
[30]
Liu L., Li Z., Permissioned blockchain and deep reinforcement learning enabled security and energy efficient healthcare internet of things, IEEE Access 10 (2022) 53640–53651.
[31]
Boateng G.O., Sun G., Mensah D.A., Doe D.M., Ou R., Liu G., Consortium blockchain-based spectrum trading for network slicing in 5G RAN: A multi-agent deep reinforcement learning approach, IEEE Trans. Mob. Comput. (2022).
[32]
Li M., Pei P., Yu F.R., Si P., Li Y., Sun E., Zhang Y., Cloud–edge collaborative resource allocation for blockchain-enabled Internet of Things: A collective reinforcement learning approach, IEEE Internet Things J. 9 (22) (2022) 23115–23129.
[33]
Chen H., Chen Z., Lin F., Zhuang P., Effective management for blockchain-based agri-food supply chains using deep reinforcement learning, IEEE Access 9 (2021) 36008–36018,.
[34]
Schnaubelt M., Deep reinforcement learning for the optimal placement of cryptocurrency limit orders, European J. Oper. Res. 296 (3) (2022) 993–1006.
[35]
Aboussalah A.M., Lee C.G., Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization, Expert Syst. Appl. 140 (2020).
[36]
Betancourt C., Chen W.H., Reinforcement learning with self-attention networks for cryptocurrency trading, Appl. Sci. 11 (16) (2021) 7377.
[37]
Taghavi M., Bentahar J., Otrok H., Bakhtiyari K., A reinforcement learning model for the reliability of blockchain oracles, Expert Syst. Appl. 214 (2023).
[38]
J. Liu, et al., Learning Contract Invariants Using Reinforcement Learning, in: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022.
[39]
Gandhi G.M., et al., Artificial intelligence integrated blockchain for training autonomous cars, in: 2019 Fifth International Conference on Science Technology Engineering and Mathematics, vol. 1, ICONSTEM, IEEE, 2019, pp. 157–161.
[40]
Github.com/teodav/dApp G.M., 2020, URL https://github.com/teodav/dApp.
[41]
Unity G.M., Unity – [online], 2023, URL https://unity.com/.
[42]
Juliani A., Berges V.P., Teng E., Cohen A., Harper J., Elion C., Goy C., Gao Y., Henry H., Mattar M., et al., Unity: A general platform for intelligent agents, 2018, arXiv preprint arXiv:1809.02627.
[43]
Ethereum-ERC20 A., Ethereum ERC20 token standard, 2015, URL https://eips.ethereum.org/EIPS/eip-20.
[44]
. Solidity, Solidity programming language, URL https://soliditylang.org//.
[45]
G.T. Suite, Ganache - truffle suite, URL https://trufflesuite.com/ganache/.
[46]
. Unity-Learn 2023, Unity Learn: ML-Agents – Hummingbird course, URL https://learn.unity.com/project/course-overview?uv=2019.3&courseId=5e470160edbc2a15578b13d7.
[48]
Bellini E., Iraqi Y., Damiani E., Blockchain-based distributed trust and reputation management systems: A survey, IEEE Access 8 (2020) 21127–21151.

Index Terms

  1. Reinforcement learning with smart contracts on blockchains
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Future Generation Computer Systems
    Future Generation Computer Systems  Volume 148, Issue C
    Nov 2023
    637 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 November 2023

    Author Tags

    1. Blockchain
    2. Ethereum
    3. Smart contracts
    4. Machine learning
    5. Reinforcement learning
    6. Imitation learning
    7. Ml-agents

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media