Handling Realistic Noise in Multi-Agent Systems with Self-Supervised Learning and Curiosity

Márton Szemenyei; Patrik Reizinger

doi:10.2478/jaiscr-2022-0009

.blurhash-client-img { display: none !important; }

Handling Realistic Noise in Multi-Agent Systems with Self-Supervised Learning and Curiosity

Journal of Artificial Intelligence and Soft Computing Research

Volume 12 (2022): Issue 2 (April 2022)

By: Márton Szemenyei and Patrik Reizinger

Open Access

|Feb 2022

[1] Bowen Baker, Ingmar Kanitscheider, Todor M. Markov, Yi Wu, Glenn Powell, Bob McGrew, and Igor Mordatch. Emergent tool use from multi-agent autocurricula. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
Search in Google Scholar Back to article
[2] Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. Openai gym. 6 2016.
Search in Google Scholar Back to article
[3] Yuri Burda, Harrison Edwards, Deepak Pathak, Amos J. Storkey, Trevor Darrell, and Alexei A. Efros. Large-scale study of curiosity-driven learning. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
Search in Google Scholar Back to article
[4] Carl Doersch, Abhinav Gupta, and Alexei A. Efros. Unsupervised visual representation learning by context prediction. May 2015.10.1109/ICCV.2015.167
Search in Google Scholar Back to article
[5] Jeff Donahue, Philipp Krahenbahl, and Trevor Darrell. Adversarial feature learning. May 2016.
Search in Google Scholar Back to article
[6] Alexey Dosovitskiy, Philipp Fischer, Jost Tobias Springenberg, Martin Riedmiller, and Thomas Brox. Discriminative unsupervised feature learning with exemplar convolutional neural networks. 6 2014.
Search in Google Scholar Back to article
[7] Jakob N. Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, and Shimon Whiteson. Counterfactual multi-agent policy gradients. In Sheila A. McIlraith and Kilian Q. Weinberger, editors, Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 2974–2982. AAAI Press, 2018.
Search in Google Scholar Back to article
[8] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. March 2018.
Search in Google Scholar Back to article
[9] David Ha and Jürgen Schmidhuber. Recurrent world models facilitate policy evolution. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 2455–2467, 2018.
Search in Google Scholar Back to article
[10] Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Andrew Cowie, Ziyu Wang, Bilal Piot, and Nando de Freitas. Acme: A research framework for distributed reinforcement learning. 6 2020.
Search in Google Scholar Back to article
[11] Eric Jang, Coline Devin, Vincent Vanhoucke, and Sergey Levine. Grasp2vec: Learning object representations from self-supervised grasping. Proceedings of The 2nd Conference on Robot Learning, in PMLR 87:99-112 (2018), November 2018.
Search in Google Scholar Back to article
[12] John K. Kruschke. Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2):573–603, 2013.10.1037/a002914622774788
Search in Google Scholar Back to article
[13] Siqi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, and Thore Graepel. Emergent coordination through competition. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
Search in Google Scholar Back to article
[14] Felipe B. Martins, Mateus G. Machado, Hansen-clever F. Bassani, Pedro H. M. Braga, and Edna S. Barros. rsoccer: A framework for studying reinforcement learning in small and very small size robot soccer. 6 2021.10.1007/978-3-030-98682-7_14
Search in Google Scholar Back to article
[15] Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In Maria-Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pages 1928–1937. JMLR.org, 2016.
Search in Google Scholar Back to article
[16] Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, and Sergey Levine. Visual reinforcement learning with imagined goals. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 9209–9220, 2018.
Search in Google Scholar Back to article
[17] Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, and Sergey Levine. Visual reinforcement learning with imagined goals. July 2018.
Search in Google Scholar Back to article
[18] Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, and Trevor Darrell. Curiosity-driven exploration by self-supervised prediction. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 2778–2787. PMLR, 2017.10.1109/CVPRW.2017.70
Search in Google Scholar Back to article
[19] Deepak Pathak, Dhiraj Gandhi, and Abhinav Gupta. Self-supervised exploration via disagreement. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 5062–5071. PMLR, 2019.
Search in Google Scholar Back to article
[20] Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 779–788. IEEE Computer Society, 2016.10.1109/CVPR.2016.91
Search in Google Scholar Back to article
[21] Joseph Redmon and Ali Farhadi. YOLO9000: better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 6517–6525. IEEE Computer Society, 2017.10.1109/CVPR.2017.690
Search in Google Scholar Back to article
[22] Patrik Reizinger and Márton Szemenyei. Attention-based curiosity-driven exploration in deep reinforcement learning. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020, pages 3542–3546. IEEE, 2020.10.1109/ICASSP40776.2020.9054546
Search in Google Scholar Back to article
[23] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms, 2017.
Search in Google Scholar Back to article
[24] P. Sermanet, C. Lynch, J. Hsu, and S. Levine. Time-contrastive networks: Self-supervised learning from multi-view observation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 486–487, 2017.10.1109/CVPRW.2017.69
Search in Google Scholar Back to article
[25] Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, and Sergey Levine. Time-contrastive networks: Self-supervised learning from video. 4 2017.10.1109/CVPRW.2017.69
Search in Google Scholar Back to article
[26] Marton Szemenyei and Vladimir Estivill-Castro. Real-time scene understanding using deep neural networks for RoboCup SPL. In RoboCup 2018: Robot World Cup XXII, pages 96–108. Springer International Publishing, 2019.10.1007/978-3-030-27544-0_8
Search in Google Scholar Back to article
[27] Márton Szemenyei and Vladimir Estivill-Castro. Fully neural object detection solutions for robot soccer. Neural Computing and Applications, 4 2021.10.1007/s00521-021-05972-1
Search in Google Scholar Back to article
[28] Marton Szemenyei and Patrik Reizinger. Attention-based curiosity in multi-agent reinforcement learning environments. In 2019 International Conference on Control, Artificial Intelligence, Robotics & Optimization (ICCAIRO). IEEE, 5 2019.10.1109/ICCAIRO47923.2019.00035
Search in Google Scholar Back to article
[29] Marton Szemenyei and Vladimir Estivill-Castro. ROBO: Robust, fully neural object detection for robot soccer. In RoboCup 2019: Robot World Cup XXIII, pages 309–322. Springer International Publishing, 2019.10.1007/978-3-030-35699-6_24
Search in Google Scholar Back to article
[30] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008, 2017.
Search in Google Scholar Back to article
[31] Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, Richard Powell, Timo Ewalds, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo Danihelka, Aja Huang, Laurent Sifre, Trevor Cai, John P. Agapiou, Max Jaderberg, Alexander S. Vezhnevets, Rémi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom L. Paine, Caglar Gulcehre, Ziyu Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wünsch, Katrina McKinney, Oliver Smith, Tom Schaul, Timothy Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps, and David Silver. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019.10.1038/s41586-019-1724-z31666705
Search in Google Scholar Back to article
[32] Carl Vondrick, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, and Kevin Murphy. Tracking emerges by colorizing videos. 6 2018.10.1007/978-3-030-01261-8_24
Search in Google Scholar Back to article
[33] Xiaolong Wang and Abhinav Gupta. Unsupervised learning of visual representations using videos. May 2015.10.1109/ICCV.2015.320
Search in Google Scholar Back to article
[34] Donglai Wei, Joseph Lim, Andrew Zisserman, and William T Freeman. Learning and using the arrow of time. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 6 2018.
Search in Google Scholar Back to article
[35] Richard Zhang, Phillip Isola, and Alexei A. Efros. Colorful image colorization. March 2016.10.1007/978-3-319-46487-9_40
Search in Google Scholar Back to article
[36] Richard Zhang, Phillip Isola, and Alexei A. Efros. Split-brain autoencoders: Unsupervised learning by cross-channel prediction. November 2016.10.1109/CVPR.2017.76
Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/jaiscr-2022-0009

Journal RSS Feed

Language: English

Page range: 135 - 148

Submitted on: Sep 27, 2021

Accepted on: Dec 18, 2021

Published on: Feb 23, 2022

Published by: SAN University

In partnership with: Paradigm Publishing Services

Publication frequency: 4 issues per year

Keywords:

deep reinforcement learning,

multi-agent environment,

autonomous driving,

robot soccer,

self-supervised learning

Related subjects:

Computer sciences,

Databases and data mining,

Artificial intelligence

© 2022 Márton Szemenyei, Patrik Reizinger, published by SAN University
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 12 (2022): Issue 2 (April 2022)