HiroPoseEstimation: A Dataset of Pose Estimation for Kid-Size Humanoid Robot

Authors

  • Amik Rafly Azmi Ulya Sepuluh Nopember Institute of Technology, Surabaya
  • Nathanael Hutama Harsono Sepuluh Nopember Institute of Technology, Surabaya
  • Eko Mulyanto Yuniarno Sepuluh Nopember Institute of Technology, Surabaya
  • Mauridhi Hery Purnomo Sepuluh Nopember Institute of Technology, Surabaya

DOI:

https://doi.org/10.25126/jitecs.202383568

Abstract

Pose estimation is a field of computer vision research that involves detecting, associating, and tracking data points on body parts. It is used for health monitoring, sign language understanding, human gesture control, elderly activities, sports, and humanoid robot pose estimation. The anatomy of a humanoid robot is similar to a human, which forms the basis for utilizing humanoid robot pose estimation. The Humanoid League is a major domain of the RoboCup competition, featuring soccer matches between humanoid robots. Pose estimation is used to measure the robot’s performance. Nevertheless, there have not been many research done on this subject. A new dataset model needs to be developed to solve the proposed problem. This work introduces HiroPoseEstimation, a kid-size humanoid robot dataset with several types of robots used in various poses based on movements in a soccer game. It is evaluated with both bottomup and top-down approaches using keypoint mask R-CNN and single-stage encoder-decoder model. Both methods demonstrate good performance on the proposed dataset.

References

V. Bazarevsky, I. Grishchenko, K. Raveendran, T. Zhu, F. Zhang, and M. Grundmann, “BlazePose: On-device Real-time Body Pose tracking.” arXiv, Jun. 17, 2020. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/2006.10204

S. Ariyani, E. Mulyanto Yuniarno, and M. Hery Purnomo, “Heuristic Application System on Pose Detection of Elderly Activity Using Machine Learning in Real-Time,” in 2022 IEEE 9th International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Chemnitz, Germany: IEEE, Jun. 2022, pp. 1–6. doi: 10.1109/CIVEMSA53371.2022.9853649.

A. Jalal, A. Nadeem, and S. Bobasu, “Human Body Parts Estimation and Detection for Physical Sports Movements,” in 2019 2nd International Conference on Communication, Computing and Digital systems (C-CODE), Islamabad, Pakistan: IEEE, Mar. 2019, pp. 104–109. doi: 10.1109/C-CODE.2019.8680993.

L. Bridgeman, M. Volino, J.-Y. Guillemaut, and A. Hilton, “Multi-Person 3D Pose Estimation and Tracking in Sports,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 2487–2496. doi: 10.1109/CVPRW.2019.00304.

R. Blythman, M. Saxena, G. J. Tierney, C. Richter, A. Smolic, and C. Simms, “Assessment of deep learning pose estimates for sports collision tracking,” J. Sports Sci., vol. 40, no. 17, pp. 1885–1900, Sep. 2022, doi: 10.1080/02640414.2022.2117474.

J. Wang et al., “Deep High-Resolution Representation Learning for Visual Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3349–3364, Oct. 2021, doi: 10.1109/TPAMI.2020.2983686.

K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN.” arXiv, Jan. 24, 2018. Accessed: May 27, 2023. [Online]. Available: http://arxiv.org/abs/1703.06870

Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.

H. Farazi and S. Behnke, “Online visual robot tracking and identification using deep LSTM networks,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC: IEEE, Sep. 2017, pp. 6118–6125. doi: 10.1109/IROS.2017.8206512.

S.-H. Guo, C.-C. Liu, C.-C. Wong, and T.-T. Lee, “Image-Based Humanoid Robot Pose Recognition System”.

V. Di Giambattista, M. Fawakherji, V. Suriani, D. D. Bloisi, and D. Nardi, “On Field Gesture-Based Robot-to-Robot Communication with NAO Soccer Players,” in RoboCup 2019: Robot World Cup XXIII, S. Chalup, T. Niemueller, J. Suthakorn, and M.-A. Williams, Eds., in Lecture Notes in Computer Science, vol. 11531. Cham: Springer International Publishing, 2019, pp. 367–375. doi: 10.1007/978-3-030-35699-6_28.

N. M. Lessa, E. L. Colombini, and A. Da Silva Simoes, “SoccerKicks: a Dataset of 3D dead ball kicks reference movements for humanoid robots,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia: IEEE, Oct. 2021, pp. 3472–3478. doi: 10.1109/SMC52423.2021.9658787.

A. Amini, H. Farazi, and S. Behnke, “Real-time Pose Estimation from Images for Multiple Humanoid Robots.” arXiv, Jul. 06, 2021. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/2107.02675

D. Maji, S. Nagori, M. Mathew, and D. Poddar, “YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA: IEEE, Jun. 2022, pp. 2636–2645. doi: 10.1109/CVPRW56347.2022.00297.

H.-S. Fang et al., “AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 6, pp. 7157–7173, Jun. 2023, doi: 10.1109/TPAMI.2022.3222784.

Y. Zhang, J. H. Han, Y. W. Kwon, and Y. S. Moon, “A New Architecture of Feature Pyramid Network for Object Detection,” in 2020 IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China: IEEE, Dec. 2020, pp. 1224–1228. doi: 10.1109/ICCC51575.2020.9345302.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” arXiv, Jan. 06, 2016. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1506.01497

G. Papandreou et al., “Towards Accurate Multi-person Pose Estimation in the Wild.” arXiv, Apr. 14, 2017. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/1701.01779

Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, “Cascaded Pyramid Network for Multi-person Pose Estimation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 7103–7112. doi: 10.1109/CVPR.2018.00742.

S. Jin et al., “Towards Multi-Person Pose Tracking: Bottom-up and Top-down Methods”.

T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context.” arXiv, Feb.

, 2015. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1405.0312

M. Kresović and T. D. Nguyen, “Bottom-up approaches for multi-person pose estimation and it’s applications: A brief review.” arXiv, Dec. 22, 2021. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/2112.11834

S. Kreiss, L. Bertoni, and A. Alahi, “PifPaf: Composite Fields for Human Pose Estimation,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 11969–11978. doi: 10.1109/CVPR.2019.01225.

X. Nie, J. Feng, J. Xing, and S. Yan, “Pose Partition Networks for Multi-person Pose Estimation,” in Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., in Lecture Notes in Computer Science, vol. 11209. Cham: Springer International Publishing, 2018, pp. 705–720. doi: 10.1007/978-3-030-01228-1_42.

H. Farazi et al., “NimbRo Robots Winning RoboCup 2018 Humanoid AdultSize Soccer Competitions.” arXiv, Sep. 05, 2019. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1909.02385

D. Rodriguez et al., RoboCup 2019 AdultSize Winner NimbRo: Deep Learning Perception, In-Walk Kick, Push Recovery, and Team Play Capabilities, vol. 11531. 2019. doi: 10.1007/978-3-030-35699-6.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1512.03385

A. Newell, Z. Huang, and J. Deng, “Associative Embedding: End-to-End Learning for Joint Detection and Grouping.” arXiv, Jun. 09, 2017. Accessed: May

, 2023. [Online]. Available: http://arxiv.org/abs/1611.05424

B. Cheng, B. Xiao, J. Wang, H. Shi, T. S. Huang, and L. Zhang, “HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation.” arXiv, Mar. 12, 2020. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1908.10357

Downloads

Published

2023-12-15

How to Cite

Rafly Azmi Ulya, A. ., Hutama Harsono, N. ., Mulyanto Yuniarno, E., & Hery Purnomo, M. (2023). HiroPoseEstimation: A Dataset of Pose Estimation for Kid-Size Humanoid Robot. Journal of Information Technology and Computer Science, 8(3), 231–240. https://doi.org/10.25126/jitecs.202383568

Issue

Section

Articles