HiroPoseEstimation: A Dataset of Pose Estimation for Kid-Size Humanoid Robot
DOI:
https://doi.org/10.25126/jitecs.202383568Abstract
Pose estimation is a field of computer vision research that involves detecting, associating, and tracking data points on body parts. It is used for health monitoring, sign language understanding, human gesture control, elderly activities, sports, and humanoid robot pose estimation. The anatomy of a humanoid robot is similar to a human, which forms the basis for utilizing humanoid robot pose estimation. The Humanoid League is a major domain of the RoboCup competition, featuring soccer matches between humanoid robots. Pose estimation is used to measure the robot’s performance. Nevertheless, there have not been many research done on this subject. A new dataset model needs to be developed to solve the proposed problem. This work introduces HiroPoseEstimation, a kid-size humanoid robot dataset with several types of robots used in various poses based on movements in a soccer game. It is evaluated with both bottomup and top-down approaches using keypoint mask R-CNN and single-stage encoder-decoder model. Both methods demonstrate good performance on the proposed dataset.
References
V. Bazarevsky, I. Grishchenko, K. Raveendran, T. Zhu, F. Zhang, and M. Grundmann, “BlazePose: On-device Real-time Body Pose tracking.” arXiv, Jun. 17, 2020. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/2006.10204
S. Ariyani, E. Mulyanto Yuniarno, and M. Hery Purnomo, “Heuristic Application System on Pose Detection of Elderly Activity Using Machine Learning in Real-Time,” in 2022 IEEE 9th International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Chemnitz, Germany: IEEE, Jun. 2022, pp. 1–6. doi: 10.1109/CIVEMSA53371.2022.9853649.
A. Jalal, A. Nadeem, and S. Bobasu, “Human Body Parts Estimation and Detection for Physical Sports Movements,” in 2019 2nd International Conference on Communication, Computing and Digital systems (C-CODE), Islamabad, Pakistan: IEEE, Mar. 2019, pp. 104–109. doi: 10.1109/C-CODE.2019.8680993.
L. Bridgeman, M. Volino, J.-Y. Guillemaut, and A. Hilton, “Multi-Person 3D Pose Estimation and Tracking in Sports,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 2487–2496. doi: 10.1109/CVPRW.2019.00304.
R. Blythman, M. Saxena, G. J. Tierney, C. Richter, A. Smolic, and C. Simms, “Assessment of deep learning pose estimates for sports collision tracking,” J. Sports Sci., vol. 40, no. 17, pp. 1885–1900, Sep. 2022, doi: 10.1080/02640414.2022.2117474.
J. Wang et al., “Deep High-Resolution Representation Learning for Visual Recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3349–3364, Oct. 2021, doi: 10.1109/TPAMI.2020.2983686.
K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN.” arXiv, Jan. 24, 2018. Accessed: May 27, 2023. [Online]. Available: http://arxiv.org/abs/1703.06870
Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 172–186, Jan. 2021, doi: 10.1109/TPAMI.2019.2929257.
H. Farazi and S. Behnke, “Online visual robot tracking and identification using deep LSTM networks,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC: IEEE, Sep. 2017, pp. 6118–6125. doi: 10.1109/IROS.2017.8206512.
S.-H. Guo, C.-C. Liu, C.-C. Wong, and T.-T. Lee, “Image-Based Humanoid Robot Pose Recognition System”.
V. Di Giambattista, M. Fawakherji, V. Suriani, D. D. Bloisi, and D. Nardi, “On Field Gesture-Based Robot-to-Robot Communication with NAO Soccer Players,” in RoboCup 2019: Robot World Cup XXIII, S. Chalup, T. Niemueller, J. Suthakorn, and M.-A. Williams, Eds., in Lecture Notes in Computer Science, vol. 11531. Cham: Springer International Publishing, 2019, pp. 367–375. doi: 10.1007/978-3-030-35699-6_28.
N. M. Lessa, E. L. Colombini, and A. Da Silva Simoes, “SoccerKicks: a Dataset of 3D dead ball kicks reference movements for humanoid robots,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia: IEEE, Oct. 2021, pp. 3472–3478. doi: 10.1109/SMC52423.2021.9658787.
A. Amini, H. Farazi, and S. Behnke, “Real-time Pose Estimation from Images for Multiple Humanoid Robots.” arXiv, Jul. 06, 2021. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/2107.02675
D. Maji, S. Nagori, M. Mathew, and D. Poddar, “YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA: IEEE, Jun. 2022, pp. 2636–2645. doi: 10.1109/CVPRW56347.2022.00297.
H.-S. Fang et al., “AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 6, pp. 7157–7173, Jun. 2023, doi: 10.1109/TPAMI.2022.3222784.
Y. Zhang, J. H. Han, Y. W. Kwon, and Y. S. Moon, “A New Architecture of Feature Pyramid Network for Object Detection,” in 2020 IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China: IEEE, Dec. 2020, pp. 1224–1228. doi: 10.1109/ICCC51575.2020.9345302.
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” arXiv, Jan. 06, 2016. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1506.01497
G. Papandreou et al., “Towards Accurate Multi-person Pose Estimation in the Wild.” arXiv, Apr. 14, 2017. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/1701.01779
Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, “Cascaded Pyramid Network for Multi-person Pose Estimation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 7103–7112. doi: 10.1109/CVPR.2018.00742.
S. Jin et al., “Towards Multi-Person Pose Tracking: Bottom-up and Top-down Methods”.
T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context.” arXiv, Feb.
, 2015. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1405.0312
M. Kresović and T. D. Nguyen, “Bottom-up approaches for multi-person pose estimation and it’s applications: A brief review.” arXiv, Dec. 22, 2021. Accessed: Jul. 10, 2023. [Online]. Available: http://arxiv.org/abs/2112.11834
S. Kreiss, L. Bertoni, and A. Alahi, “PifPaf: Composite Fields for Human Pose Estimation,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 11969–11978. doi: 10.1109/CVPR.2019.01225.
X. Nie, J. Feng, J. Xing, and S. Yan, “Pose Partition Networks for Multi-person Pose Estimation,” in Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., in Lecture Notes in Computer Science, vol. 11209. Cham: Springer International Publishing, 2018, pp. 705–720. doi: 10.1007/978-3-030-01228-1_42.
H. Farazi et al., “NimbRo Robots Winning RoboCup 2018 Humanoid AdultSize Soccer Competitions.” arXiv, Sep. 05, 2019. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1909.02385
D. Rodriguez et al., RoboCup 2019 AdultSize Winner NimbRo: Deep Learning Perception, In-Walk Kick, Push Recovery, and Team Play Capabilities, vol. 11531. 2019. doi: 10.1007/978-3-030-35699-6.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1512.03385
A. Newell, Z. Huang, and J. Deng, “Associative Embedding: End-to-End Learning for Joint Detection and Grouping.” arXiv, Jun. 09, 2017. Accessed: May
, 2023. [Online]. Available: http://arxiv.org/abs/1611.05424
B. Cheng, B. Xiao, J. Wang, H. Shi, T. S. Huang, and L. Zhang, “HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation.” arXiv, Mar. 12, 2020. Accessed: May 28, 2023. [Online]. Available: http://arxiv.org/abs/1908.10357
Downloads
Published
How to Cite
Issue
Section
License
 Creative Common Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).