Page 63 - 2024F
P. 63

56                                                                UEC Int’l Mini-Conference No.53

                                 Optimized Graph Neural Network

                          Approach for 3D Human Pose Estimation

                                        Acevedo-Bringas Luis*, Hiroki Takahashi
                                       The University of Electro-Communications
                                               Department of Informatics
                                               *a2440012@gl.cc.uec.ac.jp

                                                     Introduction

              3D Human Pose Estimation (HPE) in videos aims to predict the pose joint locations of the human body
              in a 3D space.

              • Graph-based models: human body as graph, joints as nodes, spatial, temporal & relational learning.
              • Transformer-based models: self-attention, long-range dependencies, global context.

              We choose the graph-based models because they effectively model the human body's kinematic and
              structural relationships, leading to more interpretable and computationally efficient pose estimation.

              Objective: Develop a Graph-based model capable of refine and detect human key points with a trade-
              off between accuracy and computational cost.

                                                     Methodology
















                                            Figure 1. Baseline architecture (GLA-GCN[1])

              Metrics:                                      Expected contributions
              • MPJPE↓: Average Euclidian Distance
                between the joints predicted and the     Stage 1: Implement global and local
                GT.                                      attention for the key points.
              • PCK↑:    Percentage    of    Correct
                predicted key points                     Stage 2: Implement human tracking.

              Datasets:                                  Stage 3: Search for the best trade-off
                        st
              For the 1 , 2  nd  and 3 rd  stage the     configuration between computation
              experiments are conducted using the        cost and accuracy.                      Figure 2.
              Human3.6M and MPI-INF-3DHP datasets                                                 Global
                      rd
              For the 4 stage the experiments will be    Stage 4: Implement the algorithm for   interaction of
              using the    AIT-Soccer-FIFA-Skeletal      soccer games analysis.                a key point[2].
              dataset

             References
             [1] B. X. B. Yu, Z. Zhang, et. al, "GLA-GCN: Global-local adaptive graph convolutional network for 3D human pose estimation
             from monocular video," in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2024
             [2] Wang Ti, Hong Liu, et. al, Interweaved Graph and Attention Network for 3D Human Pose Estimation, in In IEEE
             International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
   58   59   60   61   62   63   64   65   66   67   68