Page 40 - 2024F
P. 40
UEC Int’l Mini-Conference No.53 33
TABLE I: Ablation Study: Evaluation of Different Pose Extraction Methods.
Method Pose Extractor Input Poses Accuracy Precision Recall F1-score
Baseline OpenPose 4 60.2 59.0 60.0 55.0
Ours (Pitcher Pose) OpenPose 1 68.2 68.0 68.0 66.0
Cropped Pitcher OpenPose 1 63.8 65.0 64.0 64.0
Cropped Pitcher MediaPipe 1 63.1 66.0 63.0 60.0
D. Classification Tasks Results E. Comparison with State of the Art
To comprehensively evaluate our method, we We compared our results with previous state-
performed experiments on three classification tasks: of-the-art methods introduced by Piergiovanni and
six-class pitch type classification, fastball vs. non- Ryoo [7] on the MLB-YouTube dataset, which are
fastball, and fast (fastball, sinker, slider) vs. slow considered benchmarks for this dataset. The first
(changeup, curveball, knuckle-curve) pitch classifi- approach uses an I3D [15] model that processes en-
cation. These tasks were selected to explore the tire video clips as input to capture spatio-temporal
robustness of our model in differentiating pitch features through 3D convolutions. The second ap-
types, particularly in the context of automated pitch proach utilizes an InceptionV3 [16] model trained
recognition systems. In the six-class classification on pose heatmaps extracted using OpenPose, focus-
task shown in Figure 2a, our model achieved an ing on body pose information while still leveraging
overall accuracy of 68.2%. The confusion matrix dense video data. In contrast, our method employs
indicates that fastballs had the highest classification skeleton-based pose estimation to specifically iso-
rate at 84.7% correct classification rate. However, late the pitcher’s movements, utilizing a ST-GCN
there was significant confusion between sliders and model to classify pitches based on keypoint dy-
fastballs, with 52.1% of sliders being misclassified namics. As shown in Table II, our approach signifi-
as fastballs. This likely stems from the similarity in cantly improves performance, achieving an average
the initial body mechanics of these pitches, high- class accuracy of 58.5% compared to the previous
lighting the challenge in distinguishing pitches due works’ results of 34.5% (I3D) and 36.4% (Incep-
to their subtle differences in delivery. For the binary tionV3). The accuracy obtained by our method
classification of fastball vs. non-fastball, the confu- can be attributed to focusing exclusively on the
sion matrix in Figure 2b shows that fastballs were pitcher’s pose, thereby enabling the model to better
correctly identified 85.0% of the time, while non- capture the subtle body mechanics associated with
fastballs had a slightly lower accuracy of 79.9%. different pitch types.
This indicates that while the model is proficient at
recognizing fastballs, there is some overlap with TABLE II: Comparison with the State of the Art
non-fastball pitches, potentially due to similarities on the MLB-YouTube dataset.
in initial movements that can be deceptive. In Method Avg Class Accuracy (%)
the fast vs. slow pitch classification based on the Random 17.0
grouping proposed by Li et al. [11], the confusion I3D [7] 34.5
matrix in Figure 2c reveals that the model correctly InceptionV3 [7] 36.4
30.0
Baseline
identified 89.8% of fast pitches but had a lower Ours 58.5
accuracy of 71.8% for slow pitches, with 28.2%
of slow pitches misclassified as fast. The rationale
behind this grouping is that pitch speed signifi-
cantly influences the pitcher’s body movements; IV. DISCUSSION
fast pitches typically involve more explosive and This study introduces a skeleton-based approach
direct motions, while slow pitches require more to baseball pitch classification, utilizing pose esti-
nuanced mechanics. The confusion likely arises be- mation and ST-GCN to analyze pitcher body move-
cause some slow pitches, like changeup, can mimic ments. By focusing on skeletal data, our method
the initial explosive movement of fast pitches be- avoids the dependency on ball trajectory tracking
fore decelerating. prevalent in traditional systems. This design allows