Chendi Li, Yufan Xu, Sina Mahdipour Saravani, Ponnuswamy Sadayappan: Accelerated Auto-Tuning of GPU Kernels for Tensor Computations. ICS 2024: 549-561