Kaushik Roy, Qi Zhang, Manas Gaur, Amit P. Sheth: Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits. ECML/PKDD (1) 2021: 35-50