Philipp Dufter, Martin Schmitt, Hinrich Schütze: Increasing Learning Efficiency of Self-Attention Networks through Direct Position Interactions, Learnable Temperature, and Convoluted Attention. COLING 2020: 3630-3636