Licheng Yu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Mohit Bansal, Tamara L. Berg: MAttNet: Modular Attention Network for Referring Expression Comprehension. CVPR 2018: 1307-1315