


default search action
Tao Lei 0001
Person information
- affiliation: Apple, AI/ML, CA, USA
- affiliation (2021 - 2023): Google, Mountain View, CA, USA
- affiliation (2017 - 2021): ASAPP, Inc., New York, USA
- affiliation (PhD 2017): Massachusetts Institute of Technology, Cambridge, USA
Other persons with the same name
- Tao Lei — disambiguation page
- Tao Lei 0002
— Northwestern Polytechnical University, Xi'an, China - Tao Lei 0003
— Shaanxi University of Science and Technology, School of Electronic Information and Artificial Intelligence, Xinyang, China (and 4 more) - Tao Lei 0004
— Chinese Academy of Sciences, Institute of Optics and Electronics, Chengdu, China (and 1 more) - Tao Lei 0005
— Wuhan University, School of Cyber Science and Engineering, China - Tao Lei 0006
— Beijing University of Posts and Telecommunications, School of Information and Communication Engineering, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2026
[i30]Bailin Wang, Dan Friedman, Tao Lei, Chong Wang:
SPLA: Block Sparse Plus Linear Attention for Long Context Modeling. CoRR abs/2601.22379 (2026)
[i29]Chong Wang, Nan Du, Tom Gunter, Tao Lei, Kulin Seth, Senyu Tong, Jianyu Wang, Guoli Yin, Xiyou Zhou, Kelvin Zou, Ruoming Pang:
Parallel Track Transformers: Enabling Fast GPU Inference with Reduced Synchronization. CoRR abs/2602.07306 (2026)- 2025
[c32]Haotian Sun, Tao Lei, Bowen Zhang, Yanghao Li, Haoshuo Huang, Ruoming Pang, Bo Dai, Nan Du:
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing. ICLR 2025
[c31]Bairu Hou, Qibin Chen, Jianyu Wang, Guoli Yin, Chong Wang, Nan Du, Ruoming Pang, Shiyu Chang, Tao Lei:
Instruction-Following Pruning for Large Language Models. ICML 2025
[i28]Bairu Hou, Qibin Chen, Jianyu Wang, Guoli Yin, Chong Wang, Nan Du, Ruoming Pang, Shiyu Chang, Tao Lei:
Instruction-Following Pruning for Large Language Models. CoRR abs/2501.02086 (2025)
[i27]Yixiao Li, Xianzhi Du, Ajay Jaiswal, Tao Lei, Tuo Zhao, Chong Wang, Jianyu Wang:
IDEA Prune: An Integrated Enlarge-and-Prune Pipeline in Generative Language Model Pretraining. CoRR abs/2503.05920 (2025)
[i26]Mark Lee, Tom Gunter, Chang Lan, John Peebles, Hanzhi Zhou, Kelvin Zou, Sneha Bangalore, Chung-Cheng Chiu, Nan Du, Xianzhi Du, Philipp Dufter, Ruixuan Hou, Haoshuo Huang, Dongseong Hwang, Xiang Kong, Jinhao Lei, Tao Lei, Meng Li, Li Li, Jiarui Lu, Zhiyun Lu, Yiping Ma, David Qiu, Vivek Rathod, Senyu Tong, Zhucheng Tu, Jianyu Wang, Yongqiang Wang, Zirui Wang, Floris Weers, Sam Wiseman, Guoli Yin, Bowen Zhang, Xiyou Zhou, Danyang Zhuo, Cheng Leong, Ruoming Pang:
AXLearn: Modular Large Model Training on Heterogeneous Infrastructure. CoRR abs/2507.05411 (2025)- 2024
[c30]Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang:
MM1: Methods, Analysis and Insights from Multimodal LLM Pre-training. ECCV (29) 2024: 304-323
[i25]Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Guoli Yin, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang:
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training. CoRR abs/2403.09611 (2024)
[i24]Haotian Sun, Tao Lei, Bowen Zhang, Yanghao Li, Haoshuo Huang, Ruoming Pang, Bo Dai, Nan Du:
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing. CoRR abs/2410.02098 (2024)- 2023
[c29]Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David C. Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai:
CoLT5: Faster Long-Range Transformers with Conditional Computation. EMNLP 2023: 5085-5100
[c28]Jinhyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, Vincent Y. Zhao:
Rethinking the Role of Token Retrieval in Multi-Vector Retrieval. NeurIPS 2023
[c27]Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. NeurIPS 2023
[i23]Joshua Ainslie, Tao Lei, Michiel de Jong, Santiago Ontañón, Siddhartha Brahma, Yury Zemlyanskiy, David C. Uthus, Mandy Guo, James Lee-Thorp, Yi Tay, Yun-Hsuan Sung, Sumit Sanghai:
CoLT5: Faster Long-Range Transformers with Conditional Computation. CoRR abs/2303.09752 (2023)
[i22]Jinhyuk Lee, Zhuyun Dai, Sai Meher Karthik Duddu, Tao Lei, Iftekhar Naim, Ming-Wei Chang, Vincent Y. Zhao:
Rethinking the Role of Token Retrieval in Multi-Vector Retrieval. CoRR abs/2304.01982 (2023)
[i21]Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. CoRR abs/2304.04947 (2023)
[i20]Dewen Zeng, Nan Du, Tao Wang, Yuanzhong Xu, Tao Lei, Zhifeng Chen, Claire Cui:
Learning to Skip for Language Modeling. CoRR abs/2311.15436 (2023)- 2022
[c26]Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Jeong Han, Shinji Watanabe
:
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition. ICASSP 2022: 7872-7876
[c25]Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Y. Zhao, Andrew M. Dai, Zhifeng Chen, Quoc V. Le, James Laudon:
Mixture-of-Experts with Expert Choice Routing. NeurIPS 2022
[i19]Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Y. Zhao, Andrew M. Dai, Zhifeng Chen, Quoc Le, James Laudon:
Mixture-of-Experts with Expert Choice Routing. CoRR abs/2202.09368 (2022)
[i18]Yujie Qian, Jinhyuk Lee, Sai Meher Karthik Duddu, Zhuyun Dai, Siddhartha Brahma, Iftekhar Naim, Tao Lei, Vincent Y. Zhao:
Multi-Vector Retrieval as Sparse Alignment. CoRR abs/2211.01267 (2022)- 2021
[c24]Darsh J. Shah, Lili Yu, Tao Lei, Regina Barzilay:
Nutri-bullets: Summarizing Health Studies by Composing Segments. AAAI 2021: 13780-13788
[c23]Darsh J. Shah, Lili Yu, Tao Lei, Regina Barzilay:
Nutri-bullets Hybrid: Consensual Multi-document Summarization. NAACL-HLT 2021: 5213-5222
[i17]Darsh J. Shah, Lili Yu, Tao Lei, Regina Barzilay:
Nutri-bullets: Summarizing Health Studies by Composing Segments. CoRR abs/2103.11921 (2021)
[i16]Darsh J. Shah, Lili Yu, Tao Lei, Regina Barzilay:
Nutribullets Hybrid: Multi-document Health Summarization. CoRR abs/2104.03465 (2021)
[i15]Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Jeong Han, Shinji Watanabe:
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition. CoRR abs/2110.05571 (2021)- 2020
[c22]Lili Yu, Howard Chen, Sida I. Wang, Tao Lei, Yoav Artzi:
Interactive Classification by Asking Informative Questions. ACL 2020: 2664-2680
[c21]Kyle Swanson, Lili Yu, Tao Lei:
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport. ACL 2020: 5609-5626
[c20]Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei:
Autoregressive Knowledge Distillation through Imitation Learning. EMNLP (1) 2020: 6121-6133
[c19]Ziheng Wang, Jeremy Wohlwend, Tao Lei:
Structured Pruning of Large Language Models. EMNLP (1) 2020: 6151-6162
[c18]Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu Jeong Han, Tao Lei, Tao Ma:
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition. INTERSPEECH 2020: 16-20
[i14]Jing Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu Jeong Han, Tao Lei, Tao Ma:
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition. CoRR abs/2005.10469 (2020)
[i13]Kyle Swanson, Lili Yu, Tao Lei:
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport. CoRR abs/2005.13111 (2020)
[i12]Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei:
Autoregressive Knowledge Distillation through Imitation Learning. CoRR abs/2009.07253 (2020)
2010 – 2019
- 2019
[c17]Jeremy Wohlwend, Ethan R. Elenberg, Sam Altschul, Shawn Henry, Tao Lei:
Metric Learning for Dynamic Text Classification. DeepLo@EMNLP-IJCNLP 2019: 143-152
[i11]Kyle Swanson
, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei:
Building a Production Model for Retrieval-Based Chatbots. CoRR abs/1906.03209 (2019)
[i10]Ziheng Wang, Jeremy Wohlwend, Tao Lei:
Structured Pruning of Large Language Models. CoRR abs/1910.04732 (2019)
[i9]Jeremy Wohlwend, Ethan R. Elenberg, Samuel Altschul, Shawn Henry, Tao Lei:
Metric Learning for Dynamic Text Classification. CoRR abs/1911.01026 (2019)
[i8]Lili Yu, Howard Chen, Sida Wang, Yoav Artzi, Tao Lei:
Interactive Classification by Asking Informative Questions. CoRR abs/1911.03598 (2019)- 2018
[c16]Darsh J. Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov:
Adversarial Domain Adaptation for Duplicate Question Detection. EMNLP 2018: 1056-1063
[c15]Tao Lei, Yu Zhang, Sida I. Wang, Hui Dai, Yoav Artzi:
Simple Recurrent Units for Highly Parallelizable Recurrence. EMNLP 2018: 4470-4481
[i7]Darsh J. Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov:
Adversarial Domain Adaptation for Duplicate Question Detection. CoRR abs/1809.02255 (2018)- 2017
[b1]Tao Lei:
Interpretable neural models for natural language processing. Massachusetts Institute of Technology, Cambridge, USA, 2017
[c14]Tao Lei, Wengong Jin, Regina Barzilay, Tommi S. Jaakkola:
Deriving Neural Architectures from Sequence and Graph Kernels. ICML 2017: 2024-2033
[c13]Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Style Transfer from Non-Parallel Text by Cross-Alignment. NIPS 2017: 6830-6841
[i6]Tao Lei, Wengong Jin, Regina Barzilay, Tommi S. Jaakkola:
Deriving Neural Architectures from Sequence and Graph Kernels. CoRR abs/1705.09037 (2017)
[i5]Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Style Transfer from Non-Parallel Text by Cross-Alignment. CoRR abs/1705.09655 (2017)
[i4]Tao Lei, Yu Zhang, Yoav Artzi:
Training RNNs as Fast as CNNs. CoRR abs/1709.02755 (2017)- 2016
[c12]Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Rationalizing Neural Predictions. EMNLP 2016: 107-117
[c11]Youyang Gu, Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Learning to refine text based recommendations. EMNLP 2016: 2103-2108
[c10]Tianxiao Shen, Tao Lei, Regina Barzilay:
Making Dependency Labeling Simple, Fast and Accurate. HLT-NAACL 2016: 1089-1094
[c9]Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi S. Jaakkola, Kateryna Tymoshenko, Alessandro Moschitti, Lluís Màrquez:
Semi-supervised Question Retrieval with Gated Convolutions. HLT-NAACL 2016: 1279-1289
[c8]Mitra Mohtarami, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Tao Lei, Kfir Bar
, Scott Cyphers, James R. Glass:
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering. SemEval@NAACL-HLT 2016: 828-835
[i3]Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Rationalizing Neural Predictions. CoRR abs/1606.04155 (2016)- 2015
[j2]Yonatan Belinkov, Tao Lei, Regina Barzilay, Amir Globerson:
Erratum: "Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment". Trans. Assoc. Comput. Linguistics 3: 101 (2015)
[c7]Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Molding CNNs for text: non-linear, non-consecutive convolutions. EMNLP 2015: 1565-1575
[c6]Tao Lei, Yuan Zhang, Lluís Màrquez i Villodre, Alessandro Moschitti, Regina Barzilay:
High-Order Low-Rank Tensors for Semantic Role Labeling. HLT-NAACL 2015: 1150-1160
[i2]Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Molding CNNs for text: non-linear, non-consecutive convolutions. CoRR abs/1508.04112 (2015)
[i1]Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi S. Jaakkola, Kateryna Tymoshenko, Alessandro Moschitti, Lluís Màrquez i Villodre:
Denoising Bodies to Titles: Retrieving Similar Questions with Recurrent Convolutional Models. CoRR abs/1512.05726 (2015)- 2014
[j1]Yonatan Belinkov, Tao Lei, Regina Barzilay, Amir Globerson:
Exploring Compositional Architectures and Word Vector Representations for Prepositional Phrase Attachment. Trans. Assoc. Comput. Linguistics 2: 561-572 (2014)
[c5]Yuan Zhang, Tao Lei, Regina Barzilay, Tommi S. Jaakkola, Amir Globerson:
Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees. ACL (1) 2014: 197-207
[c4]Tao Lei, Yu Xin, Yuan Zhang, Regina Barzilay, Tommi S. Jaakkola:
Low-Rank Tensors for Scoring Dependency Structures. ACL (1) 2014: 1381-1391
[c3]Yuan Zhang, Tao Lei, Regina Barzilay, Tommi S. Jaakkola:
Greed is Good if Randomized: New Inference for Dependency Parsing. EMNLP 2014: 1013-1024- 2013
[c2]Tao Lei, Fan Long, Regina Barzilay, Martin C. Rinard:
From Natural Language Specifications to Program Input Parsers. ACL (1) 2013: 1294-1303- 2012
[c1]S. R. K. Branavan, Nate Kushman, Tao Lei, Regina Barzilay:
Learning High-Level Planning from Text. ACL (1) 2012: 126-135
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-05-06 00:04 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







