Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn: Disentangling Length from Quality in Direct Preference Optimization. ACL (Findings) 2024: 4998-5017