Noah Lee, Jiwoo Hong, James Thorne: Evaluating the Consistency of LLM Evaluators. COLING 2025: 10650-10659