Publications

You can also find my articles on my Google Scholar profile.

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Published in arXiv, 2024

A comprehensive study of the LLM-as-a-judge paradigm in a controlled setup that reveals new results about its strengths and weaknesses.

Published in Reinforcement Learning Conference, 2024

We propose a novel benchmark MDP for sepsis treatment in the ICU built using medical data from real patients.