Dhruv N.

Commented on Overcoming Challenges with LLM Judges: Seeking Sol...·Posted inDiscussions

Jane W. curious where you landed on this. At google we’d lean more on encoder style models, rather than generative decoders to get much more stable scores that we could tune with human ratings