Concerns Over Claude's Performance in Retrieval Evaluations

·Nov 22, 2023 08:34 PM

Retrieval Evals and Claude: If you look at our Phoenix retrieval evals with Claude we weren't shy in saying there was a big gap in performance versus GPT-4. I would be very hesitant in running Retrieval Evals with Claude More results were released by Greg Kamradt - Fairly big issues with retrieval https://x.com/SteveMoraco/status/1727370446788530236?s=20

👀1

2 comments

· Sorted by Oldest

Discussions

Concerns Over Claude's Performance in Retrieval Evaluations

Jason

·Nov 22, 2023 08:34 PM

Retrieval Evals and Claude: If you look at our Phoenix retrieval evals with Claude we weren't shy in saying there was a big gap in performance versus GPT-4. I would be very hesitant in running Retrieval Evals with Claude More results were released by Greg Kamradt - Fairly big issues with retrieval https://x.com/SteveMoraco/status/1727370446788530236?s=20

👀1

2 comments

· Sorted by Oldest

Xander S.
·
That's wild.
Xander S.
·
Huge delta.

Xander S.
·
That's wild.
Xander S.
·
Huge delta.