Retrieval Evals and Claude:
If you look at our Phoenix retrieval evals with Claude we weren't shy in saying there was a big gap in performance versus GPT-4. I would be very hesitant in running Retrieval Evals with Claude
More results were released by Greg Kamradt - Fairly big issues with retrieval
https://x.com/SteveMoraco/status/1727370446788530236?s=20