We put a lot into the following research from Aparna. These are some of the first Evals for Gemini Pro. The results for the Needle in a Haystack fact retrieval were not great.
Caveat this with other Evals we have run have been pretty good not sure why fact retrieval is so poor.
https://x.com/aparnadhinak/status/1744771295940669689?s=20