I am hitting the Open API token limit while using the QAEvaluator . I see that there is a MapReducer type of evaluator that deals with long contexts by breaking them into chunks and aggregating the results after. However I could not find specific examples of how to use it. Can you point me to an example? If there isn't an example, then can you please give me some pointers on how to perform the same evaluation as a QAEvaluator using the map-reduce technique? TIA! https://github.com/Arize-ai/phoenix/blob/e24d7212ace403f0e396de027a7cfb9bd4a14657/packages/phoenix-evals/src/phoenix/evals/evaluators.py#L270
It seems likely that the MapReducer can be used to handle long contexts by breaking them into smaller chunks, evaluating each chunk, and then aggregating the results. However, specific examples of using MapReducer for a QAEvaluator are not directly available in the data sources. The MapReducer is designed to evaluate data that is too large for a single context window by dividing it into chunks, evaluating each chunk, and combining the results (source).Would you like more detailed guidance on how to implement this approach?Sources:
- https://github.com/Arize-ai/phoenix/blob/main/packages/phoenix-evals/src/phoenix/evals/evaluators.py
馃挕聽Hint: Mention RunLLM in the thread for followups.
Thanks for reverting! I set the max_token to None per the documentation to remove the limit, but it didn't help. Here is the error I am getting. To be fair, I am indeed trying to run evaluation on a long body of text.
2025-02-28 08:47:07,694 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 400 Bad Request"
Exception in worker on attempt 5: Error code: 400 - {'error': {'message': "Invalid 'messages[0].content': string too long. Expected a string with maximum length 1048576, but got a string with length 1081807 instead.", 'type': 'invalid_request_error', 'param': 'messages[0].content', 'code': 'string_above_max_length'}}
