Hi! I'm trying to compare different retrieval strategies (MODES) using Arize and LlamaIndex. here's a snippet from my code:
for mode in MODES:
query_engine = vector_store_index.as_query_engine(similarity_top_k=5, vector_store_query_mode=mode)
px.close_app()
px.launch_app()
for query in queries[0:2]:
query_engine.query(query)
retrieved_documents = get_retrieved_documents(px.active_session())I'm relauncing Phoenix for each mode but I'm seeing that the length of retrieved_documents is incrementally increasing despite the reset which leads me to believe that it's not actually relaunching. Am I doing something wrong?
It appears that the px.close_app() method is not fully clearing the session data before relaunching Phoenix. This could be causing the incremental increase in the length of retrieved_documents. To ensure that all stored data is deleted, you should use the delete_data=True parameter with the close_app method.Here's the corrected code snippet: LBKSfor mode in MODES: query_engine = vector_store_index.as_query_engine(similarity_top_k=5, vector_store_query_mode=mode) px.close_app(delete_data=True) px.launch_app() for query in queries[0:2]: query_engine.query(query) retrieved_documents = get_retrieved_documents(px.active_session())LBKSWould you like more detail?Sources:
- https://github.com/Arize-ai/phoenix/issues/2218
- https://github.com/Arize-ai/phoenix/issues/2047
💡 Hint: Mention RunLLM in the thread for followups.
To switch projects using the using_project context manager, you can wrap your code for each mode within the context manager. This will ensure that traces for each mode are sent to a separate project. Here’s how you can modify your code: LBKSfrom phoenix.trace import using_projectLBKSfor mode in MODES: project_name = f"project_{mode}" with using_project(project_name): query_engine = vector_store_index.as_query_engine(similarity_top_k=5, vector_store_query_mode=mode) for query in queries[0:2]: query_engine.query(query) retrieved_documents = get_retrieved_documents(px.active_session()) print(f"Retrieved documents for {mode}: {retrieved_documents}")LBKSThis code will dynamically switch the project for each mode, ensuring that traces are separated by project.Would you like more details on any specific part?Sources:
- using_project context manager
- Phoenix Tracing Documentation
💡 Hint: Mention RunLLM in the thread for followups.
RunLLM That doesn't seem to work. That still gives me the incremental error.
To get retrieved documents from a specific project using px.Client, you need to specify the project_name parameter in the get_retrieved_documents function. Here’s how you can modify your code: LBKSfrom phoenix.trace import using_project import phoenix as pxLBKSclient = px.Client()LBKSfor mode in MODES: project_name = f"project_{mode}" with using_project(project_name): query_engine = vector_store_index.as_query_engine(similarity_top_k=5, vector_store_query_mode=mode) for query in queries[0:2]: query_engine.query(query) retrieved_documents = get_retrieved_documents(client, project_name=project_name) print(f"Retrieved documents for {mode} in project {project_name}: {retrieved_documents}")LBKSThis ensures that the retrieved documents are fetched from the specified project.Would you like more details on any specific part?Sources:
- get_retrieved_documents function
💡 Hint: Mention RunLLM in the thread for followups.
Ahh but I'm not using the hosted version of Arize so don't really have a client defined.
OH!?
Atleast with close/launch app, the incremental number was coming from close_app not clearing the session data.
Yeah you still have a local database file that's not closed
can you try px.delete_all() after close_app?
The px.delete_all() function in Phoenix is used to delete all persisted data under the PHOENIX_WORKING_DIR, effectively resetting your session for the next launch. This includes traces, evaluations, and any other data stored in the working directory.Here’s how you can use it in your code: LBKSfor mode in MODES: query_engine = vector_store_index.as_query_engine(similarity_top_k=5, vector_store_query_mode=mode) px.close_app(delete_data=True) px.delete_all() px.launch_app() for query in queries[0:2]: query_engine.query(query) retrieved_documents = get_retrieved_documents(px.active_session()) print(f"Retrieved documents for {mode}: {retrieved_documents}")LBKSThis ensures that all session data is cleared before relaunching Phoenix.Would you like more details on any specific part?Sources:
- Phoenix Session API
- Phoenix Session Code
💡 Hint: Mention RunLLM in the thread for followups.
Hah setting delete_data to True had the same effect. I’m going to try the project method. Seems cleaner and what one would do in a prod-like setup.
Thanks for the suggestions!
