Implementing Phoenix: Questions on Open Source Models and LLMs

·Sep 09, 2023 03:56 AM

I successfully managed to implement Phoenix as per demo. The question that I have is as follows: a) Can we change the embedding model to use an open source embedding model with different dimensions b) For the LLM part is there a way to change the LLM to an open source LLM rather than use GPT3.5 or 4?

9 comments

· Sorted by Oldest

Mikyo
·
Hey Steve, thanks for checking it out! I’m not sure I follow which demo you are mentioning. In general phoenix isn’t tied to any LLM or embedding model so you should be fine to swap things out.
Xander S.
·
Steve H. Let us know which LLM orchestration framework you're using, we'll point you in the right direction.
Jason
·
Also worth noting that if you are generating the embeddings in a Python notebook/ data frame outside of a framework, We do have simple connivence embedding generators that you can leverage from Arize library for phoenix. They work with almost any Huggingface model. https://docs.arize.com/arize/embeddings/7.-troubleshoot-embedding-data/let-arize-generate-your-embeddings
Steve H.
·
Thanks Mikyo Xander S. Jason for your quick response. The code I was following was: https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/llama_index_search_and_retrieval_tutorial.ipynb . I was able to successfully use an opensource LLM for the sythnesis. I think for the embedding model it is just because the sample data that is pre-index has vector of different length than say an open source one from Huggingface. The main thing I guess for the "classify_relevance(query_text, document_text, evals_model_name)" function it only accepts GPT3.5 or GPT-4. Is there a way to change it t use saying Llamaindex,Langchain or Huggingface frame work?
Mikyo
·
Hey Steve H. , gotcha, yes so we are actively working on an LLM eval framework and it definitely focuses on using OpenAI at the moment since it’s yielding the best results and also because we want to thoroughly test the metrics before rolling out more models. But we certainly will roll out more and could certainly support the framework model classes for ease of use. Feel free to drop us a GitHub issue and we will alert you when we push out the enhancement! Would love to understand which models you intend to use for evaluation
🙌1
Steve H.
·
Mikyo Perfect thanks heaps for that. Keep up the great work, loving what Im seeing thus far.
Xander S.
·
Steve H. Thanks for the feedback. Would love to know which open-source models you're planning on using for evaluations. As you noted, the vector store used in the notebook was built using OpenAI text-embedding-ada-002 embeddings. You will need to rebuild the index using your open-source embedding model.
Steve H.
·
Xander S. I was thinking to use Llama 2 7B chat model or other variants of this LLM. Maybe even finetune an opensource model specifically for evaluation. I think this would be greatly particularly if you have lots of data and queries.
👍1
Xander S.
·
Got it, makes sense.
🙌1

Mikyo
·
Hey Steve, thanks for checking it out! I’m not sure I follow which demo you are mentioning. In general phoenix isn’t tied to any LLM or embedding model so you should be fine to swap things out.
Xander S.
·
Steve H. Let us know which LLM orchestration framework you're using, we'll point you in the right direction.
Jason
·
Also worth noting that if you are generating the embeddings in a Python notebook/ data frame outside of a framework, We do have simple connivence embedding generators that you can leverage from Arize library for phoenix. They work with almost any Huggingface model. https://docs.arize.com/arize/embeddings/7.-troubleshoot-embedding-data/let-arize-generate-your-embeddings
Steve H.
·
Thanks Mikyo Xander S. Jason for your quick response. The code I was following was: https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/llama_index_search_and_retrieval_tutorial.ipynb . I was able to successfully use an opensource LLM for the sythnesis. I think for the embedding model it is just because the sample data that is pre-index has vector of different length than say an open source one from Huggingface. The main thing I guess for the "classify_relevance(query_text, document_text, evals_model_name)" function it only accepts GPT3.5 or GPT-4. Is there a way to change it t use saying Llamaindex,Langchain or Huggingface frame work?
Mikyo
·
Hey Steve H. , gotcha, yes so we are actively working on an LLM eval framework and it definitely focuses on using OpenAI at the moment since it’s yielding the best results and also because we want to thoroughly test the metrics before rolling out more models. But we certainly will roll out more and could certainly support the framework model classes for ease of use. Feel free to drop us a GitHub issue and we will alert you when we push out the enhancement! Would love to understand which models you intend to use for evaluation
🙌1
Steve H.
·
Mikyo Perfect thanks heaps for that. Keep up the great work, loving what Im seeing thus far.
Xander S.
·
Steve H. Thanks for the feedback. Would love to know which open-source models you're planning on using for evaluations. As you noted, the vector store used in the notebook was built using OpenAI text-embedding-ada-002 embeddings. You will need to rebuild the index using your open-source embedding model.
Steve H.
·
Xander S. I was thinking to use Llama 2 7B chat model or other variants of this LLM. Maybe even finetune an opensource model specifically for evaluation. I think this would be greatly particularly if you have lots of data and queries.
👍1
Xander S.
·
Got it, makes sense.
🙌1