I successfully managed to implement Phoenix as per demo. The question that I have is as follows: a) Can we change the embedding model to use an open source embedding model with different dimensions b) For the LLM part is there a way to change the LLM to an open source LLM rather than use GPT3.5 or 4?
Hey Steve, thanks for checking it out! I鈥檓 not sure I follow which demo you are mentioning. In general phoenix isn鈥檛 tied to any LLM or embedding model so you should be fine to swap things out.
Also worth noting that if you are generating the embeddings in a Python notebook/ data frame outside of a framework, We do have simple connivence embedding generators that you can leverage from Arize library for phoenix. They work with almost any Huggingface model. https://docs.arize.com/arize/embeddings/7.-troubleshoot-embedding-data/let-arize-generate-your-embeddings
Thanks Mikyo Xander S. Jason for your quick response. The code I was following was: https://colab.research.google.com/github/Arize-ai/phoenix/blob/main/tutorials/llama_index_search_and_retrieval_tutorial.ipynb . I was able to successfully use an opensource LLM for the sythnesis. I think for the embedding model it is just because the sample data that is pre-index has vector of different length than say an open source one from Huggingface. The main thing I guess for the "classify_relevance(query_text, document_text, evals_model_name)" function it only accepts GPT3.5 or GPT-4. Is there a way to change it t use saying Llamaindex,Langchain or Huggingface frame work?
Hey Steve H. , gotcha, yes so we are actively working on an LLM eval framework and it definitely focuses on using OpenAI at the moment since it鈥檚 yielding the best results and also because we want to thoroughly test the metrics before rolling out more models. But we certainly will roll out more and could certainly support the framework model classes for ease of use. Feel free to drop us a GitHub issue and we will alert you when we push out the enhancement! Would love to understand which models you intend to use for evaluation
Steve H. Thanks for the feedback. Would love to know which open-source models you're planning on using for evaluations. As you noted, the vector store used in the notebook was built using OpenAI text-embedding-ada-002 embeddings. You will need to rebuild the index using your open-source embedding model.
