Hey Jason/Adarsh - we were building AI bots for travel clients - a lot of them faced this problem of not having question:answer pairs because they've never implemented any Q&A type features before. To solve this, we were forced to have an extensive debugging process. A large part of the travel company would interact with the bots and help provide feedback to the answers of commonly asked questions. We'd feed the feedback into a knowledge graph and use that to constrain the output (some people we talked to fed it back to the vector database, but I found that to be sub-optimal as a lot of the sample queries were actually from badly phrased questions that required additional context to action on - i.e "i want to go trip to the south").
Happy to chat more on our approach - worked for us and agree there's a huge problem here for non-large enterprises who don't have existing Q/A sets (and I don't think synthetic data is a truly helpful answer here)