Great paper released from Anthropic that's very relevant to the Observability community. Very clever way to do hierarchical clustering on users' conversations with an LLM. This allows them to get breakdowns on how people actually use Claude and to identify misuse that got past initial safety guardrails.
I could see these techniques integrating in Arize/Phoenix very nicely - and I'm sure the team was already headed down this path.
There are a lot of great ideas in the paper. I personally love the mix of new- and old-school ML. They use LLMs and everyone's favorite K-Means clustering.