Evaluating and uncovering open LLMs

May 31, 2023

When choosing a model, we're stuck in the middle between classic NLP benchmarks (e.g. MMLU) and qualitative chatbot ranking. Neither are exactly what we want.

Read →

2 Comments

Elliot Tower

May 31, 2023

Just FWIW you might want to call it Chatbot Arena rather than ChatArena because that’s how they stylize it on the site, and there’s also another project with the name ChatArena (multi agent language games for LLMs)

Reply (1)

Share