AI groups rush to redesign model testing and create new benchmarks in November 9, 2024 Rapidly advancing technology is surpassing current methods of evaluating and comparing large language models