EvoWiki: Evaluating LLMs on Evolving Knowledge
Oct 10, 1010·,,,,,
,·
1 min read
Wei Tang
Yixin Cao
Yang Deng
Jiahao Ying
Bo Wang
Yizhe Yang
Yuyue Zhao
Qi Zhang

A benchmark for evaluating LLMs’ ability to track, update, and reason over evolving knowledge.