Tweet

Community notes highlight a great example on the “objectiveness” of benchmarks. Despite “showing” just objective data leading to obvious conclusions, warning about being “the same code”, it’s also wrong.🤷‍♂️ https://x.com/BenjDicken/status/1857449788893286484

(original)