Benchmarks & Evals
BullShit Bench
Peter Gostev publishes BullShit Bench
Peter Gostev published BullShit Bench, a new community evaluation flagged in the week's evals and benchmarks roundup. It measures how models handle nonsense or unfounded claims rather than raw capability.