I didn't realize how much time I spent on cleanups until regex let me stop.
A new benchmark pitting AI against previously unseen maths problems shows systems still fall short of top human expertise.