Join Nostr
2025-10-22 05:40:25 UTC
in reply to

Terence Tao on Nostr: The author's account of this story in Sections 1 and 7 of make for fascinating ...

The author's account of this story in Sections 1 and 7 of https://borisalexeev.com/pdf/erdos707.pdf make for fascinating reading. Some takeaways from this story:

1. Both human and AI-powered literature reviews can still fail to turn up relevant hits. There is still room for improvement in this area.

2. The Lean formalization allowed for "instant refereeing": the Lean community at https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/Lean.20in.20the.20wild/with/546368325 were able to verify the proof within thirty minutes of the announcement.

3. There appears to be one very specific use case in which "vibe coding" can actually be used responsibly and effectively: generating a formal proof artefact for a statement that has already been formalized by human experts. But even then, there is the potential for human-generated error in the statement formalization.

4. This also appears to be one of the very few use cases where LLM output can be used responsibly in a research paper. Importantly, no LLM-generated output was directly placed into the main body of the text (other than when quoting an excerpt from the LLM-generated Lean code for illustrative purposes); instead, such output was only used in completely verifiable contexts (in this case, in generating code that can be type checked by Lean).

Given recent experience, I can imagine that there will be some breathless reports that "LLMs actually solved an Erdos problem for real this time!". The truth however is extremely nuanced, and really requires a detailed study of the situation before jumping to any conclusions.