Woah! This notion of interpolation threshold and double descent was a key insight that I was missing. Looking at the time stamps of papers detailing double descent most of them were published right after I had concluded my Deep Learning course. As I never worked at an industry, I missed this development. Initial perspective is that this seems to explain the generational by generational improvements of these models.
Well, this is going to fill my reading list for the next couple of days.