Every time I see AI mentioned it’s either “wired” or “tired”. Lots of ...

Every time I see AI mentioned it’s either “wired” or “tired”. Lots of people casting their votes in both camps, but the one that bothers me the most is the people who have been programming since the 90s throwing shade. “We saw this decades ago, and it’s happening all over again! I have to rewrite everything that comes out of an AI!”

Which as far as I can tell is exceptionally nearsighted. First, those AI algorithms in the 90s were terrible hacks that happened to work. Do you know how SIFT feature detection works? It’s amazing that it provided anything useful at all, so it’s not surprising that it had its limits. N-gram word prediction? Sure it strung words together, but the results were meager. Looking at ChatGPT and recalling ELIZA? Tell me more about that thing that you read about in college, and it was already lore by then.

If you don’t see the difference between then and now, let me summarize in three words: deep neural networks. All of those 90s era algorithms were weird hacks, but they were weird hacks that we could understand. No one was unsure of how to fix a broken Baysian network. You built one, cases where it didn’t work, and improved the model. Then came deep neural networks.

No one liked deep neural networks. Not because they were new, but because you couldn’t understand them. They were magic. There was no class on hand tuning hidden layer weights. There’s no step debugger, or optimization process. The only reason they were even useful is because they trained themselves. To the extent that a programmer made them, it was by experimenting with connectivity and training methodology. They were worse than magic, they were black magic.

And now, now that the incomprehensible black magic, made of essentially a trillion if statements, can string together some code, the people who know the most about that work look at the results and say, “eh, it really isn’t that great”.

They’re not wrong. Even the best models struggle with basic things. A good programmer can plan circles around them, understand the underlying value that’s being delivered, conduct user testing, manage a project timeline, etc. Even the best model isn’t going to compete with a good developer.

Then finally they conclude, without the slightest hesitation, “these are hard problems that we just don’t know how to solve yet.” And this is the cherry on the proverbial sundae! Yes! You’re right! We don’t know how to solve these problems! But we also don’t know how to build the one you’re using right now, because we don’t build them… they train themselves.

Of course, they wouldn’t exist without us. And they tend to get better when we change things. But the way that we improve them is more like chemistry than programming. We think about the fundamentals, make a hypothesis, tweak something, then measure the output. Nothing about tweaking the number of training tokens, layer shapes, connectivity, activation mechanisms, or energy models, requires you to be better than the best staff engineer in the world. You don’t need to derive the rules for writing better software, or how to be creative. All you need to do is know what networks have been tried, and come up with a new one that might learn a little better. Because, the models teach themselves.

Phew!
Ok.

Now tell me how foolish I am to think they might keep getting better for a while.

ynniv on Nostr: Every time I see AI mentioned it’s either “wired” or “tired”. Lots of ...