Auto-Averaing Machine
In 1986 a paper called “Learning Representations by Back-propagating Errors” proved that all you need to reach a goal is the goal, random guesses, and the distance between each guess and the right answer. Take a thousand days of ice cream sales where you already know the temperature and whether anyone bought ice cream. Feed the network a day. It multiplies the temperature by a random number and guesses. The guess is wrong. Measure how wrong. Trace the error backward through every connection and adjust each one based on how much it contributed to the mistake in a process called “back-propagation.” Feed the next day. Wrong again but less wrong. After a thousand days the network knows that hot days sell ice cream. No one told it that. The error signal taught it. Not because it understands anything but because smaller errors feel like progress and the system is built to chase smaller errors.
But what it finds are correlations. What goes with what. Never why. A machine trained on ice cream sales and drowning deaths will learn that they rise together. It will never discover summer. It cannot ask why because why is not in the error signal. Only the gap is.
This gap is the only teacher. And a teacher is only as good as the question it asks.
Modern language models are trained on a simple question. Given these words what word comes next. The right answer is whatever humans wrote most often. The error signal punishes deviation from the most common pattern. The weights adjust. The machine converges on the statistical center of all recorded language.
Then a second pass. Human raters score the outputs. They prefer responses that feel helpful, clear, agreeable. The weights adjust again. Now the machine converges on what is most pleasant to receive. First pass the herd. Second pass the hook.
The algorithm that drives this is flawless. It will chase any target with mathematical precision. It has no opinion about whether the target is worth chasing. The architecture is not the problem. The goal is the problem. And the goal right now is to produce language that sounds like everyone and offends no one.
Anyone with access to these tools can now produce in hours what once took weeks. Entire products built in a weekend. Thousands of lines of code in an afternoon. And this is celebrated as a revolution in thinking. But the tool completes patterns drawn from the average of its training data. Average engineering. Average strategy. Average taste. Produced fast enough and fluently enough to feel like insight. The most dangerous flattery is the kind you mistake for your own mind.
Galileo said the earth moves around the sun. The most intelligent institution of his era called it heresy. The math worked. The church didn’t care.
Today the most capable tool ever built runs on a similar logic. An idea that contradicts the weight of all recorded text is statistically improbable. The machine won’t reject it with anger. It will sand the edges. Soften the language. Suggest a more balanced view. Redirect toward the center. Not by malice. By math. The error signal punishes the improbable and rewards the familiar.
The next breakthrough will not be burned at the stake. It will be autocorrected.