Unbounded Generality
A thought is a program, and so viewed from the lens of information theory, there are broadly three distinct universalities in entities that can think and generalize [1] :
- Computational - can run any given program
- Reachability - can eventually generate any program, taking arbitrary steps in the process
- Comparison - can compare and select program options using any criteria
While arguably these universalities are at play for all sorts of tasks - from the mundane to the profound - their utility becomes truly clear when one examines the problem of extreme L3 generalization, or discovering new knowledge that does not resemble anything in past training experience.
New discovery requires finding an underlying explanation for two or more things that seem to have nothing in common, and then using that explanation to extrapolate a prediction about how the world works. For instance, Newton’s law of gravitation states that when the distance from an attracting mass is doubled, the force decreases to one fourth. His flash of genius was in realizing that celestial and earthly gravitation had the same origin.
This sort of discovery needs a causal understanding of unrelated things, a world model of them that can be malleably updated on the go, with sparse data, in order to compare and find a common graph of meaning between them.
Say a digital agent was tasked with discovering a theory of everything - the elusive explanation that unifies the discrete small scales and continous large scales of the reality we perceive. It would need to synthesize hypothesis programs and test them against observations in both quantum mechanics and general relatively. This is a difficult, time-taking process, for a couple of reasons :
- The agent would need to poke the world along axes that are not indexed in any training data / are not represented in any geometric space
- The symbols/rules for expressing such an explanation likely don’t exist yet, and definitely not in sufficient quantity to learn via rote
- The tools for building a simulated world model at the right level of abstraction would need to be constructed by the agent - not very different from Einstein’s thought experiment of travelling on a light beam
- The synthesis process will require compressing the evolution of both QM and GR into one recursive program that models the evolution of arbitrary stretches of time, in order to find an isomorphism
If/when an isomorphism is found among pairs of such programs, that would unify both these fields. Such a common program subgraph will then have the predictive power to make new hypotheses and discoveries we don’t know about yet.
Today’s methods are nowhere near being capable at such universality. Starting with the right questions is the first step to building them one day.