TLDR: The first in a planned series of three or more papers, which constitute the first major in-road in the compositional learning programme, and a substantial step towards bridging agent foundations theory with practical algorithms.
Official Abstract: We propose novel algorithms for sequence prediction based on ideas from stringology. These algorithms are time and space efficient and satisfy mistake bounds related to particular stringological complexity measures of the sequence. In this work (the first in a series) we focus on two such measures: (i) the size of the smallest straight-line program that produces the sequence, and (ii) the number of states in the minimal automaton that can compute any symbol in the sequence when given its position in base as input. These measures are interesting because multiple rich classes of sequences studied in combinatorics of words (automatic sequences, morphic sequences, Sturmian words) have low complexity and hence high predictability in this sense.
The most serious criticism of the learning-theoretic alignment agenda (LTA), and of agent foundations research more broadly, is the gap between the theory on the one hand and algorithms with practical relevance on the other hand. Until now, all the mathematical models in LTA fell into one of two categories:
- Simplistic toy models that are far from anything that might be called "general" intelligence ("general intelligence" is a category under which I include deep learning), such as linear multi-armed bandits or tabular Markov decision processes.
- Computationally infeasible algorithms such as AIXI, or even bounded versions of AIXI.
To bridge this gap, I proposed the idea of compositional learning theory: the search for algorithms that can exploit compositional patterns in the data to achieve statistical and computational efficiency in highly expressive (compositional) hypothesis classes. In the present line of work, I made the first major in-road towards implementing this programme. Specifically, I demonstrate efficient algorithms for deterministic sequence prediction that make few mistakes on sequences expressible via small straight-line programs (exhibited in this paper), or (substantial) generalizations thereof that will appear in the next papers of the series.
There are multiple ways how this might be useful:
- Conservatively, it leads to models of agents that reason using Occam's razor which are still "toy" but more realistic than AIXI.
- The complexity measures we discover might turn out to be a good mathematical model for the generalization power of deep learning (this hypothesis should, in principle, be empirically testable).
- Ambitiously, it can produce a practical way of building AI that bypasses deep learning altogether.
It is notable that this work was enabled by bringing together computational learning theory with fields that until now were quite disparate, such as stringology and combinatorics on words. This seems to explain why these results went undiscovered until now (we do use a recent result, but weaker algorithms were available previously), and lends hope to a continuing bounty of low-hanging fruit.
While the present results are limited to the toy setting of deterministic sequence prediction, there seems to be no barrier of principle to generalizing them much further, and we already have some concrete leads for generalizing to stochastic process prediction and to reward maximization in interactive environments.
Going forward, we are planning to strengthen these results by introducing progressively more powerful compositional languages, and also generalize them to robust reinforcement learning, followed by metacognitive agents and ultimately formal computational realism. At that point, we will have arguably solved "inner alignment" and the bounded version of the "diamond maximizer problem" (some work will be needed to corroborate this), in which case the road to a full solution of technical AI alignment will be open.
Discuss