The economist Herbert Stein once said “whatever cannot continue forever must stop”, now called Stein’s Law. We can generalize this to “Stein’s Principle”.

The universe will almost certainly last for many billions of years. In addition, let’s assume that the utility of a mind’s life doesn’t depend on the absolute time period in which that life occurs.

Logically, either human-derived civilization must exist for most of the universe’s lifespan, or not. If it does not, this falls into what Nick Bostrom calls an existential risk scenario. But if it does, and if we (very reasonably) assume that the population is steady or increasing, then this implies the vast majority of future utility is in time periods over a million years from now. This is Bostrom’s conclusion in ‘Astronomical Waste‘.

However, we can break it down further. Let X be the set of possible states of future civilization. We know that there is at least one x in X which is stable over very long time periods – once humans and their progeny go extinct, we will stay extinct. We also know there is at least one x which is unstable. (For example, the world where governments have just started a nuclear war will rapidly become very different, with very high probability.) Hence, we can create a partition P over X, with each x in X falling into one and only one of P_1, P_2, P_3… P_n. Some of the P_i are stable, like extinction, in that a state within P_i will predictably evolve only into other states in P_i. Other P_j are unstable, and may evolve outside of their boundaries, with nontrivial per-year probabilities.

One can quickly see that, after a million years, human civilization will wind up in a stable bucket with exponentially high probability. (Formally, one can prove this with Markov chains.) But we already know that the vast majority of human utility occurs after a million years from now. Hence, Stein’s Principle tells us that any unstable bucket must have very little *intrinsic* utility; its utility lies almost entirely in which stable bucket might come after it.

Of course, one obvious consequence is Bostrom’s original argument: any bucket with a significant level of x-risk must be unstable, and so its intrinsic utility is relatively unimportant, compared to the utility of reducing x-risk. But even excluding x-risk, there are other consequences too. For example, for a multipolar scenario to be stable, it must include some extremely reliable mechanism for preventing both one agent from conquering the others, and the emergence of a new agent more powerful than the existing ones. Without such a mechanism, the utility of any such world will be dominated by that of the stable scenario which inevitably succeeds it.

And further, each stable bucket might itself contain stable and unstable sub-buckets, where a stable sub-bucket locks the world into it but an unstable one allows movement to elsewhere in the enclosing bucket. Hence, in a singleton scenario, buckets where the singleton might replace itself with dissimilar entities are unstable; buckets where the replacements are always similar in important respects are stable.

Iterated systems in general are subject to both cyclic and chaotic attractors. How are you ruling those out? Is it in how you defined the partition? If so, I missed it.

A cyclic attractor (eg. x1 -> x2 -> x3 -> x1) would be contained within one of the buckets within the partition.

Isn’t ‘chaotic attractor’ an oxymoron? If it is chaotic, how can one predict with very high confidence that it will continue (or not continue) doing anything?

Here’s a toy example. Pick some function of one variable that behaves chaotically when iterated. Call it f. Now consider a two-dimensional system that evolves according to g(x,y) = (f(x), y/2). The dynamics of this system are as follows: wherever you start, you converge exponentially towards the subset defined by y=0, but the way you move around that subset is unpredictable.

This sort of behaviour is quite common in “real” systems: there’s an attracting subset that almost all orbits converge toward; once you get near it, your motion is predictable in that it stays near the attractor but unpredictable in that you can quickly find yourself anywhere in the attractor.

This description works for simple preferences, but when agent’s preference is itself a computation (as with FAI’s preference being defined as what is chosen by a particular setup initially constructed out of humans), there are multiple levels of what’s going on. The outer AI may be stable in the sense that it manages to gain control over its environment, but its preference (as a process) may at the same time lack a powerful agent, and so remain “unstable”, perhaps much longer if it was designed with that possibility in mind, having a reliable mechanism that prevents AI takeover inside AI’s preference.

When formulating the law, you are writing here “must stop”, instead of the “won’t continue”. On a philosophical tangent, I am just wondering, if “must stop” means that it should be stoped, rather than go extinct by itself. I wonder if you see a difference here?

As for the multi-polar world stability and the mechanism to protect it from a new hegemony. On a purely psychoevidence level, it seems an unlikely scenario that will be substituted by a world with a new hegemony. What kind of mechanism can that be?

Will we see a relatively long period of oscillation between multi-polar conditions with world hegemonies, until there are no polars left?