This is a high level introduction which is meant to overview and motivate some of my current projects under the heading of “coherence.” Nothing here is meant to be rigorous or exhaustive.
Intro
When humans reason about objects in our environments, we understand ourselves to be reasoning about abstractions of a purported “real thing.” These abstractions tend to be composed of high level properties and behavioral models which are incredibly useful as the basis of various types of deductive and inductive reasoning chains.
This type of reasoning is overtly present in mathematical thinking. In mathematics, not only do we abstract familiar objects into collections of axioms or properties. We also routinely go in the opposite direction in producing, as it were, mathematical objects which are defined purely in terms of axioms which define their behaviors. This is fine and good for mathematical purposes, since in mathematics all that we do is to engage with these object via strict forms of syntactic reasoning for which the axiomatic properties are the necessary and sufficient inputs.
On the other hand, when we are dealing with the physical world, it is not generally possible to construct, directly from a collection of properties, an object which satisfies these properties. Nevertheless, this doesn’t keep us from bundling together collections of properties, and thinking about–or indeed, worrying very much about–objects which might satisfy this bundle of properties.
Cue the meme, you know the one.
All of this to say, when we use abstraction to reason about the world (which is all of the time), we must obviously do so very carefully. Our formal system may scare us with the implication that “A exists” implies “end of ze world,” while reality may ultimately notify us that “A cannot possibly exist.”
Reasoning about objects abstractly is, in a basic sense, the only way that we can reason about objects. Even so, there are important commonalities in the way that we often “over-abstract,” and carefully attending to some of these patterns may enable us to address some systematic errors in our abstractions and thereby take us closer to reality.
Often, the point of an abstraction is to isolate all of the ways that we will meaningfully need to interact with an object (its interface) and the behaviors associated with that interface, and to then discard all of the other things about the object (its constitution). Unless we are butchering something, operating on it, or disassembling it, the question of “what it is made of,” is usually not relevant. We can forget the the object is made of pieces, subsystems, parts.
Thus it is that we find ourselves reasoning about atomic (spherical) cows, or rational agents optimizing utility functions.
But in forgetting that objects are made of parts, we sometimes forget another very important property associated with parts, which property extends all the way into the behavioral interface which we care about. This is the fact that all behaviors of a system are generated by its parts. In a very brutish sort of way, the configuration of parts–and even the sheer number of parts–sets a basic limit to the kinds of behaviors that a system can express. This is an interesting philosophical point in itself, but let’s content ourselves for now with observations like “the set of behaviors exhibited by a human cannot easily be expressed by an object the size of a bacterium” or “space is an important resource in computational complexity theory.”
So when we remember that objects are made of finite collections of parts, we must also remember that the behaviors of these objects must also then stay within the confines of the limitations imposed by their parts, which we will chiefly understand in terms of computational and informational limitations. And while this isn’t so much of a problem for spherical cows, it turns out to be quite a big problem for things like “rational agents.”
When we take an abstraction like a rational agent and reintroduce computational and informational constraints, we will see that a very interesting thing happens: The singular, coherent behavior represented by the abstraction will prove too big a burden to be supported by the resources of a system with finite parts. Inevitably, then, there will be a rupture of coherence, such that the behavior of the system must fracture into partially coherent fragments. That is, physical parts ==> abstract limitations ==> behavioral parts! Behavioral parts may or may not correspond to physical parts.
Partially coherent objects are more difficult to reason about than fully coherent object, which is why we often over-abstract toward full coherence. But many of the problems that we face and features of our lives owe their existence to incoherence. So a science of coherence which can instruct us about why incoherence arises and to what degree it can be mitigated seems worth undertaking. Developing this science is the goal of this program/project.
At a personal level and at the risk of digression, I want to note that at almost every stage of my professional career, I have been fascinated by phenomena that went by the name of coherence. In wave physics, coherence is a property of a wave which allows it to maintain its shape and direction–a fixed relationship among different parts which allows these parts reliably to add together within the beam region and reliably to interfere fully outside of the beam region. Partially coherent light is difficult to simulate directly, but there are interesting tricks which allow one to decompose a partially coherent beam into coherent components. In quantum mechanics, the locally coherent but mutually incoherent structures within the global wave function define the branching structure of the purported many worlds of certain QM interpretations. In deep learning, coherence appears to play a role in inductive bias of stochastic gradient descent: generalizing structures in parameter space will tend to have coherent, reinforcing gradients; spurious structures will have incoherent, and globally vanishing gradients.
These examples refer to very narrow forms of coherence which are at most motivating for and analogous to the thing that we will be considering. Still, they point to what is universal about coherence, which is that it filters things into and from existence: when the parts within a collection interact incoherently, the would-be whole dissipates into noise and is overtaken or outcompeted by objects comprising coherently interacting parts.
But the positioning of the program is somewhat deeper: At the scale humans and human systems, there is a deep connection between coherence and things that we care about. Coherence is what we seek, within ourselves, within our communities, within our world. Incoherence can be deeply uncomfortable, painful, or outright destructive. There is much nuance here, and part of the project will involve developing the concepts and formalizations needed for properly engaging with this nuance. The program further hopes that if we understand many of our ailments, personally, societally, or cosmically, as failures of coherence, then this might serve as the first step toward addressing these ailments.
The program will center around several paradigmatic questions:
- What’s: What is coherence? Can we formally define it?
- Why?’s: What are the limitations or obstacles to coherence? Why aren’t systems maximally coherent by default?
- How?’s: How do systems move toward coherence? What are the mechanisms, norms, and practices that promote coherence?
This overview will survey these paradigmatic questions, address some potential challenges to the coherence paradigm, and highlight connections to existing research programs.
The paradigmatic problems of coherence
What is coherence?
At a high level, we will think of coherence as a measure of how well a system’s behaviors are aligned with the goal of maintaining the system’s stable existence. In principle, we could generalize this description to more arbitrary goals, but this invites several problems. Focusing on stable existence gives the definition a nice flavor of generalizing our observation above: Coherence is a basic condition for parts to form a stable whole.
The focus on stability situates the notion of coherence somewhere close to ideas like active inference, which characterizes certain behaviors that must be exhibited by a system maintaining a Markov blanket. But the basic framework of active inference is an opinionated choice about how systems should respond to computational barriers in Bayesian inference, rather than a framework for measuring aspects of how well arbitrary systems organize to achieve stability.
Formalizing a definition of coherence is an ongoing project.
Why are systems incoherent?
Why aren’t many systems coherent by default? What is difficult about the formation of a coherent system? This question points to the computational and informational constraints which tend to bound the coherence of systems (or, better, to generate incoherences relative to idealized variants of the system having more computational and informational resources).
One of the main ideas that will arise is that, given a set of observations, it is computationally easier to learn phenomenal/local/high level patterns within these observations than it is to construct causal explanations of those patterns relative to some kind of low-level world model–even though the latter allows for better generalization and adaptitivity to broader environmental variations. The high level patterns learned by a system (which sometimes have an active/participatory component) can be thought of as parts of the system which couple to local parts or contexts in the environment. In general, there is no guarantee that these parts will be able to interact coherently with each other across a broad spectrum of environmental contexts.
This framing understands coherence in terms of how systems understand both the world and their own behavior via hierarchies of multi-scale models. Coherence between two parts or patterns can exist when these parts have been anchored to a sufficiently low enough model to jointly explain, situate, and arbitrate between both parts. Thus, for a system to find coherence generally requires both 1) for there to exist an understanding of why each part itself exists (i.e. a causal account for a local pattern in terms of a lower-level model) and 2) for there to exist some effective process of integrating parts on the basis of this understanding.
The first requirement is generally the result of an expensive computational and informational process, and thus can be limited by the availability of these resources. We can label this general area as the informatics of integration. The second requirement tends to invoke the existence of some privileged part which is responsible for a kind of governance process. Even if such a part exists by fiat, the process can still depend heavily on what can quite naturally be called something like the trust relationship between parts of the system and this governing part. In other cases, the parts must also be able to participate in a process of selecting a governing part of otherwise opting into an integration process. We can label this general area as the trust dynamics of integration (pending better naming).
How do systems become more coherent?
If we can understand the obstacles to coherence, the next step is to study the processes which show effectiveness for cutting through these obstacles.
With respect to the informatics of integration, the main focus is on understanding the bootstrapping of causal explanations from phenomenal patterns. There is an interesting claim here, which needs further evaluation and articulation, around the idea that local patterns precede more globally coherent explanations, not just because they are easier to learn, but also because they aid in the discovery of the relevant explanations needed for coherence.
The trust dynamics of integration is an area that I’m still working to properly frame, and generally a possible area for future work.
FAQs
Here are some questions or challenges that I’m anticipating.
Incoherence and Diversity
The framework might be seen as penalizing diversity as a kind of incoherence. However, it’s important to note that incoherence is defined relative to an informational state. Certain forms of diversity may be incoherent from a God’s-eye view, but given the informational state of the system itself, pursuing exactly opposite strategies may be a fully coherent hedging strategy.
Coherence and Ecologies
How does the coherence framework view complex objects such as ecologies? I don’t yet have much of an answer for this, and I think this will be an interesting case for playing with the framework.
Ecologies appear to have a high amount of incoherence, since much of the behavioral expression within the ecology is evidently zero-sum. On the other hand, it’s unclear that there is a simple “edit” to the behaviors of the parts of the system which can retain the stability of the overall system while reducing incoherence. This shows that while incoherence may measure actions which can be deleted from a system without reducing its overall stability, there is often no simple intervention which achieves this. Perhaps this could lead to some notion of local coherence relative to available interventions.
Related work
This framing relates closely to and is greatly inspired by Richard Ngo’s Scale Free Intelligence post, though it approaches many of the topics from a different direction. It also builds on my own dwellings on the roles of world models and reflexive awareness in intelligence. There are close ties to the setting of active inference but differences in the objective. Work on world modeling in reinforcement learning touches on similar topics and should generally be expected to be convergent with the work here, but has a different emphasis.
From a different standpoint, there seem to be a lot of connections between the projects within this framework and the field of developmental psychology. It may turn out that much of the work here ends up looking like ways of laying these more descriptive frameworks into a rigorous settings that help elucidate their dynamical and teleological foundations.