How are humans and AI systems alike and how are they different? This is an extremely salient question for the present moment as society anticipates further development of AI systems and prepares to adapt.
When we ask this question, often the most salient dimensions are those associated with intelligence itself–features like capability, robustness, adaptability, and so on. But other, somewhat orthogonal dimensions can be equally important, and have increasingly entered the conversation. For instance, intelligence can exist independently from agency as well as basic drives such as self-preservation. It’s important to understand and characterize these dimensions for the systems that are emerging. The question of whether AI systems are or can be conscious looms large as another example of an orthogonal feature that could be monumentally important but equally difficult to ascertain.
This article concerns such a complementary feature, which is so mundane as to be often overlooked (though it comes up most commonly, funnily enough, in discussions of AI consciousness). This is the fact that AI systems are, by and large, digital systems. This isn’t really a surprising fact to most people, given the mastery that humans have successively achieved in domains like computer hardware, computer software, and learning algorithms.
But in this article, I’ll argue that it should still be seen as a striking fact. Digital embodiment is a very different way of existing than the analog mode in which human intelligence exists. The fact of digital implementation means that AI systems have access to qualities and primitives which are foreign and even seemingly absurd from the perspective of organic intelligences.
We’ll argue that these primitives are potentially very important for addressing various coordination problems. Most digital primitives pertain directly to matters of informational flow which are ubiquitous in coordination problems under names like informational asymmetry, moral hazard, observability, or privacy. Thus, it would be surprising if the ability to more strictly control informational flows relative to a given intelligent system was not transformational for certain coordination problems.
We will start by reviewing the essential nature of digital systems and the primitives that they afford; we’ll then review a collection of coordination dilemmas were digital intelligence changes the nature of the game. Nothing here is probably incredibly surprising or new for people who are working in these spaces. This article merely aims to provide a sharper frame to help extrapolate to new applications.
Understanding Digital Systems
The first subsections is as much a reminder of what digital systems (such as AI agents) are as a reminder of what analog systems (you) are not. Readers less interested in mathematics can safely skip to the next section.
Digital Abstraction and Environmental Isolation
What makes a system digital? Typical answers to this question will say something like “Digital systems perform operations on discrete domains instead of continuous operations on continuous domains.” This is true, but it misses the bigger picture of what properties a digital system is designed to satisfy, which lead to the discrete design.
The bigger picture that we will attempt to illustrate in this section is that a digital system is an attempt to realize a pure, abstract system–the kind that mathematicians reason about in their heads–in the real physical world. The major difficulty here is to construct a system whose dynamics are properly isolated from the vagaries of the environment in which the system must necessarily be embedded. Another way of putting this is that the system should be closed to influence or informational leakage from the environment. This section introduces a modeling framework and presents some stylized arguments for why environmental isolation tends to imply discretization.
Our discussion will likely be applicable to various abstract systems, but we’ll be particularly interested in computational systems, which we model as deterministic state machines. Such a system consists of a state space, $L$, and a state transition function, $\tau_L : L \to L$ , which takes a current state $\ell_t$ and maps it to the next state $\ell_{t+1}$ (A digital system will want for $L$ to be a finite/discrete domain, but we’ll start continuous and see what happens).
What does it mean for our abstract system to be implemented in the world? Let $U$ be a set representing the possible states of the world, and let $\tau : U \to U$ be the world’s state transition function.
We’ll say that the world realizes an abstract state machine $\tau_L$ if there exists a mapping $m : U \to L$ such that the following commutation relationship holds:
This relationship implies a kind of causal closure; Any information about the world which is thrown away by $m$ cannot later influence $\ell$. To get a better picture about what this means, let’s assume that $U$ has a product structure, $U = S \times E$ where $S$ represents the state of the system and $E$ represents its environment, and that $m$ only depends on the system state, i.e., $m = m_S \circ \pi_S$, where $\pi_S$ is a projector from $U$ onto $S$. Our digital abstraction relationship says that the environment cannot causally affect the behavior of the system since $m(\tau(s,e)) = \tau_L(m_S(s))$ $\forall e \in E$.
We can make a stylized argument to show why it isn’t possible to have this kind of causal closure when the abstract system is continuous.
Let $\tau_S = \pi_S \circ \tau$ be the restriction of $\tau$ to $S$. We’ll suppose $\tau_S$ decomposes into two sequential steps:
$$ \tau_S = \tau_2 \circ \tau_1 $$where $\tau_1 : S \times E \to S$ captures any coupling from the environment, and $\tau_2 : S \to S$ captures the internal system update. We’ll also model $S$ and $L$ as metric spaces and assume that there exist $\epsilon_2 > \epsilon_1 > 0$ such that for any $s \in S$, the set $\{\tau_1(s,e) : e \in E\}$ contains a ball of size $\epsilon_1$ and is contained in a ball of size $\epsilon_2$. This essentially means that the environmental effect on the system state is isotropic; environmental effects can move the system state in any direction by at least $\epsilon_1$ but no further than $\epsilon_2$.
If $m_S \circ \tau_S$ is a continuous, non-constant function, then our abstraction breaks. We just need to find an $s$ and $s'$ with $||s - s'|| < \epsilon_1$ where $\tau_S \circ m_S(s) \neq \tau_S \circ m_S(s')$ ) (these will exist since the composition is continuous and non-constant). Then we know that we can choose $e$ and $e'$ to make $||\tau_1(s,e) - s|| = 0$ and $||\tau_1(s,e') - s|| > \epsilon_1$ so that $m(s,e) \neq m(s,e')$.
On the other hand, we can preserve the abstraction if we are willing to make $\tau_S$ discrete. We can define a mesh $G \subset S$ with spacing $\delta > 2\epsilon_2$, and let $r : S \to G$ be the nearest-point rounding function, which maps each point to its closest mesh point. We can then define a discretized system update:
$$\tau'_2 = r \circ \tau_2 \circ r$$The rounding on the right erases any environmental perturbation stemming from $\tau_1$, assuming that $s$ started on a mesh point. The rounding on the left, ensures that the system state ends on a mesh point after the internal update, so that the commutation condition can be preserved as an inductive invariant. In this case, having made $\tau_S$ discrete, our abstraction is effectively discrete as well, since $m_S$ only sees values on the mesh $G$.
In practice, this version of $\tau_S$ is unphysical. We can’t snap the physical state to a specific point on a mesh. But we can create potential wells which form a basin of attraction to combat environmentally induced drift. Each full basin is then mapped by $m_S$ to a single point in the discrete abstraction space. The physical system can then implement the abstracted logic as long as it can manage to move the state into a vicinity corresponding to the target basin.
To recap: The goal of this development is to shift the way that we think about digital systems from “systems with discrete domains” to “systems with a shocking level of environmental isolation, which use discretization to achieve this.”
Informational and Computational Closure
One interpretation of what we’ve said so far is that digital systems can have full informational closure, in the sense that it only makes use of information from inside of itself and no information from the environment.
Another lens on this is that digital systems are closed, computationally. Usually, our models of systems are coarse-grainings of smaller scale dynamics (e.g. cellular biology coarse-grains molecular and atomic physics coarse grains various field theories coarse grain…). We understand that these models usually hold only in an approximate way, because dynamics from the lower scale may feed upward into the higher scale model in the form of noise or other distortions.
We can describe error, as we have done so far, as a kind of leakage of information from “the environment” (even if, in this case, the environment may be merely the thing itself at a smaller scale). Or we could describe it as a kind of underapproximation error: The system is doing important operations which the model doesn’t capture.
This takes us back to our original claim that digital computation is a way of realizing a pure, abstract system in the real world. A common adage is that all models are wrong but some are useful. Many models are obviously nothing more than models (though it’s not uncommon for people to forget this in certain contexts). But informationally closed abstractions blur the line greatly. It’s not uncommon, or terribly invalid, to treat digital systems as an exception to the rule. Digital abstractions can be correct models of the thing they are trying to model.
(Obviously, being completely closed to environmental information is an extreme, albeit a useful one. To avoid muddying the waters, we can mostly think of any such information as being packed into the initial state of a digital system. This doesn’t dilute any of the practical benefits we discuss below.)
Primitives of Digital Systems
We can translate the somewhat abstract ideas of informational and computational closure into concrete primitives, generally related to the flow of information, which prove useful for coordination problems:
Input transparency. I can run a digital system inside a box, and know that it is only making use of the information contained in the initial state. If I run a deterministic physics simulation on my computer, I know that the input is not influenced by something that I said to my kids while it was running. Same if I run deterministic LLM inference with a fixed input.
Replicability. I can create an exact replica of a digital system. The amounts of information needed to specify the system, its state, and its inputs are all finite. The digital abstraction is separated from low-level elements of state, such as quantum state, which are known to be impossible from clone. This makes possible procedures for reliably and exactly transferring this finite quantity of information and transforming it into a working implementation (e.g. I can copy the code, data files, etc. and run them somewhere else).
Verifiability. Claims about the behavior of digital systems can be trustlessly verified (e.g. “I put a digital system into state $\ell_0$ and ran it for $T$ steps, perhaps with information inputs $(i_t)_T$, and it reached state $\ell_T$ without using any information not contained in these inputs”). This follows directly from replicability. The equivalence between digital systems and their formal models also enables other methods of formal proof and verification to be applied to digital systems.
Input provenance. One consequence of verifiability is input provenance. It is possible not just to know that a digital system only acted on specific input, but also to prove this so that a non-trusting party can verify it.
Deletion. When a digital system thermalizes state information into its environment, the digital wall of abstraction means that this information is now completely unavailable to the system in the future. That is, deletion of information is possible.
Digital Intelligence
Perhaps this phrase, digital intelligence, means more after reading the previous section than it might have previously.
Up until this moment in time, there has never been a system (that we know of) which is both digital and highly intelligent. Humans are intelligent and not digital. Computers are digital but heretofore, not very intelligent. Now, we seem to have something that is increasingly intelligent but can operate in a fully closed/digital manner. What do we make of this?
Take a moment to run through the above primitives and imagine what they would mean for humans intelligence. Each of them looks like an immediate absurdity or impossibility. To me, it would be highly surprising if the fact of digital essence wasn’t transformative for the application of intelligence to many contexts, particularly surrounding problems in coordination.
To me, the observation that intelligence can now be digital, and all that this entails, has an interesting status. It’s something that many builders intuitively understand and appreciate. But it hasn’t received a lot of explicit mention or commentary. I think that the main reason for this is that, while the digital nature of AI certainly helps with and even transforms the nature of many problems in reasoning about AI systems and coordination problems, there are many questions that still feel very difficult or intractable; basic questions of alignment are often here.
As we move toward applications, we’ll take a ceteris paribus approach toward alignment-type questions: Human alignment isn’t a fully solved problem, but even so, the above digital primitives would feel extremely empowering for certain human coordination problems. We’ll also steer toward applications where alignment may be less central.
Applications to Coordination
Many problems of coordination are strongly influenced by the flow of information. Specific problems related to informational flow have different names in different contexts: Informational asymmetry, moral hazard, adverse selection, privacy, and so on.
In this section, we’ll survey some of these contexts and see how the informational flow primitives afforded by digital systems prove relevant.
Principle agent problems
The principal agent problem concerns a principal who wishes to have an agent who acts in the interest of the principal instead of the interests of the agent. The problem, of course, is that agents will often tend to act in their own interests, which may be different than the interests of the agent.
The principal agent problem is addressed by creating structures / incentives which try to align the incentives of the principal to those of the agent. There are various ways of doing this:
- If the principal’s interest is largely a material good or profit, a degree of alignment can be achieved by giving the agent a proportional share in this interest. Various systems of equity sharing exist to enable such interest sharing.
- An agent might make a commitment to act in the interests of the principal (within their limited capacity as an agent and subject to various constraints) carrying along certain consequences if the agent is found to be violating the commitment. This is the case for many public officials and professionals who commit to act in the interests of society or specific clients, or risk consequences ranging from the loss of a license to imprisonment. The commitment gates entry to a given professional opportunity, i.e. an opportunity for a type of reward.
The second approach is more general, but generally also more expensive and less tolerant to certain kinds of agents. The main difficulty is an informational one, going her commonly under the name Moral Hazard: The agent may take actions or respond to incentives in a way that isn’t observable to the principal, or generically the body responsible for enforcing the agent’s commitments.
Monitoring actions and informational flows for intelligent agents is typically a very difficult problem because intelligent agents tend to be humans with open informational channels and action spaces. The problem is much different for digital agents, where these spaces are closed.
Various digital primitives come into play here. Suppose that I want to validate that my agent’s actions are not based on disallowed factors such as bribery. I can replicate my agent (replication), run it on a curated collection of inputs (deletion, input transparency) which I can check for evidence of bribery attempts, and check that its decisions do not change materially. I can make this whole process automated and verifiable so that I don’t need to own or trust the infrastructure (verification).
Information asymmetries and externalities in markets
Main articles:
- Digital intelligence and market alignment
- Constitutional companies (unpublished)
Buyer’s information paradox and risky coordination problems (LLM-written)
Two related coordination problems share a common informational structure.
The first is the buyer’s inspection paradox: a buyer cannot evaluate an asset—information, an algorithm, a financial position—without acquiring it, but acquisition transfers value before payment. The second is risky coordination: I would benefit from revealing my preference for A if and only if you also prefer A, but unilateral revelation carries a cost, whether reputational (dating), strategic (signaling interest in an acquisition), or safety-related (declaring opposition under a regime).
Both problems reduce to the same informational requirement. Some party’s sensitive information must be processed against another party’s sensitive information, a narrow agreed-upon output must be emitted, and nothing else must persist. The information enters, a decision comes out, and everything else must vanish.
Humans cannot satisfy this requirement. A human evaluator who sees an algorithm now knows the algorithm; a human matchmaker who learns two parties’ preferences now holds those preferences. Reputational and legal scaffolding can constrain what such a person is likely to do with the information, but the information itself has been placed somewhere it cannot be unplaced. The primitives needed—replication and deletion—are simply unavailable.
A digital agent supplies them directly. I instantiate an agent for a specific transaction; its weights, code, and inputs are fully specified, and both parties can verify in advance, by running copies of it on test inputs, that it behaves as claimed. The agent receives the sensitive information from one or both parties, performs the agreed evaluation, and emits the narrow output. The instance is then destroyed along with all information it acquired, leaving no trace in any context that persists past the transaction.
A seller’s algorithm can be shown to an ephemeral agent which runs the buyer’s benchmarks and reports performance; the algorithm itself never reaches the buyer and the agent does not survive the evaluation. Two firms exploring an acquisition can each give an agent their reservation prices; the agent confirms or denies overlap and then ceases to exist, leaving neither side knowing the other’s number. Two people interested in each other can each register that interest with an agent which reveals mutuality only if it exists and otherwise expires having told no one anything.
What is qualitatively new here, relative to human escrow agents, brokers, and matchmakers, is that the trusted-intermediary role is filled by something whose behavior is verifiable in advance and whose state can be reliably erased afterward. Both properties fall out directly from the digital nature of the agent.