Exploring Graph Cellular Automata

How we can find local rules that produce specific global behavior

Adam Mehdi
Towards Data Science

--

1) Cellular Automaton Basics

The best way to introduce cellular automata is by example. Figure 1 shows a canonical example of a simple cellular automaton. It is called Rule 110 because its set of rules can be specified by the binary sequence 01101110 (1’s for any black updated cell). Rule 110 begins with the first row of cells, which contains a single black cell. Each subsequent row is a time step, obtained by applying the rules to each cell, updating its state based on its current state and that of its neighbors. For example, the leftmost rule in Figure 1 updates a black cell to be white if both of its neighbors are black.

Figure 1: Rule 110, with an initial state of a single black cell, run for 10, 100, 250, 1000, and 10000 time steps, respectively. Generated using Mathematica 13 [10].

Evident in Figure 1, nontrivial patterns emerge; despite its simple initial state and simple rules, Rule 110 exhibits complex global behavior [8].

However, a comparable magnitude of complexity does not arise in all such cellular automata. The following is Rule 111. It is identical to Rule 110 in every respect except one update rule: three white cells produce a black cell instead of a white cell. And yet it exhibits an apparently regular pattern of horizontal lines; quite a bit less interesting than the randomly interspersed triangles of Rule 110. This highly non-intuitive difference in complexity arises due to slight changes in the transition function.

Figure 2: Rule 111, with an initial state of a single black cell, run for 10, 100, 250, 1000, and 10000 time steps, respectively. Generated using Mathematica 13 [10].

Why you might want to read this

We’ve taken the first steps to creating digital systems capable of repairing themselves. For folks in artificial intelligence, this is an important step because it opens up a frontier of achievable, programmable behavior previously inaccessible to us. Now let us explain how we’ve done so.

First we formalize classical cellular automata such as Rule 110 to generalize them into graph cellular automata. We will use that formalism to apply graph neural networks; this allows us to learn transition functions that give rise to desired states of graph cellular automata. Incidentally, this allows us to create a self-repairing bunny graph.

This article loosely follows the paper “Learning Graph Cellular Automata” but makes many fewer assumptions on the reader’s knowledge [1].

2) Graph Cellular Automata

2.1) Cellular Automaton Formalism

There is a strong resemblance between various instantiations of cellular automata. Each uses some structure to represent state and applies simple rules to it, and from the repeated application of these rules, complex behavior arises. This resemblance can be more than just described, however; it can be for formalized into a general mathematical framework. Using this framework, new types of cellular automata such as graph cellular automata can be straightforwardly described as an extension of the familiar Rule 110. That framework is as follows.

A cellular automaton is a 4-tuple: (D, Σ, N, τ) [1], where:

Components of a cellular automaton. The math may seem hairy, but it points to a definite picture; see table 1 for examples.

The transition function τ determines the specific instantiation of the cellular automaton, such as Rule 110 or Rule 111, and the domain D, state set Σ, and neighborhood function N determine the various classes of cellular automata, as shown in Table 1. Table 1 also shows that graph cellular automata are generalizations of classical and gridded cellular automata; when cells are represented as nodes, and the constraint that neighboring cells are nearby in space is relaxed, what results is a graph cellular automaton.

Table 1. A comparison of classical (e.g. Rule 110), gridded (e.g. Conway’s Game of Life), and graph cellular automata.

Transition Rule

It is the transition function τ that is responsible for the complex behavior in a cellular automaton (we use “transition rule” and “transition function” interchangeably here). Changing a single bit in Rule 110, which has been shown to be as complex as a Turing machine [7], yields Rule 111, which exhibits repetitive behavior. Thus the task of evolving global behavior in graphs is essentially that of finding an interesting transition function τ.

The transition function τ aggregates the state of each cell and its neighbors to produce an updated state. To bake in anisotropic behavior (i.e. weigh different neighbors differently), the transition function must also act on the edge weights. Formally, τ becomes

Formal definition of the transition rule τ.

When studying cellular automata, we are usually interested in only the transition functions that give rise to complex behavior. Therefore, it would be useful to be able to know the behavior a transition function produces ahead of time, before running the cellular automaton. However, such a capacity is not always possible.

Steven Wolfram conceived of his Principle of Computational Equivalence in observing simple cellular automata such as Rule 110 and Rule 111 [9]. According to it, any system in nature that exhibits some nontrivial sense of complexity “can perform computations up to a maximal (‘universal’) level of computational power” [7]. In other words, to predict the state Rule 110 at time step t, at least t steps of computation must be used; there are no shortcuts or cutting-ahead with complex systems even as simple as Rule 110.

3) Graph Neural Network

3.1) Approximating the Transition Function

The Principle of Computational Equivalence dictates that we cannot immediately tell the behavior a transition function produces; we cannot skip ahead of the computation of cellular automata. But nevertheless, the reverse mapping is tractable: Given a desired behavior, it is possible to learn a transition function that brings rise to it. We do so using the methodology of deep learning.

We use deep learning to approximate arbitrary mappings between some domain (e.g. a cellular automaton’s initial state) and some codomain (e.g. a cellular automaton’s state at some time step t). This is possible because neural networks — essentially interleaved compositions of affine transformations with learnable parameters and nonlinear transformations — can approximate any function arbitrarily closely by adjusting its parameters to some configuration [1]. And to search for such a configuration of parameters, we use a greedy optimization algorithm called gradient descent.

Thus we are using a neural network to learn the mapping from initial state to desired behavior of a cellular automaton. However, in so doing, we will not learn the mapping explicitly, but rather learn the transition function τ. In order to compute the desired behavior, then, we will repeatedly apply this approximation of τ to the initial state of the cellular automaton.

For this exploration, we will approximate τ for graph cellular automata. Since the domain D is a graph, we therefore use a graph neural network. Graph neural networks make up a function class that maps graphs to real-valued matrices,

GNN definition, where G is a graph, n is the number of vertices, and d is the dimension of each vertex embedding such that d<<n.

3.2) Message Passing

Figure 3: In message passing, each vertex’s state is updated by aggregating its state and those of its neighbors during each iteration. The blue vertex reaches out to its neighbors, its black vertices, and its state is updated based on each of their states.

Initially, every vertex i is assigned an arbitrarily-initialized vector of dimension d, and these are collected together into a real-valued matrix of shape (n, d). H is updated by a message passing pipeline as follows.

Message-passing, the backbone of GNNs.

The message-passing pipeline comprises three steps:

Explanation of the message passing pipeline.

By iteratively applying this local message-passing pipeline, global information of the graph propagates to the vertices: Since each step sees all vertices receive information from their neighbors, if message-passing is applied for t iterations, all vertices receive information from those vertices a distance of at most t from them. Therefore, message-passing is a locally-acting function that updates a vertex’s state based on that of its neighbors, and in being applied repeatedly, it gives rise to global behavior. In this way, the message-passing is analogous to the local transition function τ, which evolves complex global patterns in cellular automata. And indeed, we will approximate the transition function (omitting edge weights) as a message-passing neural network, and we will learn τ for some task by optimizing this neural network.

4) Experiment

4.1) Task

Our goal is to learn a transition function that evolves a graph cellular automaton to a specific final state. Let us couch that in a formalism, as it will make the optimization problem straightforward.

Definitions for our task.

Here, we will use the graph neural network defined in Sec 3.2 for τ_θ. And for G, we will use the Stanford Geometric Bunny, a graph constructed from a point cloud of a bunny by treating each point as a vertex and connecting each to its nearby vertices in 3D space [1]. The state vector (s_i-bar) therefore encodes each vertex’s coordinates in the point cloud, and τ_θ will evolve a graph of arbitrary structure to the bunny in 3D space.

4.2) Training

We will now optimize the parameters τ_θ to minimize an L2 loss:

Loss function to minimize.

In (4), final time step t and minibatches (i.e. partitions of vertices) k=1,…,K.

Explanation of Equation (4), the loss function of the GNN for our task.

Now p^(i) is p as it is applied at time step i for each minibatch k and for each epoch (i.e. number of times we have seen minibatch k). This procedure is called backpropagation through time. With it, we can learn a parameter configuration of our neural network τ_θ such that it approximates the target transition rule’s mapping:

The target transition rule maps an initial state to a final state. We will approximate this mapping with a GNN.

There are a few remaining tricks that have been found to aid training. One is to to draw from a pool of different initial states instead of optimize from a single initial state S-bar, replacing each sample. This helps create persistent structures in the graph cellular automaton, since the model will learn to map a diversity of states to the target state. Perhaps most importantly, however, we can modulate the final time step t. We will see interesting results from differing definitions of t in Sec. 4.3. [1, 5]

4.3) Results

Figure 4: Results of a model trained for several maximum time step t*; t*=10 (1st row), t*=20 (2nd row), and 10 ≤ t ≤ 20 randomly chosen for each step during training (3rd row). Error is measured by the L2 loss: Equation (4) for K=1. Adapted from [2], the supplement of Grattarola et al.

The model was trained for varying time step t. Its outputs for a fixed initial state are displayed in Figure 4.

Discussion of Figure 4.

Hence, at t=20, each model successfully produces the rabbit when used as a transition rule in a graph cellular automaton, but only the model whose results are displayed in the 3rd row produces a persisting rabbit.

What is significant about this experiment, however, is not that we can create a bunny-shaped graph. Its significance lies in the fact that this bunny self-organizes, and is capable of self-repair. These are coordinated global behaviors, and yet they result not from top-down design but from iterated application of locally-acting functions, a bottom-up mechanism shared by complex systems across the natural world.

We began this exploration with basic cellular automata, marveling at the complexity that can arise from simple update rules applied to a simple initial state. But due to computational irreducibility, the kind of complexity that might arise from such cellular automata cannot be predicted. Nevertheless, we forced specific global behavior in a graph cellular automata, by replacing simple update rules with graph neural networks. Therefore, we have shown these graph neural cellular automata to be a tool for engineering rudimentary self-organizing systems.

Citations

[1] Grattarola, Daniele, Lorenzo Livi, and Cesare Alippi. “Learning graph cellular automata.” Advances in Neural Information Processing Systems 34 (2021).

[2] Grattarola, Daniele, Lorenzo Livi, and Cesare Alippi. “Learning graph cellular automata, Supplemental Material.” Advances in Neural Information Processing Systems 34 (2021).

[3] Hamilton, William L. “Graph representation learning.” Synthesis Lectures on Artifical Intelligence and Machine Learning 14.3 (2020).

[4] Howard, Jeremy, and Sylvain Gugger. Deep Learning for Coders with fastai and PyTorch. O’Reilly Media, 2020.

[5] Mordvintsev, Alexander, et al. “Growing neural cellular automata.” Distill 5.2 (2020): e23.

[6] Perraudin, Nathanaël, et al. “GSPBOX: A toolbox for signal processing on graphs.” arXiv preprint arXiv:1408.5781 (2014).

[7] Weisstein, Eric W. “Computational Irreducibility.” mathworld.wolfram.com/ (2002).

[8] Weisstein, Eric W. “Rule 110.” mathworld.wolfram.com/ (2002).

[9] Wolfram, Stephen. A new kind of science. Vol. 5. Champaign: Wolfram media, 2002.

[10] Wolfram Research, Inc., Mathematica, Version 13.0, Champaign, IL (2021).

--

--

Thinking about AI & epistemology. Researching CV & ML as published Assistant Researcher. Studying CS @ Columbia Engineering.