Blog

Chemical Turing Machines

October 21, 2024

Finite state automata (FSA) can be modeled with reactions of the form $A + B \to C + D.$ FSAs operate over regular languages, so our job is to find some reaction which corresponds to determining whether or not a given "word" (sequence of inputs to the reaction) is a member of the language.

Generally, we will be associating languages of various grammars in the Chomsky hierarchy to certain combinations of "aliquots" added to a one-pot reaction, and in this case we want our aliquots to be potassium iodide and silver nitride. Take the language over the alphabet ${a,b}$ consisting of all words with at least one $a$ and one $b.$ Now associate $a$ with some part of $\text{KIO}_3$ and $b$ with some part of $\text{AgNO}_3.$ Then, the reaction $$ \text{KIO}_3 + \text{AgNO}_3 \to \text{AgIO}_3 (\text{s}) + \text{KNO}_3 $$ only occurs when both of the reactants are present in solution, so the word is in the language if and only if silver iodide is present. (Or, equivalently, heat is released).

Type-2 grammars consist of languages that can be modeled with pushdown automatas, which differ from FSAs in that they have a stack that can store strings of arbitrary sizes. We call these languages "context-free languages", and the reactions which we associate to context-free languages are those with intermediaries. Again, because of automata equivalence, we can consider the simple case of the Dyck language: the collection of parentheses-strings that never contain more closed parentheses than open parentheses at any $i$ and contain exactly equal amounts of closed and open parentheses at $i=n.$

We associate this with the $pH$ reaction of sodium hydroxide and acetic acid (vinegar), with the amounts of each aliquot normalized to create identical disturbances in the $pH$ of the solution. Note that as $pH$ indicator aliquot is present at the beginning and end of the reaction (we associate it with the start-and-end token), the aliquot serves as the intermediary (the "stack", if you will). So, if $pH \geq \text{midpoint } pH$ throughout the reaction, but is $\text{midpoint } pH$ at the end, the reactor accepts the word. If not, it does not.

Incidentally, you can interpret this as the enthalpy yield $Y_{\Delta H} (\%)$ of the computation, defined as $$ Y_{\Delta H} (\%) = \frac{\text{reaction heat during computation}}{\text{formation heat of input string}} \times 100. $$ Dyck words maximize the enthalpy yield, whereas all other input sequences with imbalanced numbers of parentheses have lower enthalpy yields. Implication: all PDAs are doing something like "enthalpy maximization" in their computation. Couldn't find a good reference or exposition here, but something to look into.

How do we model Turing machines? You can think of a Turing machine as a "two-stack" PDA, where each stack corresponds to moving left or right on the tape. Physically, this implies that we want to model TMs with a reaction with at least two interdependent intermediaries, and we want it to be "expressive" enough to model "non-linearities". Oscillatory redox reactions are a natural choice, of which the Belousov-Zhabotinsky (BZ) reaction is the most famous.

A typical BZ reaction involves the combination of sodium bromate and malonic acid, with the main structure as follows: $$ 3\text{BrO}_3^- + 3\text{CH}_2(\text{COOH})_2 + 3\text{H}^+ \to 3\text{BrCH}(\text{COOH})_2 + 4\text{CO}_2 + 2\text{HCOOH} + 5\text{H}_2\text{O}. $$

BZ reactions have a ton of macro-structure. Color changes as a function of the amount of oxidized catalyst, the proportions of the reactants and products fluctuate periodically, and even spatial patterns emerge from micro-heterogeneity in concentrations (e.g. reaction-diffusion waves, Pack patterns). These properties are incredibly interesting in and of themselves, but all we need for modeling TMs is that the reaction is sensitive to the addition of small amounts of aliquot.

Consider the language $L_3 = \{a^nb^nc^n \mid n \geq 0\}.$ Dueñas-Díez and Pérez-Mercader associate the letter $a$ with sodium bromate and $b$ with malonic acid. $c$ must somehow be dependent on the concentrations of $a$ and $b,$ so we associate $c$ with the $pH$ of the one-pot reactor, which we can read with sodium hydroxide. An aliquot of the rubidium catalyst maps to the start-and-end token.

Oscillation frequency $f$ is proportional to $[\text{BrO}_3]^\alpha \times [\text{CH}_2(\text{COOH})_2]^{\beta} \times [\text{NaOH}]^{-\gamma},$ but it can also be modeled as a nonlinear function of the difference between the maximum redox value of the reaction and the mean redox value of a given oscillation, that is: $$ D = V_{\text{max}} + \left( V_T + \frac{V_P - V_T}{2}\right), $$ where $V_T$ and $V_P$ are the trough and peak potentials, respectively, and $V_\text{max}$ is the maximum potential. Ultimately, the final frequency of the reaction can be modeled as a quadratic in $D_{\#}$ to high-precision ($\#$ denotes the start-and-end token, so it can be taken to be the last timestep in reaction coordinates).

What actually allows word-by-word identification though, is the sensitivity of the oscillatory patterns to the concentrations of specific intermediaries:

The various "out-of-order" signatures for words not in $L_3$ can be explained as follows. Each symbol has an associated distinct pathway in the reaction network. Hence, if the aliquot added is for the same symbol as the previous one, the pathway is not changed but reinforced. In contrast, when the aliquot is different, the reaction is shifted from one dominant pathway to another pathway, thus reconfiguring the key intermediate concentrations and, in turn, leading to distinctive changes in the oscillatory patterns. The change from one pathway, say 1, to say pathway 2 impacts the oscillations differently than going from pathway 2 to pathway 1. This is what allows the machine to give unique distinctive behaviors for out-of-order substrings.1

Thermodynamically, characterizing word acceptance is a little bit more involved than that of PDAs or FSAs, but it can still be done. Define the area of a word as $$ A^{\text(Word)} = V_{\text{max}} + \tau' - \int_{t_{\#} + 30}^{t_{\#} + \tau} V_\text{osc}(t) \, dt, $$ where $t_{\#}$ is the time in reaction coordinates where the end token is added, $\tau'$ is the time interval between symbols, $V_\text{max}$ is the maximum redox potential, and $V_\text{osc}$ is the measured redox potential by the Nerst equation $$ V_\text{osc} = V_0 + \frac{RT}{nF} \ln \left( \frac{[Ru(bpy)^{3+}_{3}]}{[Ru(bpy)^{2+}_{3}]} \right), $$ where $Ru(bpy)^{3+}_{3}$ and $Ru(bpy)^{2+}_{3}$ are the concentrations of the oxidized and reduced catalyst of the BZ reaction, respectively. Now, the Gibbs free energy can be related to the redox potential as so: $$ \Delta G_\text{osc} = -nFV_\text{osc}, $$ so the area of a word can be rewritten in terms of the free energy as $$ A^{\text(Word)} = - \frac{1}{n_eF} \left( \Delta G ' \times \tau ' - \int_{t_{\#} + 30}^{t_{\#} + \tau} \Delta G_\text{osc}(t) dt\right). $$ Accepted words all have some constant value of $A^{\text(Word)},$ while rejected words have a value that is dependent on the word.

$L_3$ is a context-sensitive language, so it is only a member of the Type-1 grammar not the Type-0 grammar. However, for our purposes (realizing some practical implementation of a TM) it is roughly equivalent, as any finite TM can be realized as a two-stack PDA, and this models a two-stack PDA quite well.

References

1

Dueñas-Díez, M., & Pérez-Mercader, J. (2019). How Chemistry Computes: Language Recognition by Non-Biochemical Chemical Automata. From Finite Automata to Turing Machines. iScience, 19, 514-526. https://doi.org/10.1016/j.isci.2019.08.007

2

Magnasco, M. O. (1997). Chemical Kinetics is Turing Universal. Physical Review Letters, 78(6), 1190-1193. https://doi.org/10.1103/PhysRevLett.78.1190

3

Dueñas-Díez, M., & Pérez-Mercader, J. (2019). Native chemical automata and the thermodynamic interpretation of their experimental accept/reject responses. In The Energetics of Computing in Life and Machines, D.H. Wolpert, C. Kempes, J.A. Grochow, and P.F. Stadler, eds. (SFI Press), pp. 119–139.

4

Hjelmfelt, A., Weinberger, E. D., & Ross, J. (1991). Chemical implementation of neural networks and Turing machines. Proceedings of the National Academy of Sciences, 88(24), 10983-10987. https://doi.org/10.1073/pnas.88.24.10983


Species as Canonical Referents of Super-Organisms

October 17, 2024

A species is a reproductively isolated population. In essence, it consists of organisms which can only breed with each other, so its ability to self-replicate is entirely self-contained. In practice, the abstraction only applies well to macroflora and macrofauna, which is still enough to inform our intuitions of super-organismal interaction.

Interspecific interactions can frequently be modeled by considering the relevant species as agents in their own right. Agents motivated by self-sustention to acquire resources, preserve the health of their subagents, and bargain or compete with others on the same playing field as themselves. Parasitism, predation, pollination—all organismal interactions generalizable to super-organismal interactions.

Optimization of the genome does not occur at the level of the organism, nor does it occur at the level of the tribe. It occurs on the level of the genome, and selects for genes which encode traits which are more fit. From this perspective, it makes sense for "species" to be a natural abstraction. Yet, I claim there are properties which species have that make them particularly nice examples of super-organisms in action. Namely:

However, it is precisely because species have such nice properties that we should be incredibly cautious when using them as intuition pumps for other kinds of super-organisms, such as nation-states, companies, or egregores. For instance:

These "issues" are downstream from horizontal boundaries between other super-organisms we want to consider being less strong than the divides between idealized species. While Schelling was able to develop doctines of mutually-assured destruction for Soviet-American relations, many other nation-state interactions are heavily mediated by immigration and economic intertwinement. It makes less sense to separate China and America than it does to separate foxes and rabbits.

Don't species run into the same issues as well? Humans are all members of one species, and we manage to have absurd amounts of intraspecial conflict. Similarly, tribal dynamics in various populations are often net negative for the population as a whole. Why shall we uphold species as the canonical referent for superorganisms?

Species are self-sustaining and isolated. The platonic ideal of a species would not only be reproductively isolated, but also resource isolated, in that the only use for the resources which organisms of a species would need to thrive were ones which were unusable for any other purpose. Horizontal differentiation is necessary to generalize agent modeling to systems larger than ourselves, and species possess a kind of horizontal differentiation which is important and powerful.

A corollary of this observation is that insofar as our intuitions for "superorganismal interaction" are based on species-to-species interaction, they should be tuned to the extent to which the superorganisms we have in mind are similar to species. AI-human interaction in worlds where AIs have completely different hardware substrates to humans are notably distinct from ones in which humans have high-bandwidth implants and absurd cognitive enhancement, so they can engage in more symbiotic relationships.

I would be interested in fleshing out these ideas more rigorously, either in the form of case studies or via a debate. If you are interested, feel free to reach out.

Crossposted to LessWrong.

1

One way to establish a boundary between two categories is to define properties which apply to some class of objects which could be sorted into one of the two buckets. But what is the "class of objects" which egregores encompass?! Shall we define a "unit meme" now?

2

I'm aware I'm not fully doing justice to egregores here. I still include them as an example of a "superorganism" because they do describe something incredibly powerful. E.g., explaining phenomena where individuals acting in service of an ideology collectively contravene their own interests.


Probabilistic Logic <=> Reflective Oracles?

June 30, 2024

The Probabilistic Payor's Lemma implies the following cooperation strategy:

Let $A_{1}, \ldots, A_{n}$ be agents in a multiplayer Prisoner's Dilemma, with the ability to return either 'Cooperate' or 'Defect' (which we model as the agents being logical statements resolving to either 'True' or 'False'). Each $A_{i}$ behaves as follows:

$$ , \vdash \Box_{p_{i}} \left( \Box_{\max {p_{1},\ldots, p_{n}}}\bigwedge_{k=1}^n A_{k} \to \bigwedge_{k=1}^n A_{k} \right) \to A_{i} $$

Where $p_i$ represents each individual agents' threshold for cooperation (as a probability in $[0,1]$), $\Box_p , \phi$ returns True if credence in the statement $\phi$ is greater than $p,$ and the conjunction of $A_{1}, \ldots, A_{n}$ represents 'everyone cooperates'. Then, by the PPL, all agents cooperate, provided that all $\mathbb{P}_{A_{i}}$ give credence to the cooperation statement greater than each and every $A_{i}$'s individual thresholds for cooperation.

This formulation is desirable for a number of reasons: firstly, the Payor's Lemma is much simpler to prove than Lob's Theorem, and doesn't carry with it the same strange consequences as a result of asserting an arbitrary modal-fixedpoint; second, when we relax the necessitation requirement from 'provability' to 'belief', this gives us behavior much more similar to how agents actually I read it as it emphasizing the notion of 'evidence' being important.

However, the consistency of this 'p-belief' modal operator rests on the self-referential probabilistic logic proposed by Christiano 2012, which, while being consistent, has a few undesirable properties: the distribution over sentences automatically assigns probability 1 to all True statements and 0 to all False ones (meaning it can only really model uncertainty for statements not provable within the system).

I propose that we can transfer the intuitions we have from probabilistic modal logic to a setting where 'p-belief' is analogous to calling a 'reflective oracle', and this system gets us similar (or identical) properties of cooperation.

Oracles

A probabilistic oracle $O$ is a function from $\mathbb{N} \to [0,1]^\mathbb{N}.$ Here, its domain is meant to represent an indexing of probabilistic oracle machines, which are simply Turing machines allowed to call an oracle for input. An oracle can be queried with tuples of the form $(M, p),$ where $M$ is a probabilistic oracle machine and $p$ is a rational number between 0 and 1. By Fallenstein et. al. 2015, there exists a reflective oracle on each set of queries such that $O(M,p) = 1$ if $\mathbb{P}(M() = 1) > p,$ and $O(M,p) = 0$ if $\mathbb{P}(M() = 0) < 1-p$ (check this).

Notice that a reflective oracle has similar properties to the $Bel$ operator in self-referential probabilistic logic. It has a coherent probability distribution over probabilistic oracle machines (as opposed to sentences), it only gives information about the probability to arbitrary precision via queries ($O(M,p)$ vs. $Bel(\phi)$). So, it would be great if there was a canonical method of relating the two.

Peano Arithmetic is Turing-complete, there exists a method of embedding arbitrary Turing machines in statements in predicate logic and there also exist various methods for embedding Turing machines in PA. We can form a correspondence where implications are preserved: notably, $x\to y$ simply represents the program if TM(x), then TM(y) , and negations just make the original TM output 1 where it outputted 0 and vice versa.

(Specifically, we're identifying non-halting Turing machines with propositions and operations on those propositions with different ways of composing the component associated Turing machines. Roughly, a Turing machine outputting 1 on an input is equivalent to a given sentence being true on that input)

CDT, expected utility maximizing agents with access to the same reflective oracle will reach Nash equilibria, because reflective oracles can model other oracles and other oracles that are called by other probabilistic oracle machines---so, at least in the unbounded setting, we don't have to worry about infinite regresses, because the oracles are guaranteed to halt.

So, we can consider the following bot: $$ A_{i} := O_{i} \left( O_{\bigcap i} \left( \bigwedge_{k=1}^n A_{k}\right) \to \bigwedge_{k=1}^n A_{k}, , p_{i}\right), $$ where $A_i$ is an agent represented by a oracle machine, $O_i$ is the probabilistic oracle affiliated with the agent, $O_{\bigcap i}$ is the closure of all agents' oracles, and $p_{i} \in \mathbb{Q} \cap [0,1]$ is an individual probability threshold set by each agent.

How do we get these closures? Well, ideally $O_{\bigcap i}$ returns $0$ for queries $(M,p)$ if $p < \min{p_{M_1}, \ldots, p_{M_n}}$ and $1$ if $p > \max {p_{M_1}, \ldots, p_{M_n}},$ and randomizes for queries in the middle---for the purposes of this cooperation strategy, this turns out to work.

I claim this set of agents has the same behavior as those acting in accordance with the PPL: they will all cooperate if the 'evidence' for cooperating is above each agents' individual threshold $p_i.$ In the previous case, the 'evidence' was the statement $\Box_{\max {p_{1},\ldots, p_{n}}}\bigwedge_{k=1}^n A_{k} \to \bigwedge_{k=1}^n A_{k}.$ Here, the evidence is the statement $O_{\bigcap i} \left( \bigwedge_{k=1}^n A_{k}\right) \to \bigwedge_{k=1}^n A_{k}.$

To flesh out the correspondence further, we can show that the relevant properties of the $p$-belief operator are found in reflective oracles as well: namely, that instances of the weak distribution axiom schema are coherent and that necessitation holds.

For necessitation, $\vdash \phi \implies \vdash \Box_{p}\phi$ turns into $M_{\phi}() = 1$ implying that $O(M_{\phi},p)=1,$ which is true by the properties of reflective oracles. For weak distributivity, $\vdash \phi \to \psi \implies \vdash \Box_{p} \phi \to \Box_{p}\psi$ can be analogized to 'if it is true that the Turing machine associated with $\phi$ outputs 1 implies that the Turing machine associated with $\psi$ outputs 1, then you should be at least $\phi$-certain that $\psi$-outputs 1, so $O(M_{\phi},p)$ should imply $O(M_{\psi}, p)$ in all cases (because oracles represent true properties of probabilistic oracle machines, which Turing machines can be embedded into).

Models

Moreover, we can consider oracles to be a rough model of the p-belief modal language in which the probabilistic Payor's Lemma holds. We can get an explicit model to ensure consistency (see the links with Christiano's system, as well as its interpretation in neighborhood semantics), but oracles seem like a good intuition pump because they actively admit queries of the same form as $Bel(\phi)>p,$ and they are a nice computable analog.

They're a bit like the probabilistic logic in the sense that a typical reflective oracle just has full information about what the output of a Turing machine will be if it halts, and the probabilistic logic gives $\mathbb{P}(\phi)=1$ to all sentences which are deducible from the set of tautologies in the language. So the correspondence has some meat.

Crossposted to LessWrong.


Rome

March 20, 2024

A post-apocalyptic fever dream. The oldest civilized metropolis. Where sons are pathetic in the eyes of their father, and both are pathetic in the eyes of their grandfathers—all while wearing blackened sunglasses and leather jackets. Grown, not made.

Rome is, perhaps, the first place I recognized as solely for visiting, never living. Unlike Tokyo, one feels this immediately. Japan’s undesirability stems primarily from its inordinate, sprawling bureaucracy that is, for the most part, hidden from the typical visitor. Rome’s undesirability is apparent for all to see—it’s loud, stifling, unmaintained, and requires arduous traversals.

Population c. 100 C.E.: 1 million people.
Population c. 1000 C.E.: 35,000.
Population c. 2024 C.E.: 2.8 million people.

Rebound? Not so fast—the global population in the year 100 was just 200 million.

And this is obvious. The city center is still dominated by the Colosseum, the Imperial fora, and Trajan’s Market. Only the Vittoriano holds a candle to their extant glory. Yet the hordes of tourists still walk down the Via dei Forti Imperiali and congregate in stupidly long lines at the ticket booth to see ruins!

I walked across the city from east to west, passing by a secondary school, flea market, and various patisseries (is that the correct wording?). The pastries were incredible. The flea market reminded me of Mexico, interestingly enough. Felt very Catholic.

(All the buses and trams run late in Rome. This too, is very Catholic, as Orwell picked up on during his time in Catalonia and as anyone visiting a Mexican house would know. Plausibly also Irish?)

Rome’s ivies pervade its structures. Villas, monuments, churches (all 900 of them), and fountains all fall victim to these creepers. It gives the perception of a ruined city, that Roman glory has come and gone—and when one is aware of Italian history, it is very, very hard to perceive Rome as anything else than an overgrown still-surviving bastion against the continuing spirit of the Vandals.

Roman pines, too, are fungiform. Respighi’s tone poem doesn’t do justice to them. Perhaps this is just a Mediterranean vibe? But amongst the monumental Classical, Romanesque, and Neoclassical structures of the Piazza Venezia, these pines are punctual. Don’t really know how else to convey it.

It is difficult to comprehend how the animalistic, gladitorial Roman society became the seat of the Catholic Church. This city is clearly Gaian, and clearly ruled by the very same gods their Pantheon pays homage to. The Christian God is not of nature, it is apart from nature. It is not Dionysian, it is not even Apollonian (because it cannot recognize or respect the Dionysian, and as such cannot exist in context of it, only apart from it). And yet, the Pope persists.

I have not seen the Vatican yet. I want to. I will be back.


Self-Referential Probabilistic Logic Admits the Payor's Lemma

November 28, 2023

In summary: A probabilistic version of the Payor's Lemma holds under the logic proposed in the Definability of Truth in Probabilistic Logic. This gives us modal fixed-point-esque group cooperation even under probabilistic guarantees.

Background

Payor's Lemma: If $\vdash \Box (\Box x \to x) \to x,$ then $\vdash x.$

We assume two rules of inference:

Proof:

  1. $\vdash x \to (\Box x \to x),$ by tautology;
  2. $\vdash \Box x \to \Box (\Box x \to x),$ by 1 via necessitation and distributivity;
  3. $\vdash \Box (\Box x \to x) \to x$, by assumption;
  4. $\vdash \Box x \to x,$ from 2 and 3 by modus ponens;
  5. $\vdash \Box (\Box x \to x),$ from 4 by necessitation;
  6. $\vdash x,$ from 5 and 3 by modus ponens.

The Payor's Lemma is provable in all normal modal logics (as it can be proved in $K,$ the weakest, because it only uses necessitation and distributivity). Its proof sidesteps the assertion of an arbitrary modal fixedpoint, does not require internal necessitation ($\vdash \Box x \implies \vdash \Box \Box x$), and provides the groundwork for Lobian handshake-based cooperation without Lob's theorem.

It is known that Lob's theorem fails to hold in reflective theories of logical uncertainty. However, a proof of a probabilistic Payor's lemma has been proposed, which modifies the rules of inference necessary to be:

The question is then: does there exist a consistent formalism under which these rules of inference hold? The answer is yes, and it is provided by Christiano 2012.

Setup

(Regurgitation and rewording of the relevant parts of the Definability of Truth)

Let $L$ be some language and $T$ be a theory over that language. Assume that $L$ is powerful enough to admit a Godel encoding and that it contains terms which correspond to the rational numbers $\mathbb{Q}.$ Let $\phi_1, \phi_{2} \ldots$ be some fixed enumeration of all sentences in $L.$ Let $\ulcorner \phi \urcorner$ represent the Godel encoding of $\phi.$

We are interested in the existence and behavior of a function $\mathbb{P}: L \to [0,1]^L,$ which assigns a real-valued probability in $[0,1]$ to each well-formed sentence of $L.$ We are guaranteed the coherency of $\mathbb{P}$ with the following assumptions:

  1. For all $\phi, \psi \in L$ we have that $\mathbb{P}(\phi) = \mathbb{P}(\phi \land \psi) + \mathbb{P}(\phi \lor \neg \psi).$
  2. For each tautology $\phi,$ we have $\mathbb{P}(\phi) = 1.$
  3. For each contradiction $\phi,$ we have $\mathbb{P}(\phi) = 0.$

Note: I think that 2 & 3 are redundant (as says John Baez), and that these axioms do not necessarily constrain $\mathbb{P}$ to $[0,1]$ in and of themselves (hence the extra restriction). However, neither concern is relevant to the result.

A coherent $\mathbb{P}$ corresponds to a distribution $\mu$ over models of $L.$ A coherent $\mathbb{P}$ which gives probability 1 to $T$ corresponds to a distribution $\mu$ over models of $T$. We denote a function which generates a distribution over models of a given theory $T$ as $\mathbb{P}_T.$

Syntactic-Probabilistic Correspondence: Observe that $\mathbb{P}_T(\phi) =1 \iff T \vdash \phi.$ This allows us to interchange the notions of syntactic consequence and probabilistic certainty.

Now, we want $\mathbb{P}$ to give sane probabilities to sentences which talk about the probability $\mathbb{P}$ gives them. This means that we need some way of giving $L$ the ability to talk about itself.

Consider the formula $Bel.$ $Bel$ takes as input the Godel encodings of sentences. $Bel(\ulcorner \phi \urcorner)$ contains arbitrarily precise information about $\mathbb{P}(\phi).$ In other words, if $\mathbb{P}(\phi) = p,$ then the statement $Bel(\ulcorner \phi \urcorner) > a$ is True for all $a < p,$ and the statement $Bel(\ulcorner \phi \urcorner) > b$ is False for all $b > p.$ $Bel$ is fundamentally a part of the system, as opposed to being some metalanguage concept.

(These are identical properties to that represented in Christiano 2012 by $\mathbb{P}(\ulcorner \phi \urcorner).$ I simply choose to represent $\mathbb{P}(\ulcorner \phi \urcorner)$ with $Bel(\ulcorner \phi \urcorner)$ as it (1) reduces notational uncertainty and (2) seems to be more in the spirit of Godel's $Bew$ for provability logic).

Let $L'$ denote the language created by affixing $Bel$ to $L.$ Then, there exists a coherent $\mathbb{P}_T$ for a given consistent theory $T$ over $L$ such that the following reflection principle is satisfied:

$$ \forall \phi \in L' ; \forall a,b \in \mathbb{Q} : (a < \mathbb{P}{T}(\phi) < b) \implies \mathbb{P}{T}(a < Bel(\ulcorner \phi \urcorner) < b) = 1. $$

In other words, $a < \mathbb{P}_T(\phi) < b$ implies $T \vdash a < Bel(\ulcorner \phi \urcorner) < b.$

Proof

(From now, for simplicity, we use $\mathbb{P}$ to refer to $\mathbb{P}_T$ and $\vdash$ to refer to $T \vdash.$ You can think of this as fixing some theory $T$ and operating within it).

Let $\Box_p , (\phi)$ represent the sentence $Bel(\ulcorner \phi \urcorner) > p,$ for some $p \in \mathbb{Q}.$ We abbreviate $\Box_p , (\phi)$ as $\Box_p , \phi.$ Then, we have the following:

Probabilistic Payor's Lemma: If $\vdash \Box_p , (\Box_p , x \to x) \to x,$ then $\vdash x.$

Proof as per Demski:

  1. $\vdash x \to (\Box_{p},x \to x),$ by tautology;
  2. $\vdash \Box_{p}, x \to \Box_{p}, (\Box_{p}, x \to x),$ by 1 via weak distributivity,
  3. $\vdash \Box_{p} (\Box_{p} , x \to x) \to x$, by assumption;
  4. $\vdash \Box_{p} , x \to x,$ from 2 and 3 by modus ponens;
  5. $\vdash \Box_{p}, (\Box_{p}, x \to x),$ from 4 by necessitation;
  6. $\vdash x,$ from 5 and 3 by modus ponens.

Rules of Inference:

Necessitation: $\vdash x \implies \vdash \Box_p , x.$ If $\vdash x,$ then $\mathbb{P}(x) = 1$ by syntactic-probabilistic correspondence, so by the reflection principle we have $\mathbb{P}(\Box_p , x) = 1,$ and as such $\vdash \Box_p , x$ by syntactic-probabilistic correspondence.

Weak Distributivity: $\vdash x \to y \implies \vdash \Box_p , x \to \Box_p , y.$ The proof of this is slightly more involved.

From $\vdash x \to y$ we have (via correspondence) that $\mathbb{P}(x \to y) = 1,$ so $\mathbb{P}(\neg x \lor y) = 1.$ We want to prove that $\mathbb{P}(\Box_p , x \to \Box_p , y) = 1$ from this, or $\mathbb{P}((Bel(\ulcorner x \urcorner) \leq p) \lor (Bel(\ulcorner y \urcorner) > p)) = 1.$ We can do casework on $x$. If $\mathbb{P}(x) \leq p,$ then weak distributivity follows from vacuousness. If $\mathbb{P}(x) >p,$ then as $$ \begin{align*}
\mathbb{P}(\neg x \lor y) &= \mathbb{P}(x \land(\neg x \lor y)) + \mathbb{P}(\neg x \land (\neg x \lor y)), \\
1 &= \mathbb{P}(x \land y) + \mathbb{P}(\neg x \lor (\neg x \land y)), \\ 1 &= \mathbb{P}(x \land y) + \mathbb{P}(\neg x), \end{align*} $$ $\mathbb{P}(\neg x) < 1-p,$ so $\mathbb{P}(x \land y) < p,$ and therefore $\mathbb{P}(y) > p.$ Then, $Bel(\ulcorner y \urcorner) > p$ is True by reflection, so by correspondence it follows that $\vdash \Box_p , x \to \Box_p y.$

(I'm pretty sure this modal logic, following necessitation and weak distributivity, is not normal (it's weaker than $K$). This may have some implications? But in the 'agent' context I don't think that restricting ourselves to modal logics makes sense).

Bots

Consider agents $A,B,C$ which return True to signify cooperation in a multi-agent Prisoner's Dilemma and False to signify defection. (Similar setup to Critch's ). Each agent has 'beliefs' $\mathbb{P}_A, \mathbb{P}_B, \mathbb{P}_C : L \to [0,1]^L$ representing their credences over all formal statements in their respective languages (we are assuming they share the same language: this is unnecessary).

Each agent has the ability to reason about their own 'beliefs' about the world arbitrarily precisely, and this allows them full knowledge of their utility function (if they are VNM agents, and up to the complexity of the world-states they can internally represent). Then, these agents can be modeled with Christiano's probabilistic logic! And I would argue it is natural to do so (you could easily imagine an agent having access to its own beliefs with arbitrary precision by, say, repeatedly querying its own preferences).

Then, if $A,B,C$ each behave in the following manner:

where $E = A \land B \land C$ and $e = \max ({ a,b,c }),$ they will cooperate by the probabilistic Payor's lemma.

Proof:

  1. $\vdash \Box_a , (\Box_e , E \to E) \land \Box_b , (\Box_e , E \to E) \land \Box_c , (\Box_e , E \to E) \to A \land B \land C,$ via conjunction;
  2. $\vdash \Box_e , (\Box_e , E \to E) \to E,$ as if the $e$-threshold is satisfied all others are as well;
  3. $\vdash E,$ by probabilistic Payor.

This can be extended to arbitrarily many agents. Moreso, the valuable insight here is that cooperation is achieved when the evidence that the group cooperates exceeds each and every member's individual threshold for cooperation. A formalism of the intuitive strategy 'I will only cooperate if there are no defectors' (or perhaps 'we will only cooperate if there are no defectors').

It is important to note that any $\mathbb{P}$ is going to be uncomputable. However, I think modeling agents as having arbitrary access to their beliefs is in line with existing 'ideal' models (think VNM -- I suspect that this formalism closely maps to VNM agents that have access to arbitrary information about their utility function, at least in the form of preferences), and these agents play well with modal fixedpoint cooperation.

Acknowledgements

This work was done while I was a 2023 Summer Research Fellow at the Center on Long-Term Risk. Many thanks to Abram Demski, my mentor who got me started on this project, as well as Sam Eisenstat for some helpful conversations. CLR was a great place to work! Would highly recommend if you're interested in s-risk reduction.

Crossposted to the AI Alignment Forum.


Geneva

September 13, 2023

Geneva is evil.

It's overpriced, loud, and dirty. Paying ten francs for a medicore street taco is no way to live life. God forbid you visit the city center during the day, and stay as far away from Geneva station as you can. I thought the air was supposed to be good in the Alps?

But above all, it reeks of fakeness.

It calls itself the "Peace Capital", claims it's too good to have twin cities, and prides itself on its cosmopolitanism. On what grounds? Before Hitler's fall, Geneva's only claim to facilitating international diplomacy was hosting the League of Nations -- admittedly the best international governing body we've had thus far, but still. After, every international organization and their shadow backers clamored to have their headquarters (or at least their European headquarters) in Geneva. The UN, WHO, UNHCR, Red Cross, WTO, WIPO, WMO, ILO, ...

Did you know that the largest non-financial services industry in Geneva is watchmaking? Rolex, Patek Philippe, etc. have factories just outside of Geneva proper. To be fair, 'financial services' also excludes commodity trading, of which Geneva is to oil, sugar, grains, and coffee as Rotterdam is to metals. Vitol & Trafigura both have their headquarters in Geneva (and one must wonder whether or not this is for convenience or to take advantage of lax Swiss banking laws...remember Marc Rich?)

Two-thirds of the corporate tax in Geneva comes from commodity trading, banking, and watchmaking. These international organizations? Don't contribute to the economy. (Yes, they bring people & these people use services & this allows Geneva natives to benefit from the overwhelming amount of NGOs and international bodies in their city. Still.)

Tragically, Geneva once had a soul. The 'Protestant Rome' which once served as the birthplace of the Calvinist Revolution was annexed by Catholic France & revolted as a response. The city had opinions that informed its identity -- not a pseudo-identity formed from undeserved arrogance & globalism.

Demographic shifts (mostly French immigration to French-speaking Switzerland) led to Catholics forming the largest religious group in Geneva today, followed by atheists. (I am not blaming immigration for Geneva's soullessness! it is just another piece of the puzzle). This, along with its absurd emphasis on being a truly international city, undergird the sense that Geneve has lost its way.

And the Jet d'Eau... really? Geneva really had to take the self-masturbatory imagery to another level...


Toledo

September 12, 2023

One recounts that Washington Irving, who was traveling in Spain at the time, suggested the name to his brother, a local resident; this explanation ignores the fact that Irving returned to the United States in 1832. Others award the honor to Two Stickney, son of the major who quaintly numbered his sons and named his daughters after States. The most popular version attributes the naming to Willard J. Daniels, a merchant, who reportedly suggested Toledo because it 'is easy to pronounce, is pleasant in sound, and there is no other city of that name on the American continent.'

- The Ohio Guide on the naming of Toledo, Ohio

I found myself in Toledo one night.

I was trying to get from Ann Arbor to Boston via Amtrak (I had no working photo ID, and at the time I didn't realize that TSA takes IDs up to one year expired as valid for domestic travel) and for some reason my connection was in Toledo. Bus from Ann Arbor to Toledo, train from Toledo to Boston. Simple.

(it actually was quite simple --- this won't be some sort of beginning to a trashy horror story. Pinky promise)

Abandoned Factories 1967: Super Bowl 1, Apollo 1 blows up, Ali fights the draft, Thurgood Marshall rises to the court, and Detroit dies.

Woe befell America's automotive capital with the long, hot summer of '67 and some of the bloodiest race riots in American history. Eventually LBJ used the Insurrection Act to send the National Guard to quell the riots, but it left the west of Lake Erie a shell of its former self.

Today, Detroit is almost a ghost town. It's defaulted on its debt (and gone bankrupt!), has the 4th highest murder rate in major cities in the USA, and its former mayor was convicted on 24 felony counts and sentenced to 28 years in prison.

Luckily, I wasn't in Detroit! So you can imagine how surprised I was to find a ramshackle paper mill right next to the train station. And next to that was a junkyard, and next to that was another unused factory, and next to that was... you get the picture.

If you were in front of an abandoned factory at 3AM I would certainly hope you at least took a look around inside. Not that I would ever do such a thing, but it seems like such a missed opportunity...

[pictures incoming! in future updates :D]

Apparently the dereliction of Detroit's manufacturing capacity took Toledo (and eventually, the rest of the Midwest) with it.

Fellow Travelers Mainstays on the Amtrak: Mennonites, punks, and wannabe vagabonds.

Mennonites & the Amish are often mistaken for each other. Both practice a certain amount of technological ascetism (more extreme in the conservative branches), both are Anabaptist derived, and both use the Amtrak as a form of transportation. But you probably see the conservative Mennonites (given their distinctive dress -- 1850s cowboy vibes), especially given that large numbers of them settled in the Midwest.

Punks are interchangeable. Spiky pink hair, silver chains, black skinny jeans, relationships with inappropriate age gaps -- always the same. Entertaining, in small doses.

Wannabe vagabonds: me! :D

Highly, highly recommend talking to the person sitting next to (or in front of, or behind) you on the train. Americans like talking to people, and you probably have nothing better to do. The WiFi is atrocious.

Why Is The Station So BIG MLK Plaza is Toledo's station. It is massive.

Four floors. Gothic. Nearly a century and a half old. A multimillion dollar investment in the mid-20th century. To serve an area that is now dead.

At least National Train Day is a week early in Toledo. The city probably needs the consolation.

Microcosm Walk around Toledo at night. See the emptiness. Feel the emptiness. Get in touch with the dying Rust Belt. And maybe visit the first ever hippoquarium exhibit in a zoo.

Would rec.