Self-Referential Probabilistic Logic Admits the Payor's Lemma
November 28, 2023
In summary: A probabilistic version of the Payor's Lemma holds under the logic proposed in the Definability of Truth in Probabilistic Logic. This gives us modal fixed-point-esque group cooperation even under probabilistic guarantees.
Background
Payor's Lemma: If
We assume two rules of inference:
- Necessitation:
- Distributivity:
Proof:
by tautology; by 1 via necessitation and distributivity; , by assumption; from 2 and 3 by modus ponens; from 4 by necessitation; from 5 and 3 by modus ponens.
The Payor's Lemma is provable in all normal modal logics (as it can be proved in
It is known that Lob's theorem fails to hold in reflective theories of logical uncertainty. However, a proof of a probabilistic Payor's lemma has been proposed, which modifies the rules of inference necessary to be:
- Necessitation:
- Weak Distributivity:
where here we take to be an operator which returns True if the internal credence of is greater than and False if not. (Formalisms incoming).
The question is then: does there exist a consistent formalism under which these rules of inference hold? The answer is yes, and it is provided by Christiano 2012.
Setup
(Regurgitation and rewording of the relevant parts of the Definability of Truth)
Let
We are interested in the existence and behavior of a function
- For all
we have that - For each tautology
we have - For each contradiction
we have
Note: I think that 2 & 3 are redundant (as says John Baez), and that these axioms do not necessarily constrain
A coherent
Syntactic-Probabilistic Correspondence: Observe that
Now, we want
Consider the formula
(These are identical properties to that represented in Christiano 2012 by
Let
$$ \forall \phi \in L' ; \forall a,b \in \mathbb{Q} : (a < \mathbb{P}{T}(\phi) < b) \implies \mathbb{P}{T}(a < Bel(\ulcorner \phi \urcorner) < b) = 1. $$
In other words,
Proof
(From now, for simplicity, we use
Let
Probabilistic Payor's Lemma: If
Proof as per Demski:
by tautology; by 1 via weak distributivity, , by assumption; from 2 and 3 by modus ponens; from 4 by necessitation; from 5 and 3 by modus ponens.
Rules of Inference:
Necessitation:
Weak Distributivity:
From
(I'm pretty sure this modal logic, following necessitation and weak distributivity, is not normal (it's weaker than
Bots
Consider agents
Each agent has the ability to reason about their own 'beliefs' about the world arbitrarily precisely, and this allows them full knowledge of their utility function (if they are VNM agents, and up to the complexity of the world-states they can internally represent). Then, these agents can be modeled with Christiano's probabilistic logic! And I would argue it is natural to do so (you could easily imagine an agent having access to its own beliefs with arbitrary precision by, say, repeatedly querying its own preferences).
Then, if
where
Proof:
via conjunction; as if the -threshold is satisfied all others are as well; by probabilistic Payor.
This can be extended to arbitrarily many agents. Moreso, the valuable insight here is that cooperation is achieved when the evidence that the group cooperates exceeds each and every member's individual threshold for cooperation. A formalism of the intuitive strategy 'I will only cooperate if there are no defectors' (or perhaps 'we will only cooperate if there are no defectors').
It is important to note that any
Acknowledgements
This work was done while I was a 2023 Summer Research Fellow at the Center on Long-Term Risk. Many thanks to Abram Demski, my mentor who got me started on this project, as well as Sam Eisenstat for some helpful conversations. CLR was a great place to work! Would highly recommend if you're interested in s-risk reduction.
Crossposted to the AI Alignment Forum.