FIAT and negatory goal development

January 18, 2026

We'd like to understand a mind's effects on the world. To do this, we'd like to understand the structures / elements which determine / designate mind trajectories (preferably over long horizons).

Descriptively, minds can often be said to have goals. The minds we currently observe most often have goals, and goal-oriented frames provide short, accurate compressions of mind outputs.

How does a mind come to have a goal? All observed minds, artificial or otherwise, do not exist absent amental structure. They are grown through a deterministic (enough) process, on silicon or biological substrate, such that all goal-drivers have a "physical" representation in the corresponding mind-artifact. "Physical" explanations for goal development include:

Each is a valid explanation in context. Yet I'd argue none properly engage with "mentalistic" aspects of mind: the structures / elements of mind-systems unified by their predominance in minds and lack elsewhere. Insofar as we expect goals to be common properties of minds, we'd expect the structures / elements of minds to contribute to their development.

Tsvi hypothesizes that a dominant driver of human behavior is a reflective process in which a human mind finds itself ascribing goals in accordance with what goals it would make sense for the human mind to have conditioning on the past history of its actions.1 He calls this "the Fictitious Imputed Adopted Telos hypothesis." FIAT provides plausible, "mentalistic" explanations for goal development, in that a mind's drive for coherence (a mentalistic property) lets it amplify more "hard-coded" drives into goals that are more general than the initial activation state of the drive, by generalizing from commonalities between past experience.

Differentiating flavors of "fictitious imputed adopted telos":

With regards to the latter: attractive and repulsive dynamics are not symmetric in a relevant way; the specification of "do(x)" is crisper than the specification of "not do(x)." Even taking a predictive perspective, space of trajectories ending in "x" is much smaller than space of trajectories not ending in "x." Expanding x-space seems easier? in the do(x) vs not do(x) case, because credit attribution for the causal factors at ending up in not do(x) trajectories is harder because the space is larger. So you'd expect FIAT to "do worse" here.

(this is what I'm trying to call "negatory goal development")

Is this negatory goal development particularly worse for FIAT than otherwise? Well, I expect non-FIAT / "physical" alternatives to generalize less well in "not do(x)" cases, so total negatory goal development via e.g. imitative methods is probably smaller.

There's a story you could tell here where a lot of the pathological aspects of minds come from FIAT-induced negatory goal development. Bateson's schismogenesis hypotheses are, as far as I can tell, disjoint or at the very least operating not on the same hierarchical level (it's FIAT-accepting-agnostic). I also think these "negatory" patterns are better explicated in Braitenberg or Braitenberg-peers.

Ideally we'd study goal-development in purely epistemic settings? Not sure if our bounded reasoning models are high-fidelity enough, or whether they'll give us any interesting insights.

1

I note that he operates within the frame of "deriving values" rather than "determining goals." I don't do this because I think grounding the analysis in goals is sounder (the observational evidence for "goals" being "real" is much stronger, given that "values" are fundamentally properties internal to minds and are typically either argued for with either coherence-like structural arguments or with relatively weak "interpretability"-shaped probes, either in the neuroscience literature or the more modern interpretability field)