a brief diatribe on safetyism
January 10, 2026
A good friend asked me today: why aren't the AI labs evil?
My load-bearing answer is that I see the moral imperative to preserve the the generators of progress as comparable in magnitude to the imperative to prevent harm (even when considering harms posed by superintelligences), and so while I agree unabashed accelerationism is misguided and likely leads to catastrophic outcomes, it is difficult for me to describe those trying as "evil."1
(From similar generators, I weakly hold that OpenAI is 'less evil' than Anthropic, because it seems that the effects 'culty' organizations have on the world are worse than 'non-culty' ones, for structural reasons like worldview homogeneity / top-down vs bottom-up governance / systematic underrating of illegible-from-current-perspective deep harms)
Why?
-
progress is fragile; progress is necessary for the continued emancipation of sentient beings; progress is really the only way to create self-preserving systems that tend towards greater net emancipation because, albeit near tautologically, 'progress' creates 'slack', a lack of 'slack' indicates the agency of the constituents is stripped, an abundance of 'slack' allows systems to adopt robust and diverse stances;2
-
risks from superintelligence are immense. from a suffering-focused perspective, it's likely that the most important interventions of the next twenty years revolve around growing AIs to be dispositionally benevolent / not spiteful. extinction is likely. however, I believe past experience should bias us towards naive mitigations of perceived catastrophic risks to have deeply harmful, unintended, adverse consequences, and so I am less sympathetic to arguments of 'you shouldn't build it' (or even 'you shouldn't build it now')
-
there's an argument from aesthetics to be made. secretly, these arguments are advocating for norm-preservation, where the norms themselves have been hard-won & illegible yet are adaptive forms of bounded consequentialism. (are the norms intrinsic or systematically enforced?)
A rejoinder: sure, but sane accelerationism has never been tried. We should place our civilizational efforts into becoming wiser, and intentionally take steps forward into doom, if we choose to. Current structures (race-dynamics, etc.) differentially favor progress over care, so you should be skeptical of pro-progress arguments. Incentives are aligned for people to pursue progress in a way they are not in the pursuit of increased wisdom.
I agree that if we could systematically become wiser, we should put effort into systematically becoming wiser. But we don't have a good track record of becoming wiser (at least intentionally), and naive applications of care are likely net harmful.
From my perspective, the proper way to 'apply care' requires taking advantage of the preconditions for progress-generating environments. History is neither completely determined by the initial conditions of pivotal technologies, and neither is it beholden to pre-existing convergent pressures. However:
- developing technologies is a robust lever for disruption, and it is near-uniquely encouraged by progress-generating environments;
- path-dependence in the invention and distribution of technologies is real, and intentionally shaping the civilizational arc is possible by counterfactually accelerating a pivotal technology
In other words, if you truly care I think you should be ambitious, develop interventions compatible with modern incentive structures, and shut up and calculate when deciding on a plan of action.
Another answer: I'm more agnostic than most on the net-positive nature of current human civilization and its naturally extrapolated trajectories. This isn't loadbearing. ↩
I somewhat buy that "evilness" is synonymous with "intrinsically slack-sucking." ↩