[Disclaimer: Esoteric musings. This post probably won’t make sense if you aren’t familiar with “updateless decision theory.”]
I’ve been skeptical for a while of updateless decision theory, diachronic Dutch books, and dynamic consistency as a rational requirement. I think Hedden's (2015) notion of time-slice rationality1 nicely grounds the cluster of intuitions behind this skepticism.
(Note: Time-slice rationality is a normative standard, i.e., a statement of what it even means to “win” (if rationality = “winning”). This is not to say that winning with respect to time-slice rationality requires believing in time-slice rationality.)
According to this view, in principle “you” are not a unified decision-maker across different time points, any more than your decision-making is unified with other agents’. Yes, you share values with your future self (mostly). But insofar as you have different evidence, there isn’t any privileged epistemic perspective that all the yous at different time points can agree on. Rather:
“You” at time 0 (“0-you”) are a different decision-maker from “you” at time 1 (“1-you”).
What is rational for 1-you only depends on 0-you insofar as 0-you are another agent that 1-you might need to coordinate with. And vice versa.
This is consistent with 0-you being able to make commitments that influence 1-you. Or, e.g., just refusing an offer in a potential money pump that looks good to 0-you in the short term, because 0-you predict that 1-you will make decisions that are bad from 0-your perspective. More on this later.
Let’s say 0-you and 1-you share a value system V, and define a “diachronic (sure) loss” as a sequence of actions that (certainly) make things worse with respect to V than another sequence. (E.g., some sequence of bets in Sleeping Beauty that “an agent” endorsing EDT and SIA might take.)
On time-slice rationality, then, 1-you are not rationally obligated to make a decision that would avoid a diachronic loss given 0-your decision! 1-you just aren’t deciding a sequence. Rather, 1-you are deciding the time-1 action, and 1-you ought to decide an action that’s best from 1-your perspective.2 If a diachronic loss happens, this is a result of two agents’ decisions, which (I claim) neither 0-you nor 1-you can entirely control.3
Objection: “Consider a framing of Transparent Newcomb in which you just think about the problem at time 0, and then after seeing the boxes you either take one or two at time 1. Then, 0-you can decide for 1-you to one-box, right?”
Response: Sort of. If 0-you really are capable of psychologically binding 1-you to an action that (combined with 0-yours) will avoid a diachronic loss, 0-you ought to do so. (And if you can, then there’s no need for “updateless decision theory” per se here — you can justify diachronic loss avoidance just with time-slice rationality.) But merely intending for 1-you to do some action doesn’t bind 1-you to it. And forming this intention doesn’t rationally obligate 1-you to follow through on it, any more than declaring your endorsement of some social norm rationally obligates others to follow it. What makes 0-you so confident that 1-you will one-box?
Objection: “I know 1-me will one-box because they’ll recognize that’s the rational thing to do.”
Response: Isn’t rationality supposed to be about “winning”? 1-you wouldn’t win from their perspective by deciding to one-box. 0-you would win by having 1-you be predicted to one-box. I don’t see the independent justification for conceiving of “winning” from some perspective other than the agent’s own.
Objection: “Backing up: Isn’t following through on your intentions inherent to rational decisions? Realistically, there’s always some gap in time between when you form an intention and follow through on it, so it seems that what is rational for 1-you does depend on 0-your intention.”
Response: In the vast majority of cases, either: 1) Nothing relevant changes about your epistemic state between when 0-you form an intention and when 1-you decide to follow through. Or 2) 0-your “intention” is itself a decision that determines the subsequent behavior, so there’s nothing for 1-you to decide. These are qualitatively different from a situation like Transparent Newcomb, where 1-you has different information than 0-you and has a decision to make.
Objection: “We nudge our future selves to do things they don’t want to do all the time. Why is it any harder to just bind 1-you to one-box by intending to do so?”
Response: I think in these mundane cases, the kinds of actions you want your future self to do, and the kinds of situations they’ll be in, are much less bizarre and contrary to our natural inclinations than Transparent Newcomb. If 0-you work up the willpower to go to the gym, the sunk-cost inertia and dream of getting swole make it less costly from 1-your perspective to work out than go home. If 0-you promise to keep a secret for a friend, then even if 1-you become reasonably confident 1-you’d get away with telling it when convenient, 1-your conscience will (hopefully) make 1-you not want to tell it. By contrast, I struggle to imagine hyping myself up to one-box, then seeing the open boxes right there and feeling worse about taking both boxes. (Especially if I imagine the True Transparent Newcomb, where the money is replaced with whatever I terminally care about.)
Objection: “But 1-me shares my values. Doesn’t 1-me want both of us to receive the $1 million that comes from one-boxing?”
Response: Indeed they do. But, if 0-you didn’t bind 1-you to one-box, then I don’t see how it’s possible for 1-you to make it more likely, via their decision, that the box contains the $1 million. 1-you are certain of the box’s contents — there is no sense in which 1-your decision not to take both boxes is better from 1-your perspective. Don’t blame 1-you, blame 0-your own inability to bind 1-you.
Objection: “1-me isn’t deciding the action ‘one-box,’ they’re deciding the policy ‘one-box given a Transparent Newcomb problem.’ This is the policy that maximizes money, even from 1-my perspective.”
Response: It only maximizes money with respect to the prior, i.e., a perspective that is uncertain of the boxes’ contents. By hypothesis, 1-you are not uncertain of this. So you’re begging the question in favor of updatelessness here.
Objection: “*Shrug*. From 0-my perspective, it’s good for 1-me to believe updatelessness is rational, even if from 1-my perspective it isn’t.”
Response: I’d agree with that! Convincing yourself of updatelessness might be reasonable ex ante. (And no, this doesn’t contradict time-slice rationality. See the note at the start.) But my concern is that the tails will come apart — there will be cases where this instrumental justification for believing in updatelessness doesn’t make sense. Some examples:
Anthropics: People have appealed to updatelessness / diachronic sure losses as a justification for certain epistemic views even when no 0-you who’d want to bind 1-you ever existed!
Logical updatelessness: There’s no 0-me who floated around in a void before knowing whether they’d exist.
Fake “commitments”: If I’m right about the gap between intentions and commitments, people might be systematically overestimating the acausal power of their intentions. I worry they’re making a mistake by deviating from time-slice rationality, in that they’re not really reaping the benefits of a commitment because they could’ve decided otherwise.
He applies this idea more directly to sequential decision-making in “Options and Diachronic Tragedy.”
Carlsmith writes, regarding Parfit’s hitchhiker: “Indeed: if, in the desert, I could set-up some elaborate and costly self-binding scheme – say, a bomb that blows off my arm, in the city, if I don’t pay — such that paying in the city becomes straightforwardly incentivized, I would want to do it. But if that’s true, we might wonder, why not skip all this expensive faff with the bomb, and just, you know, pay in the city?” From the time-slice rationality perspective, this question sounds like, “Imagine that your mom [who values your survival as much as you do] finds you and the driver in the city. And the driver [who is very frail and harmless, so nothing bad will happen if their demand is refused] demands that she burn $5. Why doesn’t she just, you know, burn the money?”
So there's still a version of time-slice FDT which one-boxes in non-transparent newcomb because it calculates the effect of its actions using subjunctive dependence
But it disagrees with UDT in counterfactual mugging because only timeslice FDT 0-agent cares about timeslice FDT 1-agent in both branches, timeslice FDT 1-agent who sees tail doesn't care about the other branch
& it disagrees with UDT & TDT on transparent newcomb because the certainty of the box's contents "supersedes" subjunctive dependence
Is this accurate & is there an existing term for timeslice FDT?