# Why maximize the expected value? (II)

Suppose that what ultimately matters is the objective goodness of what you do –
where the objective goodness of an action is determined by the action’s actual outcome, not merely by the expected outcome. But suppose that you
usually don’t know for certain what degree of objective goodness any of the
available options will have. You must make your choices by following a rule
that determines which options are eligible purely on the basis of the probabilities that you assign to various
hypotheses about the degree of objective goodness that each of these available options will have. What
reason could there be for you to have a policy of always choosing an option
that has a maximal expected degree of
objective goodness?

E.g. let us suppose that you are making a series of
financial choices, about investments and transactions of various sorts. Suppose
that the only thing that ultimately matters is the actual
monetary payoff of these choices. Moreover, suppose that you must make your
decisions by following a rule that determines which of the available options
are eligible purely on the basis of the probabilities
that you assign to various hypotheses about the monetary payoffs of the
available options. What reason could there be for you to have a policy of
always choosing options that maximize your expected
monetary payoff?

In fact, there is a very simple and appealing answer to this answer. Suppose that the probabilities that you assign are “well calibrated” (so
that 80% of the propositions to which you assign a probability of 80% are true,
50% of the propositions to you assign a probability of 50% are true, and so
on). Being “well calibrated” in this way is one way in which a probability
distribution can count as “accurate”, and so it seems plausible that a rational
thinker will aim to assign probabilities in such a way that they are well
calibrated in this manner.

If you have assigned probabilities in a rational way, it will be rational for you to regard them as well calibrated in this manner.

Then it seems clear that overall you will end up
with a higher monetary payoff in the long run if you always following the rule of maximizing your expected monetary payoff than
if you follow any of the other rules. Moreover, given our two assumptions (i)
that the only thing that ultimately matters is your actual monetary payoff, and
(ii) that you must follow a rule that singles out the eligible options purely
on the basis of these probabilities, it is hard to see what could be said in
favour of any other decision rule.

Perhaps this idea can be generalized so that it does not
just apply to the special case where all that matters is one’s overall monetary
payoff, but to any case where what ultimately matters is the degree to which
one’s actions are objectively good things to do (at least compared to the
available alternatives). There’s an awful lot more work that would need to be done to make this work, but that’s the general direction that I’m interested in exploring.

## 7 Replies to “Why maximize the expected value? (II)”

1. Heath White says:

Two thoughts, Ralph. First, this model works best if goodness is like money: independent quantities result from many independent (trans)actions, and all quantities are commensurable, cumulative, etc.
Second, you are depending on a law of large numbers in there at least once and maybe twice. In the investment case, you end up with a higher overall return in the long run because overall return is a result of thousands of buy/sell decisions over the course of a lifetime. (Nobody wise would invest all their money in a single lottery ticket with a 1% chance of winning, no matter what the expected payoff.) Also, even then, there will be some unfortunate outliers who lose much of their money. I think real investors employ risk-management strategies, which is not exactly the same as pursuing greatest expected return.

2. Heath — Thank you so much for your two thoughts. Basically, I have to concede that both points are correct.
1. You’re absolutely right that the approach that I’m suggesting makes some highly non-trivial demands on the structure of goodness. In particular, we’d have to be able to compare the goodness of the options that are available in one choice situation with the options that are available in another quite different choice situation; as you say, these degrees of goodness would have to be “commensurable”. As you also rightly say, we also need something like the idea that the combined state of affairs of your choosing two good options, in two different choice situations, is in the relevant way better than the state of affairs of your choosing one good option and one bad one. In that sense, as you put it, I’d have to assume that these degrees of goodness are “cumulative”. And perhaps those assumptions might turn out to be too strong….
2. You’re also right that I’m appealing to a form of the law of large numbers. And since you won’t actually live forever, it’s certainly not guaranteed that if you make your choices by following the rule of maximizing expected goodness, you’ll “do better” in the long run than if you follow an alternative rule. Still, I think that all that I need to argue for is that it’s rational for you to regard it as more probable than not that you’ll do better in the long run by following the expectational rule than by following any alternative. This would be enough to give a non-circular argument in favour of the rule of always making choices that maximize the expected value, I think.
Of cyourse, you’re also right about real investors as well. They don’t simply maximize their expected monetary payoff. But that’s surely monetary payoff isn’t directly proportional to utility. Monetary payoffs have declining marginal utility, etc..

3. …it’s rational for you to regard it as more probable than not that you’ll do better in the long run by following the expectational rule than by following any alternative. This would be enough to give a non-circular argument in favour of the rule of always making choices that maximize the expected [objective] value, I think.
But recall Parfit’s mine shaft case, which you describe in your earlier post. Even if you know that neither you nor anyone else will ever be faced with another choice among options whose outcomes are uncertain, it’s rational for you to choose C in that case, thereby maximizing expected objective value while knowingly failing to maximize objective value. Your argument doesn’t explain why it’s rational to maximize expected objective value in this very short run of one last case.

4. Oh dear — not for the first time, I get that familiar sinking feeling when Mike Otsuka makes an objection to my latest “brainwave”…
I guess I have to agree with Mike’s intuition that maximizing expected value is rational even in a “deathbed” case, and even if the case resembles Parfit’s “mine shaft” case in that one knows for certain that the option that maximizes expected value will not maximize objective value.
The only points that I can think of making here are the following: (i) even in this case, it’s not more probable than not that one will do better by following any alternative decision rule; and (ii) perhaps the requirements of rationality aren’t “designed” for such “deathbed” cases, but for the more normal cases where one needs a rule to follow in a long run of cases.
But perhaps these points aren’t really enough to answer Mike’s objection. So maybe I just need to go back to the drawing board…

5. In Parfit’s mine shaft case:
If you choose A, then all 100 miners will be saved if they’re in the first mine shaft and none will be saved if they’re in the second mine shaft.
If you choose B, then none of the miners will be saved if they’re in the first mine shaft and all 100 will be saved if they’re in the second mine shaft.
If you choose C, then 90 of the 100 miners will be saved, whichever mine shaft they happen to occupy.
We also know that there’s a 50-50 chance the miners are in the first versus the second mine shaft.
So you’re right, Ralph, that
(i) even in this [mine shaft] case, it’s not more probable than not that one will do better by following any alternative decision rule [to the maximization of expected objective value]
But we can tweak the case a bit so that now there’s a 51% chance the miners are in the first mine shaft and a 49% chance that they’re in the second mine shaft. C still maximizes expected objective value (measured by numbers of lives saved). Now, however, it’s more probable than not (because there’s a 51% chance) that one will do better by choosing A rather than by maximizing expected value by choosing C. Nevertheless, it remains rational to choose C.

6. Mike — You’re completely right about my first response (i) to your previous comment. (What was I thinking?)
So all that I can do now is to try to develop the idea behind my second response (ii):
perhaps the requirements of rationality aren’t “designed” for such “deathbed” cases, but for the more normal cases where one needs a rule to follow in a long run of cases.
But I have to confess that this response does seem to me to have a worrying resemblance to some of those rule-consequentialist gambits that I’ve always been suspicious of…

7. conchis says:

Two comments, the first related to what seems to me to be a confusion in your set up of the problem in the previous post, and the second suggesting an alternative view of what the real issue is here.
(1) I wonder whether you’re creating unnecessary confusion by conflating two separate objective-subjective distinctions: (a) objective vs. subjective probability judgments; and (b) objective vs. subjective judgments about the goodness of outcomes.
You seem to insist on a definition of the objective goodness of a prospect that requires both objective probabilities and the objective goodness of outcomes. Conversely, your account of the subjective goodness of prospects seems to deny the relevance of both of these, and requires both subjective probabilities and subjective goodness of outcomes.
There seems to me to be no reason to restrict your definitions in this way. I think your initial intuition that we should maximize the subjective expectation of the objective goodness of the outcomes is on the right track. It does seem to be the objective goodness of outcomes that should ultimately matter. But this doesn’t need to commit us to using objective probabilities as well. The issues are entirely separate.
Indeed, I don’t think we really have much choice but to assess prospects according to subjective probabilities (perhaps subject to some sort of reasonableness requirement). Even the idea that there might be something wrong with choosing option C in Parfit’s mine shaft case relies on subjective probability judgments. Otherwise we should be even more willing to condemn whichever option of A and B turns out to be wrong than we are to condemn C. But the fact that this seems wrong doesn’t have anything to do with whether we should be worried about the subjective or objective goodness of outcomes.
(2) IMHO, the real issue with the standard motivation of decision theory is that the Savage axioms don’t actually constrain rational choice to maximizing the expected goodness of outcomes. The “value” that can be read off the preference representation function can actually be any positive monotonic transformation of goodness: value=f(goodness). Maximizing expected goodness simply corresponds to the special case where we happen to be risk neutral with respect to goodness, and f(goodness)=goodness. But the theory gives no reason for us to be risk-neutral in this way. Indeed, given a sufficient level of risk aversion it’s perfectly consistent with the ranking A~B>C in the Mine Shaft Case.*
The upshot seems to be that if you want to motivate maximizing expected goodness, you probably need to motivate risk neutrality. And this is precisely the issue you’re having trouble with now. It’s relatively easy to motivate risk neutrality in the long run on the basis that it’ll all eventually work out. But, as you note, in one off situations this seems to break down. Maybe we should be risk-averse in such cases? To be honest, I’m not confident that good motivations for particular risk preferences can be found, other that somewhat arbitrarily backing them out from intuitions about particular cases, but good luck!
*This should not be confused with the claim that maximizing the probability of maximizing goodness is consistent with the axioms. The transformation necessary to generate this decision rule would involve counting only the maximal element(s) of the set of possible outcomes as having positive value. Because this makes value depend on both the goodness of the outcome and its probability, it is not (necessarily) a positive monotonic transformation of goodness (better outcomes with zero probability are valued less than worse outcomes that are possible).