Extending the model to include rewards
Thus far we can not talk about an agent who abandons
its pursuit of the goal midway through, since our model
requires the agent to construct a path that goes all the way to t.
A simple extension of the model enables to consider such
Suppose we place a reward of r at the target node t, which
will be claimed if the agent reaches t. Standing at a node
v, the agent now has an expanded set of options: it can follow an edge out of v as before, or it can quit taking steps,
incurring no further cost but also not claiming the reward.
The agent will choose the latter option precisely when
either there is no v-t path, or when the minimum cost of a
v-t path exceeds the value of the reward, evaluated in light of
present bias: for all v-t paths P.
It is important to note a key feature of this evaluation: the
reward is always discounted by b relative to the cost that is
being incurred in the current period, even if the reward will
be received right after this cost is incurred. (e.g., if the path P
has a single edge, then the agent is comparing c(e1(P) ) to br.)
In what follows, we will consider both these models: the
former fixed-goal model, in which the agent must reach t and
seeks to minimize its cost; and the latter reward model in
which the agent trades off cost incurred against reward at t,
and has the option of stopping partway to t. Aside from this
distinction, both models share the remaining ingredients,
based on traversing an s-t path in G.
It is easy to see that the reward model displays the phenomenon of abandonment, in which the agent spends some
cost to try reaching t, but then subsequently gives up without receiving the reward. Consider for example a three-node
path on nodes s, v1, and t, with an edge c(s, v1) = 1 and c(v1, t)
= 4. If b = 1/2 and there is a reward of 7 at t, then the agent
will traverse the edge (s, v1) because it evaluates the total cost
of the path at 1 + 4b = 3 < 7b = 3. 5. But once it reaches v1, it
evaluates the cost of completing the path at 4 > 7b = 3. 5, and
so it quits without reaching t.
An example involving choice reduction
It is useful to describe a more complex example that shows
the modeling power of this shortest-path formalism, and
also shows how we can use the model to analyze deadlines
as a form of beneficial choice reduction. (As should be clear,
with a time-consistent agent it can never help to reduce the
set of choices; such a phenomenon requires some form of
time-inconsistency.) First we describe the example in text,
and then show how to represent it as a graph.
Imagine a student taking a three-week short course in
which the required work is to complete two small projects
by the end of the course. It is up to the student when to do
the projects, as long as they are done by the end. The student incurs an effort cost of one from any week in which
she does no projects (since even without projects there is
still the lower-level effort of attending class), a cost of four
from any week in which she does one project, and a cost
of nine from any week in which she does both projects.
Finally, the student receives a reward of 16 for completing the course, and she has a present-bias parameter of
b = 1/2.
2. THE GRAPH-THEORETIC MODEL
In order to argue that our graph-theoretic model captures a
variety of phenomena that have been studied in connection
with time-inconsistency, we present a sequence of examples
to illustrate some of the different behaviors that the model
exhibits. We note that the example as shown in Figure 1
already illustrates two simple points: that the path chosen
by the agent can be suboptimal; and that even if the agent
traverses an edge e with the intention of following a path P
that begins with e, it may end up following a different path P¢
that also begins with e.
For an edge e in G, let c(e) denote the cost of e; and for
a path P in G, let ei(P) denote the ith edge on P. In terms of
this notation, the agent’s decision is easy to specify: when
standing at a node v, it chooses the path P that minimizes
over all P that run from v to t. It follows the first edge of P to a new node w, and then performs
this computation again.
We begin by observing that Figure 2(a) represents a version
of the Akerlof example from the introduction. (Recall that
here we use b to denote b− 1.) Node t represents the state in
which the agent has sent the package, and node vi represents
the state in which the agent has reached day i without sending the package. The agent has the option of going directly
from node s to node t, and this is the shortest s-t path. But if
(b − 1)c > b, then the agent will instead go from s to v1, intending to complete the path s-v1-t in the next time step. At v1,
however, the agent decides to go to v2, intending to complete
the path v1-v2-t in the next time step. This process continues:
the agent, following exactly the reasoning in the example
from the introduction, is procrastinating and not going to t,
and in the end its path goes all the way to the last node vn (n =
5 in the figure) before finally taking an edge to t. (One minor
change from the set-up in the introduction is that the present-bias effect is also applied to the per-day cost of 1 as well;
this has no real effect on the underlying story.)
(a) The Akerlof example
(b) Homework deadlines
v30 v20 v10
v02 v12 v22
v11 v21 v31
Figure 2. Path problems that exhibit procrastination, abandonment,
and choice reduction.