Additional Pedantic Details: If the client guesses the wrong
primary, the retransmission mechanisms discussed in step
4c below forward the request to the current primary. The client’s time stamp t is included to ensure exactly once semantics of execution of requests. 4
2. Primary receives request, assigns sequence
number, and forwards ordered request to replicas.
A view’s primary has the authority to propose the order in
which the system should execute requests. It does so by producing order-req messages in response to client request
messages.
In particular, when the primary p receives message m =
〈request, o, t, c〉 from client c, the primary assigns to the
sc
request a sequence number n in the current view u and re-
lays a message 〈〈order-req, u, n, h , d, ND〉 , m〉 to the back-
n sp
up replicas where n and u indicate the proposed sequence
number and view number for m, digest d = H(m) is the cryp-
tographic one-way hash of m, h = H(h , d) is a cryptographic
n n− 1
hash summarizing the history, and ND is a set of values for
nondeterministic application variables (time in file systems,
locks in databases, etc.) required for executing the request.
Additional Pedantic Details: The primary only takes the above
actions if t > t where t is the highest time stamp previously
cc
received from c.
3. Replica receives ordered request, speculatively
executes it, and responds to the client.
When a replica receives an order-req message, it optimistically assumes that the primary is correct and that other
correct replicas will receive the same request with the same
proposed order. It therefore speculatively executes requests
in the order proposed by the primary and produces a spec-response message that it sends to the client.
In particular, upon receipt of a message 〈〈order-req, u,
n, h , d, ND〉 , m〉 from the primary p, replica i accepts the
n sp
ordered request if m is a well-formed request message, d
is a correct cryptographic hash of m, u is the current view,
n = max + 1 where max is the largest sequence number in
nn
i’s history, and h = H(h , d). Upon accepting the message,
n n− 1
i appends the ordered request to its history, executes the re-
quest using the current application state to produce a reply
r, and sends to c a message 〈〈spec-response, u, n, h , H(r), c,
n
t〉 , i, r, OR〉 where OR = 〈order-req, u, n, h , d, ND〉 .
si n sp
Additional Pedantic Details: A replica may only accept and
speculatively execute requests in sequence-number order, but
message loss or a faulty primary can introduce holes in the
sequence number space. Replica i discards the order-req
message if n ≤ max . If n > max + 1, then i discards the message,
nn
sends a message 〈fill-hole, u, max + 1, n, i〉 to the primary,
n si
and starts a timer. Upon receipt of a message 〈fill-hole, u, k,
n, i〉 from replica i, the primary p sends a 〈〈order-req, u, n',
si
h , d, ND〉 , m'〉 to i for each request m’ that p ordered in k ≤ n’ ≤
n' sp
n during the current view; the primary ignores fill-hole requests
from other views. If i receives the valid order-req messages
89 communications of the acm | november 2008 | vol. 51 | no. 11
needed to fill the holes, it cancels the timer. Otherwise, the replica broadcasts the fill-hole message to all other replicas and
initiates a view change when the timer fires. Any replica j that
receives a fill-hole message from i sends the corresponding
order-req message, if it has received one. If, in the process of
filling in holes in the replica sequence, replica i receives conflicting order-req messages, then the conflicting messages form a
proof of misbehavior (POM) as described in protocol step 4d.
4a. Client receives 3f + 1 matching responses and
completes the request.
In the absence of faults and timeouts, the client receives
matching spec-response messages from all 3f + 1 replicas.
The client can then consider the request and its history to be
complete and delivers the reply r to the application.
3f + 1 identical replies with identical histories suffice to ensure that a client can rely on a response. In particular, 3f + 1
matching responses means all correct servers have executed
the request and all preceding requests in the same order, so
correct servers can always form a majority to vote to keep this
response, even across view changes. 11 In particular, the view
change subprotocol executes across 2f + 1 responsive servers,
but any group of 2f + 1 servers must include at least f + 1 correct
servers and at most f faulty servers. Thus, the correct servers
are always able to vote to keep this response, including both
the application reply and the history of previous actions.
Therefore, upon receiving 3f + 1 distinct messages
〈〈spec-response, u, n, h , H(r), c, t〉 , i, r, OR〉, where i iden-
n si
tifies the replica issuing the response, a client determines
if they match. spec-response messages from distinct replicas match if they have identical u, n, h , H(r), c, t, OR, and r
n
fields.
3. 2. two-phase case
If the network, primary, or some replicas are slow or faulty, the
client c may not receive matching responses from all 3f + 1 replicas. The two-phase case applies when the client receives between 2 f + 1 and 3f matching responses. As Figure 1b illustrates,
steps 1–3 occur as described above, but step 4 is different:
4b. Client receives between 2f + 1 and 3f matching
responses, assembles a commit certificate, and
transmits the commit certificate to the replicas.
The commit certificate is cryptographic proof that a majority of correct servers agree on the ordering of requests up
to and including the client’s request. Protocol steps 5 and 6
complete the second phase of agreement by ensuring that
enough servers have this proof.
5. Replica receives a COMMIT message from a client
containing a commit certificate and acknowledges
with a LOCAL-COMMIT message.
6. Client receives a LOCAL-COMMIT messages from
2f + 1 replicas and completes the request.