CPSC 668 Distributed Algorithms and Systems Fall 2006

Transcription

CPSC 668 Distributed Algorithms and Systems Fall 2006
CPSC 668
Distributed Algorithms and
Systems
Fall 2006
Prof. Jennifer Welch
CPSC 668
Set 19: Asynchronous Solvability
1
Problems Solvable in FailureProne Asynchronous Systems
• Although consensus is not solvable in failureprone asynchronous systems (neither
message passing nor read/write shared
memory), there are some interesting
problems that are solvable:
–
–
–
–
set consensus
weakenings
of consensus
approximate agreement
renaming - "opposite" of consensus
k-exclusion - fault-tolerant variant of mutex
CPSC 668
Set 19: Asynchronous Solvability
2
Model Assumptions
• asynchronous
• shared memory with read/write registers
• at most f crash failures of procs.
• results can be translated to message
passing if f < n/2 (cf. Chapter 10)
• may be a few asides into message
passing
CPSC 668
Set 19: Asynchronous Solvability
3
Set Consensus Motivation
• By judiciously weakening the definition of the
consensus problem, we can overcome the
asynchronous impossibility
• We've already seen a weakening of
consensus:
– weaker termination condition for randomized
algorithms
• How about weakening the agreement
condition?
• One weakening is to allow more than one
decision value:
– allow a set of decisions
CPSC 668
Set 19: Asynchronous Solvability
4
Set Consensus Definition
Termination: Eventually, each nonfaulty
processor decides.
k-Agreement: The number of different
values decided on by nonfaulty
processors is at most k.
Validity: Every nonfaulty processor
decides on a value that is the input of
some processor.
CPSC 668
Set 19: Asynchronous Solvability
5
Set Consensus Algorithm
• Uses a shared atomic snapshot object X
– can be implemented with read/write
registers
• update your segment of X with your input
• repeatedly scan X until there are at least
n - f nonempty segments
• decide on minimum value appearing in
any segment
CPSC 668
Set 19: Asynchronous Solvability
6
Correctness of Set Consensus
Algorithm
• Termination: at most f crashes.
• Validity: every decision is some proc's
input
• Why does k-agreement hold?
– We'll show it does as long as k > f.
– Sanity check: When k = 1, we have
standard consensus. As long as there is
less than 1 failure, we can solve the
problem.
CPSC 668
Set 19: Asynchronous Solvability
7
k-Set Agreement Condition
• Let S be set of min values in final scan
of each nf proc; these are the nf
decisions
• Suppose in contradiction |S| > f + 1.
• Let v be largest value in S, the decision
of pi.
• So pi's final scan misses at least f + 1
values, contradicting the code.
CPSC 668
Set 19: Asynchronous Solvability
8
Set Consensus Lower Bound
Theorem: There is no algorithm for solving kset consensus in the presents of f failures, if f
≥ k.
• Straightforward extensions of consensus
impossibility result fail; even proving the
existence of an initial bivalent configuration is
quite involved.
• Original proof of set-consensus impossibility
used concepts from algebraic topology
• Textbook's proof uses more elementary
machinery, but still rather involved
CPSC 668
Set 19: Asynchronous Solvability
9
Approximate Agreement
Motivation
• An alternative way to weaken the
agreement condition for consensus:
• Require that the decisions be close to
each other, but not necessarily equal
• Seems appropriate for continuousvalued problems (as opposed to
discrete)
CPSC 668
Set 19: Asynchronous Solvability
10
Approximate Agreement
Definition
Termination: Eventually, each nonfaulty
processor decides.
-Agreement: All nonfaulty decisions are
within  of each other.
Validity: Every nonfaulty decision is
within the range of the input values.
CPSC 668
Set 19: Asynchronous Solvability
11
Approximate Agreement
Algorithm
• Assume procs know the range from which
input values are drawn:
– let D be the length of this range
• up to n - 1 procs can fail
• algorithm is structured as a series of
"asynchronous rounds":
– exchange values via a snapshot object, one per
round
– compute midpoint for next round
• continue until spread of values is within ,
which requires about log2 D/ rounds
CPSC 668
Set 19: Asynchronous Solvability
12
Approximate Agreement
Algorithm
Initially local variable v = pi's input
Initially local variable r = 1
1. update pi's segment of ASO[r] to be v
2. let scan be set of values obtained by
scanning ASO[r]
3. v := midpoint(scan)
4. if r = log2 (D/) + 1 then decide v and
terminate
5. else r++
CPSC 668
Set 19: Asynchronous Solvability
13
Analysis of Algorithm
Definitions w.r.t. a particular execution:
• M = log2 (D/) + 1
• U0 = set of input values
• Ur = set of all values ever written to
ASO[r]
CPSC 668
Set 19: Asynchronous Solvability
14
Helpful Lemma
Lemma (16.8): Consider any round r < M.
Let u be the first value written to ASO[r].
Then the values written to ASO[r+1] are
in this range:
min(Ur)
(min(Ur)+u)/2
u
(max(Ur)+u)/2 max(Ur)
elements of Ur+1 are in here
CPSC 668
Set 19: Asynchronous Solvability
15
Implications of Lemma
• The range of values written to the ASO object
for round r + 1 is contained within the range
of values written to the ASO object for round
r.
– range(Ur+1)  range(Ur)
• The spread (max - min) of values written to
the ASO object for round r + 1 is at most half
the spread of values written to the ASO object
for round r.
– spread(Ur+1) ≤ spread(Ur)/2
CPSC 668
Set 19: Asynchronous Solvability
16
Correctness of Algorithm
• Termination: Each proc executes M
asynchronous rounds.
• Validity: The range at each round is
contained in the range at the previous
round.
• -Agreement:
spread(UM) ≤ spread(U0)/2M
≤ D/2M
≤
CPSC 668
Set 19: Asynchronous Solvability
17
Handling Unknown Input Range
• Range might not be known.
• Actual range in an execution might be
much smaller than maximum possible
range.
• First idea: have a preprocessing phase
in which procs try to determine input
range
– but asynchrony and possible failures
makes this approach problematic
CPSC 668
Set 19: Asynchronous Solvability
18
Handling Unknown Input Range
• Use just one atomic snapshot object
• Dynamically recalculate how many rounds are
needed as more inputs are revealed
• Skip over rounds to try to catch up to
maximum observed round
• Only consider values associated with
maximum observed round
• Still use midpoint
CPSC 668
Set 19: Asynchronous Solvability
19
Unknown Input Range Algorithm
shared atomic snapshot object A; initially all segments 
updatei(A,[x,1,x]), where x is pi's input
repeat
scan A
let S be spread of all inputs in non- segments
if S = 0 then maxRound := 0
else maxRound := log2(S/)
let rmax be largest round in non- segments
let values be set of candidates in segments with round
number rmax
update pi's segment in A with [x,rmax+1,midpt(values)]
until rmax ≥ maxRound
decide midpoint(values)
CPSC 668
Set 19: Asynchronous Solvability
20
Analysis of Unknown Input
Range Algorithm
Definitions w.r.t. a particular execution:
• U0 = set of all input values
• Ur = set of all values ever written to A
with round number r
• M = largest r s.t. Ur is not empty
With these changes, correctness proof is
similar to that for known input range
algorithm.
CPSC 668
Set 19: Asynchronous Solvability
21
Key Differences in Proof
• Why does termination hold?
– a proc's local maxRound variable can only
increase if another proc wakes up and increases
the spread of the observable inputs. This can
happen at most n - 1 times.
• Why does -agreement hold?
– If pi's input is observed by pj the last time pj
computes its maxRound, same argument as
before.
– Otherwise, when pi wakes up, it ignores its own
input and uses values from maxRound or later.
CPSC 668
Set 19: Asynchronous Solvability
22
Renaming
• Procs start with unique names from a
large domain
• Procs should pick new names that are
still distinct but that are from a smaller
domain
• Motivation: Suppose original names are
serial numbers (many digits), but we'd
like the procs to do some kind of time
slicing based on their ids
CPSC 668
Set 19: Asynchronous Solvability
23
Renaming Problem Definition
Termination: Eventually every nonfaulty
proc pi decides on a new name yi
Uniqueness: If pi and pj are distinct
nonfaulty procs, then yi ≠ yj.
We are interested in anonymous
algorithms: procs don't have access to
their indices, just to their original names.
Code depends only on your original
name.
CPSC 668
Set 19: Asynchronous Solvability
24
Performance of Renaming
Algorithm
• New names should be drawn from
{1,2,…,M}.
• We would like M to be as small as
possible.
• Uniqueness implies M must be at least
n.
• Due to the possibility of failures, M will
actually be larger than n.
CPSC 668
Set 19: Asynchronous Solvability
25
Renaming Results
• Algorithm for wait-free case (f = n - 1) with M
= n + f = 2n - 1.
• Algorithm for general f with M = n + f.
• Lower bound that M must be at least n + 1,
for wait-free case.
– Proof similar to impossibility of wait-free
consensus
• Stronger lower bound that M must be at least
n + f, if f is the number of failures
– Proof uses algebraic topology and is related to
lower bound for set consensus
CPSC 668
Set 19: Asynchronous Solvability
26
Wait-Free Renaming Algorithm
Shared atomic snapshot object A; initially all segments 
s := 1 // suggestion for my new name
while true do
update pi's segment of A to be [x,s], where x is pi's
original name
scan A
if s is also someone else's suggestion then
let r be rank of x among original names of non-
segments
let s be r-th smallest positive integer not currently
suggested by another proc
else decide on s for new name and terminate
CPSC 668
Set 19: Asynchronous Solvability
27
Analysis of Renaming Algorithm
Uniqueness: Suppose in contradiction pi
and pj choose same new name, s.
pi's last
update
before
deciding:
suggests s
CPSC 668
pi's last
scan before
deciding s
pj's last
scan before
deciding s
sees s as pi's
suggestion and
doesn't decide s
Set 19: Asynchronous Solvability
28
Analysis of Renaming Algorithm
• New name space is {1,…,2n - 1}.
• Why?
• rank of a proc pi's original name is at most n
(the largest one)
• worst case is when each of the n - 1 other
procs has suggested a different new name for
itself, say {1,…,n - 1}.
• Then pi suggests n + n - 1 = 2n - 1.
CPSC 668
Set 19: Asynchronous Solvability
29
Analysis of Renaming Algorithm
Termination: Suppose in contradiction
some set T of nonfaulty procs never
decide in some execution.
• Consider the suffix  of the execution in
which
– each proc in T has already done at least
one update and
– only procs in T take steps (others have
either already crashed or decided).
CPSC 668
Set 19: Asynchronous Solvability
30
Analysis of Renaming Algorithm
• Let F be the set of new names that are free
(not suggested at the beginning of  by any
proc not in T) -- the trying procs need to
choose new names from this set.
• Let z1, z2,… be the names in F in order.
• By the definition of , no proc wakes up
during  and reveals an additional original
name, so all procs in T are working with the
same set of original names during .
• Let pi be proc whose original name has
smallest rank (among this set of original
names). Let r be this rank.
CPSC 668
Set 19: Asynchronous Solvability
31
Analysis of Renaming Algorithm
• Eventually procs other than pi stop
suggesting zr as a new name:
– After  starts, every scan indicates a set of
free names that is no larger than F.
– Every trying proc other than pi has a larger
rank and thus continually suggests a new
name for itself that is larger than zr, once it
does the first scan in .
CPSC 668
Set 19: Asynchronous Solvability
32
Analysis of Renaming Algorithm
• Eventually pi does suggest zr as its new
name:
– By choice of zr as r-th smallest free new
name, and fact that eventually other trying
procs stop suggesting z1 through zr,
eventually pi sees zr as free name with r-th
smallest rank.
• Contradicts assumption that pi is trying
(i.e., stuck).
• So termination holds.
CPSC 668
Set 19: Asynchronous Solvability
33
General Renaming
• Suppose we know that at most f procs will
fail, where f is not necessarily n - 1.
• We can use the wait-free algorithm, but it is
wasteful in the size of the new name space,
2n - 1, if f < n - 1.
• We can do better (if f < n - 1) with a slightly
different algorithm:
– keep track in the snapshot object of whether you
have decided
– an undecided proc suggests a new name only if its
original name is among the f + 1 lowest names of
procs that have not yet decided.
CPSC 668
Set 19: Asynchronous Solvability
34
k-Exclusion Problem
• A fault-tolerant version of mutual
exclusion.
• Processors can fail by crashing, even in
the critical section (stay there forever).
• Allow up to k processors to be in the
critical section simultaneously.
• If < k processors fail, then any nonfaulty
processor that wishes to enter the
critical section eventually does so.
CPSC 668
Set 19: Asynchronous Solvability
35
k-Exclusion Algorithm
cf. paper by Afek et al. [5].
CPSC 668
Set 19: Asynchronous Solvability
36
k-Assignment Problem
• A specialization of k-Exclusion to include:
• Uniqueness: Each proc in the critical section
has a variable called slot, which is an integer
between 1 and m. If pi and pj are in the C.S.
concurrently, then they have different slots.
• Models situation when there is a pool of
identical resources, each of which must be
used exclusively:
– k is number of procs that can be in the pool
concurrently
– m is the number of resources
– To handle failures, m should be larger than k
CPSC 668
Set 19: Asynchronous Solvability
37
k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
renaming using m = 2k-1 names
k-assignment exit section
k-exclusion exit section
CPSC 668
Set 19: Asynchronous Solvability
38
k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
request-name for long-lived
renaming using m = 2k-1 names
k-assignment exit section
release-name for long-lived
renaming using m = 2k-1 names
k-exclusion entry section
CPSC 668
Set 19: Asynchronous Solvability
39