Priority queues

Transcription

Priority queues
Algorithms and Data Structures, Fall 2011
Priority queues
Rasmus Pagh
Based on slides by Kevin Wayne, Princeton
Algorithms, 4th Edition
Monday, October 24, 11
·
Robert Sedgewick and Kevin Wayne
·
Copyright © 2002–2011
Priority queue
Collections. Insert and delete items. Which item to delete?
Stack. Remove the item most recently added.
Queue. Remove the item least recently added.
Symbol table. Remove any desired item.
Priority queue. Remove the largest (or smallest) item.
operation argument
insert
insert
insert
remove max
insert
insert
insert
remove max
insert
insert
insert
remove max
return
value size
P
Q
E
Q
X
A
M
X
P
L
E
P
1
2
3
2
3
4
5
4
5
6
7
6
contents
(unordered)
P
P
P
P
P
P
P
P
P
P
P
E
Q
Q
E
E
E
E
E
E
E
E
M
E
X
X
X
M
M
M
M
A
A
A
A
A
A
A
P
M
P
P
P
L
L
L
E
E
A sequence of operations on a priority queue
Monday, October 24, 11
P
P
E
E
E
A
A
A
A
A
A
A
2
Priority queue API
Requirement. Generic items are Comparable.
public class MaxPQ<Key extends Comparable<Key>>
MaxPQ()
create a priority queue
MaxPQ(maxN)
create a priority queue of initial capacity maxN
void insert(Key v)
insert a key into the priority queue
Key max()
return the largest key
Key delMax()
return and remove the largest key
boolean isEmpty()
int size()
is the priority queue empty?
number of entries in the priority queue
API for a generic priority queue
3
Monday, October 24, 11
Priority queue applications
•
•
•
•
•
•
•
•
•
•
Event-driven simulation.
[customers in a line, colliding particles]
Numerical computation.
[reducing roundoff error]
Data compression.
[Huffman codes]
Graph searching.
[Dijkstra's algorithm, Prim's algorithm]
Computational number theory.
[sum of powers]
Artificial intelligence.
[A* search]
Statistics.
[maintain largest M values in a sequence]
Operating systems.
[load balancing, interrupt handling]
Discrete optimization.
[bin packing, scheduling]
Spam filtering.
[Bayesian spam filter]
Generalizes: stack, queue (why?)
4
Monday, October 24, 11
Simulation example 1
% java CollisionSystem 100
5
Monday, October 24, 11
Simulation example 1
% java CollisionSystem 100
5
Monday, October 24, 11
Priority queue client example
Challenge. Find the largest M items in a stream of N items (N huge, M large).
•
•
Fraud detection: isolate $$ transactions.
File maintenance: find biggest files or directories.
Constraint. Not enough memory to store N items.
sort key
6
Monday, October 24, 11
Priority queue client example
Challenge. Find the largest M items in a stream of N items (N huge, M large).
•
•
Fraud detection: isolate $$ transactions.
File maintenance: find biggest files or directories.
Constraint. Not enough memory to store N items.
% more tinyBatch.txt
Turing
6/17/1990
vonNeumann 3/26/2002
Dijkstra
8/22/2007
vonNeumann 1/11/1999
Dijkstra
11/18/1995
Hoare
5/10/1993
vonNeumann 2/12/1994
Hoare
8/18/1992
Turing
1/11/2002
Thompson
2/27/2000
Turing
2/11/1991
Hoare
8/12/2003
vonNeumann 10/13/1993
Dijkstra
9/10/2000
Turing
10/12/1993
Hoare
2/10/2005
644.08
4121.85
2678.40
4409.74
837.42
3229.27
4732.35
4381.21
66.10
4747.08
2156.86
1025.70
2520.97
708.95
3532.36
4050.20
sort key
6
Monday, October 24, 11
Priority queue client example
Challenge. Find the largest M items in a stream of N items (N huge, M large).
•
•
Fraud detection: isolate $$ transactions.
File maintenance: find biggest files or directories.
Constraint. Not enough memory to store N items.
% more tinyBatch.txt
Turing
6/17/1990
vonNeumann 3/26/2002
Dijkstra
8/22/2007
vonNeumann 1/11/1999
Dijkstra
11/18/1995
Hoare
5/10/1993
vonNeumann 2/12/1994
Hoare
8/18/1992
Turing
1/11/2002
Thompson
2/27/2000
Turing
2/11/1991
Hoare
8/12/2003
vonNeumann 10/13/1993
Dijkstra
9/10/2000
Turing
10/12/1993
Hoare
2/10/2005
644.08
4121.85
2678.40
4409.74
837.42
3229.27
4732.35
4381.21
66.10
4747.08
2156.86
1025.70
2520.97
708.95
3532.36
4050.20
% java TopM
Thompson
vonNeumann
vonNeumann
Hoare
vonNeumann
5 < tinyBatch.txt
2/27/2000 4747.08
2/12/1994 4732.35
1/11/1999 4409.74
8/18/1992 4381.21
3/26/2002 4121.85
sort key
6
Monday, October 24, 11
Priority queue client example
Challenge. Find the largest M items in a stream of N items (N huge, M large).
Transaction data
use a min-oriented pq
type is Comparable
pq contains
largest M items
Exercise: Theoretical analysis of the above, using
properties of priority queue implementations.
7
Monday, October 24, 11
Priority queue client example
Challenge. Find the largest M items in a stream of N items (N huge, M large).
MinPQ<Transaction> pq = new MinPQ<Transaction>();
use a min-oriented pq
while (StdIn.hasNextLine())
{
String line = StdIn.readLine();
Transaction item = new Transaction(line);
pq.insert(item);
if (pq.size() > M)
pq contains
pq.delMin();
largest M items
}
Transaction data
type is Comparable
Exercise: Theoretical analysis of the above, using
properties of priority queue implementations.
7
Monday, October 24, 11
How to implement a priority queue ADT?
Answer 1: Special case of search tree API.
Answer 2: Can do this simpler!
Recursive definition:
A priority queue (PQ) for n elements can be represented as:
- The maximum element, and
- two recursive PQs for remaining about (n-1)/2 elements each, if n>1.
Problem session
If one defines a PQ recursively as above, how can one implement:
- max?
- insert?
- delMax?
8
Monday, October 24, 11
Complete binary tree
For an elegant implementation of the previous idea, we need an alternative
tree representation.
Complete tree: Perfectly balanced, except for bottom level.
complete tree with N = 16 nodes (height = 4)
Property. Height of complete binary tree with N nodes is ⎣lg N⎦.
Pf. Height only increases when N is a power of 2.
9
Monday, October 24, 11
Binary heap representations
Binary heap. Array representation of a heap-ordered complete binary tree.
Heap-ordered binary tree.
•
•
Keys in nodes.
No smaller than children’s keys.
Array representation.
•
•
i
a[i]
0
-
Take nodes in level order.
1
T
2
S
3
R
S
R
4
P
5
N
6
O
7
A
P
N
O
A
8
E
9 10 11
I H G
E
I
T
No explicit links needed!
1
8 E
9 I
G
T
3 R
2 S
4 P
H
5 N
10 H
6 O
7 A
11 G
Heap representations
10
Monday, October 24, 11
Binary heap properties
Proposition. Largest key is a[1], which is root of binary tree.
indices start at 1
Proposition. Can use array indices to move through tree.
•
•
Parent of node at k is at k/2.
Children of node at k are at 2k and 2k+1.
i
a[i]
0
-
1
T
2
S
3
R
S
R
4
P
5
N
6
O
7
A
P
N
O
A
8
E
9 10 11
I H G
E
I
T
1
8 E
9 I
G
T
3 R
2 S
4 P
H
5 N
10 H
6 O
7 A
11 G
Heap representations
11
Monday, October 24, 11
Promotion in a heap
Scenario. Node's key becomes larger key than its parent's key.
To eliminate the violation:
•
•
Exchange key in node with key in parent.
Repeat until heap order restored.
S
private void swim(int k)
{
while (k > 1 && less(k/2, k))
{
exch(k, k/2);
k = k/2;
}
parent of node at k is at k/2
}
R
P
5
G
E
I
H
T
O
A
violates heap order
(larger key than parent)
G
1
T
2
R
S
P 5
G
E
I
H
O
A
G
Bottom-up heapify (swim)
12
Monday, October 24, 11
Insertion in a heap
Insert. Add node at end, then swim it up.
Cost. At most 1 + lg N compares.
insert
remo
T
R
N
P
public void insert(Key x)
{
pq[++N] = x;
swim(N);
}
E
H
I
G
O
S
A
key to insert
T
R
N
P
E
H
I
G
O
add key to heap
violates heap order
S
T
swim up
R
S
P
E
A
N
I
G
O
sink down
A
H
Heap operation
13
Monday, October 24, 11
Demotion in a heap
Scenario. Node's key becomes smaller than one (or both) of its children's keys.
To eliminate the violation:
•
•
Exchange key in node with key in larger child.
Repeat until heap order restored.
private void sink(int k)
{
children of node
while (2*k <= N)
at k are 2k and 2k+1
{
int j = 2*k;
if (j < N && less(j, j+1)) j++;
if (!less(k, j)) break;
exch(k, j);
k = j;
}
}
violates heap order
(smaller than a child)
2
R
H
5 S
P
E
T
I
N
O
A
G
T
2
5
P
E
R
S
I
10
H
N
O
A
G
Top-down reheapify (sink)
14
Monday, October 24, 11
Delete the maximum in a heap
Delete max. Exchange root with node at end, then sink it down.
Cost. At most 2 lg N compares.
insert
remove the maximum
T
R
N
P
public Key delMax()
E
I
G
{
Key max = pq[1];
exch(1, N--);
N
sink(1);
pq[N+1] = null;
P
return max;
E
I
G
}
H
O
S
A
P
E
key to insert
N
I
G
R
P
add key to heap
violates heap order
S
E
N
I
G
I
R
G
O
A
remove node
from heap
T
S
S
N
violates
heap order
S
prevent
loitering
O
A
A
exchange keys
with root
H
T
P
O
H
R
swim up
E
R
S
T
H
key to remove
T
O
R
P
sink down
I
A
H
E
N
H
O
A
G
Heap operations
15
Monday, October 24, 11
Priority queues with remove, key updates
•
•
Some priority queue ADTs support the removal of a specific item
- Need to specify a “handle” to identify the item
Two ways to do this efficiently for a heap:
- 1. Maintain a hash table that keeps track of the location of each item in
the heap.
- 2. Leave removed items in the heap. When deleting the maximum, check if
the item has been deleted (using a hash table).
•
Some priority queue ADTs give special methods for changing (increasing/
decreasing) the priority of an item.
- Changes that move items towards the front of the queue are fast!
16
Monday, October 24, 11
Priority queues implementation cost summary
order-of-growth of number of comparisons for priority queue with N items
implementation
insert
del max
max
unordered array
1
N
N
ordered array
N
1
1
binary heap
log N
log N
1
d-ary heap
logd N
d logd N
1
Fibonacci
1
impossible
1
log N
†
1
† amortized
Gerth Brodal, Aarhus University
1
1
why impossible?
17
Monday, October 24, 11
Priority queues in Java
• java.util.PriorityQueue
– decreaseKey through remove + insert
– From Java 1.5 API documentation:
An unbounded priority queue based on a priority heap. …
Implementation note: this implementation provides O(log(n))
time for the insertion methods …; linear time for the
remove(Object) and contains(Object) methods; and constant
time for the retrieval methods (peek, element, and size).
• An implementation with better remove
performance can be found e.g. in the
JDSL library.
18
Monday, October 24, 11
Is number of comparisons the right measure of complexity?
Hard drive
Latency: 5ms
(15 million cycles)
SSD
Latency: 50µs
(150,000 cycles)
19
Monday, October 24, 11
Heaps and cache-efficiency
Suppose
- We have a heap of primitive data types (not objects), e.g. integers.
- Our heap is stored in a memory with block size at least d (each memory
access transfers a block of d words of memory).
Memory transfers of binary heap
- 1 transfer to access first log d levels
- 1 transfer for each subsequent level
- Total: 1+log(n)-log(d)
Memory transfers of d-ary heap
- 1 transfer to access each level
- Total: log(n)/log(d)
Problem: Different d needed to optimize for different block sizes.
20
Monday, October 24, 11
van Emde Boas (vEB) layout
Presented in master’s thesis of Prokop (1999):
Alternative way of laying out a complete binary tree in an array.
Observation: A tree of depth d can be “cut up” into 2d/2+1 trees:
- A top tree of the first d/2 levels
- 2d “bottom” trees with root at level d/2+1
vEB layout:
- First store top tree (recursively using vEB layout)
- Then store each bottom tree (recursively using vEB layout)
Standard heap layout (level order)
van Emde Boas layout
21
Monday, October 24, 11
Properties of van Emde Boas layout
1. Can efficiently compute children/parent of node, given a suitable lookup table.
2. The number of memory transfers at any level of the memory hierarchy in a
root-to-leaf traversal is within a factor 2 of optimal.
“Cache-oblivious algorithms” make efficient use of caches without considering
cache parameters.
There exits heaps (such as the funnel heap) that are theoretically better than
a binary heap with vEB layout.
22
Monday, October 24, 11
‣
‣
‣
‣
‣
API
elementary implementations
binary heaps
heapsort
event-driven simulation
23
Monday, October 24, 11
11)
exch(1, 10)
sink(1, 9)
S
Heapsort
R
L
E
E
11)
P
A
X
L
Basic plan for in-place Msort.E
A
A
E
O
E
E
in arbitrary order
4
A
8
heap construction
11)
exch(1, 8)
sink(1, 7)
S
X
L
2
8
E
E
11)
4
A
R
9
5
10
S
E
E
(heap-ordered)
Monday, October 24, 11
T
A
M
RP
7
(in
L6 Xplace)
AE
E
10
A
L X
E
T
X
AE
RMM
LM
L
4
8
M
RM
S
exch(1,ex
sink(1,si
AAP
XRO
E
E
TE EX
starting point (heap-ordered)
SPO
T
LO
9
A
T
3R
EP
R
SE
5
P
SE
A
ex
exch(1,s
sink(1,
1S
O
2
P
AP
A
SX
10 E
TE
L
ML
6X
RO
11E
X
X
sink(3, 11)
exch(1, 10)
7A
AP
A
R
result (sorted)
sink(4, 11)
7
RSE
LTP
sorted
R resultE
L
XO
11
sink(4, 11)
exch(1, 11)
sink(1, 10)
(in place)
RE
E
OT
O
S
M
SE
9
6
EM
3
EX
L
T
SP
RM
starting point (arbitrary order)
sortdown
E
R
11
A
5
LT
build a max-heap
E
M RP
X
T
SL
starting point (arbitrary order)
O
R
S
A
O
exch(1,
2) 11)
sink(5,
sink(1, 1)
P
3
exch(1,
7)
sink(5, 11)
sink(1, 6)
X
L
O
O
M
T
1
SE
1
2
X
T
X
T
exch(1, 3)
sink(1, 2)
start with array of keys
S
M
P
O
heap construction
E
L
M
S
R
P
A
R
E
L
X
T
• Create max-heap with all N keys.
the maximum
key.
exch(1,
9)
•S Repeatedly remove
R
sink(1, 8)
L
E
R
O
X
E
exch(1, 4)
sink(1, 3)
S
S
S
24
e
exch(1,s
starting point (heap-ordered)
starting point (arbitrary order)
Heapsort: heap construction
sink(5, 11)
exch(1, 11)
sink(1, 10)
S
O
L
First pass. Build heap using bottom-up method.
for (int k = N/2; k >= 1; k--)
sink(a, k, N);
sink(4, 11)
A
X
4
8
5
T
9
S
O
10
6
E
3
R
X
7
A
11
E
L
P
M
starting point (arbitrary order)
sink(5, 11)
E
R
R
T
OO
L
E
A
S
R
X
T
L
O
A
X
SR
P
E
L
R
T
S
O
E
L E
SM
R
A
E
T O X P
A A
AA
A
E
X
T
S
P
L E
SM
R
E
E
exch(1, 7)
sink(1, 6)
exch(1, 4)
M E
sink(1, 3)
XS
OE ET EX
result (heap-ordered)
T
E
S
E M
A
RE
M
O
R
exch(1, 8)
sink(1, 7)
exch(1, 5)
O L
sink(1, 4)
R
LL
PO
MM
E
A
A
X
E
E
L
SA
R
TP
A
X
X
S
X
R
LE
sink(1, 11)
exch(1, 10)
sink(1, 9)
S
L
L
L
P
R
T
P
M
M
O
M
A
X
sink(4, 11)
T
P
P
E
E
P
M
A
E
X
T
E
exch(1, 9)
8)
exch(1,sink(1,
6)
M
sink(1, 5)
P
E
E
P
E
E
O
starting point (heap-ordered)
sink(2, 11)
S
M
M
R
L
O
T
S
X
exch(1, 11)T
sink(1, 10)
S
O
T
L
sortdown
sink(3, 11)
2
R
O
M
heap construction
1
S
P
E
E
X
E
E
A
R
exch(1, 10)
sink(1, 9)
S
L
P
L
R
T
S
O
M
O
M
A
X
E
E
P
M
P
R
T
T
E
P
T O X P
X
Heapsort: constructing (left) and sorting down (right) a
sink(3, 11)
Monday, October 24, 11
O
exch(1, 9)
sink(1, 8)
S
X
exch(1, 3)
sink(1, 2)
R
P
E
25
E
A
E
Heapsort: sortdown
Second pass.
•
•
sortdown
heap construction
1
S
R
Remove the maximum, oneT at a Etime.
A
X
2
3
O
4
5
6
9
10
11
sink(5, 11)
L
E
P
M
while (N > 1)
{
sink(4, 11)
exch(a, 1, N--);
O
L
T
sink(a, 1, N);
E
P
M
}
E
exch(1, 10)
sink(1, 9)
E
exch(1, 9)
sink(1, 8)
S
L
E
E
sink(2, 11)
T
L
P
E
E
sink(1, 11)
X
L
R
A
2
T
E
X
4
X
T
P
8
R
E
E
O
result (heap-ordered)
L
9
S
A
3
E
5
10
P
O
1
Heapsort: constructing (left) and sorting down (right) a heap
Monday, October 24, 11
M
S
E
L
S
E
O
A
R
A
L
R
M
S
A
E
exch(1, 7)
sink(1, 6)
X
T
S
P
O
E
X
T
S
M
E
L
E
exch(1, 2)
sink(1, 1)
P
M
R
T
M
A
R
E
L
R
O
X
A
E
exch(1, 8)
sink(1, 7)
S
X
T
S
P
O
A
X
T
S
M
M
E
L
E
exch(1, 3)
sink(1, 2)
R
O
E
L
R
P
A
R
A
E
X
T
E
X
T
P
L
X
T
A
R
P
O
exch(1, 4)
sink(1, 3)
S
O
M
O
O
A
X
M
S
R
P
R
E
A
X
E
L
E
A
R
P
O
X
T
S
R
S
L
E
E
exch(1, 5)
sink(1, 4)
T
O
E
A
E
E
O
starting point (heap-ordered)
M
S
A
R
P
A
X
sink(3, 11)
M
M
R
T
P
L
P
M
L
S
exch(1, 11)
sink(1, 10)
S
O
M
T
7
E
L
P
Leave in array, instead ofMstarting
null
ing
out.
point (arbitrary order)
8
exch(1, 6)
sink(1, 5)
X
M
6
O
E
7
P
11
X
T
result (sorted)
26
Heapsort: mathematical analysis
Proposition. Heap construction uses fewer than 2 N compares and exchanges.
Proposition. Heapsort uses at most 2 N lg N compares and exchanges.
Significance. In-place sorting algorithm with N log N worst-case.
•
•
•
Mergesort: no, linear extra space.
in-place merge possible, not practical
Quicksort: no, quadratic time in worst case.
N log N worst-case quicksort possible,
Heapsort: yes!
not practical
Bottom line. Heapsort is optimal for both time and space, but:
•
•
•
Inner loop longer than quicksort’s.
Makes poor use of cache memory.
Not stable.
27
Monday, October 24, 11
Conclusion
Heaps give a more space- and time-efficient implementation of the priority
queue ADT.
- Yields a very space efficient sorting algorithm.
- Cache-efficiency can be achieved by changing the layout.
Be careful when deleting items from a heap - can take linear time in some
implementations!
We will be using priority queues as building blocks in graph algorithms.
28
Monday, October 24, 11