asynchronously. TransferQueues are useful for example in
supporting messaging frameworks that allow messages to
be either synchronous or asynchronous. The base synchronous support in TransferQueues mirrors our fair synchronous queue. The asynchronous additions differ only by
releasing producers before items are taken.
Although we have improved the scalability of the synchronous queue, there may remain potential for improvement in some contexts. Most of the inter-thread contention
in enqueue and dequeue operations occurs at the memory
containing the head (and, for fair queues, tail). Reducing
such contention by spreading it out is the idea behind
elimination techniques introduced by Shavit and Touitou. 20 These
may be applied to components featuring pairs of operations that collectively effect no change to a data structure,
for example, a concurrent push and pop on a stack. Using
elimination, multiple locations (comprising an arena) are
employed as potential targets of the main atomic instructions underlying these operations. If two threads meet in
one of these lower-traffic areas, they cancel each other out.
Otherwise, the threads must eventually fall back (usually, in
a tree-like fashion) to try the main location.
Elimination techniques have been used by Hendler et al. 4
to improve the scalability of stacks, and by us18 to improve
the scalability of the swapping channels in the java.util.co
ncurrent Exchanger class. Moir et al. 15 have also used elimination in concurrent queues, although at the price of weaker
ordering semantics than desired in some applications due
to stack-like (LIFO) operation of the elimination arena.
Similar ideas could be applied to our synchronous queues.
However, to be worthwhile here, the reduced contention
benefits would need to outweigh the delayed release (lower
throughput) experienced when threads do not meet in arena
locations. In preliminary work, we have found elimination
to be beneficial only in cases of artificially extreme contention. We leave fuller exploration to future work.
References
1. anderson, t.e., lazowska, e.d.,
levy, H.M. the performance
implications of thread management
alternatives for shared-memory
multiprocessors. IEEE Trans.
Comput. 38, 12 (dec. 1989),
1631–1644.
2. andrews, g.r. Concurrent
Programming: Principles and Practice.
benjamin/Cummings, 1991.
3. Hanson, d.r. C Interfaces and
Implementations: Techniques for
Creating Reusable Software.
addison-Wesley, 1997.
4. Hendler, d., shavit, n., yerushalmi, l.
a scalable lock-free stack
algorithm. In Proceedings of the
16th Annual ACM Symposium
on Parallelism in Algorithms and
Architectures (Jun. 2004), 206–215.
5. Herlihy, M. Wait-free synchronization.
ACM Trans. Prog. Lang. Syst. 13, 1
(Jan. 1991), 124–149.
6. Herlihy, M., luchangco, V., Moir, M.
obstruction-free synchronization:
double-ended queues as an
example. In Proceedings of the
23rd International Conference on
Distributed Computing Systems
(May 2003).
7. Herlihy, M. P., Wing, J. M.
linearizability: a correctness
condition for concurrent objects.
ACM Trans. Prog. Lang. Syst. 12, 3
(Jul. 1990), 463–492.
8. Hoare, C.a.r. Communicating
sequential processes. Commun.
ACM 21, 8 (aug. 1978), 666–677.
9. Karlin, a.r., li, K., Manasse, M.s.,
owicki, s. empirical studies
of competitive spinning for a
shared-memory multiprocessor.
In Proceedings of the 13th ACM
Symposium on Operating Systems
Principles (oct. 1991), 41–55.
10. lamport, l. a new solution of
dijkstra’s concurrent programming
problem. Commun. ACM 17, 8
(aug. 1974), 453–455.
11. lamport, l. a fast mutual exclusion
algorithm. ACM Trans. Comput. Syst.
5, 1 (Feb. 1987), 1–11.
12. lea, d. the java.util.concurrent
synchronizer Framework. Sci.
Comput. Prog. 58, 3 (dec. 2005),
293–309.
13. Mellor-Crummey, J. M., scott, M.l.
algorithms for scalable
synchronization on shared-memory
multiprocessors. ACM Trans.
Comput. Syst. 9, 1 (Feb. 1991), 21–65.
14. Michael, M. M., scott, M.l. simple,
fast, and practical non-blocking and
blocking concurrent queue algorithms.
In Proceedings of the 15th ACM
Symposium on Principles of Distributed
Computing (May 1996), 267–275.
15. Moir, M., nussbaum, d., shalev, o.,
shavit, n. using elimination to
implement scalable and lock-free
FIFo queues. In Proceedings of
the 17th Annual ACM Symposium
on Parallelism in Algorithms and
Architectures (Jul. 2005), 253–262.
16. scherer III, W.n. synchronization
and concurrency in user-level
software systems. Ph.d. dissertation,
department of Computer science,
university of rochester
(Jan. 2006).
17. scherer III, W.n., lea, d., scott, M.l.
scalable synchronous queues.
In Proceedings of the 11th ACM
Symposium on Principles and
Practice of Parallel Programming
(Mar. 2006).
18. scherer III, W.n., lea, d., scott, M.l.
a scalable elimination-based
exchange channel. In Proceedings,
Workshop on Synchronization and
Concurrency in Object-Oriented
Languages (oct. 2005). In
conjunction with ooPsla ‘05.
19. scherer III, W.n., scott, M.l.
nonblocking concurrent objects
with condition synchronization. In
Proceedings of the 18th International
Symposium on Distributed Computing
(oct. 2004).
20. shavit, n., touitou, d. elimination
trees and the construction of pools
and stacks. In Proceedings of the
7th Annual ACM Symposium on
Parallel Algorithms and Architectures
(Jul. 1995).
21. treiber, r.K. systems programming:
Coping with parallelism. technical
report rJ 5118, IbM almaden
research Center, apr. 1986.
acknowledgments
We are grateful to Dave Dice, Brian Goetz, David Holmes,
Mark Moir, Bill Pugh, and the PPoPP referees for feedback that significantly improved the presentation of this
paper. This work was supported in part by NSF grants numbers EIA-0080124, CCR-0204344, and CNS-0411127, and by
financial and equipment grants from Sun Microsystems
Laboratories.
William N. Scherer III (scherer@edu)
department of Computer science
rice university, Houston, tX.
Doug Lea ( dl@cs.oswego.edu)
department of Computer science
suny oswego, oswego, ny.
Michael L. Scott ( scott@cs.rochester.edu)
department of Computer science
university of rochester, rochester, ny.
© 2009 aCM 0001-0782/09/0500 $5.00