domains using similar data types are seismic and medical analysis.

FUTURE HARDWARE EVOLUTION:
CPU/GPU CONVERGENCE?

Processor features such as instruction formats will likely converge as a result of pressure for a consistent programming model. GPUs may migrate to narrower SIMD widths to increase performance on branching code, while CPUs move to broader SIMD width to improve instruction effi ciency.

The fact remains, however, that some tasks can be executed more efficiently using data-parallel algorithms. Since efficiency is so critical in this era of constrained power consumption, a two-point design that enables the optimal mapping of tasks to each processor model may persist for some time to come.

Further, if the hardware continues to lead the software, it is likely that systems will have more cores than the application can deal with at a given point in time, so providing a choice of processor types increases the chance of more of them being used.

Conceivably, a data-parallel system could support the entire feature set of a modern serial CPU core, including a rich set of interthread communications and synchronization mechanisms. The presence of such features, however, may not matter in the longer term because the more such traditional synchronization features are used, the worse performance will scale to high core counts. The fastest apps are not those that port their existing single-threaded or even dual-threaded code across, but those that switch to a different parallel algorithm that scales better because it relies less on general synchronization capabilities.

Figure 2 shows a list of algorithms that have been implemented using data-parallel paradigms with varying degrees of success. They are sorted roughly in order of how well they match the data-parallel model.

Data-parallel processors are becoming more broadly available, especially now that consumer GPUs support data-parallel programming environments. This paradigm shift presents a new opportunity for programmers who adapt in time.

guidance from software developers. The first to arrive will have the best chance to drive and shape upcoming data-parallel hardware architectures and development environments to meet the needs of their particular application space.

When programmed effectively, GPUs can be faster than current PC CPUs. The time has come to take advantage of this new processor type by making sure each task in your code base is assigned to the processor and memory model that is optimal for that task. Q

REFERENCES

1. Govindaraju, N.K., Gray, J., Kumar, R., Manocha, D.

2006. GPUTeraSort: High-performance graphics coprocessor sorting for large database management. Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data; http://research.microsoft.com/ research/pubs/ view.aspx?msr_tr_id=MSR-TR-2005-183).

2. Krüger, J., Westermann, R. 2003. Linear algebra operators for GPU implementation of numerical algorithms. ACM Transactions on Graphics 22( 3).

3. Blythe, D. 2008. The Rise of the GPU. Proceedings of the IEEE 96( 5).

4. Shubhabrata, S., Lefohn, A.E., Owens, J.D. 2006. A work-efficient step-efficient prefix sum algorithm. Proceedings of the Workshop on Edge Computing Using New Commodity Architectures: D- 26-27.

5. Lefohn, A.E., Kniss, J., Strzodka, R., Sengupta, S., Owens, J.D. 2006. Glift: Generic, efficient, random-access GPU data structures. ACM Transactions on Graphics 25( 1).

6. See reference 1.

SUGGESTED FURTHER READING GPU Gems 2: http://developer.nvidia.com/object/ gpu_gems_ 2_home.ht ml GPU Gems 3: http://developer.nvidia.com/object/ gpu-gems- 3.ht ml Ch 39 on prefix sum Glift data structures: http://graphics.cs.ucdavis.edu/~lefohn/work/glift/ Rapidmind:

References:

http://graphics.cs.ucdavis.edu/~lefohn/work/glift/

mailto:feedback@acmqueue.com

http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2005-183

http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2005-183

http://developer.nvidia.com/object/gpu_gems_2_home.html

http://developer.nvidia.com/object/gpu_gems_2_home.html

http://developer.nvidia.com/object/gpu-gems-3.html

http://developer.nvidia.com/object/gpu-gems-3.html

Archives