His claim basically was that all of Pixar’s fancy rendering stuff—ray tracing and subsurface scattering and more complicated simulation effects—doesn’t happen on GPUs these days. He pointed out a series of papers that covered basically everything that we do that’s really hard.

The conclusion he drew from that was there’s no reason not to run the whole thing on a GPU right now, but the examples he showed us were all isolated examples that don’t play together. It was pretty obvious that of these separate pieces, there was no reasonable way to build a whole system. They all required different data structures for storing geometry and dealing with piles of rays. I guess the point is that kernels are not systems. PH Exactly. TD Two things to consider: First, somebody needs to be thinking about how to bridge that. The system-integra-tion problem is really hard.

Second, unless the architecture of GPUs evolves in ways that I don’t expect, they’re going to be attached processors forever and there’s going to be a general-purpose processor somewhere that’s doing some of the work. How the work is allocated between two different heterogeneous kinds of machines is a really important problem, and it’s really hard because optimization strategies on the two kinds of machines are fundamentally different. PH That is a great point, and I think that is actually the biggest challenge facing us right now—for another important reason, which we haven’t talked about.

As you probably know, AMD acquired ATI and both Intel and AMD are working on building heterogeneous multicore systems that basically combine a CPU and GPU on a single chip. In the future, it might even have some other specialized hardware on it, such as a video codec. This will be our mainstream computing platform.

A laptop, for example, will have one of these single-chip things in it. How are we going to program this thing? How are we going to schedule work on it? How are we going to deal with different instruction sets or different vector units?

I don’t really know, but I do know that people are going to build these things, and we had better start thinking about it. It’s going to be very challenging to figure out. KA One way to think about this is to figure out what we’re going to mean by GPU and CPU over the next few years, and what is the difference between the two? A lot of people right now think of something that’s data-parallel, with lots of execution units, as a GPU, and something more sequential as a CPU. But that’s not going to be the right distinction down the road. TD Certainly, a multicore Intel box with 64 or 128 CPUs

on it looks an awful lot like a data-parallel machine from 50,000 feet.

KA But it has a fundamental difference, and I think in some underlying way this may get at your issue: ultimately, the way the resources are deployed and harnessed and the way the data is moved around on a CPU is under software control; the way the data and resources are deployed on a GPU is still significantly under non-soft-ware control.

There’s a lot of general-purpose computing in there, but the way it’s wired together, the way the data moves, is not general purpose or at least not exposed yet. It’s still a graphics pipeline, or it’s pretty much neutered in something like CUDA. You lose this notion of wiring a bunch of different things together, and you’re pretty much given a single data-parallel space to operate in. I’m simplifying a bit here, but that’s roughly true. In some sense, what makes something a GPU is that the resources aren’t organized by your software control; they are organized by somebody else.

Think back to the old 860. It was an Intel part that had a general-purpose CPU, but it had a little rasterizer thing on the side. It’s very clear that the CPU directed the rasterizer; the rasterizer didn’t direct the CPU. If you open up a GPU, the rasterizer is pretty much what doles out the work that makes the high-performance thing go.

In some sense, it’s that orientation that determines if it is a CPU or a GPU. When GPUs evolve to the point where that’s no longer true, that’s the day that some of your lower-level concerns get addressed. You say, “Gee, they’ve got different data structures, and how do we wire all this stuff up?”

Once you free up the special-purpose stuff to be slaves to the general-purpose stuff, instead of having the general purpose be a slave to the special purpose, that’s what software programmers are used to. That’s what allows you to change data structures and organize the shape of your overall computation.

I don’t think GPUs are so far away from that, and when that threshold is crossed, then there really aren’t GPUs and CPUs anymore. Now there are just resources that are optimized for highly parallel computation. PH It’s a neat way of thinking about it. TD Yes, it is. KA And it gives you a chance to flip your hands up. Q

LOVE IT, HATE IT? LET US KNOW

feedback@acmqueue.com or www.acmqueue.com/forums

References:

http://www.acmqueue.com

mailto:feedback@acmqueue.com

http://www.acmqueue.com/forums

Archives