interview
AW I think so. The thing about beautiful code is, first of all, it’s beautiful. Second, it’s a lot easier to maintain. BC I think elegant is something that we all know when we see it, but how would you describe elegant code? AW It’s just really clear. I don’t know what it is. In our community we have a listbox where people post questions and answers about coding, and the elegant code is always the shortest code. BC Is it elegant because it’s the shortest, or is being short a side effect of being elegant? AW I guess it’s both. All things being equal, less code is always better. BC I was just thinking of the analog to a proof. The shorter proof is almost always the more elegant proof. AW It’s the same thing. It’s usually easier to understand. BC Software has often been compared with civil engineering, but I’m really sick of people describing software as being like a bridge. What do you think the analog for software is? AW Poetry. BC Poetry captures the aesthetics, but not the precision. AW I don’t know, maybe it does.
BC Let’s talk about the data sets a little, because you’re dealing with enormous amounts of data, and it’s column-oriented.
AW The typical data is trades, quotes, and orders. These days, there are about a billion quotes a day just in the United States equities. The order events are probably 2 or 3 billion a day, and there are about 50 million trades. The customers tend to keep track of all that and execute trades during the day as well, but they also keep all the history so they can try different strategies.
I’ve done column-oriented databases since 1974. In the ’50s they were doing column-oriented databases on file systems. It’s the same data type, so of course you would store it by column.
BC Obviously that’s the right choice when you’re dealing with that kind of a data hose. If you were to build a transactional system on K, would you still want it to be column-oriented?
AW Yes, column-oriented databases seem fine. I think the reason they’re fine is because we always set it up so that the hot stuff is in memory. We did that in the ’70s when our memory was 32 K and we did high transaction rates. Now the guys have 128 gig, which is enough for a billion because these records are only 20 or 30 bytes.
BC So they load the whole thing into memory and then operate on it?
AW All day long all the hot stuff is in memory, and then during the day it takes about two minutes to write the whole thing down to disk and then flip to a new day and start from scratch.
BC In that case, is the data coming from a feed or from disk?
AW Multiple feeds, so the realtime systems and the historical systems are all running 24/7. It’s just that there’s always a quiet time.
BC But the transactions in that system are really appending temporal data to the end of a very large table.
AW Yes, but with all the analytics, they could be doing all kinds of updates to smaller tables. That’s very typical. In fact, we encourage them to do that because all your realtime analytics need to be look-ups. You can’t do any aggregations in realtime, so you have a lot of raw data. You have these billion rows of raw data spread among three tables, maybe. You might have 10 or 20 smaller tables that represent a certain state, such as book. There are also certain calculations that you want to maintain so that you can do either constant-time look-up or binary-search look-up.
BC You were saying that keeping data in DRAM is incredibly important for your performance. Looking down the track, what do you see in terms of the technologies that are coming? In particular, I’ve got to ask you about Flash and whether you think Flash memory is interesting in terms of its ability to get not DRAM speeds, but much-better-than-disk speeds. Does that pose any sort of change?
AW I think the customers are starting to investigate. It sounds great. It should provide more opportunities for other kinds of mid-range stuff.
Obviously, right now there’s no random access to disk, except for the research people. The average customer’s database is 30 terabytes, a trillion rows. So when they want to say, “Give me all the IBM activity for a certain day,” we teach them, by all means, since it’s column-oriented take as few columns as you need for whatever it is you need to do. You might need four columns: time, price, size, and something else. You’ve got to do four seeks, because we’ve got all these indexes set up so that’s all in memory. If you want all the IBM activity for a certain day, that’s going to be four seeks and then— boom!—you’ll read a few megabytes out of each of those columns. Of course, if you go back to IBM on that day, it will probably be sitting in your file cache.
BC That’s assuming, too, that when I’m accessing a file sequentially, it corresponds to sequential accesses on disk, which is not necessarily the case for copy-on-write
References:
Archives