or are terrible writers, and often there is a fatal combination of the two. Starting from that base, the typical software engineer produces code that somewhat mirrors the specification, and as things that grow in poison soil themselves become poison, the code is often as confusing as the specification.
What is hwpmc? It is a set of counters that reside on the CPU that can record various types of events of interest to engineers. If you want to know if your code is thrashing the L2 cache or if the compiler is generating suboptimal code that’s messing up the pipeline, this is a system you want to use. Though these things may seem esoteric, if you’re working on high-performance computing, they’re vitally important. As you might imagine, such counters are CPU-specific, but not just by company, with Intel being different from AMD: even the model of CPU bears on the counters that are present, as well as how they are accessed.
The sections covering hwpmc in Intel’s current manual, Intel 64 and IA- 32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide, Part 2, encompass 249 pages: 81 to describe the various systems on various chips and 168 to cover all the counters you can use on the chips. That’s a decent-size novel, but of course without the interesting story line. Kudos to Intel’s tech writers, as this is not the worst chip manual I have ever read, but I would still rather have been reading something else. Once I had read through all of this background material, I was a bit worried about what I would see when I opened the file.
But I wasn’t too worried, because I personally knew the programmer who wrote the code. He’s a very diligent engineer who not only is a good coder but also can explain what he has done and why. When I told him that I would be trying to add more chip models to the system he wrote, he sent me a 1,300-word e-mail message detailing just how to add support for new chips and counters to the system.
What’s so great about this software? Well, let’s look at a few snippets of the code. It’s important always to read the header files before the code, because header files are where the structures are defined. If the structures aren’t defined in the header file, you’re doomed from the start. Looking at the top of the very first header file I opened, we see the following code:
#define PMC_VERSION_MAJOR 0x03
#define PMC_VERSION_MINOR 0x00
#define PMC_VERSION_PATCH 0x0000
Why do these lines indicate quality code to me? Is it the capitalization? Spacing? Use of tabs? No, of course not! It’s the fact that there are version numbers. The engineer clearly knew his software would be modified not only by himself but also by others, and he has specifically allowed for that by having major, minor, and patch version numbers. Simple? Yes. Found often? No.
The next set of lines—and remember this is only the first file I opened—were also instructive:
/*
* Kinds of CPUs known */
#define __PMC_CPUS() \ __PMC_CPU(AMD_K7, “AMD K7”) \ __PMC_CPU(AMD_K8, “AMD K8”) \ __PMC_CPU(INTEL_P5, “Intel Pentium”) \ __PMC_CPU(INTEL_P6, “Intel Pentium Pro”) \ __PMC_CPU(INTEL_CL, “Intel Celeron”) \ __PMC_CPU(INTEL_PII, “Intel Pentium II”) \ __PMC_CPU(INTEL_PIII, “Intel Pentium III”) \ __PMC_CPU(INTEL_PM, “Intel Pentium M”) \ __PMC_CPU(INTEL_PIV, “Intel Pentium IV”)
Frequent readers of KV might think it was the comment that made me happy, but they would be wrong. It was the translation of constants into intelligible textual names. Nothing is more frustrating when working on a piece of software than having to remember yet another stupid, usually hex, constant. I am not impressed by programmers who can remember they numbered things from 0x100 and that 0x105 happens to be significant. Who cares? I don’t. What I want is code that uses descriptive names. Also note the constants in the code aren’t very long, but are just long enough to make it easy to know in the code which chip we’re talking about.
Figure 1 shows another fine example from the header file. I’ve used this snippet so I can avoid including the whole file. Here, machine-dependent structures are separated from machine-independent structures. It would seem obvious that you want to separate the bits of data that are specific to a certain type of CPU or device from data that is independent, but what seems obvious is rarely done in practice. The fact that the engineer thought about which bits should go where indicates a high level
References:
Archives