address with a magic character (NUL)
marking the end? This is a decision
that the dynamic trio of Ken Thompson, Dennis Ritchie, and Brian Ker-nighan must have made one day in the
early 1970s, and they had full freedom
to choose either way. I have not found
any record of the decision, which I admit is a weak point in its candidacy:
I do not have proof that it was a conscious decision.
illustration by gary neill
As far as I can determine from my
research, however, the address +
length format was preferred by the
majority of programming languages at
the time, whereas the address + magic _ marker format was used mostly
in assembly programs. As the C language was a development from assembly to a portable high-level language,
I have a difficult time believing Ken,
Dennis, and Brian gave it no thought.
Using an address + length format would cost one more byte of
overhead than an address + magic _ marker format, and their PDP
computer had limited core memory.
In other words, this could have been a
perfectly typical and rational IT or CS
decision, like the many similar decisions we all make every day; but this
one had quite atypical economic consequences.
Hardware development costs. Ini-
tially, Unix had little impact on hard-
ware and instruction set design. The
CPUs that offered string manipula-
tion instructions—for example, Z- 80
and DEC VAX—did so in terms of the
far more widespread adr+len model.
Once Unix and C gained traction, how-
ever, the terminated string appeared
on the radar as an optimization tar-
get, and CPU designers started to add
instructions to deal with them. One
example is the Logical String Assist
instructions IBM added to the ES/9000
520-based processors in 1992.1