What is a portable assembly language? D. J. Bernstein 2005.01.29 ``C has been characterized (both admiringly and invidiously) as a portable assembly language,'' Dennis M. Ritchie once wrote. What exactly does the word ``portable'' mean here? What are the relevant differences between C and an ``unportable'' assembly language? Here are three answers. First: Portable machine instructions have a unified C syntax. The same operation on different machines is expressed the same way in C. One could write integer addition, for example, as traditional Pentium asm: lea [%eax+4],%edx traditional PowerPC asm: addi r6,r3,4 traditional UltraSPARC asm: add %l3,4,%l6 C for the Pentium: d = a + 4 C for the PowerPC: d = a + 4 C for the UltraSPARC: d = a + 4 This type of portability, a unified syntax, is tremendously helpful for the reader. Yes, there are differences between the Pentium instruction set, the PowerPC instruction set, the UltraSPARC instruction set, the AMD64 instruction set, etc., but that doesn't mean we need a completely different instruction _syntax_ for each CPU. Second: Unportable machine instructions are often difficult to express in C. An old example is multiply-32-bit-by-32-bit-producing-64-bit, which was extremely difficult to express in C until compilers added ``long long'' and learned to recognize the 32*32->64 idiom. A modern example is multiply-64-bit-by-64-bit-producing-128-bit (mul and mulhdu on 64-bit PowerPC; mul on AMD64), which is now facing the same problem. Are these instructions essential for programming? Of course not. Every operation can be expressed using simple, portable instructions. The unusual instructions might be faster, but a slowdown is perfectly acceptable for almost all code---perhaps 99.999% of all lines of code today, and an even larger fraction in the future. On the other hand, there are still occasional chunks of speed-critical code. An expert programmer can save time by taking advantage of various unportable machine instructions. But convincing a C compiler to do this is often practically impossible. The programmer needs a language that _doesn't_ have this type of portability. In contrast, a traditional assembly language tries to make every machine instruction reasonably easy to express, whether or not other CPUs support similar instructions. Third: Code in a traditional assembly language is practically unusable on anything other than the target CPU. Occasionally people write machine simulators that can run on other CPUs, but those simulators don't hook nicely to code written for those other CPUs. In contrast, C code aimed at one CPU can still be run on another CPU, even if it won't be quite as fast as code aimed at the other CPU. This saves programmer time. Obviously almost all code should be written in languages offering this type of portability; the slowdown is perfectly acceptable. On the other hand, feeding a chunk of speed-critical C code to an automated ``optimizer'' often produces unacceptably slow results. It's then an expert programmer's job to write, say, three versions of the code: the original C code, a manually tuned Pentium version, and a manually tuned PowerPC version. Users aren't going to run the Pentium version on a PowerPC; the third type of portability has little benefit. Let me summarize. We need languages that are portable in the third sense. Most code isn't speed-critical; we need programming tools that make it easy to write this code once and have it run at tolerable speed on many CPUs. It's no problem if these tools use a limited set of CPU instructions. We need languages that are _not_ portable in the second sense. Some speed-critical chunks of code are written separately for different CPUs; we need programming tools that don't tie the programmer's hands. It's no problem if the resulting code can't run on more than one CPU. All languages should be portable in the first sense. To the extent that machine instruction syntax can be unified across CPUs, it should be. If a CPU designer wants to make a list of super-logical compressed opcode names, that's fine, as long as I never have to see any of those names.