D. J. Bernstein
Computer hardware
Writing high-speed software

Some relevant literature

Knuth's famous 1974 paper ``Structured programming with go to statements'' advocates, among other things, ``program manipulation systems'':

For 15 years or so I have been trying to think of how to write a compiler that really produces top quality code. For example, most of the MIX programs in my books are considerably more efficient than any of today's most visionary compiling schemes would be able to produce. I've tried to study the various techniques that a hand-coder like myself uses, and to fit them into some systematic and automatic system. A few years ago, several students and I looked at a typical sample of FORTRAN programs [...], and we all tried hard to see how a machine could produce code that would compete with our best hand-optimized object programs. We found ourselves always running up against the same problem: the compiler needs to be in a dialog with the programmer; it needs to know properties of the data, and whether certain cases can arise, etc. And we couldn't think of a good language in which to have such a dialog.
For some reason we all (especially me) had a mental block about optimization, namely that we always regarded it as a behind-the-scenes activity, to be done in the machine language, which the programmer isn't supposed to know. This veil was first lifted from my eyes in the fall of 1973, when I ran across a remark by Hoare [...] that, ideally, a language should be designed so that an optimizing compiler can describe its optimizations in the source language. Of course! Why hadn't I ever thought of it?
Once we have a suitable language, we will be able to have what seems to be emerging as the programming system of the future: an interactive program-manipulation system, analogous to the many symbol-manipulation systems which are presently undergoing extensive development and experimentation. We are gradually learning about program transformations, which are more complicated than formula manipulations but really not very different. A program-manipulation system is obviously what we've been leading up to, and I wonder why I never thought of it before. Of course, the idea isn't original with me; when I told Hoare, he said, ``Exactly!'' and referred me to a recent work by Darlington and Burstall [...] which describes a system which removes some recursions from a LISP-like language (curiously, without introducing any go to's), and which also does some conversion of data structures (from sets to lists or bit strings) and some restructuring of a program by combining similar loops. I later discovered that program manipulation is just part of a much more ambitious project undertaken by Cheatham and Wegbreit [...]; another publication about source-code optimizations has also recently appeared [...]. Since LISP programs are easily manipulated as LISP data objects, there has also been a rather extensive development of similar ideas in this domain, notably by Warren Teitelman [...]. The time is clearly ripe for program-manipulation systems, and a great deal of further work suggests itself.
The programmer using such a system will write his beautifully-structured, but possibly inefficient, program P; then he will interactively specify transformations that make it efficient. Such a system will be much more powerful and reliable than a completely automatic one. We can also imagine the system manipulating measurement statistics concerning how much of the total running time is spent in each statement, since the programmer will want to know which parts of his program deserve to be optimized, and how much effect an optimization will really have. The original program P should be retained along with the transformation specifications, so that it can be properly understood and maintained as time passes. As I say, this idea certainly isn't my own; it is so exciting I hope that everyone soon becomes aware of its possibilities.

Much later in the article, Knuth writes ``Standard operating procedure nowadays is usually to hand code critical portions of a routine in assembly language. Let us hope such assemblers will die out, and we will see several levels of language instead: at the highest levels we will be able to write abstract programs, while at the lowest levels we will be able to control storage and register allocation, and to suppress subscript range checking, etc.''

The phrase ``user-guided tuning'' appears in the paper ``Improving Performance with Integrated Program Transformations'' by Qasem, Jin, and Mellor-Crummey. This paper describes tools to help a programmer perform various standard space-time transpositions that improve cache behavior. Older papers on the same topic: Fahringer and Gerndt; Pike and Hilfinger.

The phrases ``human-computer optimization'' and ``human-guided optimization'' appear in the paper ``Investigating human-computer optimization'' by Scott, Lesh, and Klau. This is a paper on transportation scheduling.