Home Up
Home Teaching Glossary ARM Processors Supplements Prof issues About

The No Operation Instruction

A NOP or no operation is an instruction that isn’t. It’s something that no one ever thinks about. People think less about NOPs than about where all those biros go when they die. And yet.

A no operation instruction is an instruction that performs no operation. This, itself, is a paradox rather like Russell’s barber paradox (If the barber shaves all those men who do not shave themselves, who shaves the barber?). So, what are NOPs and why are they used?

A NOP is a computer instruction that does nothing apart from advance the program counter. It’s a waste of time and of the space it occupies in memory. The standard definition of a NOP is that is should serve only to advance the program counter by one instruction ([PC] ¬ [PC] + 4 in a 32-bit environment). In the previous sentence, only means exactly what it says. No other element of the processor status should change.  Of course, user-visible registers will not be modified, but the processor status and condition codes must not be modified.

When I first became interested in microprocessors in the late 1970s it was the era of the 8-bit microcontroller and the NOP was an instruction that could be used in several ways. First, it was widely used to patch code, In those days memory was horribly expensive; millions of times more expensive than today. Flash memory did not exist and engineers were forced to use mask programmable ROM. Moreover, virtually all today’s development tools did not exist because the PC and workstation had not yet appeared.

To help limit the effect of errors, programmers would often place a string of NOPs at strategic points in code. If they later found an error, they could either patch the code by replacing the NOPs or they could replace NOPs by a jump to a suitable memory location where the fix could be made. Today, you’d correct and error and recompile the code; something that was not always possible in those days. Few engineers admitted even using this technique and I was once admonished for even suggesting it. NOPs could also be used to introduce delays in code for timing purposes because if a NOP takes n microseconds, then m NOPs take m·n microseconds.

The Motorola 6800 8-bit microprocessor had an official NOP instruction with its own op-code, an’ all. Some people argued that you didn’t need a forma NOP instruction because you could take an existing op-code that didn’t actually do anything and press it into service as a NOP. In those days, some microprocessors simply ignored binary patterns that had not been assigned as op-codes. Today, such a pattern would normally cause an exception and invoke the operating system to fix things. The reason why you do need a formal NOP is because it will always be a NOP in future revisions or extensions of the processor and its code will not be reassigned to a new instruction.

When the RISC pipelined processor became popular, NOPs could be used to introduce delays when necessary; for example, following a load operation. I once heard of the use of NOPs to solve a problem that cause anomalous behaviour of the processor when a certain sequence of instructions was executed. Inserting a NOP cured this fault. The principal uses of a NOP are:

To allow the future modification of code without rewriting or recompiling it

To add a known delay

To deal with hazards and sequencing problems in pipelines

To enforce the order of execution of sequences of instructions (to prevent out-of-order execution changing the meaning of code).

To align code/data on an appropriate boundary

To synchronize events

Not all NOPs are equal, and it was this thought that drove me to write this note. The MIPS RISC has a NOP. However, on closer examination, the NOPs is really a pseudo-operation. So, in order to do nothing, the MIPS processor does a different nothing to the same effect, since all nothings are the same. Or are they?

When you write NOP in MIPS code it is translated to ssl $r0,$r0,0 which means shift register r0 left zero places and put the result in $r0. Since $r0 is hardwired to zero, this instruction has no effect and can be treated as a no operation. Moreover, since the numeric value for this is 0x00000000 (all zeros) it means that non-initialized memory will be treated as NOPs. However, such NOPs are accidental, they were not put there by a programmer but are being executed because an error has occurred and the processor is responding by merrily skipping through a field of NOPs.

ARM code can use   ANDEQ r0,r0,r0 which has an all zeros op-code like MIPS or  mov r0,r0 which is encoded as 0xE1A00000.

I feel that a true NOP should have a unique op-code. Moreover, it should, ideally, have a parameter that is ignored by the processor but which can be used when either tracing the code or debugging the system.

The most intriguing NOP I ever encountered was the Itanium NOP, for two reasons. First, the Itanium is a VLIW processor and bundles of instructions are executed in parallel. The instructions of a bungle to be executed are selected by the programmer or assembler/compiler.  In Itanium assembly language a suffix can be assigned to an operation to define which of the parallel using is going to handle it (e.g., memory, integer, floating-point). The NOP can be processed either by an integer unit or by a memory unit.’ For example; NOP.i would be executed by the integer unit and NOP.m would be executed by the memory unit. It’s really like deciding who gets to feed an invisible friend.

An even more fascinating concept arises in the case of predicated processors like the ARM and Itanium. Consider the ARM predicated form of the no operation and NOPNE. When this instruction is decoded and sent for execution, the predication marker, NE, signifies not zero. So, if the previous condition code was zero, the predicated instruction is not executed. If the previous condition code was not zero, the NOP is executed ad nothing happens.

Finally, what is the perfect NOP? Apart from having its intended effect (incrementing the PC) and leaving the processor status unmodified, it should have a unique op-code that will never change in future generations of the processor and, ideally, a user-defined field that can be detected in memory dumps during debugging and tracing.