The T-State: Your Ultimate Performance Metric

In Z80 programming, the speed of your code isn’t measured in lines, but in T-states (or clock cycles). Every instruction takes a precise, fixed number of T-states to execute. To write fast code, you must choose instructions that minimize this count.

Why T-States Matter: If your Z80 runs at 3.5 MHz (as in the ZX Spectrum), 3,500,000 T-states happen every second. A faster instruction in a critical loop can save dozens of T-states every frame, leading to smoother graphics or faster game logic.

Calculating Instruction Timing

Z80 instruction times are often listed as a range because execution time depends on the outcome of a jump (whether the condition is met or not).

Instruction T-States Notes
LD A, B 4 This is the fastest instruction (one byte, one machine cycle).
INC A 4 Very fast for simple increments.
JP NZ, label 10 / 7 Takes 10 cycles if the jump is taken (branching), and only 7 cycles if the jump fails (straight-line execution).
DJNZ label 13 / 8 Takes 13 cycles if the jump is taken, 8 cycles if the loop finishes.
CALL label 17 The single slowest common instruction.

Optimization Techniques for Speed

When writing assembly, always favor the most direct and efficient operations:

1. Avoid Slow Addressing:

  • LD A, (HL) (7 cycles) is much faster than LD A, (IX+d) (19 cycles). If speed is key, manage your addresses using HL or DE whenever possible.

2. Choose Fast Pointers:

  • Use INC HL (6 cycles) to increment a 16-bit pointer. Never try to emulate this with multiple 8-bit increments, which would be slower.

3. Use Dedicated Instructions:

  • Use DJNZ for simple loops instead of a full DEC B / JP NZ sequence, as it saves multiple T-states per loop iteration.

4. Swap vs. Memory:

  • Never save a register pair to the stack if you can use the alternate register set with EXX (4 cycles) or EX AF, AF' (4 cycles). Saving and restoring registers via PUSH/POP takes significantly longer.

The Optimization Goal

The goal isn’t to save a cycle here and there, but to save cycles inside critical loops (like the main screen refresh or game logic). Saving just five cycles inside a loop that runs 10,000 times a second frees up 50,000 cycles, which can be used for more game logic or sound processing.