Optimization

Z80 Assembly 100: Algorithm Showdown (Benchmarking Sort Speed)

The Performance Challenge Throughout this series, we learned three different sorting algorithms: Bubble Sort (Part 94), Insertion Sort (Part 96), and Quicksort (Part 95). The final question is: which one is truly the fastest on the Z80? The answer depends entirely on the size and initial state of the data. The Benchmarking Strategy To accurately compare performance, we measure the time taken in T-states (clock cycles) to sort three different datasets of 16-bit words. ...

Z80 Assembly 99: Merge Sort (The Memory-Hungry Sort)

The Speed of Merge Sort Merge Sort is another highly efficient sorting algorithm that runs in optimal $O(n \log n)$ time, regardless of the initial state of the data. It is highly valued for its predictability and stability (it preserves the relative order of equal elements). The Principle: Merge Sort is a recursive algorithm that continually splits the array into two halves until it has many arrays of size one. It then merges these small, sorted arrays back together in the correct order. ...

Z80 Assembly 98: Shell Sort (The Gap-Based Improvement)

The Drawback of Insertion Sort Insertion Sort (Part 96) is very fast for nearly sorted data, but slow for completely unsorted data because it can only move an element one position at a time. An element near the end must be swapped many times to reach the beginning. The Solution: Shell Sort Shell Sort (invented by Donald Shell) is an enhancement of Insertion Sort. It allows for large moves by sorting elements that are far apart, drastically reducing the number of total comparisons needed. ...

Z80 Assembly 97: Selection Sort (Minimizing Memory Writes)

The Advantage of Selection Sort Both Bubble Sort (Part 94) and Insertion Sort (Part 96) frequently swap adjacent elements, leading to many memory write operations. If you are sorting data that resides on a slow medium (like a disk cache) or requires complex I/O to access, minimizing these writes is critical. Selection Sort does exactly this. The Principle: Selection Sort makes one pass over the unsorted portion of the array to select the smallest (or largest) element. It then performs only one swap to move that selected element into its final sorted position. ...

Z80 Assembly 96: Insertion Sort (The Efficient Simple Sort)

The Advantage of Insertion Sort While Quicksort (Part 95) is fastest for large, random arrays, Insertion Sort is often faster for two specific scenarios: Small Arrays: Its low overhead makes it faster than Quicksort for arrays under 20-30 elements. Nearly Sorted Arrays: It is extremely fast, running in nearly linear time ($O(n)$), if the array is already mostly sorted. The Principle: The algorithm builds the final sorted array one item at a time. It iterates through the input elements and inserts each element into its correct position within the already sorted portion of the array. ...

Z80 Assembly 95: Quicksort (The Divide-and-Conquer Sorting Algorithm)

The Need for Quicksort While Bubble Sort (Part 94) is easy, its slow $O(n^2)$ time complexity makes it unusable for sorting large arrays. Quicksort is the fastest widely used sorting algorithm, running in $O(n \log n)$ time on average, making it ideal for sorting large high-score tables or data structures. The Principle: Quicksort uses a “divide-and-conquer” strategy: it picks an element (the pivot) and partitions the other elements into two sub-arrays according to whether they are less than or greater than the pivot. It then recursively sorts the sub-arrays. ...

Z80 Assembly 93: Binary Search (Dichotomic Search) on a Sorted Array

The Need for Fast Lookup If you have a large list of data (e.g., all font character pointers, or a list of game variables) that is sorted alphabetically or numerically, checking every item sequentially (`memchr′ in Part 89) is inefficient. Binary Search is the solution: it finds the target by repeatedly dividing the search space in half. The Principle: After each comparison, the algorithm eliminates half of the remaining elements. This makes the search time logarithmic, drastically speeding up large lookups. ...

Z80 Assembly 91: Fast 16-bit Block Clear (memset to Zero)

The Need for a Fast Block Clear Initializing memory to zero is one of the most common tasks in low-level programming (the equivalent of calling memset(ptr, 0, size) in C). Since the Z80 needs to clear large buffers, a highly optimized routine is essential. Goal: Fill a specified memory range (START_ADDR′ to END_ADDR′) with the byte value `00H′ as quickly as possible. The Optimized Strategy: Using LDIR The fastest method is a slight modification of the Block Fill routine (Part 85), exploiting the `LDIR′ (Load, Increment, Repeat) instruction. ...

Z80 Assembly 89: Block Search (memchr) and Compare (memcmp)

The Need for Block Search and Compare In system programming, you need routines to: Search (memchr): Find the first occurrence of a specific character (e.g., a null terminator, space, or command byte) within a buffer. Compare (memcmp): Verify that two large blocks of memory (e.g., two versions of a game level, or a password against a stored hash) are identical. Routine 1: Block Search (memchr) Block Search scans a memory area for the first occurrence of a specific 8-bit value. ...

Z80 Assembly 88: Block Move (memcpy) and Swap (memswap)

The Need for Block Operations In system programming and game development, you constantly need to move large chunks of data—transferring level data, shifting arrays, or buffering screen regions. Writing a loop for this is slow; the Z80 provides dedicated instructions for speed. Routine 1: Block Move (memcpy) `memcpy′ (Memory Copy) copies a block of bytes from a source address to a destination address. This is the fastest way to achieve the operation. ...