foo::bar

Blogs and RSS are cooler than ever

6502 asm - Many ways to multiply

There is no instruction to multiply numbers in the MOS 6502/6510 8-bit CPU. One can add with ADC (ADd with Carry) and subtract with SBC (SuBtract with Carry). Because there are no instructions to add or subtract without carry, the programmer has to clear the carry flag (CLC) before performing an addition where no incoming carry is expected. Conversely, the carry flag must be set (SEC) before a subtraction where no borrow is expected.

Leaving decimal mode aside, an addition may look like this:

LDA #$02           ; Load 02h into the accumulator
CLC                ; Clear the carry flag
ADC #$03           ; Add 03h to the accumulator.
                   ; Now the accumulator contains 05h

A simple approach to multiplication would be to add a number to itself a certain number of times. For example, to multiply 3 by 4, you could add 3 to itself 4 times. But this would be inefficient.

This routine is a wicked fast and smart way to multiply two 8-bit numbers. It leverages the fact that:

(a + x)^2 = a^2 + 2ax + x^2

Solve for ax and it becomes

ax = (a + x)^2 / 2 - a^2 / 2 - x^2 / 2

Instead of a multiplication, now we have an exponent and a division. How does that help? Because n^2 / 2 can be calculated with lookup tables. This tricks simplifies the multiplication to a couple of lookups and subtractions.

Note: A trade-off is made between speed and size. Lookup tables contain precalculated values and will speed up a calculation, but will consume more memory.

The original routine used a side-program to generate the lookup tables, but the link is broken. However the ACME assembler comes with decent arithmetic functions, so it can be used it to generate the lookup tables from the source code, without the need for a side script. For instance:

.lo_256     !for .i, 0, 255 { !byte <(float(.i^2) / 2 + 0.5) }

(The + 0.5 is a rounding trick.)

Special cases

The above routine is fast, but in some cases there are faster options. For instance, multiplying by 2 can be done with a left bit shift (ASL). Multiplying by 8 can be done with three left bit shifts. To multiply by 10, the number can be multiplied by 4 with two left bit shifts, then the number can be added to the result, and finally the result can be left bit shifted once more as shown here.