Algorithmic Density: Turning 16 Bytes of x86 Assembly into Audio-Visual Art
In the demoscene, the pursuit of extreme constraints is more than a technical challenge; it is an art form. A recent production titled "wake up! 16b" demonstrates the pinnacle of this philosophy, achieving a complex audio-visual experience using only 16 bytes of x86 real-mode DOS assembly.
By treating the computer's video memory as a calculation space, the code generates an infinite Sierpinski fractal and simultaneously interprets that geometry as audio data for the PC speaker. This is a masterclass in algorithmic density, where every single byte serves multiple purposes.
The 16-Byte Engine
To understand how such a minimal footprint is possible, we must first look at the code itself:
int 10h ; 2 bytes
mov bh, 0xb8 ; 2 bytes
mov ds, bx ; 2 bytes
L:
lodsb ; 1 byte
sub si, byte 57 ; 3 bytes
xor [si], al ; 2 bytes
out 61h, al ; 2 bytes
jmp short L ; 2 bytes
The Canvas: Priming the Void
The routine begins with int 10h, which initializes Video Mode 0 (a 40x25 text mode grid). Crucially, the BIOS does not clear the screen to absolute zero. Instead, it fills the memory with a uniform pattern: the ASCII space character (0x20) and a light gray color attribute (0x07).
This uniformity is essential. In a cellular automaton, random noise in the initial state can shatter the resulting pattern. By relying on the BIOS's predictable initialization, the author creates a stable foundation for the fractal to emerge.
The Logic: XOR and the Sierpinski Shift
The core of the production is a feedback loop. The lodsb instruction reads a byte from memory into the AL register and increments the source index (SI). The code then modifies SI and performs an xor operation between the current value and the memory cell.
Mathematically, if the code used add, it would create a binomial prefix sum. However, by using xor, the author discards the arithmetic carry and isolates the bit-planes. This transforms the calculation into a cellular automaton mapping to Wolfram's Rule 60. According to Lucas's Theorem, this XOR relationship is guaranteed to produce the Sierpinski triangle. The result is a toggle between 0x00 and 0x02 (Bit 1), creating the fractal's geometric structure.
Translating Geometry to Sound
One of the most elegant aspects of the production is the instruction out 61h, al. Port 61h interfaces with the internal PC speaker, and Bit 1 of this port directly controls the physical movement of the speaker cone.
Because the algorithm isolates and toggles Bit 1 to create the fractal, the geometry of the Sierpinski triangle becomes a direct set of instructions for the speaker. The CPU's execution speed determines the sample rate, turning the mathematical structure into a series of square waves. High-frequency tones occur where the fractal is dense (alternating bits), while rhythmic pauses occur in the larger empty regions of the triangles.
Spatial and Auditory Manipulation
The code does not step through memory linearly. The instruction sub si, byte 57, combined with the increment from lodsb, results in a net movement of -56 bytes per iteration. This specific offset serves two purposes:
1. The Visual Shear
On an 80-byte wide text grid, a -56 byte shift is spatially equivalent to moving forward 24 bytes. Since each character space is 2 bytes, this equals a 12-column shift. The resulting fractal is not a contiguous image but is sheared diagonally into ten evenly spaced vertical pillars of ascending glyphs.
2. The Auditory Octave Shift
While a 16-byte step would complete a segment sweep in 4,096 iterations, a 56-byte step requires 8,192 iterations (wrapping around the 64KB segment seven times) to return to the start. This doubles the macro-cycle length, which halves the fundamental frequency of the audio, dropping the tone by exactly one octave.
Hardware Sensitivity and the "Fingerprint"
Because the routine reads and XORs against existing memory, the output is highly sensitive to the environment. Different VGA BIOS implementations or emulators (like DOSBox or PCem) may leave different artifacts in RAM.
As a result, the visual characters and the timbre of the sound can vary between machines. While a setup routine could clear the memory to ensure uniformity, it would exceed the 16-byte limit. The author embraces this unpredictability, turning the hardware's natural state into a unique audiovisual fingerprint for each execution.
"Remarkably, sending this entire mixed-data byte directly to system port 61h... does not disrupt the system. In standard DOS environments and modern emulators, pushing these extra bits to the port is effectively harmless."
This final coincidence allows the visual ASCII data to safely double as the audio signal, maximizing the efficiency of the 16-byte limit.