After the challenge is before the post-challenge… It’s snowing again, and I’m so immensely looking forward to the law-mandated Winterdienst that I decided to took a look at my Vintage Computing Christmas Challenge snowflake again. Maybe I get an optimization idea for the VC³ post-challenge?
Staring at the old code, I noticed a pattern in the coordinates:
dc.b 9*8,0*8
dc.b 4*8,1*8
dc.b 5*8,2*8
dc.b 1*8,7*8
dc.b 2*8,8*8
dc.b 6*8,5*8
dc.b 7*8,5*8
There are “chains” of coordinate pairs (if we flip some values here and there – they get swapped in the drawing process anyway). The second value is the first value of the next element:
- (4,1) (1,7) (7,5) (5,6)
- (5,2) (2,8)
If we rearrange the order a bit and insert seemingly redundant extra pairs (like 5,6 and 6,5)…
dc.b 4*8,1*8
dc.b 1*8,7*8
dc.b 7*8,5*8
dc.b 5*8,6*8
dc.b 6*8,5*8
dc.b 5*8,2*8
dc.b 2*8,8*8
dc.b 8*8,0*8
dc.b 0*8,9*8
…we can get rid of almost half the bytes! Instead of reading two bytes each time and advance the coordinate pointer by two bytes, we can read two bytes and advance the pointer by one only.
dc.b 4*8,1*8dc.b 1*8,7*8dc.b 7*8,5*8dc.b 5*8,6*8dc.b 6*8,5*8dc.b 5*8,2*8dc.b 2*8,8*8dc.b 8*8,0*8dc.b 0*8,9*8
| Data scheme | Coordinates ÷ 8 | Plot and mirror at |
|---|---|---|
| Old | 9,0, 4,1, 5,2, ... | (9,0) (4,1) (5,2) … |
| New | 4, 1, 7, 5, 6, ... | (4,1) (1,7) (7,5) (5,6) … |
Cool, plenty of bytes saved! Together with two other small improvements:
- Avoid a useless register save (
move.w d4,d7, 2 bytes) - Shuffle registers around so the value we pass for the alert height ends up in
the right register automatically (
move.l d4,d1, 2 bytes)
…my entry is down from 84 bytes to *checks notes* 78 bytes.
move.l 368(a2),a6
lea .buf(pc),a0
clr.l -(a0)
lea .coords-1(pc),a1
moveq #64,d1
.loop subq.b #8,d1
move.b d1,d0
bge.b .star
move.b (a1),d0
move.b -(a1),d1
blt.b .done
.star move.w d0,d7
add.l (a2),d7
move.l #$562a0001,-(a0)
add.b d1,(a0)
move.w d7,-(a0)
neg.w d0
blt.b .star
neg.w d1
blt.b .star
exg d0,d1
neg.l d6
blt.b .star
clr.w d0
neg.l d5
blt.b .star
bra.b .loop
.done jmp -90(a6) ; 4eee ffa6
dc.b 4*8
dc.b 1*8
dc.b 7*8
dc.b 5*8
dc.b 6*8
dc.b 5*8
dc.b 2*8
dc.b 8*8
dc.b 0*8
dc.b 9*8
.coords
dx.b 8000 ; reserve extra space via header
.buf
Unfortunately, some neat optimizations don’t work anymore. The alert-message size exceeded the default stack size of 4k and caused crashes under Kickstart 2.0 and above. No more abusing the stack for our data! On the flip side, the code is more compatible than ever:
- Uses a dedicated and properly allocated memory area for output
- This is done with the
dx.bdirective which reserves a memory region in the file header, but adds nothing to the file size - Cost: four bytes for the
lea .buf(pc),a0instruction
- This is done with the
- Does not rely on zero-initialized memory
- As we learned in 2024, that extra memory reserved in the file header is not cleared under Kickstart 1.x
- Cost: two bytes for
clr.l -(a0)(but we already had that)
- Cleanly exits to the system
- The stack pointer a7 is not corrupted
- After the DisplayAlert call, register d0 contains the value 1 – that doesn’t even cause a “snowflake.exe failed returncode x” message on exit!
- Cost: free
Bonus: The exit condition when a coordinate value is below zero has changed.
We read backwards into the jmp -90(a6) instruction
(4eee ffa6 in binary), and the below-zero condition now already occurs with the value a6, not ff. We use that for the alert height value, so it’s
smaller now (166 instead of 255). I think it looks better this way, showing more of the AmigaDOS window
below:
