A.D.A. Amiga Demoscene Archive

        Welcome guest!

  

  

  

log in with SceneID

  

Demos Amiga Demoscene Archive Forum / Coding / My 4k intro startup code

 

Author Message
Blueberry
Member
#1 - Posted: 30 Nov 2008 17:38
Reply Quote
At GREP Green this year, I decided to extract the system close/restore and interrupt code from my 4k intro framework and put it into an easy-to-use form. The plan was to publish it here on ADA afterwards, which I never got around to until now.

This is meant primarily as an encouragement to other Amiga coders to make more 4k intros. But the code is probably useful for bigger productions as well. It is designed to be used together with ExeCruncher2 in MINI mode.

The code has the following features:
- Closes down the system and restores it again on exit. The particular technique used for this has been used for many productions by now and has proven itself quite solid, despite its simplicity.
- Works from any screen mode (native or RTG) and with non-zero vectorbase.
- Sets up a vblank interrupt which increments a vblank counter (for timing) and calls a custom vblank interrupt handler (for setting bitplane pointers and such).
- Automatically exits when the left button is pressed, regardless of what the main code is doing.
- Optionally pauses the demo when the right mouse button is pressed. The interrupt handler is still called, but the vblank counter stands still, and the main effect code is halted.
- Steps a single vblank at a time when the left mouse button is pressed during such a pause. Useful for inspecting details and for measuring the frame rate of the effect. :)
- Includes a section pointer hack that permits a statically allocated chip memory section without needing any relocation entries.
- Optionally fetches the topaz font and makes a pointer to the font character data available.

Get it at http://www.crinkler.net/BlueberryDemoStartup.lha.

Post questions, feedback, whatever, in this thread. Comments are more than welcome! :)
skan
Member
#2 - Posted: 1 Dec 2008 14:22
Reply Quote
Great! :D
We have a startup code, three softsynths, an exe cruncher... what do I miss to build the perfect 4k?
Oh yeah, an ass-kicking 2d vector engine! ;P
d0DgE
Member
#3 - Posted: 1 Dec 2008 14:38
Reply Quote
yay \o/ ... there will definitely much to learn from this one.
Mostly I will take a look at all the Vblank stuff ... THX for sharing again great achievements
sp_
Member
#4 - Posted: 2 Dec 2008 08:12
Reply Quote
I managed to shave off 22 bytes in demostartup.s :D

.
Save 2 bytes
Remove one move in SECTIONHACK.

move.l d0,a0
lea.l ChipPtr(pc),a1
move.l a0,(a1)

----->

lea.l ChipPtr(pc),a1
move.l d0,(a1)

....

Save 6 bytes:

remove move.w #$3fdf,$9a(a3) and add dc.l $9a3fdf in the copperlist
remove move.w #$0020,$1dc(a3) and add dc.l $1dc0020 in the copperlist
remove move.w #$007f,$96(a3) and add dc.l $096007f in the copperlist


Save 4*2 bytes:

replace lea $dff000,a3 with lea $dff096,a3 Correct all instructions with a3 access.

Save 2*2 bytes
in the interrupt handler
replace lea $dff000,a3 with lea $dff09c,a3 Correct all instructions with a3 access.

save 2 bytes
in memory add dc.w $0200 after the VBlank pointer
lea.l vblank(pc),a0
addq.l 1,(a0)+
...
move.w (a0),$9c(a3)
move.w (a0),$9c(a3)
Blueberry
Member
#5 - Posted: 2 Dec 2008 10:03
Reply Quote
move.l d0,a0
lea.l ChipPtr(pc),a1
move.l a0,(a1)

----->

lea.l ChipPtr(pc),a1
move.l d0,(a1)


Yep, good catch. The section hack was adapted from a more general one which could loop through multiple sections. That's where the extra instruction came from.


remove move.w #$3fdf,$9a(a3) and add dc.l $9a3fdf in the copperlist
remove move.w #$0020,$1dc(a3) and add dc.l $1dc0020 in the copperlist
remove move.w #$007f,$96(a3) and add dc.l $096007f in the copperlist


This is only possible if there actually is a copperlist, which is not necessarily the case. I often don't use one. And it mixes critical system closedown code into the intro copperlist, which is also bad.

But most importantly, it is essential for the stability of the system closedown that these instructions are executed at these exact positions. Otherwise you might run into spurious race conditions and the startup may not always work properly.


Save 4*2 bytes:
replace lea $dff000,a3 with lea $dff096,a3 Correct all instructions with a3 access.

Save 2*2 bytes
in the interrupt handler
replace lea $dff000,a3 with lea $dff09c,a3 Correct all instructions with a3 access.


Making the two leas different will make them compress worse, so this is not necessarily better after compression. You could change them both to $dff096 and get some of the benefit, but this would still make the instruction words for the move instructions different, which will again hurt compression. The instruction move.w #x,y(a3) is very common, and the cruncher can take advantage of this. So I want to see a comparison of compressed sizes (for a full 4k) before I believe that this is a good optimization.


in memory add dc.w $0200 after the VBlank pointer
lea.l vblank(pc),a0
addq.l 1,(a0)+
...
move.w (a0),$9c(a3)
move.w (a0),$9c(a3)


Again, this will not give any benefit after compression (it will most likely be worse), since the second move is identical to the first in any case. Also, it will not work if the custom interrupt routine changes a0. (As a side note, it is also practical that a0 points to the vblank counter when the routine is called.)


Well, then. 2 bytes. ;)
sp_
Member
#6 - Posted: 2 Dec 2008 13:43
Reply Quote
2 bytes is good :D

Does this mean that "planet loonies" can become a 4094 byte intro down from 4096 bytes? :) I agree The compressor might compress worse with the changes I suggested. But it need to be tested. I will disassemble the de-compression header of an intro to see if more bytes can be removed.

I'll be back ;)
sp_
Member
#7 - Posted: 2 Dec 2008 14:35
Reply Quote
The decompressor seem to be coded very tight and beatiful. But in planet loonies its possible to save 2 bytes.

add,w d7,d7
move.b -$16(a0,d7.w),(a0)+
--->
move.b -$16(a0,d7.w*2),(a0)+

save 2 bytes
Blueberry
Member
#8 - Posted: 2 Dec 2008 20:23
Reply Quote
Does this mean that "planet loonies" can become a 4094 byte intro down from 4096 bytes? :)

The section hack code in my intros is different from the published one, but actually the same optimization applies. I move the result to an address register, but it turns out I can just as well have it in a data register.

I tried it on Planet Loonies, and it saves 19 bits on the compressed size. The file size remained the same, though, as it is rounded up to a multiple of 4 bytes. Still a nice little optimization for the future. Thanks. ;)

(Actually the file size was 4084 bytes. Not sure what has changed since the release. Maybe I just tweaked the cruncher parameters better this time.)

add,w d7,d7
move.b -$16(a0,d7.w),(a0)+
--->
move.b -$16(a0,d7.w*2),(a0)+


The add.w d7,d7 is conditional on the branch just before it (used to distinguish between every-byte and only-even offsets), so this will not work.
Blueberry
Member
#9 - Posted: 6 Feb 2012 12:04
Reply Quote
More oldskool 4k action! :-D

I updated my compact demo startup code to be compatible with OCS Amigas. The exact compatibility/size tradeoff is controlled by an option in the code, with 3 possibilities:

- Mode 2: Only compatible with 68010+ and Kickstart 3.0+. Can run from any display mode. Supports non-zero vectorbase. This corresponds to the previous version.
- Mode 1: Compatible with all CPUs and Kickstarts. Slightly larger because the VBR code is guarded by a processor check and graphics.library is opened in the oldfashioned way (KS 3.0+ has a more compact way of doing this).
- Mode 0: Only compatible with 68000 and Kickstart 1.3 and must run from a PAL screen. This version omits lots of compatibility code and is thus significantly smaller.

Included is a cache flush macro, which behaves differently depending on the compatibility mode. In mode 0, it does nothing. In mode 1, it calls the CacheClearU function if the Kickstart version is 2.04 or higher. In mode 2, it simply calls the CacheClearU function. This makes it easy to create different versions of an intro with different compatibility characteristics (such as an OCS-only 4k version and a compatible larger-than-4k version).

The framework is designed to produce completely relocatable code, so that the MINI option of my cruncher can be used. To keep your code PC relative, use the SECTIONHACK feature and follow these rules:
- When referring to a label in the fast memory section, always use PC relative addressing (except for branches, which are implicitly PC relative). If the label is too far from the code for PC relative addressing, LEA a nearby label and then add the distance between the labels.
- When referring to a label in the chip memory section, read ChipPtr(pc) and then add the distance between ChipPtr and the desired label.
There are examples of these coding patterns in the supplied code.

The update also incorporates a few size optimizations, including the one suggested by sp_.

Happy coding! :-D
Blueberry
Member
#10 - Posted: 22 Aug 2013 20:13 - Edited
Reply Quote
I realized (some time ago) that the old trick of reading GraphicsBase from offset 156 of ExecBase works on all Amigas. I have now updated the startup code accordingly, saving some bytes (especially in compatibility mode 1).
dalton
Member
#11 - Posted: 23 Aug 2013 08:28
Reply Quote
That's cool, are there also places where you can read dosbase and intuitionbase?
Blueberry
Member
#12 - Posted: 27 Aug 2013 20:12
Reply Quote
I looked a bit into this. IntuitionBase seems to be available at offset 468 from GraphicsBase (gb_IData) on KS3.0 and KS3.1, but no such luck on KS1.3 or KS2.0. I couldn't find DosBase anywhere.
sp_
Member
#13 - Posted: 21 Dec 2013 05:38
Reply Quote
4k's in 2014 are done in Javascript blueberry :)
Blueberry
Member
#14 - Posted: 22 Dec 2013 01:05
Reply Quote
Amiga 4ks are most certainly not.

JavaScript is for script kiddies. ;)
Blueberry
Member
#15 - Posted: 10 Mar 2015 00:16
Reply Quote
I have made two small changes to my startup that I needed, and others might find them useful as well:

You can now specify -1 for SECTIONHACK to get no chip section pointer (e.g. if you prefer to allocate your chip memory manually).

The topaz font is now opened earlier during startup, so you can also access it during precalc.

New version uploaded. Enjoy. :)
LaBodilsen
Member
#16 - Posted: 16 Jan 2016 11:08
Reply Quote
First of thanks for this startup code.
i've played around with it, and will most likely use it for any release i might or might not release in the future.

One (newbie) Question though: You have the Copper list in "normal/any" memory, and then copy it to Chip-ram.

Is there a reason to do this, instead of just having the Copperlist in Chip-ram to begin with?.. i'm mostly just looking to program standard A500/A1200 releases. and it makes it a little easier to just have the copper list in Chip-Ram.

/Regards
Blueberry
Member
#17 - Posted: 17 Jan 2016 17:31 - Edited
Reply Quote
There is a reason, though it is admittedly subtle...

My cruncher, Shrinkler, has (like ExeCruncher2 before it) a special mini option which saves some bytes (around 100 or so) by using a simpler decrunch header. The decompression code is (basically) the same, but it only supports one non-empty code or data section and does not support relocations.

If the copperlist is put directly into a chip-memory data section, the "only one code or data section" restriction is violated, and thus the mini mode cannot be used. The copperlist data is therefore placed in the (singular) code section and copied to an uninitialized chip memory section (which does not count against the restriction).

This is not enough, however. Under normal circumstances, referring to the chip memory section would incur a relocation, again prohibiting mini mode. This is where the SECTIONHACK feature of the startup comes in: when set to 1 (and run as an executable) this will dig into the in-memory section structure to retrieve the pointer to the next section (which is assumed to be the chip memory section). This pointer is written to the ChipPtr variable, which then can be used as a base address for computing the address of any label in the chip memory section without incurring relocations.

Avoiding relocations completely can be quite tedious. If you do not care about the mini mode (that is, if you are making anything other than a tightly size-optimized 4k intro), you can just disable the SECTIONHACK by setting it to -1 and then put your data wherever you please.
LaBodilsen
Member
#18 - Posted: 17 Jan 2016 17:54
Reply Quote
Thank you for the very detailed explanation.

I will keep it in mind, if i decide to go for a 4k intro at some point. But to begin with, i will most likely just be happy to release anything. it's been far to long since i've done any release, so as not to set the bar to high for myself, i'll keep it simple to begin with.

birra
Member
#19 - Posted: 31 Oct 2016 00:35
Reply Quote
Thanks Blueberry!

I'm using a slightly modified versión of your startup code due to assembly errors with devpac (spurious code....), but my Little intro doesn't work on a A4000. I have to check it, perhaps setting compatibility mode helps :)

Thanks for all. Nice job.

Amiga rulez!

 

  Please log in to comment

  

  

  

 

A.D.A. Amiga Demoscene Archive, Version 3.0