A.D.A. Amiga Demoscene Archive

Amiga Demoscene Archive Forum / Coding / My Amiga Cruncher

Page: 1 2 3 4 5 6 »»

Author	Message
Blueberry Member	#1 - Posted: 20 Mar 2007 22:19 Reply Quote Hi ADA'ers! I spontaneously released my executable cruncher as a reaction to the discussion in this pouet thread. I thought I might as well post a link here on ADA. I should have released it long ago of course, but anyway, here it is. No documentation, but just try out various option values to see what compresses better. Give a ? to get a list of options. Use the MINI option for 4ks. So, try it out, and if you have any questions or comments, I think this is the perfect place to discuss. And I would be surprised if the discussion doesn't turn into a more general cruncher discussion before long. ;) Enjoy!
Blueberry Member	#2 - Posted: 20 Mar 2007 22:28 Reply Quote Oh, and I almost forgot: Now that you have this cruncher, go make a 4k intro for Breakpoint! That's an order! :-)
rload Member	#3 - Posted: 21 Mar 2007 01:26 Reply Quote aha... in addition to abusing Kalms' c2p, I can now abuse Blueberrys packer aswell.. I'm slowly becoming a 4k middleware empire!!
sp_ Member	#4 - Posted: 21 Mar 2007 05:35 - Edited Reply Quote hey Loaderror, how about giving us your brilliant 4k music engine so we can compete with you. :D Watching a 4k without music is worse than watching a dubbed movie in Thai language. :D . Blueberry, how do you think the performance of your cruncher will be against a lz huffmann implementation? In size. I currenty don't use adaptive huffman so 2 huffman trees are stored in the file. In the current unfinished algorithm I first crunch the file with lz. After that I crunch the LZ length's and the raw data with huffman. (relative lz offsets) are not crunched. The decrunch routine can be really small if I ushuffe a complete binary tree in memory. However this will require more bytes of memory in precalc. Precalc memory consumpiton where n= the longest huffman code in the chrunch. n --- \ / 2^n --- n=0 Total: 2^(n+1)-1 bytes Worst case huffman tree: is n=255 for byte crunched data. but normally codes are smaller than 16 bit. /\ ./\ ../\ .../\ ..../\ .. Here is my 32 bytes decruncher using a complete Huffmanntree in memory. Tnx. to doom^Iris for helping me optimize. HUFFMANDECRUNCH: moveq #1,d5 moveq #0,d3 .b moveq #1,d6 .t ror.l #1,d5 bpl.b .o move.l (a0)+,d0 .o add.l d6,d6 ;Traverse to childnode add.l d0,d0 ;Carry set = rightnode else left node addx.l d3,d6 ;Left or rightbit move.b (a2,d6.l),d2 ;If there is data!=0 we have reached a end node. beq.b .t move.b d2,(a1)+ subq.w #1,d4 bne.b .b rts
sp_ Member	#5 - Posted: 21 Mar 2007 13:46 - Edited Reply Quote The lz77 + huffman compression is not new or patented. The open .png imageformat use this teqnique to compress. The lz compression is flexible because you can adjust alot of parameters.. Like how many bits to use for length, and how many bits you use for offsets. The ideal lz cruncher should try all combinations. Lz77 is limited, and thats why we have all sorts of lz variants. . I got another idea wich I have to try out: First I crunch everything with huffmantree. After this I crunch the buffer of huffmanencoded bytes with a lz variation.(wich will try all combinations of parameters (slow:)) (since the order of the bytes are preserved and each huffman code maps to the same byte the lz offsets and lengths wil typically be able to find longer matches, with a longer reach within the file. This will improve compression). The last pass is to huffmann encode the length's and offsets of the lz. (the raw data is already huffman encoded, so no gain here. Probobly it would be bether to use arithmetric coding here.. Arithmetric coding wil not work in pass number one, since each binary code doesn't map to the same byte. so ideal order: 1. Huffman 2. Lz 3. Aritmetric I found a openource huffman encoder in c wich I am currently modifying.
Blueberry Member	#6 - Posted: 21 Mar 2007 15:17 Reply Quote hey Loaderror, how about giving us your brilliant 4k music engine so we can compete with you. :D Watching a 4k without music is worse than watching a dubbed movie in Thai language. :D You can have a copy of my 4k-synth (the one used in Planet Loonies for the music and speech) if you want. It is not as cool sounding as Loaderrors, but it should be fairly easy to use, and it even comes with a GUI for making the instruments. :) Only condition is that you use it to make a 4k for Breakpoint. ;-)
sp_ Member	#7 - Posted: 21 Mar 2007 15:41 - Edited Reply Quote hehe, I am working everyday to finish something for breakpoint, but I always end up doing non scene stuff.. . Right now I have the cruncher on my mind. I found out a clever idea how to improve huffman encoding by using a frequncy table of two sequental huffmancodes sorted by least compression bits vs least frequencies. This is a way to get arithmetric performance by using huffman pairs(2 sequential huffmann codes) . Probobly I am re-inventing the weel here. But it gives my brain coders-satisfaction. . If you mail me the synth engine, I promise I will release a 4k. wait for mouse, move random colors into $dff180 exit on mouse click a 109 minute prod. he-he :D Is the code Mc68000 compatible? Send to: sp.kvarteret.uib.no (replace the first dot with @)
Blueberry Member	#8 - Posted: 21 Mar 2007 16:18 Reply Quote OK, I'll just have to do a bit of work to separate the code from my 4k demo system, then I'll send it off. And no, just scrambling the background doesn't count. Do something decent. Well, at least have decent sound. ;) The code is far from 68000 compatible. FPU code all the way through.
dalton Member	#9 - Posted: 21 Mar 2007 18:17 Reply Quote yes! i love this cruncher!
krabob Member	#10 - Posted: 22 Mar 2007 00:08 Reply Quote mind you ! all these algorythm are copyrighted by timbaland !
Blueberry Member	#11 - Posted: 22 Mar 2007 11:14 Reply Quote Unless decompression speed is critical, Huffmann coding is not very practical compared to range coding. One thing of course is that the code lengths are suboptimal, since they are restricted to integral bit sizes. This does not cause a significant loss unless you have some very frequent symbols (say, 3 bits or less). More importantly, the overhead of storing the Huffmann tree can be quite significant, especially for small amounts of data. There are various tricks that can be used to store it compactly, though. For instance, LZX (which is also the compression method of CAB) stores a Huffmann tree for the symbols used to represent the Huffmann tree for the data symbols. But all of this complexity of course adds to the size of the decompression code. However, the main disadvantage of Huffmann coding is that it is difficult to make adaptive. Adaptive schemes base the probabilities of symbols on the number of occurrences seen so far. This is usually superior to static schemes (such as storing a Huffmann tree) since it can adapt to changes in the data, so for instance if you compress both code and data (which is typical), the probabilities adjust themselves to the new data along the way. Changing the Huffmann tree to reflect changing probabilities is both complicated and slow, whereas this change is trivial for range coding, where the representation of the probabilities are just the probabilities themselves. Range coding and decoding is a bit tricky to get right, though. Regarding your three-step idea: I don't see what is to be gained by compressing the data beforehand. If the lz77 is going to index the huffmann-compressed symbols, it will need just as many bits as if it was indexing the uncompressed bytes. Arithmetic/range coding of the result is the way to go. Adjusting the number of bits used for offsets gives a tradeoff between having small offsets (and thus getting better savings by using references) and being able to refer a greater range (and thus having more opportunities for references). You can actually get the best of both worlds by using variable-length codes for offsets, such that small offsets have small codes and larger offsets have larger codes. Same thing goes for the length of references. Many LZ77-based compressors (Zip and LZX among them) use this technique. Adaptive probability modelling of the offsets and lengths can provide this variation to some degree. Anyway, sp_, if you want to try and beat my packer (and I assure you you will have a hard time doing so ;), it can wait till after Breakpoint, right? You have more important things to do at the moment. ;-D
sp_ Member	#12 - Posted: 22 Mar 2007 11:42 Reply Quote I have searched through http://portal.acm.org/ to try to find out if there is a paper on my idea of Dynamic sequential huffmancode optimalization. But after briefly reading trough I can't find one. . Basicly the algoritm use up to 256 additional static hufmantrees wich are all derived from the original tree. This sounds like alot of extra header data, but it really isn't. The advantage of this teqnique is that is both use the frequencies of the bytes, and the byte order to generate an optimal code. Almost likea 2 dimentional huffman tree. . Well enough talk for now, time to code. he-he
Blueberry Member	#13 - Posted: 22 Mar 2007 11:51 Reply Quote I still don't see why you want to use Huffmann coding. Is this for realtime decompression of animation data or some such?
sp_ Member	#14 - Posted: 22 Mar 2007 12:01 Reply Quote Regarding your three-step idea: I don't see what is to be gained by compressing the data beforehand. If the lz77 is going to index the huffmann-compressed symbols, it will need just as many bits as if it was indexing the uncompressed bytes. Arithmetic/range coding of the result is the way to go. Since the data is compressed, the lz77 index wich points to where in the file the datasequence previously have been located can reach more bytes. Example: Lets say we reserve 12 bit for the ofsett and 4 bit for length. The cruncher searches max 4096 bytes back in the file to find a match of length 16(maybe a bit more). If the rawdata is first compressed with hoffman the lz algorithm is able to search more than 4096 bytes back and with more than 16 bytes length. The drawback is that the lz operate on byte boundaries, so this teqnique probobly works best if we have long matching strings in the previous data.
sp_ Member	#15 - Posted: 22 Mar 2007 12:05 Reply Quote I still don't see why you want to use Huffmann coding. Is this for realtime decompression of animation data or some such? Hehe, this is not scene related. I just got an idea, and want to try it out. Damn I should be democoding now.. he-he
Blueberry Member	#16 - Posted: 23 Mar 2007 23:34 Reply Quote Indexing Huffmann-compressed data on byte boundaries is useless. You will only find about 1/8 of the matches, since the other ones will be at different bit positions. By doing the Huffmann coding, you are essentially destroying the structure that the lz77 compression recognizes. As said, the best way to extend the ranges for the offset and length is to use variable-length codes. Ideally, these codes should adapt to the values that actually show up in the compressed data. And vice versa. ;) This is one of the key tricks that my new (yet to be finished) cruncher uses to improve the compression.
sp_ Member	#17 - Posted: 25 Mar 2007 11:13 - Edited Reply Quote I see your point with only 1/8 maches. using normal lz77 will be useless here. . The LZ algoritm can be modified to work on bits instead of bytes, but it will require at least 3 more bits in the offset. Probobly the number of lzbits offset bits should be dynamic. based on an adaptive probabillity algorithm as you mentioned. . I'll try to explain my "unfinished" multiple huffman tree algoritm here. I use the English language for explaination. Normally in English language you have a letterpairs that are never used.(no words contain the following letter combination) Here is an example of such Letterpairs. KQ KW KT KP KS KD KF KG KH KJ KK KZ KX KC KB KM Kq kw ... So in this example if K was the last letter the routine decrunched with huffman, the next letter can be decompressed by using a much smaller huffman tree. (a tree without Q,W,T...(around 32 letters removed)). . For Mc680x0 executable files the good elimination can be done in the same way. Graphics compression too.. Static solution with tree's stored in the file: The huffmann precalc tree is stored like this: 1 bit pr node(since each node have eighter 0 or 2 children) The bytes are stored in width first search order for simpliying the recreation of the tree. In our example above If frequency analysis show that we have alot of K's in our file (The size of all the huffmancodes that follows a K must be summed up, and compared to the newly created treecodes + precalc. ) the second huffmann tree is stored like this: 1.Sort the 32(++) letters by their frequencies. (most frequencies first) 2.Remove one by one sorted letter from the huffmantree and dynamicly recode the next letter using the new huffman tree. 3.Compress all the bytes (that followes a K in the file) with the new tree. and compare with the last result. 4. repeat 3 and 4 as long as filelength keep shrinking. It's possible to calculate excactly how many bytes you will save/ by removing a node in the tree. . Same teqnique can be applied by using a dymanic huffman aproach. But since we never know if the next byte combination is "valid" the algoritm wil have to do a "guess" ---> and then we move into aritmetric encoding. .
sp_ Member	#18 - Posted: 25 Mar 2007 17:19 Reply Quote I found a paper now from 1999 wich is similar to my idea. I have to do some more testing and reading.. . IMPROVED HUFFMAN CODING USING RECURSIVE SPLITTING. URL
d0DgE Member	#19 - Posted: 26 Mar 2007 13:22 Reply Quote @Blueberry: ÜBERKUDOS to you...you may have saved some sorry butt by releasing this cruncher :D Gthx Astounding compress rates even in the standard mode :)
Blueberry Member	#20 - Posted: 26 Mar 2007 15:52 Reply Quote sp: What you describe is essentially first-order modelling: Basing the probabilities on the preceding symbol. First (and higher) order modelling is particularly effective on text data, and to some degree on executable code. Static models for first- or higher-order modelling (such as storing the Huffmann trees) are usually only effective on large, homogeneous pieces of data, since the model data takes some space, and the same model is used for all the data. However, the structure recognized by such models is exactly the same as the one recognized by lz77, at least if the lz77 is able to reference length-2 sequences. Thus, when combined with lz77 compression, first-order modelling will give only a very minor gain over zeroth-order modelling.
sp_ Member	#21 - Posted: 27 Mar 2007 13:09 - Edited Reply Quote @Blueberry: I will test something in executable files(pure code). I create 1 huffmantree on the even bytes, and one huffmantree for the odd bytes. . When lz77 Search back in the file and reach 2 equal symbols. The offset and length bits are normally longer than 16 bit, so the lz algoritm won't pack these 2 bytes. When using a lz-huffmann aproach the huffmanncodes should probobly not be restricted to 1 byte length, but 2 bytes or even 3. Because longer codes the lz have already packed. In lz78 there is always a match in the search tree since the searchbuffer start with 256 different bytes. . Well, got to continue on my Breakpoint 060 4k now.. I have 2-3 effects, my 92 byte (uncrunched) c2p soon to be unrolled and filled with fpu code. 1 lightbuffer, 1 txturebuffer wirh shadetable acces pr pixel. (loaderror style) slow but ham8 2x2 on 060 I have so many cycles... Going from making 68000 code to 68060+fpu is really relaxing. But I need music!!! :D If you don't want to give the whole source, I just need the sample generator :D for drum and base.
Blueberry Member	#22 - Posted: 29 Mar 2007 16:09 Reply Quote @sp_: Yeah, using separate probability distributions for even and odd bytes is quite effective for 680x0 code. I do that in my cruncher as well. If you use variable-length codes for offset and length, 2-byte references can often pay off if they are close. This can give a great boost for 680x0 code, since just a single instruction that is identical to a recent one can become a reference. Good to hear that your 4k is progressing. You will get the synth, don't worry. I will take a look at the code separation right now...
rload Member	#23 - Posted: 30 Mar 2007 04:41 Reply Quote I've got a tune spare. If you could give me your email, I could send a preview. I have some real doubts that you will be able to finish though.. too much thai massage and not enough code motivation from Thor, the thunder goooooooooood!!!
Blueberry Member	#24 - Posted: 30 Mar 2007 14:12 Reply Quote OK, synth is up. See separate thread.
sp_ Member	#25 - Posted: 1 Apr 2007 07:10 Reply Quote loadrr: You are right.. Thailand is too hot now, next week I will be going on a visarun to Cambodia, and its difficult to finish a prod. 2x2 look really ugly. I just don't want to release a prod in 2x2 so I have to optimize more.. my mail: sp.kvarteret.uib.no (replace the first . with @)
z5_ Member	#26 - Posted: 19 Nov 2008 11:42 Reply Quote I just tried out Execruncher2. I was amazed to see the difference in size. Never expected that... It would be cool to have a bit of explanation on the various options though as they all seem chinese to me.
sp_ Member	#27 - Posted: 21 Nov 2008 10:02 Reply Quote Another good thing about this cruncher is that the decruncher is Mc68000 compatible. So it can be used on the a500.
Blueberry Member	#28 - Posted: 21 Nov 2008 20:31 Reply Quote Yeah, some of the options are rather arcane... The best approach is probably to just start with the default values (printed when you crunch something) and then tweak them and see what happens. Anyway, here's a short explanation: FROM/TO/QUIET: Obvious. HUNKMERGE: Merge sections with identical memory type (CHIP/FAST/PUBLIC) into one. If you are abusing the hunk list pointers to find the address of other hunks (popular in 4k intros to avoid relocation entries) you should really know what you are doing before using this option. MAXDIST: The maximum distance that the LZ compression can refer back when looking for matches. Decreasing it makes the references smaller but loses some opportunities for matching. MAXLEN: The maximum length of LZ matches. Again, making it smaller decreases the size of the references but limits how long they can be. OCCMIN/OCCADD: Parameters to the zero-order probability modelling of the range coding. Larger values for OCCMIN makes the predicted probabilities less extreme. Larger values for OCCADD makes the probabilities readjust more often, which is good if the file contains different kinds of data with different probability distributions. RSW2/RSW3/RSW4: When the packer finds an LZ match, it compares the size of the match to the size of the literal bytes and choses the smallest option. However, near the beginning of the file, it is often best to accept matches even if they are slightly larger than the literal data. Decreasing these values will make the packer more likely to pick matches near the beginning, for matches of length 2, 3 and 4, repectively. MINI: Use a smaller decompression header which only supports one non-empty section and does not support relocation. The argument is the maximum distance at which matches can be found with both odd and even offsets. Beyond this distance, only matches with even offsets can be found (these are the most common when compressing code). When the MINI option is specified, there are extra limits on MAXDIST, and the OCC??? and RSW? options have no effect. If this still seems chinese to you, just revert to my original suggestion. :)
Blueberry Member	#29 - Posted: 2 Dec 2008 20:05 Reply Quote Correction: The RSW? options do have an effect in MINI mode. Only the OCC??? options are ignored. RSW2 900 is usually a good place to start when tweaking these options.
sp_ Member	#30 - Posted: 3 Dec 2008 13:36 Reply Quote I have a suggestion. can you add an option for me? Crunch my exe with all different options until it finds the smallest parameter combination to crunch my file. :D The cruncher can keep crunching for days or weeks, It doesn't matter.. All I want is the smallest size possible.

Page: 1 2 3 4 5 6 »»

A.D.A. Amiga Demoscene Archive, Version 3.0