Author |
Message |
sp_
Member |
Copper chunky limits
OCS: The copper can move 57 12bit colors pr. scanline On with 4 bpl or less.
AGA: The copper can move 57 12bit colors pr. scanline with 8 bpl.(fetchmode %11)
AGA:
2 scanlines give 114 colors minus colorbank switches (4 moves) max resolution 110x128. Using a mask in the bitplanes to create a display
MAX resolution:
220x256 (2x2)
330x256 (3x2)
OCS:
¨
In 5 bpl a coppermove use 12 pixels when the screen is displayed, and 8 pixels outside the screen.
(screen width must be smaller than 200pixels, and color 0 is used as a mask)
2x2: 31 + 32*4/12 + 10*4/12 + 3*4/12 = around 45 pixels pr scanline possible 2x45 = 90
3x2: 31 + 32*6/12 + 16*6/12 + 8*6/12 + 4*6/12 + 2*6/12 >57 57 pixels pr scanline possible 2x57 = 114
90x256(2x2)
200x256(3x2)
320x256(4x2)
*
I have searched for 2x2 copperchunky in aga-demos and failed to find any. For the a500 i don't know if 3x2 copperchunky effects has been made yet.
|
dodke
Member |
it is nice to see someone else being interested in copperchunky :)
as some of you know multicolor used a 80x100 resolution in 4x2 giving a 320x200 sized screen with pixels alternating between 2 scanlines. I'm not sure how many times this had been used before.
I quite like the amount of colours you can have with a copperchunky screen on an a500 even if it looks a little strange. The pattern could be improved still i guess, at least by using a proper interlaced mode for it. the fact that you can easily get 1x1 overlays for free is also nice.
personally i'm not so much into "narrow" chunky-modes and would prefer keeping things 320 pixels wide.
I'm thinking it might be possible to do an almost fullscreen 3x2 copperchunky with ocs when you use 4 bpls. change 15 colours outside of the screen and then start changing again after the first colour has been shown. the copper couldn't keep up of course but it would have a 14 colour lead and maybe people wouldn't notice if there were a few 4x2 wide pixels in there somewhere.
|
sp_
Member |
4bpl 3 pixel wide could give fullscreen with a correct setup. 16 colors are prefetched outside the screen. Each pixel on a scanline is 6 pixels wide with a black mask that remove 3 of them. Since 16 colors are already stored in the registers, the copper is able to calculate 12 new ones when displaying these 16 colors. when displaying the next 12 colors, copper will create 8 new ones etc..
16*6 / 8 = 12
12*6 / 8 = 9
...
16 + 12 + 9 + 6.75+ 5,06 + 3,75 + 2,8 + 2,1... gives around 57 wich is full bandwidth.
Even pixels are generated on one scanline, and odd pixels are generated on the next scanline. this will double the resolution. This is the same teqnique you used in your demo to get 4x2 res.
3*2 fullscreen should be possible even on the a500. I have to try it out :D
|
sp_
Member |
The display will look nicer with a mask. Color 0 should be reserved for a blackmask
Then the calculations will be
15*6 / 8 = 11.25
11.25*6 / 8 = 8.43
8.43*6 / 8 = 6,32
6,32*6 / 8 = 4,74
4.74*6 / 8 = 3,55
3,55*6 / 8 = 2,66
2,66*6 / 8 = 2,00
2,00*6 / 8 = 1,5
15 + 11 +8 + 6 +4 +3+2 + 2 +1 = 52
52 * 3 * 2 = 312
312 pixel wide screen should be possible in 3x2 with a blackmask.
|
sp_
Member |
I have implemented copperchunky-displays now. For AGA bankswapping is not needed since 32 colors is enough to cover the whole screen.
This is the maximum resolution I get:
Using 5 bitplane mask
OCS: 196x256 (2x2)
AGA: 226x256 (2x2)
Using 4 bitplane mask
OCS: 318x256 (3x2) (can get wider than 318)
AGA: 318x256 (3x2) (can get wider than 318)
|
Azure
Member |
In case you want to revive copperchunky for AGA you should probably look into the trick used by Chaos in roots: He is not using a bitplane mask, but a sprite mask. By switching sprite DMA off the mask can be displayed without hogging memory bandwidth. Since the number of colors in sprites is limited this requires some nasty horizonatal switching of the CLUT mask register.
|
coyote
Member |
Hey sp_,
I've written about this once in some other thread, but I don't think you noticed it.
On OCS/ECS amigas you can use 5 bpls and still have coppermoves use only 8 pixels all the time!
The trick is to set bplcon to 7 bitplanes (yep, this is something I found back in 90s, I don't know if anyone is aware of this).
This will open 5 bpl mode (or probably 6 bpl I don't remember any more), but only 4 bitplanes are going to be read from memory by dma, while for the fifth bitplane the data in 5th bpldat is used for free - without dma. Simply fill the data register with the value you need. This is convenient if in the 5th bitplane you need only some kind of pattern which you usually do if you are making copper chunky.
Of course, for this trick you need an actual A500 since this is still not implemented in WinUAE.
|
sp_
Member |
The 5bpl screen with 4bpl dma sounds to good to be true, but it's worth testing out. I don't have an amiga anymore, and in winuae setting 7bpl doesn't speedup.
.
Masking with sprites. Interesting. Have to check it out.
|
coyote
Member |
Well, I thought that masking with sprites was how everybody did 4x4 copper chunky... I know I did it in my demo "B2" back in 1993. Never dissected other demos to see how copper chunky was done.
If I remember it correctly I used non-dma 16 color sprites as masks for colors $11-$1f, and 4 bitplanes for the first 16 colors. Made a 53 wide copper chunky (4x4). Although I think that I missed a simple way to do 54. If you do dissect it please do not look at the awful zoom-rotator I programmed. It's a 3 frame piece of crap.
I remember that on the demo party when I released my demo I had seen 56 wide and whole screen tall 4x4 chunky copper in some demo. I was really depressed. Btw, do anyone know which demo this is? I would like to see it again...
Anyway, at that time I've already known the "7bpl" trick but I wasn't sure if it will work on all ecs Amigas so I didn't use it. Later I made a short intro using this trick. In it I made 60 or 62 columns (don't remember), although it was 3x3 plus 1 black pixel around it. Since copper can do 57 moves I couldn't change all colors in one scanline so 3 lines were done with a copper loop which repeatedly changed 47 (or something) colors, and in the last line I changed all 62 of them.
Btw, it seems that the WinUAE 1.5.4 beta 3 (Nov 29) has ecs 7bpl trick implemented, but with some problems with my intro. I *really* must find my intro and transfer it to PC to see how it works...
|
sp_
Member |
I have implemented 3x2 (318x256) copperchunky that works on OCS amiga using 4 bpl. the display is masked with black pixels between each pixel. It look similar to the copperchunky in multicolor by unique , but mine is 3x2 not 4x2.
http://ada.untergrund.net/showdemo.php?demoid=687
With AGA chipset using 64 bit fetchmode a 5bpl screen will take the same amount of dma as a 0 bitplane screen. 5 bpl is enough to cover the whole screen(3*2). or 224*256(2x2) (with the blackmask teqnique)
.
On OCS 2x2 copperchunky can be done in almost full bandwidth (57 moves pr scanline) by using 4bpl and sprites. The sprites will use colorbanks 16-32. Probobly the sprite dma channel should be switched off while displaying the bitplanes. Swithcing off dma will steal 2 coppermoves pr scanline. (2*2, 55 colors pr scanline possible?) On OCS I tested a 5bpl mask and managed to change 49 colors pr scanline.
.
If the 7bpl trick really works it will speedup my 4bpl c2p quite abit since I use a mask in the 5th bpl, and scrollregister to perform a hardware c2p merge.
.
I managed to run your demo B2( I had to upgrade winuae :D) I liked the flag, and the globe routine. Time for a comeback? Start coding !! :D
|
coyote
Member |
For sprites you don't need dma at all. Simply fill the sprite data registers to get a pattern repeated through the whole height of the screen for free.
7bpl trick works, and I can confirm that it now even works in the above mentioned v1.5.4 beta 3 of WinUAE! I tried my "Lazy Bones 1st Anniversary" intro and it worked. Not every time though, but I suspect that I have a bug in it. However now my other demo "Move Any Mountain" doesn't work any more, and it worked (with some problems) in v1.5.0 beta 1. (move any mountain has a copperscreen (8x1) voxel routine)
I am glad that you liked at least some things in B2. ;-)
B2 btw really is a lame product... But it was what I could do in those days. Note that B2's zoom rotator still doesn't work perfectly in WinUAE - the right part of the rotator is corrupted. Some copper timing issue in WinUAE I'd say. And the flag also sometimes work ok, and sometimes the chequered part is corrupted.
Globe uses HAM mode with only 5 bitplanes. I used HAM to automatically fill horizontal lines, probably have put something in BPL6DAT, I don't remember. In short HAM5 is responsible for those strange yellow/green colors. Because of the HAM5 mode at least v1.5.0 of WinUAE is needed to run B2.
And about the comeback: I have been thinking about it, but to be honest I am intimidated by you guys who are doing some really amazing stuff with such old hardware! Not to mention that I was so long away from the Amiga that when I looked at the sources today while I was looking for my 7bpl intro I was quite confused. Probably because I didn't comment my code much in those days... It seems to me that I would have to re-learn too much stuff to be able to make anything worth.
|
Rebb
Member |
Newbie question: You can only change every 8th pixel, right? If i got the idea of copper chunky right? (Change the background color $180 to paint the screen).
Didn't find any good information about copper chunky, only few hits on google. So little tutorial pretty please.
|
sp_
Member |
Changing the backround color $180 will work for a 8x1 display. One rasterline is 456 lowres pixels wide. but only around 32o are visible. If you fill more color registers $180,$182,$183.. outside the visible display with the copper. and create a bitplane mask or spritemask you can get a bether resolution than 8x1. The copper can change 57 colors per scanline. If the pixelsize is doubled on the y axis 57*2 colors is ossible.
A bitplane mask can be: (byteprpixel)
scanline1: 11100022200033300444.. Every even pixel of the display is shown here.
scanline2: 00011100022200033300.. Every odd pixel of the display is shown here.
With this mask color #0 is set to black. This black mask will make the screen darker, but also more correct.
|
sp_
Member |
coyote,
I hope many of the people who used to code on the amiga will come back. I remember I had alot of fun in the nineties with the demoscene, but after I got a fulltime job as a programmer the inspiration was gone. 2 years ago I quit my job and moved to Thailand. Warm climate, beatiful women and cheap beer. hehe:D Slowly my inspiration for coding is coming back.
I used money on fast cars and women, the rest I wasted.. (George Best)
.
|
coyote
Member |
hehe
Thailand - well that's cool. ;-)
|
Alexco
Member |
Hi sp_
could you please explain to me what you mean with
"and scrollregister to perform a hardware c2p merge. "?
And only for my understanding, 4x2 means one pixel is translated to a 4x2 block? So in your scanline example both lines refer to the same colours?
|
sp_
Member |
4x2 meens 4x2 block translated to one pixel. 4 lowres pixels on x-axis and 2 on yaxis. (4*2) 8 lowres pixels are used to display one pixel. In my example both scanlines represent one line. Odd pixels are rendered on line 2. and even pixels in line one. The chunkybuffer is not linear
scanline1: aaa000ccc000eee000ggg..
scanline2: 000bbb000ddd000fff000..
First pixel(3 lowrespixels with correct color a, and 3 black.):
aaa
000
next pixel(3 lowres pixels with correct color b, and 3 black.)
000
bbb
.
By scrolling the odd bitplanes one pixel and put a mask in the 5th bitplane 01010101 its possible to get a c2p merge for free. Bitplane pointer 0,1 points to the same memoryblock
and 2,3 to the same memoryblock. When using this teqnique you don't need to duplicate a pixel twice to achieve a 2x1 resolution. This meens that the blitter only need to convert
half the data needed in other 2x1 c2p's. This teqnique was called blitterscreen, and used in many demos. F.ex Closer by CNCD from 1995.
The display is masked with black pixels in between.
For 2x2: only 2 of the pixels in the block have the right color. and 2 pixels are black.
a0b0c0d0....
0a0b0c0d....
In my final routine the masked pixels have 2bit precition and the others have 4 bit precition. I use the two most significant bits to produce a 2bit pixel in between the 4 bit pixels.
This makes the display brighter and smoother.
Big letters are 4bit precition small letters are 2 bit prection. The 2bit prection pixels are adjusted in color registers 16-32.
AaBbCcDd....
0AaBbCcD....
.
You can find more information about my c2p for a500 at this thread. With link to a sourcecode. You need to scroll down abit to find the final sourcecode. My first attempt was slow.
http://ada.untergrund.net/forum/index.php?action=v thread&forum=4&topic=217&page=0
|
sp_
Member |
The fastest way to plot in a copperchunky screen might be to use a linear word pr pixel buffer and a blitterpass that copy the colors into the copper.
Example. A zoomrotate innerloop using smc,
Without a blitterpass
move.w xxxx(a0),xxxx(a1)
move.w xxxx(a0),xxxx(a1)
move.w xxxx(a0),xxxx(a1)
....
20 cycles pr pixel.
With blitterpass
move.w xxxx(a0),(a1)+
move.w xxxx(a0),(a1)+
move.w xxxx(a0),(a1)+
...
16 cycles pr pixel
..
When using my 4bpl C2p a similar innerloop goes down to 13.25 cycles pr pixel.
|
sp_
Member |
I have tested the 7bpl trick in the latest version of winua. (1.5.3(2008.11.09))
When setting 7bpl I get a 4bpl screen. The 5th bitplane is not visible. tested with OCS/ECS
|
coyote
Member |
sp_,
You must download beta version not the latest release...
The first one in which 7bpl is implemented is 1.5.4 Beta 3.
So 1.5.4 Beta3, or 1.5.4 Beta4 should be used.
And then you must fill the bpl5dat register with the processor once and it will be used as a pattern for the 5th bitplane. Of course you can fill the bpl5dat with copper if you need to change the pattern from one scan line to the next. Finally, you can also use bpl6dat if you can find the use of half brite pattern. Also, if you don't need half brite then you probably should fill the bpl6dat with zero to avoid some possible residual pattern.
Anyway, here is the intro I wrote back in 1994 which uses 7bpl trick:
http://free-bj.t-com.hr/coyote/anniversary.exe
The proper way it should look is 61 columns of 3x3 pixels (4096 color) with horizontal and vertical black lines (1px) separating each element.
For this you need at least v1.5.4 Beta 3. In older versions you will see elements 8x3 pixels and only horizontal black lines will be seen.
Sometimes this intro does not start right (it starts totally corrupted), but exit it with the left mouse button and start it again, eventually it will show as it should. (I have never experienced this on an actual Amiga 500. There it would always start ok.)
More about 7bpl can be found here:
http://eab.abime.net/showthread.php?t=41091
More about v1.5.4 Beta releases here:
http://eab.abime.net/showthread.php?t=40738
WinUAE v1.5.4 Beta 4 here:
http://www.winuae.net/files/b/winuae_1540b4.zip
|
sp_
Member |
I downloaded the latest beta and it is working!!! 5bpl with 4bpl DMA.. Amazing!! All my c2p effects got faster.
I tested a 320x200 table effect and it got 13% faster. In 320x256 the gain is much more! Incredible!
I fill $dff118(BplDat5) with %1010101010101010 and $dff11a(BplDat5) with 0. and set Bplcon0 to $7200
This is magic
Thanks!!
|
coyote
Member |
Amiga IS magic!
:)
Several months ago when I've read your "A500 demo for BP 2008" thread I knew that my trick could be helpful. (I don't know if you noticed my post back then...)
Fortunately, Toni Wilen recently fully implemented it in WinUAE because of my Lazy Bones 1st Anniversary intro...
Anyway, I am curious if anyone else has ever found or used this behavior.
Btw, Toni listed several other undocumented hardware things here:
http://eab.abime.net/showthread.php?t=19676
Who knows, maybe you find something else that you can use... ;)
|
sp_
Member |
Maybe there is a similar 9bpl trick in the AGA chipset.
set bplcon0 to $1210 (9bpl) and fill BPLdat5-8($118,$11a,$11c,$11e) with masks.
If this use 4bpl dma it will meen a 12bit chunkytoplanar revolution.
.
A 12bit/24bit c2p use the last 2 bitplanes as a static mask. in Ham mode. The Mask represent wich color to change at each pixel (RGBB)
For a 1x1 resolution the screen must be set to superhires. (4 pixels form one lowres pixel) The problem is that a 6bpl superhires screen take alot of DMA. A 4bpl Superhires screen is as fast as a 4bpl lowres screen.(with 64 bit Fetch). If this trick works. Fast 1x1 truecolor can be done!
.
I tested this in winuae, but it doesn't seem to work. If anyone could test this on a real AGA amiga it would be great.
|
sp_
Member |
I have done some more timing with a tableEffect(13.25 cycles pr pixel)+blitterc2p. AudioDMA switched on. I change modulo and scrollregister on each scanline + 16colors in the copper for every 16th raster. Blitter interrupt with 6 blits and a vbl interrupt. When the effect is finished to render I set the blitter nasty bit to give the blitter 100%dma.
A 320x256(2*2) screen renders 17 % faster with the 7bpl trick.
A 320x226(2*2) screen renders 13 % faster
With 7bpl trick the biggest resolution that runs in 25fps is 320x228 (2x2)
With 5bpl the biggest resolution that runs in 25fps is 320x202 (2x2)
I get 12,8% more pixels with the same framerate.
Some data. Screen Height - number of rasterlines.
7bpl 256 - 718
7bpl 226 - 619
7bpl 228 - 625
5bpl 203 - 622
5bpl 206 - 634
5bpl 210 - 647
5bpl 226 - 700
5bpl 256 - 840
Timed using winuae 1.5.4 public beta 4
wb1.3
OCS cycle exact
512kchip 512kfast
Mc68000 match a500 speed.
|
coyote
Member |
Can't you do on aga similar to what can be done on ocs/ecs?
In other words: turn on the HAM bit but turn on only 4 bitplanes. This will initiate the HAM mode in which 4 dma channels will be used for bpldat1-4 while bpldat5 and bpldat6 will be used as a pattern. (well, at least a500 behaves like this)
Simply fill them once and that's it.
|
sp_
Member |
It doesn't work for AGA. The creator of WinUae have tested it. Setting a 9-15 bitplane give 0 bitplanes, and not 4bpl as in OCS/ECS
|
coyote
Member |
No, no. You didn't understand me. (I've too read the thread on eab.abime.net...)
I'm talking of a different thing. I call it ham4 and ham5 modes.
On a500 you do not need to activate all 6 bitplanes in bplcon0 to get the ham mode. You can set less bitplanes (for example 4 or 5) and still turn on the ham mode. Then you get the ham mode and denise use the data from all six bpldat registers to generate the screen, while agnus with dma fills bpldat registers only up to what you set in bplcon0. (yeah, it's somewhat similar to the 7bpl trick at least if you look at dma accesses, but it's achieved differently)
I'll be more clear: for ham4 on amiga500 you put 4 bpl in bplcon0, and turn on the ham bit, and fill the bpldat5 and bpldat6 with the pattern you need and that's it.
I used ham5 for my morphing globe routine in B2. This way I save some dma accesses since ham5 takes less dma from processor than ham6.
I just don't know if the same technique can be used on AGA. Someone should test it.
|
sp_
Member |
Brilliant idea. Again!!
I have tested in winuae latest beta. and can confirm that the 4bpl ham mode works with OCS and ECS chipset. Using AGA chipset it doesn't work(in the emulator), but it should be tested on a real amiga
|
coyote
Member |
Maybe we should ask Toni to try it on a real AGA machine... ;)
|
sp_
Member |
If this works a 320*200 1x1 12 bit truecolor screen can be converted in around 337 raster lines.(copyspeed). I wonder what happen if I set 7bpl and ham bit. maybe it's possible to do ham8 with 7bpl dma..
Fast enough to do something nice on fast amigas.
|