A.D.A. Amiga Demoscene Archive

  Welcome guest! Please register a new account or log in




Demos Amiga Demoscene Archive Forum / Coding / AGA Scroll


Author Message
#1 - Posted: 8 Mar 2017 13:23
Reply Quote
The AGA scroll register (BPLCON1) has 8-bits of scrollvalue per playfield expanded from 4 bits available on OCS.
These 8-bits should support 64 lores pixel scroll and 2 more bits for subpixel scroll.

https://www.ikod.se/references/amiga-aga-guide/new-features-for-aa/ <- search for bplcon1

However I can't seem to use this new expanded scrollrange. Seems I'm limited to 8-pixels scroll range (yes, not even 16 as would seem logical by having 4-bits).

How is the AGA scroll register enabled?

I'm guessing this expansion was made because of the new 64-bit fetchmode where bitplane pointers need to be 8 byte aligned in order to display correctly. One would need to scroll across a 64-pixels range using the scroll registers before a new valid bitplane pointer address is within reach.

Currently I scroll by setting bitplane pointers on byte boundaries and writing 3 bits into the scroll values. However this only works in FS-UAE and on the real thing I got a black screen.
#2 - Posted: 8 Mar 2017 13:37
Reply Quote
And yes I do attempt to shift the bits around so they fit the spec of BPLCON1 like below

15 PF2H7=0 PF2Hx = Playfield 2 Horizontal Scroll Code, X = 0-7
14 PF2H6=0
13 PF2H1=0
12 PF2H0=0
11 PF1H7=0
10 PF1H6=0
09 PF1H1=0
08 PF1H0=0
07 PF2H5 PF1Hx = Playfield 1 horizontal scroll code, x=0-7

Where PFyH0 = LSB = 35ns SHRES pixel. Bits have been renamed, old PFyH0 now PFyH2, etc.

Now that the scroll range has been quadrupled to allow for wider (32 or 64 bits) bitplanes.
06 PF2H4
05 PF2H3
04 PF2H2
03 PF1H5
02 PF1H4
01 PF1H3
00 PF1H2
#3 - Posted: 8 Mar 2017 16:04
Reply Quote
I did a quick test and it seemed that 0-63 indeed works only with the 64-bit fetchmode. 0-15 should work like on OCS.

You're sure the register is kept the same and not modified by anything else? (OS running in the background?)
#4 - Posted: 9 Mar 2017 13:36
Reply Quote
hmm if it works for you just like that using FMODE=0x000f and setting the value in the scrollreg then I need to check to see whether there is something else fishy going on.
#5 - Posted: 9 Apr 2017 13:59 - Edited
Reply Quote
Seems there were multiple problems to my scrolling. Posting here for self-shame and hopefully it can help others later.

- My scroll register setter function's shift amounts/masks had some typos (facepalm)

- DDFSTRT was not set to fetch earlier than for a non-scrolled display. I ended up using 0x38-0x20 for DDFSTRT. The hw ref manual suggests the value to be 0x38-0x8 = 0x30. But this applies to OCS bitplane fetch mode. I had set 64-bit fetch mode because I wanted to try the AGA sub-pixel scrolling. 0x20 is 0x8*0x4 and 16 bits*4=64 bits so the value of 0x20 makes sense.

- Modulos needed to be set to compensate for the extra words fetched after changing DDFSTRT. Ended up with modulo=bytes_until_next_line-8 instead of original value of modulo=bytes_until_next_line used when not scrolling.

- The scrolled image was not a multiple of 64 in width so after applying the modulo at end of line 1, the result address for line 2 would not be aligned to 64-bits causing corruption (only on real amiga).

There is still one more problem which is only seen on actual Amiga. I'm only scrolling half the bitplanes (0,2,4,6) while the rest are standing still (1,3,5,7). And looks like 0,2,4,6 are x-offset by ~64 pixels related to each other. I'm guessing it is that the individual bitplanes are not all aligned to 64-bit although can't seem to see rogue addresses in UAE.
#6 - Posted: 18 Apr 2017 16:21
Reply Quote
mff... I spend some weeks on AGA parralax some months ago...
you have to align bitmap memory on 4 or 8 bytes accordingly with FMODE X2 or X4. UAE doesn't care the align, watch out.
Also, the datafetch and screen position is affected, of course modulo should also aligned to 4 or 8 bytes
since you have to start reading bitmap very early at left, it disables stupidly 7 of the 8 sprites, jsyk.
in X2, you have to align 4 , start datafech later, still have 5 sprites. (and ECS/OCS scroll disables 1 sprite)
finding correct values for datafectch so it has correct aligned modulo is tricky. I can't get the values here, I will post that.

I actually did a huge "copperlist compiler" - close to a video driver for my demo to check for that.

#7 - Posted: 19 Apr 2017 13:44
Reply Quote
Thanks! Seeing your demo from revision I gather you are quite an expert at scrolling now heheh

There must be some kind of misalignment somewhere in my setup.
#8 - Posted: 19 Apr 2017 23:02
Reply Quote
For anything AGA register related there is this:


Now, about setting copperlist with any FMODE, with or without scroll:
All the following is about 320x256 Pal screen, low res:

As my demo use any of (what I call) X1,X2 or X4 burst mode,all my aga copperlist start with "fmode,value", simply because everything can be changed each frame.
(actually my fmode values are always or'ed with $00c0 because these bits are the 64 pixel sprite mode, and it doesn't affect the following.)

Why using fmode ? Because with X4 the blitter is 2 times faster, quite faster in X2, and I suspect CPU itself to be faster as far as Chip ram is concerned.

This is classic window position, this does not change for fmode and scrolls. (This would change for ntsc or overscan):

diwstart ($08e) $2c81
diwstop ($090) $2cc1

Then you specify datafetch (dffstrt $092,dffstop $094) values.
These are not the same according to fmode AND whether you have to use horizontal scroll so the screen start to read bitmap earlier at left.
the "implied modulo", number of bytes eaten by a plane line, is also implied by these values.

Note the modulo you specify in bpl1mod,bpl2mod is *added* to the "implied modulo" to find the start
of the next scanline, and can be negative. (-32768,32767) Usually you set "your BM byte width - implied modulo" there.

An easy way to read the following: dffstop change according to fmode, dffstart and impliedmodulo
change according to:(scroll or not+fmode).

- - - - - - - - - - - - X1
Classic ECS/OCS and AGA with fmode=$0000:

bitmap start pointer and modulo aligned to 2 bytes.

For no horizontal scroll:
ddfstrt: $0038
ddfstop: $00d0
implied modulo: 40
Sprites available: 8

For horizontal scroll:
ddfstrt: $0038-$08
ddfstop: $00d0
implied modulo: 42
Sprites available: 7
Bits PF1H6 /PF1H7 PF2H6 /PF2H7 of BPLCON1 useless, it scrolls 16 pixel wides

- - - - - - - - - - - X2
the X2 mode:
fmode is either $0001 ("32 wide mode") or $0002 ("double CAS"). I couldn't find any difference of speed with these 2 modes :)

bitmap start pointer and modulo aligned to 4bytes (or die).

X2 no scroll
ddfstrt: $0038
ddfstop: $00c8 <- watch out !!
implied modulo: 40
sprites available: 8

X2 with scroll
ddfstrt: $0038-$10
ddfstop: $00c8 <- !!
implied modulo: 44
sprites available: 5
Bits PF1H6 and PF2H6 of BPLCON1 useful
(Bits PF1H7 and PF2H7 of BPLCON1 useless.) <- scroll values on 32 pixels

- - - - - - - - - - -X4
the X4 mode:
fmode is $0003 (both 32wide and double CAS)

bitmap start pointer and modulo aligned to 8bytes (or die).

X4 no scroll:
ddfstrt: $0038
ddfstop: $00a0 <- !!
implied modulo: 40
sprites available: 8

X4 with scroll:
ddfstrt: $0038-$20
ddfstop: $00a0 <- !!
implied modulo: 48
sprites available: 1
PF1H6/PF1H7,... PF2H7 all bits useful in BPLCON1, can scroll 64 pixels wide (+subpixels of course).

- - - - - - - -

That's all ! Now you understand why the last screen of catabasis doesn't scroll horizontally:
still X4 and all sprites enabled.
I will certainly release source of "catabasis" in some weeks anyway.
#9 - Posted: 20 Apr 2017 15:40
Reply Quote
I never changed ddfstop! You might have posted the solution to my final issue there. Will give it a whirl on the weekend.
#10 - Posted: 25 Apr 2017 19:03
Reply Quote
Bah.. changing ddfstop from 0xb8 to 0xa0 didn't change anything in my case.

I'm curious about the table for DDFSTRT and DDFSTOP in the aga guide.. What's Extra Wide, Wide, Normal, Narrow referring to?

DDFSTRT (Left edge of display data fetch)
Purpose H8 H7 H6 H5 H4
Extra Wide (Max) 0 0 1 0 1
Wide 0 0 1 1 0
Normal 0 0 1 1 1
Narrow 0 1 0 0 0

................ another thing that puzzles me
In the OCS/ECS hw ref manual it says DDFSTOP and DDFSTART are related like this in lowres

DDFSTRT = DDFSTOP-(8*(word count - 1))

Lets try it for FMODE = 0 no scroll:
If we plug in some values from Krabob's table of working stuff above: DDFSTOP=0xd0 and word count = 320/16 (assuming word size is 16-bit) it seems ok.

0xd0 - (8*(320/16-1)) = 0x38

For FMODE = 0 with scroll:
DDFSTOP=0xd0 and word count = 336/16.

0xd0 - (8*(336/16-1)) = 0x30

Both of them work very well!

But then... moving on to X4 mode without scroll. What has this formula become? From stuff that works setting DDFSTOP = 0xa0 and DDFSTRT= 0x38. Then the question is what logical numbers should be in the middle to make it work? The 8* is related to color clocks for lowres so I think that won't change. So it must be wordcount-1 which changes.

0xa0 - (8*(wordcount-1)) = 0x38
-8*(wordcount-1) = 0x38-0xa0
(wordcount-1) = (0x38-0xa0)/-8
wordcount = 14

So what is a word count of 14 going to fetch us? For different word sizes 16,32,64:

14 * 16 = 224 bits of image data
14 * 32 = 448 bits of image data
14 * 64 = 896 bits of image data?

But we want 320 bits of image data I think..

#11 - Posted: 26 Apr 2017 09:12
Reply Quote
The DMA hardware has a fixed 8-cycle sequence during which it does one fetch from each active bitplane. In X1-mode this sequence runs back-to-back, with start points at $18,$20,$28,$30,$38,... In X4 mode the fetch sequences have some idle time between them; fetches begin at $18,$38,$58,$78,..., with $18 cycles of idle time in between.

When you set DDFSTRT/DDFSTOP you specify of these fetch sequences should be active. A straightforward way to set them correctly is to set DDFSTRT to be at the beginning of the first fetch sequence you want to have active (for example $38) and DDFSTOP to the end of the last fetch sequence you want to have active (for example $38 + 4*$20 + 8 = $c0) - so $38,$c0 will enable 5 X4-fetches which will fetch a total of 5*64=320 pixels.

The data fetched by a fetch sequence that started at $38 is available to the display hardware at the end of the fetch sequence, so at bus cycle $40. This matches well with having a DIWSTRT value of horizontal pixel position $80 (or $81, not sure where the +1 comes from - perhaps an extra cycle latency in the display circuitry?)

If you use horizontal scroll then the fetched data is fed via a shift register right after; this delays the data, effectively shifting pixels toward the right.

All the above can be done with a zero modulo. If you run with modulo -8, you are fetching the same data twice in some cases and waste memory bandwidth unnecessarily.


  Please register a new account or log in to comment





A.D.A. Amiga Demoscene Archive, Version 3.0