A.D.A. Amiga Demoscene Archive

        Welcome guest!

  

  

  

log in with SceneID

  

Demos Amiga Demoscene Archive Forum / Coding / Inner and outer loop, only one register

 

Author Message
dalton
Member
#1 - Posted: 11 Nov 2016 12:39
Reply Quote
Which of the following three solutions is preferable, in your opinion, if only d0 can be trashed by the loop logic. Maybe there's a better solution which I haven't thought of?
Both inner and outer loop should run three times (3*3).

Option 1, use d1 too but preserve it on stack:
		move.l	d1,-(sp)

moveq #3,d0
outer:
moveq #3,d1
inner:
; some stuff

subq.l #1,d1
bgt.b inner

; more stuff

subq.l #1,d0
bgt.b outer

move.l (sp)+,d1



Option 2, keep the outer loop counter on stack:
		pea.w	3
outer:
moveq #3,d0
inner:
; some stuff

subq.l #1,d0
bgt.b inner

; more stuff

subq.l #1,(sp)
bgt.b outer

addq.l #4,sp



Option 3, put both loop counters in d0:
		move.w	#3*256,d0
outer:
addq.w #3,d0
inner:
; some stuff

subq.b #1,d0
bgt.b inner

; more stuff

sub.w #256,d0
bgt.b outer
dalton
Member
#2 - Posted: 11 Nov 2016 12:40
Reply Quote
I forgot to say that the target platform is 68060.
britelite
Member
#3 - Posted: 12 Nov 2016 20:13
Reply Quote
Is unrolling the inner loop out of the question?
dodke
Member
#4 - Posted: 12 Nov 2016 23:17
Reply Quote
Yeah REPT is often a decent lazy option for short static length (inner) loops.

I think I sometime also compared the destination An to an 'end pointer' in a loop when I had no free data regs at all but plenty of address regs.
Angry Retired Bastard
Member
#5 - Posted: 13 Nov 2016 13:10
Reply Quote
If I had to chose between those 3 just based on instinct I'd say 3) because it feels cleaner.

But unrolling the inner(most) loop seems like the logical choice if things are really really perf-critical here. (And if that part feels "too big for unrolling" then you probably shouldn't care too much about the contribution of that loop counter either).
dalton
Member
#6 - Posted: 13 Nov 2016 13:12
Reply Quote
I guess it makes sense to unroll if the code is not too big. And if the code is too big, then the cost of using the stack is probably neglectable (for instance in a 4x4 matrix multiplication).
todi
Member
#7 - Posted: 3 Aug 2017 19:33
Reply Quote
Late on the ball, but how about this solution:


move.w #3-1,d0
.outer
swap d0
move.w #3-1,d0
.inner

; some stuff

dbra d0,.inner

; more stuff

swap d0
dbra d0,.outer
dodke
Member
#8 - Posted: 9 Aug 2017 17:58
Reply Quote
I wasn't aware dbra only handled words. Good to know :)
NorthWay
Member
#9 - Posted: 9 Aug 2017 20:00
Reply Quote
Option 3. Immediates are fast on the 060.
h0ffman
Member
#10 - Posted: 19 Sep 2017 16:48
Reply Quote
dodke:
I wasn't aware dbra only handled words. Good to know :)


If you're short on data registers, then you can piggy back two counters using swap.

 

  Please log in to comment

  

  

  

 

A.D.A. Amiga Demoscene Archive, Version 3.0