|
Author |
Message |
dalton
Member |
Now I know that you can't just count cycles to know how fast a routine is. You'll have to consider it's size, the order of instructions, cpu type and so on. But it should be possible to compare different instructions to each other. For instance, how much slower is divs than divu? Could I save time with shifts and adds comparing to a mulu with constant. These little questions constantly pop up when I'm coding, and I haven't really found an answer anywhere.
|
Anonymous
Member |
Yes it's becoming quite hard to actually "see" the speed from the code. But I think you're on the right track. If you compare two instructions that are equal in size and uses the same registers, I think you can look at the clock cycles with confidence.
But I say, why not time test your routines? It's one of the best ways to see which routine is faster "in reality". It can be a bit tricky do to proper tests also, but at least you can get a good pointer out of it :)
|
Pezac
Member |
It was me who wrote the above message, forgot to login :)
|
dalton
Member |
here's another one: which is fastest of these two:
move.l 10(a0),d0
move.l (10,a0,d0*4),d0
I suppose the adressing mode below couldn't be faster than the one above. But maybe they're equally fast?
|
TheDarkCoder
Member |
I think that on many CPUs they are equally faster.
I would bet on 040 and 060, maybe also on 020 and 030.
In general, you should also consider the context (i.e. other instructions closer). For example if before and after there are instructions accessing external memory (not in cache), then 2 instruction with different speed may execute in the same time amount, because of pipeline stalls.
please consider, when you have instructions longer than 1 word that also alignment in memory can play a role.
The only way to be sure is to mesure the time spent, and different hardware config can give different results!
|
winden
Member |
Dalton, you should order 68030, 68040 and 68060 books from motorola website, they are free (even shipping is free) and are great for coding assembler.
From memory... on 030 move.l 10(a0),d0 should be 2 cycles faster tan move.l (10,a0,d0.l*4),d0 due to additional internal ADD on address generation.... yes first one is 4 cycles and second one is 6 cycles since constant "10" fits in 8bit and can use (d8,an,xn*s) addressing.
|
Anonymous
Member |
I remember when Motorola sent those books, the norwegian toll department called me in order to verify that it was free...
|
d0DgE
Member |
Oh I've got another one... consider the following:
I've got some memory at a0 and want to spread it to 4 Dataregs.
movem.w (a0)+,d0-d3
or 4 times move.w (an)+,dn
what's better when I'm dancing on a 68000 ( or 020 ... no 060)
|
Blueberry
Member |
On 68000:
move.w (an)+,dn ; 8 cycles each = 32 cycles
movem.w (a0)+,d0-d3 ; 12 + 4 cycles per reg = 28 cycles
On 68060 they are equaly fast (one cycle per register with no overhead on movem), but separate moves allow for more parallel execution with other instructions, since each move can be paired with another one-cycle instruction that does no memory access.
I am not sure about 020/030, but as far as my tables here tell me, it is something like 4 or 5 cycles per move and 12+4n for the movem (for some reason, movem is slower from memory than to memory - only 2 cycles per reg the other way).
|
d0DgE
Member |
thanks again :D
|
doom
Member |
How about
move.l (a0)+,d0
move.l (a0)+,d2
move.w d0,d1
move.w d2,d3
swap d0
swap d2
Just for the fun of it. :)
|
Ralf
Member |
68000: 40 cycles (12+12+4+4+4+4)
|
|
|