Forums

Question about ARM execution cycles

Started by Robert Willy February 6, 2014
Hi,

I want to know ARM execution cycle numbers on ARM A8. I know TI DSP C6000 VLIW well. ARM, on the other hand, seems really weird to me. I know ARM is super-scalar structure. Anyway, ARM documentation gives me more questions instead of solve my original questions.

In Chapter 16 of ARM A8 TRM, some table gives instruction execution cycles. On page 648/762, example 16-2, it gives:

SMLAL      R0, R1, R2, R3


R2, R3 are ready in E1. The results in R0, R1 are available in E5. Then, it said that the instruction takes a minimum 3 cycles to execute.

Q: Does 3 get from E5-E1 (E2, E3, E4)?

Then, from Table 16-4, there is:

Normal: MUL                 2 Rm:E1 Rs:E1 [Rd:E3] {Rn:E4}a Rd:E5

I cannot get cycles 2 from the above assumption.

I suppose that these cycle numbers can add up for a sequence instructions if all these instruction sources/results avail abilities are satisfied. 

Or, are there other uses for these execution cycle numbers?



Thanks a lot.







Below is the assumption on ARM instruction (From the same pdf file ARM-A8 TRM):


Special considerations and caveats concerning the instruction tables include:
* Source requirements are always given for the first cycle in a multi-cycle
instruction.
* Destination available is always given with respect to the last cycle in a data
processing multi-cycle instruction. This rule does not apply to load/store multiple
instructions.
* Multiply instructions issue to pipeline 0 only.
* Flags from the CPSR Register are updated internally in the E2 stage.
* [Rd] as a source register indicates the destination register is required as a source
if the instruction is conditional.
* {} on a source register indicate the register is required only if the instruction
includes an accumulator operand.
* () on a destination register indicate the destination is required only if writeback is
enabled.
* [] on a load instruction destination register indicate that the destination register is
optional depending on the size of the data transferred.
On Thursday, February 6, 2014 2:39:59 PM UTC-5, Robert Willy wrote:
> Hi, > > > > I want to know ARM execution cycle numbers on ARM A8. I know TI DSP C6000 VLIW well. ARM, on the other hand, seems really weird to me. I know ARM is super-scalar structure. Anyway, ARM documentation gives me more questions instead of solve my original questions. > > > > In Chapter 16 of ARM A8 TRM, some table gives instruction execution cycles. On page 648/762, example 16-2, it gives: > > > > SMLAL R0, R1, R2, R3 > > > > > > R2, R3 are ready in E1. The results in R0, R1 are available in E5. Then, it said that the instruction takes a minimum 3 cycles to execute. > > > > Q: Does 3 get from E5-E1 (E2, E3, E4)? > > > > Then, from Table 16-4, there is: > > > > Normal: MUL 2 Rm:E1 Rs:E1 [Rd:E3] {Rn:E4}a Rd:E5 > > > > I cannot get cycles 2 from the above assumption. > > > > I suppose that these cycle numbers can add up for a sequence instructions if all these instruction sources/results avail abilities are satisfied. > > > > Or, are there other uses for these execution cycle numbers? > > > > > > > > Thanks a lot. > > > > > > > > > > > > > > > > Below is the assumption on ARM instruction (From the same pdf file ARM-A8 TRM): > > > > > > Special considerations and caveats concerning the instruction tables include: > > * Source requirements are always given for the first cycle in a multi-cycle > > instruction. > > * Destination available is always given with respect to the last cycle in a data > > processing multi-cycle instruction. This rule does not apply to load/store multiple > > instructions. > > * Multiply instructions issue to pipeline 0 only. > > * Flags from the CPSR Register are updated internally in the E2 stage. > > * [Rd] as a source register indicates the destination register is required as a source > > if the instruction is conditional. > > * {} on a source register indicate the register is required only if the instruction > > includes an accumulator operand. > > * () on a destination register indicate the destination is required only if writeback is > > enabled. > > * [] on a load instruction destination register indicate that the destination register is > > optional depending on the size of the data transferred.
Further question is here. The result is in Rd. Why the first instruction has 1 cycle while the second has 2 cycles? Thanks, Page 654/762, Table 16-9: |Scaled register offset, LSL by 2|: cycle number: 1 Rn:E1 Rm:E1 [Rd:E2] - Rd:E3 (Rn:E2) |Scaled register offset, other| cycle number: 2 Rn:E1 Rm:E1 [Rd:E2] - Rd:E3 (Rn:E2), ___________________________________________________2nd iteration