EmbeddedRelated.com
Forums

Querry code density About ARM7 (Newbie)

Started by Unknown June 7, 2005
> Isn't there at least one implementation where Thumb required an extra > pipeline stage to translate the code into ARM instructions? Depending > on how branchy the code is, that may actually slow things down.
To my knowledge this has never been the case. In ARM7T implementations of thumb, the Thumb instructions are translated into ARM instructions during the first half of the decode stage. The translated instructions are then decoded by the existing ARM decoder in the second half of the cycle. This does not introduce any extra stages or delays and is completely neutral from a pipeline throughput point of view. ARM9T (and later) implementations incorporate two decoders, one for ARM and one for Thumb so there is no need for the translation operation. This allows the decode stage to be shorter in terms of time and contributes to the higher potential clock speed of these implementations. Chris (posting as an individual)
<chris.shore@arm.nospam.com> wrote in message news:42A80942.229CD7EE@arm.nospam.com...
> > Isn't there at least one implementation where Thumb required an extra > > pipeline stage to translate the code into ARM instructions? Depending > > on how branchy the code is, that may actually slow things down. > > To my knowledge this has never been the case.
... in ARMs own designs. XScale is different.
> ARM9T (and later) implementations incorporate two decoders, one for ARM > and one > for Thumb so there is no need for the translation operation. This > allows the decode stage to be shorter in terms of time and contributes > to the higher potential clock speed of these implementations.
Yes, all modern ARMs do it this way. XScale is the only exception as it uses an extra decode stage to deal with Thumb instructions. In some sense this is similar to what high-end ARMs do, as they split the ARM and Thumb decoders over 2 stages. You only pay for the extra pipeline stage on a branch mispredict, so it works fine if you have a decent branch predictor. However the 667 MHz Samsung ARM10 proves you can take an old ARM design and push it hard (it has a simple 6-stage pipe with single cycle ARM/Thumb decode stage). Wilco
"Wilco Dijkstra" <wilco-dot-dijkstra@ntlworld.com> wrote in message
news:6kUpe.6357$jS3.2937@newsfe2-win.ntli.net...
> > > However the 667 MHz Samsung ARM10 proves you can take an old ARM > design and push it hard (it has a simple 6-stage pipe with single cycle > ARM/Thumb decode stage).
I can find that proccessor anywhere. Does it really exists or is it a propaganda announcement like their 1.2GHz ARM9? Andr&#4294967295;
Andre wrote:
> "Wilco Dijkstra" <wilco-dot-dijkstra@ntlworld.com> wrote in message > news:6kUpe.6357$jS3.2937@newsfe2-win.ntli.net... >> >> >> However the 667 MHz Samsung ARM10 proves you can take an old ARM >> design and push it hard (it has a simple 6-stage pipe with single >> cycle ARM/Thumb decode stage). > > I can find that proccessor anywhere. Does it really exists > or is it a propaganda announcement like their 1.2GHz ARM9?
Some years ago they announced "Halla" which was to be a 1.2GHz ARM1020E, but I don't think it has made silicon yet. More recently they announced the S3C2440 which is a 533MHz ARM920T, which they sampled, but I don't know if it is in production at that speed. John -- John Penton, posting as an individual unless specifically indicated otherwise.