Hi, When I read the words below dot line, I don't understand why "R14 is adjusted to allow for the prefetch" Could you explain it to me? Thanks, ....... Branch with Link (BL) writes the old PC into the link register (R14) of the current bank. The PC value written into R14 is adjusted to allow for the prefetch, and contains the address of the instruction following the branch and link instruction. Note that the CPSR is not saved with the PC and R14[1:0] are always cleared.
Could you explain ARM Branch with Link (BL) instruction considering prefetch?
Started by ●July 30, 2015
Reply by ●July 30, 20152015-07-30
Robert Willy <rxjwg98@gmail.com> wrote:> Hi, > > When I read the words below dot line, I don't understand why "R14 is > adjusted to allow for the prefetch" > > Could you explain it to me?Any time you move the PC into another register, for instance the link register R14, what you actually get is the address of the current instruction plus 8. The reason for this dates back to the ARM1, which had a 3 stage pipeline, fetch-decode-execute. When you executed the move, the instruction fetch stage was already two instructions further on. A bit like the branch delay slot on MIPS, exposure of this microarchitectural artifact to the ISA has meant that all 32 bit ARMs use this current+8, even though they don't have 3 stage pipelines any more. Theo
Reply by ●July 30, 20152015-07-30
On Thu, 30 Jul 2015 09:28:06 -0700, Robert Willy wrote:> When I read the words below dot line, I don't understand why "R14 is > adjusted > to allow for the prefetch" > > Could you explain it to me? > ....... > Branch with Link (BL) writes the old PC into the link register (R14) of > the > current bank. The PC value written into R14 is adjusted to allow for > the prefetch, and contains the address of the instruction following the > branch and link instruction. Note that the CPSR is not saved with the > PC and R14[1:0] are always cleared.Most likely the processor is fetching instructions in anticipation of executing them later, so if you store the PC, the value you get is the address of the instruction two places down from the one you're executing (i.e. a BL). That's not the instruction you expect to return to, so it's decremented before storing it in R14. Embedded IBM S/360?
Reply by ●July 30, 20152015-07-30
Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: (snip regarding ARM and PC values)> Any time you move the PC into another register, for instance the link > register R14, what you actually get is the address of the current > instruction plus 8. The reason for this dates back to the ARM1, which had a > 3 stage pipeline, fetch-decode-execute. When you executed the move, the > instruction fetch stage was already two instructions further on.The JSR instruction on the 6502 pushes one less than the address of the next instruction. RET pops the address and adds one. Again, it seems related to the value of the register at the time.> A bit like the branch delay slot on MIPS, exposure of this > microarchitectural artifact to the ISA has meant that all 32 bit ARMs use > this current+8, even though they don't have 3 stage pipelines any more.Is there a compensating branch instruction? -- glen
Reply by ●July 30, 20152015-07-30
On 7/30/2015 12:58 PM, Mel Wilson wrote:> On Thu, 30 Jul 2015 09:28:06 -0700, Robert Willy wrote: > >> When I read the words below dot line, I don't understand why "R14 is >> adjusted >> to allow for the prefetch" >> >> Could you explain it to me? >> ....... >> Branch with Link (BL) writes the old PC into the link register (R14) of >> the >> current bank. The PC value written into R14 is adjusted to allow for >> the prefetch, and contains the address of the instruction following the >> branch and link instruction. Note that the CPSR is not saved with the >> PC and R14[1:0] are always cleared. > > Most likely the processor is fetching instructions in anticipation of > executing them later, so if you store the PC, the value you get is the > address of the instruction two places down from the one you're executing > (i.e. a BL). That's not the instruction you expect to return to, so it's > decremented before storing it in R14. Embedded IBM S/360?So they are not saying they are adjusting the address to allow for a prefetch using that address. Rather they are adjusting the prefetch address to get the correct next instruction address to return to? -- Rick
Reply by ●July 30, 20152015-07-30
On 7/30/15 12:28 PM, Robert Willy wrote:> Hi, > > When I read the words below dot line, I don't understand why "R14 is adjusted > to allow for the prefetch" > > Could you explain it to me? > > > Thanks, > > ....... > Branch with Link (BL) writes the old PC into the link register (R14) of the > current bank. The PC value written into R14 is adjusted to allow for the > prefetch, and contains the address of the instruction following the branch > and link instruction. Note that the CPSR is not saved with the PC and > R14[1:0] are always cleared. >If the processor didn't do prefetching, but fetched an instruction (incrementing the PC), decoded it, then executed it, and when done, fetched the next (and so on), no adjustment would be needed. Since the ARM actually has fetched more data (and incremented the PC) by the time the instruction is executed, if it just moved the now current value of the PC, it would point past the point you really wanted to return to (either requiring empty slots after every call or the code needing to adjust the value of R14 before returning), so the processor will automatically correct the value so it has the value of the PC just after the instruction was fetch, and undoes the effect of the prefetch that happened.
Reply by ●July 30, 20152015-07-30
On 30.7.2015 г. 21:11, rickman wrote:> On 7/30/2015 12:58 PM, Mel Wilson wrote: >> On Thu, 30 Jul 2015 09:28:06 -0700, Robert Willy wrote: >> >>> When I read the words below dot line, I don't understand why "R14 is >>> adjusted >>> to allow for the prefetch" >>> >>> Could you explain it to me? >>> ....... >>> Branch with Link (BL) writes the old PC into the link register (R14) of >>> the >>> current bank. The PC value written into R14 is adjusted to allow for >>> the prefetch, and contains the address of the instruction following >>> the >>> branch and link instruction. Note that the CPSR is not saved with the >>> PC and R14[1:0] are always cleared. >> >> Most likely the processor is fetching instructions in anticipation of >> executing them later, so if you store the PC, the value you get is the >> address of the instruction two places down from the one you're executing >> (i.e. a BL). That's not the instruction you expect to return to, so it's >> decremented before storing it in R14. Embedded IBM S/360? > > So they are not saying they are adjusting the address to allow for a > prefetch using that address. Rather they are adjusting the prefetch > address to get the correct next instruction address to return to? >They clearly say the address is the correct one to return to; why do they go on talking prefetch I don't know, perhaps someone who put work into designing the core wrote the manual, too and got carried away into details not needed in this context. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by ●July 30, 20152015-07-30
On Thu, 30 Jul 2015 22:44:47 +0300, Dimiter_Popoff <dp@tgi-sci.com> wrote:>On 30.7.2015 ?. 21:11, rickman wrote: >> On 7/30/2015 12:58 PM, Mel Wilson wrote: >>> On Thu, 30 Jul 2015 09:28:06 -0700, Robert Willy wrote: >>> >>>> When I read the words below dot line, I don't understand why "R14 is >>>> adjusted >>>> to allow for the prefetch" >>>> >>>> Could you explain it to me? >>>> ....... >>>> Branch with Link (BL) writes the old PC into the link register (R14) of >>>> the >>>> current bank. The PC value written into R14 is adjusted to allow for >>>> the prefetch, and contains the address of the instruction following >>>> the >>>> branch and link instruction. Note that the CPSR is not saved with the >>>> PC and R14[1:0] are always cleared. >>> >>> Most likely the processor is fetching instructions in anticipation of >>> executing them later, so if you store the PC, the value you get is the >>> address of the instruction two places down from the one you're executing >>> (i.e. a BL). That's not the instruction you expect to return to, so it's >>> decremented before storing it in R14. Embedded IBM S/360? >> >> So they are not saying they are adjusting the address to allow for a >> prefetch using that address. Rather they are adjusting the prefetch >> address to get the correct next instruction address to return to? >> > >They clearly say the address is the correct one to return to; why do >they go on talking prefetch I don't know, perhaps someone who put >work into designing the core wrote the manual, too and got carried >away into details not needed in this context.It's the way the ISA was designed - because of prefetching, PC is always pointing 8 bytes ahead of where you are. So if you're using a PC relative instruction, you have to compensate for that. In the case of a subroutine call, they're just telling you they've backed out that +8 in the saved return address.
Reply by ●July 30, 20152015-07-30
On 30.7.2015 г. 23:32, Robert Wessel wrote:> On Thu, 30 Jul 2015 22:44:47 +0300, Dimiter_Popoff <dp@tgi-sci.com> > wrote: > >> On 30.7.2015 ?. 21:11, rickman wrote: >>> On 7/30/2015 12:58 PM, Mel Wilson wrote: >>>> On Thu, 30 Jul 2015 09:28:06 -0700, Robert Willy wrote: >>>> >>>>> When I read the words below dot line, I don't understand why "R14 is >>>>> adjusted >>>>> to allow for the prefetch" >>>>> >>>>> Could you explain it to me? >>>>> ....... >>>>> Branch with Link (BL) writes the old PC into the link register (R14) of >>>>> the >>>>> current bank. The PC value written into R14 is adjusted to allow for >>>>> the prefetch, and contains the address of the instruction following >>>>> the >>>>> branch and link instruction. Note that the CPSR is not saved with the >>>>> PC and R14[1:0] are always cleared. >>>> >>>> Most likely the processor is fetching instructions in anticipation of >>>> executing them later, so if you store the PC, the value you get is the >>>> address of the instruction two places down from the one you're executing >>>> (i.e. a BL). That's not the instruction you expect to return to, so it's >>>> decremented before storing it in R14. Embedded IBM S/360? >>> >>> So they are not saying they are adjusting the address to allow for a >>> prefetch using that address. Rather they are adjusting the prefetch >>> address to get the correct next instruction address to return to? >>> >> >> They clearly say the address is the correct one to return to; why do >> they go on talking prefetch I don't know, perhaps someone who put >> work into designing the core wrote the manual, too and got carried >> away into details not needed in this context. > > > It's the way the ISA was designed - because of prefetching, PC is > always pointing 8 bytes ahead of where you are. So if you're using a > PC relative instruction, you have to compensate for that. In the case > of a subroutine call, they're just telling you they've backed out that > +8 in the saved return address. >I see, that +8 is quite common and they just remind you this is taken into account in this case. One must usually check how branch etc. PC relative offsets are to be calculated and it differs on various processors. I can see the point of the guy who made the power architecture; no PC register is available, you have to do a linked call and use the value in the LR... :-). Not an issue as one can do it once and keep the absolute address in a GPR - doing so would be more painful on ARM with its fewer GPRs. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/
Reply by ●July 31, 20152015-07-31
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:> Theo Markettos <theom+news@chiark.greenend.org.uk> wrote: > > (snip regarding ARM and PC values) > > > Any time you move the PC into another register, for instance the link > > register R14, what you actually get is the address of the current > > instruction plus 8. The reason for this dates back to the ARM1, which had a > > 3 stage pipeline, fetch-decode-execute. When you executed the move, the > > instruction fetch stage was already two instructions further on. > > The JSR instruction on the 6502 pushes one less than the address > of the next instruction. RET pops the address and adds one. > Again, it seems related to the value of the register at the time. > > > A bit like the branch delay slot on MIPS, exposure of this > > microarchitectural artifact to the ISA has meant that all 32 bit ARMs use > > this current+8, even though they don't have 3 stage pipelines any more. > > Is there a compensating branch instruction?AFAIR, it's 'MOV rN, pc' that's affected. If you do 'MOV pc, rN', you start executing at the instruction pointed to by rN, not one or two after. If you do 'BL label', r14 points to the instruction after the BL (pc+4), not two after (pc+8). label: <code> MOV pc,r14 and label: STMFD r13!,{r14} ; push r14 <code> LDMFD r13!,{pc} ; pop old r14 and write to pc are common subroutine paradigms, and they do what you'd expect. Theo







