Direction of Stack Growth| page 11

Reply by karthikbalaguru ●October 23, 20072007-10-23

On Oct 22, 2:36 pm, Elcaro Nosille <Elcaro.Nosi...@mailinator.com>
wrote:
> Terje Mathisen schrieb:
>
> >> With downwards growing stacks you can address the "top" element of
> >> the stack at address sp + 0 and this results often in smaller opodes
> >> of the machine-instructions adressing that elements.
> >> With upwards growing stacks you either would have to know the size
> >> of the top element when pushing another element to the stack or you
> >> would have to address the top element at sp - N, you you couldn't
> >> address it with an offset-less instruction.
> > Sure you could!
> > You just need a pre-increment stack pointer.
>
> Yes, but then a push onto the stack would have to know the size of
> the element pushed before; i.e. there would be two operands of this
> push-operation - that's bulky and unnecessary.

This is logical.

Karthik Balaguru

Reply by karthikbalaguru ●October 23, 20072007-10-23

On Oct 22, 11:32 am, Paul Keinanen <keina...@sci.fi> wrote:
> On 21 Oct 2007 21:25:48 GMT, n...@cus.cam.ac.uk (Nick Maclaren) wrote:
>
> >Actually, handling upwards-growing stacks on the System/360 was as
> >easy as downwards-growing ones, and you DON'T need a separate frame
> >pointer for most languages - but that doesn't deny your point.
>
> >I lost track of the number of people I tried to explain the major
> >methods of doing it to, generally unsuccessfully.  There was
> >definitely a belief (i.e. myth) that upwards-growing ones were
> >inefficient or even impossible and, as always, pointing out counter-
> >examples and explaining the techniques didn't kill the myth.
>
> It depends how the stack pointer (or other general register) is
> updated. With register pre-decrement and post-increment access, it is
> natural to let the stack grow downwards, since the register will point
> to the newly pushed value. If the stack grows upwards, the stack would
> point to a useless (free) location of the stack and to access the last
> value, you would have to use an offset.
>
> On PDP-11 a sequence with downwards growing software stack would be
>
> MOV VAR1, -(R5)  ; Push Var1 to top of stack (TOS)
> COM (R5)         ; Do some operations on the value on the TOS
> ADD #5,(R5)      ; Some more operations on TOS
> MOV (R5)+,VAR2   ; Pop value from stack
>
> With a stack growing upwards
>
> MOV VAR1, (R5)+  ; Push Var1 to top of stack
>                  ; R5 now points above the value pushed
> COM -2(R5)       ; Do some operations on the value on the TOS
> ADD #5,-2(R5)    ; Some more operations on TOS
> MOV -(R5),VAR2   ; Pop value from stack
>
> This is 2 words (4 bytes) longer than the previous example.
>
> When other locations are accessed, the offset penalty applies to both
> methods, but with processors with several offset sizes, the offset may
> be shorter, if the (short) offset is handled as an unsigned value.
>
> In processors with pre-increment and post-decrement addressing modes,
> it would be natural to let the stack grow upwards.
>

Reading through this and thinking about heap mechanism. I got some
query.
Why do we need the stack as the heap appears flexible ?

Thx in advans,
Karthik Balaguru

Reply by karthikbalaguru ●October 23, 20072007-10-23

On Oct 23, 3:05 am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> Nick Maclaren wrote:
>
> (snip)
>
> > Witness that System/360 stacks were normally upwards growing in the
> > 1960s and 1970s, and changed as the new generation of people with a
> > DEC background moved in during the 1980s.
>
> The S/360 that I used, running OS/360 and descendants, used
> linked lists for subroutine calling, register saving, and
> argument lists.

I am eager to understand more about this.
Linked lists for subroutine calling.
Is it a good method or a bad method of design ?
Where can i get more information about this ?

>
> I haven't looked at what Linux/390 does, though.
>

What does linux follow/390 do ?
Where can i get more information about this ?

Thx in advans,
Karthik Balaguru

Reply by Everett M. Greene ●October 23, 20072007-10-23

glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:

(snip)

> As I wrote in another post, the IBM 704 Fortran compiler stored
> arrays in decreasing order in memory.  It might also be that
> programs started from the bottom, and memory for variables
> (static addressed) from the top.  Note also the Fortran requirement
> that all COMMON blocks except blank COMMON have the same size in
> all program units.  That allows them to be positioned in memory,
> with the unknown size blank COMMON at the end (one end or the other).

I'd forgotten about those machines when I said something about
some processors only being able to use positive index offsets.
The negative indexing was the origin of the BES symbol?

Reply by Terje Mathisen ●October 23, 20072007-10-23

Steve Underwood wrote:
> If anecdotes are your only exposure to that kind of programming, you've 
> lead a sheltered life. :-) 

Please!

I suggest you do a quick google of my name, or maybe browse some of my 
comp.arch posts over the last 10-15 years.

I've written code to work around the Pentium FDIV bug, this got included 
in pretty much every compiler on the market for a few years.

I've also written asm code for one of the AES contenders, as well as 
doing optimization work on things like the first ever software DVD 
player. You'll also find my name in the original Quake manual.

On your end, I assume you're the Steve Underwood who's worked on various 
telephony open source projects?

 > Much is still coded for specialist machines
> where there is no real program counter, and every instruction points to 
> the next. It makes patches wonderfully arcane :-)
> 
> Back in the good old days, when AMD was the king of the DSP chip makers, 
> practically all DSP programming was done that way (and it took a lot of 
> AMD chips to make one machine).

I once wrote, mostly in hex and without a computer, over an Easter 
break, a program to encode arbitrary binary data as an executable ascii 
file, using nothing but the 70+ chars blessed by the MIME standard as 
being transparent across all email gateways.

This should be in the same ballpark as Mel's card-playing program, right?

It uses the minimum possible amount of self-modification (a single 
two-byte backwards branch instruction), it survives almost all forms of 
paragraph reformatting, and it uses Base-64 encoding of the payload with 
the smallest decoder I have ever seen. (It is significantly better than 
the "best" virus code I've disassembled.)

Here's the two lines that contains the first two bootstrap stages, check 
it out in a 16-bit debugger:
ZRYPQIQDYLRQRQRRAQX,2,NPPa,R0Gc,.0Gd,PPu.F2,QX=0+r+E=0=tG0-Ju E=
EE(-(-GNEEEEEEEEEEEEEEEF 5BBEEYQEEEE=DU.COM=======(c)TMathisen95

Terje
PS. I've won two international code optimization competitions, and ended 
up in second or third place a couple of times.
-- 
- <Terje.Mathisen@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Reply by glen herrmannsfeldt ●October 23, 20072007-10-23

karthikbalaguru wrote:
(snip)

> The 370 series(IBM System/370) of computers was a 32-bit big endian
> style mainframe architecture,

Note that IBM was consistently big-endian for S/370.
Though there are no bit addressing instructions, all the documentation
numbers bits with the least significant bit as zero.

It does get interesting with the extension to 64 bit z/architecture,
now the bits of a 32 bit register are numbered 32 to 63.
(That is, 32 bit instructions use bits 32 to 63 of the
64 bit registers defined by the architecture.)

-- glen

Reply by glen herrmannsfeldt ●October 23, 20072007-10-23

Paul Keinanen wrote:
(snip)

> Little endian addressing is a good idea, if the bus width is less than
> the address size. You can perform the effective address calculations
> on the LSB (and generate the carry from that calculation) before the
> MSB is loaded. 

> In big endian systems with a narrow bus, you must first load both the
> MSB and LSB, before you can start calculating the effective address,
> thus, the calculation is slower or you need much more carry-lookahead
> gates to perform the effective address calculation swiftly.

This might reasonably be true for processors like the 8080 or
6502 that were limited by transistor count.  For addresses that
are part of the instruction or for autoincrement addressing
modes, you can do the add while incrementing the PC or
index register.  For addresses in memory, it isn't that much
harder to load the higher byte first and continue down
in memory.  (Consider the subject of this thread.)

For any machine big enough to have a multiply instruction,
even more for a divide instruction, one needs to address memory
(or registers) repeatedly and most of the advantage is gone.

Note that S/360 was implemented on processors with an eight
bit ALU and eight bit addressable memory (the 360/30), yet they
went for big-endian.  It is just a small change in the
microcode when addressing memory.  (Especially if you
don't allow unaligned access.)

Those who think big tend to use big-endian, those who
think small tend to use little-endian (for the reason you say).

-- glen

Reply by glen herrmannsfeldt ●October 23, 20072007-10-23

karthikbalaguru wrote:

(snip)

 > Check this out.
 > http://kstruct.com/2007/04/16/interview-questions-stack-direction/
(snip)

> 2) But i liked the 'C' program that helps in finding the direction of
> stack growth.

I would say technically that only finds the direction of stack
growth relative to that of array addressing.  If one addressed arrays
downward in memory, then the C conversion for pointer arithmetic,
including that for pointer differences, would invert the sign.

In addition, the C standard makes no guarantee on tests other than
equality and inequality between pointers to different objects.

I think we really need a C compiler for the 7090 so we can check
out the effects of sign magnitude arithmetic and negative indexing
arrays.  (The latter not required, but Fortran did it.)

-- glen

Reply by Jerry Avins ●October 23, 20072007-10-23

karthikbalaguru wrote:

   ...

> Reading through this and thinking about heap mechanism. I got some
> query.
> Why do we need the stack as the heap appears flexible ?

Manipulating a heap requires software. Some stacks are implemented 
entirely in hardware (PUSH, POP; CALL, RETURN). It would be very hard to 
create or maintain a heap on a machine with no stack.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;

Reply by robe...@yahoo.com ●October 23, 20072007-10-23

On Oct 23, 3:39 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> karthikbalaguru wrote:
>
> (snip)
>
> > The 370 series(IBM System/370) of computers was a 32-bit big endian
> > style mainframe architecture,
>
> Note that IBM was consistently big-endian for S/370.
> Though there are no bit addressing instructions, all the documentation
> numbers bits with the least significant bit as zero.
>
> It does get interesting with the extension to 64 bit z/architecture,
> now the bits of a 32 bit register are numbered 32 to 63.
> (That is, 32 bit instructions use bits 32 to 63 of the
> 64 bit registers defined by the architecture.)


Your second statement is correct, I assume the first is a typo.  In
IBM S/360 (and many other IBM designed ISAs, like PPC), the *most*
significant bit of a register or word is zero.