Optimizing stack access for a stack based VM - Compilers

This is a discussion on Optimizing stack access for a stack based VM - Compilers ; Hello everyone, I'm working on a compiler of a Pascal-like language for a specific stack-based virtual machine. To produce small code, I would like to optimize the storage of variables and temporaries akin to register allocation on a register-based machine. ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 12

Optimizing stack access for a stack based VM

  1. Default Optimizing stack access for a stack based VM

    Hello everyone,

    I'm working on a compiler of a Pascal-like language for a specific
    stack-based virtual machine. To produce small code, I would like to
    optimize the storage of variables and temporaries akin to register
    allocation on a register-based machine.

    Local variables should occur as natural operands of instructions (on
    the stack top) as much as possible, removing the need to fetch them
    from the middle of the stack (and encode displacement values).

    (More detailed description below.)

    Do you know if anyone already tried something like this, can you point me
    to any algorithms/articles/software?


    Target machine description:

    In short: Imagine a classical Harvard-style RISC processor, only that there
    are no registers, instruction operands are on the stack.

    There are separate address space for (1) code, (2) data, (3) stack.

    Code - classical linear sequence of instructions with conditional jumps
    for branching

    Arithmetic instructions - for example "add" - pops two values off the stack,
    adds them and pushes the result onto the stack

    Data - "ld [<addr>]" loads a word from the given address and pushes it
    on the stack; "st [<addr>]" pops a word off the stack and stores it
    to the given address. "Static" variables are stored here.

    Stack - direct access to stack is possible, "stkld [<offs>]", "stkst [<offs>]"
    similar to ld/st, but <offs> is relative to the stack top. Auxiliary
    instructions: "dup" duplicate value on stack top, "trash" remove value from
    stack top, "swap" exchange two values on stack top. "Automatic" variables
    and temporaries are stored here.

    Obviously anytime we want to execute an arithmetic operation it would be
    beneficial to have its operands on the stack top removing the need for
    stkld/stkst (and coding <offs>).

    While we get this for free for temporaries (= temporary results in arithmetic
    expressions) just by producing stack-based code from the source code, I'd like
    to extend it to work for variables, too.

    I suppose an algorithm for an accumulator-based machine (which has
    1 accumulator register) could also be used (with less efficiency).

    TIA

    Regards
    Jiri

  2. Default Re: Optimizing stack access for a stack based VM

    Jiri Svoboda wrote:

    > Local variables should occur as natural operands of instructions (on
    > the stack top) as much as possible, removing the need to fetch them
    > from the middle of the stack (and encode displacement values).


    IMO it doesn't matter whether you encode local variables as offsets
    from the SP, or from another base pointer.


    > Obviously anytime we want to execute an arithmetic operation it would be
    > beneficial to have its operands on the stack top removing the need for
    > stkld/stkst (and coding <offs>).
    >
    > While we get this for free for temporaries (= temporary results in
    > arithmetic expressions) just by producing stack-based code from the
    > source code, I'd like to extend it to work for variables, too.


    Since access to local variables requires some selector (offset...)
    value, you'll have to duplicate all operations, for using a local
    variable as the second operand. Perhaps further sets are required, for
    locals as first or both operands, and operations with non-local
    variables would require further instructions. I'd stay with an RISC
    instruction set, and only push/pop variables by dedicated
    instructions.

    Did you have a look at the i80x87 FPU instruction set? Or at the Java
    VM, or at FORTH?

    DoDi

  3. Default Re: Optimizing stack access for a stack based VM

    On Sep 11, 10:13 am, Jiri Svoboda <jirik.svob...@seznam.cz> wrote:
    > I'm working on a compiler of a Pascal-like language for a specific
    > stack-based virtual machine. To produce small code, I would like to
    > optimize the storage of variables and temporaries akin to register
    > allocation on a register-based machine.
    >
    > Local variables should occur as natural operands of instructions (on
    > the stack top) as much as possible, removing the need to fetch them
    > from the middle of the stack (and encode displacement values).
    >
    > (More detailed description below.)
    >
    > Do you know if anyone already tried something like this, can you point me
    > to any algorithms/articles/software?


    Much of what you describe is accomplished by Forth.

    > Target machine description:
    >
    > In short: Imagine a classical Harvard-style RISC processor, only that there
    > are no registers, instruction operands are on the stack.
    >
    > There are separate address space for (1) code, (2) data, (3) stack.


    The separation between code and data isn't required in Forth (although
    there may be advantages in certain architectures that treat each space
    differently, or don't like mixing of code and data).

    > Code - classical linear sequence of instructions with conditional jumps
    > for branching


    Or token-threaded code; it doesn't need to be executable code.

    > Arithmetic instructions - for example "add" - pops two values off the stack,
    > adds them and pushes the result onto the stack
    >
    > Data - "ld [<addr>]" loads a word from the given address and pushes it
    > on the stack; "st [<addr>]" pops a word off the stack and stores it
    > to the given address. "Static" variables are stored here.


    Forth uses addresses on the stack, and a @ (fetch contents of address)
    and ! (store into address the value 2nd on the stack).

    variable x
    10 x ! \ set x to 10; x puts the address on the stack for the 2
    operand !

    > Stack - direct access to stack is possible, "stkld [<offs>]", "stkst [<offs>]"
    > similar to ld/st, but <offs> is relative to the stack top. Auxiliary
    > instructions: "dup" duplicate value on stack top, "trash" remove value from
    > stack top, "swap" exchange two values on stack top. "Automatic" variables
    > and temporaries are stored here.


    Most operations in a stack based VM only require access to the top 2
    stack entries. Forth provides your basic arithmetic instructions,
    along with stack juggling words; SWAP and DUP you mention, and "trash"
    is DROP. Also consider having an OVER ( a b -> a b a ).

    The stack offset word you describe is Forth's PICK; but it is
    infrequently used, and can be eliminated entirely for most code.
    Digging down on the stack is frowned on in Forth, because many of the
    machines on which it is implemented do not have an item addressable
    stack. In general, it's not required, and 99 times out of 100
    indicates a loss of stack control on the part of the programmer.

    > Obviously anytime we want to execute an arithmetic operation it
    > would be beneficial to have its operands on the stack top removing
    > the need for stkld/stkst (and coding <offs>).


    "Variables" in Forth are words that push their addresses on the stack,
    as in the @ and ! example given above. Forth also has a value, or a
    self fetching variable.

    > While we get this for free for temporaries (= temporary results in
    > arithmetic expressions) just by producing stack-based code from the
    > source code, I'd like to extend it to work for variables, too.


    > I suppose an algorithm for an accumulator-based machine (which has 1
    > accumulator register) could also be used (with less efficiency).


    It might be more efficient if stack accesses are more expensive than
    accumulator accesses. Just consider the accumulator as the top entry
    of the stack, and the stack holds entries below that.

    > TIA


    See comp.lang.forth,
    Anton Ertl's website at http://www.complang.tuwien.ac.at/anton/home.html,
    the comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html,
    and this thread on clf; http://groups.google.com/group/comp....a6b972/?hl=en#

    --
    Regards
    Alex McDonald

  4. Default Re: Optimizing stack access for a stack based VM

    Jiri Svoboda wrote:
    > I'm working on a compiler of a Pascal-like language for a specific
    > stack-based virtual machine. To produce small code, I would like to
    > optimize the storage of variables and temporaries akin to register
    > allocation on a register-based machine.


    I've seen something like this done in two places:

    One is Postscript, where programmers do this manually. There are no
    statements or expressions as such, just a stack and operators.

    To do the equivalent of "x := x + 1":

    dict /x get 1 add dict /x exch put

    You can also do optimize away one dictionary lookup with something like
    this:

    dict /x 2 copy get 1 add put

    Then there's the Unisys MCP Series and their Burroughs ancestors, all
    stack machines. I've seen those compilers optimize expressions, e.g.:

    x ** 5 would be something like

    load x
    dupl
    dupl
    mult
    dupl
    mult
    mult

    I'm not sure these compilers do everything you're looking for.

    I do know that the earlier machines in this series (Burroughs B6700)
    kept the top two words of the stack (and their double precision
    extensions) in registers and the rest of the stack in memory. Later
    machines cached the top n words of the stack, which made stack access
    faster.

    I'm not sure I understand all of what you're trying to do. Are you
    trying to reuse stack offsets when variables are no longer needed?
    Are you trying to avoid assigning a stack offset to a variable that
    can be bounced around the top of the stack until it can simply go
    away?

    Are you sure you'd be saving memory? More complex code means more
    memory (and more development time and more bugs).

    Louis

    lkrupp *at* pssw *dot* com


  5. Default Re: Optimizing stack access for a stack based VM

    Alex McDonald wrote:

    >>There are separate address space for (1) code, (2) data, (3) stack.

    >
    > The separation between code and data isn't required in Forth (although
    > there may be advantages in certain architectures that treat each space
    > differently, or don't like mixing of code and data).


    IMO Forth is not a direct solution for the OP, since...

    > "Variables" in Forth are words that push their addresses on the stack,
    > as in the @ and ! example given above. Forth also has a value, or a
    > self fetching variable.


    Forth has no local variables (for recursive calls), the implementation
    of e.g. loop counters requires two separate stacks, for data and
    return addresses. Calling subroutines (Forth: compiled words) for the
    retrieval of references to variables IMO also is overkill, in a VM
    other than the Forth machine itself.

    Nonetheless Forth is a source of inspiration, for everybody who wants
    to learn more about really uncommon virtual machines :-)

    DoDi

  6. Default Re: Optimizing stack access for a stack based VM

    Louis Krupp wrote:

    > I do know that the earlier machines in this series (Burroughs B6700)
    > kept the top two words of the stack (and their double precision
    > extensions) in registers and the rest of the stack in memory. Later
    > machines cached the top n words of the stack, which made stack access
    > faster.


    AFAIR a 16 bit TI microprocessor also used the top 16 entries on the
    stack as registers. Perhaps this architecture reflected the Burroughs
    machine?

    DoDi
    [That seems to be pretty typical. The HP3000 kept the top several elements
    of the stack in registers as well. -John]

  7. Default Re: Optimizing stack access for a stack based VM

    > Did you have a look at the i80x87 FPU instruction set? Or at the Java
    > VM, or at FORTH?


    In my biased opinion, the prototypical non-ancient stack-based ISA is that of
    the transputer.

    Jan

  8. Default Re: Optimizing stack access for a stack based VM

    On Sep 13, 9:31 pm, Hans-Peter Diettrich <DrDiettri...@aol.com> wrote:
    > Alex McDonald wrote:
    > >>There are separate address space for (1) code, (2) data, (3) stack.

    >
    > > The separation between code and data isn't required in Forth (although
    > > there may be advantages in certain architectures that treat each space
    > > differently, or don't like mixing of code and data).

    >
    > IMO Forth is not a direct solution for the OP, since...


    I suggested the OP look at Forth, not implement Forth. There are many
    good ideas in Forth, and he can avoid many of the problems others have
    solved by doing a bit of research. Generating stack-based IL from
    Pascal is not difficult; getting the right stack-based IL is the hard
    part.

    > > "Variables" in Forth are words that push their addresses on the stack,
    > > as in the @ and ! example given above. Forth also has a value, or a
    > > self fetching variable.

    >
    > Forth has no local variables (for recursive calls),


    Your assertion is not true; Forth has local variables (actually, self-
    fetching values) which can be used in recursion. Preferably, the stack
    can be used for recursion and avoid locals entirely.

    : fib ( n -- n' )
    dup 1 > if
    dup 1- recurse swap 2 - recurse +
    then ;

    > the implementation of e.g. loop counters requires two separate
    > stacks, for data and return addresses.


    Forth requires two stacks (data and return) as a matter of course, and
    not just for loop counters or variables. This is not a big problem
    even in architectures that don't directly support any stacks at all
    (as I'm sure you know, stacks are easy to implement without hardware
    support).

    And loop counters and local variables can be held on the return stack,
    but don't need to be. If recursion isn't used, the stack sizes
    required for the data and return stacks need not be large; certainly,
    tens of entries are more than adequate for most purposes, and rarely
    if ever a requirement for hundreds or thousands of entries.

    > Calling subroutines (Forth: compiled words) for the retrieval of
    > references to variables IMO also is overkill, in a VM other than the
    > Forth machine itself.


    Your assumption is that Forth is exclusively subroutine-threaded.
    Threaded code Forths are easy to implement, debug and port, which is
    why threaded implementations are often described in the literature.

    But modern Forths generate native code, rather than threaded code, and
    (for instance) would compile this primitive and simplistic variable
    increment;

    x @ 1 + x !

    into native code;

    add x, 1

    --
    Regards
    Alex McDonald


  9. Default Re: Optimizing stack access for a stack based VM

    DrDiettrich1@aol.com (Hans-Peter Diettrich) wrote:

    > Forth has no local variables (for recursive calls),


    Local variables are in the Forth Standard, optional word set IIRC.

    Ken Young

  10. Default Re: Optimizing stack access for a stack based VM

    Hans-Peter Diettrich wrote:
    > AFAIR a 16 bit TI microprocessor also used the top 16 entries on the
    > stack as registers. Perhaps this architecture reflected the Burroughs
    > machine?


    The TI9900 might be what you remember. It was a PDP-11 wannabe, with 16
    registers instead of 8. Some neat features and some weird features.

    jeff
    [The registers, which TI called the workspace, were in RAM, with a
    pointer register saying where they were. So to stack the registers on
    a subroutine call, you just adjusted the pointer register. Not really
    a stack architecture, more a predecessor to the SPARC's register
    windows. -John]

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Similar Threads

  1. Why stack overflow with such a small stack?
    By Application Development in forum RUBY
    Replies: 8
    Last Post: 08-31-2007, 11:21 PM
  2. Thread stack (not call stack)
    By Application Development in forum DOTNET
    Replies: 5
    Last Post: 07-14-2006, 04:19 PM
  3. Re: Implementing a stack-based interpreter
    By Application Development in forum Compilers
    Replies: 1
    Last Post: 08-10-2005, 10:52 AM
  4. Implementing a stack-based interpreter
    By Application Development in forum Compilers
    Replies: 11
    Last Post: 08-07-2005, 03:21 PM
  5. trail stack, global stack
    By Application Development in forum PROLOG
    Replies: 3
    Last Post: 06-30-2005, 02:02 PM