separate stacks

This is a discussion on separate stacks within the Forth forums in Programming Languages category; hi > > > Running a block RAM at twice the speed of the rest of the CPU might > > > not be sufficient. *If you are pushing the speed of the processor, it > > > is likely that nearly half of the cycle time will be for program > > > memory output delay and decoding the instruction. *It could be pretty > > > hard to form an address based on the current instruction in the first > > > half of a clock cycle and still meet the setup time of the RAM. *The > ...

Go Back   Application Development Forum > Programming Languages > Forth

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #61  
Old 08-08-2008, 04:15 PM
jacko
Guest
 
Default Re: separate stacks

hi

> > > Running a block RAM at twice the speed of the rest of the CPU might
> > > not be sufficient. *If you are pushing the speed of the processor, it
> > > is likely that nearly half of the cycle time will be for program
> > > memory output delay and decoding the instruction. *It could be pretty
> > > hard to form an address based on the current instruction in the first
> > > half of a clock cycle and still meet the setup time of the RAM. *The
> > > output delay seems to be a lot longer than the setup time however, so
> > > it might work out ok. *But the only way to tell for sure is to
> > > implement a design. *There are many, many details in a processor
> > > design that can greatly impact this sort of tradeoff.

>
> > I hadn't considered some of these drawbacks to double-clocking the
> > RAM. Maybe it's better that I went with a single-clock design. But it
> > works and I'm stuck with it. A single "simple dual port" block RAM is
> > used for all stacks and user variables. Actually, four RAM blocks are
> > used to make it byte-addressable.


http://nibz.googlecode.com went for unified memory. This has advantaes
when multiple block rams are not available. The main address setup
time delay is due to the predecrement of a write. The block ram itself
can perform the write in a later cycle, so it really is just a
registered happening. A simple BUS_GNT_I signal will accunt for any
read delay. Early write, late read.

> > There were two cases of RAM access I had to work around:

>
> > 1. Simultaneous read and write to the same address. The RAM's output
> > is indeterminate so the data must be supplied by a register.

>
> I'm not clear on what situation requires this, but the block rams I
> have looked at provide several modes for dealing with this. *They can
> do a write before read or a write after read (at least Xilinx can also
> do a read data hold through write). *I used the write before read and
> use the output of the block ram as the Next On Stack (NOS) register.
> I implemented a register for TOS. *So the block ram is a true stack.
> It is just behind the TOS register which is the top of stack to my
> CPU. *So when a push operation is performed, the TOS is written to the
> ram stack and the data from the instruction is written into the TOS
> register. *Since the ram always does a read on whatever address is
> used even if the cycle is a write, the output of the RAM now shows the
> data written and is ready for the next instruction cycle.


I never have this problem, as there is no simultanious memory access.
Only one address per cycle, by design.

> > 2. Read before write to the same address. This is because instructions
> > can have three steps: read, process and write. R@, for example,
> > requests a read, latches the read result into TOS, and starts a write
> > cycle to store the previous TOS. The instruction following R@ may
> > request a read before R@'s write in which case the read result must be
> > taken from what R@ will be writing which is usually T. It sounds a
> > little confusing and it is, but suffice it to say that random read/
> > write access to dual port RAM has muxes in the read path to work
> > around ambiguous conditions.


The complexit of the control unit increses as mltiple addressing is
considered. I have avoided this by design. he late read is th only
thing to worry about. This is solved by clocking the exeution of the
instruction into the registers at the same time as next instruction's
address is presented (i.e. fetch setup). This is wht introduces the
branch delay. The inavailability of the updated program counter, at he
point of needing the fetch adress. Half speed or branch delay slot is
the choice, I went for branch delay slot. Although loading the program
counter rom the instruction register avoid this for certain types of
flow control instruction.

> This sounds exactly like what I implemented. *I'm not clear on what
> the initial read is, but I assume it is reading the return stack
> register or ram. *Is the same ram used for the return stack? *If so,
> you can use a separate port for each of the two stacks and they can
> operate independently in parallel.
>
> Also, the read/write interaction can help here. *If you are concerned
> about the data written being read in the next cycle, it won't be a
> problem. *Using the write before read mode will put the written data
> on the RAM output for the next cycle.


I only ever write from registers, never through logic calculation.
Keeping the data setup within bounds.

> > Maybe short, zero operand instructions aren't the best option for
> > getting high performance from Forth-friendly processors. They are if
> > your compiler isn't very sophisticated, but if you have an analytical
> > compiler that maps the Forth into registers then the object code
> > consists of longer but fewer instructions. I think if you can keep a
> > register machine fed with instructions, it will be faster than a zero-
> > operand (small instruction) machine. Having said that, I don't care
> > because if I need speed I can get it in hardware.


Always as data locaity makes rgister cached copies with no memory
access needed available in parallel.The quetion is how much register
data locality can be assigned in a small CPU when area or logic
resources are paramount.

> I can't say if one is faster than the other in general. *Typically the
> more complex instruction sets can run faster because the code can
> access a larger number of operands and stack manipulation instructions
> are not needed. *But then data has to be fetched and stored in memory
> which require separate instructions. *So it depends on how your code
> is written. *I am sure that a C compiler will never produce speedy
> code running on a stack machine. *But then, if writing in Forth, I
> expect a stack machine has no penalty and is optimal for code size and
> processor complexity.


C could compile efficiently if the compiler made DTC code. This would
involve factoring RTL primaries, and compiling composite instuction
xts. This is aso another reasn I went for 1 memory indirection per
instruction, so that RTL factorization became much simpler.

> Rick- Hide quoted text -
>
> - Show quoted text -


cheers.
jacko
Reply With Quote
  #62  
Old 08-08-2008, 04:41 PM
jacko
Guest
 
Default Re: separate stacks

hi

example algorithm:

migrate reads later, migrate writes earlier until data dependancy
defines logic order. This is why it is good to split read from write.

Definte small group of stack easy ops. use these as combinators to
place localy needed variables by transform of variable name. i.e. y =
x SWAP make sure that load and save from stack to storage at right
stack depth to prevent ROLL madness.

When program is written in one variable, cosider it variable factored.
Now factor common pairs of simplified primitives based on most common
pair recusively. When all pair xt as singles (singletons) are equally
likely or have o duplication, consider the code factored.

cheers
jacko
Reply With Quote
  #63  
Old 08-08-2008, 08:43 PM
joelseph
Guest
 
Default Re: separate stacks

On 82, ȫ8:50, DavidM <nos...@nowhere.com> wrote:
> I note that Forth has traditionally had 2 stacks - data and return.
>
> Then came the float stack.
>
> Then, there has been further separation with a locals stack, and even an
> 'object' stack for forths with OO capability.
>
> What do people think about this separation of stacks? Does it clutter the
> thinking to separate out locals and objects into their own stacks, or is
> it more of a help?


The way I see it, you split the stacks to separate context domains.
The return address context is not really (usually) de-synchronized
from the parameter context, but we split them in FORTH because it
makes for a sturdier execution environment without a lot of mechanical
optimization.

Splitting floats out was also primarily for convenience in building
compilers with simple optimizations. Splitting locals out is similar,
although it can help (when done with care), in terms of letting the
persistence of parameters become a bit sloppier.

Objects, well, objects should persist completely without reference to
the call context, which would indicate to me a non-stack, non-heap
list for maintaining objects. But if your objects are owned by other
objects or have some other nested persistence relationship, there
might be a reason for an object stack. I'm not sure whether that stack
would be appropriate for allocation to a cpu register.

But, then, I really don't know anything about objects, so my comments
on that are not necessarily meaningful.
Reply With Quote
  #64  
Old 08-09-2008, 08:09 AM
DavidM
Guest
 
Default Re: separate stacks

On Fri, 08 Aug 2008 17:43:47 -0700, joelseph wrote:

> On 8月2日, 午後8:50, DavidM <nos...@nowhere.com> wrote:
>> I note that Forth has traditionally had 2 stacks - data and return.
>>
>> Then came the float stack.
>>
>> Then, there has been further separation with a locals stack, and even
>> an 'object' stack for forths with OO capability.
>>
>> What do people think about this separation of stacks? Does it clutter
>> the thinking to separate out locals and objects into their own stacks,
>> or is it more of a help?

>
> The way I see it, you split the stacks to separate context domains. The
> return address context is not really (usually) de-synchronized from the
> parameter context, but we split them in FORTH because it makes for a
> sturdier execution environment without a lot of mechanical optimization.
>
> Splitting floats out was also primarily for convenience in building
> compilers with simple optimizations. Splitting locals out is similar,
> although it can help (when done with care), in terms of letting the
> persistence of parameters become a bit sloppier.
>
> Objects, well, objects should persist completely without reference to
> the call context, which would indicate to me a non-stack, non-heap list
> for maintaining objects. But if your objects are owned by other objects
> or have some other nested persistence relationship, there might be a
> reason for an object stack. I'm not sure whether that stack would be
> appropriate for allocation to a cpu register.
>
> But, then, I really don't know anything about objects, so my comments on
> that are not necessarily meaningful.


Well, as it happens, I changed aumForth to use a separate objects stack.
But it ended up creating more problems than it solved, so I changed it
back.

Some things are best discovered by trial and error.

Reply With Quote
  #65  
Old 08-18-2008, 04:24 AM
David Thompson
Guest
 
Default Re: separate stacks

Just nitpicking:
On Wed, 6 Aug 2008 11:02:13 -0700 (PDT), rickman <gnuarm@gmail.com>
wrote:
<snip: DUP on empty stack>
> > I once (around) 1981 had FIGFORTH on my ELF-II and it consequently
> > responded with 42. It turned out that this was the ASCII value of
> > '.' ;-)

>

You must be remembering inexactly ...

> Are you sure that this particular Forth was not designed to provide
> "The Answer to Life, the Universe, and Everything"???
>
> Or maybe it does calculations in base 13? ;^)
>

and you're obscure and slightly off.

period is dec 62 hex 3E base13 4A base15 42
colon is dec 42 hex 2A

- formerly david.thompson1 || achar(64) || worldnet.att.net
Reply With Quote
  #66  
Old 08-18-2008, 11:50 AM
rickman
Guest
 
Default Re: separate stacks

On Aug 18, 4:24 am, David Thompson <dave.thomps...@verizon.net> wrote:
> Just nitpicking:
> On Wed, 6 Aug 2008 11:02:13 -0700 (PDT), rickman <gnu...@gmail.com>
> wrote:
> <snip: DUP on empty stack>> > I once (around) 1981 had FIGFORTH on my ELF-II and it consequently
> > > responded with 42. It turned out that this was the ASCII value of
> > > '.' ;-)

>
> You must be remembering inexactly ...
>
> > Are you sure that this particular Forth was not designed to provide
> > "The Answer to Life, the Universe, and Everything"???

>
> > Or maybe it does calculations in base 13? ;^)

>
> and you're obscure and slightly off.
>
> period is dec 62 hex 3E base13 4A base15 42
> colon is dec 42 hex 2A


Do you have trouble seeing the forests for the trees?

Rick
Reply With Quote
  #67  
Old 08-18-2008, 02:04 PM
Coos Haak
Guest
 
Default Re: separate stacks

David Thompson wrote:

> Just nitpicking:
> On Wed, 6 Aug 2008 11:02:13 -0700 (PDT), rickman <gnuarm@gmail.com>
> wrote:
> <snip: DUP on empty stack>
>> > I once (around) 1981 had FIGFORTH on my ELF-II and it
>> > consequently responded with 42. It turned out that this was the
>> > ASCII value of '.' ;-)

>>

> You must be remembering inexactly ...


This is what I wrote, not rickman.
And yes, CHAR . is equal to 46 decimal ;-)

--
Coos

Reply With Quote
  #68  
Old 08-22-2008, 07:58 PM
Jonah Thomas
Guest
 
Default Re: separate stacks

Coos Haak <chforth@hccnet.nl> wrote:
> David Thompson wrote:
> > Just nitpicking:
> > rickman <gnuarm@gmail.com> wrote:
> > <snip: DUP on empty stack>
> >> > I once (around) 1981 had FIGFORTH on my ELF-II and it
> >> > consequently responded with 42. It turned out that this was the
> >> > ASCII value of '.' ;-)
> >>

> > You must be remembering inexactly ...

>
> This is what I wrote, not rickman.
> And yes, CHAR . is equal to 46 decimal ;-)


Why were you using base 11?
Reply With Quote
  #69  
Old 08-23-2008, 04:01 PM
Coos Haak
Guest
 
Default Re: separate stacks

Jonah Thomas wrote:

> Coos Haak <chforth@hccnet.nl> wrote:
>> David Thompson wrote:
>> > Just nitpicking:
>> > rickman <gnuarm@gmail.com> wrote:
>> > <snip: DUP on empty stack>
>> >> > I once (around) 1981 had FIGFORTH on my ELF-II and it
>> >> > consequently responded with 42. It turned out that this was
>> >> > the ASCII value of '.' ;-)
>> >>
>> > You must be remembering inexactly ...

>>
>> This is what I wrote, not rickman.
>> And yes, CHAR . is equal to 46 decimal ;-)

>
> Why were you using base 11?

Because I'm way over 46 ?
--
Coos

Reply With Quote
  #70  
Old 09-01-2008, 02:53 AM
David Thompson
Guest
 
Default Re: separate stacks

On Mon, 18 Aug 2008 08:24:47 GMT, I wrote:

> Just nitpicking:
> On Wed, 6 Aug 2008 11:02:13 -0700 (PDT), rickman <gnuarm@gmail.com>
> wrote:
> <snip: DUP on empty stack>
> > > I once (around) 1981 had FIGFORTH on my ELF-II and it consequently
> > > responded with 42. It turned out that this was the ASCII value of
> > > '.' ;-)

> >

> You must be remembering inexactly ...


> > Or maybe it does calculations in base 13? ;^)
> >

> and you're obscure and slightly off.
>
> period is dec 62 hex 3E base13 4A base15 42
> colon is dec 42 hex 2A
>

Aargh! I swapped columns 2 & 3.
period dec 46 hex 2E b13 37 b11 42
asterisk dec 42 hex 2A
digit6 dec54 hex 36 b13 42

Sorry.
- formerly david.thompson1 || achar(64) || worldnet.att.net
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 04:37 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.