RfD - Local buffers, v4 (long)

This is a discussion on RfD - Local buffers, v4 (long) within the Forth forums in Programming Languages category; Don't forget to attend EuroForth 2008 and the Forth200x standards meeting in gorgeous glorious Vienna from 25-28 September 2008. http://www.euroforth.org I have separated the enhanced local variable syntax from the local buffer proposal. This proposal is about local buffers. Stephen RfD - Local buffers, v4 ======================= Stephen Pelc - 14 September 2007 20080811 Brought into line with LocalsExt4.txt. 20070914 Split local buffers to a separate proposal. Problem ======= When programming large applications, especially those interfacing with a host operating system, there is a frequent need for temporary buffers. This proposal is an extension to the extended locals proposal, on which ...

Go Back   Application Development Forum > Programming Languages > Forth

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-11-2008, 08:57 AM
Stephen Pelc
Guest
 
Default RfD - Local buffers, v4 (long)

Don't forget to attend EuroForth 2008 and the Forth200x standards
meeting in gorgeous glorious Vienna from 25-28 September 2008.
http://www.euroforth.org

I have separated the enhanced local variable syntax from the
local buffer proposal. This proposal is about local buffers.

Stephen

RfD - Local buffers, v4
=======================
Stephen Pelc - 14 September 2007

20080811 Brought into line with LocalsExt4.txt.
20070914 Split local buffers to a separate proposal.

Problem
=======
When programming large applications, especially those interfacing
with a host operating system, there is a frequent need for temporary
buffers. This proposal is an extension to the extended locals
proposal, on which this proposal depends.

Current implementations show that creation and destruction of
local buffers are much faster than using ALLOCATE (14.6.1.0707)
and FREE (14.6.1.1605).

This proposal is derived from implementations that have existed for
more than 15 years.

Solution
========
The following syntax for local arguments and local values is
proposed elsewhere. The sequence:
{ ni1 ni2 ... | lv1 lv2 ... -- o1 o2 ... }
defines local arguments, local values, and outputs. The local
arguments are automatically initialised from the data stack on
entry, the rightmost being taken from the top of the data stack.
Local arguments and local values can be referenced by name within
the word during compilation. The output names are dummies to allow
a complete stack comment to be generated.
The items between { and | are local arguments.
The items between | and -- are local values.
The items between -- and } are outputs for formal comments only.

The outputs are provided in the notation so that complete stack
comments can be produced. However, all text between -- and } is
ignored. The facility is there to permit the notation to form a
complete stack comment. This eases documentation and current
users of the notation like this facility.

Local arguments and values return their values when referenced,
and must be preceded by TO to perform a store.

In the local value region, local buffers may be defined in the
form:
[ <expr> ] lbuff
At least one existing implementation uses the form:
lbuff[ <expr> ]
where any name ending in thr '[' character indicates a local
buffer. To prevent conflict, we define names ending in '[' as
being an ambiguous condition.

Any name preceded by '[ <expr> ]' will be treated as a buffer whose
size is given by the result of interpreting the expression.
Local buffers return their base address, all operators such as TO
are an ambiguous condition.

In the example below, a and b are local arguments, a+b and a*b are
local values, and arr[ is a 10 byte local buffer.

: foo { a b | a+b a*b [ 10 chars ] arr -- }
a b + to a+b
a b * to a*b
cr a+b . a*b .
arr 10 erase
s" Hello" arr swap cmove
arr 5 type
;

Forth 200x text
===============
Replace the text for 13.6.2.xxxx { as follows:

13.6.2.xxxx {
brace LOCAL EXT

Interpretation: Interpretation semantics for this word are undefined.

Compilation:
( "<spaces>arg1" ... "<spaces>argn" | "<spaces>lv1" ... "<spaces>lvn"
-- )

Create up to eight local arguments by repeatedly skipping leading
spaces, parsing arg, and executing implementation defined actions.
The list of local arguments to be defined is terminated by "|", "--"
or "}". Append the run-time semantics for local arguments given below
to the current definition. If a space delimited '|' is encountered,
create up to eight local values or buffers by repeatedly skipping
leading spaces, parsing the "lv" token, and creating the local
element. The list of local values and buffers to be defined is
terminated by "--" or "}". Append the run-time semantics for local
values and local buffers given below to the current definition.
If "--" has been encountered, further text between "--" and } is
ignored.

Local buffers are declared in the form:
[ <expr> ] lbuff
They expression between the whitespace delimited ']' and the
closing ']' is parsed, and pass to 7.6.1.1360 EVALUATE to obtain
the size of the storage in address units.

Local argument run-time: ( x1 ... xn -- )
Local value run-time: ( -- )
Local buffer run-time: ( -- )

Initialise up to eight local arguments from the data stack.
Local argument arg1 is initialized with x1, arg2 with x2 up
to argn from xn, which is on the top of the data stack. When
invoked, each local argument will return its value. The value
of a local argument may be changed using 13.6.1.2295 TO.

Initialise up to eight local values or local buffers. The
initial contents of local values and local buffers are undefined.
When invoked, each local value returns its value. The contents of
a local value may be changed using 13.6.1.2295 TO. The size of a
local value is a cell. When invoked, each local buffer will return
its address. The user may make no assumption about the order and
contiguity of separate local values and buffers in memory.

Ambiguous conditions:
a) The { ... } text extends over more than one line.
b) The expression for local buffer size does not return a single
cell.
c) { ... } is declared more than once in a word.
d) Parsing units '|', '[', ']', '--' and '}' are not whitespace
delimited.
2) A local argument, value or buffer name ends in the '[' character.

See: 3.4 The Forth text interpreter

Ambiguous conditions:
a local argument, value or buffer is executed while in
interpretation state.
TO is applied to a local buffer.


Reference implementation
=========================

0 [if]
This implementation supports the existing notation as well as
that of the proposed standard. This is done to prevent breaking
existing code.

BUILDLV c-addr u +n mode
When executed during compilation, BUILDLV passes a message to the
system identifying a new local argument whose definition name is
given by the string of characters identified by c-addr u. The size
of the data item is given by +n address units, and the mode
identifies the construction required as follows:
0 - finish construction of initialisation and data storage
allocation code. C-addr and u are ignored. +n is 0
(other values are reserved for future use).
1 - identify a local argument, +n = cell
2 - identify a local value, +n = cell
3 - identify a local buffer, +n = storage required.
4+ - reserved for future use
-ve - implementation specific values

The result of executing BUILDLV during compilation of a definition
is to create a set of named local arguments, values and/or
buffers, each of which is a definition name, that only have
execution semantics within the scope of that definition's source.
[then]

: BUILDLV \ c-addr u +n mode --
\ Dummy for testing
CR 2SWAP TYPE SPACE SWAP . .
;

: TOKEN \ -- caddr u
\ Get the next space delimited token from the input stream.
PARSE-NAME
;

: LTERM? \ caddr u -- flag
\ Return true if the string caddr/u is "--" or "}"
2DUP S" --" COMPARE 0= >R
S" }" COMPARE 0= R> OR
;

: LBSIZE \ -- +n
\ Parse up to the terminating ']' and EVALUATE the expression
\ not including the terminating ']'.
POSTPONE [ [CHAR] ] PARSE EVALUATE ]
;

: LB? \ caddr u -- flag
\ Return true if the last character of the string is '['.
+ 1 CHARS - C@ [CHAR] [ =
;

: LSEP? \ caddr u -- flag
\ Return true if the string caddr/u is the separator between
\ local arguments and local values or buffers.
2DUP S" |" COMPARE 0= >R
S" \" COMPARE 0= R> OR
;

: {
1 >R
BEGIN
TOKEN 2DUP LTERM? 0=
WHILE
2DUP LSEP? IF
2DROP R> DROP 0 >R
ELSE
R@ IF
1 CELLS 1
ELSE
2DUP S" [" COMPARE 0= IF
2DROP LBSIZE TOKEN ROT 3
ELSE
2DUP LB? IF
1- LBSIZE 3
ELSE
1 CELLS 2
THEN
THEN
THEN
BUILDLV
THEN
REPEAT
BEGIN
S" }" COMPARE
WHILE
TOKEN
REPEAT
0 0 0 0 BUILDLV
R> DROP ; IMMEDIATE

: TEST1 { a | b c[ 66] [ 77] d e -- f }
CR ." Hello1 " CR ;

TEST1

CR .( swapping c and d ) CR

: TEST2 { a | b [ 77] d c[ 66] e -- f }
CR ." Hello2 " CR ;

TEST2



--
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Reply With Quote
  #2  
Old 08-11-2008, 12:30 PM
Andrew Haley
Guest
 
Default Re: RfD - Local buffers, v4 (long)

Stephen Pelc <stephenXXX@mpeforth.com> wrote:
> Don't forget to attend EuroForth 2008 and the Forth200x standards
> meeting in gorgeous glorious Vienna from 25-28 September 2008.
> http://www.euroforth.org


> I have separated the enhanced local variable syntax from the
> local buffer proposal. This proposal is about local buffers.


This is all very interesting, but why not just have a word LBUFFER
that allocates some (return) stack space? You could use it like this:

4 chars LBUFFER to buf

I suppose this is similar to the variable-length array declaration in
C versus alloca():

int *p[N];

versus

const int *p = alloca(N * sizeof (int));

which are for almost all purposes equivalent.

I'm a bit mystified. There is an obvious simple syntax (LBUFFER,
above) and a complex special syntax. Given that Forth generally
eschews special syntactical forms, I can't see why you want this.

Andrew.
Reply With Quote
  #3  
Old 08-11-2008, 12:50 PM
Bruce McFarling
Guest
 
Default Re: RfD - Local buffers, v4 (long)

On Aug 11, 12:30 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> This is all very interesting, but why not just have a word LBUFFER
> that allocates some (return) stack space? You could use it like this:


> 4 chars LBUFFER to buf


Or?

4 CHARS LBUFFER { buf }

.... since the basic { } locals syntax without all the extra syntax
gives the ability to declare a local value without necessitating a
dictionary entry ...

.... would that be right?

I must say of all the proposed extended locals syntax, the local
buffer is the only bit I really want, and this looks like a reasonable
way to get it.


Reply With Quote
  #4  
Old 08-11-2008, 01:11 PM
Andrew Haley
Guest
 
Default Re: RfD - Local buffers, v4 (long)

Bruce McFarling <agila61@netscape.net> wrote:
> On Aug 11, 12:30 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
> wrote:


> > This is all very interesting, but why not just have a word LBUFFER
> > that allocates some (return) stack space? You could use it like this:


> > 4 chars LBUFFER to buf


> Or?


> 4 CHARS LBUFFER { buf }


> ... since the basic { } locals syntax without all the extra syntax
> gives the ability to declare a local value without necessitating a
> dictionary entry ...


> ... would that be right?


> I must say of all the proposed extended locals syntax, the local
> buffer is the only bit I really want, and this looks like a reasonable
> way to get it.


Yes.

Andrew.
Reply With Quote
  #5  
Old 08-11-2008, 01:36 PM
Andrew Haley
Guest
 
Default Re: RfD - Local buffers, v4 (long)

Thomas Pornin <pornin@bolet.org> wrote:
> According to Andrew Haley <andrew29@littlepinkcloud.invalid>:
> > This is all very interesting, but why not just have a word LBUFFER
> > that allocates some (return) stack space? You could use it like this:
> >
> > 4 chars LBUFFER to buf


> I see some implementation problems:


> -- The LBUFFER word must be special, since it must allocate things
> on the return stack _as seen by the caller_, notwithstanding what
> the call procedure to LBUFFER itself pushed on that very same
> stack. This may be quite tricky in some implementations.


I don't believe it will be all that tricky. Certainly no more tricky
than local variables.

> -- Local variables are fast and easy to access as long as the return
> stack remains untouched. Basically, locals are meant to be accessed
> by using the return stack pointer, with a fixed offset computed at
> word compilation time (when "(local)" was invoked).


They aren't "meant" to be an offset from the return stack pointer.
The obvious implementation technique was always to use a register as a
base pointer to locals, or even to put the locals directly into
registers. The fact that you have to allow DO ... LOOPs (and the
traditional way DO...LOOP is implemented) suggests that you can't
simply base locals off the return stack pointer.

> This precludes dynamic allocation of space within the return stack:
> if the return stack is used directly (for instance with >R or R>),
> then the locals remain unavailable until the return stack has been
> restored to its initial state. Of course, nothing in the ANS
> standard _mandates_ that the return stack uses a pointer and that
> locals are accessed by offsetting that pointer; but many existing
> implementations implements "(LOCAL)" that way, and ANS is especially
> designed to allow such an implementation. Adapting them to dynamic
> allocation with a length computed at runtime may prove quite
> difficult.


If that's true, which I very much doubt, it applies equally to *all*
techniques that allocate local buffers on the return stack.

> Note that C manages to handle so-called variable-length arrays because
> it accesses locals through a frame pointer, initialized from the stack
> pointer upon function entry, and stored in a dedicated register. Forth
> is often a bit more register-starved than C, since it has an additional
> stack to maintain.


I think you're exaggerating here. "often a bit more register-starved
than C" ? Because of _one_ pointer? Really? :-)

Andrew.
Reply With Quote
  #6  
Old 08-12-2008, 06:58 AM
Andrew Haley
Guest
 
Default Re: RfD - Local buffers, v4 (long)

Thomas Pornin <pornin@bolet.org> wrote:
> According to Andrew Haley <andrew29@littlepinkcloud.invalid>:
> > They aren't "meant" to be an offset from the return stack pointer.
> > The obvious implementation technique was always to use a register as a
> > base pointer to locals, or even to put the locals directly into
> > registers. The fact that you have to allow DO ... LOOPs (and the
> > traditional way DO...LOOP is implemented) suggests that you can't
> > simply base locals off the return stack pointer.


> If your compiler supports local variables, then the obvious way to
> implement DO...LOOP is to reserve an unnamed local variable for
> the loop counter.


Well, that's a way to do it: I strongly dispute that it's the obvious
way, especially on a small system, but let's move on.

> It equally applies to all techniques that allocate local buffers
> _with a size known only at runtime_.


Right, which is what the proposal is all about. The purpose of this
is a way to get variable-size local buffers.

> > I think you're exaggerating here. "often a bit more
> > register-starved than C" ? Because of _one_ pointer? Really?
> > :-)


> Because of at least two, not one. A basic ITC-based Forth system
> requires a data stack pointer, a return stack pointer and the extra
> register which receives the word header address.


Sure, but a basic ITC-based Forth system does not put local variables
in registers. When you take that into account, the need for registers
is small. The 8051 with 8 8-bit registers was tight, but that was
some time ago.

Andrew.
Reply With Quote
  #7  
Old 08-12-2008, 10:59 AM
Stephen Pelc
Guest
 
Default Re: RfD - Local buffers, v4 (long)

On Mon, 11 Aug 2008 11:30:42 -0500, Andrew Haley
<andrew29@littlepinkcloud.invalid> wrote:

>This is all very interesting, but why not just have a word LBUFFER
>that allocates some (return) stack space? You could use it like this:
>
> 4 chars LBUFFER to buf


It's been done, but in this proposal release is automatic at EXIT
and friends. In your proposal you need an undo operation. You
still need a local for buf, so we end up with
: foo { a b | c d buf }
4 chars LBUFFER to buf
...
buf DISCARD
;
versus
: foo { a b | c d [ 4 chars ] buf }
...
;

Your notation suffers badly when there are multiple exits from
a word.

>I'm a bit mystified. There is an obvious simple syntax (LBUFFER,
>above) and a complex special syntax. Given that Forth generally
>eschews special syntactical forms, I can't see why you want this.


What Forth applications do you write these days? And which modern
Forth systems have you surveyed?

Stephen

--
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Reply With Quote
  #8  
Old 08-12-2008, 02:15 PM
Bruce McFarling
Guest
 
Default Re: RfD - Local buffers, v4 (long)

On Aug 12, 10:59 am, stephen...@mpeforth.com (Stephen Pelc) wrote:
> It's been done, but in this proposal release is automatic at EXIT
> and friends. In your proposal you need an undo operation.


I don't follow this ... why is automatic release harder for LBUFFER
than for { ... | ... [ ... ] ... }

> You still need a local for buf, so we end up with


> : foo { a b | c d buf }
> 4 chars LBUFFER to buf
> ...
> buf DISCARD
> ;


This is in an independent proposal to the "|" proposal, so wouldn't it
be:

: foo
4 CHARS LBUFFER { buf a b }
...
;

Reply With Quote
  #9  
Old 08-13-2008, 05:49 AM
Andrew Haley
Guest
 
Default Re: RfD - Local buffers, v4 (long)

Stephen Pelc <stephenXXX@mpeforth.com> wrote:
> On Mon, 11 Aug 2008 11:30:42 -0500, Andrew Haley
> <andrew29@littlepinkcloud.invalid> wrote:


> >This is all very interesting, but why not just have a word LBUFFER
> >that allocates some (return) stack space? You could use it like
> >this:
> >
> > 4 chars LBUFFER to buf


> It's been done, but in this proposal release is automatic at EXIT
> and friends. In your proposal you need an undo operation.


I was assuming that *any* release of local buffers is automatic at
EXIT. I was only querying the additional syntax. It seems that some
of these proposals add syntax where none is needed, that's all.
That's why I'm asking the question.

> You
> still need a local for buf, so we end up with
> : foo { a b | c d buf }
> 4 chars LBUFFER to buf
> ...
> buf DISCARD
> ;
> versus
> : foo { a b | c d [ 4 chars ] buf }
> ...
> ;


I was thinking more along the lines of

: foo { a b | c d buf }
4 chars LBUFFER to buf
...
;

> Your notation suffers badly when there are multiple exits from
> a word.


No, not at all. I wasn't proposing manual discard, just asking about
the additional syntax. Given that you have a way to do this that
avoids new syntax, and a way that adds it, why do you prefer adding
syntax? Is there *any* reason other than "this is the way we do it" ?

> >I'm a bit mystified. There is an obvious simple syntax (LBUFFER,
> >above) and a complex special syntax. Given that Forth generally
> >eschews special syntactical forms, I can't see why you want this.


> What Forth applications do you write these days? And which modern
> Forth systems have you surveyed?


I haven't written Forth applications for quite a few years. I think
I'm still entitled to ask questions, though.

Are you trying to poison the well, Steve? Shame on you.

Andrew.
Reply With Quote
  #10  
Old 08-13-2008, 08:13 AM
Stephen Pelc
Guest
 
Default Re: RfD - Local buffers, v4 (long)

On Wed, 13 Aug 2008 04:49:09 -0500, Andrew Haley
<andrew29@littlepinkcloud.invalid> wrote:

>No, not at all. I wasn't proposing manual discard, just asking about
>the additional syntax.


My misunderstanding.

>Given that you have a way to do this that
>avoids new syntax, and a way that adds it, why do you prefer adding
>syntax?


>> >I'm a bit mystified. There is an obvious simple syntax (LBUFFER,
>> >above) and a complex special syntax. Given that Forth generally
>> >eschews special syntactical forms, I can't see why you want this.


I've seen at least three mechanisms using LBUFFER style words
a) local frame
b) sbrk style frame
c) frame from heap

There are also techniques and operating systems in which such buffers
have persistence beyond a single word. Consequently a separate release
word is appropriate.

>> What Forth applications do you write these days? And which modern
>> Forth systems have you surveyed?

>
>I haven't written Forth applications for quite a few years. I think
>I'm still entitled to ask questions, though.
>
>Are you trying to poison the well, Steve? Shame on you.


My apologies if you were offended. I was simply trying to find out
the perspective of Forth which leads to your remarks.

Stephen


--
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 05:27 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.