copy or not? (style question)

This is a discussion on copy or not? (style question) within the lisp forums in Programming Languages category; Hi everyone, I have a question about programming style. Before Lisp, I used R, where almost all arguments to a function are deeply copied, eg > f <- function(a) { + a[2] <- 9 + a + } > a1 <- 1:3 > a1 [1] 1 2 3 > f(a1) [1] 1 9 3 > a1 [1] 1 2 3 Initially, I tried to imitate this in Lisp, and wrote my programs with the following "contract" in mind: a function that uses a non-atomic argument has to copy it. For example, think about a function that calculates a (univariate) spline ...

Go Back   Application Development Forum > Programming Languages > lisp

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-24-2008, 01:09 PM
Tamas K Papp
Guest
 
Default copy or not? (style question)

Hi everyone,

I have a question about programming style. Before Lisp, I used R, where
almost all arguments to a function are deeply copied, eg

> f <- function(a) {

+ a[2] <- 9
+ a
+ }
> a1 <- 1:3
> a1

[1] 1 2 3
> f(a1)

[1] 1 9 3
> a1

[1] 1 2 3

Initially, I tried to imitate this in Lisp, and wrote my programs with
the following "contract" in mind: a function that uses a non-atomic
argument has to copy it.

For example, think about a function that calculates a (univariate) spline
at arbitrary real numbers, where this function is returned by another
function and the information about the spline (the polynomials) is
captured in an array, which is in the closure. Skeleton code:

(defun make-spline-function (knots polynomials)
;; [check knots and polynomials]
(let ((knots (array-copy knots))
(polynomials (array-copy polynomials)))
(lambda (x)
;; [calculate spline at x]
)))

The assumption here is that I need to copy, because the user may make
changes to knots and polynomials after make-spline-function is called,
which can lead to a mess. For example, knots may not be weakly
increasing anymore, so the function inside would need to check for that
each time.

Now I am drifting towards the opposite style, where I do not make copies:
the assumption is that whoever calls the function should not mess with
the arrays afterwards. If the user needs to change the structure, he
should make a copy and change that.

I think that both styles can be consistent. The latter seems elegant,
but less "secure". OTOH, when I write this code as a library, I guess I
can just document the fact that arrays are not copied and the user should
act accordingly.

Which style is more "natural" in Lisp? What do you use?

Do you worry about "consistency checks" because the data structures that
are not copied could have changed since when they were captured?

Thanks,

Tamas
Reply With Quote
  #2  
Old 08-24-2008, 02:28 PM
Pascal J. Bourguignon
Guest
 
Default Re: copy or not? (style question)

Tamas K Papp <tkpapp@gmail.com> writes:
> Now I am drifting towards the opposite style, where I do not make copies:
> the assumption is that whoever calls the function should not mess with
> the arrays afterwards. If the user needs to change the structure, he
> should make a copy and change that.
>
> I think that both styles can be consistent. The latter seems elegant,
> but less "secure". OTOH, when I write this code as a library, I guess I
> can just document the fact that arrays are not copied and the user should
> act accordingly.
>
> Which style is more "natural" in Lisp?


Both.


In CL, there are functions that are specified not to change the
arguments, and others that are allowed to change them.

For example:

(let ((a (list 1 2 3 2 4 2 5 2)))
(print (remove 2 a))
a)
prints:
(1 3 4 5)
--> (1 2 3 2 4 2 5 2)


On the other hand:

(let ((a (list 1 2 3 2 4 2 5 2)))
(print (delete 2 a))
a)
prints:
(1 3 4 5)
--> (1 3 4 5)


For a lot of functions there's a pair F / NF, F doesn't modify its
arguments, while NF must avoid consing, and therefore is allowed to
modify its arguments. ((substitute nsubstitute) (reverse nreverse) ...)



However, in a programming language without a garbage collector it's
rather hard to manage structure sharing and functions returning new
structures, so it is customary in these programming languages to make
a lot of copies. But in Lisp we have a garbage collector, so we can
on the contrary optimize out time and space, by sharing structures.
Therefore I would rather advise using pure functions, that won't
modify their arguments, and for which it is useless to copy them.


> What do you use?


Both. See above.



--
__Pascal Bourguignon__ http://www.informatimago.com/
Cats meow out of angst
"Thumbs! If only we had thumbs!
We could break so much!"
Reply With Quote
  #3  
Old 08-24-2008, 03:16 PM
Tamas K Papp
Guest
 
Default Re: copy or not? (style question)

On Sun, 24 Aug 2008 20:28:05 +0200, Pascal J. Bourguignon wrote:

> For a lot of functions there's a pair F / NF, F doesn't modify its
> arguments, while NF must avoid consing, and therefore is allowed to
> modify its arguments. ((substitute nsubstitute) (reverse nreverse) ...)
>
> However, in a programming language without a garbage collector it's
> rather hard to manage structure sharing and functions returning new
> structures, so it is customary in these programming languages to make a
> lot of copies. But in Lisp we have a garbage collector, so we can on
> the contrary optimize out time and space, by sharing structures.
> Therefore I would rather advise using pure functions, that won't modify
> their arguments, and for which it is useless to copy them.


For me, the issue is not really about optimization, but enforcing some
consistency on shared structures. Imagine that a closure uses shared
structure, which is required to satisfy some condition. One could

1) check the structure at the time of the closure's creation,
2) check the structure each time the closure is called,
3) both,
4) neither

1) catches obvious misunderstandings (eg the function expecting a vector
of weakly increasing numbers, but called with one that does not satisfy
this), and is reasonably fast. But the shared structure can be modified
in the meantime.

2) always catches inconsistencies, but it is slow and mostly
unnecessary. 3) ditto.

4) can be bad. Some inconsistencies are caught as an error (such as
division by 0). But with numerical computations, errors can creep in
unnoticed, and are hard to debug. The worst-case scenario is having an
inconsistency that influences computation but is not noticed at all.

I think that there is another way: make error checking dependent on
something, possibly the safety declaration, for example. If I recall
correctly, Eiffel compilers had a feature like this which you could turn
off after the code has been tested. How could I do this in CL? Suppose

(declaim (optimize (safety 3)))

was used. How can I make code dependent on the safety level?

But even this would be a compilation-time feature, not something I can
toggle at runtime - that's OK I guess.

I wonder if this is the way to go, opinions and suggestions are welcome.

Thanks,

Tamas
Reply With Quote
  #4  
Old 08-24-2008, 03:31 PM
Rainer Joswig
Guest
 
Default Re: copy or not? (style question)

In article <6hdqcnFkdhveU2@mid.individual.net>,
Tamas K Papp <tkpapp@gmail.com> wrote:

> On Sun, 24 Aug 2008 20:28:05 +0200, Pascal J. Bourguignon wrote:
>
> > For a lot of functions there's a pair F / NF, F doesn't modify its
> > arguments, while NF must avoid consing, and therefore is allowed to
> > modify its arguments. ((substitute nsubstitute) (reverse nreverse) ...)
> >
> > However, in a programming language without a garbage collector it's
> > rather hard to manage structure sharing and functions returning new
> > structures, so it is customary in these programming languages to make a
> > lot of copies. But in Lisp we have a garbage collector, so we can on
> > the contrary optimize out time and space, by sharing structures.
> > Therefore I would rather advise using pure functions, that won't modify
> > their arguments, and for which it is useless to copy them.

>
> For me, the issue is not really about optimization, but enforcing some
> consistency on shared structures. Imagine that a closure uses shared
> structure, which is required to satisfy some condition. One could
>
> 1) check the structure at the time of the closure's creation,
> 2) check the structure each time the closure is called,
> 3) both,
> 4) neither
>
> 1) catches obvious misunderstandings (eg the function expecting a vector
> of weakly increasing numbers, but called with one that does not satisfy
> this), and is reasonably fast. But the shared structure can be modified
> in the meantime.
>
> 2) always catches inconsistencies, but it is slow and mostly
> unnecessary. 3) ditto.
>
> 4) can be bad. Some inconsistencies are caught as an error (such as
> division by 0). But with numerical computations, errors can creep in
> unnoticed, and are hard to debug. The worst-case scenario is having an
> inconsistency that influences computation but is not noticed at all.
>
> I think that there is another way: make error checking dependent on
> something, possibly the safety declaration, for example. If I recall
> correctly, Eiffel compilers had a feature like this which you could turn
> off after the code has been tested. How could I do this in CL? Suppose
>
> (declaim (optimize (safety 3)))
>
> was used. How can I make code dependent on the safety level?
>
> But even this would be a compilation-time feature, not something I can
> toggle at runtime - that's OK I guess.
>
> I wonder if this is the way to go, opinions and suggestions are welcome.
>
> Thanks,
>
> Tamas



It is relatively simple. If you need to protect
data against modifications then do it.

* let routines hand out copies of the original data

* let routines hand out shallow objects (value objects)
with just the necessary data set

* let routines hand out data that only it knows how to
change. For example in CLOS you could use slot names
that are not global symbols. So slots could
be read with a reader function, but there would be
no writer and no way to set the SLOT-VALUE (without
knowing the private slotname).

* if there is some concurrent access the use locks
(or similar) to protect the data against concurrent
changes

* in some cases you might want to make sure that the
changing operations are atomic

* some routines might want to make copies of argument data first

If your software needs to check the integrity of some data,
the write assertions (see ASSERT and CHECK-TYPE).

If your needs are more demanding, then you might think
about some kind of transaction handling.

--
http://lispm.dyndns.org/
Reply With Quote
  #5  
Old 08-24-2008, 10:22 PM
Scott Burson
Guest
 
Default Re: copy or not? (style question)

On Aug 24, 10:09 am, Tamas K Papp <tkp...@gmail.com> wrote:
> Hi everyone,
>
> I have a question about programming style. Before Lisp, I used R, where
> almost all arguments to a function are deeply copied, eg
>
> > f <- function(a) {

>
> + a[2] <- 9
> + a
> + }> a1 <- 1:3
> > a1

> [1] 1 2 3
> > f(a1)

> [1] 1 9 3
> > a1

>
> [1] 1 2 3
>
> Initially, I tried to imitate this in Lisp, and wrote my programs with
> the following "contract" in mind: a function that uses a non-atomic
> argument has to copy it.


This is why I wrote my functional collections library FSet. See:

http://common-lisp.net/project/fset/

FSet gives you the functional semantics you want with the efficiency
you need.

* (defun f (a) (setf (@ a 1) 9) a)
F
* (setq a1 (seq 1 2 3))
#[1 2 3]
* (f a1)
#[1 9 3]
* a1
#[1 2 3]

To be honest, I don't think anyone is using FSet but me. I know, it
could use more and better documentation -- though what's there is
already well beyond what most open-source projects have. But I have
other things to work on, and it's just not clear how many users FSet
could attract even if I wrote reams more about it.

Besides, with no feedback it's hard to know what aspects the docs need
to explain better.

The lack of enthusiasm is certainly understandable -- Lisp already has
several kinds of collection primitives, and of them, lists can even be
used functionally (though it sometimes takes a little effort and isn't
efficient for large collections). It's hard to explain the benefits
of efficient functional collections against that background.

But the benefits of FSet go beyond the functional semantics. Where CL
just gives you the bare data structures and lets you use them as you
wish, FSet collections are semantically loaded -- sets and sequences,
for instance, are different types and support different operations.
This allows code using them to be written more abstractly, and thus
more elegantly.

Well, that's my opinion, anyway. I hope you'll give it a try. I'd
love to hear what you think about it, whether you wind up using it or
not.

-- Scott
Reply With Quote
  #6  
Old 08-24-2008, 11:10 PM
D Herring
Guest
 
Default Re: copy or not? (style question)

Scott Burson wrote:

> But the benefits of FSet go beyond the functional semantics. Where CL
> just gives you the bare data structures and lets you use them as you
> wish, FSet collections are semantically loaded -- sets and sequences,
> for instance, are different types and support different operations.
> This allows code using them to be written more abstractly, and thus
> more elegantly.


Is there a comparison between FSet and cl-containers?

Thanks,
Daniel
Reply With Quote
  #7  
Old 08-24-2008, 11:54 PM
Scott Burson
Guest
 
Default Re: copy or not? (style question)

On Aug 24, 8:10 pm, D Herring <dherr...@at.tentpost.dot.com> wrote:
> Scott Burson wrote:
> > But the benefits of FSet go beyond the functional semantics. Where CL
> > just gives you the bare data structures and lets you use them as you
> > wish, FSet collections are semantically loaded -- sets and sequences,
> > for instance, are different types and support different operations.
> > This allows code using them to be written more abstractly, and thus
> > more elegantly.

>
> Is there a comparison between FSet and cl-containers?


I don't have a detailed comparison, but I can see right off that the
cl-containers classes have imperative semantics. It appears
furthermore that the projects have rather different design goals. CL-
containers gives you ways to make certain kinds of imperative
algorithms more efficient, and (perhaps secondarily) easier to write.
FSet's goal, on the other hand, is to make it easier (and efficient
_enough_) to write programs in a more functional and more abstract
style, which will be more elegant and thus easier to write and
understand.

Also, it appears that the emphasis of cl-containers is to provide a
broad choice of data structures, where FSet is more about providing a
rich interface to each of four fundamental types: sets, bags, seqs
(sequences), and maps.

So don't think of FSet primarily as a way to make your code faster.
Think of it as enabling you to write in a functional style, at least
where collection operations are concerned.

-- Scott
Reply With Quote
  #8  
Old 08-25-2008, 02:35 AM
Pascal J. Bourguignon
Guest
 
Default Re: copy or not? (style question)

Scott Burson <FSet.SLB@gmail.com> writes:
> To be honest, I don't think anyone is using FSet but me. I know, it
> could use more and better documentation -- though what's there is
> already well beyond what most open-source projects have. But I have
> other things to work on, and it's just not clear how many users FSet
> could attract even if I wrote reams more about it.


(and remembering Kenny complaining about the lack of Cell followers).

So it would seem the problem of Lisp is not so much the lack of
libraries than the lack of library users...

--
__Pascal Bourguignon__ http://www.informatimago.com/

"Klingon function calls do not have "parameters" -- they have
"arguments" and they ALWAYS WIN THEM."
Reply With Quote
  #9  
Old 08-25-2008, 03:33 AM
Lennart Staflin
Guest
 
Default Re: copy or not? (style question)

Scott Burson <FSet.SLB@gmail.com> writes:

> To be honest, I don't think anyone is using FSet but me. I know, it
> could use more and better documentation -- though what's there is
> already well beyond what most open-source projects have. But I have
> other things to work on, and it's just not clear how many users FSet
> could attract even if I wrote reams more about it.


Well, I tried to use it. Found a couple of bugs, but no way of
reporting bugs. I gave up on using it.

Common-lisp.net kinda sucks. I got as far as to a trac page that
didn't even mention bugs. Much less how to create a bug report.

--
Lennart Staflin <lenst@lysator.liu.se>
Reply With Quote
  #10  
Old 08-25-2008, 07:43 AM
Tamas K Papp
Guest
 
Default Re: copy or not? (style question)

On Sun, 24 Aug 2008 19:22:09 -0700, Scott Burson wrote:

> But the benefits of FSet go beyond the functional semantics. Where CL
> just gives you the bare data structures and lets you use them as you
> wish, FSet collections are semantically loaded -- sets and sequences,
> for instance, are different types and support different operations. This
> allows code using them to be written more abstractly, and thus more
> elegantly.
>
> Well, that's my opinion, anyway. I hope you'll give it a try. I'd love
> to hear what you think about it, whether you wind up using it or not.


Hi Scott,

I looked at your library, and I find the idea nice, but I won't be using
it. Mostly because I use matrices and vectors, and don't want to deal
with the overhead and possible bugs. Also, when I call foreign
functions, I would have to convert, which would be a large PITA.

Best,

Tamas


Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 01:22 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.