side-effects, I/O, and the lot

This is a discussion on side-effects, I/O, and the lot within the Functional forums in Programming Languages category; Hi, all. I'm a newcomer to purely functional programming, in the process of wrapping my brain around the central concepts. So, here goes: Consider a hypothetical purely functional language, with first-class functions. For convenience, let's say a function can return multiple values. Now suppose I have a function random() [pardon the Algolish syntax] that returns two values: a pseudorandom number and a new function that, upon invocation, returns the next pseudorandom number and again a new function, etc. This would appear to me to preserve referential transparency; a call to each function would always return the same value. Or, consider ...

Go Back   Application Development Forum > Programming Languages > Functional

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-21-2008, 08:20 AM
Nyang A. Phra
Guest
 
Default side-effects, I/O, and the lot

Hi, all.

I'm a newcomer to purely functional programming, in the process of
wrapping my brain around the central concepts. So, here goes:

Consider a hypothetical purely functional language, with first-class
functions. For convenience, let's say a function can return multiple
values.

Now suppose I have a function random() [pardon the Algolish syntax]
that returns two values: a pseudorandom number and a new function
that, upon invocation, returns the next pseudorandom number and again
a new function, etc.

This would appear to me to preserve referential transparency; a call
to each function would always return the same value.

Or, consider rudimentary I/O in the same vein. A function
open(filename) that would return, say, two anonymous functions, for
reading and writing, such that each write would in turn return a read
function to read the altered version of the file.

You get the idea: a function that alters a state always returns a new
anonymous function to read that altered state.

I guess my question is, not fully grokking what referential
transparency is precisely good for, would this be a sufficient
approach (efficiency concerns aside) to I/O and side-effects in a
purely functional language? As I see it, this approach would preserve
referential transparency within each execution of a program, but not
between different executions at different times. As an example,
consider a time() function, returning the current time and a new
function to read the time again. Clearly, its semantics would be
different on each execution of a program containing calls to it. But
would it matter? I'm guessing yes it would, but I'm unable to figure
out why, or what I'm possibly confused about. Please educate.


Reply With Quote
  #2  
Old 08-21-2008, 10:40 AM
Dan Doel
Guest
 
Default Re: side-effects, I/O, and the lot

Nyang A. Phra wrote:

> This would appear to me to preserve referential transparency; a call
> to each function would always return the same value.
>
> Or, consider rudimentary I/O in the same vein. A function
> open(filename) that would return, say, two anonymous functions, for
> reading and writing, such that each write would in turn return a read
> function to read the altered version of the file.
>
> You get the idea: a function that alters a state always returns a new
> anonymous function to read that altered state.
>
> I guess my question is, not fully grokking what referential
> transparency is precisely good for, would this be a sufficient
> approach (efficiency concerns aside) to I/O and side-effects in a
> purely functional language? As I see it, this approach would preserve
> referential transparency within each execution of a program, but not
> between different executions at different times. As an example,
> consider a time() function, returning the current time and a new
> function to read the time again. Clearly, its semantics would be
> different on each execution of a program containing calls to it. But
> would it matter? I'm guessing yes it would, but I'm unable to figure
> out why, or what I'm possibly confused about. Please educate.


I'm not certain about what your random function would do in all cases, but I
don't think you can claim the others work. Consider, for instance...

(t, f) = time()
... wait a while ...
(t, f) = time()

In other words, what happens when you call it twice? Does it return the same
t and f as the first call? So time() returns the time of the first call to
time in the program, along with a function that returns the time at which
it's first called in the program and so on? If not, I'm not sure you can
credibly make the claim that it's referentially transparent (and even then,
it's pretty odd).

It's even weirder to play that game with I/O...

(read1, write1) = open("foo")
...
(read2, write2) = open("foo")

If I use write1 what happens if I call any of the others? And if I write
with write1 in the "...", is read2 then a different function from read1, in
that it produces different results? If so, then open is not referentially
transparent.

I think the closest you got was with the random function, but only if:

(r1,next1) = rand()
...
(r2,next2) = rand()

means that r1 = r2, and next1 = next2; in other words, beginning from rand()
always produces the same sequence of numbers. But then it's not very useful
for generating random numbers locally: you'd have to pass along the current
random generating function. And then, why even specify that it's a function?
Rather, you might as well pass along some abstract data type for the state
of the random generator, which is how the Haskell standard libraries do it:

random :: RandomGen -> (Int, RandomGen)

(That's not the real type, but it's a simplification). You can also play this
game with I/O and time, too:

open :: RealWorld -> (FileDescriptor, RealWorld)
read :: FileDescriptor -> RealWorld -> (Char, RealWorld)
write :: FileDescriptor -> Char -> RealWorld -> RealWorld
time :: RealWorld -> (Time, RealWorld)

where you pass along an abstract "state of the real world" through your
programs. Clean does this explicitly, and makes sure that you don't
duplicate the real world (because it doesn't really record everything so
that reading from an old RealWorld after writing will behave correctly) with
uniqueness typing. However, you can also ensure such linearity with monads,
and this is one way of seeing the IO monad in Haskell. A value of type IO a
can be thought of as 'RealWorld -> (a, RealWorld)', so the above become:

open :: IO FileDescriptor
read :: FileDescriptor -> IO Char
write :: FileDescriptor -> Char -> IO ()
time :: IO Time

In a way, you could argue that your original formulation of these functions
is the same, provided you made the semantics like I said. time() would
always return the time at which it's first called, and always return the same
function, which returns the time at which that function is first called, and
a function that .... That makes multiple calls to time() in a program
essentially useless, and forces the programmer to thread the new time
functions through their program, which is the same as the Clean/Haskell
situation, with a different enforcement mechanism. I/O would be hairier, as
the runtime system would have to record the contents of all opened files at
all times, so that multiple calls to open or the subsequent functions on the
same file would always yield the same results. That's probably not feasible
in practice.

Finally, the fact that time() and such have their value fixed only at
runtime, and differ between runs is perhaps a little messy. I don't know if
that violates some formal definition of referential transparency or not (I
think I've heard it argued both ways), but conceptually, an IO monad (or
other such abstraction), or passing around a RealWorld value (where your
program is provided with a new one one each run) is probably nicer. But,
you're free to disagree.

Anyhow, I hope this has helped a bit. Cheers.

-- Dan
Reply With Quote
  #3  
Old 08-21-2008, 11:40 AM
Nyang A. Phra
Guest
 
Default Re: side-effects, I/O, and the lot

On Aug 21, 5:40*pm, Dan Doel <dan.d...@gmail.com> wrote:
> Nyang A. Phra wrote:
> > This would appear to me to preserve referential transparency; a call
> > to each function would always return the same value.

>
> > Or, consider rudimentary I/O in the same vein. A function
> > open(filename) that would return, say, two anonymous functions, for
> > reading and writing, such that each write would in turn return a read
> > function to read the altered version of the file.

>
> > You get the idea: a function that alters a state always returns a new
> > anonymous function to read that altered state.

>
> > I guess my question is, not fully grokking what referential
> > transparency is precisely good for, would this be a sufficient
> > approach (efficiency concerns aside) to I/O and side-effects in a
> > purely functional language? As I see it, this approach would preserve
> > referential transparency within each execution of a program, but not
> > between different executions at different times. As an example,
> > consider a time() function, returning the current time and a new
> > function to read the time again. Clearly, its semantics would be
> > different on each execution of a program containing calls to it. But
> > would it matter? I'm guessing yes it would, but I'm unable to figure
> > out why, or what I'm possibly confused about. Please educate.

>
> I'm not certain about what your random function would do in all cases, but I
> don't think you can claim the others work. Consider, for instance...
>
> * * (t, f) = time()
> * * ... wait a while ...
> * * (t, f) = time()
>
> In other words, what happens when you call it twice? Does it return the same
> t and f as the first call? So time() returns the time of the first call to
> time in the program, along with a function that returns the time at which
> it's first called in the program and so on? If not, I'm not sure you can
> credibly make the claim that it's referentially transparent (and even then,
> it's pretty odd).


Yes, I mean exactly that kind of "odd" (granted behaviour, i.e.:

(t1, f1) = time()
(t2, f2) = time()
(t3, f3) = f1()

where t1 = t2 and f1 = f2, whereas t3 and f3 would be something else.

> It's even weirder to play that game with I/O...
>
> * * (read1, write1) = open("foo")
> * * ...
> * * (read2, write2) = open("foo")
>
> If I use write1 what happens if I call any of the others?


Yeah, would seem to get pretty hairy if several reads and writes would
be callable at the same time. There'd have to be several copies of the
accessed file, I think, depending on which writeN() one starts...

> And if I write
> with write1 in the "...", is read2 then a different function from read1, in
> that it produces different results? If so, then open is not referentially
> transparent.


Yes, that's true, referential transparency wouldn't hold with multiple
open()s with write()s in between. Unless, of course, close() returns a
new open(), or some other hack to that effect...

> I think the closest you got was with the random function, but only if:
>
> * * (r1,next1) = rand()
> * * ...
> * * (r2,next2) = rand()
>
> means that r1 = r2, and next1 = next2; in other words, beginning fromrand()
> always produces the same sequence of numbers. But then it's not very useful
> for generating random numbers locally: you'd have to pass along the current
> random generating function.


Yes, that's how I intended it.

> And then, why even specify that it's a function?
> Rather, you might as well pass along some abstract data type for the state
> of the random generator, which is how the Haskell standard libraries do it:
>
> * * random :: RandomGen -> (Int, RandomGen)
>
> (That's not the real type, but it's a simplification). You can also play this
> game with I/O and time, too:
>
> * * open :: RealWorld -> (FileDescriptor, RealWorld)
> * * read :: FileDescriptor -> RealWorld -> (Char, RealWorld)
> * * write :: FileDescriptor -> Char -> RealWorld -> RealWorld
> * * time :: RealWorld -> (Time, RealWorld)
>
> where you pass along an abstract "state of the real world" through your
> programs. Clean does this explicitly, and makes sure that you don't
> duplicate the real world (because it doesn't really record everything so
> that reading from an old RealWorld after writing will behave correctly) with
> uniqueness typing. However, you can also ensure such linearity with monads,
> and this is one way of seeing the IO monad in Haskell. A value of type IOa
> can be thought of as 'RealWorld -> (a, RealWorld)', so the above become:
>
> * * open :: IO FileDescriptor
> * * read :: FileDescriptor -> IO Char
> * * write :: FileDescriptor -> Char -> IO ()
> * * time :: IO Time
>
> In a way, you could argue that your original formulation of these functions
> is the same, provided you made the semantics like I said. time() would
> always return the time at which it's first called, and always return the same
> function, which returns the time at which that function is first called, and
> a function that .... That makes multiple calls to time() in a program
> essentially useless, and forces the programmer to thread the new time
> functions through their program, which is the same as the Clean/Haskell
> situation, with a different enforcement mechanism. I/O would be hairier, as
> the runtime system would have to record the contents of all opened files at
> all times, so that multiple calls to open or the subsequent functions on the
> same file would always yield the same results. That's probably not feasible
> in practice.


OK, yes I see your point.

> Finally, the fact that time() and such have their value fixed only at
> runtime, and differ between runs is perhaps a little messy. I don't know if
> that violates some formal definition of referential transparency or not (I
> think I've heard it argued both ways), but conceptually, an IO monad (or
> other such abstraction), or passing around a RealWorld value (where your
> program is provided with a new one one each run) is probably nicer. But,
> you're free to disagree.


I'm sure I'm free to disagree, but quite unqualified to do so

> Anyhow, I hope this has helped a bit. Cheers.
>
> -- Dan


Thanks, it did.
Reply With Quote
  #4  
Old 08-22-2008, 02:13 AM
Dirk Thierbach
Guest
 
Default Re: side-effects, I/O, and the lot

Nyang A. Phra <naphra@gmail.com> wrote:
> Now suppose I have a function random() [pardon the Algolish syntax]
> that returns two values: a pseudorandom number and a new function
> that, upon invocation, returns the next pseudorandom number and again
> a new function, etc.
>
> This would appear to me to preserve referential transparency; a call
> to each function would always return the same value.


Yes, and one can actually write such a pure function. However, it
probably wouldn't really do what you want: It would return the same
sequence of numbers for every run of the program. For a "random" number,
you'd expect it to be different for each run of the program.

> Or, consider rudimentary I/O in the same vein. A function
> open(filename) that would return, say, two anonymous functions, for
> reading and writing, such that each write would in turn return a read
> function to read the altered version of the file.
>
> You get the idea: a function that alters a state always returns a new
> anonymous function to read that altered state.


Same problem here: What the read function actually returns depends on
some entity external to the program, namely the file. The contents
of the file may be different for different runs of the program. So,
no matter if you use an opaque handle or a bunch of accessor function
as return value of "open", it's not referentially transparent.

> I guess my question is, not fully grokking what referential
> transparency is precisely good for,


The result of referentially transparent function depends only on it's
arguments. That means whenever you call a referentially transparent
function with known arguments, you could replace the whole call with
the result and have the same program. So the compiler can move around,
pre-evaluate, inline or otherwise optimize referentially transparent
functions completely freely. One couldn't do that with a print
statement, say: If you change the order of the print statements, or
remove one completely because it's return value is never used, then
you've changed the program.

Similarly, a theorem prover could prove assertions about a
referentially transperent program, or you could transform
referentially transparent program to a different, also referentially
transparent program which is more efficient (programs that do that
exist).

All this stuff doesn't work (or not as easily) if the code is not
referentially transparent, because then you must take care that you
don't mix up the side effects.

> would this be a sufficient approach (efficiency concerns aside) to
> I/O and side-effects in a purely functional language?


No. :-)

> As I see it, this approach would preserve referential transparency
> within each execution of a program, but not between different
> executions at different times.


Yes, but the point of referential transparency is exactly that the
program remains the same on *all* executions of the program. It only
depends on the arguments; the result will only change if the arguments
change. It shouldn't change if you run it again with the same arguments.

HTH,

- Dirk
Reply With Quote
  #5  
Old 08-22-2008, 05:08 AM
Nyang A. Phra
Guest
 
Default Re: side-effects, I/O, and the lot

On 22 elo, 09:13, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> Nyang A. Phra <nap...@gmail.com> wrote:
>
> > Now suppose I have a function random() [pardon the Algolish syntax]
> > that returns two values: a pseudorandom number and a new function
> > that, upon invocation, returns the next pseudorandom number and again
> > a new function, etc.

>
> > This would appear to me to preserve referential transparency; a call
> > to each function would always return the same value.

>
> Yes, and one can actually write such a pure function. However, it
> probably wouldn't really do what you want: It would return the same
> sequence of numbers for every run of the program. For a "random" number,
> you'd expect it to be different for each run of the program.


True, so I'd probably need a seed(Int) function to start the chain
with.

<snip>

> The result of referentially transparent function depends only on it's
> arguments. That means whenever you call a referentially transparent
> function with known arguments, you could replace the whole call with
> the result and have the same program. So the compiler can move around,
> pre-evaluate, inline or otherwise optimize referentially transparent
> functions completely freely. One couldn't do that with a print
> statement, say: If you change the order of the print statements, or
> remove one completely because it's return value is never used, then
> you've changed the program.
>
> Similarly, a theorem prover could prove assertions about a
> referentially transperent program, or you could transform
> referentially transparent program to a different, also referentially
> transparent program which is more efficient (programs that do that
> exist).
>
> All this stuff doesn't work (or not as easily) if the code is not
> referentially transparent, because then you must take care that you
> don't mix up the side effects.


Yeah, I kinda thought there might be compile-time advantages in having
truly referentially transparent semantics. Thanks for confirming that
intuition.

> > As I see it, this approach would preserve referential transparency
> > within each execution of a program, but not between different
> > executions at different times.

>
> Yes, but the point of referential transparency is exactly that the
> program remains the same on *all* executions of the program. It only
> depends on the arguments; the result will only change if the arguments
> change. It shouldn't change if you run it again with the same arguments.


Yes, figures now. So I suppose I could remedy this situation by
passing a World argument to main() each time I run a program, and then
use that World as the mother of all things state-dependent. And with
that, I guess I'm beginning to see the idea of uniqueness typing as
well..

> HTH,
>
> - Dirk


It did, thanks. Much food for thought.
Reply With Quote
  #6  
Old 08-22-2008, 05:53 AM
Torben Ægidius Mogensen
Guest
 
Default Re: side-effects, I/O, and the lot

Dirk Thierbach <dthierbach@usenet.arcornews.de> writes:

> Nyang A. Phra <naphra@gmail.com> wrote:
>> Now suppose I have a function random() [pardon the Algolish syntax]
>> that returns two values: a pseudorandom number and a new function
>> that, upon invocation, returns the next pseudorandom number and again
>> a new function, etc.
>>
>> This would appear to me to preserve referential transparency; a call
>> to each function would always return the same value.

>
> Yes, and one can actually write such a pure function. However, it
> probably wouldn't really do what you want: It would return the same
> sequence of numbers for every run of the program. For a "random" number,
> you'd expect it to be different for each run of the program.
>
>> Or, consider rudimentary I/O in the same vein. A function
>> open(filename) that would return, say, two anonymous functions, for
>> reading and writing, such that each write would in turn return a read
>> function to read the altered version of the file.
>>
>> You get the idea: a function that alters a state always returns a new
>> anonymous function to read that altered state.

>
> Same problem here: What the read function actually returns depends on
> some entity external to the program, namely the file. The contents
> of the file may be different for different runs of the program. So,
> no matter if you use an opaque handle or a bunch of accessor function
> as return value of "open", it's not referentially transparent.


Both of these issued can be "solved" by having the environment (file
system, time, etc.) be an argument to the program when it starts. You
must avoid using this "world" value twice, as you can't be sure it is
unchanging, so you must make it linearly typed (or use a monad to this
effect) so it can only be used once. Each operation on the "world"
would return a new, possibly modified, world as part of its result.

For example, if you ask for the time, you pass the world value to a
function and get back both the current time and a new world value.
The old world value can no longer be used (the linearity ensures
this), so you don't get the problems described above.

Torben
Reply With Quote
  #7  
Old 08-22-2008, 05:56 AM
Dirk Thierbach
Guest
 
Default Re: side-effects, I/O, and the lot

Nyang A. Phra <naphra@gmail.com> wrote:
> Yes, figures now. So I suppose I could remedy this situation by
> passing a World argument to main() each time I run a program, and then
> use that World as the mother of all things state-dependent. And with
> that, I guess I'm beginning to see the idea of uniqueness typing as
> well..


Exactly. You have to make sure the World is not duplicated, and one
particular instance of the World is never used twice to start a
side effect from. Because the real world behaves this way.

The alternative is to hide the World behind some abstract data type
and only expose accessor functions to that data type that make
sure that the World, which is threaded around inside this data type
for you automatically, is treated in that way. Then you have the IO monad.

- Dirk
Reply With Quote
  #8  
Old 08-22-2008, 08:31 AM
Nyang A. Phra
Guest
 
Default Re: side-effects, I/O, and the lot

On Aug 22, 12:56*pm, Dirk Thierbach <dthierb...@usenet.arcornews.de>
wrote:
> The alternative is to hide the World behind some abstract data type
> and only expose accessor functions to that data type that make
> sure that the World, which is threaded around inside this data type
> for you automatically, is treated in that way. Then you have the IO monad..


So that's what monads are. Cool. Thanks a lot.
Reply With Quote
  #9  
Old 08-22-2008, 09:14 AM
Dirk Thierbach
Guest
 
Default Re: side-effects, I/O, and the lot

Nyang A. Phra <naphra@gmail.com> wrote:
> On Aug 22, 12:56*pm, Dirk Thierbach <dthierb...@usenet.arcornews.de>
> wrote:


>> The alternative is to hide the World behind some abstract data type
>> and only expose accessor functions to that data type that make
>> sure that the World, which is threaded around inside this data type
>> for you automatically, is treated in that way. Then you have the IO monad.


> So that's what monads are.


At least with respect to IO (where in an actual implementation of course you
don't really pass the external world around as an variable, that's only
a convenient way to think about it). And since monads are an abstract data
type, you can use them for other stuff, too.

> Cool. Thanks a lot.


You're welcome.

- Dirk

Reply With Quote
  #10  
Old 08-22-2008, 10:26 AM
Ertugrul Söylemez
Guest
 
Default Re: side-effects, I/O, and the lot

"Nyang A. Phra" <naphra@gmail.com> wrote:

> On Aug 22, 12:56Â*pm, Dirk Thierbach <dthierb...@usenet.arcornews.de>
> wrote:
>
> > The alternative is to hide the World behind some abstract data type
> > and only expose accessor functions to that data type that make sure
> > that the World, which is threaded around inside this data type for
> > you automatically, is treated in that way. Then you have the IO
> > monad.

>
> So that's what monads are. Cool. Thanks a lot.


That's what the IO monad is, a very specific monad. This is how Haskell
implements impureness in a pure way. Monads are a very generic abstract
structure from category theory.

To a programmer, one way to view monads is a type of another type, and
the difference between monads is the way one 'binds' values of the inner
type from a computation to another computation. This can be used to
modify the meaning of 'value', or the meaning of 'passing a value'. For
example, the identity monad just takes the value from the source
computation and passes it verbatim to the consuming computation, hence
the idea of 'value' is exactly a 'value', and the idea of 'passing a
value' is exactly 'passing that specific value'.

IO is almost like the identity monad, but it implicitly passes along the
state of the universe, and it sequences computations by ensuring that
the source computation is finished, before the consuming computation
starts. So the IO monad leaves the meaning of 'value', but changes the
meaning of 'passing a value' to 'first executing the source computation,
then passing the value to the consuming computation together with the
state of the universe'. There are opaque computations, which are
capable of changing the state of the universe, like printing things,
reading files or fetching the system time.

The program is just a value of type IO (), where '()' really means
'nothing at all', hence it is a computation without a result, and the
nature of monads makes it impossible to go back in time, i.e. reusing
the state of the universe, because that state is never mentioned
explicitly.

Other examples of monads are:

* the list monad, where the idea of 'value' is modified to mean an
arbitrary number of values, and 'passing a value' becomes 'passing
all values and collecting all results', i.e. non-determinism,

* the 'maybe' monad, a less general idea of 'value', where a value may
not exist, and 'passing a value' becomes 'passing a value, if there
is a value, otherwise not issuing the consuming computation at all',
i.e. computations, which fail, if any sub-computation fails,

* the state monad, where the idea of 'value' is specialized to a
function of state, etc.

Of course, as monads are a very abstract thing, there are multiple ways
to interpret them.

I hope, that was helpful.


Greets,
Ertugrul.


--
nightmare = unsafePerformIO (getWrongWife >>= sex)

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 11:48 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.