looking for a technique to reduce stack usage - PROLOG
This is a discussion on looking for a technique to reduce stack usage - PROLOG ; Hello
Just ordered O'Keefe's book on the craft of Porlog. While I am waiting,
can somebody here suggest a technique to reduce stack usage in the
following program?
With SWI prolog I get the following statistics
------------- stack usage for ...
-
looking for a technique to reduce stack usage
Hello
Just ordered O'Keefe's book on the craft of Porlog. While I am waiting,
can somebody here suggest a technique to reduce stack usage in the
following program?
With SWI prolog I get the following statistics
------------- stack usage for size 14 -----------------
?- move_disks('A','B','C',14),statistics.
Heap : 2,083,664 Bytes
Local stack : 2,048,000 1,679,360 1,671,828 Bytes
Global stack : 4,096,000 102,400 100,192 Bytes
Trail stack : 4,096,000 36,864 33,436 Bytes
3 garbage collections gained 558,532 bytes in 0.02 seconds.
1 threads, 0 finished threads used 0.00 seconds.
----------------------------------------------------------------
For size 15 the local stack is overflown.
Surely this program can be rewritten in a way that doesn't consume this
much memory. I am looking for a technique that can be generalized to
deal with other not so trivial problems. To me the challenge always is
to restate the problem in a way that addresses resource consumption.
--- hanoi towers implementation -----
% towers of hanoi, 3 pegs
% source A, destination B, intermediate C, number N
% vars must be instantiated
move_disks(A, B, _, 1) :- write('move top from '),
write(A), write(' to '), writeln(B).
% move from A to B via C
move_disks(A, B, C, N) :- N>1, N1 is N-1,
move_disks(A,C,B,N1),
move_disks(A,B,_,1),
move_disks(C,B,A,N1).
-------- end -------------------------------------
-
Re: looking for a technique to reduce stack usage
Hi!
mneuhaber22@berlin.com wrote:
>
> For size 15 the local stack is overflown.
> Surely this program can be rewritten in a way that doesn't consume this
> much memory.
Make the predicate tail recursive to avoid using so much local stack space:
move_disks1([]).
move_disks1([m(A,B,C,N)|Rest]) :-
( N =:= 1 ->
format("move top from: ~w to ~w\n", [A,B]),
move_disks1(Rest)
;
N1 is N - 1,
move_disks1([m(A,C,B,N1),m(A,B,_,1),m(C,B,A,N1)|Rest])
).
doit1(N) :-
move_disks1([m(a,b,c,N)]),
statistics.
?- doit1(14).
Heap : 398,240 Bytes
Local stack : 2,048,000 49,152 748 Bytes
Global stack : 4,096,000 65,536 51,160 Bytes
Trail stack : 4,096,000 16,384 324 Bytes
All the best,
Markus.
-
Re: looking for a technique to reduce stack usage
Markus,
thank you for this enlightening rewrite. I tested it and it works
great. Is there any way to measure the max length of the list that
holds the solutions above. In other words, you are using constant
amount of stack but does it come at the cost of a large list allocated
on the heap? Where is the technique you used explained in more detail
(textbook, web link)?
-
Re: looking for a technique to reduce stack usage
Hi!
mneuhaber22@berlin.com wrote:
> thank you for this enlightening rewrite. I tested it and it works
> great. Is there any way to measure the max length of the list that
> holds the solutions above.
There is no such list holding the solutions - I only reified the
call-stack: Instead of implicit allocation of environment frames on the
local stack, I (more or less) explicitly allocate corresponding terms on
the global stack. Garbage collection then corresponds to deallocation of
frames.
> In other words, you are using constant
> amount of [local] stack but does it come at the cost of a large list allocated
> on the heap?
Yeah, the fine-print being that the hand-crafted list, albeit large,
apparently still consumes less memory (on the global stack) than the
environment frames would in its place (on the local stack).
> Where is the technique you used explained in more detail
> (textbook, web link)?
"Modern Compiler Implementation in ML" by Andrew W. Appel discusses it
in the context of garbage collection, where a depth-first search is
implemented using an explicit stack to allocate only N words instead of
N stack frames.
All the best,
Markus.
-
Re: looking for a technique to reduce stack usage
mneuhaber22@berlin.com wrote:
} Hello
} Just ordered O'Keefe's book on the craft of Porlog. While I am waiting,
} can somebody here suggest a technique to reduce stack usage in the
} following program?
} With SWI prolog I get the following statistics
} ------------- stack usage for size 14 -----------------
} ?- move_disks('A','B','C',14),statistics.
} Heap : 2,083,664 Bytes
} Local stack : 2,048,000 1,679,360 1,671,828 Bytes
} Global stack : 4,096,000 102,400 100,192 Bytes
} Trail stack : 4,096,000 36,864 33,436 Bytes
} 3 garbage collections gained 558,532 bytes in 0.02 seconds.
} 1 threads, 0 finished threads used 0.00 seconds.
} ----------------------------------------------------------------
} For size 15 the local stack is overflown.
} Surely this program can be rewritten in a way that doesn't consume this
} much memory. I am looking for a technique that can be generalized to
} deal with other not so trivial problems. To me the challenge always is
} to restate the problem in a way that addresses resource consumption.
.....
Sure, the TOH can be re-written to be purely iterative. It's a good
exercise. 
-
Re: looking for a technique to reduce stack usage
mneuhaber22@berlin.com wrote:
> Hello
>
> Just ordered O'Keefe's book on the craft of Porlog. While I am waiting,
> can somebody here suggest a technique to reduce stack usage in the
> following program?
>
> With SWI prolog I get the following statistics
> ------------- stack usage for size 14 -----------------
> ?- move_disks('A','B','C',14),statistics.
> Heap : 2,083,664 Bytes
> Local stack : 2,048,000 1,679,360 1,671,828 Bytes
> Global stack : 4,096,000 102,400 100,192 Bytes
> Trail stack : 4,096,000 36,864 33,436 Bytes
>
> 3 garbage collections gained 558,532 bytes in 0.02 seconds.
> 1 threads, 0 finished threads used 0.00 seconds.
> ----------------------------------------------------------------
>
> For size 15 the local stack is overflown.
> Surely this program can be rewritten in a way that doesn't consume this
> much memory. I am looking for a technique that can be generalized to
> deal with other not so trivial problems. To me the challenge always is
> to restate the problem in a way that addresses resource consumption.
>
> --- hanoi towers implementation -----
> % towers of hanoi, 3 pegs
>
> % source A, destination B, intermediate C, number N
> % vars must be instantiated
> move_disks(A, B, _, 1) :- write('move top from '),
> write(A), write(' to '), writeln(B).
> % move from A to B via C
> move_disks(A, B, C, N) :- N>1, N1 is N-1,
> move_disks(A,C,B,N1),
> move_disks(A,B,_,1),
> move_disks(C,B,A,N1).
> -------- end -------------------------------------
>
My guess is that because the Prolog inference engine has no way of
"knowing"
that these two rules are mutually exclusive, it has to assume they are not,
which causes it to allocate space for what I believe are called "choice
points" that
are in fact not necessary.
I am not sure exactly what the output from 'statistics/0' means, but
the following, revised version of your code seems reasonably
well-behaved to me
move_disks(A, B, _, 1) :-
!,
write('move top from '),
write(A), write(' to '), writeln(B).
% move from A to B via C
move_disks(A, B, C, N) :-
N>1,
N1 is N-1,
move_disks(A,C,B,N1),
move_disks(A,B,_,1),
!,
move_disks(C,B,A,N1).1.59 seconds cpu time for 305,311 inferences
given the goal
?- move_disks('A','B','C',15),statistics.
2,278 atoms, 1,456 functors, 1,560 predicates, 24 modules, 35,419 VM-codes
Limit Allocated In use
Heap : 503,888 Bytes
Local stack : 2,048,000 16,384 748 Bytes
Global stack : 4,096,000 212,992 197,668 Bytes
Trail stack : 4,096,000 81,920 65,944 Bytes
1 threads, 0 finished threads used 0.00 seconds.
billh
-
Re: looking for a technique to reduce stack usage
student wrote:
> My guess is that because the Prolog inference engine has no way of
> "knowing"
> that these two rules are mutually exclusive, it has to assume they are not,
> which causes it to allocate space for what I believe are called "choice
> points" that
> are in fact not necessary.
>
> I am not sure exactly what the output from 'statistics/0' means, but
> the following, revised version of your code seems reasonably
> well-behaved to me
>
> move_disks(A, B, _, 1) :-
> !,
> write('move top from '),
> write(A), write(' to '), writeln(B).
>
> % move from A to B via C
> move_disks(A, B, C, N) :-
> N>1,
> N1 is N-1,
> move_disks(A,C,B,N1),
> move_disks(A,B,_,1),
> !,
> move_disks(C,B,A,N1).1.59 seconds cpu time for 305,311 inferences
Ah, finally someone who knows where "the clapper hangs in the clock"
(flemmish expression).
The excessive stack space usage is indeed due to the non-recognised
determinism.
The extra cut in the first clause solves that problem. Transforming it
to if-then-else has the same effect and the non-tail-recusion removal
(as proposed in other contributions) has almost no effect: it is easy to
check that (with the cut in the first clause) the recursion depth at any
point does not exceed the N in the original query - so tail recursion
optimisation should be enough - Bill's statistics figures don't really
show that (I think) because it gives statistics AFTER all computation is
finished: what one would need is a high-water mark.
I would however argue strongly against the cut in the second clause: it
is totally unnecessary.
Cheers
Bart Demoen
-
Re: looking for a technique to reduce stack usage
Markus Triska wrote:
> Make the predicate tail recursive to avoid using so much local stack space:
>
> move_disks1([]).
> move_disks1([m(A,B,C,N)|Rest]) :-
> ( N =:= 1 ->
> format("move top from: ~w to ~w\n", [A,B]),
> move_disks1(Rest)
> ;
> N1 is N - 1,
> move_disks1([m(A,C,B,N1),m(A,B,_,1),m(C,B,A,N1)|Rest])
> ).
>
> doit1(N) :-
> move_disks1([m(a,b,c,N)]),
> statistics.
>
>
> ?- doit1(14).
>
> Heap : 398,240 Bytes
> Local stack : 2,048,000 49,152 748 Bytes
> Global stack : 4,096,000 65,536 51,160 Bytes
> Trail stack : 4,096,000 16,384 324 Bytes
You gained local stack by introducing if-then-else: the tail-recursion
doesn't help in this case.
Cheers
Bart Demoen
-
Re: looking for a technique to reduce stack usage
Markus Triska wrote:
>> Where is the technique you used explained in more detail
>> (textbook, web link)?
>
>
> "Modern Compiler Implementation in ML" by Andrew W. Appel discusses it
> in the context of garbage collection, where a depth-first search is
> implemented using an explicit stack to allocate only N words instead of
> N stack frames.
For turning any predicate into a tail recursive one, you can look in
that compiler book (and not find anything about Prolog) or consider the
quite general technique of binarizing the clauses. Here is an example:
Suppose you have a predicate p/2 with a clauses:
p(I,J).
p(X,Y) :- q(X,A), r(A,B), s(B,Y).
then rewrite this as:
p(P,Q) :- p(P,Q,true).
p(I,J,Cont) :- call(Cont).
p(X,Y,Cont) :- q(X,A, r(A,B, s(B,Y, Cont))).
of course, you should do a similar thing to the clauses for r and s.
You have now a program in which every clause has one call in the body,
and so it's certainly a tail call: no environment frames are needed
for its execution. Of course, choicepoints remain the same.
If you wish, you can specialize the call(Cont) goal, because you know
that the only Conts possible are of the form true/0, r/3 and s/3.
If there are builtins called and an "early" cut, you adapt the
transformation a bit - here is an example:
p(X,Y,Z) :- X > 17, !, functor(Y,Name,Arity), q(Name,Z), write(Z), foo(X,Y).
becomes:
p(X,Y,Z,Cont) :-
X > 17, !, functor(Y,Name,Arity),
q(Name,Z, mywrite(Z, foo(X,Y, Cont))).
mywrite(T,Cont) :- write(T), call(Cont).
If there are cuts in the body or perhaps some if-then-elses, then you
need to deal with them too.
Bin-Prolog binarizes clauses before compiling to a WAM like engine:
it doesn't have an environment stack. jProlog's compilation schema also
starts with binarizing.
Binarizing (+ some specialization stuff later on) can make programs
faster !
Look at http://www.binnetcorp.com/BinProlog/doc/advanced.html
for a bit more on binarization and some references.
Here is a different version of the TOH problem and its binarized
equivalent:
toh_orig(N,Moves,_From,_Over,_To,Tail) :-
N =:= 0, !,
Moves = Tail.
toh_orig(N,Moves,From,Over,To,Tail) :-
N > 0,
M is N - 1,
toh_orig(M,Moves,From,To,Over,T1),
T1 = [move(From,To)|T2],
toh_orig(M,T2,Over,From,To,Tail).
toh_bin(N,Moves,From,To,Over,Tail) :-
toh_bin(N,Moves,From,To,Over,Tail,true).
toh_bin(N,Moves,_From,_Over,_To,Tail,Cont) :-
N =:= 0, !,
Moves = Tail,
call(Cont).
toh_bin(N,Moves,From,Over,To,Tail,Cont) :-
N > 0,
M is N - 1,
T1 = [move(From,To)|T2],
toh_bin(M,Moves,From,To,Over,T1,
toh_bin(M,T2,Over,From,To,Tail,Cont)).
Cheers
Bart Demoen
-
Re: looking for a technique to reduce stack usage
Hi!
Bart Demoen wrote:
> the non-tail-recusion removal has almost no effect
How to best measure this? Here is what I thought I'm doing:
I allocate (on the global stack) 3 terms of arity 4 per call, makes
3*(4+1) = 15 memory cells, give or take a few for encompassing list term
etc. I looked at the definition of local frames in the SWI source and I
got the impression that environment frames are considerably larger than
this.
Thank you in advance,
Markus.
Similar Threads
-
By Application Development in forum Compilers
Replies: 0
Last Post: 04-05-2007, 11:04 PM
-
By Application Development in forum Compilers
Replies: 1
Last Post: 03-14-2007, 10:25 PM
-
By Application Development in forum Compilers
Replies: 9
Last Post: 10-15-2006, 07:34 AM
-
By Application Development in forum Compilers
Replies: 0
Last Post: 09-03-2004, 11:42 AM