# optimization question - lisp

This is a discussion on optimization question - lisp ; Hi, I want to learn how to write faster numerical code in Lisp. The function below is not a bottleneck in my current application, but since it is simple, I thought I could learn some techniques. The function looks like ...

1. ## optimization question

Hi,

I want to learn how to write faster numerical code in Lisp. The
function below is not a bottleneck in my current application, but
since it is simple, I thought I could learn some techniques.

The function looks like this:

(defun golden-section-combination (a b)
"Return the convex combination (1-G)*a+G*b, where G is the
inverse of the golden ratio."
(declare (double-float a b))
(let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
(Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
(the double-float (+ (the double-float (* Gleft a))
(the double-float (* Gright b))))))

My final goal is to achieve the equivalent of the C function

double golden_section_combination(double a, double b) {
return (a*0.3819660112501051+b*0.6180339887498949);
}

ie with the constants compiled into the function, but disassemble
shows that there are quite a few other things in there (I am using
SBCL):

CL-NUMLIB> (disassemble #'golden-section-combination)
; 0B18A95D: 8B05F0A8180B MOV EAX, [#xB18A8F0] ; 0.6180339887498949d0
; no-arg-parsing entry point
; 63: DDD8 FSTPD FR0
; 65: DD4001 FLDD [EAX+1]
; 68: D8C9 FMULD FR1
; 6A: DDD3 FSTD FR3
; 6C: 8B05F4A8180B MOV EAX, [#xB18A8F4] ; 0.3819660112501051d0
; 72: DDD8 FSTPD FR0
; 74: DD4001 FLDD [EAX+1]
; 77: D8CA FMULD FR2
; 79: 9B WAIT
; 7C: 9B WAIT
; 7D: 64 FS-SEGMENT-PREFIX
; 7E: 800D4800000004 OR BYTE PTR [#x48], 4
; 85: BA10000000 MOV EDX, 16
; 8A: 64 FS-SEGMENT-PREFIX
; 8B: 031520000000 ADD EDX, [#x20]
; 91: 64 FS-SEGMENT-PREFIX
; 92: 3B1524000000 CMP EDX, [#x24]
; 98: 7607 JBE L0
; 9A: E85D8DEDFC CALL #x80636FC ; alloc_overflow_edx
; 9F: EB0A JMP L1
; A1: L0: 64 FS-SEGMENT-PREFIX
; A2: 891520000000 MOV [#x20], EDX
; A8: 83EA10 SUB EDX, 16
; AB: L1: C70216030000 MOV DWORD PTR [EDX], 790
; B1: 8D5207 LEA EDX, [EDX+7]
; B4: DD5201 FSTD [EDX+1]
; B7: 64 FS-SEGMENT-PREFIX
; B8: 80354800000004 XOR BYTE PTR [#x48], 4
; BF: 7402 JEQ L2
; C1: CC09 BREAK 9 ; pending interrupt trap
; C3: L2: 8D65F8 LEA ESP, [EBP-8]
; C6: F8 CLC
; C7: 8B6DFC MOV EBP, [EBP-4]
; CA: C20400 RET 4
; CD: 90 NOP
; CE: 90 NOP
; CF: 90 NOP
; D0: CC0A BREAK 10 ; error trap
; D2: 02 BYTE #X02
; D3: 18 BYTE #X18 ; INVALID-ARG-COUNT-ERROR
; D4: 4D BYTE #X4D ; ECX
; D5: CC0A BREAK 10 ; error trap
; D7: 02 BYTE #X02
; D8: 06 BYTE #X06 ; OBJECT-NOT-DOUBLE-FLOAT-ERROR
; D9: 8E BYTE #X8E ; EDX
; DA: CC0A BREAK 10 ; error trap
; DC: 04 BYTE #X04
; DD: 06 BYTE #X06 ; OBJECT-NOT-DOUBLE-FLOAT-ERROR
; DE: FECE01 BYTE #XFE, #XCE, #X01 ; EDI
;
NIL

Thanks,

Tamas

2. ## Re: optimization question

I'm not sure about SBCL, but on my old iBook G4 under Lispworks 5.02
(trial), I think I was able to optimize it:

(declaim (ftype (function (double-float double-float)
double-float)
golden-section-combination)
(inline golden-section-combination))

(defun golden-section-combination (a b)
"Return the convex combination (1-G)*a+G*b, where G is the
inverse of the golden ratio."
(declare (optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)
#+lispworks (float 0)))
(declare (double-float a b))
(let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
(Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
(the double-float (+ (the double-float (* Gleft a))
(the double-float (* Gright b))))))

(defun test (a b)
(declare (optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)
#+lispworks (float 0)))
(declare (inline golden-section-combination))
(declare (type double-float a b))
(+ (golden-section-combination 12.34d0 56.78d0)
(golden-section-combination 98.10d0 23.45d0)
(golden-section-combination a b)))

The code in test gave me this in assembly (one downside is that it calls
SYSTEM::RAW-FAST-BOX-DOUBLE, but unless you make the function to return
the value in one of the parameters, it can't be avoided):

CL-USER 12 > (disassemble 'test)
8: #xBFA10000 stmw fp,#x0(sp)
12: #x603D0000 ori fp,sp,#x0
16: #x83D70006 lwz const,#x6(func)
20: #xC8830005 lfd f4,#x5(res/arg0)
24: #xC8640005 lfd f3,#x5(arg1)
28: #x82DE002D lwz r22,#x2D(const) ; 0.6180339887498949D0
32: #xC8560005 lfd f2,#x5(r22)
36: #xFC820132 fmul f4,f2,f0-Ftmp1
40: #x82DE0031 lwz r22,#x31(const) ; 0.3819660112501051D0
44: #xC8560005 lfd f2,#x5(r22)
48: #xFC6200F2 fmul f3,f2,f0-Ftmp1
56: #x82DE0035 lwz r22,#x35(const) ; 98.90080577654365D0
60: #xC8360005 lfd f1,#x5(r22)
68: #x82FE0039 lwz func,#x39(const) ;
SYSTEM::RAW-FAST-BOX-DOUBLE
72: #x38000001 li nargs,#x1
76: #xBBA10000 lmw fp,#x0(sp)
88: #x80F70002 lwz tmp1,#x2(func)
96: #x7CE903A6 mtctr tmp1
100: #x4E800420 bctr

(declaim (ftype (function (double-float double-float)
double-float)
golden-section-combination)
(inline golden-section-combination))

Also you have to notify the compiler that you want to use it as inline
one, in test, for example:

(declare (inline golden-section-combination))

At least those are the rules I've found out for Lispworks, not sure for
SBCL.

3. ## Re: optimization question

And if you want to avoid any consing, if SBCL does, as Lispworks, then
instead of making the function return a value, make it return in an
array (ugly, but that's the way I'm planning to do my vector3/4 and
matrix33/44 library):

(defun test (a b c)
(declare (optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)
#+lispworks (float 0)))
(declare (inline golden-section-combination))
(declare (type double-float a b)
(type (simple-array double-float (1)) c))
(setf (aref c 0) (+ (golden-section-combination 12.34d0 56.78d0)
(golden-section-combination 98.10d0 23.45d0)
(golden-section-combination a b)))
c)

Here c is array of one double-float

CL-USER 21 > (disassemble 'test)
8: #xBFA10000 stmw fp,#x0(sp)
12: #x603D0000 ori fp,sp,#x0
16: #x83D70006 lwz const,#x6(func)
20: #xC8830005 lfd f4,#x5(res/arg0)
24: #xC8640005 lfd f3,#x5(arg1)
28: #x82DE002D lwz r22,#x2D(const) ; 0.6180339887498949D0
32: #xC8560005 lfd f2,#x5(r22)
36: #xFC820132 fmul f4,f2,f0-Ftmp1
40: #x82DE0031 lwz r22,#x31(const) ; 0.3819660112501051D0
44: #xC8560005 lfd f2,#x5(r22)
48: #xFC6200F2 fmul f3,f2,f0-Ftmp1
56: #x82DE0035 lwz r22,#x35(const) ; 98.90080680013434D0
60: #xC8760005 lfd f3,#x5(r22)
68: #xD8850005 stfd f4,#x5(arg2)
72: #x60A30000 ori res/arg0,arg2,#x0
76: #x38000001 li nargs,#x1
80: #xBBA10000 lmw fp,#x0(sp)
92: #x4E800020 blr

No consing, but ugly.

This one, is more flexible:

(defun test2 (a b &optional (c (make-array 1 :element-type 'double-float)))
(declare (optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)
#+lispworks (float 0)))
(declare (inline golden-section-combination))
(declare (type double-float a b)
(type (simple-array double-float (1)) c))
(setf (aref c 0) (+ (golden-section-combination 12.34d0 56.78d0)
(golden-section-combination 98.10d0 23.45d0)
(golden-section-combination a b)))
c)

As it allows you caller-site to either choose to reuse memory, or cons
memory, but it costs one additional branch at the begining of the function:

CL-USER 22 > (disassemble 'test2)
8: #xBF610000 stmw r27-p,#x0(sp)
16: #x83D70006 lwz const,#x6(func)
20: #x60160000 ori r22,nargs,#x0
24: #x607B0000 ori r27-p,res/arg0,#x0
28: #x609C0000 ori r28-p,arg1,#x0
32: #x2C160002 cmpwi cr0,r22,#x2
36: #x40810050 ble 116
40: #x82DE0031 lwz r22,#x31(const) ; 0.6180339887498949D0
44: #xC8960005 lfd f4,#x5(r22)
48: #xC87B0005 lfd f3,#x5(r27-p)
52: #xFC8400F2 fmul f4,f4,f0-Ftmp1
56: #x82DE0035 lwz r22,#x35(const) ; 0.3819660112501051D0
60: #xC8760005 lfd f3,#x5(r22)
64: #xC85C0005 lfd f2,#x5(r28-p)
68: #xFC6300B2 fmul f3,f3,f0-Ftmp1
76: #x82DE0039 lwz r22,#x39(const) ; 98.90080680013434D0
80: #xC8760005 lfd f3,#x5(r22)
88: #xD8850005 stfd f4,#x5(arg2)
92: #x60A30000 ori res/arg0,arg2,#x0
96: #x38000001 li nargs,#x1
100: #xBB610000 lmw r27-p,#x0(sp)
112: #x4E800020 blr
116: #x82FE002D lwz func,#x2D(const) ; SYSTEM::ALLOC-I-VECTOR
120: #x38000003 li nargs,#x3
124: #x38600004 li res/arg0,#x4
128: #x38800100 li arg1,#x100
132: #x63050000 ori arg2,nil,#x0
136: #x80F70002 lwz tmp1,#x2(func)
144: #x7CE903A6 mtctr tmp1
148: #x4E800421 bctrl
152: #x82C3FFFD lwz r22,#x-3(res/arg0)
156: #x62D62000 ori r22,r22,#x2000
160: #x92C3FFFD stw r22,#x-3(res/arg0)
164: #x60650000 ori arg2,res/arg0,#x0
168: #x4BFFFF80 b 40

Again, this is under Lispworks in 32bit mode Mac OS X PowerPC G4, under
other platforms/bits, or other lisps, like SBCL it could be different
(when comes to consing).

4. ## Re: optimization question

>>>>> "Tamas" == Tamas Papp <tkpapp@gmail.com> writes:

Tamas> Hi,
Tamas> I want to learn how to write faster numerical code in Lisp. The
Tamas> function below is not a bottleneck in my current application, but
Tamas> since it is simple, I thought I could learn some techniques.

Tamas> The function looks like this:

Tamas> (defun golden-section-combination (a b)
Tamas> "Return the convex combination (1-G)*a+G*b, where G is the
Tamas> inverse of the golden ratio."
Tamas> (declare (double-float a b))
Tamas> (let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
Tamas> (Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
Tamas> (the double-float (+ (the double-float (* Gleft a))
Tamas> (the double-float (* Gright b))))))

With sbcl I'm pretty sure the "the double-float" stuff isn't needed
because the compiler can figure out the types of the operations.

Tamas> My final goal is to achieve the equivalent of the C function

Tamas> double golden_section_combination(double a, double b) {
Tamas> return (a*0.3819660112501051+b*0.6180339887498949);
Tamas> }

Tamas> ie with the constants compiled into the function, but disassemble
Tamas> shows that there are quite a few other things in there (I am using
Tamas> SBCL):

CL-NUMLIB> (disassemble #'golden-section-combination)
Tamas> ; 0B18A95D: 8B05F0A8180B MOV EAX, [#xB18A8F0] ; 0.6180339887498949d0
Tamas> ; no-arg-parsing entry point
Tamas> ; 63: DDD8 FSTPD FR0
Tamas> ; 65: DD4001 FLDD [EAX+1]
Tamas> ; 68: D8C9 FMULD FR1
Tamas> ; 6A: DDD3 FSTD FR3
Tamas> ; 6C: 8B05F4A8180B MOV EAX, [#xB18A8F4] ; 0.3819660112501051d0
Tamas> ; 72: DDD8 FSTPD FR0
Tamas> ; 74: DD4001 FLDD [EAX+1]
Tamas> ; 77: D8CA FMULD FR2
Tamas> ; 79: 9B WAIT
Tamas> ; 7A: D8C3 FADDD FR3
Tamas> ; 7C: 9B WAIT
Tamas> ; 7D: 64 FS-SEGMENT-PREFIX
Tamas> ; 7E: 800D4800000004 OR BYTE PTR [#x48], 4
Tamas> ; 85: BA10000000 MOV EDX, 16
Tamas> ; 8A: 64 FS-SEGMENT-PREFIX
Tamas> ; 8B: 031520000000 ADD EDX, [#x20]
Tamas> ; 91: 64 FS-SEGMENT-PREFIX
Tamas> ; 92: 3B1524000000 CMP EDX, [#x24]
Tamas> ; 98: 7607 JBE L0

the constant 0.618... into the FPU. Inst at 68 that with the first
arg, already in FR1. Inst at 6c and 74 loads the second
constant 0.3819... into the FPU. Inst at 77 multiplies that with the
second arg in FR2. Inst at 7a adds the two products together. The
rest of the stuff is for boxing up the result.

Without looking at the C code, I guess the only difference would be
that the C code might multiply directly from memory. Can't remember
if that's possible or not.

So the compiled lisp code is probably quite similar to the C code

Ray

5. ## Re: optimization question

"Dimiter \"malkia\" Stanev" <malkia@mac.com> writes:

> And if you want to avoid any consing, if SBCL does, as Lispworks, then
> instead of making the function return a value, make it return in an
> array (ugly, but that's the way I'm planning to do my vector3/4 and
> matrix33/44 library):

Of course, such a solution is not thread-safe - it is not re-entrant.

In Allegro CL, you can do it in a slightly different way (note that
I'm on mac/intel here; I much prefer its floating point to that of the
x87 style unit - note also that I removed all of the extraneous THE
forms - they just clutter up the source):

CL-USER(2): (shell "cat try.cl")
#+allegro
(eval-when (compile)
(setf (get 'golden-section-combination 'sys::immed-args-call)
'((double-float double-float) double-float))
)

(defun golden-section-combination (a b)
"Return the convex combination (1-G)*a+G*b, where G is the
inverse of the golden ratio."
(declare (optimize speed (safety 0))
(double-float a b))
(let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
(Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
(+ (* Gleft a) (* Gright b))))

0
CL-USER(3): :cf try
;;; Compiling file try.cl
;;; Writing fasl file try.fasl
;;; Fasl write complete
CL-USER(4): :ld try
CL-USER(5): (disassemble 'golden-section-combination)
;; disassembly of #<Function GOLDEN-SECTION-COMBINATION>
;; formals: A EXCL:F_PLACE-HOLDER B EXCL:F_PLACE-HOLDER
;; constant vector:
0: 0.3819660112501051d0
1: 0.6180339887498949d0

;; code start: #x10a51fb4:
0: ff a7 47 03 00 jmp *[edi+839] ; SYS::IMMED-ARG-HOOK
00
6: 8b 5e 12 movl ebx,[esi+18] ; 0.3819660112501051d0
9: f2 0f 10 6b f6 movsd xmm5,[ebx-10]
14: 8b 5e 16 movl ebx,[esi+22] ; 0.6180339887498949d0
17: f2 0f 10 63 f6 movsd xmm4,[ebx-10]
22: f2 0f 10 5c 24 movsd xmm3,[esp+4]
04
28: f2 0f 59 e3 mulsd xmm4,xmm3
32: f2 0f 10 5c 24 movsd xmm3,[esp+12]
0c
38: f2 0f 59 eb mulsd xmm5,xmm3
42: f2 0f 58 ec addsd xmm5,xmm4
46: f2 0f 10 c5 movsd xmm0,xmm5
50: 8b 75 fc movl esi,[ebp-4]
53: c3 ret
CL-USER(6):

If you call this function in straight lisp code, the immed-arg-hook
unboxes the arguments, calls this functiin back again, and boxes up
the result. But if you compile a call to this function with the
immed-args-call property still in effect, then that call sets up the
arguments directly in immediate form, calls the function in this
immediate way, and the return is used (still unboxed, and with no
consing at this level). It is up to the caller, now, as to what it
wants to do with the unboxed value it just got back.

We don't officially document this approach because it is not a very
Lispy approach - a lispy approach would allow redefinitions without
consequences. But since it is extremely useful we provide it and
always give out unofficial documentation to customers who ask for it.

--
Duane Rettig duane@franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182

6. ## SBCL float to pointer conversion, cost 13 (was Re: optimization question)

Raymond Toy <raymond.toy@ericsson.com> writes:

> With sbcl I'm pretty sure the "the double-float" stuff isn't needed
> because the compiler can figure out the types of the operations.

Thanks, I have removed it.

> the constant 0.618... into the FPU. Inst at 68 that with the first
> arg, already in FR1. Inst at 6c and 74 loads the second
> constant 0.3819... into the FPU. Inst at 77 multiplies that with the
> second arg in FR2. Inst at 7a adds the two products together. The
> rest of the stuff is for boxing up the result.
>
> Without looking at the C code, I guess the only difference would be
> that the C code might multiply directly from memory. Can't remember
> if that's possible or not.
>
> So the compiled lisp code is probably quite similar to the C code

OK, I get it.

I have simplified the function and I think I am almost done, but SBCL
gives me notes about float to pointer conversion with a cost of 13.

The code is here (thanks to Dimiter and Duane too for suggestions):

(declaim (ftype (function (double-float double-float)
double-float)
golden-section-combination)
(inline golden-section-combination))

(defun golden-section-combination (a b)
"Return the convex combination (1-G)*a+G*b, where G is the
inverse of the golden ratio."
(declare (double-float a b))
(declare (optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)))
(+ (* #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)) a)
(* #.(/ (- 3d0 (sqrt 5d0)) 2d0) b)))

(defun golden-section-minimize (f a b tol)
"Find a local minimum of f in the [a,b] interval. The
algorithm terminates when the minimum is bracketed in an interval
smaller than tol. Since the algorithm is slow, tol should not be
chosen smaller then necessary. The algorithm will also find the
local minimum at the endpoints, and if f is unimodal, it will
find the global minimum."
(declare (double-float a b tol)
(type (function (double-float) double-float) f)
(optimize (speed 3) (safety 0)
(compilation-speed 0)
(space 0) (debug 0)))
;; reorder a and b if necessary
(when (> a b)
(rotatef a b))
;; start iteration with golden ratio inner points
(let* ((m1 (golden-section-combination a b))
(m2 (golden-section-combination b a))
(f1 (funcall f m1))
(f2 (funcall f m2)))
(do ()
((< (abs (- b a)) tol)
(if (< f1 f2)
(values m1 f1)
(values m2 f2)))
;;;; uncomment below for debugging
;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
;; a m1 f1 m2 f2 b)
(if (< f1 f2)
(progn
;; new bracket is (a,m1,m2)
(shiftf b m2 m1 (golden-section-combination m1 a))
(shiftf f2 f1 (funcall f m1)))
(progn
;; new bracket is (m1,m2,b)
(shiftf a m1 m2 (golden-section-combination m2 b))
(shiftf f1 f2 (funcall f m2)))))))

And the warnings I get are:

; in: LAMBDA NIL
; SB-INT:NAMED-LAMBDA
; ==>
; #'(SB-INT:NAMED-LAMBDA CL-NUMLIB::GOLDEN-SECTION-COMBINATION
; (CL-NUMLIB::A CL-NUMLIB::B)
; (DECLARE (DOUBLE-FLOAT CL-NUMLIB::A CL-NUMLIB::B))
; (DECLARE
; (OPTIMIZE (SPEED 3) (SAFETY 0) (COMPILATION-SPEED 0)
; (SPACE 0) (DEBUG 0)))
; (BLOCK CL-NUMLIB::GOLDEN-SECTION-COMBINATION
; (+ (* 0.6180339887498949d0 CL-NUMLIB::A)
; (* 0.3819660112501051d0 CL-NUMLIB::B))))
;
; note: doing float to pointer coercion (cost 13) to "<return value>"
;
; compilation unit finished
; printed 1 note
STYLE-WARNING: redefining GOLDEN-SECTION-COMBINATION in DEFUN
; in: LAMBDA NIL
; LET*
;
; note: doing float to pointer coercion (cost 13) to M1
;
; note: doing float to pointer coercion (cost 13) to M2

; SHIFTF
; --> LET MULTIPLE-VALUE-BIND LET MULTIPLE-VALUE-BIND LET
; --> MULTIPLE-VALUE-BIND LET MULTIPLE-VALUE-BIND LET
; ==>
; (SETQ CL-NUMLIB::M2 #:G13)
;
; note: doing float to pointer coercion (cost 13) to M2

; ==>
; (SETQ CL-NUMLIB::M1 #:G6)
;
; note: doing float to pointer coercion (cost 13) to M1
;
; compilation unit finished
; printed 4 notes

Thanks,

Tamas

7. ## Re: optimization question

You probably want to add (declare (optimize speed)) to the function,
or put (declaim (optimize speed)) at the top of the file.

8. ## Re: SBCL float to pointer conversion, cost 13 (was Re: optimizationquestion)

Try with making m1, m2, f1 & f2 also double-floats. Although I'm not
sure whether that would help with SBCL, but it would I think with other
compilers.

> (let* ((m1 (golden-section-combination a b))
> (m2 (golden-section-combination b a))
> (f1 (funcall f m1))
> (f2 (funcall f m2)))

(declare (type double-float m1 m2 f1 f2))
> (do ()
> ((< (abs (- b a)) tol)
> (if (< f1 f2)
> (values m1 f1)
> (values m2 f2)))
> ;;;; uncomment below for debugging
> ;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
> ;; a m1 f1 m2 f2 b)

9. ## Re: SBCL float to pointer conversion, cost 13 (was Re: optimization question)

"Dimiter \"malkia\" Stanev" <malkia@mac.com> writes:

> Try with making m1, m2, f1 & f2 also double-floats. Although I'm not
> sure whether that would help with SBCL, but it would I think with
> other compilers.
>
>> (let* ((m1 (golden-section-combination a b))
>> (m2 (golden-section-combination b a))
>> (f1 (funcall f m1))
>> (f2 (funcall f m2)))

> (declare (type double-float m1 m2 f1 f2))
>> (do ()
>> ((< (abs (- b a)) tol)
>> (if (< f1 f2)
>> (values m1 f1)
>> (values m2 f2)))
>> ;;;; uncomment below for debugging
>> ;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
>> ;; a m1 f1 m2 f2 b)

I tried, but it doesn't help. Actually, the same warning crops up
when I compile golden-section combination, so it must be something
there.

Tamas

10. ## Re: SBCL float to pointer conversion, cost 13 (was Re: optimization question)

Tamas Papp wrote:

> I tried, but it doesn't help. Actually, the same warning crops up
> when I compile golden-section combination, so it must be something
> there.
>

For golden-section-combination, if you make the function local to all
its callers, then SBCL/CMUCL will probably know enough not to box the
float return value, at least IIRC. Of course, that doesn't suit for
functions that are part of your external-facing interface for your
library.

i.e. you could do
(labels
((golden-section-combination (a b)
...))

(defun golden-section-minimize (f a b tol)
...<calls to golden-section-combination>... )
....)