optimization question - lisp

This is a discussion on optimization question - lisp ; Hi, I want to learn how to write faster numerical code in Lisp. The function below is not a bottleneck in my current application, but since it is simple, I thought I could learn some techniques. The function looks like ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 19

optimization question

  1. Default optimization question

    Hi,

    I want to learn how to write faster numerical code in Lisp. The
    function below is not a bottleneck in my current application, but
    since it is simple, I thought I could learn some techniques.

    The function looks like this:

    (defun golden-section-combination (a b)
    "Return the convex combination (1-G)*a+G*b, where G is the
    inverse of the golden ratio."
    (declare (double-float a b))
    (let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
    (Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
    (the double-float (+ (the double-float (* Gleft a))
    (the double-float (* Gright b))))))

    My final goal is to achieve the equivalent of the C function

    double golden_section_combination(double a, double b) {
    return (a*0.3819660112501051+b*0.6180339887498949);
    }

    ie with the constants compiled into the function, but disassemble
    shows that there are quite a few other things in there (I am using
    SBCL):

    CL-NUMLIB> (disassemble #'golden-section-combination)
    ; 0B18A95D: 8B05F0A8180B MOV EAX, [#xB18A8F0] ; 0.6180339887498949d0
    ; no-arg-parsing entry point
    ; 63: DDD8 FSTPD FR0
    ; 65: DD4001 FLDD [EAX+1]
    ; 68: D8C9 FMULD FR1
    ; 6A: DDD3 FSTD FR3
    ; 6C: 8B05F4A8180B MOV EAX, [#xB18A8F4] ; 0.3819660112501051d0
    ; 72: DDD8 FSTPD FR0
    ; 74: DD4001 FLDD [EAX+1]
    ; 77: D8CA FMULD FR2
    ; 79: 9B WAIT
    ; 7A: D8C3 FADDD FR3
    ; 7C: 9B WAIT
    ; 7D: 64 FS-SEGMENT-PREFIX
    ; 7E: 800D4800000004 OR BYTE PTR [#x48], 4
    ; 85: BA10000000 MOV EDX, 16
    ; 8A: 64 FS-SEGMENT-PREFIX
    ; 8B: 031520000000 ADD EDX, [#x20]
    ; 91: 64 FS-SEGMENT-PREFIX
    ; 92: 3B1524000000 CMP EDX, [#x24]
    ; 98: 7607 JBE L0
    ; 9A: E85D8DEDFC CALL #x80636FC ; alloc_overflow_edx
    ; 9F: EB0A JMP L1
    ; A1: L0: 64 FS-SEGMENT-PREFIX
    ; A2: 891520000000 MOV [#x20], EDX
    ; A8: 83EA10 SUB EDX, 16
    ; AB: L1: C70216030000 MOV DWORD PTR [EDX], 790
    ; B1: 8D5207 LEA EDX, [EDX+7]
    ; B4: DD5201 FSTD [EDX+1]
    ; B7: 64 FS-SEGMENT-PREFIX
    ; B8: 80354800000004 XOR BYTE PTR [#x48], 4
    ; BF: 7402 JEQ L2
    ; C1: CC09 BREAK 9 ; pending interrupt trap
    ; C3: L2: 8D65F8 LEA ESP, [EBP-8]
    ; C6: F8 CLC
    ; C7: 8B6DFC MOV EBP, [EBP-4]
    ; CA: C20400 RET 4
    ; CD: 90 NOP
    ; CE: 90 NOP
    ; CF: 90 NOP
    ; D0: CC0A BREAK 10 ; error trap
    ; D2: 02 BYTE #X02
    ; D3: 18 BYTE #X18 ; INVALID-ARG-COUNT-ERROR
    ; D4: 4D BYTE #X4D ; ECX
    ; D5: CC0A BREAK 10 ; error trap
    ; D7: 02 BYTE #X02
    ; D8: 06 BYTE #X06 ; OBJECT-NOT-DOUBLE-FLOAT-ERROR
    ; D9: 8E BYTE #X8E ; EDX
    ; DA: CC0A BREAK 10 ; error trap
    ; DC: 04 BYTE #X04
    ; DD: 06 BYTE #X06 ; OBJECT-NOT-DOUBLE-FLOAT-ERROR
    ; DE: FECE01 BYTE #XFE, #XCE, #X01 ; EDI
    ;
    NIL

    Any advice would be appreciated.

    Thanks,

    Tamas

  2. Default Re: optimization question

    I'm not sure about SBCL, but on my old iBook G4 under Lispworks 5.02
    (trial), I think I was able to optimize it:

    (declaim (ftype (function (double-float double-float)
    double-float)
    golden-section-combination)
    (inline golden-section-combination))

    (defun golden-section-combination (a b)
    "Return the convex combination (1-G)*a+G*b, where G is the
    inverse of the golden ratio."
    (declare (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)
    #+lispworks (float 0)))
    (declare (double-float a b))
    (let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
    (Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
    (the double-float (+ (the double-float (* Gleft a))
    (the double-float (* Gright b))))))

    (defun test (a b)
    (declare (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)
    #+lispworks (float 0)))
    (declare (inline golden-section-combination))
    (declare (type double-float a b))
    (+ (golden-section-combination 12.34d0 56.78d0)
    (golden-section-combination 98.10d0 23.45d0)
    (golden-section-combination a b)))

    The code in test gave me this in assembly (one downside is that it calls
    SYSTEM::RAW-FAST-BOX-DOUBLE, but unless you make the function to return
    the value in one of the parameters, it can't be avoided):

    CL-USER 12 > (disassemble 'test)
    0: #x7FE802A6 mflr link
    4: #x3021FFF4 addic sp,sp,#x-C
    8: #xBFA10000 stmw fp,#x0(sp)
    12: #x603D0000 ori fp,sp,#x0
    16: #x83D70006 lwz const,#x6(func)
    20: #xC8830005 lfd f4,#x5(res/arg0)
    24: #xC8640005 lfd f3,#x5(arg1)
    28: #x82DE002D lwz r22,#x2D(const) ; 0.6180339887498949D0
    32: #xC8560005 lfd f2,#x5(r22)
    36: #xFC820132 fmul f4,f2,f0-Ftmp1
    40: #x82DE0031 lwz r22,#x31(const) ; 0.3819660112501051D0
    44: #xC8560005 lfd f2,#x5(r22)
    48: #xFC6200F2 fmul f3,f2,f0-Ftmp1
    52: #xFC84182A fadd f4,f4,f3
    56: #x82DE0035 lwz r22,#x35(const) ; 98.90080577654365D0
    60: #xC8360005 lfd f1,#x5(r22)
    64: #xFC21202A fadd f1,f1,f4
    68: #x82FE0039 lwz func,#x39(const) ;
    SYSTEM::RAW-FAST-BOX-DOUBLE
    72: #x38000001 li nargs,#x1
    76: #xBBA10000 lmw fp,#x0(sp)
    80: #x3021000C addic sp,sp,#xC
    84: #x7FE803A6 mtlr link
    88: #x80F70002 lwz tmp1,#x2(func)
    92: #x30E70005 addic tmp1,tmp1,#x5
    96: #x7CE903A6 mtctr tmp1
    100: #x4E800420 bctr

    So I guess, making your function inline-able could help you:

    (declaim (ftype (function (double-float double-float)
    double-float)
    golden-section-combination)
    (inline golden-section-combination))

    Also you have to notify the compiler that you want to use it as inline
    one, in test, for example:

    (declare (inline golden-section-combination))

    At least those are the rules I've found out for Lispworks, not sure for
    SBCL.



  3. Default Re: optimization question

    And if you want to avoid any consing, if SBCL does, as Lispworks, then
    instead of making the function return a value, make it return in an
    array (ugly, but that's the way I'm planning to do my vector3/4 and
    matrix33/44 library):

    (defun test (a b c)
    (declare (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)
    #+lispworks (float 0)))
    (declare (inline golden-section-combination))
    (declare (type double-float a b)
    (type (simple-array double-float (1)) c))
    (setf (aref c 0) (+ (golden-section-combination 12.34d0 56.78d0)
    (golden-section-combination 98.10d0 23.45d0)
    (golden-section-combination a b)))
    c)

    Here c is array of one double-float

    CL-USER 21 > (disassemble 'test)
    0: #x7FE802A6 mflr link
    4: #x3021FFF4 addic sp,sp,#x-C
    8: #xBFA10000 stmw fp,#x0(sp)
    12: #x603D0000 ori fp,sp,#x0
    16: #x83D70006 lwz const,#x6(func)
    20: #xC8830005 lfd f4,#x5(res/arg0)
    24: #xC8640005 lfd f3,#x5(arg1)
    28: #x82DE002D lwz r22,#x2D(const) ; 0.6180339887498949D0
    32: #xC8560005 lfd f2,#x5(r22)
    36: #xFC820132 fmul f4,f2,f0-Ftmp1
    40: #x82DE0031 lwz r22,#x31(const) ; 0.3819660112501051D0
    44: #xC8560005 lfd f2,#x5(r22)
    48: #xFC6200F2 fmul f3,f2,f0-Ftmp1
    52: #xFC84182A fadd f4,f4,f3
    56: #x82DE0035 lwz r22,#x35(const) ; 98.90080680013434D0
    60: #xC8760005 lfd f3,#x5(r22)
    64: #xFC83202A fadd f4,f3,f4
    68: #xD8850005 stfd f4,#x5(arg2)
    72: #x60A30000 ori res/arg0,arg2,#x0
    76: #x38000001 li nargs,#x1
    80: #xBBA10000 lmw fp,#x0(sp)
    84: #x3021000C addic sp,sp,#xC
    88: #x7FE803A6 mtlr link
    92: #x4E800020 blr

    No consing, but ugly.

    This one, is more flexible:

    (defun test2 (a b &optional (c (make-array 1 :element-type 'double-float)))
    (declare (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)
    #+lispworks (float 0)))
    (declare (inline golden-section-combination))
    (declare (type double-float a b)
    (type (simple-array double-float (1)) c))
    (setf (aref c 0) (+ (golden-section-combination 12.34d0 56.78d0)
    (golden-section-combination 98.10d0 23.45d0)
    (golden-section-combination a b)))
    c)


    As it allows you caller-site to either choose to reuse memory, or cons
    memory, but it costs one additional branch at the begining of the function:

    CL-USER 22 > (disassemble 'test2)
    0: #x7FE802A6 mflr link
    4: #x3021FFEC addic sp,sp,#x-14
    8: #xBF610000 stmw r27-p,#x0(sp)
    12: #x33A10008 addic fp,sp,#x8
    16: #x83D70006 lwz const,#x6(func)
    20: #x60160000 ori r22,nargs,#x0
    24: #x607B0000 ori r27-p,res/arg0,#x0
    28: #x609C0000 ori r28-p,arg1,#x0
    32: #x2C160002 cmpwi cr0,r22,#x2
    36: #x40810050 ble 116
    40: #x82DE0031 lwz r22,#x31(const) ; 0.6180339887498949D0
    44: #xC8960005 lfd f4,#x5(r22)
    48: #xC87B0005 lfd f3,#x5(r27-p)
    52: #xFC8400F2 fmul f4,f4,f0-Ftmp1
    56: #x82DE0035 lwz r22,#x35(const) ; 0.3819660112501051D0
    60: #xC8760005 lfd f3,#x5(r22)
    64: #xC85C0005 lfd f2,#x5(r28-p)
    68: #xFC6300B2 fmul f3,f3,f0-Ftmp1
    72: #xFC84182A fadd f4,f4,f3
    76: #x82DE0039 lwz r22,#x39(const) ; 98.90080680013434D0
    80: #xC8760005 lfd f3,#x5(r22)
    84: #xFC83202A fadd f4,f3,f4
    88: #xD8850005 stfd f4,#x5(arg2)
    92: #x60A30000 ori res/arg0,arg2,#x0
    96: #x38000001 li nargs,#x1
    100: #xBB610000 lmw r27-p,#x0(sp)
    104: #x30210014 addic sp,sp,#x14
    108: #x7FE803A6 mtlr link
    112: #x4E800020 blr
    116: #x82FE002D lwz func,#x2D(const) ; SYSTEM::ALLOC-I-VECTOR
    120: #x38000003 li nargs,#x3
    124: #x38600004 li res/arg0,#x4
    128: #x38800100 li arg1,#x100
    132: #x63050000 ori arg2,nil,#x0
    136: #x80F70002 lwz tmp1,#x2(func)
    140: #x30E70005 addic tmp1,tmp1,#x5
    144: #x7CE903A6 mtctr tmp1
    148: #x4E800421 bctrl
    152: #x82C3FFFD lwz r22,#x-3(res/arg0)
    156: #x62D62000 ori r22,r22,#x2000
    160: #x92C3FFFD stw r22,#x-3(res/arg0)
    164: #x60650000 ori arg2,res/arg0,#x0
    168: #x4BFFFF80 b 40

    Again, this is under Lispworks in 32bit mode Mac OS X PowerPC G4, under
    other platforms/bits, or other lisps, like SBCL it could be different
    (when comes to consing).


  4. Default Re: optimization question

    >>>>> "Tamas" == Tamas Papp <tkpapp@gmail.com> writes:

    Tamas> Hi,
    Tamas> I want to learn how to write faster numerical code in Lisp. The
    Tamas> function below is not a bottleneck in my current application, but
    Tamas> since it is simple, I thought I could learn some techniques.

    Tamas> The function looks like this:

    Tamas> (defun golden-section-combination (a b)
    Tamas> "Return the convex combination (1-G)*a+G*b, where G is the
    Tamas> inverse of the golden ratio."
    Tamas> (declare (double-float a b))
    Tamas> (let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
    Tamas> (Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
    Tamas> (the double-float (+ (the double-float (* Gleft a))
    Tamas> (the double-float (* Gright b))))))

    With sbcl I'm pretty sure the "the double-float" stuff isn't needed
    because the compiler can figure out the types of the operations.

    Tamas> My final goal is to achieve the equivalent of the C function

    Tamas> double golden_section_combination(double a, double b) {
    Tamas> return (a*0.3819660112501051+b*0.6180339887498949);
    Tamas> }

    Tamas> ie with the constants compiled into the function, but disassemble
    Tamas> shows that there are quite a few other things in there (I am using
    Tamas> SBCL):

    CL-NUMLIB> (disassemble #'golden-section-combination)
    Tamas> ; 0B18A95D: 8B05F0A8180B MOV EAX, [#xB18A8F0] ; 0.6180339887498949d0
    Tamas> ; no-arg-parsing entry point
    Tamas> ; 63: DDD8 FSTPD FR0
    Tamas> ; 65: DD4001 FLDD [EAX+1]
    Tamas> ; 68: D8C9 FMULD FR1
    Tamas> ; 6A: DDD3 FSTD FR3
    Tamas> ; 6C: 8B05F4A8180B MOV EAX, [#xB18A8F4] ; 0.3819660112501051d0
    Tamas> ; 72: DDD8 FSTPD FR0
    Tamas> ; 74: DD4001 FLDD [EAX+1]
    Tamas> ; 77: D8CA FMULD FR2
    Tamas> ; 79: 9B WAIT
    Tamas> ; 7A: D8C3 FADDD FR3
    Tamas> ; 7C: 9B WAIT
    Tamas> ; 7D: 64 FS-SEGMENT-PREFIX
    Tamas> ; 7E: 800D4800000004 OR BYTE PTR [#x48], 4
    Tamas> ; 85: BA10000000 MOV EDX, 16
    Tamas> ; 8A: 64 FS-SEGMENT-PREFIX
    Tamas> ; 8B: 031520000000 ADD EDX, [#x20]
    Tamas> ; 91: 64 FS-SEGMENT-PREFIX
    Tamas> ; 92: 3B1524000000 CMP EDX, [#x24]
    Tamas> ; 98: 7607 JBE L0

    I hate reading x86 asm. The stuff at address 0b18a95d-65 is loading
    the constant 0.618... into the FPU. Inst at 68 that with the first
    arg, already in FR1. Inst at 6c and 74 loads the second
    constant 0.3819... into the FPU. Inst at 77 multiplies that with the
    second arg in FR2. Inst at 7a adds the two products together. The
    rest of the stuff is for boxing up the result.

    Without looking at the C code, I guess the only difference would be
    that the C code might multiply directly from memory. Can't remember
    if that's possible or not.

    So the compiled lisp code is probably quite similar to the C code
    already.

    Ray


  5. Default Re: optimization question

    "Dimiter \"malkia\" Stanev" <malkia@mac.com> writes:

    > And if you want to avoid any consing, if SBCL does, as Lispworks, then
    > instead of making the function return a value, make it return in an
    > array (ugly, but that's the way I'm planning to do my vector3/4 and
    > matrix33/44 library):


    Of course, such a solution is not thread-safe - it is not re-entrant.

    In Allegro CL, you can do it in a slightly different way (note that
    I'm on mac/intel here; I much prefer its floating point to that of the
    x87 style unit - note also that I removed all of the extraneous THE
    forms - they just clutter up the source):

    CL-USER(2): (shell "cat try.cl")
    #+allegro
    (eval-when (compile)
    (setf (get 'golden-section-combination 'sys::immed-args-call)
    '((double-float double-float) double-float))
    )

    (defun golden-section-combination (a b)
    "Return the convex combination (1-G)*a+G*b, where G is the
    inverse of the golden ratio."
    (declare (optimize speed (safety 0))
    (double-float a b))
    (let ((Gright #.(/ (- 3d0 (sqrt 5d0)) 2d0)) ; equals to G above
    (Gleft #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)))) ; 1-G
    (+ (* Gleft a) (* Gright b))))



    0
    CL-USER(3): :cf try
    ;;; Compiling file try.cl
    ;;; Writing fasl file try.fasl
    ;;; Fasl write complete
    CL-USER(4): :ld try
    ; Fast loading /tmp_mnt/net/gemini/home/duane/try.fasl
    CL-USER(5): (disassemble 'golden-section-combination)
    ;; disassembly of #<Function GOLDEN-SECTION-COMBINATION>
    ;; formals: A EXCL:F_PLACE-HOLDER B EXCL:F_PLACE-HOLDER
    ;; constant vector:
    0: 0.3819660112501051d0
    1: 0.6180339887498949d0

    ;; code start: #x10a51fb4:
    0: ff a7 47 03 00 jmp *[edi+839] ; SYS::IMMED-ARG-HOOK
    00
    6: 8b 5e 12 movl ebx,[esi+18] ; 0.3819660112501051d0
    9: f2 0f 10 6b f6 movsd xmm5,[ebx-10]
    14: 8b 5e 16 movl ebx,[esi+22] ; 0.6180339887498949d0
    17: f2 0f 10 63 f6 movsd xmm4,[ebx-10]
    22: f2 0f 10 5c 24 movsd xmm3,[esp+4]
    04
    28: f2 0f 59 e3 mulsd xmm4,xmm3
    32: f2 0f 10 5c 24 movsd xmm3,[esp+12]
    0c
    38: f2 0f 59 eb mulsd xmm5,xmm3
    42: f2 0f 58 ec addsd xmm5,xmm4
    46: f2 0f 10 c5 movsd xmm0,xmm5
    50: 8b 75 fc movl esi,[ebp-4]
    53: c3 ret
    CL-USER(6):

    If you call this function in straight lisp code, the immed-arg-hook
    unboxes the arguments, calls this functiin back again, and boxes up
    the result. But if you compile a call to this function with the
    immed-args-call property still in effect, then that call sets up the
    arguments directly in immediate form, calls the function in this
    immediate way, and the return is used (still unboxed, and with no
    consing at this level). It is up to the caller, now, as to what it
    wants to do with the unboxed value it just got back.

    We don't officially document this approach because it is not a very
    Lispy approach - a lispy approach would allow redefinitions without
    consequences. But since it is extremely useful we provide it and
    always give out unofficial documentation to customers who ask for it.

    --
    Duane Rettig duane@franz.com Franz Inc. http://www.franz.com/
    555 12th St., Suite 1450 http://www.555citycenter.com/
    Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182

  6. Default SBCL float to pointer conversion, cost 13 (was Re: optimization question)

    Raymond Toy <raymond.toy@ericsson.com> writes:

    > With sbcl I'm pretty sure the "the double-float" stuff isn't needed
    > because the compiler can figure out the types of the operations.


    Thanks, I have removed it.

    > I hate reading x86 asm. The stuff at address 0b18a95d-65 is loading
    > the constant 0.618... into the FPU. Inst at 68 that with the first
    > arg, already in FR1. Inst at 6c and 74 loads the second
    > constant 0.3819... into the FPU. Inst at 77 multiplies that with the
    > second arg in FR2. Inst at 7a adds the two products together. The
    > rest of the stuff is for boxing up the result.
    >
    > Without looking at the C code, I guess the only difference would be
    > that the C code might multiply directly from memory. Can't remember
    > if that's possible or not.
    >
    > So the compiled lisp code is probably quite similar to the C code
    > already.


    OK, I get it.

    I have simplified the function and I think I am almost done, but SBCL
    gives me notes about float to pointer conversion with a cost of 13.
    Could you please tell me what to do about this?

    The code is here (thanks to Dimiter and Duane too for suggestions):

    (declaim (ftype (function (double-float double-float)
    double-float)
    golden-section-combination)
    (inline golden-section-combination))

    (defun golden-section-combination (a b)
    "Return the convex combination (1-G)*a+G*b, where G is the
    inverse of the golden ratio."
    (declare (double-float a b))
    (declare (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)))
    (+ (* #.(- 1d0 (/ (- 3d0 (sqrt 5d0)) 2d0)) a)
    (* #.(/ (- 3d0 (sqrt 5d0)) 2d0) b)))

    (defun golden-section-minimize (f a b tol)
    "Find a local minimum of f in the [a,b] interval. The
    algorithm terminates when the minimum is bracketed in an interval
    smaller than tol. Since the algorithm is slow, tol should not be
    chosen smaller then necessary. The algorithm will also find the
    local minimum at the endpoints, and if f is unimodal, it will
    find the global minimum."
    (declare (double-float a b tol)
    (type (function (double-float) double-float) f)
    (optimize (speed 3) (safety 0)
    (compilation-speed 0)
    (space 0) (debug 0)))
    ;; reorder a and b if necessary
    (when (> a b)
    (rotatef a b))
    ;; start iteration with golden ratio inner points
    (let* ((m1 (golden-section-combination a b))
    (m2 (golden-section-combination b a))
    (f1 (funcall f m1))
    (f2 (funcall f m2)))
    (do ()
    ((< (abs (- b a)) tol)
    (if (< f1 f2)
    (values m1 f1)
    (values m2 f2)))
    ;;;; uncomment below for debugging
    ;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
    ;; a m1 f1 m2 f2 b)
    (if (< f1 f2)
    (progn
    ;; new bracket is (a,m1,m2)
    (shiftf b m2 m1 (golden-section-combination m1 a))
    (shiftf f2 f1 (funcall f m1)))
    (progn
    ;; new bracket is (m1,m2,b)
    (shiftf a m1 m2 (golden-section-combination m2 b))
    (shiftf f1 f2 (funcall f m2)))))))

    And the warnings I get are:

    ; in: LAMBDA NIL
    ; SB-INT:NAMED-LAMBDA
    ; ==>
    ; #'(SB-INT:NAMED-LAMBDA CL-NUMLIB::GOLDEN-SECTION-COMBINATION
    ; (CL-NUMLIB::A CL-NUMLIB::B)
    ; (DECLARE (DOUBLE-FLOAT CL-NUMLIB::A CL-NUMLIB::B))
    ; (DECLARE
    ; (OPTIMIZE (SPEED 3) (SAFETY 0) (COMPILATION-SPEED 0)
    ; (SPACE 0) (DEBUG 0)))
    ; (BLOCK CL-NUMLIB::GOLDEN-SECTION-COMBINATION
    ; (+ (* 0.6180339887498949d0 CL-NUMLIB::A)
    ; (* 0.3819660112501051d0 CL-NUMLIB::B))))
    ;
    ; note: doing float to pointer coercion (cost 13) to "<return value>"
    ;
    ; compilation unit finished
    ; printed 1 note
    STYLE-WARNING: redefining GOLDEN-SECTION-COMBINATION in DEFUN
    ; in: LAMBDA NIL
    ; LET*
    ;
    ; note: doing float to pointer coercion (cost 13) to M1
    ;
    ; note: doing float to pointer coercion (cost 13) to M2

    ; SHIFTF
    ; --> LET MULTIPLE-VALUE-BIND LET MULTIPLE-VALUE-BIND LET
    ; --> MULTIPLE-VALUE-BIND LET MULTIPLE-VALUE-BIND LET
    ; ==>
    ; (SETQ CL-NUMLIB::M2 #:G13)
    ;
    ; note: doing float to pointer coercion (cost 13) to M2

    ; ==>
    ; (SETQ CL-NUMLIB::M1 #:G6)
    ;
    ; note: doing float to pointer coercion (cost 13) to M1
    ;
    ; compilation unit finished
    ; printed 4 notes

    Thanks,

    Tamas

  7. Default Re: optimization question

    You probably want to add (declare (optimize speed)) to the function,
    or put (declaim (optimize speed)) at the top of the file.


  8. Default Re: SBCL float to pointer conversion, cost 13 (was Re: optimizationquestion)

    Try with making m1, m2, f1 & f2 also double-floats. Although I'm not
    sure whether that would help with SBCL, but it would I think with other
    compilers.

    > (let* ((m1 (golden-section-combination a b))
    > (m2 (golden-section-combination b a))
    > (f1 (funcall f m1))
    > (f2 (funcall f m2)))

    (declare (type double-float m1 m2 f1 f2))
    > (do ()
    > ((< (abs (- b a)) tol)
    > (if (< f1 f2)
    > (values m1 f1)
    > (values m2 f2)))
    > ;;;; uncomment below for debugging
    > ;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
    > ;; a m1 f1 m2 f2 b)


  9. Default Re: SBCL float to pointer conversion, cost 13 (was Re: optimization question)

    "Dimiter \"malkia\" Stanev" <malkia@mac.com> writes:

    > Try with making m1, m2, f1 & f2 also double-floats. Although I'm not
    > sure whether that would help with SBCL, but it would I think with
    > other compilers.
    >
    >> (let* ((m1 (golden-section-combination a b))
    >> (m2 (golden-section-combination b a))
    >> (f1 (funcall f m1))
    >> (f2 (funcall f m2)))

    > (declare (type double-float m1 m2 f1 f2))
    >> (do ()
    >> ((< (abs (- b a)) tol)
    >> (if (< f1 f2)
    >> (values m1 f1)
    >> (values m2 f2)))
    >> ;;;; uncomment below for debugging
    >> ;; (format t "bracket is a=~a~%m1=f(~a)=~a~%m2=f(~a)=~a~%b=~a~%"
    >> ;; a m1 f1 m2 f2 b)


    I tried, but it doesn't help. Actually, the same warning crops up
    when I compile golden-section combination, so it must be something
    there.

    Tamas

  10. Default Re: SBCL float to pointer conversion, cost 13 (was Re: optimization question)

    Tamas Papp wrote:

    > I tried, but it doesn't help. Actually, the same warning crops up
    > when I compile golden-section combination, so it must be something
    > there.
    >


    For golden-section-combination, if you make the function local to all
    its callers, then SBCL/CMUCL will probably know enough not to box the
    float return value, at least IIRC. Of course, that doesn't suit for
    functions that are part of your external-facing interface for your
    library.


    i.e. you could do
    (labels
    ((golden-section-combination (a b)
    ...))

    (defun golden-section-minimize (f a b tol)
    ...<calls to golden-section-combination>... )
    ....)





+ Reply to Thread
Page 1 of 2 1 2 LastLast

Similar Threads

  1. Optimization question
    By Application Development in forum Java
    Replies: 2
    Last Post: 09-27-2007, 09:06 PM
  2. Optimization question
    By Application Development in forum C
    Replies: 10
    Last Post: 08-18-2007, 09:29 PM
  3. Optimization Question
    By Application Development in forum Graphics
    Replies: 0
    Last Post: 09-30-2006, 01:04 PM
  4. Optimization Question
    By Application Development in forum ASM x86 ASM 370
    Replies: 0
    Last Post: 05-11-2006, 07:17 AM
  5. Optimization Question
    By Application Development in forum ASM x86 ASM 370
    Replies: 1
    Last Post: 05-10-2006, 03:15 PM