FOR loops removal

This is a discussion on FOR loops removal within the Idl-pvwave forums in Programming Languages category; Dear all, Some weeks ago I've noticed that IDL is rather slow with FOR loops the hard way...after reading about it and not believing it, offcourse. Last week I started removing some FOR loops in my code, apart from some embarasing ones like: sum = 0 FOR i=0,max_y-1 DO BEGIN FOR j=0,max-1 DO BEGIN sum_int = sum_int + data[i,j] ENDFOR ENDFOR which has a straightforward solution. However I have some other FOR loops which aren't that obvious at all. Like for instance this one: FOR l = 0, n*2 DO BEGIN temp = 0 FOR i =0,max_y-1 DO BEGIN FOR ...

Go Back   Application Development Forum > Programming Languages > Idl-pvwave

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-19-2008, 08:38 AM
loebasboy
Guest
 
Default FOR loops removal

Dear all,

Some weeks ago I've noticed that IDL is rather slow with FOR loops the
hard way...after reading about it and not believing it, offcourse.
Last week I started removing some FOR loops in my code, apart from
some embarasing ones like:

sum = 0
FOR i=0,max_y-1 DO BEGIN
FOR j=0,max-1 DO BEGIN
sum_int = sum_int + data[i,j]
ENDFOR
ENDFOR

which has a straightforward solution. However I have some other FOR
loops which aren't that obvious at all. Like for instance this one:

FOR l = 0, n*2 DO BEGIN
temp = 0
FOR i =0,max_y-1 DO BEGIN
FOR j=0,max_x-1 DO BEGIN
jtemp = j + l
jtemp2 = j + n
temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
ENDFOR
ENDFOR
output[l] = temp/(max_x*max_y)
ENDFOR

which I could alter, not that straigthforwardly into:

z = size(arr)
arr = reform(in_arr, z[1]*z[2], /overwrite)
endt = (max_y*max_x)-1
FOR l = 0, n*2 DO BEGIN
temp = 0
FOR i=0,endt DO temp = temp + (arr[i+l*max_y] * in_arr [i
+n*max_y])
output[l] = temp/(max_x*max_y)
ENDFOR
in_arr = reform(in_arr, z[1],z[2], /overwrite)

where 1 FOR loop is removed. However there is hardly any time profit
at all. It is even so that the following code is faster than both,
which is a very straightforward alteration of the first loop:

FOR l = 0, n*2 DO BEGIN
temp = 0
FOR i =0,max_y-1 DO FOR j=0,max_x-1 DO temp = temp + (arr[i,j
j + l] * arr [i, j + n])
output[l] = temp/(max_x*max_y)
ENDFOR.

With the following variables set and the for loops repeated with
another FOR loop of i= 0,10000 (to see a time difference, and in the
full program it is repeated about that many times too, but with even
larger arrays):

n = 8
max_x = 5
max_y = 5
output = fltarr(2*n+1)
arr = findgen(interval_y, region) +1

I have for the first for loop : 1.6279998 s
the second : 1.6060002 s
the third : 1.2749999 s

I measured the times with SYSTIME, /SECONDS command.

(the full program takes 22,5 h for a standard image, with the
alterations I have allready done, I've came up with 18.1 h, which is
still 17 h to go to make it workable, i've used the last loop in the
above example so far...)

Can anybody tell me why removing one loop doesn't help in this case or
what i'm doing wrong?

thnx
Stijn Van der Linden


Reply With Quote
  #2  
Old 08-19-2008, 09:43 AM
Wox
Guest
 
Default Re: FOR loops removal

On Tue, 19 Aug 2008 05:38:50 -0700 (PDT), loebasboy
<stijn.vdl@gmail.com> wrote:

> FOR l = 0, n*2 DO BEGIN
> temp = 0
> FOR i =0,max_y-1 DO BEGIN
> FOR j=0,max_x-1 DO BEGIN
> jtemp = j + l
> jtemp2 = j + n
> temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
> ENDFOR
> ENDFOR
> output[l] = temp/(max_x*max_y)
> ENDFOR



The code below is a start. Does this processing have a name? It feels
familiar somehow. Btw, in IDL the first index of an array is the
column and the second is the row. So in your case y are the columns
and x are the rows. No problem with that off course, just check
whether this is how you intended it.

n = 8
max_x = 5
max_y = 5
output = fltarr(2*n+1)
arr = findgen(max_y, 2*n+max_x) +1

arr2=arr[0:max_y-1,n:max_x-1+n]
FOR l = 0, 2*n DO $
output[l] = total(arr[0:max_y-1,l:max_x-1+l]*arr2)
output/=max_x*max_y
Reply With Quote
  #3  
Old 08-19-2008, 09:51 AM
David Fanning
Guest
 
Default Re: FOR loops removal

Wox writes:

> The code below is a start. Does this processing have a name? It feels
> familiar somehow. Btw, in IDL the first index of an array is the
> column and the second is the row. So in your case y are the columns
> and x are the rows. No problem with that off course, just check
> whether this is how you intended it.


And, of course, it *will* matter (especially in loops) how
you access the data. Always keep in mind that internally
data is stored in memory in row order (column indices vary
faster than row indices). Loops that keep this in mind are
fast. Loops that don't keep this in mind are what IDL users
complain about. :-)

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Reply With Quote
  #4  
Old 08-19-2008, 12:51 PM
Spon
Guest
 
Default Re: FOR loops removal

On Aug 19, 1:38 pm, loebasboy <stijn....@gmail.com> wrote:
> Like for instance this one:
>
> FOR l = 0, n*2 DO BEGIN
> temp = 0
> FOR i =0,max_y-1 DO BEGIN
> FOR j=0,max_x-1 DO BEGIN
> jtemp = j + l
> jtemp2 = j + n
> temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
> ENDFOR
> ENDFOR
> output[l] = temp/(max_x*max_y)
> ENDFOR


Hi Stijn,

it's been a while since I've had my IDL hat on, but you may be able to
make use of IDL's nifty array multiplication here - if two arrays A
and B have the same number of elements, then C = A * B will also have
that number of elements, with each individual element the product of
that element in the initial two arrays. IDL does this VERY
efficiently.

Given this, I think the inner two FOR loops in the above example can
be replaced by:

Output[l] = TOTAL(arr[0:max_y-1, l:l+max_x-1] * $
arr[0:max_y-1, n:n+max_x-1]) / (max_x*max_y)

Like I say, I'm out of practice, so this might not work. But I think
it should.

Regards,
Chris

>
> thnx
> Stijn Van der Linden


Reply With Quote
  #5  
Old 08-19-2008, 04:04 PM
Chris
Guest
 
Default Re: FOR loops removal

> Can anybody tell me why removing one loop doesn't help in this case or
> what i'm doing wrong?


I think the main reason why your second code snippet isn't much faster
than the first is because it's not a very good vectorization (the term
to describe the elimination of loops in favor of array based
operations).

Not being a computer scientist, I don't actually 'understand' what IDL
does when it compiles and runs code. But the image in my mind is akin
this old man I saw in a post office one time. He couldn't really hear
that well, and kept (loudly) asking the post office clerk 'when the
hell those Rat-a-ville stamps are coming in' (after eavesdropping for
a while, I realized that he was actually sent by his wife to buy
stamps from the movie 'Ratatouille'). After the clerk (repeatedly)
told him that a) it was pronounced 'rat-uh-too-eee' and b) they would
get them next week, the old man was on his way. The impressive thing
was that this 90 year old man FLEW out of the post office when he was
done. He was fast - like Lolo Jones fast.

How does this connect? IDL for loops are slow because the part of IDL
that interprets your file a fast but crotchety old man who can't hear
you very well and may not even really be listening. Any time you tell
him to do something, it takes him a while to interpret what you just
said - much longer than other, less crotchety men. Once he figures out
what's going on, however, he's plenty fast (especially if you tell him
to do something that he was already designed to do, for which he has
been well optimized). Good vectorization, then, minimizes the number
of instructions (e.g. iterations in a loop) while maximizing the
amount of work to do with each instruction.

Your second loop doesn't have any fewer iterations than the first loop
- it just gets rid of one nested for loop and increases the size of
the previous loop. Un-nesting the loops helps a bit (looping the loop
is two layers of interpretation. IDL has no patience for such tasks.
He lived through the depression and fought the Germans), but you
really aren't following the principle of 'loop less with bigger
processing chunks in each step.'

Wox's code is the right way to vectorize your loop. It truly iterates
fewer times, and gives IDL more to chew with each line of instruction.
I wouldn't bother eliminating the L loop. As soon as you do some hefty
processing in each iteration, the looping penalty goes away, and you
don't need to worry about your vectorization creating huge temporary
arrays and paying penalties in memory allocation.

As long as you don't have any loops where, at each iteration, you are
simply accessing an element of an array, IDL should be pretty fast.

chris
Reply With Quote
  #6  
Old 08-19-2008, 04:14 PM
David Fanning
Guest
 
Default Re: FOR loops removal

Chris writes:

> IDL for loops are slow because the part of IDL
> that interprets your file a fast but crotchety old man who can't hear
> you very well and may not even really be listening. Any time you tell
> him to do something, it takes him a while to interpret what you just
> said - much longer than other, less crotchety men. Once he figures out
> what's going on, however, he's plenty fast (especially if you tell him
> to do something that he was already designed to do, for which he has
> been well optimized). Good vectorization, then, minimizes the number
> of instructions (e.g. iterations in a loop) while maximizing the
> amount of work to do with each instruction.


I think this is a transparent attempt to characterize certain
persons on this newsgroup and I deeply resent it. :-(

Cheers,

David

P.S. Let's just say I would type a smiley face, but my
feelings have been cut too close to the bone. :-)

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Reply With Quote
  #7  
Old 08-20-2008, 03:24 AM
loebasboy
Guest
 
Default Re: FOR loops removal

On Aug 19, 3:43*pm, Wox <nom...@hotmail.com> wrote:
> On Tue, 19 Aug 2008 05:38:50 -0700 (PDT), loebasboy
>
> <stijn....@gmail.com> wrote:
> > * * *FOR l = 0, n*2 DO BEGIN
> > * * * *temp *= 0
> > * * * *FOR i =0,max_y-1 DO BEGIN
> > * * * * *FOR j=0,max_x-1 DO BEGIN
> > * * * * * *jtemp = j + l
> > * * * * * *jtemp2 = j + n
> > * * * * * *temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
> > * * * * *ENDFOR
> > * * * *ENDFOR
> > * * * *output[l] = temp/(max_x*max_y)
> > * * *ENDFOR

>
> The code below is a start. Does this processing have a name? It feels
> familiar somehow. Btw, in IDL the first index of an array is the
> column and the second is the row. So in your case y are the columns
> and x are the rows. No problem with that off course, just check
> whether this is how you intended it.
>
> n = 8
> max_x = 5
> max_y = 5
> output = fltarr(2*n+1)
> arr = findgen(max_y, 2*n+max_x) +1
>
> arr2=arr[0:max_y-1,n:max_x-1+n]
> FOR l = 0, 2*n DO $
> * * * * output[l] = total(arr[0:max_y-1,l:max_x-1+l]*arr2)
> output/=max_x*max_y


Thank you for your code, it works rather well, maybe it seems familiar
because it's a kind of autocorrelation that I'm calculating... .

I think I still need some vectorisation training to get IDL much
faster, I've calculated a time profit of 14 h (that makes 8.5 h
instead of 22.5 h), so I still have some FOR loops I can train on .
Thanks for helping finding my way and the fast answers, I think I will
definitely post again when I'm really stuck again .

Reply With Quote
  #8  
Old 08-20-2008, 07:50 AM
Jeremy Bailin
Guest
 
Default Re: FOR loops removal

On Aug 19, 9:43*am, Wox <nom...@hotmail.com> wrote:
> On Tue, 19 Aug 2008 05:38:50 -0700 (PDT), loebasboy
>
> <stijn....@gmail.com> wrote:
> > * * *FOR l = 0, n*2 DO BEGIN
> > * * * *temp *= 0
> > * * * *FOR i =0,max_y-1 DO BEGIN
> > * * * * *FOR j=0,max_x-1 DO BEGIN
> > * * * * * *jtemp = j + l
> > * * * * * *jtemp2 = j + n
> > * * * * * *temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
> > * * * * *ENDFOR
> > * * * *ENDFOR
> > * * * *output[l] = temp/(max_x*max_y)
> > * * *ENDFOR

>
> The code below is a start. Does this processing have a name? It feels
> familiar somehow. Btw, in IDL the first index of an array is the
> column and the second is the row. So in your case y are the columns
> and x are the rows. No problem with that off course, just check
> whether this is how you intended it.
>
> n = 8
> max_x = 5
> max_y = 5
> output = fltarr(2*n+1)
> arr = findgen(max_y, 2*n+max_x) +1
>
> arr2=arr[0:max_y-1,n:max_x-1+n]
> FOR l = 0, 2*n DO $
> * * * * output[l] = total(arr[0:max_y-1,l:max_x-1+l]*arr2)
> output/=max_x*max_y


Following on that last version, I think we can *completely* get rid of
the loop... though at the expense (as usual) of memory:

n = 8
max_x = 5
max_y = 5
arr = findgen(max_y, 2*n+max_x) +1

max_area = max_x*max_y
output = total( arr[rebin(lindgen(max_area),max_area,2*n+1) +
max_y*rebin(reform(lindgen(2*n+1),1,2*n+1),max_are a,2*n+1)] *
rebin( (arr[*,n:max_x-1+n])[*], max_area,2*n+1), 1) / max_area


Whether that's actually faster will depend on how big max_x, max_y and
n are, of course... it ends up internally storing a couple of
max_x*max_y*(2*n+1) arrays, so if that is going to take you into swap
then you're best off sticking with Wox's version. If that stays in
physical memory, though, I bet this will win.

-Jeremy.
Reply With Quote
  #9  
Old 08-20-2008, 08:36 AM
loebasboy
Guest
 
Default Re: FOR loops removal

On 20 aug, 13:50, Jeremy Bailin <astroco...@gmail.com> wrote:
> On Aug 19, 9:43*am, Wox <nom...@hotmail.com> wrote:
>
>
>
>
>
> > On Tue, 19 Aug 2008 05:38:50 -0700 (PDT), loebasboy

>
> > <stijn....@gmail.com> wrote:
> > > * * *FOR l = 0, n*2 DO BEGIN
> > > * * * *temp *= 0
> > > * * * *FOR i =0,max_y-1 DO BEGIN
> > > * * * * *FOR j=0,max_x-1 DO BEGIN
> > > * * * * * *jtemp = j + l
> > > * * * * * *jtemp2 = j + n
> > > * * * * * *temp = temp + (arr[i,jtemp] * arr [i,jtemp2])
> > > * * * * *ENDFOR
> > > * * * *ENDFOR
> > > * * * *output[l] = temp/(max_x*max_y)
> > > * * *ENDFOR

>
> > The code below is a start. Does this processing have a name? It feels
> > familiar somehow. Btw, in IDL the first index of an array is the
> > column and the second is the row. So in your case y are the columns
> > and x are the rows. No problem with that off course, just check
> > whether this is how you intended it.

>
> > n = 8
> > max_x = 5
> > max_y = 5
> > output = fltarr(2*n+1)
> > arr = findgen(max_y, 2*n+max_x) +1

>
> > arr2=arr[0:max_y-1,n:max_x-1+n]
> > FOR l = 0, 2*n DO $
> > * * * * output[l] = total(arr[0:max_y-1,l:max_x-1+l]*arr2)
> > output/=max_x*max_y

>
> Following on that last version, I think we can *completely* get rid of
> the loop... though at the expense (as usual) of memory:
>
> n = 8
> max_x = 5
> max_y = 5
> arr = findgen(max_y, 2*n+max_x) +1
>
> max_area = max_x*max_y
> output = total( arr[rebin(lindgen(max_area),max_area,2*n+1) +
> * max_y*rebin(reform(lindgen(2*n+1),1,2*n+1),max_are a,2*n+1)] *
> * rebin( (arr[*,n:max_x-1+n])[*], max_area,2*n+1), 1) / max_area
>
> Whether that's actually faster will depend on how big max_x, max_y and
> n are, of course... it ends up internally storing a couple of
> max_x*max_y*(2*n+1) arrays, so if that is going to take you into swap
> then you're best off sticking with Wox's version. If that stays in
> physical memory, though, I bet this will win.
>
> -Jeremy.- Tekst uit oorspronkelijk bericht niet weergeven -
>
> - Tekst uit oorspronkelijk bericht weergeven -


Great code, that is another hour of time profit again, and it works.
So vectorization comes down to instead of repeating an action per
element in matrix, putting all elements on the right spot in a matrix
and doing the action on the matrix, right? The only problem is then,
finding out when this is possible, or when it isn't .

thank you all, for all the great info allready!
Reply With Quote
  #10  
Old 08-20-2008, 10:07 AM
Jeremy Bailin
Guest
 
Default Re: FOR loops removal

> So vectorization comes down to instead of repeating an action per
> element in matrix, putting all elements on the right spot in a matrix
> and doing the action on the matrix, right?


Yup, that sums it up pretty well!

-Jeremy.
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 10:08 PM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.