Re: Sorting routine

This is a discussion on Re: Sorting routine within the Forth forums in Programming Languages category; Thomas Pornin <pornin @ bolet.org> writes Re: Sorting routine > Even if the FSL included a standardized sort, I strongly doubt that a > programmer daring to use his own sorting code would be fried to death by > a lightning bolt wielding vengeful deity. In my view, a generic function > would be generally handy and would fit no less the possible needs than > the absence of said function. And the FSL already contains a whole bunch > of words which handiness is no more general. In my experience (just a datapoint, not proof) it would hardly be ...

Go Back   Application Development Forum > Programming Languages > Forth

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-30-2008, 02:36 AM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

Thomas Pornin <pornin@bolet.org> writes Re: Sorting routine

> Even if the FSL included a standardized sort, I strongly doubt that a
> programmer daring to use his own sorting code would be fried to death by
> a lightning bolt wielding vengeful deity. In my view, a generic function
> would be generally handy and would fit no less the possible needs than
> the absence of said function. And the FSL already contains a whole bunch
> of words which handiness is no more general.


In my experience (just a datapoint, not proof) it would hardly be
ever used. Of course, if it were a standard word, I suspect that
people would 'bend' problems so that a sort could solve them.

Furthermore, when I *did* need a sort routine recently (to solve an
Euler problem) it specified:

"If we calculate rad(n) for 1 <= n <= 10,000, then sort them on rad(n),
and sorting on n if the radical values are equal, ... "

... this didn't directly fit (even after reasonable modding) the generic
Forth quicksorts I know of, and I wrote a custom version from the ground
up (it would have taken me longer to find out a way to let a standard
tool do this, which is undoubtedly possible).

IMHO no pressing need for a standard SORT word. OTOH, a centralized
library version in source would be nice, as its performance and
bugfree-ness would increase with each passing year.

-marcel

Reply With Quote
  #2  
Old 09-04-2008, 12:54 PM
dkelvey@hotmail.com
Guest
 
Default Re: Sorting routine

On Aug 29, 11:36 pm, m...@iae.nl (Marcel Hendrix) wrote:
> Thomas Pornin <por...@bolet.org> writes Re: Sorting routine
>
> > Even if the FSL included a standardized sort, I strongly doubt that a
> > programmer daring to use his own sorting code would be fried to death by
> > a lightning bolt wielding vengeful deity. In my view, a generic function
> > would be generally handy and would fit no less the possible needs than
> > the absence of said function. And the FSL already contains a whole bunch
> > of words which handiness is no more general.

>
> In my experience (just a datapoint, not proof) it would hardly be
> ever used. Of course, if it were a standard word, I suspect that
> people would 'bend' problems so that a sort could solve them.
>
> Furthermore, when I *did* need a sort routine recently (to solve an
> Euler problem) it specified:
>
> "If we calculate rad(n) for 1 <= n <= 10,000, then sort them on rad(n),
> and sorting on n if the radical values are equal, ... "
>
> .. this didn't directly fit (even after reasonable modding) the generic
> Forth quicksorts I know of, and I wrote a custom version from the ground
> up (it would have taken me longer to find out a way to let a standard
> tool do this, which is undoubtedly possible).
>
> IMHO no pressing need for a standard SORT word. OTOH, a centralized
> library version in source would be nice, as its performance and
> bugfree-ness would increase with each passing year.
>
> -marcel



Hi Marcel
My sort would have worked but you need to sort the n values
first ( they may have already been sorted since you'd most likely
create the result values by the order of n ). The sorts I describe
will not mess up the order of presorted values because the
routine doesn't do anything out of order. You'd only need to
sort the results and not worry about the values n since these
were already sorted.
Dwight

Reply With Quote
  #3  
Old 09-04-2008, 03:36 PM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

"dkelvey@hotmail.com" <dkelvey@hotmail.com> writes Re: Sorting routine

> On Aug 29, 11:36 pm, m...@iae.nl (Marcel Hendrix) wrote:

[..]
>> Furthermore, when I *did* need a sort routine recently (to solve an
>> Euler problem) it specified:


>> "If we calculate rad(n) for 1 <= n <= 10,000, then sort them on rad(n),
>> and sorting on n if the radical values are equal, ... "


>> .. this didn't directly fit (even after reasonable modding) the generic
>> Forth quicksorts I know of, and I wrote a custom version from the ground
>> up (it would have taken me longer to find out a way to let a standard
>> tool do this, which is undoubtedly possible).

[..]
> My sort would have worked but you need to sort the n values
> first ( they may have already been sorted since you'd most likely
> create the result values by the order of n ). The sorts I describe
> will not mess up the order of presorted values because the
> routine doesn't do anything out of order. You'd only need to
> sort the results and not worry about the values n since these
> were already sorted.


Oeps. You are probably (I didn't test it) right. I was fooled by the
specification (*first* sort on rad(n) *then* on n). One only needs
a 'stable' sort to do it your way, and quite possibly Quicksort could
also have done it... Maybe I will retry the problem, it takes an
embarassing amount of time to finish now.

I looked at your divide&conquer algorithm many years ago, but never
implemented it because, IIRC, I couldn't find a way to generalize it
for arbitrary data. Maybe I will try again. Again IIRC, Knuth describes
it, but it may have gone over my head at the time (it is *very* long
ago :-)

-marcel

Reply With Quote
  #4  
Old 09-04-2008, 06:01 PM
dkelvey@hotmail.com
Guest
 
Default Re: Sorting routine

On Sep 4, 12:36 pm, m...@iae.nl (Marcel Hendrix) wrote:
> "dkel...@hotmail.com" <dkel...@hotmail.com> writes Re: Sorting routine
>
>
>
>
>
>
>
> > On Aug 29, 11:36 pm, m...@iae.nl (Marcel Hendrix) wrote:

> [..]
> >> Furthermore, when I *did* need a sort routine recently (to solve an
> >> Euler problem) it specified:
> >> "If we calculate rad(n) for 1 <= n <= 10,000, then sort them on rad(n),
> >> and sorting on n if the radical values are equal, ... "
> >> .. this didn't directly fit (even after reasonable modding) the generic
> >> Forth quicksorts I know of, and I wrote a custom version from the ground
> >> up (it would have taken me longer to find out a way to let a standard
> >> tool do this, which is undoubtedly possible).

> [..]
> > My sort would have worked but you need to sort the n values
> > first ( they may have already been sorted since you'd most likely
> > create the result values by the order of n ). The sorts I describe
> > will not mess up the order of presorted values because the
> > routine doesn't do anything out of order. You'd only need to
> > sort the results and not worry about the values n since these
> > were already sorted.

>
> Oeps. You are probably (I didn't test it) right. I was fooled by the
> specification (*first* sort on rad(n) *then* on n). One only needs
> a 'stable' sort to do it your way, and quite possibly Quicksort could
> also have done it... Maybe I will retry the problem, it takes an
> embarassing amount of time to finish now.
>
> I looked at your divide&conquer algorithm many years ago, but never
> implemented it because, IIRC, I couldn't find a way to generalize it
> for arbitrary data. Maybe I will try again. Again IIRC, Knuth describes
> it, but it may have gone over my head at the time (it is *very* long
> ago :-)
>
> -marcel



Hi
Ok, here is the way you could use the distribution sort.
You need to determine the radix you expect to use. Radix 256 works
nice
because it is bytes and these are generally easy to extract from the
data.
Make a TableForCounts ( 256 entries )
Make a TableForIndex ( 256 entries, One could reuse the first table
with care )
Start of top loop
Initialize TableForCounts to zeros
Start of tally loop
parse out the lowest order byte that has not yet been sorted
Use this byte as an index into TableForCounts and increment that
value
Repeat tally for each entry
place 0 in the first location of TableForIndex
for I = 0 to 255
TableForIndex[I+1]=TableForIndex[i]+TableForIndex[i]
loop
Repeat for each entry
parse out the same byte as above
use this to index into TableForIndex
Use this index as the location in the output buffer to place the
item
increment the value in the TableForIndex just used
loop
repeat from the start for each byte field

Do note that if the value is a signed two complement, you need
to split the last TableForIndex into two halves because you want
to sort the negative numbers first before the positive. This is best
done in a completely different step because it effect how the
TableForIndex is built.
Hope this makes sense
Dwight

Reply With Quote
  #5  
Old 09-05-2008, 06:19 PM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

"dkelvey@hotmail.com" <dkelvey@hotmail.com> wrote Re: Sorting routine

> On Sep 4, 12:36 pm, m...@iae.nl (Marcel Hendrix) wrote:
>> "dkel...@hotmail.com" <dkel...@hotmail.com> writes Re: Sorting routine

[..]

> Ok, here is the way you could use the distribution sort.


> You need to determine the radix you expect to use. Radix 256 works nice
> because it is bytes and these are generally easy to extract from the data.

[..]
> Hope this makes sense


Like this?

-marcel

-- -----------------------------------------------------------------------
( *
* LANGUAGE : ANS Forth with extensions
* PROJECT : Forth Environments
* DESCRIPTION : RADIX / Distribution SORT
* CATEGORY : Utilities
* AUTHOR : Marcel Hendrix
* LAST CHANGE : September 5, 2008, Marcel Hendrix
* )


0 [IF]

A table COUNTS ( 256 entries )
A table INDEX ( 256 entries )

Initialize COUNTS to zeros
FOR n = 0 to numbytes ( largest entry has numbytes bytes )
BEGIN
parse byte n out of the next entry, the lowest order byte that has not yet been sorted
Use this byte as an index into COUNTS and increment that value
REPEAT ( for each entry )

place 0 in the first location of INDEX
FOR i = 0 to 254
INDEX[i+1] = INDEX[i] + COUNTS[i]
NEXT

BEGIN ( for each entry )
parse out byte n
use this to index into INDEX
use this new index as the location in the output buffer to place the item
increment the value in the INDEX just used
REPEAT
NEXT n

Negative values not supported.
[THEN]


CREATE COUNTS #256 CELLS ALLOT
CREATE INDEX #256 CELLS ALLOT

4 VALUE #bytes
#1000000 VALUE #entries

#entries CELLS ALLOCATE THROW CONSTANT input
#entries CELLS ALLOCATE THROW CONSTANT output

: FILL-input ( -- )
CR ." Creating " #entries . ." input values."
#entries 0 DO #31 2^x CHOOSE input I CELLS + ! LOOP ;

: NEXT-BYTE ( n -- )
8 * 0 0 LOCALS| 'ix ix n |
COUNTS #256 CELLS ERASE
#entries 0 ?DO input I CELLS + @ n RSHIFT $FF AND
COUNTS SWAP CELLS + 1 SWAP +!
LOOP
0 INDEX ! #255 0 DO INDEX I CELLS + @ COUNTS I CELLS + @ + INDEX I 1+ CELLS + ! LOOP
#entries 0 ?DO input I CELLS + @
DUP n RSHIFT $FF AND INDEX SWAP CELLS + DUP TO 'ix @ TO ix
( entry) output ix CELLS + ! 1 'ix +!
LOOP ;

: RADIX-SORT ( -- ) #bytes 0 ?DO I NEXT-BYTE LOOP ;

: TEST ( -- ) FILL-input CR ." Sorting .. " TIMER-RESET RADIX-SORT .ELAPSED ;

\ FORTH> test ( 3 GHz PIV )
\ Creating 1,000,000 input values.
\ Sorting .. 0.126 seconds elapsed. ok

Reply With Quote
  #6  
Old 09-05-2008, 07:51 PM
dkelvey@hotmail.com
Guest
 
Default Re: Sorting routine

On Sep 5, 3:19 pm, m...@iae.nl (Marcel Hendrix) wrote:
> "dkel...@hotmail.com" <dkel...@hotmail.com> wrote Re: Sorting routine
>
> > On Sep 4, 12:36 pm, m...@iae.nl (Marcel Hendrix) wrote:
> >> "dkel...@hotmail.com" <dkel...@hotmail.com> writes Re: Sorting routine

>
> [..]
>
>
>
> > Ok, here is the way you could use the distribution sort.
> > You need to determine the radix you expect to use. Radix 256 works nice
> > because it is bytes and these are generally easy to extract from the data.

> [..]
> > Hope this makes sense

>
> Like this?
>
> -marcel
>
> -- -----------------------------------------------------------------------
> ( *
> * LANGUAGE : ANS Forth with extensions
> * PROJECT : Forth Environments
> * DESCRIPTION : RADIX / Distribution SORT
> * CATEGORY : Utilities
> * AUTHOR : Marcel Hendrix
> * LAST CHANGE : September 5, 2008, Marcel Hendrix
> * )
>
> 0 [IF]
>
> A table COUNTS ( 256 entries )
> A table INDEX ( 256 entries )
>
> Initialize COUNTS to zeros
> FOR n = 0 to numbytes ( largest entry has numbytes bytes )
> BEGIN
> parse byte n out of the next entry, the lowest order byte that has not yet been sorted
> Use this byte as an index into COUNTS and increment that value
> REPEAT ( for each entry )
>
> place 0 in the first location of INDEX
> FOR i = 0 to 254
> INDEX[i+1] = INDEX[i] + COUNTS[i]
> NEXT
>
> BEGIN ( for each entry )
> parse out byte n
> use this to index into INDEX
> use this new index as the location in the output buffer to place the item
> increment the value in the INDEX just used
> REPEAT
> NEXT n
>
> Negative values not supported.
> [THEN]
>
> CREATE COUNTS #256 CELLS ALLOT
> CREATE INDEX #256 CELLS ALLOT
>
> 4 VALUE #bytes
> #1000000 VALUE #entries
>
> #entries CELLS ALLOCATE THROW CONSTANT input
> #entries CELLS ALLOCATE THROW CONSTANT output
>
> : FILL-input ( -- )
> CR ." Creating " #entries . ." input values."
> #entries 0 DO #31 2^x CHOOSE input I CELLS + ! LOOP ;
>
> : NEXT-BYTE ( n -- )
> 8 * 0 0 LOCALS| 'ix ix n |
> COUNTS #256 CELLS ERASE
> #entries 0 ?DO input I CELLS + @ n RSHIFT $FF AND
> COUNTS SWAP CELLS + 1 SWAP +!
> LOOP
> 0 INDEX ! #255 0 DO INDEX I CELLS + @ COUNTS I CELLS + @ + INDEX I 1+ CELLS + ! LOOP
> #entries 0 ?DO input I CELLS + @
> DUP n RSHIFT $FF AND INDEX SWAP CELLS + DUP TO 'ix @ TO ix
> ( entry) output ix CELLS + ! 1 'ix +!
> LOOP ;
>
> : RADIX-SORT ( -- ) #bytes 0 ?DO I NEXT-BYTE LOOP ;
>
> : TEST ( -- ) FILL-input CR ." Sorting .. " TIMER-RESET RADIX-SORT .ELAPSED ;
>
> \ FORTH> test ( 3 GHz PIV )
> \ Creating 1,000,000 input values.
> \ Sorting .. 0.126 seconds elapsed. ok



Hi Marcel
This looks right. There is room for improvements. Using RSHIFT may or
may not be the best way to parse out bytes but other methods would
be machine dependent. On a byte addressing machine it might be
more efficient to just fetch the byte at a time, rather than fetching
the entire value and then extracting the byte. On my NC4000 that
had no byte addressing, I made a hardware register at -1 that would
do a byte swap. The endian problem makes it less portable as well.
As I mentioned, the two arrays, index and counts could be combined.
If kept separate, for the last pass to sort the signed values, one
can
create the index array starting from the 128th entry and then
wrap back to 0 after 255, instead of starting from zero. One would
need to keep the two arrays in this method. This would only be done
as the last step when doing the byte with the sign for 2's complment.
I'm assuming that RSHIFT does a byte sized shift?
One can do a small optimization bye building the counts and index
tables for each byte at once. This save needing to refetch the entries
for each pass to do this calculations. One still needs to run the
last step for each byte separately in order.
If there are a lot of bytes in the values, one can setup such a loop
that one does the counting at the same time as one does the placing
in the output array. This would make the first pass and last pass
unique. This would be handy for sign sorting of 2's complement.
Have you rerun any of the other benches with the faster processor
and using a 1M entries?
Dwight
Reply With Quote
  #7  
Old 09-05-2008, 08:53 PM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

"dkelvey@hotmail.com" <dkelvey@hotmail.com> writes Re: Sorting routine
> On Sep 5, 3:19 pm, m...@iae.nl (Marcel Hendrix) wrote:
>> "dkel...@hotmail.com" <dkel...@hotmail.com> wrote Re: Sorting routine


>> > On Sep 4, 12:36 pm, m...@iae.nl (Marcel Hendrix) wrote:
>> >> "dkel...@hotmail.com" <dkel...@hotmail.com> writes Re: Sorting routine


> [..]
>> \ FORTH> test ( 3 GHz PIV )
>> \ Creating 1,000,000 input values.
>> \ Sorting .. 0.126 seconds elapsed. ok



> Hi Marcel
> This looks right. There is room for improvements. Using RSHIFT may or
> may not be the best way to parse out bytes but other methods would
> be machine dependent. On a byte addressing machine it might be
> more efficient to just fetch the byte at a time, rather than fetching
> the entire value and then extracting the byte. On my NC4000 that
> had no byte addressing, I made a hardware register at -1 that would
> do a byte swap. The endian problem makes it less portable as well.


On a PC, the hardware fetches many 32-bit words at once.

> As I mentioned, the two arrays, index and counts could be combined.
> If kept separate, for the last pass to sort the signed values, one can
> create the index array starting from the 128th entry and then
> wrap back to 0 after 255, instead of starting from zero. One would
> need to keep the two arrays in this method. This would only be done
> as the last step when doing the byte with the sign for 2's complement.


I'll leave the complications for somebody else :-) I like the simplicity
of the current code.

> I'm assuming that RSHIFT does a byte sized shift?


n RSHIFT shifts n bits.

> One can do a small optimization bye building the counts and index
> tables for each byte at once. This save needing to refetch the entries
> for each pass to do this calculations. One still needs to run the
> last step for each byte separately in order.


I am not so sure this will do much (besides decreasing size).

> If there are a lot of bytes in the values, one can setup such a loop
> that one does the counting at the same time as one does the placing
> in the output array. This would make the first pass and last pass
> unique. This would be handy for sign sorting of 2's complement.


Again, I'll gladly leave the complications to other enthusiasts.

> Have you rerun any of the other benches with the faster processor
> and using a 1M entries?


In all cases random numbers were used to initialize the arrays.

FORTH> simple-bench1M ( 3 GHz PIV )
Sedge sort - 1000000 elements, 174 ms (needs 128889 passes)
Quicker sort - 1000000 elements, 191 ms (needs 518998 passes)
Quick sort - 1000000 elements, 195 ms (needs 771816 passes)
Comb sort - 1000000 elements, 317 ms (needs 54 passes)
Heap sort - 1000000 elements, 913 ms (needs 19549621 passes)

So Radix/Distribution sort is about 1.4 times faster than Sedge's Sort.
However, no serious optimization effort has been done yet for either sort.

-marcel

Reply With Quote
  #8  
Old 09-06-2008, 04:10 AM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

Sorry, I forgot to swap input and output after each byte run.

Here is the modified iForth source. The speed has gone from
126 to 100 ms per million values sorted. This is definitely the
fastest integer sort I know.

-marcel

-- ------------------------------------------------
(*
* LANGUAGE : ANS Forth with extensions
* PROJECT : Forth Environments
* DESCRIPTION : RADIX / Distribution SORT
* CATEGORY : Utilities
* AUTHOR : Marcel Hendrix
* LAST CHANGE : September 5, 2008, Marcel Hendrix
*)



NEEDS -miscutil

REVISION -radixsort "--- Radix Sort Version 1.00 ---"

PRIVATES

DOC
(*
Make a table COUNTS ( 256 entries )
Make a table INDEX ( 256 entries )

Initialize COUNTS to zeros
FOR n = 0 to numwords ( largest entry has numwords 8-bit words )
BEGIN
parse word n out of the next entry, the lowest order word that has not yet been sorted
Use this word as an index into COUNTS and increment that value
REPEAT ( for each entry )

place 0 in the first location of INDEX
FOR i = 0 to 2^8 - 2
INDEX[i+1] = INDEX[i] + COUNTS[i]
NEXT

BEGIN ( for each entry )
parse out word n
use this to index into INDEX
use this new index as the location in the output buffer to place the item
increment the value in the INDEX just used
REPEAT

copy output to input
NEXT n

Do note that if the value is a signed two-complement, you need to split the last
INDEX into two halves because you want to sort the negative numbers first before
the positive. This is best done in a completely different step because it effects
how the INDEX is built.
*)
ENDDOC

8 =: #bits PRIVATE
#bits 2^x 1- =: mask PRIVATE
4 VALUE #bytes PRIVATE
#1000000 VALUE #entries PRIVATE

CREATE COUNTS #bits 2^x CELLS ALLOT PRIVATE
CREATE INDEX #bits 2^x CELLS ALLOT PRIVATE

#entries CELLS ALLOCATE ?ALLOCATE =: input PRIVATE
#entries CELLS ALLOCATE ?ALLOCATE =: output PRIVATE

: RESTORE ( xt -- ) DROP input FREE ?ALLOCATE output FREE ?ALLOCATE ; PRIVATE

: FILL-input ( -- )
CR ." Creating " #entries U>D (n,3) ." input values."
#entries 0 DO RANDOM ABS input I CELL[] ! LOOP ;

' RESTORE IS-FORGET FILL-input

: PRINT-input ( -- )
#entries #10 > ?EXIT
CR ." input : " #entries 0 DO input I CELL[] @ DEC. LOOP ;

: PRINT-output ( -- ) PRINT-input ;

: NEXT-BYTE ( n -- )
#bits * 0 COUNTS LOCALS| 'count ix n |
COUNTS #bits 2^x CELLS ERASE
input #entries 0 ?DO @+ n RSHIFT mask AND
COUNTS []CELL 1 SWAP +!
LOOP DROP
0 INDEX !
INDEX #bits 2^x 1- 0 DO @+ 'count @+ SWAP TO 'count + OVER !
LOOP DROP
input #entries 0 ?DO @+ ( addr entry)
DUP n RSHIFT mask AND INDEX []CELL DUP @ 1 ROT +!
( addr entry ix ) output []CELL !
LOOP DROP
output input #entries CELLS MOVE ; PRIVATE

: RADIX-SORT ( -- ) #bytes 0 ?DO I NEXT-BYTE LOOP ;

: TEST ( -- )
FILL-input PRINT-input
CR ." Sorting .. " TIMER-RESET RADIX-SORT .ELAPSED
PRINT-output ;

:ABOUT CR ." Try: TEST " ;

\ FORTH> test ( 3 GHz PIV )
\ Creating 1,000,000 input values.
\ Sorting .. 0.099 seconds elapsed. ok

.ABOUT -radixsort CR
DEPRIVE

(* End of Source *)

Reply With Quote
  #9  
Old 09-06-2008, 09:51 AM
Jos van de Ven
Guest
 
Default Re: Sorting routine


Marcel wrote:

> This is definitely the fastest integer sort I know.


Indeed, but it took me 10.000.000 items under Win32Forth to see it.

[...] LAST CHANGE : September 5, 2008, Marcel Hendrix [...]

The code for Win32Forth is:

(( *
Make a table COUNTS ( 256 entries )
Make a table INDEX ( 256 entries )

Initialize COUNTS to zeros
FOR n = 0 to numwords ( largest entry has numwords 8-bit words )
BEGIN
parse word n out of the next entry, the lowest order word that has not yet been sorted
Use this word as an index into COUNTS and increment that value
REPEAT ( for each entry )

place 0 in the first location of INDEX
FOR i = 0 to 2^8 - 2
INDEX[i+1] = INDEX[i] + COUNTS[i]
NEXT

BEGIN ( for each entry )
parse out word n
use this to index into INDEX
use this new index as the location in the output buffer to place the item
increment the value in the INDEX just used
REPEAT

copy output to input
NEXT n

Do note that if the value is a signed two-complement, you need to split the last
INDEX into two halves because you want to sort the negative numbers first before
the positive. This is best done in a completely different step because it effects
how the INDEX is built.
*)
ENDDOC ))

\ Added for Win32Forth:

: 2^x ( x -- 2^x ) 1 swap 0 ?do 1 lshift loop ;
: (u. ( - ) 0 (d.) type ;
: u.td ( n - ) base @ swap decimal (u. base ! ;
: @+ ( adr -- adr n ) s" dup cell+ swap @ " evaluate ; immediate
: []cell ( no addr - addr+offset ) s" swap cell * + " evaluate ; immediate

synonym =: constant
synonym private noop
synonym cell[] cells+
synonym dec. u.td

\ Only some minor changes:

8 =: #bits PRIVATE
#bits 2^x 1- =: mask PRIVATE
4 VALUE #bytes PRIVATE
#10000000 VALUE #entries PRIVATE

CREATE COUNTS #bits 2^x CELLS ALLOT PRIVATE
CREATE INDEX #bits 2^x CELLS ALLOT PRIVATE

#entries CELLS ALLOCATE THROW =: input PRIVATE
#entries CELLS ALLOCATE THROW =: output PRIVATE

\ : RESTORE ( xt -- ) DROP input FREE ?ALLOCATE output FREE ?ALLOCATE ; PRIVATE

: FILL-input ( -- )
CR ." Creating " #entries ( U>D (n,3) . ." input values."
#entries 0 DO [ 31 2^x 1- ] literal RANDOM ABS input I CELL[] ! LOOP ;

\ ' RESTORE IS-FORGET FILL-input

: PRINT-input ( -- )
#entries #10 > ?EXIT
CR ." input : " #entries 0 DO input I CELL[] @ DEC. space LOOP ;

: PRINT-output ( -- ) PRINT-input ;

: NEXT-BYTE ( n -- )
#bits * 0 COUNTS LOCALS| 'count ix n |
COUNTS #bits 2^x CELLS ERASE
input #entries 0 ?DO @+ n RSHIFT mask AND
COUNTS []CELL 1 SWAP +!
LOOP DROP
0 INDEX !
INDEX #bits 2^x 1- 0 DO @+ 'count @+ SWAP TO 'count + OVER !
LOOP DROP
input #entries 0 ?DO @+ ( addr entry)
DUP n RSHIFT mask AND INDEX []CELL DUP @ 1 ROT +!
( addr entry ix ) output []CELL !
LOOP DROP
output input #entries CELLS MOVE ; PRIVATE

: RADIX-SORT ( -- ) #bytes 0 ?DO I NEXT-BYTE LOOP ;

: TEST ( -- )
FILL-input PRINT-input
CR ." Sorting .. " TIMER-RESET RADIX-SORT .ELAPSED
PRINT-output ;

( CR ." Try: TEST " )

TEST

\s

> \ FORTH> test ( 3 GHz PIV )
> \ Creating 1,000,000 input values.
> \ Sorting .. 0.099 seconds elapsed. ok


\ Win32Forth P4 between 2.8 - 3.25 Ghz

Creating 10000000 input values.
Sorting .. Elapsed time: 00:00:20.419 ok

cell sort sedge-c RANDOM-FILL:
10,000,000 items UNsorted Elapsed time: 00:00:25.019 sorted

(* End of Source *)

==
4ePost: 4,095 bytes in mail. Elapsed time to buffer: .000112 sec.
Reply With Quote
  #10  
Old 09-06-2008, 11:03 AM
Marcel Hendrix
Guest
 
Default Re: Sorting routine

mhx@iae.nl (Marcel Hendrix) writes Re: Sorting routine

> Sorry, I forgot to swap input and output after each byte run.

[..]
> Here is the modified iForth source. The speed has gone from
> 126 to 100 ms per million values sorted. This is definitely the
> fastest integer sort I know.

[..]
> \ FORTH> test ( 3 GHz PIV iForth32 )
> \ Creating 1,000,000 input values.
> \ Sorting .. 0.099 seconds elapsed. ok


The MOVE can be changed into a simple exchange of the input and output buffer
pointers. And, of course, when the data has high bytes that are all zero, no
sorting pass is needed. With these two changes the results become:

( 3 GHz AMD X2, iForth64, 1,000,000 items )
FORTH> 8 test \ Sorting .. 0.021 seconds elapsed.
FORTH> 16 test \ Sorting .. 0.030 seconds elapsed.
FORTH> 24 test \ Sorting .. 0.050 seconds elapsed.
FORTH> 32 test \ Sorting .. 0.059 seconds elapsed.
FORTH> 48 test \ Sorting .. 0.089 seconds elapsed.
FORTH> 56 test \ Sorting .. 0.108 seconds elapsed.
FORTH> 63 test \ Sorting .. 0.118 seconds elapsed.

Here "8 test" means: sort 1,000,000 items of 8 bits each (i.e. only one
sorting pass will be done).

On my AMD X2, the fastest comparable sort, Sedge's, needs 130 ms for
1 million bytes.

-marcel

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 02:53 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.