primitives vs cleverness vs readability

This is a discussion on primitives vs cleverness vs readability within the Forth forums in Programming Languages category; As we all know, all VMs regardless of the language can impose huge run- time performance penalties, compared to native coding in native compiled languages. An STC-based Forth with optimal hand-coded assembler primitives can evade this cost to a large degree, so I won't be talking about that here. I'm thinking more of DTC, ITC and TTC-based forths. What I am interested in is: when is it better to sweat over the coding of forth words, squeezing every last ounce of speed out of them at the very likely cost of readability/maintainability, as opposed to just taking the time critical ...

Go Back   Application Development Forum > Programming Languages > Forth

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-03-2008, 06:45 PM
DavidM
Guest
 
Default primitives vs cleverness vs readability

As we all know, all VMs regardless of the language can impose huge run-
time performance penalties, compared to native coding in native compiled
languages.

An STC-based Forth with optimal hand-coded assembler primitives can evade
this cost to a large degree, so I won't be talking about that here. I'm
thinking more of DTC, ITC and TTC-based forths.

What I am interested in is: when is it better to sweat over the coding of
forth words, squeezing every last ounce of speed out of them at the very
likely cost of readability/maintainability, as opposed to just taking the
time critical stuff and coding it as C and/or assembler primitives?

Cheers
Dave
Reply With Quote
  #2  
Old 08-03-2008, 08:27 PM
roger.levy@gmail.com
Guest
 
Default Re: primitives vs cleverness vs readability

On Aug 3, 6:45 pm, DavidM <nos...@nowhere.com> wrote:
> As we all know, all VMs regardless of the language can impose huge run-
> time performance penalties, compared to native coding in native compiled
> languages.
>
> An STC-based Forth with optimal hand-coded assembler primitives can evade
> this cost to a large degree, so I won't be talking about that here. I'm
> thinking more of DTC, ITC and TTC-based forths.
>
> What I am interested in is: when is it better to sweat over the coding of
> forth words, squeezing every last ounce of speed out of them at the very
> likely cost of readability/maintainability, as opposed to just taking the
> time critical stuff and coding it as C and/or assembler primitives?
>
> Cheers
> Dave


I'd say ... you shouldn't sweat over Forth code at all. Do what's
necessary and no more. You want to finish your application (if you're
writing one), not torture yourself.

I'd choose readability and ease of coding over speed. At the same
time I like speed too so I use fast native external libraries and an
STC Forth.

Many times I've read about determining the most often executed
routines and optimizing those into assembly. I have a handful in my
project that are candidates for this, but I'm waiting until I need
more speed, not anticipating it. Or when I'm bored and just want to
code SOMETHING. Which I've done, but I usually feel an odd sense of
having wasted my time after.

So it goes, a lot of people say not to optimize prematurely. Oops.

On the bright side, algorithm-redesign can speed up things. I've
reduced routines to 25% their original size just through a redesign of
an algorithm that was meant to simplify things, and I wasn't even
looking for speed.

Hope those helped.

Roger
Reply With Quote
  #3  
Old 08-03-2008, 10:41 PM
Jonah Thomas
Guest
 
Default Re: primitives vs cleverness vs readability

DavidM <nospam@nowhere.com> wrote:

> What I am interested in is: when is it better to sweat over the coding
> of forth words, squeezing every last ounce of speed out of them at the
> very likely cost of readability/maintainability, as opposed to just
> taking the time critical stuff and coding it as C and/or assembler
> primitives?


If your code is already fast enough running on your particular Forth on
your particular hardware, then you don't need to do either one, you're
done.

So, you have your Forth code that works but it isn't fast enough. Step
back and notice whether you see some other method that would run
faster. You can try out new methods faster in Forth. See which one looks
like it's actually doing the least work. If you find a better algorithm
and it's fast enough, then you're done.

OK, so your best method is still too slow. Look at it carefully for ways
to speed it up with better Forth. Don't do things that make it
unreadable. They aren't worth it. Something will go wrong and it will be
extra trouble to fix it. But you might as well look in case you've
missed something that would speed it up. You can do some profiling to
see where the slow stuff is, don't spend much effort on the stuff that
can't help. If you see something that lets you speed up a critical inner
loop, maybe it will be fast enough. If so then you're done.

If it's still too slow then by this time you know where the slow spots
are. Look for something that it pays to do in C or assembler. If you're
already using a good Forth optimiser that produces native code, you
might only expect to speed it up 3-4 times. Maybe less, depending. If
you need more speedup than you have any right to hope for, now is the
time to either go back and look again for a better algorithm, or else
look for faster hardware. Or you could try doing it in C or assembly
just in case. If it's fast enough at this point then you're done.

If you've already coded one bottleneck and it didn't help enough, you
probably have some idea how much speedup you can hope for from the
second bottleneck. Guess whether you can make it fast enough by assembly
coding. If it doesn't look plausible, your best choices are to find a
faster algorithm or get faster hardware. Look at the things that gobble
up the time and imagine ways to get your result without doing them.
Like, one time the slow part was to compute successive integer square
roots inside an inner loop. The solution was to not compute square
roots, but instead compute successive squares. The square root stays the
same for x iterations until you get to the new square. And if you know n
and n^2, then (n+1)^2 = n^2 + 2n + 1. Very fast.

Making your code unreadable is a mug's game. You lose the advantages of
Forth for a moderate speed improvement.

Rewriting your code in C or assembler can be a mug's game. You lose the
advantages of Forth for a moderate speed improvement. Do it if you need
to, and if you think it will work well enough.

Forth is good for prototyping alternate methods. That's worth a try. The
problem with that approach is that you can't tell ahead of time what
you'll find, and there's a chance you'll put signficant time into it and
not find anything. So alternate with other approaches. If you have a
manager who pays close attention to how you use your time, coding things
in C will look like a valid exercise. So when you spend half your time
doing that, at worst it will look like you're half as fast coding in C
as you really are. If you spend all your time looking for better
methods and you don't find them, then it might appear that you've just
been goofing off.

Writing code that's at the very edge of what your processor can do is
also a mug's game. You put a lot of effort into getting things barely
fast enough, and pretty soon the specs will change and demand more. The
first time you have to be real smart to get your code fast enough is a
big warning that you need a faster processor. They can pay you the big
bucks to do smarter and smarter tricks, and then they'll have to switch
to a faster processor anyway. All the time you spent writing optimised
assembly code for the old processor is wasted. (But the C code can be
salvaged provided the C compiler for the new chip is 100% compatible
with that for the old one.) The Forth code ought to run but the speed
tradeoffs may be different. If you spent a lot of effort writing just
exactly the sequence of Forth that was fastest, and now SWAP is seven
times as fast as it used to be but ROT is only twice as fast, any effort
you spent on fast stack juggling is wasted -- even if speed is still an
issue on the new processor.


[minor rant on] Forth is good for making code size small. You can put
some effort into byte code, you can compress source code and decompress
it a line at a time and interpret it, there are lots of ways to make
your code very small if you don't need it to be fast. It can be pretty
cheap to produce small code.

Forth is one of the best scripting languages for making code fast. Good
Forth programmers can produce relatively fast code cheaper than faster
code written in C.

Forth is one of the best choices for making code that's fast and small
both. But it isn't cheap to do that. This is not a good market niche for
Forth, even though it's a niche that Forth is good for. If somebody is
using an obsolescent processor that needs to do more than it can do, if
they want to cram more functionality in limited space and limited speed
than anybody can reasonably expect, maybe you can do it for them with
Forth. Then they bite the bullet and switch processors, and their costs
for getting you to do all that great stuff must be completely amortized
right then. Very likely they'll decide it wasn't worth it, they should
have switched earlier. Next time, or the time after, they do switch
earlier. You get bragging rights for doing this superhuman work but
repeat sales don't happen as often as you'd like.

I think it would be much better to develop the reputation for delivering
more than expected, if you can do that. If you can give a competitive
bid, and then meet the specs long before deadline, and ask "What else
would you like us to do?".... Getting the obsolete processor to deliver
a little bit longer is not quite beating a dead horse. Delivering code
that's small and reasonably fast and *correct*, before deadline, and
then offering something extra -- there ought to be a big market for
that. If you can deliver.
[rant end]
Reply With Quote
  #4  
Old 08-03-2008, 11:09 PM
Elizabeth D Rather
Guest
 
Default Re: primitives vs cleverness vs readability

DavidM wrote:
> As we all know, all VMs regardless of the language can impose huge run-
> time performance penalties, compared to native coding in native compiled
> languages.
>
> An STC-based Forth with optimal hand-coded assembler primitives can evade
> this cost to a large degree, so I won't be talking about that here. I'm
> thinking more of DTC, ITC and TTC-based forths.
>
> What I am interested in is: when is it better to sweat over the coding of
> forth words, squeezing every last ounce of speed out of them at the very
> likely cost of readability/maintainability, as opposed to just taking the
> time critical stuff and coding it as C and/or assembler primitives?
>
> Cheers
> Dave


A story I've told here before is applicable (sorry if you've already
heard it): FORTH, Inc. was asked to recode a baggage handling system
for American Airlines. The original program was all assembler, and too
expensive to maintain. We were required to reproduce the user interface
and basic bag handling procedures, but could do whatever else seemed
appropriate. Our program was written entirely in polyFORTH (ITC),
running native on an LSI-11 (yeah, it was quite a few years ago). When
it was sufficiently complete to run some timing tests, everyone was
astonished: our system could handle 25% more bags/minute than the
previous one. polyFORTH was obviously not faster than pure assembler;
the point was that our internal design was far more efficient than its
predecessor.

The overall design of an application is a much stronger determinant of
performance than language (any language). Sweating bullets over
language benchmarks is a largely meaningless exercise.

As others have said, the important thing is to get your program running
in the most straightforward way possible. In designing your program, be
cognizant of the potentially time-critical parts, and try to come up
with a clean design implemented in clean, readable code. When your
program is running correctly, you can do timing studies and it should be
clear what sections, if any, need some kind of optimization.

Modern Forths running on modern hardware are fast enough for the vast
majority of applications. It's not worth sweating until you have a
working program and can establish that you have a timing problem (and
where it is). Then you can focus on that bottleneck.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================
Reply With Quote
  #5  
Old 08-03-2008, 11:28 PM
Jerry Avins
Guest
 
Default Re: primitives vs cleverness vs readability

Elizabeth D Rather wrote:

...

> A story I've told here before is applicable (sorry if you've already
> heard it): FORTH, Inc. was asked to recode a baggage handling system
> for American Airlines. The original program was all assembler, and too
> expensive to maintain. We were required to reproduce the user interface
> and basic bag handling procedures, but could do whatever else seemed
> appropriate. Our program was written entirely in polyFORTH (ITC),
> running native on an LSI-11 (yeah, it was quite a few years ago). When
> it was sufficiently complete to run some timing tests, everyone was
> astonished: our system could handle 25% more bags/minute than the
> previous one. polyFORTH was obviously not faster than pure assembler;
> the point was that our internal design was far more efficient than its
> predecessor.


Which code was running last Wednesday at JFK? What a mess!
http://news.yahoo.com/s/nm/20080730/ts_nm/amr_jfk_dc

Jerry
--
Engineering is the art of making what you want from things you can get.
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply With Quote
  #6  
Old 08-04-2008, 03:11 AM
John Passaniti
Guest
 
Default Re: primitives vs cleverness vs readability

On Aug 3, 6:45 pm, DavidM <nos...@nowhere.com> wrote:
> As we all know, all VMs regardless of the language can impose huge run-
> time performance penalties, compared to native coding in native compiled
> languages.


The key word is "can." The slowest VM can still outperform the
fastest native code if the algorithms used are superior. What matters
is the performance of the system as a whole, not the VM. That's a
trap that you see endlessly here in comp.lang.forth-- a preoccupation
with speed. But not speed of some concrete real-world application,
but the speed of some small primitive in the system. The theory, I
guess, is that by focusing on speeding up all the primitives, the
performance of the overall system is improved. To which I say,
nonsense-- if I choose a superior algorithm, that is going to give me
a far better pay-off in terms of performance than if I manage to
reduce a routine I hardly ever call by a few cycles.

> What I am interested in is: when is it better to sweat over the coding of
> forth words, squeezing every last ounce of speed out of them at the very
> likely cost of readability/maintainability, as opposed to just taking the
> time critical stuff and coding it as C and/or assembler primitives?


The first step is to consider different algorithms and data
structures. And here, simpler isn't always better. For example,
depending on size of what you're searching, a simple linear search
will be slower (O(n)) than a more complex binary search (O(log2 N)).
And a more complex binary search is likely to be slower than an even
more complex hashing algorithm (typically O(1)). On the other hand,
if the size of what you're searching is small, then a linear search
may be fastest. It requires thought and sometimes experiment to know
how to choose algorithms and data structures.

If you're talking to hardware that has strict real-time performance
requirements that can't be met in Forth, then that is one time when
coding in a more primitive language makes sense. If you're processing
is not keeping up with input data, then coding core routines in a more
primitive language makes sense.

But before one starts down that road, they need to *measure*. That
may mean running a profiler to identify where to focus efforts. That
may mean getting out your oscilloscope or logic analyzer and
establishing a timing baseline. That may mean instrumenting the VM to
collect statistics on things like counts of certain instructions or
how much time some instructions take up. The point is that you need
to have some objective metric by which you can not only verify that
your efforts have paid off, but to understand the run-time behavior of
the system. You need that because your intuition can be wrong. You
need that because your experience can blind you to what is front of
you.
Reply With Quote
  #7  
Old 08-04-2008, 08:59 AM
Elizabeth D Rather
Guest
 
Default Re: primitives vs cleverness vs readability

Jerry Avins wrote:
> Elizabeth D Rather wrote:
>
> ...
>
>> A story I've told here before is applicable (sorry if you've already
>> heard it): FORTH, Inc. was asked to recode a baggage handling system
>> for American Airlines. The original program was all assembler, and
>> too expensive to maintain. We were required to reproduce the user
>> interface and basic bag handling procedures, but could do whatever
>> else seemed appropriate. Our program was written entirely in
>> polyFORTH (ITC), running native on an LSI-11 (yeah, it was quite a few
>> years ago). When it was sufficiently complete to run some timing
>> tests, everyone was astonished: our system could handle 25% more
>> bags/minute than the previous one. polyFORTH was obviously not faster
>> than pure assembler; the point was that our internal design was far
>> more efficient than its predecessor.

>
> Which code was running last Wednesday at JFK? What a mess!
> http://news.yahoo.com/s/nm/20080730/ts_nm/amr_jfk_dc
>
> Jerry


Hah, thank you for that! No, our system was at LAX. It operated for
about 10 years before AA corporate decided to standardize on a turnkey
system provided by a company "specializing in baggage handling systems".

ISTR there was a similar snafu when the new Denver terminal opened.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================
Reply With Quote
  #8  
Old 08-04-2008, 09:21 AM
Bernd Paysan
Guest
 
Default Re: primitives vs cleverness vs readability

John Passaniti wrote:
> The key word is "can." The slowest VM can still outperform the
> fastest native code if the algorithms used are superior. What matters
> is the performance of the system as a whole, not the VM. That's a
> trap that you see endlessly here in comp.lang.forth-- a preoccupation
> with speed. But not speed of some concrete real-world application,
> but the speed of some small primitive in the system.


You don't see the forrest in the trees. When we argue here, we are quite
often concerned about speed. You then step in and say "there's no
requirement to be performant here". That's sloppy thinking - there's no
requirement for performance here and there, so you implement CPU hogs here
and there and then wonder why your application is dog slow.

You forget that a fast, low-memory solution is often straight-forward and
clean design, too. It's easy to maintain, as well. That's the goal. If your
performance improvements are a burden for a small, clean implementation,
forget it.

After you have done that - implement something that's already sane in terms
of space, performance, and lines of codes, you can start measuring things.
You might be surprised that something that looked sane isn't, but in
general, it makes things much easier. After all, you have only to hunt
those bottlenecks that are still there despite of the preparation, and the
design is lean and clean.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
Reply With Quote
  #9  
Old 08-04-2008, 09:59 AM
Elizabeth D Rather
Guest
 
Default Re: primitives vs cleverness vs readability

Bernd Paysan wrote:
> John Passaniti wrote:
>> The key word is "can." The slowest VM can still outperform the
>> fastest native code if the algorithms used are superior. What matters
>> is the performance of the system as a whole, not the VM. That's a
>> trap that you see endlessly here in comp.lang.forth-- a preoccupation
>> with speed. But not speed of some concrete real-world application,
>> but the speed of some small primitive in the system.

>
> You don't see the forrest in the trees. When we argue here, we are quite
> often concerned about speed. You then step in and say "there's no
> requirement to be performant here". That's sloppy thinking - there's no
> requirement for performance here and there, so you implement CPU hogs here
> and there and then wonder why your application is dog slow.
>
> You forget that a fast, low-memory solution is often straight-forward and
> clean design, too. It's easy to maintain, as well. That's the goal. If your
> performance improvements are a burden for a small, clean implementation,
> forget it.
>
> After you have done that - implement something that's already sane in terms
> of space, performance, and lines of codes, you can start measuring things.
> You might be surprised that something that looked sane isn't, but in
> general, it makes things much easier. After all, you have only to hunt
> those bottlenecks that are still there despite of the preparation, and the
> design is lean and clean.


Well, but the OP was asking where he should put his emphasis in
optimizing low-level code. John and I are both saying, in different
words, that the place to put the primary effort is in application
design, not low-level code optimization. We're not saying performance
doesn't matter, but pointing out where the main focus should be in
achieving it.

In my story about the baggage system, the polyFORTH that we used was
roughly 10x faster than FIGforths of that era, but what made the
difference was not that but the design of polyFORTH's multitasker and
the way we used it in the application (the old system was doing a lot of
polling and flag passing internally). I'm all for fast systems, but
that doesn't address the OP's issue.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================
Reply With Quote
  #10  
Old 08-04-2008, 12:41 PM
Stephen Pelc
Guest
 
Default Re: primitives vs cleverness vs readability

On 4 Aug 2008 10:45:30 +1200, DavidM <nospam@nowhere.com> wrote:

>As we all know, all VMs regardless of the language can impose huge run-
>time performance penalties, compared to native coding in native compiled
>languages.
>
>An STC-based Forth with optimal hand-coded assembler primitives can evade
>this cost to a large degree, so I won't be talking about that here. I'm
>thinking more of DTC, ITC and TTC-based forths.
>
>What I am interested in is: when is it better to sweat over the coding of
>forth words, squeezing every last ounce of speed out of them at the very
>likely cost of readability/maintainability, as opposed to just taking the
>time critical stuff and coding it as C and/or assembler primitives?


It depends what the VM is for! When MPE and Forth Inc were working on
the OTA virtual machine, we found that if the high level portion of
the system (I/O, database ...) was sufficiently high-level, the
payment terminal applications spent most of their time in the high
level functions. I do not remember the numbers ... it was a long time
agao.

OTA was a token threaded 32 bit system. Underneath that, depending
on the CPU were DTC and STC Forth kernels for CPUs like 80186/V25,
8051 and 68000.

Stephen


--
Stephen Pelc, stephenXXX@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 07:06 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.