A stupid post about Intel's latest computer chip ( s) - Java
This is a discussion on A stupid post about Intel's latest computer chip ( s) - Java ; In sci.math, A Man Crying Alone In The Wilderness
<cpu16x1832@wmconnect.com>
wrote
on 23 Oct 2005 11:16:57 -0700
<1130091417.708926.74340@g49g2000cwa.googlegroups.com>:
> Come on now, you are less of an idiot to understand this,
>
>
> IBM/INTEL architecture,
>
> REGISTER_1 ( ...
-
Re: A stupid post about Intel's latest computer chip ( s)
In sci.math, A Man Crying Alone In The Wilderness
<cpu16x1832@wmconnect.com>
wrote
on 23 Oct 2005 11:16:57 -0700
<1130091417.708926.74340@g49g2000cwa.googlegroups.com>:
> Come on now, you are less of an idiot to understand this,
>
>
> IBM/INTEL architecture,
>
> REGISTER_1 ( A storage location)
> REGISTER_2 ( A storage location)
> REGISTER_3 ( A storage location)
> ( etc. . . . )
> REGISTER_16 ( A storage location)
>
>
> V=2ES
> single stack enhanced architecture, ( dynamic frequency profiled)
>
> STACK_1 [ 1..8]
> STACK_2 [ 1..4]
> STACK_3 [ 1..4]
>
>
> Which one do you believe requires less chip internal hardware wires?
Depends on how one pushes and pops the stacks, perhaps. I'll admit
I don't see STACK_1 being deep enough. Also, is there a reason for
3 separate stacks? My hypothetical required only two: numeric
values and codepointers.
Did you anticipate using something along the lines of dual barrel
shift registers? That makes some sense, if it's fast enough;
however, there's a lot of issues regarding pipelining with a
stack register architecture; basically, the second instruction
can't execute until the first one's done playing with the stack.
At least in a register-based architecture where one has the code
sequence
SUB AX, BX
ADD CX, DX
one could conceivably be executing the SUB instruction and the ADD
instruction more or less simultaneously. Ideally, though, the
simpler architecture would run at a higher clockrate.
Perhaps if you were to clarify what you mean by "chip internal hardware
wiring"? For instance, does that mean:
[1] die size, given a certain transistor size?
[2] total wiring area?
[3] number of vias?
[4] a combination of the above?
Note also that buffer transistors -- those things that have to drive
the outside world pins -- are huge compared to the internal wiring.
And there's a lot of them. Try to optimize the internal wiring
too much and one might just waste space.
>
> ( and, thus, a higher efficiency of "Turing" machine language
> expression ( and code profile))
Turing machines don't do arithmetic all that well. If one postulates,
for example, a decimal number, followed by a blank, followed by
another decimal number, followed by an indefinite number of blanks,
one could do the following.
state 0, any char but blank; write that char, right, state 0
state 0, blank: write blank, back up, state 1
state 1, char '0': write '0', go to state 2-0
state 1, char '1': write '1', go to state 2-1
....
state 1, char '9': write '9', go to state 2-9
state 2-x, blank: write blank, right, go to state 3-x
state 3-x, any char but blank, write that char, right, stay in this state
state 3-x, blank: write blank, left, go to state 4-x
state 4-x, char 'y': write 'y', right, go to state 5-{x+y} or 6-{x+y-10}
I could go on but it gets pretty tedious. :-) And that's for
*addition*; I shudder what I would have to do for multiplication
or division.
If one postulates two binary numbers as opposed to two decimal
ones, the machine gets slightly simpler but it's still pretty
tedious.
Of course one could postulate a 2^32+1 character alphabet, and
an impossibly huge state matrix, if one wishes. That gets
slightly silly, though.
>
>
> ( HINT : Have you every read about minimal ANSI FORTH machines? )
Can't say I have. I know a little Forth; it's a strange language,
which can modify itself. Very interesting and efficient, but it
doesn't do files all that well; the traditional method involves
numbered screen loading, as I recall. Of course that was way
back then.
>
> MIMD Multiple Instruction Multiple Data
> VLIW Variable Length Instruction Word
> MPP Multiple Parallel Processors ( many SMPs linked together like an
> interconnecting LEGO(tm)-like block game to add more processing power )
Bit-slice architectures have been known for years, if not decades.
> SMP Symmetric Multiple Processor ( like, multiple cores on a single CPU
> chip)
> ( between sixteen and two with IBM/Intel, set at a constant factor of
> sixteen and derivative of super-scalable application dynamic frequency
> profile )
>
> I have been shouting news of the VLIW SMP MPP FORTH formula to
> Washington and has been published, since 1996, all around the St. Paul
> and Minneapolis Minnesota area.
>
> However, IBM/Intel continues to shout anti-news.
>
> ---
>
> A simple enumeration of basic primitives with a stack enhanced
> architecture yields an powerful micro processor core. ( For example
> ANSI FORTH machine implicit and explicit primitives, JUMP_IF_ZERO JUMP
> CALL RETURN LITERAL 0< AND XOR DROP OVER DUP @ ! 2* 2/ >R R> INVERT + )
>
>
> ---
>
> The Ghost In The Machine wrote:
> <SNIP>
>
>> Note that I'm not really specifying a word size, although
>> most contemporary architectures would be 32 or 64 bits.
>
> Maybe investigate a 16-bit 16-way SMP core dual-bus architecture with
> 16-bit instructions aligned every 64-bits for an optimum primitives
> profile.
Why so low? 64-bit is the way to go, if one can afford the die space.
The practical considerations are these:
[1] How many die per year can one fabricate? Note that this is a
function of wafer size, yield, transistor size, and process complexity;
the smaller the transistors the more sensitive they are to
process variations.
[2] How much does each die cost to make?
[3] How much can one sell each die for?
[4] How well does a die actually work in regards to contemporary
microprocessors?
I don't see FORTH being limited to 16-bit.
>
> regards,
>
> maw
>
--
#191, ewill3@earthlink.net
It's still legal to go .sigless.
-
Re: A stupid post about Intel's latest computer chip ( s)
In sci.math, Mark Nudelman
<markn@greenwoodsoftware.com>
wrote
on Sun, 23 Oct 2005 13:35:30 -0700
<jcKdnVFD0K2PacbeRVn-ow@comcast.com>:
> On 10/23/2005 8:11 AM, A Man Crying Alone In The Wilderness wrote:
>> Come on now, you are less of an idiot to understand this,
>
> Perhaps if you could write in grammatical English, people could
> understand what you're trying to say.
>
>> IBM/INTEL architecture,
>>
>> REGISTER_1 ( A storage location)
>> REGISTER_2 ( A storage location)
>> REGISTER_3 ( A storage location)
>> ( etc. . . . )
>> REGISTER_16 ( A storage location)
>>
>>
>> V.S
>> single stack enhanced architecture, ( dynamic frequency profiled)
>>
>> STACK_1 [ 1..8]
>> STACK_2 [ 1..4]
>> STACK_3 [ 1..4]
>>
>>
>> Which one do you believe requires less chip internal hardware wires?
>>
>> ( and, thus, a higher efficiency of "Turing" machine language
>> expression ( and code profile))
>
> This is a meaningless question. A stack architecture could be
> implemented with fewer "wires" (by which I think you mean "gates")
The two are not unrelated. Cross polysilicate with diffusion, and
one has a transistor gate (FET) -- at least, for NMOS, PMOS, or
CMOS architectures. Of course in CMOS one has to have another
transistor somewhere else, of the opposing type in the transistor
wiring "graph": a 2-input NAND gate, in particular, has 2 N-types
in series and 2 P-types in parallel.
(With my luck all this is in a FAQ somewhere. :-) )
> if it
> stores the stack entirely in off-chip memory, but then it would be much
> slower than a register-based machine.
I for one think he's thinking internal stack. An 8-way
numeric stack, though, is rather small, considering that
modern micros have 2 MB or more internal memory cache.
I'd probably want to use two 1kword or 4Kword barrel shift
registers. The ALU would connect to the top four slices
of the barrel. Ideally, the physical layout would in fact
look a bit like a barrel, to optimize propagation delay.
However, I'm far from expert in this stuff.
One possibility might be to replace the flipflops in
a traditional barrel shift register with a DRAM unit
(transistor + capacitor); the barrel shift register would
then *have* to shift (either forward or backward) every
clockpulse transition, perhaps.
> Most reasonable stack machines
> keep the top N stack entries in on-chip registers, which makes it look
> pretty similar to a register-based architecture from the point of view
> of chip resources. On the other hand, a register-based machine could
> keep its registers in off-chip memory in order to save gates, but this
> would be a pretty stupid design.
Actually, the 486 does exactly this, if one switches contexts.
Basically, the registers are shoved into a TSS structure
in memory.
I suspect more modern chips have similar capabilities.
>
> However, counting gates (or "wires") is not the way to determine the
> efficiency of a chip. In general, chips with more gates are MORE
> efficient, since they implement a lot of optimizations which are not
> possible in smaller chips.
I for one would think it depends on what one wants to optimize.
[1] Raw chip speed -- how fast can that sucker go?
[2] Chip power dissipation.
[3] Chip size.
[4] Number of transistors. (This is not quite the same as chip size,
since other variables include fanin or fanout per transistor.)
[5] Number of transistor flips during execution of a specific problem
(e.g., Erastothene's Sieve). Presumably, this is related
to [2].
>
>
> But possibly I entirely misunderstood your point, because your posting
> is very unclear.
>
>
> Also, when people reply to you and you just repost your original post as
> a reply to them, it makes it look like you can't understand their
> replies (or that you're a bot). You should at least respond to the
> substance of posts that reply to you.
I'm not sure my reply was all that basic. :-) But it's clear he
didn't pursue the details thereof.
>
> --Mark
--
#191, ewill3@earthlink.net
It's still legal to go .sigless.
-
Re: A stupid post about Intel's latest computer chip ( s)
In article <fudt23-d6s.ln1@sirius.tg00suus7038.net>,
The Ghost In The Machine <ewill@sirius.tg00suus7038.net> wrote:
[...]
>I for one would think it depends on what one wants to optimize.
>
>[1] Raw chip speed -- how fast can that sucker go?
>[2] Chip power dissipation.
>[3] Chip size.
>[4] Number of transistors. (This is not quite the same as chip size,
> since other variables include fanin or fanout per transistor.)
>[5] Number of transistor flips during execution of a specific problem
> (e.g., Erastothene's Sieve). Presumably, this is related
> to [2].
[6] How fast it will go running something compiled with a C compiler a
mere mortal can design.
In a very pipelined machine, you can get more speed per transistor by
making it the compilers job to make sure that two numbers aren't trying to
go down the same bus. If different instructions have all manner of
different timings, coming up with the optimum code can be very tricky.
--
--
kensmith@rahul.net forging knowledge
-
Re: A stupid post about Intel's latest computer chip ( s)
Ken Smith wrote:
> In article <fudt23-d6s.ln1@sirius.tg00suus7038.net>,
> The Ghost In The Machine <ewill@sirius.tg00suus7038.net> wrote:
> [...]
> >I for one would think it depends on what one wants to optimize.
> >
> >[1] Raw chip speed -- how fast can that sucker go?
> >[2] Chip power dissipation.
> >[3] Chip size.
> >[4] Number of transistors. (This is not quite the same as chip size,
> > since other variables include fanin or fanout per transistor.)
> >[5] Number of transistor flips during execution of a specific problem
> > (e.g., Erastothene's Sieve). Presumably, this is related
> > to [2].
>
> [6] How fast it will go running something compiled with a C compiler a
> mere mortal can design.
>
>
> In a very pipelined machine, you can get more speed per transistor by
> making it the compilers job to make sure that two numbers aren't trying to
> go down the same bus. If different instructions have all manner of
> different timings, coming up with the optimum code can be very tricky.
>
> --
> --
> kensmith@rahul.net forging knowledge
As balanced for high microprocessor efficiency, an MPP SMP stack
machine architecture for FORTH, C, Scheme, Java,
you-name-it-computer-programming-language
It uses a simple stack to stack messaging, for both SMP multi core,
internally and MPP, CPU16-to-CPU16, externally, for simply solving MIMD
and a host of other SMP multi core chip design problems, ...
as you may read, a C compiler is almost an IBM/Intel no-brainer,
In general, microprocessor efficiency minimizes transistor count and
maximizes utilization of those transistors, however, externally,
traditional "bandwidth" benchmarking program suites, a 'raw' efficiency
will be displayed, even more so where a benchmark relies upon parallel
architectures, I guess ten to ONE HUNDRED times faster, for some
real-world practical parallel programming benchmark suites. (
hydrodynamic or thermodynamic modeling, etc. )
This model is the most efficient SMP MPP microprocessor model I have
reference, a hydrid of Mr. Moore's work and mine, and, as a final note,
I am having difficulty developing my chip model any further than this,
URL,
http://groups.google.com/group/comp....e=source&hl=en
Here is 16-bit VLIW protocol reference, ( from dynamic profiling)
URL,
http://groups.google.com/group/comp....e=source&hl=en
Regards,
maw
-
Re: A stupid post about Intel's latest computer chip ( s)
The Ghost In The Machine wrote:
> In sci.math, A Man Crying Alone In The Wilderness
> <cpu16x1832@wmconnect.com>
> wrote
> on 23 Oct 2005 11:16:57 -0700
> <1130091417.708926.74340@g49g2000cwa.googlegroups.com>:
> > Come on now, you are less of an idiot to understand this,
> >
> >
> > IBM/INTEL architecture,
> >
> > REGISTER_1 ( A storage location)
> > REGISTER_2 ( A storage location)
> > REGISTER_3 ( A storage location)
> > ( etc. . . . )
> > REGISTER_16 ( A storage location)
> >
> >
> > V=2ES
> > single stack enhanced architecture, ( dynamic frequency profiled)
> >
> > STACK_1 [ 1..8]
> > STACK_2 [ 1..4]
> > STACK_3 [ 1..4]
> >
> >
> > Which one do you believe requires less chip internal hardware wires?
>
> Depends on how one pushes and pops the stacks, perhaps. I'll admit
> I don't see STACK_1 being deep enough. Also, is there a reason for
> 3 separate stacks? My hypothetical required only two: numeric
> values and codepointers.
>
I currently use five stacks, for my "Holy Grail" almost all purpose
super scalable multi core architecture model,
COPIED FROM ANOTHER POST, URL,
http://groups.google.com/group/comp....e=source&hl=en
"
Example extended on-chip stack register map,
sixteen ( 16) return stack elements,
eight ( 8) parameter stack elements,
four ( 4) Supplementary stack elements ( X, Y),
thirty two ( 32) status /machine state logic/ stack elements
"
Regards,
maw
-
Re: A stupid post about Intel's latest computer chip ( s)
maghas@Ryugyong.Hotel wrote:
> <cpu16x1832@wmconnect.com> wrote in message
> news:1130088424.198227.277960@g44g2000cwa.googlegroups.com...
> Come on now, you are less of an idiot to understand this,
>
> FORTH never went anywhere for a good reason.
>
Post-script, Lego Mindstorms, Open Firmware...
> Totally un-maintainable.
No need to maintain if it does the job already
-
Re: A stupid post about Intel's latest computer chip ( s)
In sci.math, Tim Clacy
<nospamtcl@nospamphaseone.nospamdk>
wrote
on Mon, 24 Oct 2005 12:22:42 +0200
<435cb5fb$0$38653$edfadb0f@dread12.news.tele.dk>:
> maghas@Ryugyong.Hotel wrote:
>> <cpu16x1832@wmconnect.com> wrote in message
>> news:1130088424.198227.277960@g44g2000cwa.googlegroups.com...
>> Come on now, you are less of an idiot to understand this,
>>
>> FORTH never went anywhere for a good reason.
>>
> Post-script, Lego Mindstorms, Open Firmware...
>
>> Totally un-maintainable.
> No need to maintain if it does the job already
>
Three words: lots of comments. :-P Besides, Forth is generally
portrayed as a dictionary of source; if one needs to change
a word, it can be changed easily, although one might have
to reload the system for the change to take proper effect.
For example:
: GODOIT 1 . ;
: DOITAGAIN CR GODOIT GODOIT CR ;
: GODOIT 2 . ;
DOITAGAIN
would print '11' or ' 1 1', not '22' or ' 2 2'.
A quick install of 'gforth' confirms this.
http://www.complang.tuwien.ac.at/projects/forth.html
Comments use '()' or backslash; note that these must be
separated by a space as they are interpreted by the
gforth system:
( this is a comment; it can be multiline but cannot nest )
\ so is this; it extends to the end of the line
(The \ is new to me, but reasonably logical. The () I've seen before.)
A typical definition, for example, notes the stack effects in ():
: square ( n -- n^2 )
dup * ;
And of course it helps to use intuitive tokens as opposed to cryptic
one-character affairs.
--
#191, ewill3@earthlink.net
It's still legal to go .sigless.
-
Re: A stupid post about Intel's latest computer chip ( s)
"A Man Crying Alone In The Wilderness" <cpu16x1832@wmconnect.com> writes:
>maghas@Ryugyong.Hotel wrote:
>> <cpu16x1832@wmconnect.com> wrote in message
>> news:1130088424.198227.277960@g44g2000cwa.googlegroups.com...
>> Come on now, you are less of an idiot to understand this,
>>
>> FORTH never went anywhere for a good reason.
>>
>> Totally un-maintainable.
>Presumably for the same reason you understand all machine code is
>un-maintainable. Maybe read some more to develop you knowledge of
>computer programming languages and their relationship to machine code.
And a Forth interpreter is shipped embedded in the hardware of all
Macs and Sun SPARCs. And that is a pretty good indication of the
application space.
Casper
-
Re: A stupid post about Intel's latest computer chip ( s)
In article <1130119928.958404.155180@g44g2000cwa.googlegroups.com>,
A Man Crying Alone In The Wilderness <cpu16x1832@wmconnect.com> wrote:
[....]
>traditional "bandwidth" benchmarking program suites, a 'raw' efficiency
>will be displayed, even more so where a benchmark relies upon parallel
>architectures, I guess ten to ONE HUNDRED times faster, for some
>real-world practical parallel programming benchmark suites. (
>hydrodynamic or thermodynamic modeling, etc. )
In both seismic data processing and the SETI project, parallel computing
can be done up to the limit of your budget for all practical purposes.
Some years back I was at a trade show and saw a computer with 32K
processors in it.
Cute sales lady: ... and this machine has over 32 thousand processors.
Me: Oh, what kind of processor are they.
CSL : ... um ... um ... little bitty ones
--
--
kensmith@rahul.net forging knowledge
-
Re: A stupid post about Intel's latest computer chip ( s)
"Casper H.S. Dik" <Casper.Dik@Sun.COM> wrote in message
news:435ce764$0$11069$e4fe514c@news.xs4all.nl...
> "A Man Crying Alone In The Wilderness" <cpu16x1832@wmconnect.com> writes:
>
>
>>maghas@Ryugyong.Hotel wrote:
>>> <cpu16x1832@wmconnect.com> wrote in message
>>> news:1130088424.198227.277960@g44g2000cwa.googlegroups.com...
>>> Come on now, you are less of an idiot to understand this,
>>>
>>> FORTH never went anywhere for a good reason.
>>>
>>> Totally un-maintainable.
>
>>Presumably for the same reason you understand all machine code is
>>un-maintainable. Maybe read some more to develop you knowledge of
>>computer programming languages and their relationship to machine code.
>
> And a Forth interpreter is shipped embedded in the hardware of all
> Macs and Sun SPARCs. And that is a pretty good indication of the
> application space.
yep. With NO applications either.
Similar Threads
-
By Application Development in forum verilog
Replies: 2
Last Post: 09-10-2007, 07:42 PM
-
By Application Development in forum vhdl
Replies: 2
Last Post: 09-10-2007, 07:42 PM
-
By Application Development in forum Java
Replies: 0
Last Post: 01-09-2006, 06:32 AM
-
By Application Development in forum Java
Replies: 0
Last Post: 01-08-2006, 07:25 PM
-
By Application Development in forum Java
Replies: 0
Last Post: 01-08-2006, 07:23 PM