| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#91
| |||
| |||
| In sci.physics Michael Moroney <moroney@world.std.spaamtrap.com> wrote: > Randy Yates <yates@ieee.org> writes: > >Randy Yates <yates@ieee.org> writes: > >> I will NEVER go back to MS. > >Oh, and did I mention that daily/hourly reboots are now a thing of the > >past? This system was recently up for 30 days without > >rebooting. Usually when I do have to reboot, it's because a > >thunderstorm or some stupid thing I did. > 30 days? That's nothing. I work with VMS systems, which frequently have > uptimes measured in years, as long as the electric utility, thunderstorms > and UPS's cooperate. VMS? You got TOPS-10 and RT-11 too? -- Jim Pennino Remove .spam.sux to reply. |
|
#92
| |||
| |||
| Walter Banks wrote: > > Androcles wrote: > >> : Could be you are both right. I prefer to compare C to other peoples >> : asm code. Might also be the quality of the C compiler ![]() >> >> As I worked as a test engineer in flight simulation it was my job >> to correct the minor (and sometimes major) errors in other >> people's code as well as look into hardware. As in all walks of life, >> human beings are prone to error and some more than others. >> That includes me, I'm not perfect, but I have no doubt that >> statistically a compiler can perform better than some asm >> programmers, of which the numbers are dwindling. However, >> we are not comparing groups of people with a compiler, >> but a theoretically perfect asm program with a theoretically perfect >> compiled program that has been optimised for either speed or >> for size or for accuracy. > > A more realistic comparison is between an asm programmer > skilled in his craft compared to a HLL programmer skilled > in their craft. > >> The asm programmer will be aware of which routines run the >> most often and optimise for speed and accuracy, writing as much >> code as necessary, leaving less often run modules to junior >> programmers. > > This is a management decision on how resources are used. This > happens in both environments. > >> The compiler cannot make that distinction. >> I had no more than 30 milliseconds to paint a full scene for the >> pilot or the image would flicker, the guy writing the instructor's >> station could update it once a second. Big difference. > > No argument > >> When they speed up the processors and add more RAM >> they want more detail painted, so the time constraint doesn't >> ever go away. > > Again I agree. > > What the compiler does is evaluate the application and > makes implementation decisions on every compile that > realistically can't be done each time by the assembler > programmer. > > In the link I posted before, we show that there is no > advantage for asm over C. It effectively proves that > anything that can be written in asm I can code in the > same or less space in C and the same or less execution > time. I agree that it is true only for well implemented C > compilers supporting ISO/IEC 18037. But it can and Since ISO/IEC 18037 provides for assembly code, that's a tautology. What's the ISO/IEC 18037 instruction for MAC? Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯ |
|
#93
| |||
| |||
| In article <f8clov$qh$1@news.Stanford.EDU>, "Luna Moon" <lunamoonmoon@gmail.com> wrote: > "Hendrik van der Heijden" <hvdh@gmx.de> wrote in message > news:46a99140$0$31634$9b4e6d93@newsspool3.arcor-online.net... > > galathaea schrieb: > > > >> you have to run benchmarks and profile code execution paths > >> but just choosing the right compiler has increased code > > > speed 40% in tight loops for me in the past > > > > I've seen a 500% boost just from adding a compiler option (-mfpmath). > > With this option, gcc uses SSE1/2 units for floating point calculations > > (SISD, not vector code) instead of traditional x87 code. > > > > > > Hendrik vdH > > Great! This is exactly what I want... Thanks so much I will try it out! > Which complier? > > We have done everything at the algorithm level, now we just want to make > sure our data structure, caches, and code organization don't do stupid > things to slow down and we try various tricks to squeeze up speed further. > > Any more pointers? Hopefully there are some books/articles/resource > somewhere on this planet talking about highly efficient C++ code and > complier, etc. Have you profiled your current implementation? -- Michael Press |
|
#94
| |||
| |||
| jimp@specsol.spam.sux.com wrote: (snip) >>30 days? That's nothing. I work with VMS systems, which frequently have >>uptimes measured in years, as long as the electric utility, thunderstorms >>and UPS's cooperate. > You got TOPS-10 and RT-11 too? Isn't TOPS-10 the one with the error message up-too-long? -- glen |
|
#95
| |||
| |||
| Randy Yates <yates@ieee.org> writes: > Walter Banks <walter@bytecraft.com> writes: > > > srp@microtec.net wrote: > > > >> Short of going to straight assembly language, nothing is faster. > > > > Many C compilers easily beat most asm code in my experience > > what happens is asm programmers tend to use subsets of the > > instruction set and rarely do complete a re-implimentation of the > > application with each change the way compilers do with each > > compile. > > I normally see a factor of two speedup in hand-coded assembly > over C and it's not uncommon to get a factor of 5. I frequently see a theoretical maximum factor of 1.0. I run the Freescale scheduler modeller over my compiled C code, and often notice that it will take in, and churn out 4 FPU operations every 5 ticks. Asm can't beat that, as the processor (G4) can't do any more than that. Then again I tend to write C which maps 1-1 onto opcodes, so effectively I'm writing asm. Except that I don't have to write it for both PPC and Alpha, so I only need to do half of the work. If your C is much slower than asm, then write better C. Phil -- "Home taping is killing big business profits. We left this side blank so you can help." -- Dead Kennedys, written upon the B-side of tapes of /In God We Trust, Inc./. |
|
#96
| |||
| |||
| "Luna Moon" <lunamoonmoon@gmail.com> writes: > "Phil Carmody" <thefatphil_demunged@yahoo.co.uk> wrote in message > news:87vec62tst.fsf@nonospaz.fatphil.org... > > glen herrmannsfeldt <gah@ugcs.caltech.edu> writes: > >> jimp@specsol.spam.sux.com wrote: > >> > >> (snip) > >> > >> > Sigh, Linux is free; all you have to do is download and install it. > >> > >> > Linux will be a LOT faster than Windows when doing number crunching. > >> > >> There is no reason Linux should be faster, in general, for number > >> crunching. > > > > When I compiled, using the same level of gcc, my number crunching code > > (OversEis, for finding prime numbers) for Opteron, using nothing > > but the i386 registers and the FPU, no MMX/SSE/whatever, the code > > running on linux was 30% faster than the code running on windows. > > (250s per test vs. 330s per test) Also, in windows, running 1 such > > program was slower than running 2 in parallel (330s for 2 tests, > > vs. 480s for 1 test - linux was 250s for 2 tests, 250s for 1 test). > > > > Something ain't right in the state of 64-bit windows. > > > > Phil > > > Why didn't you use MMX/SSE/whatever? When _I_ used them, they (SSE2 was the only thing that would do what I need) were slower than simple compiled C using the FPU. E.g. this, from Dan Bernstein's DJBFFT: inline void c4(register complex *a) { register real t0, t1, t2, t3, t4, t5, t6, t7; register real t8, t9, t10, t11, t12, t13, t14, t15; t0 = a[0].re; t1 = a[2].re; t2 = a[1].im; t3 = a[3].im; t4 = t0 - t1; t5 = a[0].im; t6 = t0 + t1; t7 = a[2].im; t8 = t2 - t3; t9 = a[1].re; t10 = t2 + t3; t11 = a[3].re; t12 = t5 - t7; t13 = t5 + t7; t14 = t9 - t11; t15 = t9 + t11; t0 = t4 - t8; a[2].re = t0; t1 = t4 + t8; a[3].re = t1; t2 = t12 + t14; a[2].im = t2; t3 = t12 - t14; a[3].im = t3; t5 = t6 + t15; a[0].re = t5; t7 = t6 - t15; a[1].re = t7; t9 = t13 + t10; a[0].im = t9; t11 = t13 - t10; a[1].im = t11; } was faster than this - which I wrote thinking it would be faster: #if 0 inline void c4(register complex *a) { register __m128d p1=((__m128d*)a)[1]; register __m128d p3=((__m128d*)a)[3]; register __m128d p0=((__m128d*)a)[0]; register __m128d p2=((__m128d*)a)[2]; register __m128d p6=p1-p3; //x1-x3,y1-y3 register __m128d p7=p1+p3; //x1+x3,y1+y3 register __m128d p4=p0-p2; //x0-x2,y0-y2 register __m128d p5=p0+p2; //x0+x2,y0+y2 register __m128d p9; _mm_shuffle_pd(p9,p6,_MM_SHUFFLE2(0,1)); //y1-y3,x1-x3 ((__m128d*)a)[0]=p5+p7; ((__m128d*)a)[2]=p5-p7; ((__m128d*)a)[1]=p4-p9; ((__m128d*)a)[3]=p4+p9; } #endif Phil -- "Home taping is killing big business profits. We left this side blank so you can help." -- Dead Kennedys, written upon the B-side of tapes of /In God We Trust, Inc./. |
|
#97
| |||
| |||
| srp@microtec.net writes: > On 26 juil, 23:46, Gianni Mariani <gi3nos...@mariani.ws> wrote: > > s...@microtec.net wrote: > > > On 26 juil, 17:19, lunamoonm...@gmail.com wrote: > > ... > > > Your best bet with C/C++ is to get a fast computer. Optimization > > > is only nominal with these compilers. > > > > "nominal"? It's critical. > > > > I have no idea what you're alluding to but if you're trying to say that > > the optimizer is not one of the critical parts of a C++ compiler when it > > comes to performance, you're very mistaken. > > Of course, optimized C++ is way better than non optimized C. > > But in order of code optimization, from worst to best, you have C++, > straight C, and Forth as best by far. C is simply no match for any > flavor of Forth for low level code optimization. > > Simple fact of life. Can forth get a G4 to process more FPU instructions per tick than the processor is capable of handling? Given that I can get C to exactly match that bound, your forth can never be faster. Ever. Your claim is hollow. You obviously don't write good enough C. Phil -- "Home taping is killing big business profits. We left this side blank so you can help." -- Dead Kennedys, written upon the B-side of tapes of /In God We Trust, Inc./. |
|
#98
| |||
| |||
| "Dr Ivan D. Reid" <Ivan.Reid@brunel.ac.uk> writes: > On Fri, 27 Jul 2007 04:44:59 GMT, jimp@specsol.spam.sux.com > <jimp@specsol.spam.sux.com> > wrote in <1d9on4-6ta.ln1@mail.specsol.com>: > > > Linux will be a LOT faster than Windows when doing number crunching. > > Really? I have two "identical" CPUs running seti@home. The one > running Linux currently has 860 Cobblestones of Recently Acquired Credit. > The one running Win XP has a 16% higher RAC: Were they compiled from the same source with the same compiler, and same options? Phil -- "Home taping is killing big business profits. We left this side blank so you can help." -- Dead Kennedys, written upon the B-side of tapes of /In God We Trust, Inc./. |
|
#99
| |||
| |||
| Randy Yates <yates@ieee.org> writes: > Actually, if I wasn't running a GUI on my machine, and I had a more > robust UPS, and I was a more competent administrator, a reboot period > of one year (or more) would probably be well-within reach. A colleague of mine had bought a new computer and just before he turned the old one off, we checked the uptime. It read 444 days. Not up with the reliability of these guys: http://www.stratus.com/uptime/ftserver.htm where some machines have up-time measured in decades, but 444 days is still pretty impressive for a Mac. Ciao, Peter K. -- "And he sees the vision splendid of the sunlit plains extended And at night the wondrous glory of the everlasting stars." |
|
#100
| |||
| |||
| On 28 Jul 2007 11:13:30 +0300, Phil Carmody <thefatphil_demunged@yahoo.co.uk> wrote in <87tzrp0wut.fsf@nonospaz.fatphil.org>: > "Dr Ivan D. Reid" <Ivan.Reid@brunel.ac.uk> writes: >> On Fri, 27 Jul 2007 04:44:59 GMT, jimp@specsol.spam.sux.com >> <jimp@specsol.spam.sux.com> >> wrote in <1d9on4-6ta.ln1@mail.specsol.com>: >> > Linux will be a LOT faster than Windows when doing number crunching. >> Really? I have two "identical" CPUs running seti@home. The one >> running Linux currently has 860 Cobblestones of Recently Acquired Credit. >> The one running Win XP has a 16% higher RAC: > Were they compiled from the same source with the same compiler, > and same options? They are both "optimised" versions from lunatic.at; I believe they both used the Intel compiler, but I'm not totally sure about the Windows version. I forgot to mention that the Windows machine spends 4% of its CPU monitoring my home security system, as well being the machine I log onto the Internet with, play games on, etc. At the time I published the RAC numbers it had also spent 9 core-cpu-hours in the previous day running genetic-algorithm analyses. The Linux machine does nothing but crunch; I ssh into it from time to time to tickle its result reporting. -- Ivan Reid, School of Engineering & Design, _____________ CMS Collaboration, Brunel University. Ivan.Reid@[brunel.ac.uk|cern.ch] Room 40-1-B12, CERN KotPT -- "for stupidity above and beyond the call of duty". |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.