| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#21
| |||
| |||
| Phlip wrote: > nokturnal wrote: > > >>Having global variables is faster than having to pass the variable round >>from function to function. > > > So, you have time-tested this? and you have compared it to the Singleton > pattern with Get() methods? Actually, not only are globals faster than variables passed, they are also faster than straight local variables. Retrieving a local variable requires an indexed addressing mode to a stack address, while retrieving a global variable can be done with a direct addressing mode. There are some issues of cache, but these are mostly outweighed by the overhead of bringing the value into the stack (whether by passing or derivation). Jon ---- Learn to program using Linux assembly language http://www.cafeshops.com/bartlettpublish.8640017 |
|
#22
| |||
| |||
| In article <1112676450.491826.18800@g14g2000cwa.googlegroups. com>, Phlip <phlip2005@gmail.com> wrote: >> gcc, in particular, won't inline accessors found in a .cpp file. >All of that is about whether inline produces in-line opcodes, not >whether it actually makes things faster. It could make them slower. "Could." But, is not guaranteed to. Therefore, inlining is not something to mindlessly avoid. C/C++ give you lots of tools to make shooting yourself in the foot very easy and very painful, but that's not a reason to get rid of the tools. It's a wake-up call that those using it need to get a clue as to how their tools work. With regard to inlining, it is a necessary tool to help make sure that implementation-hiding does not become performance-hiding. Writing accessors is great. But, when >For example, suppose small method Foo() calls small method Bar() twenty >times. Now inline Bar(). (Really inline - not just write the keyword.) If you're going to toss around hypotheticals, here's some counterexamples: #define Bar(v) (v) = 0 vs, in Bar.cpp void Bar(int &v) { v = 0; } The first will be force-inlined no matter what. The second will never be inlined under gcc, and requires special compiler flags must be specified under MS Visual Studio .NET to get that inlined. On pretty much all architectures, it takes more opcodes (==bytes) and cycles to call a function, do something trivial, and return. Branches and function calls *kill* modern OOE processors, and information / implementation hiding encourages function calls out the wazoo. And there goes your performance. On the PS2, a function call takes 4 bytes, 8 bytes if the compiler couldn't find an opcode to put into. I've seen *LOTS* of functions that were 4/8 bytes in length. Inlining them is then essentially for free-- same # of bytes, and fewer cycles blown jumping and returning. gcc's "-finline-limit-n" commandline flag is a good start on achieving this. (See http://gcc.gnu.org/ for more info if you aren't familiar with this.) It lets *you*, the programmer, set a limit as to how much expansion to do automatically. That way, if Bar() is more complex, and called once, it could be inlined, and if it's called 20 times, it's not. And you don't need to manually re-profile and change your plans when your function changes-- the grunt work is handed off to the compiler. Personally, I'd like to see compilers give *MORE* information back. At the end of compiling and/or linking, I want a big text file (optionally) kicked out, saying the following: Functions that aren't inlined, but probably should be: void SomeClass: oNothing(void) - 0 bytes, called 5 timesfloat GameLoop::GetDT(void) - 4 bytes, called 1463 times void Bar::Baz(int &) - 4 bytes, called 840 times [...] Functions that are flagged as inline, but probably shouldn't be: void Entity: oSomethingObscure(int, float, void *, int, bool) - 2300 bytes, called 1 time[...] This is the kind of gruntwork that computers are great at, and they should be doing more of. Reports like that would go a long way towards correcting mis-inlining of all types. Different platforms have different ideas as to what "free" is, but removing some of the "information hiding" blinders that have separated programmers from the decisions of their compilers is a good thing. >So: Time them and see. The C++ thought leaders, such as James Kanze, >order their troups to never ever inline unless profiling reveals a real >need. If you don't know what functions are likely to fall into the "free" category, then timing isn't going to give you a good idea of what's going on. Profilers are horrible at this, as they might be able to tell you that function X is hit a bit, but the 57 different accessors that X calls don't really show up on the profile because they're so short that the sampling never caught a meaningful number of them. Only by looking at the disassembly and realizing that functions are being called every 3-5 instructions would you realize what's really up. I've heard that chip designers can tell C and C++ code apart by looking at the disassembly-- C has much longer 'runs' of code before branching/calling functions. C++ (especially bad C++) is so jump-heavy that you kill performance. Further, profiling need to be done simultaneously from both directions-- first a 'bottom-up' event-based sampler that identifies the functions that the CPU spent the most time in. Additionally, you need a top-down approach where you sum up the time spent in hierarchical parts of your app-- physics vs AI vs graphics, then start breaking those up into subcomponents. Both will only give you a hint as to where performance stinks. Then a human needs to come in-- is it using the cache too much? Too little? Is the CPU being misused here? Would changing algorithms help here? That's where the human insight is needed. My rule of thumb: anything trivial should darn well be inlined, "leaders" can be ignored. Performance should not be hidden by abuse of C++ due to "good intentions." And, I'd like to get *much* more feedback from the compiler about mis-inlining. Nathan Mates -- <*> Nathan Mates - personal webpage http://www.visi.com/~nathan/ # Programmer at Pandemic Studios -- http://www.pandemicstudios.com/ # NOT speaking for Pandemic Studios. "Care not what the neighbors # think. What are the facts, and to how many decimal places?" -R.A. Heinlein |
|
#23
| |||
| |||
| "Phlip" <phlip2005@gmail.com> wrote in news:1112676450.491826.18800@g14g2000cwa.googlegro ups.com: > Foo() got bigger by twenty instances of Bar(). Maybe less, with > optimization. But Foo() might now be big enough to overflow the > CPU cache. The method thrashes, where before Foo() and Bar() fit > right next to each other. > Whith one? L1 cache, or opcode cache? And what penalty it would have? Would you care to write a small test program... |
|
#24
| |||
| |||
| "Andy White" <brighamandrew@msn.com> wrote in news:V1m4e.1136$vV3.610@fe04.lga: > I need advice on when to use global and when not. I've always > been taught to avoid useing globals as much as possible but > recently since I've stepped into video game programming, I've > heard that it's ok or "more accepted". I need a guideline for > this matter, thanks. Use them comment them writte common values into comments place them on one place if these aren't Thread global variables. (Then place them into the Runnable.) Write some kawaii comment about it and prepare an antibalistic armor. If you need update them from multiple threads make them volatile, and cache them to avoid unnecessary performance hit. If you'd be using some smart language like Java, call them for example by "Global.someName" call (actually you'd be forced to, if you'd avoid static import.) Then after few months / years experience with them you'd learn a proper usage. I really don't know why would someone dissallow use of global variables, especially if they would be writen in one file, sorted and commented. |
|
#25
| |||
| |||
| Jonathan Bartlett wrote: > Phlip wrote: > >>nokturnal wrote: >> >> >> >>>Having global variables is faster than having to pass the variable round >> >>>from function to function. >> >> >>So, you have time-tested this? and you have compared it to the Singleton >>pattern with Get() methods? > > > Actually, not only are globals faster than variables passed, they are > also faster than straight local variables. Retrieving a local variable > requires an indexed addressing mode to a stack address, while retrieving > a global variable can be done with a direct addressing mode. There are > some issues of cache, but these are mostly outweighed by the overhead of > bringing the value into the stack (whether by passing or derivation). > > Jon Certainly thats the conventional wisdom. However, it no longer applies. Cache locality will favor local variables, depending on how the program works. For example, if the program uses a lot of locals, instead of globals spread all over the program, it will increase cache locality and thus actually be faster. In order to understand REAL speed, you need to know both how the compiler works, and how the machine (CPU) works. The best methodology is to program it using best practices (like eschewing globals), then optimize frequently executed code, as determined by a profiler, using knowledge of CPU, cache and compiler characteristics. |
|
#26
| |||
| |||
| nathan@visi.com (Nathan Mates) writes: > In article <1112676450.491826.18800@g14g2000cwa.googlegroups. com>, > Phlip <phlip2005@gmail.com> wrote: > >All of that is about whether inline produces in-line opcodes, not > >whether it actually makes things faster. It could make them slower. > > "Could." But, is not guaranteed to. Therefore, inlining is not > something to mindlessly avoid. C/C++ give you lots of tools to make > shooting yourself in the foot very easy and very painful, but that's > not a reason to get rid of the tools. It's a wake-up call that those > using it need to get a clue as to how their tools work. I am not a games programmer, so this is a question from curiosity: is the obsessing over inlining really worth the trouble in practice? Given the current aggressive scheduels and complexity of today's games, I would have expected that the biggest risk is plain old bugs. Thus design choices to do strong information hiding, "design firewalls", and things that improve compilation times would all be more important. That is, things are more robust if one minimizes inlining. Another question: does not algorithmic efficiency far outweigh function call overheads in practice? I would have expected that in practice a few key spots in the game engine would be the bottlenecks, and that for the rest of the game it really does not matter. The standard mantra I am aware of is: don't optmize for efficiency till later. The idea being you get the design and implementation "right" (including algorithim choices, of course) and only have careful profiling worry about the low level tricks to speed things up. I am aware though that real time considerations can turn all that on its head. -- Cheers, The Rhythm is around me, The Rhythm has control. Ray Blaak The Rhythm is inside me, rAYblaaK@STRIPCAPStelus.net The Rhythm has my soul. |
|
#27
| |||
| |||
| Nathan Mates wrote: > Personally, I'd like to see compilers give *MORE* information back. > At the end of compiling and/or linking, I want a big text file > (optionally) kicked out, saying the following: > > Functions that aren't inlined, but probably should be: > void SomeClass: oNothing(void) - 0 bytes, called 5 times> float GameLoop::GetDT(void) - 4 bytes, called 1463 times > void Bar::Baz(int &) - 4 bytes, called 840 times > [...] > > Functions that are flagged as inline, but probably shouldn't be: > void Entity: oSomethingObscure(int, float, void *, int, bool) - 2300 bytes, called 1 time> [...] not a bad idea, but not entirely relevant by itself. if you could also get *runtime* information such as the actual number of times each function was called, that would be neat. maybe even the time spent in the function vs. time spent in the function prologue. profiling will get you some of that info, but it's probably impractical to get any information for already inlined functions. of course, that info alone isn't enough to tell you what can and can't be inlined anyway - i mean if the function's address is taken, it can't be inlined. still, it would be a helpful guideline. otoh, if a compiler is collecting that kind of info, what's to stop it from using it to decide what to inline on its own? the inline keyword is just a suggestion - the compiler is free to inline or not inline as it sees fit. indi -- Mark A. Gibbs (aka. Indi) Administrator #c++ on irc.Rizon.net http://ca.geocities.com/indij@rogers.com/ (temporary website) |
|
#28
| |||
| |||
| In article <uhdik6ft0.fsf@STRIPCAPStelus.net>, Ray Blaak <rAYblaaK@STRIPCAPStelus.net> wrote: >The standard mantra I am aware of is: don't optmize for efficiency >till later. The idea being you get the design and implementation >"right" (including algorithim choices, of course) and only have >careful profiling worry about the low level tricks to speed things >up. You shouldn't spend time optimizing until later, but that doesn't mean you don't think about it. For example, you don't write BogoSort (see google) and expect to change it to quicksort later if necessary. Basically, don't do things that you know are stupid from day 1. To me, not inlining trivial functions that are called a lot falls into that category. Any function body that's as short as or shorter than the overhead of the function call to reach it is my definition of 'trivial.' More complex functions should be considered for inlining later. As you gain experience, you'll know what performance wins you can get "for free" while programming the first time. That's the best time to implement changes-- while writing them-- rather than coming back later at the end of the project when more things are locked down and everyone is tired. Studies show that the cost of fixing bugs increases the later they're tackled. Thus, avoiding "premature" optimization doesn't mean that you optimize last. Nathan Mates -- <*> Nathan Mates - personal webpage http://www.visi.com/~nathan/ # Programmer at Pandemic Studios -- http://www.pandemicstudios.com/ # NOT speaking for Pandemic Studios. "Care not what the neighbors # think. What are the facts, and to how many decimal places?" -R.A. Heinlein |
|
#29
| |||
| |||
| Raghar wrote: > I really don't know why would someone dissallow use of global > variables, especially if they would be writen in one file, sorted > and commented. Could you add "encapsulated" to your list? > Whith one? L1 cache, or opcode cache? And what penalty it would have? Golly gee, do ya think that might be architecture-specific? > Would you care to write a small test program... Nope. Do you have an answer for the question, "Does 'inline', after it's shown to produce in-line opcodes, always make things faster?" Ray Blaak wrote: > I am not a games programmer, so this is a question from curiosity: is the > obsessing over inlining really worth the trouble in practice? My bad. I presumed that a common excuse for _naked_ globals might be the run-time cost of encapsulating them. Hence: Ensure their accessors are inlined, _and_ ensure the inlining is relevant to performance. Maybe most simple accessor inlining is. Others have pointed out that better algorithms and local variables can often be safely assumed to have a greater effect. > Given the current aggressive scheduels and complexity of today's games, I > would have expected that the biggest risk is plain old bugs. Welcome to game programming. We have a word for that: "Crunch mode". You boil the architecture and art until the bug list gets super-long, then you work late beating all the simple bugs count down. But you didn't hear that from me - nobody here likes to be reminded of it. Mark A. Gibbs wrote: > otoh, if a compiler is collecting that kind of info, what's to stop it > from using it to decide what to inline on its own? I thought some of them did that. Specifically - some inline things that you didn't declare. Scott Moore wrote: > In order to understand REAL speed, you need to know both how the compiler > works, and how the machine (CPU) works. > > The best methodology is to program it using best practices (like eschewing > globals), then optimize frequently executed code, as determined by a > profiler, using knowledge of CPU, cache and compiler characteristics. So, is there a consensus that global variables aren't a wise path towards speed? Or did I cause trouble in vain? -- Phlip http://industrialxp.org/community/bi...UserInterfaces |
|
#30
| |||
| |||
| In article <LJSdnel1tMiBj87fRVn-gg@rogers.com>, Mark A. Gibbs <x_gibbsmark@rogers.com_x> wrote: >> Personally, I'd like to see compilers give *MORE* information back. >> At the end of compiling and/or linking, I want a big text file >> (optionally) kicked out, saying the following: >> Functions that aren't inlined, but probably should be: >> void SomeClass: oNothing(void) - 0 bytes, called 5 times>> float GameLoop::GetDT(void) - 4 bytes, called 1463 times >> void Bar::Baz(int &) - 4 bytes, called 840 times >> [...] >not a bad idea, but not entirely relevant by itself. if you could also >get *runtime* information such as the actual number of times each >function was called, that would be neat. Runtime profiling like that will *kill* your performance. There's millions of functions called per second, each of which has to have code inserted to increment a counter (64 bit minimum, which is a big hit on a 32-bit processor). Your in-game profiler shouldn't have a huge hit on performance-- 5% is borderline "too much" in my book. That's why I want the text printout of functions. Give that to a programmer, and they can quickly say "that's an init-time function, and it doesn't matter" or "wow-- that's called from all our inner loops." >of course, that info alone isn't enough to tell you what can and can't >be inlined anyway - i mean if the function's address is taken, it can't >be inlined. still, it would be a helpful guideline. Most compilers will duplicate the inlined code into the calling functions. That way, there's a copy that can be address-of'd, and those that use it inline both get a win. >otoh, if a compiler is collecting that kind of info, what's to stop it >from using it to decide what to inline on its own? the inline keyword is >just a suggestion - the compiler is free to inline or not inline as it >sees fit. If they did a great job, there wouldn't be this discussion about misuse of it. Therefore, some handholding is necessary. And, things like accessors-in-cpp-files can't ever be considered as a candidate to inline by most compilers. [LTCG on VS.NET does that, but it's not on by default.] That's why I want the report to say "consider inlining these functions" so that the code can be rearranged to support that. Nathan Mates -- <*> Nathan Mates - personal webpage http://www.visi.com/~nathan/ # Programmer at Pandemic Studios -- http://www.pandemicstudios.com/ # NOT speaking for Pandemic Studios. "Care not what the neighbors # think. What are the facts, and to how many decimal places?" -R.A. Heinlein |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.