| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| I am playing around with some ideas for a compiled dynamically typed language i.e. no VM. What bugs me is that in a dynamically typed language without primitives, code that looks like this in C: int j; .... <code setting the value of j> int i = j * 5 + 3; Would usually compile to something like 2 dynamic calls, first on j and then on the result of j * 5. While one can discuss the extra overhead of dynamic dispatch, in this example the C code doesn't even turn up a single function call, and because of the dynamic dispatch there is no hope to inline the calls. The only idea that struck me as remotely feasible would be to inline these types manually. So my code i = j * 5 + 3; would compile to something like (ignoring conversions due to int overflow): Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); Where IS_INT is some macro that would do some bit-check to verify that the object can be interpreted as an int. Aside from being hideously ugly, I am not really sure that it is a workable solution. This obviously touches on the whole "how do you get a dynamic language to run as fast as C" discussion. So, does anyone have any ideas or pointers? |
|
#2
| |||
| |||
| > Would usually compile to something like 2 dynamic calls, first on j > and then on the result of j * 5. > > While one can discuss the extra overhead of dynamic dispatch, in this > example the C code doesn't even turn up a single function call, and > because of the dynamic dispatch there is no hope to inline the calls. > > The only idea that struck me as remotely feasible would be to inline > these types manually. > > So my code > > i = j * 5 + 3; > > would compile to something like (ignoring conversions due to int > overflow): > > Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); > Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); Codegen for dynamic languages is "fun" and what you suggest (with the IS_INT type checks) is basically the way they are handled. I'm most familiar with javascript, which allows for dynamically set type conversion functions to handle toNumber for example, but the "functions" *, /, etc are immutable. But if you take an example like: var z = x * y; that is in principle var z = getPrimitiveValue(x) * getPrimitiveValue(y); Where getPrimitiveValue may end up calling the user defined function valueOf() (which can throw, or return a non-primitive value *sigh*). However to get around the immense cost of the series of virtual calls that would other wise be needed most JS interpreters do have some kind of fast logic to distinguish a few specific types, so you may end up with code along the lines of (effectively): double v1, v2; if (fastToNumber(x, v1) && fastToNumber(y, v2)) z = v1 * v2; else { slow path with virtual calls, exception checks, etc } Additionally, there are other things you can do, for example the basic arithmetic operators (*, /, -, bit operators) have guaranteed returns types ("Number", + may return Number or String), which allows a degree of static analysis in the expression tree. Finally, a number of the JS interpreters will modify the execution of a script as the script is being executed based on the types and functions they see actually being hit, to try to specialise the order of logic used to determine behaviour, etc. --Oliver |
|
#3
| |||
| |||
| On Aug 23, 4:38 pm, Christoffer Lernv <le...@dragonascendant.com> wrote: > I am playing around with some ideas for a compiled dynamically typed > language i.e. no VM. > > What bugs me is that in a dynamically typed language without > primitives, code that looks like this in C: > > int j; > > ... <code setting the value of j> > > int i = j * 5 + 3; > > Would usually compile to something like 2 dynamic calls, first on j > and then on the result of j * 5. > > While one can discuss the extra overhead of dynamic dispatch, in this > example the C code doesn't even turn up a single function call, and > because of the dynamic dispatch there is no hope to inline the calls. > > The only idea that struck me as remotely feasible would be to inline > these types manually. > > So my code > > i = j * 5 + 3; > > would compile to something like (ignoring conversions due to int > overflow): > > Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); > Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); > > Where IS_INT is some macro that would do some bit-check to verify that > the object can be interpreted as an int. > > Aside from being hideously ugly, I am not really sure that it is a > workable solution. > > This obviously touches on the whole "how do you get a dynamic language > to run as fast as C" discussion. So, does anyone have any ideas or > pointers? Naive dynamic language implementations carry type tags in objects or pointers (or both) and spend a lot of time checking them before every operation. That's what you're proposing here. Fancy implementations use type inference and other rule-based schemes along with inlining to eliminate checks wherever possible, sometimes tags as well. There's a literature on type inference for various dialects of lisp. You might start by googling for "type inference lisp" and "type inference scheme", also the more general "lisp optimization". It's actually worse than what you show. Unboxed INTs (usually called fixnums) need at least one bit to identify them as such. Usually you exploit alignment restrictions: the low bits of pointers musts be zero. So represent unboxed fixnum j as e.g. (j << 1) | 1. This means that even for simple arithmetic, you need to shift the tag bit out to the right beforehand. One motivation for languages designed to support sound and complete type inference (ML and its progeny) is to get rid of _all_ the tags and checks while retaining most of the convenience of dynamic types. |
|
#4
| |||
| |||
| > Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); > Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); > > Where IS_INT is some macro that would do some bit-check to verify that > the object can be interpreted as an int. > > Aside from being hideously ugly, I am not really sure that it is a > workable solution. > > This obviously touches on the whole "how do you get a dynamic language > to run as fast as C" discussion. So, does anyone have any ideas or > pointers? I wrote a script language (to write repetitive source code from a template) which is C like, has user defined variables and all type control is done at runtime. I wrote it in C++ & used normal virtual fns to dispatch the operators and perform type conversion. The parser creates a basic tree which is then interpreted & the function calls really aren't an issue for my performance. Essentially I coded the 'generic' case of what you suggest. A basic outline of what I did is below, of course you can splatter it with consts etc. to your hearts content... class Base { public: virtual Int* GetAsInt() {return NULL;} // return Int or NULL if conversion illegal virtual String* GetAsString(); virtual void BinaryPlus(Base* rhs); ... } class Int : public Base { public: Int* GetAsInt() {return this;} // No conversion String* GetAsString(); // makes new string & initialises it with formatted Int void BinaryPlus(Base* rhs) { Int* int_rhs=GetAsInt(rhs); // conversion if (!int_rhs) SemanticError(); // Error handling Val += int_rhs->Val; // do operator; } private: int Val; } //elsewhere... ExectueTreeNodeBinary { ... lhs->BinaryPlus( rhs ); // all done - in fact here I use a generic dispatch fn with a pointer to the Binary Operator, no switch or anything } Types only need to know about themselves & how to convert themselves to other types (if possible). It gives me the benefit that I can add integers to strings and get a string which is great for text formatting (the purpose of my language). 2 trivial vfn lookups for a binary operator really isn't a performance problem on modern processors. I prefer it to branches & switches. So for: int i = j * 5 + 3; I have the obvious tree based on C operator precedence. So instance j operator* is called with rhs constant(5) which is converted to j's type as the above example before the *. Then constant(3) is converted to result's type (i.e. j's type) for the binary+. Then the result is converted to int for the assignment. Any illegal semantics in the types are mopped up along the way + useful error message for the human overlord. In fact the common code for the operators is pulled out into a wrapper fn() & the binary operators (unary as well) are much simpler in my real types but I hope you get the gist. Performance is good but simplicity of implementing operators & types was my primary concern. I assume that your application & requirements are different to mine but I hope the info is helpful. Chris |
|
#5
| |||
| |||
| On Aug 24, 7:21 pm, Gene <gene.ress...@gmail.com> wrote: > On Aug 23, 4:38 pm, Christoffer Lernv <le...@dragonascendant.com> > wrote: > > I am playing around with some ideas for a compiled dynamically typed > > language i.e. no VM. > It's actually worse than what you show. Unboxed INTs (usually called > fixnums) need at least one bit to identify them as such. Usually you > exploit alignment restrictions: the low bits of pointers musts be > zero. So represent unboxed fixnum j as e.g. (j << 1) | 1. This means > that even for simple arithmetic, you need to shift the tag bit out to > the right beforehand. > > One motivation for languages designed to support sound and complete > type inference (ML and its progeny) is to get rid of _all_ the tags > and checks while retaining most of the convenience of dynamic types. I have a couple nits to pick here: 1. Unless you have hardware tagged arithmetic support (e.g. SPARC), tagging fixnums with 0 saves a couple shifts during arithmetic. The downside is that you have to do one more addition for pointer operations that don't already add a constant offset. That's not bad at all, especially if the architecture you're targeting offers load/store- with-(constant-)offset instructions. 2. Even disregarding tagged union, most ML derivatives still use tags: they need to distinguish pointers from unboxed immediate values in the GC (which is non-trivial due to polymorphism). It is possible to reconstruct pointerness information at runtime [1], but few implementations (none?) do. "... the implementation of the tag-free collector in a polymorphically typed language is difficult in ways that were not described in previous papers... we mainly describe how to perform type reconstruction during garbage collection and do not attempt to address practical issues in the garbage collection process." ([1] again) might be why. If one is ready to go for whole-program compilation, it's feasible, as in MLton demonstrates, to compile away all the polymorphism and only have monomorphic functions left at runtime. That greatly simplifies the task of reconstructing types from the backtrace. Paul Khuong 1. Polymorphic Type Reconstruction for Garbage Collection without Tags, Benjamin Goldberg, Michael Gloger, 1992 <http://www.cs.nyu.edu/ goldberg/pubs/gg92.pdf>. |
|
#6
| |||
| |||
| On Aug 24, 4:58 pm, "oliverh...@gmail.com" <oliverh...@gmail.com> wrote: > > Would usually compile to something like 2 dynamic calls, first on j > > and then on the result of j * 5. > > > While one can discuss the extra overhead of dynamic dispatch, in this > > example the C code doesn't even turn up a single function call, and > > because of the dynamic dispatch there is no hope to inline the calls. > > > The only idea that struck me as remotely feasible would be to inline > > these types manually. > > > So my code > > > i = j * 5 + 3; > > > would compile to something like (ignoring conversions due to int > > overflow): > > > Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); > > Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); > > Codegen for dynamic languages is "fun" and what you suggest (with the > IS_INT type checks) is basically the way they are handled. I'm most > familiar with javascript, which allows for dynamically set type > conversion functions to handle toNumber for example, but the > "functions" *, /, etc are immutable. > > But if you take an example like: > > var z = x * y; > > that is in principle > > var z = getPrimitiveValue(x) * getPrimitiveValue(y); > > Where getPrimitiveValue may end up calling the user defined function > valueOf() (which can throw, or return a non-primitive value *sigh*). > > However to get around the immense cost of the series of virtual calls > that would other wise be needed most JS interpreters do have some kind > of fast logic to distinguish a few specific types, so you may end up > with code along the lines of (effectively): > > double v1, v2; > if (fastToNumber(x, v1) && fastToNumber(y, v2)) z = v1 * v2; > else { slow path with virtual calls, exception checks, etc } Theoretically, it's possible to write a type inferencer for a dynamic language. It obviously wouldn't be perfect, but even if it only gives you useful information 50% of the time, it still seems like the way to go. The only issue I see with it is when you're dealing with actual "dynamic" code, but again, it should e possible to work around that. Of course, neither of those tasks is trivial in practice, simply because it would be a nightmare in terms of complexity. It might be something you'd want to look into, though, as that was what I was originally planning on doing when writing the compiler for my (now defunct) dynamically typed language. (Granted, the language was basically designed around the fact that various inference passes would be run over the code.) |
|
#7
| |||
| |||
| On 24 Aug, 23:58, "oliverh...@gmail.com" <oliverh...@gmail.com> wrote: > > Would usually compile to something like 2 dynamic calls, first on j > > and then on the result of j * 5. > > > While one can discuss the extra overhead of dynamic dispatch, in this > > example the C code doesn't even turn up a single function call, and > > because of the dynamic dispatch there is no hope to inline the calls. > > > The only idea that struck me as remotely feasible would be to inline > > these types manually. > > > So my code > > > i = j * 5 + 3; > > > would compile to something like (ignoring conversions due to int > > overflow): > > > Id temp = IS_INT(j) ? j * 5 : call(j, "mult", 5); > > Id i = IS_INT(temp) ? temp + 3 : call(temp, "add", 3); > > Codegen for dynamic languages is "fun" and what you suggest (with the > IS_INT type checks) is basically the way they are handled. I'm most > familiar with javascript, which allows for dynamically set type > conversion functions to handle toNumber for example, but the > "functions" *, /, etc are immutable. > > But if you take an example like: > > var z = x * y; > > that is in principle > > var z = getPrimitiveValue(x) * getPrimitiveValue(y); > > Where getPrimitiveValue may end up calling the user defined function > valueOf() (which can throw, or return a non-primitive value *sigh*). > > However to get around the immense cost of the series of virtual calls > that would other wise be needed most JS interpreters do have some kind > of fast logic to distinguish a few specific types, so you may end up > with code along the lines of (effectively): > > double v1, v2; > if (fastToNumber(x, v1) && fastToNumber(y, v2)) z = v1 * v2; > else { slow path with virtual calls, exception checks, etc } Do you know of any papers detailing the fast algorithms to do this? > Additionally, there are other things you can do, for example the basic > arithmetic operators (*, /, -, bit operators) have guaranteed returns > types ("Number", + may return Number or String), which allows a degree > of static analysis in the expression tree. Unless you, as I was planning to, treat them as methods (like Ruby) at which point I suppose all bets are off except for "final" classes. I *will* have optional typing in place, but I don't if it will help optimizing anything, since you can send any message to any object regardless. > Finally, a number of the JS interpreters will modify the execution > of a script as the script is being executed based on the types and > functions they see actually being hit, to try to specialise the > order of logic used to determine behaviour, etc. That's the luxury of running a VM I suppose. |
|
#8
| |||
| |||
| On 25 Aug, 17:20, Paul Khuong <pkhu...@gmail.com> wrote: > 1. Unless you have hardware tagged arithmetic support (e.g. SPARC), > tagging fixnums with 0 saves a couple shifts during arithmetic. The > downside is that you have to do one more addition for pointer > operations that don't already add a constant offset. That's not bad at > all, especially if the architecture you're targeting offers load/store- > with-(constant-)offset instructions. Speaking of raw pointers, is that the way to go? I originally envisioned some sort of (compact) object table where what I passed around was simply the index into that table rather than the pointer itself. Obviously this would mean another point of indirection whenever accessing data, but I was thinking it would simplify things like tracking all objects and maybe enable optimizations later on. It's just an idea and I don't know if it has any real advantages. I note that most implementations of object systems in C instead rely on passing a pointer directly to the object struct. What are the virtues of that approach, except for it being more C-ish? /Christoffer |
|
#9
| |||
| |||
| "Christoffer Lernv" <lerno@dragonascendant.com> wrote in message > On 25 Aug, 17:20, Paul Khuong <pkhu...@gmail.com> wrote: >> 1. Unless you have hardware tagged arithmetic support (e.g. SPARC), >> tagging fixnums with 0 saves a couple shifts during arithmetic. The >> downside is that you have to do one more addition for pointer >> operations that don't already add a constant offset. That's not bad at >> all, especially if the architecture you're targeting offers load/store- >> with-(constant-)offset instructions. > > Speaking of raw pointers, is that the way to go? Raw Pointers (And Presumably Conservative Vs Precise Garbage Collection), Vs tagged values or other cases, do offer a few advantages: it is possible to mesh things much more nicely with C land; a good deal of internal complexity can be reduced; ..... However, With A Raw Pointer Scheme, Getting Good Performance Is A Lot Harder vs when using tagged values, especially if the compiler/interpreter does not have strong type inference. However, I Have Still Eventually Ended Up Largely Settling On Raw Pointers, or as I have often called them, "magic pointers" (well, along with a VM project that is, internally, rather overly complex). > I originally envisioned some sort of (compact) object table where what > I passed around was simply the index into that table rather than the > pointer itself. > this is a worthwhile approach, especially if combined with a tagged-value scheme (even if cheap, for example, representing fixnums or flonums as any kind of allocated objects can become very expensive). now, in my case, I have done something funky: I allocate big chunks of address space (the details depends on OS and arch, where on x86 this is typically the whole upper 1-2GB of the process image, which is otherwise unusable by the app), and use chunks of this space for things like fixnums and flonums. so, I make a design proclamation: not all of the address space pointed to by dynamic pointers is necessarily addressable (and even if it is, it is not gueranteed to point at memory necessarily representing the object being portrayed). now, on x86-64, I have a far more massive piece of address space to play with (56 bits usable for addresses? this leaves around 63.9935 bits free to do anything I want with, although I currently only use another 48 or 56 bits I think to represent my "special" space). > Obviously this would mean another point of indirection whenever > accessing data, but I was thinking it would simplify things like > tracking all objects and maybe enable optimizations later on. It's > just an idea and I don't know if it has any real advantages. > some advantages, some disadvantages, despends on what you are doing. as can be noted, when I was last using tagged values, they internally did use indexing of this sort, and yes, the other advantages of this approach (for example, you don't have to "look up" parts of the heap, ...) make up for the added indirection. > I note that most implementations of object systems in C instead rely > on passing a pointer directly to the object struct. What are the > virtues of that approach, except for it being more C-ish? > it is more C-ish mostly, and in C at least, tends to allow a much more "natural" interface, but at additional costs. actually, within my existing "magic pointer" system, this is often done as well, but this is also mixed with far more "opaque" types and interfaces (abstract API calls and similar...). this is a system that in my case has generally "evolved" over the past number of years (me at times using a much more primitive version of this system, at other times using tagged-reference systems, facing issues like one system being slower than hell, the other being a horrible PITA to use from plain C code, ...). it takes a long time to generally find something that works well... |
|
#10
| |||
| |||
| Christoffer Lernv wrote: > This obviously touches on the whole "how do you get a dynamic language > to run as fast as C" discussion. So, does anyone have any ideas or > pointers? Look at "polymorphic inline caches" and trace trees: http://research.sun.com/self/papers/pics.html (the Self papers in general are full of really interesting stuff for implementing dynamic languages, and they are very readable) http://andreasgal.com/2008/06/02/trace-trees-faq/ While these techniques appears to be mainly used with VM's, they can definitively be used without them. Polymorphic inline caches work by generating inline code for type checks and the methods to be called at the call site. If you allow dynamically replacing methods at runtime (like in Ruby for example) you'd need some mechanism for identifying and invalidating the caches. Trace trees work by instrumenting the code and tracing the control flow. When frequently occurring loops are identified the code is optimized (in their case by JIT'ing, but this optimization could just as well include creating a inlined and partially unrolled native code sequence with inline caches. You might also want to look into the ongoing work on the new version of the Lua JIT which make use of some of this. Another thing to look at is to what extend your language allows "lifting" type checks and then statically typing subtrees. That is, if your languages semantics allows you to show that inside f(), x will always refer to the same type, you can consider lifting the type check for x out of f() and optionally then compile (or inline) static versions of f() for the most likely types. Often you can make very good guesses - i.e. what looks like integer addition most likely is, so lifting type checks as far up as possible and compiling static versions that use integer additions with a fully dynamic fallback if the type check fails may be acceptable and give very good performance for the typical case (of course at the expense of larger code). Of course lifting type checks out of small inner loops wherever possible is the most obvious use case. Vidar |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.