Machine language and assembler translators? - Compilers

This is a discussion on Machine language and assembler translators? - Compilers ; Hi Sir My question is that are there any tools available which converts the machine language of one architecture to other architecture. For example if we have a got a complier for language say X and the target language of ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 17

Machine language and assembler translators?

  1. Default Machine language and assembler translators?

    Hi Sir

    My question is that are there any tools available which converts the
    machine language of one architecture to other architecture. For
    example if we have a got a complier for language say X and the target
    language of compiler is say for ARM. Now are there any tools available
    which convert or translate the code generated for ARM to some other
    architecture say SH.

    So basically we will be compiling the source language to get the
    target language code for a particular architecture which will then be
    translated to other architecture code.

    Now my personal view is that all the above work can also be achieved
    by using a cross complier but for that we have to make the changes in
    the complier backend. Instead of doing that cant we make a tool which
    convert the assembly language of one architecture to another
    architecture machine language.

    Also kindly let me know if there are already any tools available which
    are capable of doing the above work (assembly to assemble translation)

    Kindly advice..

    Thanks and Best Regards
    Jatin Bhateja
    [Both machine language translators and assembler translators have been
    around approximately forever. It's straightforward except for
    self-modifying code, and incompatible byte order and data formats.
    -John]

  2. Default Re: Machine language and assembler translators?

    > Now my personal view is that all the above work can also be achieved
    > by using a cross complier but for that we have to make the changes in
    > the complier backend. Instead of doing that cant we make a tool which
    > convert the assembly language of one architecture to another
    > architecture machine language.


    I doubt it is a reasonable idea. If you wish just to port the code so
    that it just works on another platform, then it is theoretically
    possible (though it may take less efforts to write an emulator). If
    you wish to gain a reasonable performance, then avoiding back-end's
    work is impossible.

    Optimizing compilers do a lot of hi-tech work on allocating data on
    registers, selecting better instructions, scheduling them for better
    performance, etc. Suppose you do a conversion from platform with N
    registers to a platform with M<N registers. How will you deal with
    lacking registers? Spill them to memory? into stack? load and unload
    from time to time? The work seems to be compared with fighting in a
    can...

    Vit
    http://www.excelsior-usa.com
    [People have been translating assembler and machine code for 40 years.
    Since the target is usually a newer machine than the source,
    sufficient registers et al aren't usually a problem. Incompatible
    data formats and byte orders, though ... -John]

  3. Default Re: Machine language and assembler translators?

    "Jatin Bhateja, Noida" wrote:

    > Now my personal view is that all the above work can also be achieved
    > by using a cross complier but for that we have to make the changes in
    > the complier backend. Instead of doing that cant we make a tool which
    > convert the assembly language of one architecture to another
    > architecture machine language.


    Of course it's doable, a JIT compiler does such a conversion on the
    fly. But for good performance of the converted code a *virtual*
    machine should be chosen for the input code, that maps well to
    existing machines of every kind, instead of some *really* existing
    machine.

    > [Both machine language translators and assembler translators have
    > been around approximately forever. It's straightforward except for
    > self-modifying code, and incompatible byte order and data formats.
    > -John]


    sic!

    DoDi

  4. Default Re: Machine language and assembler translators?

    Jatin Bhateja, Noida wrote:


    > My question is that are there any tools available which converts the
    > machine language of one architecture to other architecture. For
    > example if we have a got a complier for language say X and the target
    > language of compiler is say for ARM. Now are there any tools available
    > which convert or translate the code generated for ARM to some other
    > architecture say SH.


    (snip)

    > [Both machine language translators and assembler translators have been
    > around approximately forever. It's straightforward except for
    > self-modifying code, and incompatible byte order and data formats.
    > -John]


    And that it tends to generate less efficient code then recompiling
    from source.

    The relatively recent popularity of JIT, just-in-time compilers
    used by Java (and probably others) do just that.

    When Apple started selling PowerPC based Macs the new OS included the
    ability to run 680x0 code, using some type of run time translation.

    For a slightly different way of doing it, many IBM S/360 machines
    included microcode to execute the instruction set of previous
    generations of machines. This allowed purchasers to run existing
    programs while developing native S/360 versions.

    -- glen

  5. Default Re: Machine language and assembler translators?

    Vit wrote:
    (snip, and previous snip regarding translation of object code
    between architectures as an alternative to recompilation.)

    > I doubt it is a reasonable idea. If you wish just to port the code
    > so that it just works on another platform, then it is theoretically
    > possible (though it may take less efforts to write an emulator). If
    > you wish to gain a reasonable performance, then avoiding back-end's
    > work is impossible.


    For significantly different architectures there is likely a fairly
    large performance penalty.

    For subarchitectures, different branches of what is fundamentally the
    same architecture, though, it may be more useful.

    I have written similar discussions here before, but I believe it may
    make a lot of sense in the continuing development of RISC machines.

    If find it interesting that the S/360 architecture, though with some
    improvements along the way, is still alive and well.

    Now, consider that much of the advantage of RISC depends on the
    compilers generating optimal code for the specific processor. That is,
    not just at the architecture level but for what is sometimes called the
    subarchitecture. (I first knew subarchitecture from Sun/SPARC where it
    refers more to the MMU from the days of external MMU, but I believe the
    term is appropriate.)

    One of the early feature of RISC was branch delay slots. The proper use
    of branch delay slot instructions, of optimizing for the branch
    prediction logic, and other branch related instruction scheduling tricks
    allow one to get the best performance out of a RISC machine. Yet all
    these tricks are dependent on the specific details of the machine, the
    subarchitecture, and likely result in code that performs poorly on
    different machines with the same architecture, maybe only a year or
    two later. It might be that some kind of object code translation
    could help in the RISC case.

    S/360 code likely runs fine, though maybe slightly less than optimal,
    on current z/Architecture machines. That is, assembly code that is
    now almost 40 years old!

    -- glen

  6. Default Re: Machine language and assembler translators?

    Sorry, but I do not quite catch what you mean

    > For subarchitectures, different branches of what is fundamentally the
    > same architecture, though, it may be more useful.


    Agreed, but it's a too narrow area, and I feel that the initial
    problem has been established in a wider context...

    > Now, consider that much of the advantage of RISC depends on the
    > compilers generating optimal code for the specific processor.


    Since I agree with the above, then my comment for the following:

    > not just at the architecture level but for what is sometimes called the
    > subarchitecture.


    is: It is easier to tune the compiler's back-end for better
    performance on a subarchitecture level (and all its varieties that
    exist) than to develop such a side tool as target code converter to
    loose the performance even in the smallest extent... The statement is
    that when for a given low-level code both hi-level source code and the
    compiler's sources are available, it is better to tune the compiler
    and recompile. Creating a converter makes sense only if its multiple
    usage is guaranteed, that is, if having a lot of hand-written assembly
    code or loosing the compiler's sources are the case. If there is just
    SOME code to be tuned to SOME slightly different architecture, well,
    than it is a particular problem, and it's up to you budget if convert
    it manually or develop an automatic converter for a single run...

    Vit
    http://www.excelsior-usa.com

  7. Default Re: Machine language and assembler translators?

    I have built translators like this for machine/assembly language, with
    mixed results. Yes, it can be done. But it won't be efficient code.

    At the most primitive level, you can look at each instruction of the
    "source" processor, and see how best to make the CPU do _EXACTLY_ the
    same thing in the "target" processor. Once you have a translation of
    each instruction, it's a simple matter to write a text processor that
    will make the necessary substitutions. I've even been known to do it in
    the macro language of a good editor.

    As both John and Vit point out, your results will depend a lot on the
    similarity between the two processors. Translating, say, the assembly
    language of one RISC machine to another one is a piece of cake, and
    gives good results. Translating between families is a lot harder.

    Our first translator, back in the 8-bit days, involved translating 8080
    assembly language to 6800. That's probably the worst-case scenario for
    two reasons. First, the 6800 had fewer registers, so we had to implement
    registers equivalent to the 8080 registers, in RAM. Second, the Intel
    and Motorola lines had (still do) very different rules for the way the
    flags work. In the Intel line, flags are "sticky," and often remain
    unchanged by certain operations. In the Moto line, flags always reflect
    the contents of the "accumulator" (or register last used).

    Since our translator operated only on one instruction at a time, we had
    to take the worst-case assumption, that every flag had to be treated
    just as it would have been in the source processor. This is clearly a
    pessimistic assumption, and led to much of the inefficiency. The
    translated program ran, and generated correct results out of the gate.
    But the size of the translated code was about double that of the
    original source, and speed was about one third.

    If you had a _REALLY_ good back-end optimizer -- one that could tell if
    a given flag was going to be needed or not -- you might get a lot of the
    performance back.

    Others have suggested an emulator instead of a translator. That
    definitely works too, and would of course not cause a bloat in code size.

    Looked at at the most fundamental level, you can imagine each
    translation as a macro substitution. A translator substitutes each
    macro in line. An emulator basically has a subroutine for each macro,
    and uses a jump table to select and execute it. The macros are
    essentially the same either way.

    One last thought: Since I assume that, nowadays, you are writing the
    code in a high-order language like C, and assuming that you have a feel
    for what constructs are generated for various kinds of instructions, you
    might be able to anticipate things like flag usage much better. That is
    to say, an if-statement or for-statement is going to do its predicate
    tests in a very specific way. If you know what that way is, you can use
    it to create an equivalent construct for the target machine.

    Unfortunately, this approach is coming very close to being a decompiler,
    which is a whole 'nuther problem. Not impossible, but not easy either.

    Jack

    Jatin Bhateja, Noida wrote:

    > Hi Sir
    >
    > My question is that are there any tools available which converts the
    > machine language of one architecture to other architecture. For
    > example if we have a got a complier for language say X and the target
    > language of compiler is say for ARM. Now are there any tools available
    > which convert or translate the code generated for ARM to some other
    > architecture say SH. ...


  8. Default Re: Machine language and assembler translators?

    Jack Crenshaw wrote:

    > I have built translators like this for machine/assembly language, with
    > mixed results. Yes, it can be done. But it won't be efficient code.


    (snip)

    > Our first translator, back in the 8-bit days, involved translating 8080
    > assembly language to 6800. That's probably the worst-case scenario for
    > two reasons.


    Recently there was a discussion in another newsgroup on 8080 to 8086
    translation.

    It seems that intel designed the 8086 to make assembly source
    translation easy. Some 8086 instructions originally existed only for
    that reason. Though it isn't necessary that one 8080 instruction
    generate one 8086 instruction.

    One interesting case is the instruction for loading the 8080 A
    register into the flags register. The equivalent 8086 instruction
    turned out to be very useful when the 8087 was designed. The 8087,
    being a separate processor, needed a way to get the flags back to the
    8086. They are stored into memory, loaded into AH, and then into the
    flags register to be used for conditional tests.

    Features of the pentium 4 can be traced back to the 8080, 30 years ago.

    Then again, much of the instruction set of IBM's current
    z/Architecture came from S/360 over 40 years ago.

    -- glen

  9. Default Re: Machine language and assembler translators?

    glen herrmannsfeldt wrote:
    > Jatin Bhateja, Noida wrote:
    >
    >
    > > My question is that are there any tools available which converts the
    > > machine language of one architecture to other architecture. For
    > > example if we have a got a complier for language say X and the target
    > > language of compiler is say for ARM. Now are there any tools available
    > > which convert or translate the code generated for ARM to some other
    > > architecture say SH.

    >
    > (snip)
    >
    > > [Both machine language translators and assembler translators have been
    > > around approximately forever. It's straightforward except for
    > > self-modifying code, and incompatible byte order and data formats.
    > > -John]

    > ...
    >
    > When Apple started selling PowerPC based Macs the new OS included the
    > ability to run 680x0 code, using some type of run time translation.


    The first 68K emulator Apple shipped was an interpreter. The second
    generation of Power Macs introduced a dynamic recompiling emulator
    (much faster). This emulator is still in the Classic runtime of current
    versions of OS X (so you can transparently run *integer* 68K code on
    any Power Mac up to the present).

    Apple plans to use a PowerPC emulator (Rosetta) to help users make the
    transition from PowerPC to the x86-based product line. I'm surprised
    they would go to so much effort, since unlike the 68K-PPC transition,
    "porting" to the x86 architecture is largely just a recompile - the
    APIs already being platform agnostic.

    Much has been written about binary translation. For instance, there is
    an informative white paper about ARDI's Syn68k:
    http://www.ardi.com/syn68k.php There is a Yahoo! group,
    http://groups.yahoo.com/group/dynarec/ which is occasionally active
    and apparently working on a project. Past posts go into some detail on
    translation technicalities.

    --Toby

  10. Default Re:Machine language and assembler translators?

    Jack Crenshaw wrote:
    > I have built translators like this for machine/assembly language, with
    > mixed results. Yes, it can be done. But it won't be efficient code.
    >
    > At the most primitive level, you can look at each instruction of the
    > "source" processor, and see how best to make the CPU do _EXACTLY_ the
    > same thing in the "target" processor. Once you have a translation of
    > each instruction, it's a simple matter to write a text processor that
    > will make the necessary substitutions. I've even been known to do it

    in
    > the macro language of a good editor.


    The original 8086 came with a utility to convert 8085 code into 8086
    code.
    I believe it worked by simple substitution like you describe.

    > As both John and Vit point out, your results will depend a lot on the
    > similarity between the two processors. Translating, say, the assembly
    > language of one RISC machine to another one is a piece of cake, and
    > gives good results. Translating between families is a lot harder.


    It depends. If the target processor has many more registers than the
    source processor then it's not as difficult.

    When DEC supported Windows on the Alpha they provided a program called
    FX!32 that translated x86 Windows programs statically into Alpha
    programs.
    The alpha has far more registers than an x86 does which probably helped.

    FX!32 achieved reasonable performance by first running the program with
    an emulator. It would examine the trace of calls made by the emulator
    and in the background the binary translator would translate into Alpha
    code the important parts of the code. The translated code is then
    stored
    for future runs. I think it worked at about half native speed.

    As well as registers there are other difficulties such as instructions
    that have no equivalent on the target architecure. For example 80-bit
    x87
    instructions don't exist elsewhere. Also, the target architecture may
    have no flags, or different flags to the source architecture.


+ Reply to Thread
Page 1 of 2 1 2 LastLast

Similar Threads

  1. How to code directly in machine language?
    By Application Development in forum ASM x86 ASM 370
    Replies: 7
    Last Post: 08-15-2006, 09:54 AM
  2. Some Tokenising Routines - PL/I, 370 Assembler, Intel Assembler
    By Application Development in forum pl1
    Replies: 0
    Last Post: 06-29-2006, 02:00 AM
  3. the language machine
    By Application Development in forum Compilers
    Replies: 0
    Last Post: 09-10-2005, 11:38 AM
  4. Assembler: alignment of machine instructions
    By Application Development in forum ASM x86 ASM 370
    Replies: 12
    Last Post: 07-27-2005, 05:11 PM
  5. Assembler and C. How can I use extern in a mixture of C language and Assembler?
    By Application Development in forum ASM x86 ASM 370
    Replies: 6
    Last Post: 11-23-2003, 05:09 PM