Writing a compiler - Compilers
This is a discussion on Writing a compiler - Compilers ; Marco van de Voort wrote:
> On 2008-11-03, Alex Colvin <alexc@TheWorld.com> wrote:
> > Runtime and linker features such as exceptions and template instantiation
> > would be very difficult to generate in C.
> Exceptions are harder. Sure, it ...
-
Re: compiling C++ to C, was writing a compiler
Marco van de Voort wrote:
> On 2008-11-03, Alex Colvin <alexc@TheWorld.com> wrote:
> > Runtime and linker features such as exceptions and template instantiation
> > would be very difficult to generate in C.
> Exceptions are harder. Sure, it is doable in C, but would the resulting,
> stack unwinding C be portable to a different compiler (IOW free of (ab)use of
> specific ABI knowledge )?
I've implemented Windows SEH-style exception handling (try / except /
finally) using C setjmp and longjmp, and I'm sure I'm not alone. C++ EH
can be built on top of that.
It's not particularly pretty, but it's portable in so far as setjmp and
longjmp are portable, if you e.g. discard register attributes for locals
etc.
-- Barry
--
http://barrkel.blogspot.com/
-
Re: compiling C++ to C, was writing a compiler
Marco van de Voort wrote:
> On 2008-11-03, Alex Colvin <alexc@TheWorld.com> wrote:
>>> Tony said, "From what I've read, it seems that CFront couldn't implement
>>> the whole language."
>>> I said I thought it could, at least theoretically, although modern
>>> compilers which implement the current standard all compile to machine code.
>> Runtime and linker features such as exceptions and template instantiation
>> would be very difficult to generate in C.
>
> Templates are no problem, the C++ compiler instantiates them, and writes
> them out in C code.
>
> Exceptions are harder. Sure, it is doable in C, but would the resulting,
> stack unwinding C be portable to a different compiler (IOW free of (ab)use of
> specific ABI knowledge )?
>
The cfront wikipedia entry has this to say about exceptions:
"Cfront 4.0 was abandoned after a failed attempt to add exception
support, and it appears it is no longer commercially available. The C++
language had grown beyond its capabilities, however a compiler with
similar approach became available later, namely Comeau C/C++, which is
considered highly standard compliant."
And on the Comeau website,
http://www.comeaucomputing.com/faqs/...html#ccompiler
says this:
"The C compiler is used merely and only for the sake of obtaining native
code generation. This means that Comeau C++ is tailored for use with
specific C compilers on each respective platform. Please note that it is
a requirement that tailoring must be done by Comeau. Otherwise, the
generated C code is meaningless as it is tied to a specific platform
(where platform includes at least the CPU, OS, and C compiler) and
furthermore, the generated C code is not standalone. Therefore, it
cannot be used by itself (note that this is both a technical and legal
requirement when using Comeau C++), and this is why there is not
normally an option to see the generated C code: it's almost always
unhelpful and the compile process, including its generation, should be
considered as internal phases of translation."
So it sounds like Tony and Marco are basically right: cfront as we knew
it can't do exceptions.
Templates are a challenge for any compiler, cfront or not. Back in 1994
and 1995, I was making a living porting C++ code to DEC C++ (which at
the time didn't support automatic template instantiation) and to Sun C++
(whose old cfront compiler did it well but whose brand new native
compiler did it badly). I eventually wrote something using nm and perl
which kept track of which templates were needed and compiled them (with
C++), iterating until there were no more uninstantiated templates left.
I think I still have a copy of that code somewhere.
Louis
-
Re: Writing a compiler
On 2008-11-05, George Neuner <gneuner2@comcast.net> wrote:
>>> I see no reason why cfront couldn't implement all of C++; theoretically,
>>> there's no difference between generating C or assembler code.
>>This is not true if e.g. all chars that can be used in C identifiers are
>>also valid chars in C++. In assembler, usually a lot more special chars ($,@
>>often) can be used to separate the parts in a mangled name.
>
> That doesn't matter. The standard does not specify how to mangle
> names and, in fact, nearly every compiler does it differently.
Indeed, I forgot the escape based solution obviously.
-
Re: using C as intermediate code, was Writing a compiler
On 30 Okt., 05:47, andresj <andresjriof...@gmail.com> wrote:
> On Oct 26, 4:57 am, Nick <Ibeam2...@gmail.com> wrote:
>
> > If I can make a suggestion, use C or C++ as target language. Here
> > you don't have to reinvent subroutine calling and the like, and you
> > maintain compatibility with other things on the OS. Not to mention
> > ease of moving around different OSes. And troubleshooting. Much
> > easier.
> > [Quite a reasonable idea unless your plan was to learn about code
> > generation. -John]
>
> That is a reasonable idea, thanks. :-) The only problem is that there
> would be a dependency on a complicated C or C++ compiler, which might
> make the language more difficult to manage.
This depends on how much compiler specific (in contrast to language
defined) features are used. If you restrict e.g. to C89 there are
not so much differences in the C compilers.
I found the following differences:
- Empty structs are not allowed in some C compilers (E.g in the
MSVC C compiler).
- The number of parenthesis levels (64 for older and 256 for newer
MSVC C compilers). Naturally when C programs are generated such
limits are very bad.
- Using of floating point NaN and Infinite in contrast to
exceptions raised or the whole program terminated.
- Handling of integer exceptions (posix uses the SIGFPE signal for
integer division by zero while MSVC uses it's structured
(unportable) exceptions.
- Automatic casting between integer (long) and pointers (most unix
compilers including GCC and also MSVC issue warnings for such
casts, but allow them. Borlands bcc32 just issues errors and
forbids such implizit casts).
- The number of significant characters in an identifier may make
problems. For internal identifiers 32 characters are, according
to the C89 standard, significant. But: For identifiers with
external linkages as few as 6 characters are significant and
even the case distinctions may be ignored (Until now I did not
found a linker which enforces this 6 character limit).
Additional there are other differences you may have to deal with:
- Little or big endian representation. Note that it is possible
to write C code which is independend of the endianess of the
machine.
- The representation of negative integers (ones or twos
complement).
- Integer division is, according to the C89 standard, allowed to
truncate towards zero or towards minus infinite (All C compilers
that I know about truncate integer divisions towards zero).
The differences between the libraries provided by different
compilers / operating systems is much bigger. Not every programming
language trys to deal with library / os differences.
A list of some library / os differences is:
- Unicode support with wide chars or with Utf-8 (this has an
influence on many library os interfaces).
- Reading directory contents with opendir() and readdir() or with
findfirst() and findnext().
- mkdir() with one or two parameters.
- Path delimiter / or \ .
- File permitions (unix and windows permitions are different).
- console access (terminfo, termcap, curses, windows console).
- File seeks for files with more than 2 (or 4) gigabyte.
- Winsockets or real Unix sockets.
- Graphics libraries (X11, gl, gdi, directx).
Usually C uses macros to handle compiler / library differences.
In the Seed7 interpreter, the Seed7 to C compiler and the Seed7
libraries several defines and driver libraries are used to handle
C compiler and library differences. This defines are explained
in the file seed7/src/read_me.txt (in the Seed7 release).
Greetings Thomas Mertes
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
-
Re: Writing a compiler
On 20 Oct, 23:20, andresj <andresjriof...@gmail.com> wrote:
> Hello, I am designing a programming language that will run natively.
> That is to say, it will run like C, without a need of an interpreter
> nor a standard library (well, there will be, but I want to be able to
> write an operating system, too.)
Unless someone else has pointed these out already you may want to take
a look at these other newsgroups:
alt.os.development - re. writing an operating system
comp.lang.misc - re. designing a language
--
James
-
Re: Writing a compiler
On 23 Oct, 13:25, torb...@pc-003.diku.dk (Torben Fgidius Mogensen)
wrote:
....
> You can download my book "Basics of Compiler Design" for free
> from http://www.diku.dk/~torbenm/~Basics. It has chapters on type checking
> and code generation, which should help you to get all the way from
> syntax tree to symbolic assembly language. I suggest you use existing
> assemblers and linkers for generating binaries.
....
I think that link should be
http://www.diku.dk/~torbenm/Basics
--
James