Pre Delphi 2008-9 Unicode Do's and Dont's

This is a discussion on Pre Delphi 2008-9 Unicode Do's and Dont's within the Delphi forums in Programming Languages category; Has anyone posted information concerning do's and dont's for Unicode support in upcoming Delphi versions? It recent threads concerning Delphi/Unicode, I think the topic of being prepared for Unicode has not been addressed so much, at least as far as I can see. On one side, we have applications that have already been written whose authors are rightfully concerned about compatibility. On the other side, we have applications which are yet to be written and do not have much threat of being In the middle, we have applications which are currently being written (raises hand) which could benefit from some ...

Go Back   Application Development Forum > Programming Languages > Delphi

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 07-21-2008, 12:40 PM
Lee Jenkins
Guest
 
Default Pre Delphi 2008-9 Unicode Do's and Dont's


Has anyone posted information concerning do's and dont's for Unicode support in
upcoming Delphi versions?

It recent threads concerning Delphi/Unicode, I think the topic of being prepared
for Unicode has not been addressed so much, at least as far as I can see.

On one side, we have applications that have already been written whose authors
are rightfully concerned about compatibility.

On the other side, we have applications which are yet to be written and do not
have much threat of being

In the middle, we have applications which are currently being written (raises
hand) which could benefit from some suggestions on best practices to give the
applications currently being written to have a chance of being ported more
easily when D2008/9 is finally released.

--
Warm Regards,

Lee
Reply With Quote
  #2  
Old 07-21-2008, 01:04 PM
Nick Hodges (Embarcadero)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

Lee Jenkins wrote:

> Has anyone posted information concerning do's and dont's for Unicode
> support in upcoming Delphi versions?


I'll be posting some articles on this very soon.

Short list:

Don't assume that the size of a Char is one.

Don't assume that the size of an array of Char is the same as the
Length of the string held in the array of Char.


--
Nick Hodges
Delphi Product Manager - Embarcadero
http://blogs.codegear.com/nickhodges
Reply With Quote
  #3  
Old 07-21-2008, 01:27 PM
John Herbster
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

>> Has anyone posted information concerning do's and dont's for
>> Unicode support in upcoming Delphi versions?


"Nick Hodges (Embarcadero)" <nick.hodges@codegear.com> wrote
> I'll be posting some articles on this very soon.


Nick, Here are a few suggestions and clarifications.

> Short list::
> Don't assume that the size of a Char is one.


Please start with the compiler op to switch def of "Char".

> Don't assume that the SizeOf an array of Char is the same as the
> Length of the string held in the array of Char [less one].


Show us how to iterate through a string of characters with indexes.

Show us how to iterate through a string of characters with pointers.

Show us how to load and store a string from and to TStreams.

Show us how to replace a character.

Show us how to make literal constants an assign them to strings.

Show us how to pass strings to and from DLLs.

Regards, JohnH





Reply With Quote
  #4  
Old 07-21-2008, 01:44 PM
Nick Hodges (Embarcadero)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

John Herbster wrote:

>
> Show us how to iterate through a string of characters with indexes.


Exactly as before.

>
> Show us how to iterate through a string of characters with pointers.


Exactly as before -- but don't assume a character is of size 1.

> Show us how to load and store a string from and to TStreams.


Exactly as before but you can't assume that the length of a string char
is 1.

>
> Show us how to replace a character.


Exactly as before.

> Show us how to make literal constants an assign them to strings.


Exactly as before.

> Show us how to pass strings to and from DLLs.


Just as before, but again, don't assume that Char = 1 byte.

--
Nick Hodges
Delphi Product Manager - Embarcadero
http://blogs.codegear.com/nickhodges
Reply With Quote
  #5  
Old 07-21-2008, 02:24 PM
Serge Dosyukov \(Dragon Soft\)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

1) Few functions are expecting PAnsiChar/PWideChar instead of
AnsiCar/WideChar (windows API)
2) working with Windows API, be aware of what you are passing around
(windows messages)
3) use Length()
4) P-strings are still #0 terminated, but instead of #00, you might see
#0000.

In Delphi 7
var
LC: char;
LC2: widechar;
LC3: ansichar;
begin
ShowMessage(IntToStr(SizeOf(LC)) + ', ' + IntToStr(SizeOf(LC2)) + ', ' +
IntToStr(SizeOf(LC3)));
end;

gives "1, 2, 1"

where now you may get

gives "2, 2, 1"

"John Herbster" <herb-sci1_AT_sbcglobal.net> wrote in message
news:4884c709$1@newsgroups.borland.com...
>> Has anyone posted information concerning do's and dont's for
>> Unicode support in upcoming Delphi versions?




Reply With Quote
  #6  
Old 07-21-2008, 02:27 PM
John Herbster
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's


"Nick Hodges (Embarcadero)" <nick.hodges@codegear.com> wrote
> Exactly as before -- but don't assume a character is of size 1.


Thanks Nick!

>> Show us how to iterate through a string of characters with pointers.


> Exactly as before -- but don't assume a character is of size 1.


May I presume like this?
p := @MyString[1];
Inc(p);
where MyStr: string; and p: PChar;

And how expensive are these operations during CPU execution?

TIA, JohnH
Reply With Quote
  #7  
Old 07-21-2008, 02:32 PM
John Herbster
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's


"Serge Dosyukov (Dragon Soft)" <pooh996.gmail.com> wrote
> 1) Few functions are expecting PAnsiChar/PWideChar instead of
> AnsiCar/WideChar (windows API)


What are the type names for Unicode strings and chars?
What is the SizeOf() for a Unicode char variable?
--JohnH
Reply With Quote
  #8  
Old 07-21-2008, 02:45 PM
Nick Hodges (Embarcadero)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

John Herbster wrote:

> May I presume like this?
> p := @MyString[1];
> Inc(p);
> where MyStr: string; and p: PChar;


Yes -- just like before.

> And how expensive are these operations during CPU execution?


Minimal -- it's very efficient. It's pointer math, right? ;-)


--
Nick Hodges
Delphi Product Manager - Embarcadero
http://blogs.codegear.com/nickhodges
Reply With Quote
  #9  
Old 07-21-2008, 02:47 PM
Nick Hodges (Embarcadero)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

John Herbster wrote:

> What are the type names for Unicode strings and chars?



string aliases to UnicodeString
PChar aliases to PWideChar

> What is the SizeOf() for a Unicode char variable?


SizeOf(Char) is now 2.


--
Nick Hodges
Delphi Product Manager - Embarcadero
http://blogs.codegear.com/nickhodges
Reply With Quote
  #10  
Old 07-21-2008, 02:58 PM
Serge Dosyukov \(Dragon Soft\)
Guest
 
Default Re: Pre Delphi 2008-9 Unicode Do's and Dont's

http://blogs.codegear.com/nickhodges

1) string, char
2) We are still sit on top of Windows API, so "Wide strings consist of
16-bit Unicode characters". Could be different for 64bit processors.

But "WideChar would suddenly grow in size"

http://en.wikipedia.org/wiki/Unicode
http://delphi.about.com/od/beginners/l/aa071800a.htm
http://www.codexterity.com/delphistrings.htm

As you can see from my code sample, I was getting widechar/widestring
representation: in case of char, it is a widechar, in case of the string it
is a widestring.

Rule of thumb. Stay away from assuming specific size of the string
representation in bytes, count its length in chars instead. Then if you need
exact size, multiply it by the size of the char being stored.

"John Herbster" <herb-sci1_AT_sbcglobal.net> wrote in message
news:4884d62c@newsgroups.borland.com...

"Serge Dosyukov (Dragon Soft)" <pooh996.gmail.com> wrote
> 1) Few functions are expecting PAnsiChar/PWideChar instead of
> AnsiCar/WideChar (windows API)


What are the type names for Unicode strings and chars?
What is the SizeOf() for a Unicode char variable?
--JohnH


Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 05:29 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.