TDS and character encoding

This is a discussion on TDS and character encoding within the Framework and Interface Programming forums in category; I've seen a dump of the TDS traffic going from my webserver to the SQL Server database and it seems encoded in Unicode (it has two bytes per char). Seems it would have a huge impact on performance if it travelled in one byte. Why might this be? rj...

Go Back   Application Development Forum > Framework and Interface Programming

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-30-2007, 10:26 AM
raymond_b_jimenez@yahoo.com
Guest
 
Default TDS and character encoding

I've seen a dump of the TDS traffic going from my webserver to the SQL
Server database and it seems encoded in Unicode (it has two bytes per
char). Seems it would have a huge impact on performance if it
travelled in one byte. Why might this be?

rj

Reply With Quote
  #2  
Old 08-30-2007, 10:59 AM
Bob Barrows [MVP]
Guest
 
Default Re: TDS and character encoding

raymond_b_jimenez@yahoo.com wrote:
> I've seen a dump of the TDS traffic going from my webserver to the SQL
> Server database and it seems encoded in Unicode (it has two bytes per
> char). Seems it would have a huge impact on performance if it
> travelled in one byte. Why might this be?
>

This seems to have nothing at all to do with classic ADO. Please remove
this newsgroup from future crossposts.

--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.


Reply With Quote
  #3  
Old 08-30-2007, 05:31 PM
Erland Sommarskog
Guest
 
Default Re: TDS and character encoding

(raymond_b_jimenez@yahoo.com) writes:
> I've seen a dump of the TDS traffic going from my webserver to the SQL
> Server database and it seems encoded in Unicode (it has two bytes per
> char). Seems it would have a huge impact on performance if it
> travelled in one byte. Why might this be?


I have never eavesdropped on TDS, but Unicode is indeed the character
set of SQL Server. You are perfectly able to name your tables in
Cyrillic or Hindi characters if you feel like. And of course character
strings may include all sorts of characters. So an batch of SQL statement
that is sent over the wire must be Unicode. That is beyond dispute.

However, you don't encode something in Unicode. Unicode is the character
set, and there are several encodings available, of which the most popular
are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
2 bytes, and characters beyond that takes up 4 bytes. (The base plane
covers the vast majority of living langauges). In UTF-8, ASCII characters
takes up one byte, other characters in the Latin, Greek and Cyrillic
script takes two bytes, and Chinese and Japanese characters takes up three
bytes.

SQL Server uses UTF-16 exclusively. It is true that for network traffic
in the western world, it would be more effective if TDS used UTF-8, but
as you can see that it is necessarily the case in the Far East. And had
TDS used UTF-8, both ends of the wire would have had to convert to
UTF-16, so any reduced network traffic could be eaten up by extra CPU
time.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Reply With Quote
  #4  
Old 08-30-2007, 05:31 PM
Erland Sommarskog
Guest
 
Default Re: TDS and character encoding

(raymond_b_jimenez@yahoo.com) writes:
> I've seen a dump of the TDS traffic going from my webserver to the SQL
> Server database and it seems encoded in Unicode (it has two bytes per
> char). Seems it would have a huge impact on performance if it
> travelled in one byte. Why might this be?


I have never eavesdropped on TDS, but Unicode is indeed the character
set of SQL Server. You are perfectly able to name your tables in
Cyrillic or Hindi characters if you feel like. And of course character
strings may include all sorts of characters. So an batch of SQL statement
that is sent over the wire must be Unicode. That is beyond dispute.

However, you don't encode something in Unicode. Unicode is the character
set, and there are several encodings available, of which the most popular
are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
2 bytes, and characters beyond that takes up 4 bytes. (The base plane
covers the vast majority of living langauges). In UTF-8, ASCII characters
takes up one byte, other characters in the Latin, Greek and Cyrillic
script takes two bytes, and Chinese and Japanese characters takes up three
bytes.

SQL Server uses UTF-16 exclusively. It is true that for network traffic
in the western world, it would be more effective if TDS used UTF-8, but
as you can see that it is necessarily the case in the Far East. And had
TDS used UTF-8, both ends of the wire would have had to convert to
UTF-16, so any reduced network traffic could be eaten up by extra CPU
time.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Reply With Quote
  #5  
Old 08-30-2007, 06:16 PM
William Vaughn
Guest
 
Default Re: TDS and character encoding

Snooping into the TDS would be the very last place I would look when trying
to improve performance. It would be like polishing a clean mirror to remove
one's zits.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant, Dad, Grandpa
Microsoft MVP
INETA Speaker
www.betav.com
www.betav.com/blog/billva
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------

"Erland Sommarskog" <esquel@sommarskog.se> wrote in message
news:Xns999CEFDAB2FB4Yazorman@127.0.0.1...
> (raymond_b_jimenez@yahoo.com) writes:
>> I've seen a dump of the TDS traffic going from my webserver to the SQL
>> Server database and it seems encoded in Unicode (it has two bytes per
>> char). Seems it would have a huge impact on performance if it
>> travelled in one byte. Why might this be?

>
> I have never eavesdropped on TDS, but Unicode is indeed the character
> set of SQL Server. You are perfectly able to name your tables in
> Cyrillic or Hindi characters if you feel like. And of course character
> strings may include all sorts of characters. So an batch of SQL statement
> that is sent over the wire must be Unicode. That is beyond dispute.
>
> However, you don't encode something in Unicode. Unicode is the character
> set, and there are several encodings available, of which the most popular
> are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
> 2 bytes, and characters beyond that takes up 4 bytes. (The base plane
> covers the vast majority of living langauges). In UTF-8, ASCII characters
> takes up one byte, other characters in the Latin, Greek and Cyrillic
> script takes two bytes, and Chinese and Japanese characters takes up three
> bytes.
>
> SQL Server uses UTF-16 exclusively. It is true that for network traffic
> in the western world, it would be more effective if TDS used UTF-8, but
> as you can see that it is necessarily the case in the Far East. And had
> TDS used UTF-8, both ends of the wire would have had to convert to
> UTF-16, so any reduced network traffic could be eaten up by extra CPU
> time.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/pro...ads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #6  
Old 08-30-2007, 06:16 PM
William Vaughn
Guest
 
Default Re: TDS and character encoding

Snooping into the TDS would be the very last place I would look when trying
to improve performance. It would be like polishing a clean mirror to remove
one's zits.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant, Dad, Grandpa
Microsoft MVP
INETA Speaker
www.betav.com
www.betav.com/blog/billva
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------

"Erland Sommarskog" <esquel@sommarskog.se> wrote in message
news:Xns999CEFDAB2FB4Yazorman@127.0.0.1...
> (raymond_b_jimenez@yahoo.com) writes:
>> I've seen a dump of the TDS traffic going from my webserver to the SQL
>> Server database and it seems encoded in Unicode (it has two bytes per
>> char). Seems it would have a huge impact on performance if it
>> travelled in one byte. Why might this be?

>
> I have never eavesdropped on TDS, but Unicode is indeed the character
> set of SQL Server. You are perfectly able to name your tables in
> Cyrillic or Hindi characters if you feel like. And of course character
> strings may include all sorts of characters. So an batch of SQL statement
> that is sent over the wire must be Unicode. That is beyond dispute.
>
> However, you don't encode something in Unicode. Unicode is the character
> set, and there are several encodings available, of which the most popular
> are UTF-16 and UTF-8. In UTF-8 each character in the base plane takes up
> 2 bytes, and characters beyond that takes up 4 bytes. (The base plane
> covers the vast majority of living langauges). In UTF-8, ASCII characters
> takes up one byte, other characters in the Latin, Greek and Cyrillic
> script takes two bytes, and Chinese and Japanese characters takes up three
> bytes.
>
> SQL Server uses UTF-16 exclusively. It is true that for network traffic
> in the western world, it would be more effective if TDS used UTF-8, but
> as you can see that it is necessarily the case in the Far East. And had
> TDS used UTF-8, both ends of the wire would have had to convert to
> UTF-16, so any reduced network traffic could be eaten up by extra CPU
> time.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/pro...ads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #7  
Old 08-31-2007, 12:50 PM
raymond_b_jimenez@yahoo.com
Guest
 
Default Re: TDS and character encoding

Well William, that is clearly not the case where you have a REAL
database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
between the IIS servers and SQL Server... Getting away from about
10Mbps of unneeded traffic does not seem like polishing to me...
I can guarantee you that this is having serious impact on performance,
and when you're digging really into it (things like TCP/IP slow-
starts...), you really get to know why it's huge impact for the
client, the DB server and performance.
rj

On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@betav.com> wrote:
> Snooping into the TDS would be the very last place I would look when trying
> to improve performance. It would be like polishing a clean mirror to remove
> one's zits.
>
> --
> ____________________________________
> William (Bill) Vaughn
> Author, Mentor, Consultant, Dad, Grandpa
> Microsoft MVP
> INETA Speakerwww.betav.comwww.betav.com/blog/billva
> Please reply only to the newsgroup so that others can benefit.
> This posting is provided "AS IS" with no warranties, and confers no rights.
> __________________________________
>


Reply With Quote
  #8  
Old 08-31-2007, 12:50 PM
raymond_b_jimenez@yahoo.com
Guest
 
Default Re: TDS and character encoding

Well William, that is clearly not the case where you have a REAL
database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
between the IIS servers and SQL Server... Getting away from about
10Mbps of unneeded traffic does not seem like polishing to me...
I can guarantee you that this is having serious impact on performance,
and when you're digging really into it (things like TCP/IP slow-
starts...), you really get to know why it's huge impact for the
client, the DB server and performance.
rj

On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@betav.com> wrote:
> Snooping into the TDS would be the very last place I would look when trying
> to improve performance. It would be like polishing a clean mirror to remove
> one's zits.
>
> --
> ____________________________________
> William (Bill) Vaughn
> Author, Mentor, Consultant, Dad, Grandpa
> Microsoft MVP
> INETA Speakerwww.betav.comwww.betav.com/blog/billva
> Please reply only to the newsgroup so that others can benefit.
> This posting is provided "AS IS" with no warranties, and confers no rights.
> __________________________________
>


Reply With Quote
  #9  
Old 08-31-2007, 01:54 PM
William \(Bill\) Vaughn
Guest
 
Default Re: TDS and character encoding

Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest layers
of the interface that should be the focus of performance tuning, but the
applications, database designs and query methodologies that should dominate
your attempts to improve throughput and scalibility. Reducing the traffic on
the TDS channel will go a long way to improving performance if you have to
move that much volume over the wire to make a difference.

SQL Server Holds Record for TPC-E Database Benchmark
by Brian Moran, brian@solidqualitylearning.com

SQL Server now holds every conceivable world record for the TPC-E database
benchmark. That news would be slightly more impressive if TPC-E scores
existed for any database besides SQL Server, but heck, winning a race with
just one runner doesn't mean that runner did a bad job. I first wrote about
TPC-E, the latest benchmark from the Transaction Processing Performance
Council, in my commentary "TPC's New Benchmark Strives for Realism," October
2006, InstantDoc ID 93955.

Microsoft became the first database vendor to have a published TPC-E result
when Unisys published a TPC-E score on July 12 using SQL Server 2005 on a
dual-core 16-processor ES7000. IBM followed suit with a dual-core
2-processor server two weeks later, and Dell posted a dual-core 4-processor
result on August 24. Both IBM's and Dell's results used SQL Server, so SQL
Server is currently the only database vendor listed, meaning SQL Server
currently holds all the top scores. Sane vendors don't post TPC-E scores
that make them look bad, but I suspect it's only a matter of time before IBM
and Oracle post TPC- E scores for their database products that leapfrog the
latest SQL Server scores, which will in turn be bested by Microsoft in the
never-ending game of benchmark leapfrog.
Read the full article at:
http://lists.sqlmag.com/t?ctl=642B5:...B50D3688BDE645



--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------
<raymond_b_jimenez@yahoo.com> wrote in message
news:1188578980.032571.55290@g4g2000hsf.googlegrou ps.com...
> Well William, that is clearly not the case where you have a REAL
> database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
> between the IIS servers and SQL Server... Getting away from about
> 10Mbps of unneeded traffic does not seem like polishing to me...
> I can guarantee you that this is having serious impact on performance,
> and when you're digging really into it (things like TCP/IP slow-
> starts...), you really get to know why it's huge impact for the
> client, the DB server and performance.
> rj
>
> On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@betav.com> wrote:
>> Snooping into the TDS would be the very last place I would look when
>> trying
>> to improve performance. It would be like polishing a clean mirror to
>> remove
>> one's zits.
>>
>> --
>> ____________________________________
>> William (Bill) Vaughn
>> Author, Mentor, Consultant, Dad, Grandpa
>> Microsoft MVP
>> INETA Speakerwww.betav.comwww.betav.com/blog/billva
>> Please reply only to the newsgroup so that others can benefit.
>> This posting is provided "AS IS" with no warranties, and confers no
>> rights.
>> __________________________________
>>

>


Reply With Quote
  #10  
Old 08-31-2007, 01:54 PM
William \(Bill\) Vaughn
Guest
 
Default Re: TDS and character encoding

Given that SQL Server has the highest TPC-E benchmarks in the industry,
don't you think that the SQL Server team has made the TDS stream as
efficient as possible? IMHO, it's not the line protocol or the lowest layers
of the interface that should be the focus of performance tuning, but the
applications, database designs and query methodologies that should dominate
your attempts to improve throughput and scalibility. Reducing the traffic on
the TDS channel will go a long way to improving performance if you have to
move that much volume over the wire to make a difference.

SQL Server Holds Record for TPC-E Database Benchmark
by Brian Moran, brian@solidqualitylearning.com

SQL Server now holds every conceivable world record for the TPC-E database
benchmark. That news would be slightly more impressive if TPC-E scores
existed for any database besides SQL Server, but heck, winning a race with
just one runner doesn't mean that runner did a bad job. I first wrote about
TPC-E, the latest benchmark from the Transaction Processing Performance
Council, in my commentary "TPC's New Benchmark Strives for Realism," October
2006, InstantDoc ID 93955.

Microsoft became the first database vendor to have a published TPC-E result
when Unisys published a TPC-E score on July 12 using SQL Server 2005 on a
dual-core 16-processor ES7000. IBM followed suit with a dual-core
2-processor server two weeks later, and Dell posted a dual-core 4-processor
result on August 24. Both IBM's and Dell's results used SQL Server, so SQL
Server is currently the only database vendor listed, meaning SQL Server
currently holds all the top scores. Sane vendors don't post TPC-E scores
that make them look bad, but I suspect it's only a matter of time before IBM
and Oracle post TPC- E scores for their database products that leapfrog the
latest SQL Server scores, which will in turn be bested by Microsoft in the
never-ending game of benchmark leapfrog.
Read the full article at:
http://lists.sqlmag.com/t?ctl=642B5:...B50D3688BDE645



--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------
<raymond_b_jimenez@yahoo.com> wrote in message
news:1188578980.032571.55290@g4g2000hsf.googlegrou ps.com...
> Well William, that is clearly not the case where you have a REAL
> database with REAL traffic. When I mean REAL, I mean a 25Mbps stream
> between the IIS servers and SQL Server... Getting away from about
> 10Mbps of unneeded traffic does not seem like polishing to me...
> I can guarantee you that this is having serious impact on performance,
> and when you're digging really into it (things like TCP/IP slow-
> starts...), you really get to know why it's huge impact for the
> client, the DB server and performance.
> rj
>
> On 30 Ago, 23:16, "William Vaughn" <billvaNoS...@betav.com> wrote:
>> Snooping into the TDS would be the very last place I would look when
>> trying
>> to improve performance. It would be like polishing a clean mirror to
>> remove
>> one's zits.
>>
>> --
>> ____________________________________
>> William (Bill) Vaughn
>> Author, Mentor, Consultant, Dad, Grandpa
>> Microsoft MVP
>> INETA Speakerwww.betav.comwww.betav.com/blog/billva
>> Please reply only to the newsgroup so that others can benefit.
>> This posting is provided "AS IS" with no warranties, and confers no
>> rights.
>> __________________________________
>>

>


Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 02:42 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.