| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| Hi, I have a J2EE application which connects to a DB2 configured with code set IBM-850. The application works with encoding ISO-8859-1. If I save characters outside the range supported by IBM-850 (i.e. the euro currency character EURO) then I read garbage... I tried encoding conversions with InputStreamReader and OutputStreamWriter: .... BufferedReader reader = new BufferedReader(new InputStreamReader(source, "IBM850")); BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(output, "ISO-8859-1")); .... but that didn't work... My JVM Charset.availableCharsets() includes IBM850. What can I do? Thanks, in advance, Andrea |
|
#2
| |||
| |||
| Andrea wrote: > I have a J2EE application which connects to a DB2 configured with code > set IBM-850. The application works with encoding ISO-8859-1. In general the JDBC-driver is aware of the encoding, the database is using and is doing the conversion already if you access the column by getString(columnName/index). > I tried encoding conversions with InputStreamReader and > OutputStreamWriter: > ... > BufferedReader reader = new BufferedReader(new > InputStreamReader(source, "IBM850")); What is source? How do you create that from the JDBC- resultset? > BufferedWriter writer = new BufferedWriter(new > OutputStreamWriter(output, "ISO-8859-1")); That looks OK. Regards, Lothar -- Lothar Kimmeringer E-Mail: spamfang@kimmeringer.de PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81) Always remember: The answer is forty-two, there can only be wrong questions! |
|
#3
| |||
| |||
| Hi Lothar, > > I have a J2EE application which connects to a DB2 configured with code > > set IBM-850. The application works with encoding ISO-8859-1. > > In general the JDBC-driver is aware of the encoding, the database > is using and is doing the conversion already if you access the > column by getString(columnName/index). Yes I fetch the string with Resultset.getString(index). I use DB2 Universal Driver with a type 4 connection. > > I tried encoding conversions with InputStreamReader and OutputStreamWriter: > > ... > > BufferedReader reader = new BufferedReader(new > > InputStreamReader(source, "IBM850")); > > What is source? How do you create that from the JDBC- > resultset? I tried: InputStream source = new ByteArrayInputStream(stringFetchedFromDB.getBytes( )); Thanks, Andrea |
|
#4
| |||
| |||
| Andrea wrote: > Hi, > I have a J2EE application which connects to a DB2 configured with code > set IBM-850. The application works with encoding ISO-8859-1. > If I save characters outside the range supported by IBM-850 (i.e. the > euro currency character EURO) then I read garbage... Yes, the Euro symbol is not part of the encodings, so your database can't contain it. If you need it, you would have to change the databases encoding (ISO-8859-15 includes the Euro symbol). Otherwise, you have to take care not to try to write unsupported character into string/character fields. One solution could be to parse all strings and replace the symbol with the shorthand "EUR", but it might not be acceptable to your client. -- Sabine Dinis Blochberger Op3racional www.op3racional.eu |
|
#5
| |||
| |||
| > > ... > > If I save characters outside the range supported by IBM-850 (i.e. the > > euro currency character EURO) then I read garbage... > > Yes, the Euro symbol is not part of the encodings, so your database > can't contain it. I've found a strange thing: C and COBOL application can write and read (using embedded SQL) characters outside the accepted range without problems... So the database can contain those characters without loosing any information, but I can't understand how... > If you need it, you would have to change the databases > encoding (ISO-8859-15 includes the Euro symbol). > Otherwise, you have to take care not to try to write unsupported > character into string/character fields. > > One solution could be to parse all strings and replace the symbol with > the shorthand "EUR", but it might not be acceptable to your client. Actually the EURO character is just an example, I have more complex strings to handle (and I can't change the encoding of the database). If my problem has no solution at all then I'd like to understand why other languages don't have this problem... Thanks, Andrea |
|
#6
| |||
| |||
| Andrea wrote: > > > ... > > > If I save characters outside the range supported by IBM-850 (i.e. the > > > euro currency character EURO) then I read garbage... > > > > Yes, the Euro symbol is not part of the encodings, so your database > > can't contain it. > I've found a strange thing: C and COBOL application can write and read > (using embedded SQL) characters outside the accepted range without > problems... So the database can contain those characters without > loosing any information, but I can't understand how... > Yes, in theory you can store any value (0 - 255 in case of one byte strings) in a string, but how that is interpreted (i.e. encoding) is where it gets hairy. Also, multibyte characters would break the interpretation. > > If you need it, you would have to change the databases > > encoding (ISO-8859-15 includes the Euro symbol). > > Otherwise, you have to take care not to try to write unsupported > > character into string/character fields. > > > > One solution could be to parse all strings and replace the symbol with > > the shorthand "EUR", but it might not be acceptable to your client. > Actually the EURO character is just an example, I have more complex > strings to handle (and I can't change the encoding of the database). > If my problem has no solution at all then I'd like to understand why > other languages don't have this problem... > Ah, there is always hacks around limitations. But they aren't usually pretty. The problem is to funnel a string with these "unsupported" characters through the JDBC driver (both ways). You might get around it by using typeless fields (you can put any byte sequence there), like BLOBS maybe... Or you write a parser that substitutes the impossible characters with acceptable replacements. Of course, this is most likele not feasable. But the customer has to be aware that a database with encoding X can only hold strings encoded in X. If they need UTF-8 for example now, they will eventually have to change their database. And it would be better to migrate to a suitable encoding than to hack around it and in a few years, have to do all over again (and then some), when they finally do want to change the database encoding. On other languages not having the problem, in C, you can treat a string just like an array of bytes and use those for whatever you like, the compiler won't complain. Even interpreting them as memory addresses is possible, adding and subtracting etc... > Thanks, > Andrea -- Sabine Dinis Blochberger Op3racional www.op3racional.eu |
|
#7
| |||
| |||
| Hi Sabine, thank you for your explanation, now the overall situation is much more clear to me. Thanks, Andrea |
|
#8
| |||
| |||
| On Mon, 11 Feb 2008 04:03:47 -0800 (PST), Andrea <tol7481@iperbole.bologna.it> wrote, quoted or indirectly quoted someone who said : >I have a J2EE application which connects to a DB2 configured with code >set IBM-850. The application works with encoding ISO-8859-1. >If I save characters outside the range supported by IBM-850 (i.e. the >euro currency character EURO) then I read garbage... First, make sure the data are truly encoded in IBM-850. See http://mindprod.com/applet/encodingrecogniser.html If there are characters int that file outside the range of IBM-850, then by definition the file is not encoded in IBM-850 and you SHOULD expect garbage. You can write your own translate program to handle the excess chars. see http://mindprod.com/jgloss/encoding.html I don't know how to hook it in as an official encoding, but that is not necessary. -- Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com |
|
#9
| |||
| |||
| On Mon, 11 Feb 2008 04:03:47 -0800 (PST), Andrea <tol7481@iperbole.bologna.it> wrote, quoted or indirectly quoted someone who said : >BufferedReader reader = new BufferedReader(new >InputStreamReader(source, "IBM850")); >BufferedWriter writer = new BufferedWriter(new >OutputStreamWriter(output, "ISO-8859-1")); Your first task is to find out just what you are being handed before you start fooling around with translations. Unicode, IBM850, ISO-8859-1, something else? -- Roedy Green Canadian Mind Products The Java Glossary http://mindprod.com |
|
#10
| |||
| |||
| Hi Roedy, the database (DB2) has this configuration: .... Database territory = US Database code page = 850 Database code set = IBM-850 .... I've exported to a file the content of a table with a CHAR(N) field containing the EURO currency character, then I've opened the file with EncodingRecognizer: if I choose IBM850 I see a strange character (like a small X), if I choose ISO-8859-1 I see a square. I tried a translation with: String problematicString = rs.getString(index); problematicString = new String(problematicString, "IBM850"); // Am I correct? but I still get garbage :-( Thanks, Andrea |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.