string conversion latin2 to ascii - Python

This is a discussion on string conversion latin2 to ascii - Python ; Hi all, sorry for a newbie question. I have unicode string (or better say latin2 encoding) containing non-ascii characters, e.g. s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA" I would like to convert this string to plain ascii (using some lookup table for latin2) to ...

+ Reply to Thread
Results 1 to 6 of 6

string conversion latin2 to ascii

  1. Default string conversion latin2 to ascii

    Hi all,

    sorry for a newbie question. I have unicode string (or better say
    latin2 encoding) containing non-ascii characters, e.g.

    s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"

    I would like to convert this string to plain ascii (using some lookup
    table for latin2)

    to get

    -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA

    Thanks for any hits! Regards, Martin Landa

  2. Default Re: string conversion latin2 to ascii

    On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.com> wrote:
    > Hi all,
    >
    > sorry for a newbie question. I have unicode string (or better say
    > latin2 encoding) containing non-ascii characters, e.g.
    >
    > s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"
    >
    > I would like to convert this string to plain ascii (using some lookup
    > table for latin2)
    >
    > to get
    >
    > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA
    >
    > Thanks for any hits! Regards, Martin Landa


    With a little googling, I found this:

    http://www.peterbe.com/plog/unicode-to-ascii

    You might also find this article useful:

    http://www.reportlab.com/i18n/python..._tutorial.html

    Mike

  3. Default Re: string conversion latin2 to ascii

    > sorry for a newbie question. I have unicode string (or better say
    > latin2 encoding) containing non-ascii characters, e.g.
    >
    > s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"


    That's not a Unicode string (at least in Python 2); it is
    a latin-2 encoded byte string; it has nothing to do with Unicode.

    > I would like to convert this string to plain ascii (using some lookup
    > table for latin2)
    >
    > to get
    >
    > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA


    I recommend to use string.translate. You need a translation
    table there, which is best generated with string.maketrans.

    table=string.maketrans("áží","azi")
    print s.translate(table)

    HTH,
    Martin

  4. Default Re: string conversion latin2 to ascii

    On Nov 28, 8:45 am, kyoso...@gmail.com wrote:
    > On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.com> wrote:
    >
    > > Hi all,

    >
    > > sorry for a newbie question. I have unicode string (or better say
    > > latin2 encoding) containing non-ascii characters, e.g.

    >
    > > s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"

    >
    > > I would like to convert this string to plain ascii (using some lookup
    > > table for latin2)

    >
    > > to get

    >
    > > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA

    >
    > > Thanks for any hits! Regards, Martin Landa

    >
    > With a little googling, I found this:
    >
    > http://www.peterbe.com/plog/unicode-to-ascii


    and if the OP has the patience to read *ALL* the comments on that blog
    entry, he will find that comment[-2] points to

    http://effbot.python-hosting.com/fil...xt/unaccent.py

    and comment[-1] (from the blog owner) is "Brilliant! Thank you."

    The bottom line is that there is no universal easy solution; you need
    to handcraft a translation table suited to your particular purpose
    (e.g. do you want u-with-umlaut to become u or ue?). The
    unicodedata.normalize function is useful for off-line preparation of a
    set of candidate mappings for that table; it should not be applied
    either on-line or blindly.

    Cheers,
    John

  5. Default Re: string conversion latin2 to ascii

    * Martin Landa <landa.martin@gmail.com>, 2007-11-27:
    > I have unicode string (or better say latin2 encoding) containing
    > non-ascii characters, e.g.
    >
    > s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"
    >
    > I would like to convert this string to plain ascii (using some lookup
    > table for latin2)
    >
    > to get
    >
    > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA


    You may try python-elinks
    <http://freshmeat.net/projects/python-elinks/>:


    >>> import elinks
    >>> print "Ukázka_mo\236nosti_vyu\236ití_programu_OpenJUMP_v_SOA".decode('Windows-1250').encode('ASCII', 'elinks')

    Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA


    --
    Jakub Wilk

  6. Default Re: string conversion latin2 to ascii

    On Nov 27, 5:08 pm, John Machin <sjmac...@lexicon.net> wrote:
    > On Nov 28, 8:45 am, kyoso...@gmail.com wrote:
    >
    >
    >
    >
    >
    >
    >
    > > On Nov 27, 3:35 pm, Martin Landa <landa.mar...@gmail.com> wrote:

    >
    > > > Hi all,

    >
    > > > sorry for a newbie question. I have unicode string (or better say
    > > > latin2 encoding) containing non-ascii characters, e.g.

    >
    > > > s = "Ukázka_možnosti_využití_programu_OpenJUMP_v_SOA"

    >
    > > > I would like to convert this string to plain ascii (using some lookup
    > > > table for latin2)

    >
    > > > to get

    >
    > > > -> Ukazka_moznosti_vyuziti_programu_OpenJUMP_v_SOA

    >
    > > > Thanks for any hits! Regards, Martin Landa

    >
    > > With a little googling, I found this:

    >
    > >http://www.peterbe.com/plog/unicode-to-ascii

    >
    > and if the OP has the patience to read *ALL* the comments on that blog
    > entry, he will find that comment[-2] points to
    >
    > http://effbot.python-hosting.com/fil...xt/unaccent.py
    >
    > and comment[-1] (from the blog owner) is "Brilliant! Thank you."
    >
    > The bottom line is that there is no universal easy solution; you need
    > to handcraft a translation table suited to your particular purpose
    > (e.g. do you want u-with-umlaut to become u or ue?). The
    > unicodedata.normalize function is useful for off-line preparation of a
    > set of candidate mappings for that table; it should not be applied
    > either on-line or blindly.
    >
    > Cheers,
    > John


    Sorry...I didn't know about translation tables or I would have
    mentioned that instead. My bad.

    Mike

+ Reply to Thread

Similar Threads

  1. Re: ASCII to decimal conversion function
    By Application Development in forum labview
    Replies: 34
    Last Post: 06-12-2008, 12:40 PM
  2. Re: convers ascii string to hex string
    By Application Development in forum labview
    Replies: 0
    Last Post: 08-23-2007, 02:40 PM
  3. Re: ascii to deimal conversion
    By Application Development in forum labview
    Replies: 0
    Last Post: 08-20-2007, 01:40 PM
  4. UTF to ASCII conversion
    By Application Development in forum Clipper
    Replies: 2
    Last Post: 12-08-2006, 01:56 AM
  5. ASCII conversion
    By Application Development in forum verilog
    Replies: 4
    Last Post: 10-06-2006, 04:57 PM