ANNOUNCE: Text::CSV_XS 0.32 - Perl

This is a discussion on ANNOUNCE: Text::CSV_XS 0.32 - Perl ; The following report has been written by the PAUSE namespace indexer. Please contact modules@perl.org if there are any open questions. Id: mldistwatch 925 2007-09-16 15:41:11Z k User: HMBRAND (H.Merijn Brand) Distribution file: Text-CSV_XS-0.32.tgz Number of files: 28 *.pm files: 1 ...

+ Reply to Thread
Results 1 to 8 of 8

ANNOUNCE: Text::CSV_XS 0.32

  1. Default ANNOUNCE: Text::CSV_XS 0.32

    The following report has been written by the PAUSE namespace indexer.
    Please contact modules@perl.org if there are any open questions.
    Id: mldistwatch 925 2007-09-16 15:41:11Z k

    User: HMBRAND (H.Merijn Brand)
    Distribution file: Text-CSV_XS-0.32.tgz
    Number of files: 28
    *.pm files: 1
    README: Text-CSV_XS-0.32/README
    META.yml: Text-CSV_XS-0.32/META.yml
    Timestamp of file: Wed Oct 24 11:26:57 2007 UTC
    Time of this run: Wed Oct 24 11:28:25 2007 UTC

    2007-10-24 0.32 - H.Merijn Brand <h.m.brand@xs4all.nl>

    * Added $csv->error_diag () to SYNOPSIS
    * Added need for diag when new () fails to TODO
    * Fixed a sneaked-in defined or in examples/csv2xls
    * Plugged a 32byte memory leak in the cache code (valgrind++)
    * Some perlcritic level1 changes

    2007-07-23 0.31 - H.Merijn Brand <h.m.brand@xs4all.nl>

    * Removed prototypes in examples/csv2xls
    * Improved usage for examples/csv2xls (GetOpt::Long now does
    --help/-?)
    * Extended examples/csv2xls to deal with Unicode (-u)
    * Serious bug in Text::CSV_XS::NV () type setting, causing the
    resulting field to be truncated to IV

    2007-06-21 0.30 - H.Merijn Brand <h.m.brand@xs4all.nl>

    * ,\rx, is definitely an error without binary (used to HANG!)
    * Fixed bug in attribute caching for undefined eol
    * Cleaned up some code after -W*** warnings
    * Added verbatim.
    * More test to cover the really dark corners and edge cases
    * Even more typo fixes in the docs
    * Added error_diag ()
    * Added t/80_diag.t - Will not be mirrored by Text::CSV_PP
    * Added DIAGNOSTICS section to pod - Will grow
    * Small pod niot (abeltje)
    * Doc fix in TODO (Miller Hall)

  2. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    H.Merijn Brand wrote:
    > The following report has been written by the PAUSE namespace indexer.
    > Please contact modules@perl.org if there are any open questions.
    > Id: mldistwatch 925 2007-09-16 15:41:11Z k
    >
    > User: HMBRAND (H.Merijn Brand)
    > Distribution file: Text-CSV_XS-0.32.tgz


    Well, I'm pleased to see you here :-)
    I tried to use your module Text::CSV_XS for storing some data to CSV file
    but without success. The problem is national characters. When I tried
    $csv->combine(('abc',áíá','def') I got "abc\n" only. Your module fail on
    first field where something greather then \x7f is. But no error, no warning.
    Is this a bug or feature?

    --

    Petr Vileta, Czech republic
    (My server rejects all messages from Yahoo and Hotmail. Send me your mail
    from another non-spammer site please.)



  3. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    On 10/24/2007 11:59 PM, Petr Vileta wrote:
    > H.Merijn Brand wrote:
    >> The following report has been written by the PAUSE namespace indexer.
    >> Please contact modules@perl.org if there are any open questions.
    >> Id: mldistwatch 925 2007-09-16 15:41:11Z k
    >>
    >> User: HMBRAND (H.Merijn Brand)
    >> Distribution file: Text-CSV_XS-0.32.tgz

    >
    > Well, I'm pleased to see you here :-)
    > I tried to use your module Text::CSV_XS for storing some data to CSV
    > file but without success. The problem is national characters. When I
    > tried $csv->combine(('abc',áíá','def') I got "abc\n" only. Your module
    > fail on first field where something greather then \x7f is. But no error,
    > no warning.
    > Is this a bug or feature?
    >


    This sort-of works for me:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use encoding 'iso-8859-1';
    use Text::CSV_XS 0.32;

    print "Version = $Text::CSV_XS::VERSION\n";

    my $csv = Text::CSV_XS->new({binary => 1});
    $csv->combine('abc','áíá','def') or warn("problem: $!\n");
    print $csv->string(), "\n";

    __END__

    However, the output seems to be forced to UTF-8:

    Version = 0.32
    abc,áíá,def

    The above is properly interpreted in utf-8 as this:

    Version = 0.32
    abc,áíá,def

    So Text::CSV_XS seems to ignore both the script encoding and the locale.
    I had set LANG=en_US.ISO-8859-1 in Linux before running the script.

    And no error message is placed into $! upon error. I know, this is in
    the TODO section :-)


  4. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    On Thu, 25 Oct 2007 06:59:01 +0200, Petr Vileta <stoupa@practisoft.cz>
    wrote:

    > H.Merijn Brand wrote:
    >> The following report has been written by the PAUSE namespace indexer.
    >> Please contact modules@perl.org if there are any open questions.
    >> Id: mldistwatch 925 2007-09-16 15:41:11Z k
    >>
    >> User: HMBRAND (H.Merijn Brand)
    >> Distribution file: Text-CSV_XS-0.32.tgz

    >
    > Well, I'm pleased to see you here :-)


    I've been here before, but I prefer private mail

    > I tried to use your module Text::CSV_XS for storing some data to CSV
    > file but without success. The problem is national characters. When I
    > tried $csv->combine(('abc',áíá','def') I got "abc\n" only.


    As both Mumia and the docs make (now) VERY clear, you need the binary
    flag. This version has made that even more clear. You *do* read the
    docs, right?
    --8<---
    Important Note: The default behavior is to only accept ascii
    characters. This means that fields can not contain newlines. If
    your
    data contains newlines embedded in fields, or characters above 0x7e
    (tilde), or binary data, you *must* set "binary => 1" in the call to
    "new ()". To cover the widest range of parsing options, you will
    always want to set binary.
    -->8---

    > Your module fail on first field where something greather then \x7f is.


    My module doesn't fail here. It is the default, documented, and correct
    behaviour

    > But no error, no warning.
    > Is this a bug or feature?


    Feature, or documented behaviour. Whatever you prefer.

    In the distribution, check out t/50_utf8.t to see how you should be
    dealing with non-ASCII characters. Maybe I can put that example in
    the documentation, as I keep refering to that file.

  5. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    Mumia W. wrote:
    > On 10/24/2007 11:59 PM, Petr Vileta wrote:
    >> Well, I'm pleased to see you here :-)
    >> I tried to use your module Text::CSV_XS for storing some data to CSV
    >> file but without success. The problem is national characters. When I
    >> tried $csv->combine(('abc',áíá','def') I got "abc\n" only. Your
    >> module fail on first field where something greather then \x7f is.
    >> But no error, no warning.
    >> Is this a bug or feature?
    >>

    >
    > This sort-of works for me:
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    > use encoding 'iso-8859-1';
    > use Text::CSV_XS 0.32;
    >
    > print "Version = $Text::CSV_XS::VERSION\n";
    >
    > my $csv = Text::CSV_XS->new({binary => 1});


    I suppose that binari is intended for "unprintable" characters.

    >
    > However, the output seems to be forced to UTF-8:
    >


    I can't to use utf-8, I must use iso-8859-1 for some reason.

    > So Text::CSV_XS seems to ignore both the script encoding and the
    > locale. I had set LANG=en_US.ISO-8859-1 in Linux before running the
    > script.


    Hmm, ignore but not thoroughly :-) I avoid using combine() finction by this
    sub

    sub mycombine
    {
    my @fileds=@_;
    my $line = '';
    foreach (@fileds)
    {
    s/\"/\"\"/g;
    $line .= '"' . $_ . '"';
    }
    $line .= chr(13) . $chr(10);
    return $line;
    }

    Ys, of course, this not look to filed type (number or string) but for my
    intention this is sufficient. This work with program and locales settings.
    Maybe will be good to add some functions to your module to set up input and
    output codepages. Some like
    $csv = $csv = Text::CSV_XS->new('input_charser' => 'utf-8', 'output_charset
    => 'iso-8859-1');
    But this is my idea only ;-)
    --

    Petr Vileta, Czech republic
    (My server rejects all messages from Yahoo and Hotmail. Send me your mail
    from another non-spammer site please.)



  6. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    On 10/26/2007 11:12 PM, Petr Vileta wrote:
    > [...]
    > I avoid using combine() finction by
    > this sub
    >
    > sub mycombine
    > {
    > my @fileds=@_;
    > my $line = '';
    > foreach (@fileds)
    > {
    > s/\"/\"\"/g;
    > $line .= '"' . $_ . '"';
    > }
    > $line .= chr(13) . $chr(10);
    > return $line;
    > }
    >
    > Ys, of course, this not look to filed type (number or string) but for my
    > intention this is sufficient. This work with program and locales
    > settings. Maybe will be good to add some functions to your module to set
    > up input and output codepages. Some like
    > $csv = $csv = Text::CSV_XS->new('input_charser' => 'utf-8',
    > 'output_charset => 'iso-8859-1');
    > But this is my idea only ;-)


    I've just discovered that this works perfectly for me:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use Text::CSV_XS 0.32;

    print "Version = $Text::CSV_XS::VERSION\n";

    my $csv = Text::CSV_XS->new({binary => 1});
    $csv->combine('abc','áíá','def') or warn("problem: $!\n");
    print $csv->string(), "\n";

    __END__

    The above code outputs latin1 characters as expected.

    For some reason, Text::CSV_XS doesn't like the encoding pragma:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use encoding 'latin1';
    use Text::CSV_XS 0.32;

    print "Version = $Text::CSV_XS::VERSION\n";

    my $csv = Text::CSV_XS->new({binary => 1});
    $csv->combine('abc','áíá','def') or warn("problem: $!\n");
    print $csv->string(), "\n";

    __END__

    The above outputs utf8-data; "áíá" is converted into "áíá"

    However, if "binmode(STDOUT, ':encoding(latin1)');" is placed before the
    print commands, the output is correct. I don't know if this is a bug in
    Text::CSV_XS or not.

    This is with Perl 5.8.4 and Text::CSV_XS 0.32. I had set
    LANG=en_US.ISO-8859-1 under Linux.

  7. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    On Sat, 27 Oct 2007 06:12:47 +0200, Petr Vileta <stoupa@practisoft.cz>
    wrote:

    > Mumia W. wrote:
    >> On 10/24/2007 11:59 PM, Petr Vileta wrote:
    >>> Well, I'm pleased to see you here :-)
    >>> I tried to use your module Text::CSV_XS for storing some data to CSV
    >>> file but without success. The problem is national characters. When I
    >>> tried $csv->combine(('abc',áíá','def') I got "abc\n" only. Your
    >>> module fail on first field where something greather then \x7f is.
    >>> But no error, no warning.
    >>> Is this a bug or feature?

    >>
    >> This sort-of works for me:
    >>
    >> #!/usr/bin/perl
    >> use strict;
    >> use warnings;
    >> use encoding 'iso-8859-1';
    >> use Text::CSV_XS 0.32;
    >>
    >> print "Version = $Text::CSV_XS::VERSION\n";
    >>
    >> my $csv = Text::CSV_XS->new({binary => 1});

    >
    > I suppose that binari is intended for "unprintable" characters.


    depends. Do you think \x{d7} is unprintable? or \x{20ac}

    >> However, the output seems to be forced to UTF-8:


    Text::CSV_XS doesn't know anything about encoding.

    > [snip]


    > Maybe will be good to add some functions to your module to set up input
    > and output codepages. Some like
    > $csv = $csv = Text::CSV_XS->new('input_charser' => 'utf-8',
    > 'output_charset => 'iso-8859-1');


    That would of course be

    my $csv = Text::CSV_XS->new ({
    input_charset => "utf-8",
    output_charset => "iso-8859-1",
    });

    1: s/charser/charset/
    2: put in an anon-hash

    The idea sounds nice, but would severely slow down all
    scripts that use Text::CSV_XS in a transparent mode,
    without Encoding/Decoding.

    It is rather easy to do it right from the user point of view.
    Here's the snippet used in the test suite to check if encoding
    works (t/50_utf8.t):

    my $csv = Text::CSV_XS->new ({ binary => 1, always_quote => 1 });

    # Special characters to check:
    # 0A = \n 2C = , 20 = 22 = "
    # 0D = \r 3B = ;
    foreach my $test (
    # Space-like characters
    [ "\x{0000A0}", "U+0000A0 NO-BRAK SPACE" ],
    [ "\x{00200B}", "U+00200B ZERO WIDTH SPACE" ],
    # Some characters with possible problems in the code point
    [ "\x{000122}", "U+000122 LATIN CAPITAL LETTER G WITH CEDILLA" ],
    [ "\x{002C22}", "U+002C22 GLAGOLITIC CAPITAL LETTER SPIDERY HA" ],
    [ "\x{000A2C}", "U+000A2C GURMUKHI LETTER BA" ],
    [ "\x{000E2C}", "U+000E2C THAI CHARACTER LO CHULA" ],
    [ "\x{010A2C}", "U+010A2C KHAROSHTHI LETTER VA" ],
    # Characters with possible problems in the encoded representation
    # Should not be possible. ASCII is coded in 000..127, all other
    # characters in 128..255
    ) {
    my ($u, $msg) = @$test;
    utf8::encode ($u);
    my @in = ("", " ", $u, "");
    my $exp = join ",", map { qq{"$_"} } @in;

    ok ($csv->combine (@in), "combine $msg");

    my $str = $csv->string;
    is_binary ($str, $exp, "string $msg");

    ok ($csv->parse ($str), "parse $msg");
    my @out = $csv->fields;
    # Cannot use is_deeply (), because of the binary content
    is (scalar @in, scalar @out, "fields $msg");
    for (0 .. $#in) {
    is_binary ($in[$_], $out[$_], "field $_ $msg");
    }
    }

    > But this is my idea only ;-)


  8. Default Re: ANNOUNCE: Text::CSV_XS 0.32

    H.Merijn Brand wrote:
    > On Sat, 27 Oct 2007 06:12:47 +0200, Petr Vileta <stoupa@practisoft.cz>
    > wrote:


    [snip]

    >> I suppose that binary is intended for "unprintable" characters.

    >
    > depends. Do you think \x{d7} is unprintable? or \x{20ac}
    >


    Ehm, yes ;-) I meant unprintable in \x00 to \xff code range, so all
    characters less then \x20 except \x0a, \x0d, \x09.

    [snip]

    > That would of course be
    >
    > my $csv = Text::CSV_XS->new ({
    > input_charset => "utf-8",
    > output_charset => "iso-8859-1",
    > });
    >
    > The idea sounds nice, but would severely slow down all
    > scripts that use Text::CSV_XS in a transparent mode,
    > without Encoding/Decoding.
    >


    But you can check if programmer set both charsets in ->new() part of module.
    If both charsets are set then run in "translate" mode, if none is set then
    run in "transparent" mode and if only one is set then return error.

    --

    Petr Vileta, Czech republic
    (My server rejects all messages from Yahoo and Hotmail. Send me your mail
    from another non-spammer site please.)



+ Reply to Thread

Similar Threads

  1. ANNOUNCE: Text::CSV_XS 0.32
    By Application Development in forum Perl
    Replies: 0
    Last Post: 10-24-2007, 06:45 AM
  2. ANNOUNCE: Text::CSV_XS 0.26
    By Application Development in forum Perl
    Replies: 0
    Last Post: 05-15-2007, 06:30 AM
  3. ANNOUNCE Text::CSV_XS 0.26
    By Application Development in forum Perl
    Replies: 0
    Last Post: 05-15-2007, 06:29 AM
  4. [ANNOUNCE] Text-CSV_XS 0.25
    By Application Development in forum Perl
    Replies: 0
    Last Post: 05-07-2007, 10:22 AM
  5. Text::CSV_XS Trying to find empty field
    By Application Development in forum awk
    Replies: 2
    Last Post: 10-03-2006, 11:33 PM