Reading text file with correct text encoding mode - c++

This is a discussion on Reading text file with correct text encoding mode - c++ ; Hi, I can read from text file using this code but i'm unabled to read Turkish or other non-English codes, how can i read non-English chars and set the default encoder parameter as if it's working fine in Windows Notepad's ...

+ Reply to Thread
Results 1 to 7 of 7

Reading text file with correct text encoding mode

  1. Default Reading text file with correct text encoding mode

    Hi,
    I can read from text file using this code but i'm unabled to read
    Turkish or other non-English codes, how can i read non-English chars
    and set the default encoder parameter as if it's working fine in
    Windows Notepad's (uses ANSI by default encoding)?

    // reading a text file
    #include <iostream>
    #include <fstream>
    #include <string>
    using namespace std;

    int main () {
    string line;
    ifstream myfile ("c:\\example.txt");
    if (myfile.is_open())
    {
    while (! myfile.eof() )
    {
    getline (myfile,line);
    cout << line << endl;
    }
    myfile.close();
    }

    else cout << "Unable to open file";
    getchar();
    return 0;
    }

  2. Default Re: Reading text file with correct text encoding mode

    kimiraikkonen wrote:
    > I can read from text file using this code but i'm unabled to read
    > Turkish or other non-English codes, how can i read non-English chars
    > and set the default encoder parameter as if it's working fine in
    > Windows Notepad's (uses ANSI by default encoding)?


    I'd suggest reading the recent thread "How to read UNICODE file with
    iostream".


    > while (! myfile.eof() )
    > {
    > getline (myfile,line);
    > cout << line << endl;
    > }


    This is wrong, read carefully the documentation for the eof()
    memberfunction. The correct code is this:

    while( getline( myfile, line))
    cout << line << endl;

    Uli


  3. Default Re: Reading text file with correct text encoding mode

    On Dec 25, 3:53 am, Ulrich Eckhardt <dooms...@knuut.de> wrote:
    > kimiraikkonen wrote:
    > > I can read from text file using this code but i'm unabled to read
    > > Turkish or other non-English codes, how can i read non-English chars
    > > and set the default encoder parameter as if it's working fine in
    > > Windows Notepad's (uses ANSI by default encoding)?

    >
    > I'd suggest reading the recent thread "How to read UNICODE file with
    > iostream".
    >
    > > while (! myfile.eof() )
    > > {
    > > getline (myfile,line);
    > > cout << line << endl;
    > > }

    >
    > This is wrong, read carefully the documentation for the eof()
    > memberfunction. The correct code is this:
    >
    > while( getline( myfile, line))
    > cout << line << endl;



    Inserted missing 2 "{" (parathesis) into while block to your code.
    while( getline( myfile, line)){
    cout << line << endl;
    }

    And looked at the thread you've mentioned but i couldn't get help,
    there's no solid info which i could find out on that thread. Why is
    reading an ANSI (default)-encoded text file is so hard than other
    languages like being in other things in C++?


    > Uli



    Thank you.


  4. Default Re: Reading text file with correct text encoding mode

    kimiraikkonen wrote:
    > On Dec 25, 3:53 am, Ulrich Eckhardt <dooms...@knuut.de> wrote:
    >> kimiraikkonen wrote:
    >> > I can read from text file using this code but i'm unabled to read
    >> > Turkish or other non-English codes, how can i read non-English chars
    >> > and set the default encoder parameter as if it's working fine in
    >> > Windows Notepad's (uses ANSI by default encoding)?

    >>
    >> I'd suggest reading the recent thread "How to read UNICODE file with
    >> iostream".
    >>
    >> > while (! myfile.eof() )
    >> > {
    >> > getline (myfile,line);
    >> > cout << line << endl;
    >> > }

    >>
    >> This is wrong, read carefully the documentation for the eof()
    >> memberfunction. The correct code is this:
    >>
    >> while( getline( myfile, line))
    >> cout << line << endl;

    >
    >
    > Inserted missing 2 "{" (parathesis) into while block to your code.
    > while( getline( myfile, line)){
    > cout << line << endl;
    > }


    Um, no? Whether you put those or not is a matter of taste, IIRC. Anyway, the
    point is really that you were using eof() wrongly, which can lead to data
    corruption.

    > And looked at the thread you've mentioned but i couldn't get help,
    > there's no solid info which i could find out on that thread. Why is
    > reading an ANSI (default)-encoded text file is so hard than other
    > languages like being in other things in C++?


    There are various reasons that things like that are hard:
    1. C++ doesn't mandate any specific charset. It only requires a basic set of
    characters for the sourcefiles, but how that is stored is not defined, it
    could be ASCII or EBDCIC (sp?) or anything else.
    2. There is no standard how to store e.g. Turkish (Arabian?) characters in
    textfiles either. There are various extensions to ASCII which use the
    codepoints beyond 127 and map them to various characters. Internally, those
    will still be represented as 'char', but with different meanings.
    Alternatively, there are the UTF-x encodings which are capable of encoding
    the whole Unicode range and which are then typically also stored in wchar_t
    internally.

    My suggestion:
    1. Use wchar_t strings and streams.
    2. Use UTF-8 for your files.

    Uli


  5. Default Re: Reading text file with correct text encoding mode

    On Dec 25, 2:19 pm, Ulrich Eckhardt <dooms...@knuut.de> wrote:
    > kimiraikkonen wrote:
    > > On Dec 25, 3:53 am, Ulrich Eckhardt <dooms...@knuut.de> wrote:
    > >> kimiraikkonen wrote:
    > >> > I can read from text file using this code but i'm unabled to read
    > >> > Turkish or other non-English codes, how can i read non-English chars
    > >> > and set the default encoder parameter as if it's working fine in
    > >> > Windows Notepad's (uses ANSI by default encoding)?

    >
    > >> I'd suggest reading the recent thread "How to read UNICODE file with
    > >> iostream".

    >
    > >> > while (! myfile.eof() )
    > >> > {
    > >> > getline (myfile,line);
    > >> > cout << line << endl;
    > >> > }

    >
    > >> This is wrong, read carefully the documentation for the eof()
    > >> memberfunction. The correct code is this:

    >
    > >> while( getline( myfile, line))
    > >> cout << line << endl;

    >
    > > Inserted missing 2 "{" (parathesis) into while block to your code.
    > > while( getline( myfile, line)){
    > > cout << line << endl;
    > > }

    >
    > Um, no? Whether you put those or not is a matter of taste, IIRC. Anyway, the
    > point is really that you were using eof() wrongly, which can lead to data
    > corruption.
    >
    > > And looked at the thread you've mentioned but i couldn't get help,
    > > there's no solid info which i could find out on that thread. Why is
    > > reading an ANSI (default)-encoded text file is so hard than other
    > > languages like being in other things in C++?

    >
    > There are various reasons that things like that are hard:
    > 1. C++ doesn't mandate any specific charset. It only requires a basic set of
    > characters for the sourcefiles, but how that is stored is not defined, it
    > could be ASCII or EBDCIC (sp?) or anything else.
    > 2. There is no standard how to store e.g. Turkish (Arabian?) characters in
    > textfiles either. There are various extensions to ASCII which use the
    > codepoints beyond 127 and map them to various characters. Internally, those
    > will still be represented as 'char', but with different meanings.
    > Alternatively, there are the UTF-x encodings which are capable of encoding
    > the whole Unicode range and which are then typically also stored in wchar_t
    > internally.
    >
    > My suggestion:
    > 1. Use wchar_t strings and streams.
    > 2. Use UTF-8 for your files.
    >
    > Uli


    Hi Uli,
    I want to ask another question:

    How can i read file from a user-specified file location?

    I tried this with no help:

    #include <iostream>
    #include <fstream>
    #include <string>
    using namespace std;

    int main () {
    string line;
    string path;
    cout<<"Please enter path of file:\n";
    cin>>path;
    ifstream myfile (path);
    if (myfile.is_open())
    {
    while( getline( myfile, line)){
    cout << line << endl;
    }
    myfile.close();
    }

    else cout << "Unable to open file";
    getchar();
    return 0;
    }

    I hope i can learn. Thanks!

  6. Default Re: Reading text file with correct text encoding mode

    kimiraikkonen wrote:

    > Hi Uli,
    > I want to ask another question:
    >
    > How can i read file from a user-specified file location?
    >
    > I tried this with no help:
    >
    > #include <iostream>
    > #include <fstream>
    > #include <string>
    > using namespace std;
    >
    > int main () {
    > string line;
    > string path;
    > cout<<"Please enter path of file:\n";
    > cin>>path;
    > ifstream myfile (path);


    This line should read:
    ifstream myfile (path.c_str());

    The string::c_str() function converts a string object into a C-style
    string (a nul-terminated array or chacracters), which the fstream
    objects can work with.

    > if (myfile.is_open())
    > {
    > while( getline( myfile, line)){
    > cout << line << endl;
    > }
    > myfile.close();
    > }
    >
    > else cout << "Unable to open file";
    > getchar();
    > return 0;
    > }
    >
    > I hope i can learn. Thanks!


    Bart v Ingen Schenau
    --
    a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
    c.l.c FAQ: http://c-faq.com/
    c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/

  7. Default Re: Reading text file with correct text encoding mode

    On Dec 26, 11:21 am, Bart van Ingen Schenau <b...@ingen.ddns.info>
    wrote:
    > kimiraikkonen wrote:
    > > Hi Uli,
    > > I want to ask another question:

    >
    > > How can i read file from a user-specified file location?

    >
    > > I tried this with no help:

    >
    > > #include <iostream>
    > > #include <fstream>
    > > #include <string>
    > > using namespace std;

    >
    > > int main () {
    > > string line;
    > > string path;
    > > cout<<"Please enter path of file:\n";
    > > cin>>path;
    > > ifstream myfile (path);

    >
    > This line should read:
    > ifstream myfile (path.c_str());
    >
    > The string::c_str() function converts a string object into a C-style
    > string (a nul-terminated array or chacracters), which the fstream
    > objects can work with.
    >
    > > if (myfile.is_open())
    > > {
    > > while( getline( myfile, line)){
    > > cout << line << endl;
    > > }
    > > myfile.close();
    > > }

    >
    > > else cout << "Unable to open file";
    > > getchar();
    > > return 0;
    > > }

    >
    > > I hope i can learn. Thanks!

    >
    > Bart v Ingen Schenau
    > --
    > a.c.l.l.c-c++ FAQ:http://www.comeaucomputing.com/learn/faq
    > c.l.c FAQ:http://c-faq.com/
    > c.l.c++ FAQ:http://www.parashift.com/c++-faq-lite/


    Hi,
    Thanks, that did it! However i needed to put "getchar();" before
    "ifstream myfile (path.c_str());" to display file content. Otherwise
    my app blinks and closes automatically because my IDE is Dev C++.

+ Reply to Thread