Read binary data file - Java

This is a discussion on Read binary data file - Java ; Charles wrote: > Let's review what the OP stated > > A struct is given in C++ > > Data needs to read from a file in Java. > > You have the following data types > > unsigned long ...

+ Reply to Thread
Page 3 of 4 FirstFirst 1 2 3 4 LastLast
Results 21 to 30 of 38

Read binary data file

  1. Default Re: Read binary data file

    Charles wrote:
    > Let's review what the OP stated
    >
    > A struct is given in C++
    >
    > Data needs to read from a file in Java.
    >
    > You have the following data types
    >
    > unsigned long
    > unsigned short
    >
    > As previously stated by other posters the Endianness of the operating
    > system should affect how the output file is encoded. I assume this to
    > be true but have not verified it to be true.
    >
    > We assume all unsigned longs and unsigned short will ALWAYS have the
    > same bytesize.
    >
    > The complete struct is given as
    >
    > unsigned long data1;
    > unsigned short data2;
    > unsigned short data3;
    > unsigned long data4;
    >
    > Can we also assume that the data will always be sequenced as described
    > in the STRUCT?
    > I don't see any argument why the data will be out of sequence as
    > defined in the STRUCT.


    But we do not know the padding, and the OP doesn't know what those sizes are,
    nor the endianness of their files. They don't even know in what format the
    floating-point values are stored: IEEE? We need all that information to craft
    a Java equivalent, and we don't have it. The OP doesn't have it, by their
    account.

    > Does the input file get modified when it is transported from one
    > operating system to another?
    > I assume NO. This is not verified.


    But if endianness and padding matter, the fact that it is not modified will
    make it unreadable on the second system.

    > Are there equivalents of unsigned long and unsigned short in Java?


    No.

    > Are they the same byte size?


    We do not know. The OP hasn't given us enough information.

    > Do they encode the data the same?


    We do not know. The OP hasn't given us enough information.

    > Try to read in Java and verify with known data. If you don't know any
    > of the data values this becomes a harder task.


    It's already impossible based on the information given. How much harder can
    it get?

    --
    Lew

  2. Default Re: Read binary data file

    Lew wrote:
    >
    > It's already impossible based on the information given. How much harder
    > can it get?
    >

    If the OP *MUST* move binary data, at least do it in a platform and
    language-independent manner and use ASN.1 encoding.


    --
    martin@ | Martin Gregorie
    gregorie. | Es**** UK
    org |

  3. Default Re: Read binary data file

    I'm not sure if this is the same issue, but I'm trying to interpret
    numeric values out of a chunk of data as follows:

    int toBinary theValue
    124 1111100 3.8
    63 111111 4
    224 11100000 4.8
    63 111111 4
    63 111111 4
    224 11100000 4.8
    64 1000000 3.2
    63 111111 4
    244 11110100 5
    124 1111100 3.8

    I can read "int" out of my blob of data, and I ran toBinaryString on
    it just to visualize it. I manually typed "theValue" (that is what I
    KNOW the test data is). Can someone help me figure out what code to
    run in order to get "theValue"?

    --Dale--


  4. Default Re: Read binary data file

    Martin Gregorie <martin@see.sig.for.address> wrote:

    > If the OP *MUST* move binary data, at least do it in a platform and
    > language-independent manner and use ASN.1 encoding.


    I understand Hunter's comments, and and while I don't know much about
    ASN.1 encoding, what I am pointing out is that binary files are usually
    *not* intended to be used across sytems. Every binary data file I have
    ever worked with was intended to be used either by the program that wrote
    it, or separate applications that used the same utility libraries as the
    application which wrote the data. There is nothing wrong with simply writing
    the C structure to a file, and reading it in the same way. In this case
    the code, and not some specification, drives the format of the data - and there
    is *nothing* wrong with this. The lack of a need to share the data outside of
    the application is what often drives the decision to use binary data in the
    first place (why not take advantage of the efficiency binary files have to
    offer).

    Of course, every once in a while an outside user decides they want to use this
    data. Well, then they have a choice. Either generate it themselves, or
    spend a few hours writing something that can read it in - not a big price
    to pay.

    - Kurt

  5. Default Re: Read binary data file

    On Fri, 31 Aug 2007 09:15:55 -0700, "DRS.Usenet@sengsational.com"
    <DRS.Usenet@sengsational.com> wrote, quoted or indirectly quoted
    someone who said :

    >int toBinary theValue
    >124 1111100 3.8
    >63 111111 4
    >224 11100000 4.8
    >63 111111 4
    >63 111111 4
    >224 11100000 4.8
    >64 1000000 3.2
    >63 111111 4
    >244 11110100 5
    >124 1111100 3.8
    >
    >I can read "int" out of my blob of data, and I ran toBinaryString on
    >it just to visualize it. I manually typed "theValue" (that is what I
    >KNOW the test data is). Can someone help me figure out what code to
    >run in order to get "theValue"?


    If you get enough samples you can create a
    private static final double[] translate = new double[256];
    to do the translation for you.

    In what context did you see this code? It looks like it might be some
    sort of sound encoding technique. You can read up the specs on the
    encoding.

    see http://mindprod.com/jgloss/sound.html to help get you started.

    It might also be some sort of Huffman encoding. See
    http://mindprod.com/jgloss/huffman.html
    --
    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com

  6. Default Re: Read binary data file

    ~kurt wrote:
    > I understand Hunter's comments, and and while I don't know much about
    > ASN.1 encoding, what I am pointing out is that binary files are usually
    > *not* intended to be used across sytems.


    Except for all the ones that are, e.g. protocol dumps; databases;
    interpretive pseudo-code (e.g. .class files), ...

    > Every binary data file I have
    > ever worked with was intended to be used either by the program that wrote
    > it, or separate applications that used the same utility libraries as the
    > application which wrote the data.


    Except for the ones that aren't: e.g. protocol dumps; databases;
    interpretive pseudo-code (e.g. .class files), ...

    > There is nothing wrong with simply writing
    > the C structure to a file, and reading it in the same way. In this case
    > the code, and not some specification, drives the format of the data - and there
    > is *nothing* wrong with this.


    There is plenty wrong with this. The format of binary data written
    directly from a struct in memory depends on at least the following:

    - the host hardware
    - the compiler
    - the compiler version
    - the surrounding #pragmas
    - the compiler options that were in effect when the binary that wrote
    the file it was compiled

    This is too many dependencies, on too many things that can't be controlled.

    The only time writing a struct from memory to a file or a network can
    sanely be justified is when the target application is constructed with
    the same version of the same object file that wrote it. And this is not
    a guarantee that in general can be met.

  7. Default Re: Read binary data file

    ~kurt wrote:
    >
    > I understand Hunter's comments, and and while I don't know much about
    > ASN.1 encoding, what I am pointing out is that binary files are usually
    > *not* intended to be used across sytems.
    >

    I think its use is quite industry-dependent: I've never seen it used in
    financial messaging (that's more likely to use SWIFT formats, which are
    tagged text) but its common in the telecommunications industry.

    Telcos (both fixed line and mobile) use a lot of binary data for control
    and accounting purposes, mainly because this minimizes message size and
    there's a LOT of stuff flying around controlling the network in real
    time and accounting for its use. Switches from large vendors, e.g.
    Erickson, tend to use proprietary, flat message formats but if the data
    will be exchanged between different types of kit (e.g. roaming billing
    data) they tend to use ASN.1: CCITT likes it.

    ASN.1 has a lot in common with XML in that its a tagged field protocol,
    allows nesting, and uses a tag dictionary to associate meanings with
    tags. Compared with XML its a LOT more compact (tags are one byte, fixed
    length fields don't have terminators, variable length fields are
    preceded by a one or two byte length) and it has a number of predefined
    field types as well as arrays. If you have the dictionary its easy to
    interpret on the fly though, like XML, you can also use the dictionary
    to generate code to encode and decode ASN.1 records.

    > Every binary data file I have
    > ever worked with was intended to be used either by the program that wrote
    > it, or separate applications that used the same utility libraries as the
    > application which wrote the data.
    >

    There's also a lot of binary data in large commercial systems. Formerly
    it was in large serial files, then flat indexed files, now its probably
    in a database. A really good reason for using an RDBMS is that it not
    only hides implementation details (like endian conventions) from the
    application, but the interfaces (SQL, JDBC, ODBC, etc) typically provide
    field conversion facilities.

    > There is nothing wrong with simply writing
    > the C structure to a file, and reading it in the same way.
    >

    I'd probably use a CSV format any place where a database would be
    obvious overkill, but ymmv.

    Using CSV rather than binary makes debugging easier and (said with his
    *NIX hat on) it allows the data to be handled by common scripted
    utilities like awk, perl and even shell scripts. Oh yeah, Java too :-)


    --
    martin@ | Martin Gregorie
    gregorie. | Es**** UK
    org |

  8. Default Re: Read binary data file

    Esmond Pitt wrote:
    > ~kurt wrote:
    >> I understand Hunter's comments, and and while I don't know much about
    >> ASN.1 encoding, what I am pointing out is that binary files are
    >> usually *not* intended to be used across sytems.

    >
    > Except for all the ones that are, e.g. protocol dumps; databases;
    > interpretive pseudo-code (e.g. .class files), ...


    How often to database *files* get moved from one system to another? In my
    experience, they stay on the server where the DBMS engine is running.



  9. Default Re: Read binary data file

    Esmond Pitt <esmond.pitt@nospam.bigpond.com> wrote:

    > The only time writing a struct from memory to a file or a network can


    Who is talking about writing data to a network?

    > sanely be justified is when the target application is constructed with
    > the same version of the same object file that wrote it. And this is not
    > a guarantee that in general can be met.


    Uh, this is pretty much what I just said other than I see no need for
    the "guarantee" part - it is not necessary unless the *intent* is to
    distribute the data externally.

    As I said, my gripe is in calling the originator of the OP's data clueless.
    That statement is simply clueless itself. Yes, if the original program had
    been written in Java, then maybe that statement would be true. But this
    is a C++ program. The data files are most likely "private", only to be
    used internally. Sure, if you port the code to another platform, the
    binary files between the two versions may not be compatible, but so what -
    that usually isn't a problem. The new code will create binary files that
    are compatible with itself. Creating some external specification that this
    binary data must meet would be stupid because then, if you did port the
    code, now you may have to modify it to be compatible with the original
    specification, and this may require more processing of the data. Suddenly,
    some specification is driving internal data, and robbing some degree of
    performance from the application.

    Just because a bureaucrat comes a long some time down the road and says
    "though shalt write a Java program (not that Java is the best solution in
    this case, but because it is the 'in' thing to do) that will use Program X's
    internal data files" does not mean Program X was poorly designed.

    - Kurt

  10. Default Re: Read binary data file

    ~kurt wrote:
    > Esmond Pitt <esmond.pitt@nospam.bigpond.com> wrote:
    >
    >> The only time writing a struct from memory to a file or a network can

    >
    > Who is talking about writing data to a network?
    >
    >> sanely be justified is when the target application is constructed
    >> with the same version of the same object file that wrote it. And
    >> this is not a guarantee that in general can be met.

    >
    > Uh, this is pretty much what I just said other than I see no need for
    > the "guarantee" part - it is not necessary unless the *intent* is to
    > distribute the data externally.
    >
    > As I said, my gripe is in calling the originator of the OP's data
    > clueless. That statement is simply clueless itself. Yes, if the
    > original program had been written in Java, then maybe that statement
    > would be true. But this
    > is a C++ program. The data files are most likely "private", only to
    > be used internally. Sure, if you port the code to another platform,
    > the binary files between the two versions may not be compatible, but
    > so what - that usually isn't a problem. The new code will create
    > binary files that are compatible with itself. Creating some external
    > specification that this binary data must meet would be stupid because
    > then, if you did port the code, now you may have to modify it to be
    > compatible with the original specification, and this may require more
    > processing of the data. Suddenly, some specification is driving
    > internal data, and robbing some degree of performance from the
    > application.


    The danger is that a different compiler (or different version of the same
    compiler) would cause an incompatibility. The good news is that compiler
    vendors tend not to change struct layouts for that very reason. Still, this
    needs to be kept in mind and tested for whenever that sort of change is
    made.

    Another point, not yet mentioned (or if it has been, I missed that post.)
    Any structured data that's saved persistently should contain a version
    number. If it never changes, you've added a small amount of overhead. When
    it does change, it's now straightforward to convert older versions and
    recognize new ones, which, without the explicit versioning, can be difficult
    or impossible.





+ Reply to Thread
Page 3 of 4 FirstFirst 1 2 3 4 LastLast

Similar Threads

  1. Re: How can I read binary data?
    By Application Development in forum labview
    Replies: 0
    Last Post: 12-13-2007, 12:10 PM
  2. want to read binary file
    By Application Development in forum Fortran
    Replies: 7
    Last Post: 11-23-2007, 09:15 PM
  3. Replies: 7
    Last Post: 10-22-2007, 05:27 AM
  4. Back with How to read binary data from a Dbase varchar(80) ?
    By Application Development in forum ADO DAO RDO RDS
    Replies: 0
    Last Post: 12-07-2006, 06:22 AM
  5. splot binary data read has x and y interchanged
    By Application Development in forum Graphics
    Replies: 0
    Last Post: 06-12-2006, 08:41 AM