how to use standard fortran on a "binary"input file?

This is a discussion on how to use standard fortran on a "binary"input file? within the Fortran forums in Programming Languages category; I was re-reading my huge Fortran 77/90/95 language specification manual and noted that the open file option ACCESS='BINARY' was accepted as an extension only for Windows up to NT, but not other platforms. I can solve some potential problems of upgrading Fortran programs to a more portable standard, by using DIRECT UNFORMATTED access, but I am left with the cases where programs read somewhat unknown input data files of words of bits, and proceed to determine the coding structure used (e.g. IBM 12 bit binary, IBM 16-bit binary, Qantum 12 bit card code, common 10,12,16 and 36 character ascii code ...

Go Back   Application Development Forum > Programming Languages > Fortran

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-22-2008, 01:10 AM
Terence
Guest
 
Default how to use standard fortran on a "binary"input file?

I was re-reading my huge Fortran 77/90/95 language specification
manual and noted that the open file option ACCESS='BINARY' was
accepted as an extension only for Windows up to NT, but not other
platforms.

I can solve some potential problems of upgrading Fortran programs to a
more portable standard, by using DIRECT UNFORMATTED access, but I am
left with the cases where programs read somewhat unknown input data
files of words of bits, and proceed to determine the coding structure
used (e.g. IBM 12 bit binary, IBM 16-bit binary, Qantum 12 bit card
code, common 10,12,16 and 36 character ascii code and so on) before
reopening the file in a more suitable way for reading that particular
structure.

To data I have used the BINARY option and read chunks of data to
search for clues (e.g. searching for CR, LF, CR-LF, LF-CR, and DEL
characters and the character interval counts between each, and the
presence or absence of the top one or two bits in each character ad
whether any hex zero bytes occur).

What is the simple definition of of the expected structure of files
declared as UNFORMATTED SEQUENTIAL? I had always thought these are
expected to contain non-data markers.

This seems to be the only chane of achieving portability since
FORMATTED is obviously out and direct access only applies to fixed-
length records, other than treating the file as a blocks of unknown
data and working out what happened on the last "record"; which I will
use if no other idea presents itself.


Reply With Quote
  #2  
Old 08-22-2008, 02:15 AM
Richard Maine
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

Terence <tbwright@cantv.net> wrote:

> I was re-reading my huge Fortran 77/90/95 language specification
> manual and noted that the open file option ACCESS='BINARY' was
> accepted as an extension only for Windows up to NT, but not other
> platforms.


Presumably you are talking about some specific, but unmentioned
compiler, as the actual Fortran language specifications certainly don't
say anything about Windows versions. Anyway...

> What is the simple definition of of the expected structure of files
> declared as UNFORMATTED SEQUENTIAL? I had always thought these are
> expected to contain non-data markers.


The standard doesn't say. But containing no non-data markers is not a
realistic expectation (except for file systems where the record
structure is maintained out-of-band, but those aren't common these
days). The most common structure involves adding record-size fields
before (and usually after) the data of the record. The details *DO*
vary. This is essentially not an option for reading non-Fortran files;
it is even a bit problematic for reading Fortran files created by other
compilers.

This is an issue I've been very much working with for... well... about 2
and a half decades now, I guess. It is the large part of why I wrote the
initial proposal for access='stream' in f2003.

The problem you are looking at is exactly why access='stream' was added
in f2003. You can think of access='stream' as the standardized, and
slightly cleaned up version of the nonstandard 'binary' thing you were
using.

Your main options are

1. The access='stream', which is supported my many current compilers,
although not yet all.

2. Continue using the nonstandard binary options. Pretty much all
compilers today support at least some variant of them. I have no idea
what manual this is you are looking at or exactly what it actually says,
but the options are widely available. Sometimes the spellings do vary
slightly, which is part of why standardization was needed.

3. Interface to C and do it there. That involves some portability issues
as well, but it is diable. It wouldn't be my first choice, but I know
some people have done it that way.

4. Use direct access unformatted. There are lots of complications, but
it is doable. I know, as I've done it. You have to do your own record
management in order to give Fortran the fixed-size chuncks of data that
it wants. It would take quite a while to elaborate on all the various
gotchas. (Yes, the last block is one of them). The fact that it can get
pretty messy is why I pushed for access='stream', which makes it all so
much easier.

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
Reply With Quote
  #3  
Old 08-22-2008, 02:19 AM
Arjen Markus
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

On 22 aug, 08:15, nos...@see.signature (Richard Maine) wrote:
>
> 4. Use direct access unformatted. There are lots of complications, but
> it is doable. I know, as I've done it. You have to do your own record
> management in order to give Fortran the fixed-size chuncks of data that
> it wants. It would take quite a while to elaborate on all the various
> gotchas. (Yes, the last block is one of them). The fact that it can get
> pretty messy is why I pushed for access='stream', which makes it all so
> much easier.
>


Have a look at http;//flibs.sf.net, for that particular approach.
It is - unfortunately - not quite without problems, as Richard
also indicates, but it could be useful nonetheless.

Regards,

Arjen
Reply With Quote
  #4  
Old 08-22-2008, 02:25 AM
robert.corbett@sun.com
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

On Aug 21, 10:10 pm, Terence <tbwri...@cantv.net> wrote:
> I was re-reading my huge Fortran 77/90/95 language specification
> manual and noted that the open file option ACCESS='BINARY' was
> accepted as an extension only for Windows up to NT, but not other
> platforms.
>
> I can solve some potential problems of upgrading Fortran programs to a
> more portable standard, by using DIRECT UNFORMATTED access, but I am
> left with the cases where programs read somewhat unknown input data
> files of words of bits, and proceed to determine the coding structure
> used (e.g. IBM 12 bit binary, IBM 16-bit binary, Qantum 12 bit card
> code, common 10,12,16 and 36 character ascii code and so on) before
> reopening the file in a more suitable way for reading that particular
> structure.
>
> To data I have used the BINARY option and read chunks of data to
> search for clues (e.g. searching for CR, LF, CR-LF, LF-CR, and DEL
> characters and the character interval counts between each, and the
> presence or absence of the top one or two bits in each character ad
> whether any hex zero bytes occur).
>
> What is the simple definition of of the expected structure of files
> declared as UNFORMATTED SEQUENTIAL? I had always thought these are
> expected to contain non-data markers.


The file formats used by different Fortran implementations
all over the map. There is no reason to expect direct-access
unformatted files to be more portable than sequential-access
unformatted files, even among implementations on similar
operating systems. I know of a Fortran implementation that
allocates an extra byte at the start each direct-access record.
The byte is used, among other things, to indicate whether the
record has been written. I know of another Fortran
implementation that builds its support for direct-access
unformatted I/O on top of an indexed files system. It is nice
in that it is possible to write records whose record numbers
are huge without using much space. On the other hand, I/O
performance tends to be lower than with the more common
implementations.

The reason unformatted records are called unformatted records
is that at one time they were. Early computer systems tended
to use record-based I/O hardware. When all I/O is physically
record-based, the record format is simply assumed. When
reading or punching cards, there is no need to guess what the
record structure might be. When reading or writing unblocked
open reel tapes, the record structure is indicated by the
physical inter-record gaps. No data needs to be provided to
indicate the record structure.

Bob Corbett
Reply With Quote
  #5  
Old 08-22-2008, 02:59 AM
Richard Maine
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

<robert.corbett@sun.com> wrote:

> There is no reason to expect direct-access
> unformatted files to be more portable than sequential-access
> unformatted files, even among implementations on similar
> operating systems.


While there may be no "reason", my observation shows it to be so.

By "so" I am taking your "more portable" literally in that "more
portable" is not the same thing as "100% portable." Yes, I also know of
exceptions. But they are distinctly exceptions. You can work with an
awfully lot of compilers and never run into one of the exceptions. Some,
though not all, of the exceptions can be addressed by compiler switches
such as Lahey's /nohed. That's unlike direct access sequential, where
you find differences every time you turn around.

Since I have actually used direct access unformatted for this myself on
a large variety of systems, I'm going to be pretty adamant about claming
that it can be done.

I acknowledge that the portability is not 100%. I recall telling some
people for whom it might have been relevant at the time that if they
were going to try to port the code in question to a 60-bit CDC machine,
they were just going to be out of luck and I wasn't going to be willing
to try to support that. I still observe that its portability is
significantly better than that of sequential unformatted in practice.

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
Reply With Quote
  #6  
Old 08-22-2008, 02:06 PM
glen herrmannsfeldt
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

robert.corbett@sun.com wrote:
(snip)

> The reason unformatted records are called unformatted records
> is that at one time they were. Early computer systems tended
> to use record-based I/O hardware. When all I/O is physically
> record-based, the record format is simply assumed. When
> reading or punching cards, there is no need to guess what the
> record structure might be. When reading or writing unblocked
> open reel tapes, the record structure is indicated by the
> physical inter-record gaps. No data needs to be provided to
> indicate the record structure.


This still done for many tape systems. For disks, most now
use a fixed hardware block size and buffer it in memory to
give the impression of a uniform byte stream to the user.

IBM mainframes use a record oriented I/O system for disks.
For direct access files, each record maps to a physical
disk block allocated to the appropriate size. (It may
be remapped inside the disk/controller hardware, but
the record structure is visible to the OS.)

-- glen

Reply With Quote
  #7  
Old 08-23-2008, 04:50 AM
Terence
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

Yes, thanks to all responding.

I should have said I worked on early Fortran (post Fortran II) for IBM
in 1960 on, and know what was then "the standard" with respect to the
two FORM formats and two ACCESS methods, right through 370 days.
However, I have found weird changes to what I thought was sacrosanct,
when working (for my next company for 28 years) all over the world on
mainframes and process control and message switching computers, and
finding count bytes stuck into UNFORMATTED SEQUENTIAL input files.
..
To respond to doubts as to what compiler I am referring to, I use CVF/
DVF 6.6c for Windows work, and MS F77 v3.31 for all DOS work; both of
which accept "BINARY" as a sequentail access option. And Lahey accepts
"Transparent" for the same usefull purpose.

Given the above comments, I am now certain I will stick with DIRECT
UNFORMATTED and process the data myself (as usual for variable length
reords).

I do wish the standard default for RECL on this mode was still bytes
(and not 4-byte words) and not a compiler option. After all the
dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
just the opposite!

I think I leaned how to deal with the last, posiibly incomplete,
record for all possibilities (and there are quite a few).
Reply With Quote
  #8  
Old 08-23-2008, 03:22 PM
Steve Lionel
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

Terence wrote:

> I do wish the standard default for RECL on this mode was still bytes
> (and not 4-byte words) and not a compiler option. After all the
> dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
> just the opposite!


Since you're using CVF, the default for UNFORMATTED access is 4-byte
units of RECL=. This has been the mode of DEC compilers for more than
30 years and comes from the F77 standard's use of the term "storage
units" being interpreted as "numerical storage units" - that is, the
size of an INTEGER or REAL.

In Fortran 2003, the standard still allows this but recommends the use
of bytes (I forget the exact wording).

--

Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://softwareforums.intel.com/
Intel Fortran Support
http://support.intel.com/support/per...etools/fortran
My Fortran blog
http://www.intel.com/software/drfortran
Reply With Quote
  #9  
Old 08-23-2008, 06:31 PM
Terence
Guest
 
Default Re: how to use standard fortran on a "binary"input file?



Steve Lionel wrote:
> Terence wrote:
>
> > I do wish the standard default for RECL on this mode was still bytes
> > (and not 4-byte words) and not a compiler option. After all the
> > dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
> > just the opposite!

>
> Since you're using CVF, the default for UNFORMATTED access is 4-byte
> units of RECL=. This has been the mode of DEC compilers for more than
> 30 years and comes from the F77 standard's use of the term "storage
> units" being interpreted as "numerical storage units" - that is, the
> size of an INTEGER or REAL.
>
> In Fortran 2003, the standard still allows this but recommends the use
> of bytes (I forget the exact wording).


Sensible! But the default is the opposite of IBM.s 704/7044 etc and
Microsoft's AT default of one byte counts - which causes puzzlement to
a new non-DEC user of the CVF compiler. IBM MAY have had a different
default for the mainframe in the later /370 days because of disk
drives, but I only used these for the ULA BMD Fortran statistical
packages

And the above words almost duplicate Steve's original help to me with
precisely this problem when I installed and started to use the CVF 6.6
compiler "back when" this came out. A demonstration of why I would
NEVER imply his refusal to help, Richard!.

(You have to set an override switch which is well hidden in Visual
Studio).

Reply With Quote
  #10  
Old 08-23-2008, 06:33 PM
Joe
Guest
 
Default Re: how to use standard fortran on a "binary"input file?

On Aug 23, 3:22 pm, Steve Lionel <steve.lio...@intel.invalid> wrote:
> Terence wrote:
> > I do wish the standard default for RECL on this mode was still bytes
> > (and not 4-byte words) and not a compiler option. After all the
> > dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
> > just the opposite!

>
> Since you're using CVF, the default for UNFORMATTED access is 4-byte
> units of RECL=. This has been the mode of DEC compilers for more than
> 30 years and comes from the F77 standard's use of the term "storage
> units" being interpreted as "numerical storage units" - that is, the
> size of an INTEGER or REAL.
>
> In Fortran 2003, the standard still allows this but recommends the use
> of bytes (I forget the exact wording).
>


F2003 also has ISO_FORTRAN_ENV to supply the actual file storage unit
size in bits. Hopefully, this will be commonly available in the near
future.

P.S. My ISP recent cut off usenet access with little warning, and
without very good reason (they could just exclude binary groups.) If
other people find themselves without usenet access, you can get c.l.f
and other via proxy web sites like "Google Groups".
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 03:24 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.