GAWK: A fix for "missing file is a fatal error"

This is a discussion on GAWK: A fix for "missing file is a fatal error" within the awk forums in Programming Languages category; In article <48AEB0AC.30901 @ lsupcaemnt.com>, Ed Morton <morton @ lsupcaemnt.com> wrote: >On 8/22/2008 7:08 AM, pk wrote: >> On Friday 22 August 2008 13:34, Aharon Robbins wrote: >> >> >>>Kenny: You are, of course, welcome to fork the gawk code base and create >>>a language that works to your specifications. You have my blessings. >> >> >> In this particular case, I think a command line switch to enable the >> behavior could be enough. >> > >Yeah, but I think it'd make sense to see the default behavior changed and just >do the abort if a new switch or ...

Go Back   Application Development Forum > Programming Languages > awk

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #21  
Old 08-22-2008, 09:01 AM
Kenny McCormack
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <48AEB0AC.30901@lsupcaemnt.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
>On 8/22/2008 7:08 AM, pk wrote:
>> On Friday 22 August 2008 13:34, Aharon Robbins wrote:
>>
>>
>>>Kenny: You are, of course, welcome to fork the gawk code base and create
>>>a language that works to your specifications. You have my blessings.

>>
>>
>> In this particular case, I think a command line switch to enable the
>> behavior could be enough.
>>

>
>Yeah, but I think it'd make sense to see the default behavior changed and just
>do the abort if a new switch or the existing "--compat/traditional" switch was
>being used.


I agree with 'pk' on this one. A switch to invoke the "non-traditional"
behavior is the way to go. While I *admire* the TAWK way, I tend to
agree that the "traditional" Unix/GAWK way is what most users expect.

>On the other hand, I've never actually encountered this problem in real use so
>it's just an opinion...


True. And that's what makes this whole thread rather, shall we say,
unique. It is hard to imagine a real world instance of this _other than_
when dealing with /proc...

Still, I think that the LD_PRELOAD method is good - obviously this
syntax functions as a "switch" - if I want this functionality, I use
LD_PRELOAD; if I don't, I don't. As I said, if I were really serious
about making this a permanent change, I'd fix it in the source, but it's
just not feasible for me to do that at the moment.

Reply With Quote
  #22  
Old 08-22-2008, 09:36 AM
Ed Morton
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On 8/22/2008 8:01 AM, Kenny McCormack wrote:
> In article <48AEB0AC.30901@lsupcaemnt.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
>
>>On 8/22/2008 7:08 AM, pk wrote:
>>
>>>On Friday 22 August 2008 13:34, Aharon Robbins wrote:
>>>
>>>
>>>
>>>>Kenny: You are, of course, welcome to fork the gawk code base and create
>>>>a language that works to your specifications. You have my blessings.
>>>
>>>
>>>In this particular case, I think a command line switch to enable the
>>>behavior could be enough.
>>>

>>
>>Yeah, but I think it'd make sense to see the default behavior changed and just
>>do the abort if a new switch or the existing "--compat/traditional" switch was
>>being used.

>
>
> I agree with 'pk' on this one. A switch to invoke the "non-traditional"
> behavior is the way to go. While I *admire* the TAWK way, I tend to
> agree that the "traditional" Unix/GAWK way is what most users expect.


While I usually would agree with that, in this case we're talking about
something that almost never happens so I doubt if anyone would add that switch
every time they invoke awk just in case it does, so if we have a switch to
invoke the "new" behavior then it'll probably never get used so those who would
fall over this problem still will, and there's an alternative workaround using
getline IF you need to deal with it, so it's just pointless to add a switch to
turn ON the new behavior.

On the other hand, making the new behavior the default would almost certainly
not cause anyone any problems, and if it does they can add the new switch.

Ed.


Reply With Quote
  #23  
Old 08-22-2008, 02:55 PM
John DuBois
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <g8m88m$8bl$1@news4.netvision.net.il>,
Aharon Robbins <arnold@skeeve.com> wrote:
>
>It's historical practice. Unix awk has worked this way since forever. IF
>you don't need the filenames, you could always use
>
> cat /proc/*/cmdline 2>/dev/null | awk 'program text'
>
>In any case, it would not be a good idea to change gawk's default
>behavior in this case.


I agree quite strongly! Having existing awk programs - including all those
relied upon for normal system functioning - suddenly have the potential to fail
silently rather than verbosely in the case of a missing file would be a very
bad idea.

I have no objection to an option to enable alternate behavior, though I'm among
those who would have little use for it.

John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
Reply With Quote
  #24  
Old 08-23-2008, 08:04 AM
Ed Morton
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On 8/22/2008 1:55 PM, John DuBois wrote:
> In article <g8m88m$8bl$1@news4.netvision.net.il>,
> Aharon Robbins <arnold@skeeve.com> wrote:
>
>>It's historical practice. Unix awk has worked this way since forever. IF
>>you don't need the filenames, you could always use
>>
>> cat /proc/*/cmdline 2>/dev/null | awk 'program text'
>>
>>In any case, it would not be a good idea to change gawk's default
>>behavior in this case.

>
>
> I agree quite strongly! Having existing awk programs - including all those
> relied upon for normal system functioning - suddenly have the potential to fail
> silently rather than verbosely in the case of a missing file would be a very
> bad idea.


It doesn't have to be silent, there's no reason for it to be a catastrophic
failure like today, there's no real reason an application should want a
significant difference between trying to open a missing file vs trying to open
an unreadable file like today, and a missing file is handled inconsistently
today between being opened by getline vs being opened in the normal work loop so
handling of missing files could seriously be considered as broken right now and
this prooposal is a fix.

> I have no objection to an option to enable alternate behavior, though I'm among
> those who would have little use for it.


Right, but then no-one would actually use it as I mentioned elsethread.

Ed.


Reply With Quote
  #25  
Old 08-23-2008, 10:54 AM
Andrew Schorr
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

FYI, it looks to me as if Arnold has already committed a patch
to the Savannah CVS tree that changes the fatal error to
a warning if the WHINY_USERS environment variable is set:

+++ ./io.c 2008-08-22 10:30:05.534799000 -0400
@@ -316,6 +316,11 @@ nextfile(int skipping)
if (isdir && do_traditional)
continue;
#endif
+ if (whiny_users) {
+ warning(_("cannot open file `
%s' for rea
ding (%s)"),
+ fname,
strerror(errno));
+ continue;
+ }
goto give_up;
}
curfile->flag |= IOP_NOFREE_OBJ;

I imagine that should satisfy the various constituencies.

Regards,
Andy
Reply With Quote
  #26  
Old 08-23-2008, 11:02 AM
Kenny McCormack
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <58a97cc4-43e0-4863-a378-249fb2fc2f7b@b1g2000hsg.googlegroups.com>,
Andrew Schorr <aschorr@telemetry-investments.com> wrote:
>FYI, it looks to me as if Arnold has already committed a patch
>to the Savannah CVS tree that changes the fatal error to
>a warning if the WHINY_USERS environment variable is set:


Interesting. Looks like we may have to frequently mention here on the
newsgroup, for the benefit of the various newbies, the need to set
WHINY_USERS in order to get proper functionality of GAWK.

Note that I am sort-of, semi, half-kidding. I do strongly believe that
array sorting is just natural and should always be on (unless your
arrays are really, really, huge, or your machine made during the Stone
Age, I can't see how it can cost). However, as my posts here have made
clear, I'm not all that certain that this "file not found" issue is in
need of an over-arching solution. I.e., I could see turning
WHINY_USERS on for the array sorting, but not necessarily
wanting/needing this other feature turned on.

I suppose I should search the current sources to see what, if any, other
effects may have been tied to WHINY_USERS.

Reply With Quote
  #27  
Old 08-23-2008, 12:42 PM
John DuBois
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <48AFFCCA.40305@lsupcaemnt.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
>On 8/22/2008 1:55 PM, John DuBois wrote:
>> In article <g8m88m$8bl$1@news4.netvision.net.il>,
>> Aharon Robbins <arnold@skeeve.com> wrote:
>>
>>>It's historical practice. Unix awk has worked this way since forever. IF
>>>you don't need the filenames, you could always use
>>>
>>> cat /proc/*/cmdline 2>/dev/null | awk 'program text'
>>>
>>>In any case, it would not be a good idea to change gawk's default
>>>behavior in this case.

>>
>>
>> I agree quite strongly! Having existing awk programs - including all those
>> relied upon for normal system functioning - suddenly have the potential to fail
>> silently rather than verbosely in the case of a missing file would be a very
>> bad idea.

>
>It doesn't have to be silent, there's no reason for it to be a catastrophic
>failure like today,


This need is set by all of the existing awk code out there, most of which is
not run interactively, and approximately none of which does any sort of error
checking on availablity of input files. I do *not* want that code to continue
to produce output, exit successfully, etc. if input files are not available.

>there's no real reason an application should want a
>significant difference between trying to open a missing file vs trying to open
>an unreadable file like today


What significant difference?

> and a missing file is handled inconsistently
>today between being opened by getline vs being opened in the normal work loop


This is exactly the difference that *should* exist. In a getline loop, there
is a failure indication intrinsically available to the code. If a file is
simply skipped, there isn't.

In fact, let me put it this way: If I was designing the language today, I would
make it behave (almost) exactly as it does. A file that couldn't be opened for
any reason would, by default, be a fatal error. What I might do differently:
a) provide a command-line option to make it a non-fatal error; and b) provide a
failure block which, if used, would make it an otherwise-silent non-event:
something like OPENFAIL { }.

John
--
John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/
Reply With Quote
  #28  
Old 08-23-2008, 01:41 PM
Grant
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On Sat, 23 Aug 2008 11:42:25 -0500, spcecdt@armory.com (John DuBois) wrote:

>In article <48AFFCCA.40305@lsupcaemnt.com>,
>Ed Morton <morton@lsupcaemnt.com> wrote:
>>On 8/22/2008 1:55 PM, John DuBois wrote:
>>> In article <g8m88m$8bl$1@news4.netvision.net.il>,
>>> Aharon Robbins <arnold@skeeve.com> wrote:
>>>
>>>>It's historical practice. Unix awk has worked this way since forever. IF
>>>>you don't need the filenames, you could always use
>>>>
>>>> cat /proc/*/cmdline 2>/dev/null | awk 'program text'
>>>>
>>>>In any case, it would not be a good idea to change gawk's default
>>>>behavior in this case.
>>>
>>>
>>> I agree quite strongly! Having existing awk programs - including all those
>>> relied upon for normal system functioning - suddenly have the potential to fail
>>> silently rather than verbosely in the case of a missing file would be a very
>>> bad idea.

>>
>>It doesn't have to be silent, there's no reason for it to be a catastrophic
>>failure like today,

>
>This need is set by all of the existing awk code out there, most of which is
>not run interactively, and approximately none of which does any sort of error
>checking on availablity of input files. I do *not* want that code to continue
>to produce output, exit successfully, etc. if input files are not available.
>
>>there's no real reason an application should want a
>>significant difference between trying to open a missing file vs trying to open
>>an unreadable file like today

>
>What significant difference?
>
>> and a missing file is handled inconsistently
>>today between being opened by getline vs being opened in the normal work loop

>
>This is exactly the difference that *should* exist. In a getline loop, there
>is a failure indication intrinsically available to the code. If a file is
>simply skipped, there isn't.
>
>In fact, let me put it this way: If I was designing the language today, I would
>make it behave (almost) exactly as it does. A file that couldn't be opened for
>any reason would, by default, be a fatal error. What I might do differently:
>a) provide a command-line option to make it a non-fatal error; and b) provide a
>failure block which, if used, would make it an otherwise-silent non-event:
>something like OPENFAIL { }.


Oh yes, and add SIGNAL { } too. Having to wrap gawk script in a shell wrapper
to catch signals -- well, it can be done, like the shell wrapper for open fail.

Grant.
--
http://bugsplatter.id.au/
Reply With Quote
  #29  
Old 08-23-2008, 08:13 PM
Ed Morton
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On 8/23/2008 11:42 AM, John DuBois wrote:
> In article <48AFFCCA.40305@lsupcaemnt.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
>
>>On 8/22/2008 1:55 PM, John DuBois wrote:
>>
>>>In article <g8m88m$8bl$1@news4.netvision.net.il>,
>>>Aharon Robbins <arnold@skeeve.com> wrote:
>>>
>>>
>>>>It's historical practice. Unix awk has worked this way since forever. IF
>>>>you don't need the filenames, you could always use
>>>>
>>>> cat /proc/*/cmdline 2>/dev/null | awk 'program text'
>>>>
>>>>In any case, it would not be a good idea to change gawk's default
>>>>behavior in this case.
>>>
>>>
>>>I agree quite strongly! Having existing awk programs - including all those
>>>relied upon for normal system functioning - suddenly have the potential to fail
>>>silently rather than verbosely in the case of a missing file would be a very
>>>bad idea.

>>
>>It doesn't have to be silent, there's no reason for it to be a catastrophic
>>failure like today,

>
>
> This need is set by all of the existing awk code out there, most of which is
> not run interactively, and approximately none of which does any sort of error
> checking on availablity of input files. I do *not* want that code to continue
> to produce output, exit successfully, etc. if input files are not available.


It does that today if the input file is empty and *I* don't care if it's empty
or can't be opened or doesn't exist.

>
>>there's no real reason an application should want a
>>significant difference between trying to open a missing file vs trying to open
>>an unreadable file like today

>
>
> What significant difference?


There isn't one in general. I was looking at a difference that only exists on
cygwin:

$ ls -l f?
-rw-r--r-- 1 morton mkgroup-l-d 11 Aug 20 16:58 f1
---------- 1 morton mkgroup-l-d 0 Aug 23 18:49 f2
-rw-r--r-- 1 morton mkgroup-l-d 11 Aug 20 16:59 f3

$ gawk '1' f1 f2 f3
f1, line 1
f3, line 1

$ rm -f f2

$ gawk '1' f1 f2 f3
f1, line 1
gawk: (FILENAME=f1 FNR=1) fatal: cannot open file `f2' for reading (No such file
or directory)

It looked like it was quitely skipping the unreadable file, but when I added
content to that file:

$ ls -l f?
-rw-r--r-- 1 morton mkgroup-l-d 11 Aug 20 16:58 f1
---------- 1 morton mkgroup-l-d 14 Aug 23 18:51 f2
-rw-r--r-- 1 morton mkgroup-l-d 11 Aug 20 16:59 f3

$ gawk '1' f1 f2 f3
f1, line 1
file2, line 1
f3, line 1

I see it's just that cygwin is ignoring the unreadable permission of f2.

>
>>and a missing file is handled inconsistently
>>today between being opened by getline vs being opened in the normal work loop

>
>
> This is exactly the difference that *should* exist. In a getline loop, there
> is a failure indication intrinsically available to the code. If a file is
> simply skipped, there isn't.


That's just a design choice. You could choose to set some standard variable and
have it available for anyone who cared to test, probably in the END section.

> In fact, let me put it this way: If I was designing the language today, I would
> make it behave (almost) exactly as it does. A file that couldn't be opened for
> any reason would, by default, be a fatal error.


Why? If you're going to do that, why not make an empty file a fatal error too?
If you care about it, why not test all the files up front and then not open any
of them rather than producing partial output? Those are rhetorical questions - I
don't really care what the answers are as what to do is just a matter of
opinion, BUT a fatal error tears the rug out from under you in terms of handling
various input coonditions.

> What I might do differently: a) provide a command-line option to make it a

non-fatal error; and b) provide a
> failure block which, if used, would make it an otherwise-silent non-event:
> something like OPENFAIL { }.


I agree with both of those, though obviously I'd switch the default behavior.

Ed.

Reply With Quote
  #30  
Old 08-24-2008, 04:37 AM
Kenny McCormack
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <48B0A7AB.4030907@lsupcaemnt.com>,
Ed Morton <morton@lsupcaemnt.com> wrote (>) in response to someone else (>>):
....
>> What I might do differently: a) provide a command-line option to make
>> it a non-fatal error; and b) provide a failure block which, if used,
>> would make it an otherwise-silent non-event: something like OPENFAIL
>> { }.

>
>I agree with both of those, though obviously I'd switch the default behavior.
>
> Ed.
>


I think what most people are arguing is that you *can't* change the
default behavior, however much we wish it had been done right (IOHO) in
the beginning, because it *might* break existing code. The situation is
much the same as that which has Solaris keeping two very broken programs
around (and makes them the default on the default PATH). I am
referring, of course, to their keeping /bin/awk (very broken) and
/bin/sh (original sh, warts and all, even as the world is moving towards
the so-called "POSIX" shell).

P.S. IOHO: In our humble opinion

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 02:42 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.