GAWK: A fix for "missing file is a fatal error"

This is a discussion on GAWK: A fix for "missing file is a fatal error" within the awk forums in Programming Languages category; In one of my scripts, I found that in GAWK, if a file is missing (can't be opened), GAWK bombs with a fatal error. I have a vague memory of there being some discussion of this issue in this group at some point in the distant past, and the consensus was that there wasn't anything you could do about it. Note to standards jockeys: No, this isn't a bug in your precious GAWK in the usual "standards" sense. So, don't even bother. The fact is that, under certain conditions, it *is* a mis-feature, and it would be nice to at ...

Go Back   Application Development Forum > Programming Languages > awk

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-20-2008, 08:48 AM
Kenny McCormack
Guest
 
Default GAWK: A fix for "missing file is a fatal error"

In one of my scripts, I found that in GAWK, if a file is missing (can't
be opened), GAWK bombs with a fatal error. I have a vague memory of there
being some discussion of this issue in this group at some point in the
distant past, and the consensus was that there wasn't anything you could
do about it.

Note to standards jockeys: No, this isn't a bug in your precious GAWK in
the usual "standards" sense. So, don't even bother.

The fact is that, under certain conditions, it *is* a mis-feature, and it
would be nice to at least have the option of continuing. I note in
passing that TAWK handles this rather better - you get a warning about a
missing file, but the script continues. Ideal, of course, would be a
settable option, so you can select the behavior that you want.

Note: I am talking about files read in the "automatic input loop", not
via "getline".

Obviously one solution would be to hack (fix) the GAWK source code and
recompile, but that is inconvenient for me (due to some reasons beyond
the scope of this document). So, I elected to fix it via an "interposer".
See below.

This solution works for me under Linux - you may need to adjust
accordingly for your environment.

$ cat open_fix.c
/* A lib to fix the GAWK missing files problem */
/* Usage: export LD_PRELOAD=/path/to/this/lib */
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

static int (*myopen64) (const char *,int);

int open64(const char *path, int flags, ...) {
int ret;

if (!myopen64)
myopen64 = (int (*)(const char *,int)) dlsym(RTLD_NEXT,"open64");
ret = myopen64(path,flags);
return ret != -1 ? ret : myopen64("/dev/null",flags);
}
$ gcc -fPIC -W -Wall -Werror -c open_fix.c
$ ld -G -h libopen_fix.so.1 -ldl -o libopen_fix.so open_fix.o
$ LD_PRELOAD=./libopen_fix.so gawk '{print FILENAME,$0}' goodfile badfile goodfile1

Have fun!

Reply With Quote
  #2  
Old 08-20-2008, 10:01 AM
pk
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On Wednesday 20 August 2008 14:48, Kenny McCormack wrote:

> Obviously one solution would be to hack (fix) the GAWK source code and
> recompile, but that is inconvenient for me (due to some reasons beyond
> the scope of this document). So, I elected to fix it via an "interposer".


What's wrong with checking if the file exists like this:

awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc.

(except the fact that one usually doesn't post to that a newsgroup)

Reply With Quote
  #3  
Old 08-20-2008, 10:15 AM
Janis Papanagnou
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

pk wrote:
> On Wednesday 20 August 2008 14:48, Kenny McCormack wrote:
>
>
>>Obviously one solution would be to hack (fix) the GAWK source code and
>>recompile, but that is inconvenient for me (due to some reasons beyond
>>the scope of this document). So, I elected to fix it via an "interposer".

>
>
> What's wrong with checking if the file exists like this:
>
> awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc.


Maybe because that would be impractical if you're processing many file
arguments...?

awk '{print FILENAME,$0}' prefix*.ext


Janis

>
> (except the fact that one usually doesn't post to that a newsgroup)
>

Reply With Quote
  #4  
Old 08-20-2008, 10:16 AM
Janis Papanagnou
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

Janis Papanagnou wrote:
> pk wrote:
>
>> On Wednesday 20 August 2008 14:48, Kenny McCormack wrote:
>>
>>
>>> Obviously one solution would be to hack (fix) the GAWK source code and
>>> recompile, but that is inconvenient for me (due to some reasons beyond
>>> the scope of this document). So, I elected to fix it via an
>>> "interposer".

>>
>>
>>
>> What's wrong with checking if the file exists like this:
>>
>> awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null
>> )" etc.

>
>
> Maybe because that would be impractical if you're processing many file
> arguments...?
>
> awk '{print FILENAME,$0}' prefix*.ext


Oops, makes not much sense with wildcards.

> Janis
>
>>
>> (except the fact that one usually doesn't post to that a newsgroup)
>>

Reply With Quote
  #5  
Old 08-20-2008, 10:17 AM
Kenny McCormack
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <g8h7u1$4pq$1@aioe.org>, pk <pk@pk.invalid> wrote:
>On Wednesday 20 August 2008 14:48, Kenny McCormack wrote:
>
>> Obviously one solution would be to hack (fix) the GAWK source code and
>> recompile, but that is inconvenient for me (due to some reasons beyond
>> the scope of this document). So, I elected to fix it via an "interposer".

>
>What's wrong with checking if the file exists like this:
>
>awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc.
>
>(except the fact that one usually doesn't post that to a newsgroup)
>


1) The above is ugly.
2) The above is ugly.
3) It doesn't scale. My issue is that I have many, many input files and
I'd hate to have to code that kludge in a loop.
4) The issue is that the files can appear or disappear at any time - so
there is a race condition going on - you can't really rely on the above
to work.
5) I was going to point out that the above is shell, so OT, but then again,
I suppose C is OT as well. But not quite so much OT as shell is.

Anyway, it works for me, and that's the important thing.
I have always thought that it is better to enhance (fix) the language,
then to kludge around it.

Reply With Quote
  #6  
Old 08-20-2008, 10:33 AM
pk
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On Wednesday 20 August 2008 16:17, Kenny McCormack wrote:

> In article <g8h7u1$4pq$1@aioe.org>, pk <pk@pk.invalid> wrote:
>>On Wednesday 20 August 2008 14:48, Kenny McCormack wrote:
>>
>>> Obviously one solution would be to hack (fix) the GAWK source code and
>>> recompile, but that is inconvenient for me (due to some reasons beyond
>>> the scope of this document). So, I elected to fix it via an
>>> "interposer".

>>
>>What's wrong with checking if the file exists like this:
>>
>>awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )"
>>etc.
>>
>>(except the fact that one usually doesn't post that to a newsgroup)
>>

>
> 1) The above is ugly.
> 2) The above is ugly.


Very good technical reasons.

> 3) It doesn't scale. My issue is that I have many, many input files and
> I'd hate to have to code that kludge in a loop.


You still have to code *your* kludge.

> 4) The issue is that the files can appear or disappear at any time - so
> there is a race condition going on - you can't really rely on the
> above to work.


This is a good reason (which you didn't mention before).

> 5) I was going to point out that the above is shell, so OT, but then
> again, I suppose C is OT as well. But not quite so much OT as shell is.


Yes, of course you are the one who decides that, I had forgot it.

Reply With Quote
  #7  
Old 08-20-2008, 10:41 AM
pk
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote:

>>> What's wrong with checking if the file exists like this:
>>>
>>> awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null
>>> )" etc.

>>
>>
>> Maybe because that would be impractical if you're processing many file
>> arguments...?
>>
>> awk '{print FILENAME,$0}' prefix*.ext

>
> Oops, makes not much sense with wildcards.


You're right, you need even more ugly kludges in that case, while the
interposer works fine because it only sees the filenames as expanded by the
shell.

Reply With Quote
  #8  
Old 08-20-2008, 10:43 AM
pk
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote:

>> Maybe because that would be impractical if you're processing many file
>> arguments...?
>>
>> awk '{print FILENAME,$0}' prefix*.ext

>
> Oops, makes not much sense with wildcards.


In that case the shell does all the work and the resulting file list
contains only files that actually exist. If we want to be picky, there's
still the race condition problem between the moment the shell expands the
list and awk tries to open each file.

Reply With Quote
  #9  
Old 08-20-2008, 11:35 AM
Kenny McCormack
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

In article <g8h9qd$df1$1@aioe.org>, pk <pk@pk.invalid> wrote:
....
>> 1) The above is ugly.
>> 2) The above is ugly.

>
>Very good technical reasons.


Indeed.

>> 3) It doesn't scale. My issue is that I have many, many input files and
>> I'd hate to have to code that kludge in a loop.

>
>You still have to code *your* kludge.


One man's kludge is another man's thing of beauty.

>> 4) The issue is that the files can appear or disappear at any time - so
>> there is a race condition going on - you can't really rely on the
>> above to work.

>
>This is a good reason (which you didn't mention before).


Yes. In fact, that's the real problem - the race condition between when
the shell expands the filenames and when AWK gets around to reading them.

By the way, my input file specification is: /proc/*/cmdline

>> 5) I was going to point out that the above is shell, so OT, but then
>> again, I suppose C is OT as well. But not quite so much OT as shell is.

>
>Yes, of course you are the one who decides that, I had forgot it.


Yes. I am the boss here. And don't nobody be forgettin' it!

Reply With Quote
  #10  
Old 08-20-2008, 01:57 PM
Janis Papanagnou
Guest
 
Default Re: GAWK: A fix for "missing file is a fatal error"

pk wrote:
> On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote:
>
>>>Maybe because that would be impractical if you're processing many file
>>>arguments...?
>>>
>>> awk '{print FILENAME,$0}' prefix*.ext

>>
>>Oops, makes not much sense with wildcards.

>
> In that case the shell does all the work and the resulting file list
> contains only files that actually exist.


Yes, that's why I cancelled my original message and added this comment.
I think it's still a Good Thing to let an "invisible" layer handle that
instead of using explicit workarounds for each of the given files and
avoiding "non-scalable" (as Kenny called it) shell constructs, which was
the intention introducing my wildcard example in the first place to show
the problem that arises with many file arguments.

Janis
Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 06:43 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.