| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| In one of my scripts, I found that in GAWK, if a file is missing (can't be opened), GAWK bombs with a fatal error. I have a vague memory of there being some discussion of this issue in this group at some point in the distant past, and the consensus was that there wasn't anything you could do about it. Note to standards jockeys: No, this isn't a bug in your precious GAWK in the usual "standards" sense. So, don't even bother. The fact is that, under certain conditions, it *is* a mis-feature, and it would be nice to at least have the option of continuing. I note in passing that TAWK handles this rather better - you get a warning about a missing file, but the script continues. Ideal, of course, would be a settable option, so you can select the behavior that you want. Note: I am talking about files read in the "automatic input loop", not via "getline". Obviously one solution would be to hack (fix) the GAWK source code and recompile, but that is inconvenient for me (due to some reasons beyond the scope of this document). So, I elected to fix it via an "interposer". See below. This solution works for me under Linux - you may need to adjust accordingly for your environment. $ cat open_fix.c /* A lib to fix the GAWK missing files problem */ /* Usage: export LD_PRELOAD=/path/to/this/lib */ #define _GNU_SOURCE #include <dlfcn.h> #include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> static int (*myopen64) (const char *,int); int open64(const char *path, int flags, ...) { int ret; if (!myopen64) myopen64 = (int (*)(const char *,int)) dlsym(RTLD_NEXT,"open64"); ret = myopen64(path,flags); return ret != -1 ? ret : myopen64("/dev/null",flags); } $ gcc -fPIC -W -Wall -Werror -c open_fix.c $ ld -G -h libopen_fix.so.1 -ldl -o libopen_fix.so open_fix.o $ LD_PRELOAD=./libopen_fix.so gawk '{print FILENAME,$0}' goodfile badfile goodfile1 Have fun! |
|
#2
| |||
| |||
| On Wednesday 20 August 2008 14:48, Kenny McCormack wrote: > Obviously one solution would be to hack (fix) the GAWK source code and > recompile, but that is inconvenient for me (due to some reasons beyond > the scope of this document). So, I elected to fix it via an "interposer". What's wrong with checking if the file exists like this: awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc. (except the fact that one usually doesn't post to that a newsgroup) |
|
#3
| |||
| |||
| pk wrote: > On Wednesday 20 August 2008 14:48, Kenny McCormack wrote: > > >>Obviously one solution would be to hack (fix) the GAWK source code and >>recompile, but that is inconvenient for me (due to some reasons beyond >>the scope of this document). So, I elected to fix it via an "interposer". > > > What's wrong with checking if the file exists like this: > > awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc. Maybe because that would be impractical if you're processing many file arguments...? awk '{print FILENAME,$0}' prefix*.ext Janis > > (except the fact that one usually doesn't post to that a newsgroup) > |
|
#4
| |||
| |||
| Janis Papanagnou wrote: > pk wrote: > >> On Wednesday 20 August 2008 14:48, Kenny McCormack wrote: >> >> >>> Obviously one solution would be to hack (fix) the GAWK source code and >>> recompile, but that is inconvenient for me (due to some reasons beyond >>> the scope of this document). So, I elected to fix it via an >>> "interposer". >> >> >> >> What's wrong with checking if the file exists like this: >> >> awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null >> )" etc. > > > Maybe because that would be impractical if you're processing many file > arguments...? > > awk '{print FILENAME,$0}' prefix*.ext Oops, makes not much sense with wildcards. > Janis > >> >> (except the fact that one usually doesn't post to that a newsgroup) >> |
|
#5
| |||
| |||
| In article <g8h7u1$4pq$1@aioe.org>, pk <pk@pk.invalid> wrote: >On Wednesday 20 August 2008 14:48, Kenny McCormack wrote: > >> Obviously one solution would be to hack (fix) the GAWK source code and >> recompile, but that is inconvenient for me (due to some reasons beyond >> the scope of this document). So, I elected to fix it via an "interposer". > >What's wrong with checking if the file exists like this: > >awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" etc. > >(except the fact that one usually doesn't post that to a newsgroup) > 1) The above is ugly. 2) The above is ugly. 3) It doesn't scale. My issue is that I have many, many input files and I'd hate to have to code that kludge in a loop. 4) The issue is that the files can appear or disappear at any time - so there is a race condition going on - you can't really rely on the above to work. 5) I was going to point out that the above is shell, so OT, but then again, I suppose C is OT as well. But not quite so much OT as shell is. Anyway, it works for me, and that's the important thing. I have always thought that it is better to enhance (fix) the language, then to kludge around it. |
|
#6
| |||
| |||
| On Wednesday 20 August 2008 16:17, Kenny McCormack wrote: > In article <g8h7u1$4pq$1@aioe.org>, pk <pk@pk.invalid> wrote: >>On Wednesday 20 August 2008 14:48, Kenny McCormack wrote: >> >>> Obviously one solution would be to hack (fix) the GAWK source code and >>> recompile, but that is inconvenient for me (due to some reasons beyond >>> the scope of this document). So, I elected to fix it via an >>> "interposer". >> >>What's wrong with checking if the file exists like this: >> >>awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null )" >>etc. >> >>(except the fact that one usually doesn't post that to a newsgroup) >> > > 1) The above is ugly. > 2) The above is ugly. Very good technical reasons. > 3) It doesn't scale. My issue is that I have many, many input files and > I'd hate to have to code that kludge in a loop. You still have to code *your* kludge. > 4) The issue is that the files can appear or disappear at any time - so > there is a race condition going on - you can't really rely on the > above to work. This is a good reason (which you didn't mention before). > 5) I was going to point out that the above is shell, so OT, but then > again, I suppose C is OT as well. But not quite so much OT as shell is. Yes, of course you are the one who decides that, I had forgot it. |
|
#7
| |||
| |||
| On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote: >>> What's wrong with checking if the file exists like this: >>> >>> awk '{print FILENAME,$0}' "$( [ -f file ]&&echo file||echo /dev/null >>> )" etc. >> >> >> Maybe because that would be impractical if you're processing many file >> arguments...? >> >> awk '{print FILENAME,$0}' prefix*.ext > > Oops, makes not much sense with wildcards. You're right, you need even more ugly kludges in that case, while the interposer works fine because it only sees the filenames as expanded by the shell. |
|
#8
| |||
| |||
| On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote: >> Maybe because that would be impractical if you're processing many file >> arguments...? >> >> awk '{print FILENAME,$0}' prefix*.ext > > Oops, makes not much sense with wildcards. In that case the shell does all the work and the resulting file list contains only files that actually exist. If we want to be picky, there's still the race condition problem between the moment the shell expands the list and awk tries to open each file. |
|
#9
| |||
| |||
| In article <g8h9qd$df1$1@aioe.org>, pk <pk@pk.invalid> wrote: .... >> 1) The above is ugly. >> 2) The above is ugly. > >Very good technical reasons. Indeed. >> 3) It doesn't scale. My issue is that I have many, many input files and >> I'd hate to have to code that kludge in a loop. > >You still have to code *your* kludge. One man's kludge is another man's thing of beauty. >> 4) The issue is that the files can appear or disappear at any time - so >> there is a race condition going on - you can't really rely on the >> above to work. > >This is a good reason (which you didn't mention before). Yes. In fact, that's the real problem - the race condition between when the shell expands the filenames and when AWK gets around to reading them. By the way, my input file specification is: /proc/*/cmdline >> 5) I was going to point out that the above is shell, so OT, but then >> again, I suppose C is OT as well. But not quite so much OT as shell is. > >Yes, of course you are the one who decides that, I had forgot it. Yes. I am the boss here. And don't nobody be forgettin' it! |
|
#10
| |||
| |||
| pk wrote: > On Wednesday 20 August 2008 16:16, Janis Papanagnou wrote: > >>>Maybe because that would be impractical if you're processing many file >>>arguments...? >>> >>> awk '{print FILENAME,$0}' prefix*.ext >> >>Oops, makes not much sense with wildcards. > > In that case the shell does all the work and the resulting file list > contains only files that actually exist. Yes, that's why I cancelled my original message and added this comment. I think it's still a Good Thing to let an "invisible" layer handle that instead of using explicit workarounds for each of the given files and avoiding "non-scalable" (as Kenny called it) shell constructs, which was the intention introducing my wildcard example in the first place to show the problem that arises with many file arguments. Janis |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.