awk file size limit?

This is a discussion on awk file size limit? within the awk forums in Programming Languages category; Hi all, I'm trying to learn awk and use it for files manipulation. I'm trying a simple: awk 'gsub("\t"," "); print >FILENAME}' *.py this should give me all the python scripts with tabs substituted by 4 spaces. It works, but the output file are always truncated: using Ubuntu to 4km using cygwin to 80k. Any suggestion? Thanks in advance Claudio...

Go Back   Application Development Forum > Programming Languages > awk

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 07-23-2008, 12:32 PM
claudegps
Guest
 
Default awk file size limit?

Hi all,
I'm trying to learn awk and use it for files manipulation.
I'm trying a simple:
awk 'gsub("\t"," "); print >FILENAME}' *.py

this should give me all the python scripts with tabs substituted by 4
spaces.
It works, but the output file are always truncated: using Ubuntu to
4km using cygwin to 80k.

Any suggestion?
Thanks in advance

Claudio
Reply With Quote
  #2  
Old 07-23-2008, 12:50 PM
Kenny McCormack
Guest
 
Default Re: awk file size limit?

In article <8f3d2375-1950-48a0-9068-3da253f84a81@w7g2000hsa.googlegroups.com>,
claudegps <claudegps@gmail.com> wrote:
>Hi all,
> I'm trying to learn awk and use it for files manipulation.
>I'm trying a simple:
>awk 'gsub("\t"," "); print >FILENAME}' *.py
>
>this should give me all the python scripts with tabs substituted by 4
>spaces.
>It works, but the output file are always truncated: using Ubuntu to
>4km using cygwin to 80k.
>
>Any suggestion?
>Thanks in advance
>
>Claudio


Short summary: It's not what you think.

You can't write back to the same file you're reading from (simultaneously).

You either have to arrange to do the usual "write to temp file, delete
original file, rename temp file to original file" dance, *or*, if you're
careful, in AWK, you can build up the new file in an array, then after
the original file is fully read and has been closed, then write out the
array to the original filename.

Reply With Quote
  #3  
Old 07-23-2008, 01:05 PM
claudegps
Guest
 
Default Re: awk file size limit?

On 23 Lug, 18:50, gaze...@xmission.xmission.com (Kenny McCormack)
wrote:
> In article <8f3d2375-1950-48a0-9068-3da253f84...@w7g2000hsa.googlegroups.com>,
>
> claudegps *<claude...@gmail.com> wrote:
> >Hi all,
> > * * * *I'm trying to learn awk and use it for files manipulation.
> >I'm trying a simple:
> >awk 'gsub("\t"," * *"); print >FILENAME}' *.py

>
> >this should give me all the python scripts with tabs substituted by 4
> >spaces.
> >It works, but the output file are always truncated: using Ubuntu to
> >4km using cygwin to 80k.

>
> >Any suggestion?
> >Thanks in advance

>
> >Claudio

>
> Short summary: It's not what you think.
>
> You can't write back to the same file you're reading from (simultaneously).


I see...

> You either have to arrange to do the usual "write to temp file, delete
> original file, rename temp file to original file" dance, *or*, if you're


Sure this works!

> careful, in AWK, you can build up the new file in an array, then after
> the original file is fully read and has been closed, then write out the
> array to the original filename.


Ok, I'll study to try this.
I think I should forget the "single line does all the work I need"
Thanks for your help!

Claudio
Reply With Quote
  #4  
Old 07-23-2008, 11:38 PM
Ed Morton
Guest
 
Default Re: awk file size limit?



On 7/23/2008 12:05 PM, claudegps wrote:
> On 23 Lug, 18:50, gaze...@xmission.xmission.com (Kenny McCormack)
> wrote:
>
>>In article <8f3d2375-1950-48a0-9068-3da253f84...@w7g2000hsa.googlegroups.com>,
>>
>>claudegps <claude...@gmail.com> wrote:
>>
>>>Hi all,
>>> I'm trying to learn awk and use it for files manipulation.
>>>I'm trying a simple:
>>>awk 'gsub("\t"," "); print >FILENAME}' *.py

>>
>>>this should give me all the python scripts with tabs substituted by 4
>>>spaces.
>>>It works, but the output file are always truncated: using Ubuntu to
>>>4km using cygwin to 80k.

>>
>>>Any suggestion?
>>>Thanks in advance

>>
>>>Claudio

>>
>>Short summary: It's not what you think.
>>
>>You can't write back to the same file you're reading from (simultaneously).

>
>
> I see...
>
>
>>You either have to arrange to do the usual "write to temp file, delete
>>original file, rename temp file to original file" dance, *or*, if you're

>
>
> Sure this works!
>
>
>>careful, in AWK, you can build up the new file in an array, then after
>>the original file is fully read and has been closed, then write out the
>>array to the original filename.

>
>
> Ok, I'll study to try this.


OK, but then forget it again and never, ever do it :-). Seriously - it's one of
those things you CAN do for an exercise, e.g. if you just have one file:

awk '
function printout(_str) { _out[++_nr] = _str }
function flushout( _i) { close(FILENAME);
for (_i=1; _i<=_nr;_i++)
print _out[_i] > FILENAME
}
{ gsub("\t"," "); printout($0) }
END { flushout() }' file

but it's just obscure and complicated compared to a simple tmp file, e.g. in UNIX:

awk '{gsub("\t"," ")}1' file > tmp && mv tmp file

> I think I should forget the "single line does all the work I need"


You can do a LOT with a single line. Of course, it somewhat depends on how long
a line you think is reasonable.

Ed.

Reply With Quote
  #5  
Old 07-24-2008, 07:39 AM
Kenny McCormack
Guest
 
Default Re: awk file size limit?

In article <4887F937.1080402@lsupcaemnt.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>... compared to a simple tmp file,
>e.g. in UNIX:
>
>awk '{gsub("\t"," ")}1' file > tmp && mv tmp file


The problem is that this doesn't easily generalize to handling multiple
files (in a single AWK program). You end up doing a shell loop, and
that is, of course, 1) OT here and 2) Not generalizable to other
(non-Unix) OSs.

OB CYA: Yes, of course there are ways to do it, but the point is that it
doesn't flow naturally in AWK. This is an area where Perl/sed's -i
option is actually a nice piece of syntactic sugar.

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 08:36 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.