awk's fieldseparator (FS) cant be a RE,
so you have to "split" your records.
try
awk '{split($0,a,"([ \t])+");print a[3];}')
awk '{split($0,a,"([ \t])+");print a[4];}')
man awk
hth,
Thomas
This is a discussion on Awk script to separate spaces and tabs - awk ; Hi, I have bunch of file as shown below "sample.txt" with multiple lines. Each line is separated either by TAB or BLANK space. I am trying to write shell script which gives me col 3 and corrosponding col 4. so ...
Hi,
I have bunch of file as shown below "sample.txt" with multiple lines.
Each line is separated either by TAB or BLANK space. I am trying to
write shell script which gives me col 3 and corrosponding col 4. so
that i can see generate a report for each line in each of the text
file with size and # no files in each folder. Also need to copy the
data from corrponding foler in each line to new folder say the first
line is
test.com emp Announcements /ms/0B/03/81512
I need foldername= Announcements
corrosponding dirpath=/ms/0B/03/81512
Report:
Folder: Announcements
# files= `ls /ms/0B/03/81512`
Size: `du -sk /ms/0B/03/81512`
Becasue of tabs and white sapces betwen the colums and white spaces in
col 3 its difficult to separate with cat and awk . Any help is greatly
apprciated.
for i in `cat /tmp/sample.txt`
do
echo $i
dirpath=`echo $i | awk -F"\t" '{print $4}'`
# echo $dirpath
# foldername=`echo $i | awk -F"\t" '{print $3}'`
#/bin/echo "size = \c"
# du -s $dirpath
#/bin/echo "largest = \c"
# ls -ltsr $dirpath | sort -n | tail -1
#/bin/echo "num msgs = \c"
# ls $dirpath | wc -l
done
sample.txt
test.com emp Announcements /ms/0B/03/81512
test.com emp BOD /ms/76/0F/81513
test.com emp CGC /ms2/76/0C/81517
test.com emp Drafts /ms/21/03/81526
test.com emp INBOX /ms2/6B/0E/81511
test.com emp "Junk E-mail" /ms/62/1C/81569
test.com emp Personal /ms/15/00/81578
test.com emp "some sap" /ms2/11/02/81579
test.com emp Sent /ms2/22/1A/81580
test.com emp Templates /ms2/07/17/81581
test.com emp Trash /ms2/49/16/81582
test.com emp "Work Request" /ms2/01/0F/81583
test.com emp admins /ms2/6C/0A/81584
test.com emp dumpster /ms2/77/10/81585
test.com emp ec /ms2/19/07/81586
test.com emp "emp personal" /ms2/50/
awk's fieldseparator (FS) cant be a RE,
so you have to "split" your records.
try
awk '{split($0,a,"([ \t])+");print a[3];}')
awk '{split($0,a,"([ \t])+");print a[4];}')
man awk
hth,
Thomas
Ï/Ç Thomas J. Ýãñáøå:
> awk's fieldseparator (FS) cant be a RE,
> so you have to "split" your records.
>
> try
>
> awk '{split($0,a,"([ \t])+");print a[3];}')
> awk '{split($0,a,"([ \t])+");print a[4];}')
>
> man awk
>
> hth,
>
> Thomas
Or you can pass the field separator in cmdline like this:
awk -F'\t' '{ gsub(/"/, "", $3); print $3 }'
awk -F'\t' '{ print $4 }'
Thomas J. wrote:
> awk's fieldseparator (FS) cant be a RE,
GNU awks can.
> so you have to "split" your records.
>
> try
>
> awk '{split($0,a,"([ \t])+");print a[3];}')
> awk '{split($0,a,"([ \t])+");print a[4];}')
The trailing semicolons aren't necessary and chains of white space are
the default FS so, unless there's some white space characters you're
specicially trying to exclude, the above is equivalent to:
awk '{print $3}'
awk '{print $4}'
> man awk
Indeed ;-).
Ed.
Vassilis wrote:
> Ï/Ç Thomas J. Ýãñáøå:
>
>>awk's fieldseparator (FS) cant be a RE,
>>so you have to "split" your records.
>>
>>try
>>
>>awk '{split($0,a,"([ \t])+");print a[3];}')
>>awk '{split($0,a,"([ \t])+");print a[4];}')
>>
>>man awk
>>
>>hth,
>>
>>Thomas
>
>
> Or you can pass the field separator in cmdline like this:
>
> awk -F'\t' '{ gsub(/"/, "", $3); print $3 }'
> awk -F'\t' '{ print $4 }'
>
In the part that was snipped the OP said:
> Becasue of tabs and white sapces betwen the colums and white spaces in
> col 3...
so you can't just use a single tab as the field separator.
Ed.
explor wrote:
> Hi,
> I have bunch of file as shown below "sample.txt" with multiple lines.
> Each line is separated either by TAB or BLANK space. I am trying to
> write shell script which gives me col 3 and corrosponding col 4. so
> that i can see generate a report for each line in each of the text
> file with size and # no files in each folder. Also need to copy the
> data from corrponding foler in each line to new folder say the first
> line is
>
> test.com emp Announcements /ms/0B/03/81512
>
> I need foldername= Announcements
> corrosponding dirpath=/ms/0B/03/81512
>
> Report:
> Folder: Announcements
> # files= `ls /ms/0B/03/81512`
> Size: `du -sk /ms/0B/03/81512`
>
> Becasue of tabs and white sapces betwen the colums and white spaces in
> col 3 its difficult to separate with cat and awk . Any help is greatly
> apprciated.
>
> for i in `cat /tmp/sample.txt`
> do
> echo $i
> dirpath=`echo $i | awk -F"\t" '{print $4}'`
> # echo $dirpath
> # foldername=`echo $i | awk -F"\t" '{print $3}'`
> #/bin/echo "size = \c"
> # du -s $dirpath
> #/bin/echo "largest = \c"
> # ls -ltsr $dirpath | sort -n | tail -1
> #/bin/echo "num msgs = \c"
> # ls $dirpath | wc -l
> done
>
>
> sample.txt
> test.com emp Announcements /ms/0B/03/81512
> test.com emp BOD /ms/76/0F/81513
> test.com emp CGC /ms2/76/0C/81517
> test.com emp Drafts /ms/21/03/81526
> test.com emp INBOX /ms2/6B/0E/81511
> test.com emp "Junk E-mail" /ms/62/1C/81569
> test.com emp Personal /ms/15/00/81578
> test.com emp "some sap" /ms2/11/02/81579
> test.com emp Sent /ms2/22/1A/81580
> test.com emp Templates /ms2/07/17/81581
> test.com emp Trash /ms2/49/16/81582
> test.com emp "Work Request" /ms2/01/0F/81583
> test.com emp admins /ms2/6C/0A/81584
> test.com emp dumpster /ms2/77/10/81585
> test.com emp ec /ms2/19/07/81586
> test.com emp "emp personal" /ms2/50/
>
This (untested) should give you the 2 fields you want, assuming that
your directory names never contain white space:
awk '{
dirpath=$NF
sub(/[^[:space:]]+[[:space:]]+[^[:space:]]+[[:space:]]+/,"")
sub(/[[:space:]]+[^[:space:]]+$/,"")
gsub(/"/,"")
foldername=$0
printf "dirpath=\"%s\" foldername=\"%s\"\n",dirpath,foldername
}' sample.txt
The first sub deletes the first 2 fields and subsequent white space. The
second sub deletes the last field and preceeding white space. The end
result is just the third field.
You can use "eval" if you want to directly create shell variables of the
same names as used inthe awk output (ask in comp.unix.shell about that
if you're unsure).
Regards,
Ed.
On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
> Thomas J. wrote:
> > awk's fieldseparator (FS) cant be a RE,
>
> GNU awks can.
the OP uses unspecific awk...
>
> > so you have to "split" your records.
>
> > try
>
> > awk '{split($0,a,"([ \t])+");print a[3];}')
> > awk '{split($0,a,"([ \t])+");print a[4];}')
>
> The trailing semicolons aren't necessary and chains of white space are
> the default FS so, unless there's some white space characters you're
> specicially trying to exclude, the above is equivalent to:
>
> awk '{print $3}'
> awk '{print $4}'
I dont agree.
My hint does defnitly not the same as
awk '{print $3}'
but you are right:
my hints doesnt meet the requirements of the OP.
>
> > man awk
>
> Indeed ;-).
>
or read Ed Mortons posts
Thank you,
Thomas
Ï/Ç Thomas J. Ýãñáøå:
> On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
>
> or read Ed Mortons posts
That's the key exactly
In article <D9CdnaTu6tcav3XYnZ2dnUVZ_vyunZ2d@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
>Thomas J. wrote:
>
>> awk's fieldseparator (FS) cant be a RE,
>
>GNU awks can.
As does TAWK. Which covers 100% of my use of AWK (i.e.,GAWK & TAWK).
And, as the commmerical says, anything else ... would be ... uncivilized.
Thomas J. wrote:
> On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
>
>>Thomas J. wrote:
<snip>
>>>awk '{split($0,a,"([ \t])+");print a[3];}')
<snip>
>>unless there's some white space characters you're
>>specicially trying to exclude, the above is equivalent to:
>>
>>awk '{print $3}'
<snip>
> I dont agree.
> My hint does defnitly not the same as
> awk '{print $3}'
Could you give an example of how it'd be different?
Ed.