How to extract element from multiple files from different parentfolders in awk

This is a discussion on How to extract element from multiple files from different parentfolders in awk within the awk forums in Programming Languages category; Hi, I have 10 different set of files located at 10 different parent directory and each of these 10 files has respective 50 lines with different values. I would like to process each line of the 50 lines in these 10 files one at a time and do an average of 3rd field ($3) of these 10 files. This will be output to an output file. Instead of using join to generate whole bunch of redundant files and then compute the average, im looking any other possible better way to do the above right away. E.g bin1/apple.txt tool1 2.00 4 ...

Go Back   Application Development Forum > Programming Languages > awk

Object Mix

Register FAQ Calendar Search Today's Posts Mark Forums Read
  #1  
Old 08-25-2008, 04:53 AM
huajie.lee@gmail.com
Guest
 
Default How to extract element from multiple files from different parentfolders in awk

Hi,

I have 10 different set of files located at 10 different parent
directory and each of these 10 files has respective 50 lines with
different values.

I would like to process each line of the 50 lines in these 10 files
one at a time and do an average of 3rd field ($3) of these 10 files.
This will be output to an output file.

Instead of using join to generate whole bunch of redundant files and
then compute the average, im looking any other possible better way to
do the above right away.

E.g

bin1/apple.txt
tool1 2.00 4 30.20
tool2 3.00 5 40.22
tool3 2.00 6 45.32
.....
tool50 ...........


bin2/apple.txt
tool1 2.00 1 30.20
tool2 3.00 5 40.22
tool3 2.00 6 45.32
.....
tool50 ...........


bin3/apple.txt
tool1 2.00 2 30.20
tool2 3.00 5 40.22
tool3 2.00 6 45.32
.....
tool50 ...........


........


bin10/apple.txt
tool1 2.00 4 30.20
tool2 3.00 5 40.22
tool3 2.00 5 45.32
.....
tool50 ...........


Output desired:-
tool1 (4+1+2+....+4)/10
tool2 (5+5+5+....+5)/10
tool3 (6+6+6+....+5)/10

This is only for one file "apple.txt". Assume I have other remaining 9
set of files which are "foo.txt", "bar.txt"....etc..in similar parent
folders.

I have a rough idea of using "getline" in awk but when it comes to
different files in different parent directory, its more towards bash.
This ends up drafting a mixture of bash and awk which is terribly
confusing.

Pseudocode:-

filename={apple.txt,orange.txt,bar.txt............ ....}
directory={bin1,bin2,bin3....bin10}

for file in filename;do

for dir in directory;do

while(getline<dir"/" "file"); do

val+=$3;
count++;

done
done

tada=val/count;

print $1"," tada>$fileoutput.txt

done

Please advise. Appreciate it.
Reply With Quote
  #2  
Old 08-25-2008, 08:52 AM
Ed Morton
Guest
 
Default Re: How to extract element from multiple files from different parentfolders in awk

On 8/25/2008 3:53 AM, huajie.lee@gmail.com wrote:
> Hi,
>
> I have 10 different set of files located at 10 different parent
> directory and each of these 10 files has respective 50 lines with
> different values.
>
> I would like to process each line of the 50 lines in these 10 files
> one at a time and do an average of 3rd field ($3) of these 10 files.
> This will be output to an output file.
>
> Instead of using join to generate whole bunch of redundant files and
> then compute the average, im looking any other possible better way to
> do the above right away.
>
> E.g
>
> bin1/apple.txt
> tool1 2.00 4 30.20
> tool2 3.00 5 40.22
> tool3 2.00 6 45.32
> ....
> tool50 ...........
>
>
> bin2/apple.txt
> tool1 2.00 1 30.20
> tool2 3.00 5 40.22
> tool3 2.00 6 45.32
> ....
> tool50 ...........
>
>
> bin3/apple.txt
> tool1 2.00 2 30.20
> tool2 3.00 5 40.22
> tool3 2.00 6 45.32
> ....
> tool50 ...........
>
>
> .......
>
>
> bin10/apple.txt
> tool1 2.00 4 30.20
> tool2 3.00 5 40.22
> tool3 2.00 5 45.32
> ....
> tool50 ...........
>
>
> Output desired:-
> tool1 (4+1+2+....+4)/10
> tool2 (5+5+5+....+5)/10
> tool3 (6+6+6+....+5)/10
>
> This is only for one file "apple.txt". Assume I have other remaining 9
> set of files which are "foo.txt", "bar.txt"....etc..in similar parent
> folders.
>
> I have a rough idea of using "getline" in awk but when it comes to
> different files in different parent directory, its more towards bash.
> This ends up drafting a mixture of bash and awk which is terribly
> confusing.
>
> Pseudocode:-
>
> filename={apple.txt,orange.txt,bar.txt............ ....}
> directory={bin1,bin2,bin3....bin10}
>
> for file in filename;do
>
> for dir in directory;do
>
> while(getline<dir"/" "file"); do
>
> val+=$3;
> count++;
>
> done
> done
>
> tada=val/count;
>
> print $1"," tada>$fileoutput.txt
>
> done
>
> Please advise. Appreciate it.


Untested:

awk '
FNR==1 {fname=FILENAME; sub(/.*\//,"",fname); fcnt[fname]++}
{rname[fname,FNR]=$1; sum[fname,FNR]+=$3; rcnt[fname]=FNR}
END{
for (fname in fcnt)
for (i=1;i<=rcnt[fname];i++)
print rname[fname,i],sum[fname,i]/cnt[fname] > "/outputdir/" fname
}' /bin1/* /bin2/* ... /bin50/*

Regards,

Ed.

Reply With Quote
  #3  
Old 08-29-2008, 06:11 PM
Claudio
Guest
 
Default Re: How to extract element from multiple files from differentparent folders in awk

On 2008-08-25, huajie.lee@gmail.com <huajie.lee@gmail.com> wrote:
> Hi,


hello!

>
> Pseudocode:-
>
> filename={apple.txt,orange.txt,bar.txt............ ....}
> directory={bin1,bin2,bin3....bin10}
>
> for file in filename;do
>
> for dir in directory;do
>

cat $dir/$file
done
done | awk '{s+=$3; nlines++} END {print s/n}'



For a clearer code:

(for ...
for ...
cat $dir/$file
done
done
) | awk ...


The script inside () jus concatenates all the files.


If you prefer:

cat {dir1,dir2,dir2}/{file1,file2,file3} | awk ...

Or just:

awk '...' {dir1,dir2,dir2}/{file1,file2,file3}

.... but I don't know if this runs in your unix flavour.

Best regards,
Claudio.

Reply With Quote
Reply


Thread Tools
Display Modes


All times are GMT -5. The time now is 02:58 AM.


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
vB Ad Management by =RedTyger=

In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.