| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
| |||
| |||
| Hi, I have 10 different set of files located at 10 different parent directory and each of these 10 files has respective 50 lines with different values. I would like to process each line of the 50 lines in these 10 files one at a time and do an average of 3rd field ($3) of these 10 files. This will be output to an output file. Instead of using join to generate whole bunch of redundant files and then compute the average, im looking any other possible better way to do the above right away. E.g bin1/apple.txt tool1 2.00 4 30.20 tool2 3.00 5 40.22 tool3 2.00 6 45.32 ..... tool50 ........... bin2/apple.txt tool1 2.00 1 30.20 tool2 3.00 5 40.22 tool3 2.00 6 45.32 ..... tool50 ........... bin3/apple.txt tool1 2.00 2 30.20 tool2 3.00 5 40.22 tool3 2.00 6 45.32 ..... tool50 ........... ........ bin10/apple.txt tool1 2.00 4 30.20 tool2 3.00 5 40.22 tool3 2.00 5 45.32 ..... tool50 ........... Output desired:- tool1 (4+1+2+....+4)/10 tool2 (5+5+5+....+5)/10 tool3 (6+6+6+....+5)/10 This is only for one file "apple.txt". Assume I have other remaining 9 set of files which are "foo.txt", "bar.txt"....etc..in similar parent folders. I have a rough idea of using "getline" in awk but when it comes to different files in different parent directory, its more towards bash. This ends up drafting a mixture of bash and awk which is terribly confusing. Pseudocode:- filename={apple.txt,orange.txt,bar.txt............ ....} directory={bin1,bin2,bin3....bin10} for file in filename;do for dir in directory;do while(getline<dir"/" "file"); do val+=$3; count++; done done tada=val/count; print $1"," tada>$fileoutput.txt done Please advise. Appreciate it. |
|
#2
| |||
| |||
| On 8/25/2008 3:53 AM, huajie.lee@gmail.com wrote: > Hi, > > I have 10 different set of files located at 10 different parent > directory and each of these 10 files has respective 50 lines with > different values. > > I would like to process each line of the 50 lines in these 10 files > one at a time and do an average of 3rd field ($3) of these 10 files. > This will be output to an output file. > > Instead of using join to generate whole bunch of redundant files and > then compute the average, im looking any other possible better way to > do the above right away. > > E.g > > bin1/apple.txt > tool1 2.00 4 30.20 > tool2 3.00 5 40.22 > tool3 2.00 6 45.32 > .... > tool50 ........... > > > bin2/apple.txt > tool1 2.00 1 30.20 > tool2 3.00 5 40.22 > tool3 2.00 6 45.32 > .... > tool50 ........... > > > bin3/apple.txt > tool1 2.00 2 30.20 > tool2 3.00 5 40.22 > tool3 2.00 6 45.32 > .... > tool50 ........... > > > ....... > > > bin10/apple.txt > tool1 2.00 4 30.20 > tool2 3.00 5 40.22 > tool3 2.00 5 45.32 > .... > tool50 ........... > > > Output desired:- > tool1 (4+1+2+....+4)/10 > tool2 (5+5+5+....+5)/10 > tool3 (6+6+6+....+5)/10 > > This is only for one file "apple.txt". Assume I have other remaining 9 > set of files which are "foo.txt", "bar.txt"....etc..in similar parent > folders. > > I have a rough idea of using "getline" in awk but when it comes to > different files in different parent directory, its more towards bash. > This ends up drafting a mixture of bash and awk which is terribly > confusing. > > Pseudocode:- > > filename={apple.txt,orange.txt,bar.txt............ ....} > directory={bin1,bin2,bin3....bin10} > > for file in filename;do > > for dir in directory;do > > while(getline<dir"/" "file"); do > > val+=$3; > count++; > > done > done > > tada=val/count; > > print $1"," tada>$fileoutput.txt > > done > > Please advise. Appreciate it. Untested: awk ' FNR==1 {fname=FILENAME; sub(/.*\//,"",fname); fcnt[fname]++} {rname[fname,FNR]=$1; sum[fname,FNR]+=$3; rcnt[fname]=FNR} END{ for (fname in fcnt) for (i=1;i<=rcnt[fname];i++) print rname[fname,i],sum[fname,i]/cnt[fname] > "/outputdir/" fname }' /bin1/* /bin2/* ... /bin50/* Regards, Ed. |
|
#3
| |||
| |||
| On 2008-08-25, huajie.lee@gmail.com <huajie.lee@gmail.com> wrote: > Hi, hello! > > Pseudocode:- > > filename={apple.txt,orange.txt,bar.txt............ ....} > directory={bin1,bin2,bin3....bin10} > > for file in filename;do > > for dir in directory;do > cat $dir/$file done done | awk '{s+=$3; nlines++} END {print s/n}' For a clearer code: (for ... for ... cat $dir/$file done done ) | awk ... The script inside () jus concatenates all the files. If you prefer: cat {dir1,dir2,dir2}/{file1,file2,file3} | awk ... Or just: awk '...' {dir1,dir2,dir2}/{file1,file2,file3} .... but I don't know if this runs in your unix flavour. Best regards, Claudio. |
![]() |
| Thread Tools | |
| Display Modes | |
In an effort to better serve ads to our visitors, cookies are used on objectmix.com. For more information, check out our Privacy Policy.