Awk script to separate spaces and tabs - awk

This is a discussion on Awk script to separate spaces and tabs - awk ; Hi, I have bunch of file as shown below "sample.txt" with multiple lines. Each line is separated either by TAB or BLANK space. I am trying to write shell script which gives me col 3 and corrosponding col 4. so ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 11

Awk script to separate spaces and tabs

  1. Default Awk script to separate spaces and tabs

    Hi,
    I have bunch of file as shown below "sample.txt" with multiple lines.
    Each line is separated either by TAB or BLANK space. I am trying to
    write shell script which gives me col 3 and corrosponding col 4. so
    that i can see generate a report for each line in each of the text
    file with size and # no files in each folder. Also need to copy the
    data from corrponding foler in each line to new folder say the first
    line is

    test.com emp Announcements /ms/0B/03/81512

    I need foldername= Announcements
    corrosponding dirpath=/ms/0B/03/81512

    Report:
    Folder: Announcements
    # files= `ls /ms/0B/03/81512`
    Size: `du -sk /ms/0B/03/81512`

    Becasue of tabs and white sapces betwen the colums and white spaces in
    col 3 its difficult to separate with cat and awk . Any help is greatly
    apprciated.

    for i in `cat /tmp/sample.txt`
    do
    echo $i
    dirpath=`echo $i | awk -F"\t" '{print $4}'`
    # echo $dirpath
    # foldername=`echo $i | awk -F"\t" '{print $3}'`
    #/bin/echo "size = \c"
    # du -s $dirpath
    #/bin/echo "largest = \c"
    # ls -ltsr $dirpath | sort -n | tail -1
    #/bin/echo "num msgs = \c"
    # ls $dirpath | wc -l
    done


    sample.txt
    test.com emp Announcements /ms/0B/03/81512
    test.com emp BOD /ms/76/0F/81513
    test.com emp CGC /ms2/76/0C/81517
    test.com emp Drafts /ms/21/03/81526
    test.com emp INBOX /ms2/6B/0E/81511
    test.com emp "Junk E-mail" /ms/62/1C/81569
    test.com emp Personal /ms/15/00/81578
    test.com emp "some sap" /ms2/11/02/81579
    test.com emp Sent /ms2/22/1A/81580
    test.com emp Templates /ms2/07/17/81581
    test.com emp Trash /ms2/49/16/81582
    test.com emp "Work Request" /ms2/01/0F/81583
    test.com emp admins /ms2/6C/0A/81584
    test.com emp dumpster /ms2/77/10/81585
    test.com emp ec /ms2/19/07/81586
    test.com emp "emp personal" /ms2/50/


  2. Default Re: Awk script to separate spaces and tabs

    awk's fieldseparator (FS) cant be a RE,
    so you have to "split" your records.

    try

    awk '{split($0,a,"([ \t])+");print a[3];}')
    awk '{split($0,a,"([ \t])+");print a[4];}')

    man awk

    hth,

    Thomas



  3. Default Re: Awk script to separate spaces and tabs


    Ï/Ç Thomas J. Ýãñáøå:
    > awk's fieldseparator (FS) cant be a RE,
    > so you have to "split" your records.
    >
    > try
    >
    > awk '{split($0,a,"([ \t])+");print a[3];}')
    > awk '{split($0,a,"([ \t])+");print a[4];}')
    >
    > man awk
    >
    > hth,
    >
    > Thomas


    Or you can pass the field separator in cmdline like this:

    awk -F'\t' '{ gsub(/"/, "", $3); print $3 }'
    awk -F'\t' '{ print $4 }'


  4. Default Re: Awk script to separate spaces and tabs

    Thomas J. wrote:

    > awk's fieldseparator (FS) cant be a RE,


    GNU awks can.

    > so you have to "split" your records.
    >
    > try
    >
    > awk '{split($0,a,"([ \t])+");print a[3];}')
    > awk '{split($0,a,"([ \t])+");print a[4];}')


    The trailing semicolons aren't necessary and chains of white space are
    the default FS so, unless there's some white space characters you're
    specicially trying to exclude, the above is equivalent to:

    awk '{print $3}'
    awk '{print $4}'

    > man awk


    Indeed ;-).

    Ed.

  5. Default Re: Awk script to separate spaces and tabs

    Vassilis wrote:

    > Ï/Ç Thomas J. Ýãñáøå:
    >
    >>awk's fieldseparator (FS) cant be a RE,
    >>so you have to "split" your records.
    >>
    >>try
    >>
    >>awk '{split($0,a,"([ \t])+");print a[3];}')
    >>awk '{split($0,a,"([ \t])+");print a[4];}')
    >>
    >>man awk
    >>
    >>hth,
    >>
    >>Thomas

    >
    >
    > Or you can pass the field separator in cmdline like this:
    >
    > awk -F'\t' '{ gsub(/"/, "", $3); print $3 }'
    > awk -F'\t' '{ print $4 }'
    >


    In the part that was snipped the OP said:

    > Becasue of tabs and white sapces betwen the colums and white spaces in
    > col 3...


    so you can't just use a single tab as the field separator.

    Ed.

  6. Default Re: Awk script to separate spaces and tabs

    explor wrote:

    > Hi,
    > I have bunch of file as shown below "sample.txt" with multiple lines.
    > Each line is separated either by TAB or BLANK space. I am trying to
    > write shell script which gives me col 3 and corrosponding col 4. so
    > that i can see generate a report for each line in each of the text
    > file with size and # no files in each folder. Also need to copy the
    > data from corrponding foler in each line to new folder say the first
    > line is
    >
    > test.com emp Announcements /ms/0B/03/81512
    >
    > I need foldername= Announcements
    > corrosponding dirpath=/ms/0B/03/81512
    >
    > Report:
    > Folder: Announcements
    > # files= `ls /ms/0B/03/81512`
    > Size: `du -sk /ms/0B/03/81512`
    >
    > Becasue of tabs and white sapces betwen the colums and white spaces in
    > col 3 its difficult to separate with cat and awk . Any help is greatly
    > apprciated.
    >
    > for i in `cat /tmp/sample.txt`
    > do
    > echo $i
    > dirpath=`echo $i | awk -F"\t" '{print $4}'`
    > # echo $dirpath
    > # foldername=`echo $i | awk -F"\t" '{print $3}'`
    > #/bin/echo "size = \c"
    > # du -s $dirpath
    > #/bin/echo "largest = \c"
    > # ls -ltsr $dirpath | sort -n | tail -1
    > #/bin/echo "num msgs = \c"
    > # ls $dirpath | wc -l
    > done
    >
    >
    > sample.txt
    > test.com emp Announcements /ms/0B/03/81512
    > test.com emp BOD /ms/76/0F/81513
    > test.com emp CGC /ms2/76/0C/81517
    > test.com emp Drafts /ms/21/03/81526
    > test.com emp INBOX /ms2/6B/0E/81511
    > test.com emp "Junk E-mail" /ms/62/1C/81569
    > test.com emp Personal /ms/15/00/81578
    > test.com emp "some sap" /ms2/11/02/81579
    > test.com emp Sent /ms2/22/1A/81580
    > test.com emp Templates /ms2/07/17/81581
    > test.com emp Trash /ms2/49/16/81582
    > test.com emp "Work Request" /ms2/01/0F/81583
    > test.com emp admins /ms2/6C/0A/81584
    > test.com emp dumpster /ms2/77/10/81585
    > test.com emp ec /ms2/19/07/81586
    > test.com emp "emp personal" /ms2/50/
    >


    This (untested) should give you the 2 fields you want, assuming that
    your directory names never contain white space:

    awk '{
    dirpath=$NF
    sub(/[^[:space:]]+[[:space:]]+[^[:space:]]+[[:space:]]+/,"")
    sub(/[[:space:]]+[^[:space:]]+$/,"")
    gsub(/"/,"")
    foldername=$0
    printf "dirpath=\"%s\" foldername=\"%s\"\n",dirpath,foldername
    }' sample.txt

    The first sub deletes the first 2 fields and subsequent white space. The
    second sub deletes the last field and preceeding white space. The end
    result is just the third field.

    You can use "eval" if you want to directly create shell variables of the
    same names as used inthe awk output (ask in comp.unix.shell about that
    if you're unsure).

    Regards,

    Ed.

  7. Default Re: Awk script to separate spaces and tabs

    On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
    > Thomas J. wrote:
    > > awk's fieldseparator (FS) cant be a RE,

    >
    > GNU awks can.


    the OP uses unspecific awk...

    >
    > > so you have to "split" your records.

    >
    > > try

    >
    > > awk '{split($0,a,"([ \t])+");print a[3];}')
    > > awk '{split($0,a,"([ \t])+");print a[4];}')

    >
    > The trailing semicolons aren't necessary and chains of white space are
    > the default FS so, unless there's some white space characters you're
    > specicially trying to exclude, the above is equivalent to:
    >
    > awk '{print $3}'
    > awk '{print $4}'


    I dont agree.
    My hint does defnitly not the same as
    awk '{print $3}'

    but you are right:
    my hints doesnt meet the requirements of the OP.

    >
    > > man awk

    >
    > Indeed ;-).
    >


    or read Ed Mortons posts

    Thank you,

    Thomas



  8. Default Re: Awk script to separate spaces and tabs


    Ï/Ç Thomas J. Ýãñáøå:
    > On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
    >
    > or read Ed Mortons posts


    That's the key exactly


  9. Default Re: Awk script to separate spaces and tabs

    In article <D9CdnaTu6tcav3XYnZ2dnUVZ_vyunZ2d@comcast.com>,
    Ed Morton <morton@lsupcaemnt.com> wrote:
    >Thomas J. wrote:
    >
    >> awk's fieldseparator (FS) cant be a RE,

    >
    >GNU awks can.


    As does TAWK. Which covers 100% of my use of AWK (i.e.,GAWK & TAWK).

    And, as the commmerical says, anything else ... would be ... uncivilized.


  10. Default Re: Awk script to separate spaces and tabs

    Thomas J. wrote:
    > On 2 Mrz., 14:11, Ed Morton <mor...@lsupcaemnt.com> wrote:
    >
    >>Thomas J. wrote:

    <snip>
    >>>awk '{split($0,a,"([ \t])+");print a[3];}')

    <snip>
    >>unless there's some white space characters you're
    >>specicially trying to exclude, the above is equivalent to:
    >>
    >>awk '{print $3}'

    <snip>
    > I dont agree.
    > My hint does defnitly not the same as
    > awk '{print $3}'


    Could you give an example of how it'd be different?

    Ed.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Similar Threads

  1. A better way to convert spaces to tabs while joining 2 lines
    By Application Development in forum awk
    Replies: 6
    Last Post: 12-07-2007, 02:37 PM
  2. Vim autoindent Tabs and Spaces
    By Application Development in forum Editors
    Replies: 2
    Last Post: 02-16-2007, 07:20 PM
  3. vim: replace tabs by spaces
    By Application Development in forum Editors
    Replies: 2
    Last Post: 08-01-2006, 07:31 PM
  4. how to deal with spaces and tabs
    By Application Development in forum awk
    Replies: 1
    Last Post: 02-16-2006, 07:59 AM
  5. How to use split on spaces AND tabs?
    By Application Development in forum Perl
    Replies: 2
    Last Post: 07-03-2004, 04:24 PM