Results 1 to 10 of 10

Thread: Parsing Values

  1. #1
    Join Date
    Feb 2004
    Posts
    37

    Unanswered: Parsing Values

    From the following file format I need to count the number of “BHAH”, “CHAH” and “AHAH” (Both cases (upper/lower) occurrences by ABC, 123, BBB

    bhah ABC set

    BHAH ABC SET

    ahah from ABC where

    AHAH FROM ABC WHERE

    CHAH INTO ABC VALUES

    chah into ABC values


    CHAH INTO 123 VALUES

    chah into 123 values

    AHAH FROM 123 WHERE

    ahah from 123 where

    BHAH 123 SET

    bhah 123 set



    bhah BBB set

    BHAH BBB SET

    CHAH INTO BBB VALUES

    chah into BBB values

    ahah from BBB

    AHAH FROM BBB


    I am looking for the following output

    BHAH ABC 2
    AHAH ABC 2
    CHAH ABC 2
    BHAH 123 2
    AHAH 123 2
    CHAH 123 2
    BHAH BBB 2
    AHAH BBB 2
    CHAH BBB 2

    Any help is appreciated

  2. #2
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    The following script makes the work ...
    It selects the lines with second word equal to FROM or INTO, and the lines with third word equal to SET.

    Code:
    awk '
    { $0 = toupper($0) }
    $2 == "FROM" || $2 == "INTO" { count[$1,$3]++ }
    $3 == "SET" { count[$1,$2]++ }
    END {
       for (indx in count) {
          split(indx, i, SUBSEP)
          print i[1],i[2],count[indx]
       }
    }
    ' input_file
    Jean-Pierre.

  3. #3
    Join Date
    Feb 2004
    Posts
    37

    Re: Parsing Values

    Thank you

    Please help me understand how that just happened..

    :-)

  4. #4
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    { $0 = toupper($0) }
    Converts to lowercase the record

    $2 == "FROM" || $2 == "INTO"
    Select record if field1 is equal to FROM or INTO.
    In that case the two words to select ared field1 and field3.

    { count[$1,$3]++ }
    The 'count' array contains the occurrence number of the two words.
    The entry for the two words is incremented by 1.

    $3 == "SET" { count[$1,$2]++ }
    Select record if field2 is equal to SET.
    In that case the two words to select ared field1 and field2.

    { count[$1,$2]++ }
    The entry for the two words is incremented by 1.

    [COLOR=blue]END { . . . }[/COLOR
    The code is executed when all records of the input file are read.

    for (indx in count) { . . . }
    The code is executed for every index of the 'code' array.
    The index is formed by the multiple subsiptes separated by the character SUBSEP.
    For example, the index for the element count["aaa","bbb"] is the equivalent to count["aaaSUBSEPbbb"].
    The for variable 'indx' of the loop is assigned this second form of the index.

    split(indx, i, SUBSEP)
    The index is split into the 'i' array, the character SUBSEP is used as delimiter into 'index'
    indx="aaaSUBSEPbbb" => i[1]="aaa" i[2]="bbb"

    print i[1],i[2],count[indx]
    print the two words and the number of occurrence


    Hope This Help ( PH)
    Jean-Pierre.

  5. #5
    Join Date
    Feb 2004
    Posts
    37

    Re: Parsing Values

    When I run the following code I am getting the errors listed below. Please help me if I am missing anything

    Thanks

    #!/bin/ksh
    awk '
    { $0 = toupper($0) }
    $2 == "FROM" || $2 == "INTO" { count[$1,$3]++ }
    $3 == "SET" { count[$1,$2]++ }
    END
    {
    for (indx in count)
    {
    split(indx, i, SUBSEP)
    print i[1],i[2],count[indx]
    }
    } ' /home/dba/test.dat.load


    awk: syntax error near line 3
    awk: illegal statement near line 3
    awk: syntax error near line 4
    awk: illegal statement near line 4
    awk: syntax error near line 6
    awk: bailing out near line 6

  6. #6
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    Try to replace 'awk' by 'nawk'
    Jean-Pierre.

  7. #7
    Join Date
    Feb 2004
    Posts
    37

    Re: Parsing Values

    It works. Thanks for your help. Also, how would I get the values if the "SET" keyword is on the next line.

    Thanks

    EX:

    bhah ABC
    set

    BHAH ABC
    SET

    BHAH 123
    SET

    bhah 123
    set



    bhah BBB
    set

    BHAH BBB
    SET

  8. #8
    Join Date
    Feb 2004
    Posts
    37

    Re: Parsing Values

    Also I wanted to know if I could put spaces before and after the

    " INTO "
    " FROM "
    " SET "

    Thanks for your help!!!

  9. #9
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    Try this new version of the script :

    Code:
    awk '
    { $0 = toupper($0) }
    $1 == "SET" { count[w1,w2]++ ; next }
    $2 == "FROM" || $2 == "INTO" { count[$1,$3]++ ; next }
    $3 == "SET" { count[$1,$2]++ ;next }
    { w1 = $1; w2 = $2 }
    END {
       for (indx in count) {
          split(indx, i, SUBSEP)
          print i[1],i[2],count[indx]
       }
    }
    ' input_file
    $1 == "SET" { count[w1,w2]++ ; next }
    Select record if field1 equal to SET.
    The entry for the two words is incremented by 1.
    The two words have been memorized in w1 et w2 from the previous record.
    The 'next' statement stop the processing of the current input record and proceeds with the next input record.

    { w1 = $1; w2 = $2 }
    Memorize field1 and field2 values in variables w1 and w2;
    The values will be used for the next record if the field1 of that record is equal to SET.


    Don't put spaces in the tested strings.
    $1 == " FROM " will never be verified because the space is a field separator and awk removes them.
    Jean-Pierre.

  10. #10
    Join Date
    Feb 2004
    Posts
    37
    You are awesome!!

    Thanks and I appreciate all your help!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •