If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > Parsing Values

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 03-11-04, 18:21
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
Parsing Values

From the following file format I need to count the number of “BHAH”, “CHAH” and “AHAH” (Both cases (upper/lower) occurrences by ABC, 123, BBB

bhah ABC set

BHAH ABC SET

ahah from ABC where

AHAH FROM ABC WHERE

CHAH INTO ABC VALUES

chah into ABC values


CHAH INTO 123 VALUES

chah into 123 values

AHAH FROM 123 WHERE

ahah from 123 where

BHAH 123 SET

bhah 123 set



bhah BBB set

BHAH BBB SET

CHAH INTO BBB VALUES

chah into BBB values

ahah from BBB

AHAH FROM BBB


I am looking for the following output

BHAH ABC 2
AHAH ABC 2
CHAH ABC 2
BHAH 123 2
AHAH 123 2
CHAH 123 2
BHAH BBB 2
AHAH BBB 2
CHAH BBB 2

Any help is appreciated
Reply With Quote
  #2 (permalink)  
Old 03-12-04, 03:21
aigles aigles is offline
Registered User
 
Join Date: Jan 2004
Location: Bordeaux, France
Posts: 319
The following script makes the work ...
It selects the lines with second word equal to FROM or INTO, and the lines with third word equal to SET.

Code:
awk '
{ $0 = toupper($0) }
$2 == "FROM" || $2 == "INTO" { count[$1,$3]++ }
$3 == "SET" { count[$1,$2]++ }
END {
   for (indx in count) {
      split(indx, i, SUBSEP)
      print i[1],i[2],count[indx]
   }
}
' input_file
__________________
Jean-Pierre.
Reply With Quote
  #3 (permalink)  
Old 03-12-04, 08:56
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
Re: Parsing Values

Thank you

Please help me understand how that just happened..

:-)
Reply With Quote
  #4 (permalink)  
Old 03-12-04, 09:34
aigles aigles is offline
Registered User
 
Join Date: Jan 2004
Location: Bordeaux, France
Posts: 319
{ $0 = toupper($0) }
Converts to lowercase the record

$2 == "FROM" || $2 == "INTO"
Select record if field1 is equal to FROM or INTO.
In that case the two words to select ared field1 and field3.

{ count[$1,$3]++ }
The 'count' array contains the occurrence number of the two words.
The entry for the two words is incremented by 1.

$3 == "SET" { count[$1,$2]++ }
Select record if field2 is equal to SET.
In that case the two words to select ared field1 and field2.

{ count[$1,$2]++ }
The entry for the two words is incremented by 1.

[COLOR=blue]END { . . . }[/COLOR
The code is executed when all records of the input file are read.

for (indx in count) { . . . }
The code is executed for every index of the 'code' array.
The index is formed by the multiple subsiptes separated by the character SUBSEP.
For example, the index for the element count["aaa","bbb"] is the equivalent to count["aaaSUBSEPbbb"].
The for variable 'indx' of the loop is assigned this second form of the index.

split(indx, i, SUBSEP)
The index is split into the 'i' array, the character SUBSEP is used as delimiter into 'index'
indx="aaaSUBSEPbbb" => i[1]="aaa" i[2]="bbb"

print i[1],i[2],count[indx]
print the two words and the number of occurrence


Hope This Help ( PH)
__________________
Jean-Pierre.
Reply With Quote
  #5 (permalink)  
Old 03-12-04, 09:50
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
Re: Parsing Values

When I run the following code I am getting the errors listed below. Please help me if I am missing anything

Thanks

#!/bin/ksh
awk '
{ $0 = toupper($0) }
$2 == "FROM" || $2 == "INTO" { count[$1,$3]++ }
$3 == "SET" { count[$1,$2]++ }
END
{
for (indx in count)
{
split(indx, i, SUBSEP)
print i[1],i[2],count[indx]
}
} ' /home/dba/test.dat.load


awk: syntax error near line 3
awk: illegal statement near line 3
awk: syntax error near line 4
awk: illegal statement near line 4
awk: syntax error near line 6
awk: bailing out near line 6
Reply With Quote
  #6 (permalink)  
Old 03-12-04, 10:02
aigles aigles is offline
Registered User
 
Join Date: Jan 2004
Location: Bordeaux, France
Posts: 319
Try to replace 'awk' by 'nawk'
__________________
Jean-Pierre.
Reply With Quote
  #7 (permalink)  
Old 03-12-04, 11:44
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
Re: Parsing Values

It works. Thanks for your help. Also, how would I get the values if the "SET" keyword is on the next line.

Thanks

EX:

bhah ABC
set

BHAH ABC
SET

BHAH 123
SET

bhah 123
set



bhah BBB
set

BHAH BBB
SET
Reply With Quote
  #8 (permalink)  
Old 03-12-04, 11:54
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
Re: Parsing Values

Also I wanted to know if I could put spaces before and after the

" INTO "
" FROM "
" SET "

Thanks for your help!!!
Reply With Quote
  #9 (permalink)  
Old 03-12-04, 13:07
aigles aigles is offline
Registered User
 
Join Date: Jan 2004
Location: Bordeaux, France
Posts: 319
Try this new version of the script :

Code:
awk '
{ $0 = toupper($0) }
$1 == "SET" { count[w1,w2]++ ; next }
$2 == "FROM" || $2 == "INTO" { count[$1,$3]++ ; next }
$3 == "SET" { count[$1,$2]++ ;next }
{ w1 = $1; w2 = $2 }
END {
   for (indx in count) {
      split(indx, i, SUBSEP)
      print i[1],i[2],count[indx]
   }
}
' input_file
$1 == "SET" { count[w1,w2]++ ; next }
Select record if field1 equal to SET.
The entry for the two words is incremented by 1.
The two words have been memorized in w1 et w2 from the previous record.
The 'next' statement stop the processing of the current input record and proceeds with the next input record.

{ w1 = $1; w2 = $2 }
Memorize field1 and field2 values in variables w1 and w2;
The values will be used for the next record if the field1 of that record is equal to SET.


Don't put spaces in the tested strings.
$1 == " FROM " will never be verified because the space is a field separator and awk removes them.
__________________
Jean-Pierre.
Reply With Quote
  #10 (permalink)  
Old 03-12-04, 13:23
gomes009 gomes009 is offline
Registered User
 
Join Date: Feb 2004
Posts: 37
You are awesome!!

Thanks and I appreciate all your help!
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On