If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > Can't get awk to print a column

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 03-16-09, 03:31
smarvin smarvin is offline
Registered User
 
Join Date: Feb 2009
Posts: 8
Can't get awk to print a column

I am enclosing script, input file and output of a file...problem is that the END action of the script won't execute for $5...It gives me a value of 0. If I use $3 or $4 in line 7 of the script, it works fine. Can anyone explain why it's not recognizing $5? Also, there is an annoying '0' this is appearing in line 2 of $6 that I can't get rid of and can't find the source (this is only a 5 column file)...any ideas?

Thanks! (PS - I know it looks ugly... apparently formatting doesn't appear to carry over as written).

command: awk -f report.awk filename

Input file:

Code:
First Name      Last Name       Rate    Hours   Total Pay
  George          White         18.00     23
   Mark            Red          18.10     20
   Mary           Blue          10.89     25
    Dan           Black         12.00      0
   Susan          Green         18.00     40
    Nora          Brown         17.20     46
   Bruce          Purple        12.20     52
   John           Gray          11.00     39
    Bob           Gold          15.00     45
   Steve          Silver        14.67     25
Script:

Code:
BEGIN {print "          Weekly Report"}
      {OFS = "\t"}
      {actual = $3 * $4}
      {base = $3 * 40}
      {ovt = ($4 - 40) * 1.5 * $3}
      {total = 0}
      {totPay += $5}

{
 if ($4 > "40")
  print $0, base + ovt

 else
  print $0, actual
}

END   {print "Total Payroll",":", totPay}
Output:

Code:
                Weekly Report
First Name      Last Name       Rate    Hours   Total Pay     0
  George          White         18.00     23    414
   Mark            Red          18.10     20    362
   Mary           Blue          10.89     25    272.25
    Dan           Black         12.00      0    0
   Susan          Green         18.00     40    720
    Nora          Brown         17.20     46    842.8
   Bruce          Purple        12.20     52    707.6
   John           Gray          11.00     39    429
    Bob           Gold          15.00     45    712.5
   Steve          Silver        14.67     25    366.75
Total Payroll   :       0

Last edited by Pat Phelan; 03-16-09 at 13:09. Reason: Added code blocks to improve formatting
Reply With Quote
  #2 (permalink)  
Old 03-16-09, 06:31
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,517
Quote:
Can anyone explain why it's not recognizing $5?
These variables ($1 ... $5) refer to the input fields so $1 is the value in the first field ie George in the first record. If you count the input fields you'll see you only have 4 fields so $5 refers to nothing at all. Incidentally $0 refers to the whole line and you can change the field delimiter to things other than spaces etc.
Quote:
Also, there is an annoying '0' this is appearing in line 2 of $6 that I can't get rid of and can't find the source (this is only a 5 column file)...any ideas?
The header line is the only line that actually contains a value for $5 but sadly it's not a number so I guess it treats it as 0 and prints that.
Quote:
I know it looks ugly... apparently formatting doesn't appear to carry over as written
If you highlight all your code and then hit the # button in the edit window then it will keep your original formatting.
Reply With Quote
  #3 (permalink)  
Old 03-16-09, 21:53
smarvin smarvin is offline
Registered User
 
Join Date: Feb 2009
Posts: 8
Maybe you can help me with this then...My fifth input column has the heading 'Total Pay'. Since my BEGIN statement and body are set to calculate the 'Total Pay', why doesn't the END statement recognize the totals that wind up in column 5?
Reply With Quote
  #4 (permalink)  
Old 03-16-09, 21:56
smarvin smarvin is offline
Registered User
 
Join Date: Feb 2009
Posts: 8
oh..and thanks for the formatting solution.
Reply With Quote
  #5 (permalink)  
Old 03-17-09, 06:23
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,517
Code:
BEGIN {print "          Weekly Report"}
      {OFS = "\t"}
      {actual = $3 * $4}
      {base = $3 * 40}
      {ovt = ($4 - 40) * 1.5 * $3}
      {total = 0}
      {totPay += $5}

{
 if ($4 > "40")
  print $0, base + ovt

 else
  print $0, actual
}

END   {print "Total Payroll",":", totPay}
Few points:
  • The indenting around your BEGIN statement implies you have 7 statements in your BEGIN structure however the bracketing means only the print statement is part of the begin. The other statements are likely to be run each time.
  • You wrap each statement in curly brackets which means nothing. The brackets are used to run a a bunch of statements together.
  • There is a test where you compare a number field to "40" - do you really want to see if your variable is alphabetically greater than "40"?
  • You are still computing your total on the 5th column ($5) which doesn't exist in your input file (perhaps you should be using total rather than totpay).
  • You are currently calculating actual values for all lines including the header line. There is a standard variable supplied called NR which shows what line you are processing. You could use this to avoid the first line.
This site is quite good for showing how to use awk effectively.

Mike
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On