If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > Join two files using AWK

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-06-06, 23:14
madhunk madhunk is offline
Registered User
 
Join Date: Jun 2006
Posts: 4
Join two files using AWK

Two files need to be joined on a second column. File 1 is standard and it has only two columns. However, File 2 will be changing on the number of columns after 2nd column. One file can have 30 fields and other file has 80 fields. File 1 always will have two columns.

File 1

PHP Code:
FRO     FRONTIER 
MWC    MIDWEST 
ORB     ORBITZ 
ATR     AIRTRAN 
BWI     BESTWESTERN 
File 2

PHP Code:
M       AIRTRAN 8       TRIPS_YTD       A 
M       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17 
M       FRONTIER 8 
M       CAESAR 8       TRIPS_YTD       D       10 
M       ORBITZ 8       TRIPS 
Desired Join Output:

PHP Code:
ATR       AIRTRAN 8       TRIPS_YTD       A 
BWI       BESTWESTERN 8       TRIPS_YTD       B       20      TRIPS_TOTAL     A       17 
FRO       FRONTIER 8 
ORB       ORBITZ 8       TRIPS 
Please let me know if there is a way to do this using NAWK...
Reply With Quote
  #2 (permalink)  
Old 06-07-06, 01:06
Pat Phelan Pat Phelan is online now
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,614
This smells like homework to me...

Yes, it can certainly be done using NAWK. Read the first file into a hash (a NAWK array) in the BEGIN block, then do a hash lookup to replace the first field of each line in the second file with the hashed lookup from the original file.

-PatP
Reply With Quote
  #3 (permalink)  
Old 06-07-06, 10:14
madhunk madhunk is offline
Registered User
 
Join Date: Jun 2006
Posts: 4
Thank you Pat...

This is definitely not a homework problem....I work for a bank and we have to do something like this....Since, I am not an expert in awk, I was trying to achieve it through something like this...

Code:
#!/usr/bin/ksh

DIR=$1
CODES_FILE=$2
METADATA_FILE=$3

cd ${DIR}

sort -k 2,2 ${CODES_FILE} > f1.dat
sort -k 2,2 ${METADATA_FILE} > f2.dat

awk '
BEGIN { OFS = "\t" } { print $2 , $1, $2 }
' f1.dat > temp_f1.dat

awk '
BEGIN { OFS = "\t" } { print $2 , $2, $3 ,$4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26, $27  }
' f2.dat > temp_f2.dat

join  -j1 1 -j2 1 -o 1.2 1.3 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 2.20 2.21 2.22 2.23 2.24 2.25 2.26 2.27 temp_f1.da
t temp_f2.dat | tr -s ' ' '\t' > f3.dat
The reason why I didn't like this is because File2 has dynamically changing columns. The current code takes only 27 fields and if I have 80 fields, then it would be a problem.

File 1 is always constant and has only 2 columns.

Please advice
Reply With Quote
  #4 (permalink)  
Old 06-07-06, 10:41
madhunk madhunk is offline
Registered User
 
Join Date: Jun 2006
Posts: 4
This is what I did....

Code:
nawk '
  NR==FNR{
    a[$2]=$1;
    next
 }a[$2]&&sub($1,a[$2])' ${CODES_FILE} ${METADATA_FILE} > f3.dat
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On