If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > file manupulation

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-07-04, 11:52
skd skd is offline
Registered User
 
Join Date: Sep 2003
Posts: 71
file manupulation

i believe its very easy to do in awk .need some help

here is my requirement

#input data ( prepared from cut command from different source file)
440 708 BAINBRIDGE TWP PD
440 708 CHARDON PD
440 708 CHESTER TWP PD
440 708 GEAUGA
440 708 MIDDLEFIELD PD
==================
#input data format
==================
col1=3 char
col2=3 char
col3=32 char with space padding ,mulitple words allowed separated by space
==================
#output data format
==================
col1=3 char
col2=3 char
col3=comma separated list with trimming space in 3rd col of input file
==================
#desired output
==================
440 708 BAINBRIDGE_TWP_PD,CHARDON_PD,CHESTER_TWP_PD,GEAUGA ,MIDDLEFIELD_PD
Reply With Quote
  #2 (permalink)  
Old 07-07-04, 14:13
fla5do fla5do is offline
Registered User
 
Join Date: Oct 2003
Location: Germany
Posts: 138
Hi SKD,
your requirement is not very difficult.

- but -

I think, when col1 or col2 change the values, then you want to have a new row with your desired output.

can it be ???
__________________
Greetings from germany
Peter F.
Reply With Quote
  #3 (permalink)  
Old 07-07-04, 14:28
skd skd is offline
Registered User
 
Join Date: Sep 2003
Posts: 71
yes, you got that right.
Reply With Quote
  #4 (permalink)  
Old 07-07-04, 14:31
LKBrwn_DBA LKBrwn_DBA is offline
Registered User
 
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
Talking

Try this:

Code:
#!/bin/ksh
nawk 'BEGIN {key=substr($0,1,8); lin=""; s=","}
{key2=substr($0,1,8); txt=substr($0,9); gsub(" ","_",txt);
 if (key == key2)  {lin = lin s txt;}
 else {gsub(" ","",lin); print key lin; key=key2; lin=txt;};
} END {gsub(" ","",lin); print key lin;}' MyFile.txt >NewFile.txt
__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

Last edited by LKBrwn_DBA; 07-07-04 at 14:34.
Reply With Quote
  #5 (permalink)  
Old 07-07-04, 14:38
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
or:

Code:
nawk '
{
  idx= $1 " " $2;
  gsub("^" $1 " " $2 " ", "");
  gsub(/ /, "_")
  a[idx] =  ( idx in a ) ? a[idx] "," $0 : $0;
}

END {
   for (i in a)
     printf("%s %s\n", i, a[i]);
}' file
Reply With Quote
  #6 (permalink)  
Old 07-07-04, 15:22
fla5do fla5do is offline
Registered User
 
Join Date: Oct 2003
Location: Germany
Posts: 138
Quote:
Originally Posted by LKBrwn_DBA
Try this:

Code:
#!/bin/ksh
nawk 'BEGIN {key=substr($0,1,8); lin=""; s=","}
{key2=substr($0,1,8); txt=substr($0,9); gsub(" ","_",txt);
 if (key == key2)  {lin = lin s txt;}
 else {gsub(" ","",lin); print key lin; key=key2; lin=txt;};
} END {gsub(" ","",lin); print key lin;}' MyFile.txt >NewFile.txt
Hi LKBrwn_DBA
I am brim over with enthusiasm for your solution.
I test it in SCO Unix and it works fine. I can use this in another problem I have. Thanks a lot

@ skd
I hope col1 and col2 are sorted. Otherwise it does not work.
__________________
Greetings from germany
Peter F.

Last edited by fla5do; 07-07-04 at 15:28.
Reply With Quote
  #7 (permalink)  
Old 07-07-04, 18:11
LKBrwn_DBA LKBrwn_DBA is offline
Registered User
 
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
Lightbulb

Quote:
Originally Posted by fla5do
@ skd
I hope col1 and col2 are sorted. Otherwise it does not work.
True, but you could do this:
Code:
sort MyFIle.txt |\
nawk 'BEGIN {key=substr($0,1,8); lin=""; s=","}
{key2=substr($0,1,8); txt=substr($0,9); gsub(" ","_",txt);
 if (key == key2)  {lin = lin s txt;}
 else {gsub(" ","",lin); print key lin; key=key2; lin=txt;};
} END {gsub(" ","",lin); print key lin;}' >NewFile.txt
__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
Reply With Quote
  #8 (permalink)  
Old 07-08-04, 11:09
skd skd is offline
Registered User
 
Join Date: Sep 2003
Posts: 71
First of all , many thanks to three of you for your enthusiam in your solution.

you guys seems a guru in awk.

Thanks a lot.

how do i trim _ at end of each word.
further desired output is
===
216 440 BAINBRIDGE_TWP_PD,CHARDON_RD,CHESTER_TWP_PD,GEAUGA _COUNTY
note :- in input , 3rd field is 32 char left justified text with blanks padding
==
instead of
216 440 BAINBRIDGE_TWP_PD__________,CHARDON_PD____________ __________,CHESTER_TWP_PD_____________,GEAUGA_COUN TY___________________


thanks again.
Reply With Quote
  #9 (permalink)  
Old 07-08-04, 11:22
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
Quote:
Originally Posted by skd
First of all , many thanks to three of you for your enthusiam in your solution.

you guys seems a guru in awk.

Thanks a lot.

how do i trim _ at end of each word.
further desired output is
===
216 440 BAINBRIDGE_TWP_PD,CHARDON_RD,CHESTER_TWP_PD,GEAUGA _COUNTY
note :- in input , 3rd field is 32 char left justified text with blanks padding
==
instead of
216 440 BAINBRIDGE_TWP_PD__________,CHARDON_PD____________ __________,CHESTER_TWP_PD_____________,GEAUGA_COUN TY___________________


thanks again.
what's the INDIVIDUAL width of fields separated by commas?
Meaning:
Quote:
BAINBRIDGE_TWP_PD__________,
what IS the total width of this field?
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
Reply With Quote
  #10 (permalink)  
Old 07-08-04, 11:29
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
for example to have 20 char wide character fields left justified and padded with spaces:

Code:
{
  idx= $1 " " $2;
  gsub("^" $1 " " $2 " ", "");
  gsub(/ $/, ""); gsub(/ /, "_");
  $0=sprintf("%-20.20s", $0);
  a[idx] =  ( idx in a ) ? a[idx] "," $0 : $0;
}

END {
   for (i in a)
     printf("%s %s\n", i, a[i]);
}
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
Reply With Quote
  #11 (permalink)  
Old 07-08-04, 13:54
skd skd is offline
Registered User
 
Join Date: Sep 2003
Posts: 71
indivisual field width for each word separated by comma is variable in output.

this is how i am getting input.

cat aciw.oh|\
awk ' { print substr($0,2,3) " " substr($0,5,3) " " substr($0,94,32) } '|\
grep -v "^[HT]L"|sort -u > input.txt

cat input.txt
=========
216 440 BAINBRIDGE TOWNSHIP PD
216 440 CHARDON PD
216 440 CHESTER TOWNSHIP PD
216 440 GEAUGA COUNTY

my desire output
==============
216 440 BAINBRIDGE_TOWNSHIP_PD,CHARDON_PD,CHESTER_TOWNSHIP _PD,GEAUGA_COUNTY
Reply With Quote
  #12 (permalink)  
Old 07-08-04, 14:22
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
Code:
{
  idx= $1 " " $2;
  gsub("^" $1 " " $2 " ", "");
  gsub(/ $/, ""); gsub(/ /, "_");
  a[idx] =  ( idx in a ) ? a[idx] "," $0 : $0;
}

END {
   for (i in a)
     printf("%s %s\n", i, a[i]);
}
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On