If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > working with files

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-21-04, 15:31
vootkur vootkur is offline
Registered User
 
Join Date: Jun 2004
Posts: 11
working with files

Hi,

How to convert an fixed width file to an commadelimited file.
EX;
state Desc
----- ----
TX Texas
MI Michigan

I need to remove the header and convert the file to
TX,Texas
MI,Michigan
Reply With Quote
  #2 (permalink)  
Old 06-21-04, 15:41
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
Code:
nawk -v OFS=, 'FNR>2 && $1=$1' file
Reply With Quote
  #3 (permalink)  
Old 06-22-04, 14:37
vootkur vootkur is offline
Registered User
 
Join Date: Jun 2004
Posts: 11
Hi vgersh,

Thank u very much for ur help.
Could me help me in writing a scrpit that is generalised in a way that whatever the file i give as an input(fixedwidth format files) the outputfile should be comma delimited,and there should not be any header and footer,this should work for any number of columns.can u also suggest me what should be done if there is no data in a column.For now if there is no data the next column value is sitting next to the comma.

Thanku
vootkur
Reply With Quote
  #4 (permalink)  
Old 06-22-04, 14:41
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
I'll help if you're willing to help yourself.

What have you tried so far?
Reply With Quote
  #5 (permalink)  
Old 06-22-04, 14:45
vootkur vootkur is offline
Registered User
 
Join Date: Jun 2004
Posts: 11
Hi vgersh,

Since i am new to scripting i am facing lot of problems and that to i dont have much time to learn and do the things,thats the problem.

sorry if iam troubling u guys.
thanks
vootkur
Reply With Quote
  #6 (permalink)  
Old 06-22-04, 14:52
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
have you tried what I've posted?
seems like you never mentioned the 'footers' neither in you original posting NOR in the sample file.

Pls provide a complete sample file.

I cannot provide a complete solution, but will try to give enough to get you started - you'll have to do your share.
Reply With Quote
  #7 (permalink)  
Old 06-22-04, 14:59
vootkur vootkur is offline
Registered User
 
Join Date: Jun 2004
Posts: 11
Thank u vgersh

yeah sure,just help me in getting started and i will take from there.
Here is the Sample file.

Num State Desc
----- ----- ----
111 PA PENNSYLVANIA

222 NJ NEW JERSEY

333 NJ NEW JERSEY

444 NJ NEW JERSEY

555 NJ NEW JERSEY



5record(s) selected.
These would be the Sample file that i would be getting.
but i need to write a generalised script which can handle any no.of columns.
just give me some suggestions and i will start from there.

vootkur
Reply With Quote
  #8 (permalink)  
Old 06-22-04, 15:18
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
do you have gawk available?
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
Reply With Quote
  #9 (permalink)  
Old 06-22-04, 15:21
vootkur vootkur is offline
Registered User
 
Join Date: Jun 2004
Posts: 11
do you have gawk available?
__________________
+----------------------------+
| Vlad Geshkovich |
| #include <disclaimer.h> |
+----------------------------+

No.Its saying no manual entry for gawk
Reply With Quote
  #10 (permalink)  
Old 06-22-04, 16:14
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
it's going to be hard to write a 'generic' script to handle any number of column as you have 'embedded spaces in your columns: 'New Jersey'
Quote:
222 NJ NEW JERSEY
It would be easier to make your DB extraction to use different 'column separators' and use quotes around field values - more or less like CSV.

Here's an attemp to deal with your sample file with the embedded spaces.
The variable 'FIELDWIDTHS' contains the width of your three fields.

If this logic works for you, you can make variable 'FIELDWIDTHS' be a parameter to the script specifying different file widths that you want to parse.

for this code below - voot.awk:

nawk -f voot.awk file2format

Code:
function setFieldsByWidth(   i,n,FWS,start,copyd0) {
# Licensed under GPL Peter S Tillier, 2003
# NB corrupts $0
  copyd0 = $0                             # make copy of $0 to work on
  if (length(FIELDWIDTHS) == 0) {
    print "You need to set the width of the fields that you require" > "/dev/stderr"
    print "in the variable FIELDWIDTHS (NB: Upper case!)" > "/dev/stderr"
    exit(1)
  }

  if (!match(FIELDWIDTHS,/^[0-9 ]+$/)) {
    print "The variable FIELDWIDTHS must contain digits, separated" > "/dev/stderr"
    print "by spaces." > "/dev/stderr"
    exit(1)
  }

  n = split(FIELDWIDTHS,FWS)

  if (n == 1) {
    print "Warning: FIELDWIDTHS contains only one field width." > "/dev/stderr"
    print "Attempting to continue." > "/dev/stderr"
  }

  start = 1
  for (i=1; i <= n; i++) {
    $i = trim(substr(copyd0,start,FWS[i]))
    start = start + FWS[i]
  }
  return n;
}

# Note that the "/dev/stderr" entries in some lines have wrapped.
#
# I then call setFieldsByWidth() in my main awk code as follows:

function trim(str)
{
  sub("^[ ]*", "", str);
  sub("[ ]*$", "", str);
  return str;
}

BEGIN {
  #FIELDWIDTHS="7 6 5 4 3 2 1" # for example
  FIELDWIDTHS="3 3 15"
  OFS=","
}
!/^[  ]*$/ && FNR > 2 && !/record\(s\) selected./ {
  saveDollarZero = $0 # if you want it later
  numFields = setFieldsByWidth()
  # now we can manipulate $0, NF and $1 .. $NF as we wish
  for(i=1; i <= numFields; i++)
     printf("%s%s", $i, (i != numFields) ? OFS : ORS);
}
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On