Results 1 to 10 of 10
  1. #1
    Join Date
    Jun 2004
    Posts
    11

    Unanswered: working with files

    Hi,

    How to convert an fixed width file to an commadelimited file.
    EX;
    state Desc
    ----- ----
    TX Texas
    MI Michigan

    I need to remove the header and convert the file to
    TX,Texas
    MI,Michigan

  2. #2
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    Code:
    nawk -v OFS=, 'FNR>2 && $1=$1' file

  3. #3
    Join Date
    Jun 2004
    Posts
    11
    Hi vgersh,

    Thank u very much for ur help.
    Could me help me in writing a scrpit that is generalised in a way that whatever the file i give as an input(fixedwidth format files) the outputfile should be comma delimited,and there should not be any header and footer,this should work for any number of columns.can u also suggest me what should be done if there is no data in a column.For now if there is no data the next column value is sitting next to the comma.

    Thanku
    vootkur

  4. #4
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    I'll help if you're willing to help yourself.

    What have you tried so far?

  5. #5
    Join Date
    Jun 2004
    Posts
    11
    Hi vgersh,

    Since i am new to scripting i am facing lot of problems and that to i dont have much time to learn and do the things,thats the problem.

    sorry if iam troubling u guys.
    thanks
    vootkur

  6. #6
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    have you tried what I've posted?
    seems like you never mentioned the 'footers' neither in you original posting NOR in the sample file.

    Pls provide a complete sample file.

    I cannot provide a complete solution, but will try to give enough to get you started - you'll have to do your share.

  7. #7
    Join Date
    Jun 2004
    Posts
    11
    Thank u vgersh

    yeah sure,just help me in getting started and i will take from there.
    Here is the Sample file.

    Num State Desc
    ----- ----- ----
    111 PA PENNSYLVANIA

    222 NJ NEW JERSEY

    333 NJ NEW JERSEY

    444 NJ NEW JERSEY

    555 NJ NEW JERSEY



    5record(s) selected.
    These would be the Sample file that i would be getting.
    but i need to write a generalised script which can handle any no.of columns.
    just give me some suggestions and i will start from there.

    vootkur

  8. #8
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    do you have gawk available?
    vlad
    +-----------------------+
    | #include <disclaimer.h> |
    +-----------------------+

  9. #9
    Join Date
    Jun 2004
    Posts
    11
    do you have gawk available?
    __________________
    +----------------------------+
    | Vlad Geshkovich |
    | #include <disclaimer.h> |
    +----------------------------+

    No.Its saying no manual entry for gawk

  10. #10
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    it's going to be hard to write a 'generic' script to handle any number of column as you have 'embedded spaces in your columns: 'New Jersey'
    222 NJ NEW JERSEY
    It would be easier to make your DB extraction to use different 'column separators' and use quotes around field values - more or less like CSV.

    Here's an attemp to deal with your sample file with the embedded spaces.
    The variable 'FIELDWIDTHS' contains the width of your three fields.

    If this logic works for you, you can make variable 'FIELDWIDTHS' be a parameter to the script specifying different file widths that you want to parse.

    for this code below - voot.awk:

    nawk -f voot.awk file2format

    Code:
    function setFieldsByWidth(   i,n,FWS,start,copyd0) {
    # Licensed under GPL Peter S Tillier, 2003
    # NB corrupts $0
      copyd0 = $0                             # make copy of $0 to work on
      if (length(FIELDWIDTHS) == 0) {
        print "You need to set the width of the fields that you require" > "/dev/stderr"
        print "in the variable FIELDWIDTHS (NB: Upper case!)" > "/dev/stderr"
        exit(1)
      }
    
      if (!match(FIELDWIDTHS,/^[0-9 ]+$/)) {
        print "The variable FIELDWIDTHS must contain digits, separated" > "/dev/stderr"
        print "by spaces." > "/dev/stderr"
        exit(1)
      }
    
      n = split(FIELDWIDTHS,FWS)
    
      if (n == 1) {
        print "Warning: FIELDWIDTHS contains only one field width." > "/dev/stderr"
        print "Attempting to continue." > "/dev/stderr"
      }
    
      start = 1
      for (i=1; i <= n; i++) {
        $i = trim(substr(copyd0,start,FWS[i]))
        start = start + FWS[i]
      }
      return n;
    }
    
    # Note that the "/dev/stderr" entries in some lines have wrapped.
    #
    # I then call setFieldsByWidth() in my main awk code as follows:
    
    function trim(str)
    {
      sub("^[ ]*", "", str);
      sub("[ ]*$", "", str);
      return str;
    }
    
    BEGIN {
      #FIELDWIDTHS="7 6 5 4 3 2 1" # for example
      FIELDWIDTHS="3 3 15"
      OFS=","
    }
    !/^[  ]*$/ && FNR > 2 && !/record\(s\) selected./ {
      saveDollarZero = $0 # if you want it later
      numFields = setFieldsByWidth()
      # now we can manipulate $0, NF and $1 .. $NF as we wish
      for(i=1; i <= numFields; i++)
         printf("%s%s", $i, (i != numFields) ? OFS : ORS);
    }

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •