Results 1 to 7 of 7
  1. #1
    Join Date
    Feb 2005
    Posts
    4

    Question Unanswered: Converting Text File into XML using Unix Shell Scripts

    Hi everyone,

    If someone out there could help me out with this problem. I would really appreciate it.

    I am trying to convert a file into xml format using Unix shell scripts.

    The file has fields with each field having a certain number of bytes, but the fields are not delimited by anything (e.g. whitespace). I need to get those fields into some sort of data structure so that I can use them to generate the XML using a simple for loop. The following is a an example of what I am looking at:

    Jack Johnson 90980288Harv 9090998
    Joe Joie 8989 Sed 99488


    I can't use whitespace as a delimiter.


    Thanks you guys very much,
    I really appreciate it.

  2. #2
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    Quote Originally Posted by Laud12345
    ...
    The file has fields with each field having a certain number of bytes, ...
    Does this mean that each field has the same length?
    For example: Field1=length(30), filed2=length(10), field3=length(24),...etc...
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  3. #3
    Join Date
    Feb 2005
    Posts
    4

    Re

    Yes. That is correct.

  4. #4
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    Try something like this:
    Code:
    #!/bin/ksh
    cat - <<! >data.txt
    Jack Johnson        2005-02-11Harv      9090998
    Joe Joie            2005-01-23Sed       0099488
    !
    awk ' BEGIN {print "<?xmlversion=""1.0""?>";print "<ROWSET>"}
    {print "<ROW>"
     print "<NAME>" substr($0,1,20) "</NAME>"
     print "<TRN_DATE>" substr($0,21,10) "</TRN_DATE>"
     print "<TEXT>" substr($0,31,10) "</TEXT>"
     print "<ID>" substr($0,41,7) "</ID>"
     print "</ROW>";}
     END { print "</ROWSET>";}' data.txt >data.xml
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  5. #5
    Join Date
    Feb 2005
    Posts
    4
    Thanks very much LKBrwn. It worked.

    Now, how I strip out the leading and trailing whitespace from the each string before having it print out?

    I was thinking of using the split function to store it into an array and then printing those elements as one string, but that seems to be overkill.


    Any help would be appreciated.

    Thanks,
    Laud

  6. #6
    Join Date
    Apr 2004
    Location
    Boston, MA
    Posts
    325
    as posted somewhere else....
    Code:
    function trim(str)
    {
        sub("^[ ]*", "", str);
        sub("[ ]*$", "", str);
        return str;
    }
    vlad
    +-----------------------+
    | #include <disclaimer.h> |
    +-----------------------+

  7. #7
    Join Date
    Feb 2005
    Posts
    4
    Oh yes!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •