If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > Converting Text File into XML using Unix Shell Scripts

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-15-05, 10:58
Laud12345 Laud12345 is offline
Registered User
 
Join Date: Feb 2005
Posts: 4
Question Converting Text File into XML using Unix Shell Scripts

Hi everyone,

If someone out there could help me out with this problem. I would really appreciate it.

I am trying to convert a file into xml format using Unix shell scripts.

The file has fields with each field having a certain number of bytes, but the fields are not delimited by anything (e.g. whitespace). I need to get those fields into some sort of data structure so that I can use them to generate the XML using a simple for loop. The following is a an example of what I am looking at:

Jack Johnson 90980288Harv 9090998
Joe Joie 8989 Sed 99488


I can't use whitespace as a delimiter.


Thanks you guys very much,
I really appreciate it.
Reply With Quote
  #2 (permalink)  
Old 02-15-05, 13:42
LKBrwn_DBA LKBrwn_DBA is offline
Registered User
 
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
Cool

Quote:
Originally Posted by Laud12345
...
The file has fields with each field having a certain number of bytes, ...
Does this mean that each field has the same length?
For example: Field1=length(30), filed2=length(10), field3=length(24),...etc...
__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
Reply With Quote
  #3 (permalink)  
Old 02-15-05, 17:22
Laud12345 Laud12345 is offline
Registered User
 
Join Date: Feb 2005
Posts: 4
Re

Yes. That is correct.
Reply With Quote
  #4 (permalink)  
Old 02-16-05, 09:36
LKBrwn_DBA LKBrwn_DBA is offline
Registered User
 
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
Cool

Try something like this:
Code:
#!/bin/ksh
cat - <<! >data.txt
Jack Johnson        2005-02-11Harv      9090998
Joe Joie            2005-01-23Sed       0099488
!
awk ' BEGIN {print "<?xmlversion=""1.0""?>";print "<ROWSET>"}
{print "<ROW>"
 print "<NAME>" substr($0,1,20) "</NAME>"
 print "<TRN_DATE>" substr($0,21,10) "</TRN_DATE>"
 print "<TEXT>" substr($0,31,10) "</TEXT>"
 print "<ID>" substr($0,41,7) "</ID>"
 print "</ROW>";}
 END { print "</ROWSET>";}' data.txt >data.xml
__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
Reply With Quote
  #5 (permalink)  
Old 02-16-05, 11:08
Laud12345 Laud12345 is offline
Registered User
 
Join Date: Feb 2005
Posts: 4
Thanks very much LKBrwn. It worked.

Now, how I strip out the leading and trailing whitespace from the each string before having it print out?

I was thinking of using the split function to store it into an array and then printing those elements as one string, but that seems to be overkill.


Any help would be appreciated.

Thanks,
Laud
Reply With Quote
  #6 (permalink)  
Old 02-16-05, 11:30
vgersh99 vgersh99 is offline
Registered User
 
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
as posted somewhere else....
Code:
function trim(str)
{
    sub("^[ ]*", "", str);
    sub("[ ]*$", "", str);
    return str;
}
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
Reply With Quote
  #7 (permalink)  
Old 02-16-05, 13:21
Laud12345 Laud12345 is offline
Registered User
 
Join Date: Feb 2005
Posts: 4
Oh yes!
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On