| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

02-15-05, 10:58
|
|
Registered User
|
|
Join Date: Feb 2005
Posts: 4
|
|
Converting Text File into XML using Unix Shell Scripts
|
|
Hi everyone,
If someone out there could help me out with this problem. I would really appreciate it.
I am trying to convert a file into xml format using Unix shell scripts.
The file has fields with each field having a certain number of bytes, but the fields are not delimited by anything (e.g. whitespace). I need to get those fields into some sort of data structure so that I can use them to generate the XML using a simple for loop. The following is a an example of what I am looking at:
Jack Johnson 90980288Harv 9090998
Joe Joie 8989 Sed 99488
I can't use whitespace as a delimiter.
Thanks you guys very much,
I really appreciate it.
|
|

02-15-05, 13:42
|
|
Registered User
|
|
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
|
|
Quote:
|
Originally Posted by Laud12345
...
The file has fields with each field having a certain number of bytes, ...
|
Does this mean that each field has the same length?
For example: Field1=length(30), filed2=length(10), field3=length(24),...etc...

__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
|
|

02-15-05, 17:22
|
|
Registered User
|
|
Join Date: Feb 2005
Posts: 4
|
|
|
Re
|

02-16-05, 09:36
|
|
Registered User
|
|
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
|
|
Try something like this:
Code:
#!/bin/ksh
cat - <<! >data.txt
Jack Johnson 2005-02-11Harv 9090998
Joe Joie 2005-01-23Sed 0099488
!
awk ' BEGIN {print "<?xmlversion=""1.0""?>";print "<ROWSET>"}
{print "<ROW>"
print "<NAME>" substr($0,1,20) "</NAME>"
print "<TRN_DATE>" substr($0,21,10) "</TRN_DATE>"
print "<TEXT>" substr($0,31,10) "</TEXT>"
print "<ID>" substr($0,41,7) "</ID>"
print "</ROW>";}
END { print "</ROWSET>";}' data.txt >data.xml

__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
|
|

02-16-05, 11:08
|
|
Registered User
|
|
Join Date: Feb 2005
Posts: 4
|
|
Thanks very much LKBrwn. It worked.
Now, how I strip out the leading and trailing whitespace from the each string before having it print out?
I was thinking of using the split function to store it into an array and then printing those elements as one string, but that seems to be overkill.
Any help would be appreciated.
Thanks,
Laud
|
|

02-16-05, 11:30
|
|
Registered User
|
|
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
|
|
as posted somewhere else....
Code:
function trim(str)
{
sub("^[ ]*", "", str);
sub("[ ]*$", "", str);
return str;
}
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
|
|

02-16-05, 13:21
|
|
Registered User
|
|
Join Date: Feb 2005
Posts: 4
|
|
Oh yes! 
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|