| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

06-24-04, 11:09
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
|
File Conversion
|
|
I have a datafile with 'n' rows and m columns which are of fixed width.
i do have a data specification file with the length of each column.
The file do have a header and trailer.
I have to eliminate the header and footer and also i need the file to be delimited by comma.
Any help would be greatly apprecited.
Datafile
EX:
a b c
--- ----- ----
111 TX Texas
222 CA California
2 rows selected
File Specification:
col1(1,3)
col2(4,6)
col3(8,25)
the specification file can be either way
col1(3)
col2(2)
col3(20)
I need to write a generalised script for the above 2 conditions seperately.
whatever the file may be the output should be
111,tx,texas
222,ca,california
Thanks in advance.
|
|

06-24-04, 12:10
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,613
|
|
Perl or AWK could do this nicely. Did the teacher give you any insight into which they'd prefer for this assignment?
-PatP
|
|

06-24-04, 12:14
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
|
File conversion
|
|
Hi,
I am supposed to do that in awk.
and also i am not that good in awk programming,i knew only the basic unix commands.
|
|

06-24-04, 14:49
|
|
Registered User
|
|
Join Date: Oct 2003
Location: Germany
Posts: 138
|
|
Quote:
|
Originally Posted by Sachin9
I have a datafile with 'n' rows and m columns which are of fixed width.
i do have a data specification file with the length of each column.
The file do have a header and trailer.
I have to eliminate the header and footer and also i need the file to be delimited by comma.
Any help would be greatly apprecited.
Datafile
EX:
a b c
--- ----- ----
111 TX Texas
222 CA California
2 rows selected
File Specification:
col1(1,3)
col2(4,6)
col3(8,25)
the specification file can be either way
col1(3)
col2(2)
col3(20)
I need to write a generalised script for the above 2 conditions seperately.
whatever the file may be the output should be
111,tx,texas
222,ca,california
Thanks in advance.
|
Hi,
I thing I can help you in awk. But I have some questions.
1. Is the number of cols always the same (col1 - col3) ???
2. Between col1..2..3 ist there allways a blank ??
3. Can it happen that inside the colls a blank is written ???
__________________
Greetings from germany
Peter F.
|
|

06-24-04, 15:18
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
Quote:
|
Originally Posted by fla5do
Hi,
I thing I can help you in awk. But I have some questions.
1. Is the number of cols always the same (col1 - col3) ???
2. Between col1..2..3 ist there allways a blank ??
3. Can it happen that inside the colls a blank is written ???
|
Thank u very much.
No,the no. of columns will not always be the same,there could be more columns also.
Yeah there is always a blank in between the cols.but the the columns are fixed width format(length of the col size).
EX:
Col1 col2 col3
col1 can be 1 character,col2 can be 10 characters,col3 can be 20 characters
but there will be blanks in between the cols.
Yeah,there can be spaces in the columns values,in that case we need to leave a blank(i.e the length of the column )
EX:if the length of col10 is 20 and if it is blank the file should look like
col9,20 spaces,col11
Once again thanks in advance.
|
|

06-24-04, 17:17
|
|
Registered User
|
|
Join Date: Oct 2003
Location: Germany
Posts: 138
|
|
Quote:
|
Originally Posted by Sachin9
Thank u very much.
No,the no. of columns will not always be the same,there could be more columns also.
Yeah there is always a blank in between the cols.but the the columns are fixed width format(length of the col size).
EX:
Col1 col2 col3
col1 can be 1 character,col2 can be 10 characters,col3 can be 20 characters
but there will be blanks in between the cols.
Yeah,there can be spaces in the columns values,in that case we need to leave a blank(i.e the length of the column )
EX:if the length of col10 is 20 and if it is blank the file should look like
col9,20 spaces,col11
Once again thanks in advance.
|
no idea for this specification :-) ???
buy
__________________
Greetings from germany
Peter F.
|
|

06-24-04, 18:00
|
|
Registered User
|
|
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
|
|
I thought I've already posted on a similar thread. What is this - more than 1 person taking the same class?
# columns with widths of 3, 2 and chars
# change the widths of your columns on the invokation line
nawk -v FIELDWIDTHS='3 2 20' -f sachin19.awk file2convert
here's the content for achin19.awk'
Code:
function setFieldsByWidth( i,n,FWS,start,copyd0) {
# Licensed under GPL Peter S Tillier, 2003
# NB corrupts $0
copyd0 = $0 # make copy of $0 to work on
if (length(FIELDWIDTHS) == 0) {
print "You need to set the width of the fields that you require" > "/dev/stderr"
print "in the variable FIELDWIDTHS (NB: Upper case!)" > "/dev/stderr"
exit(1)
}
if (!match(FIELDWIDTHS,/^[0-9 ]+$/)) {
print "The variable FIELDWIDTHS must contain digits, separated" > "/dev/stderr"
print "by spaces." > "/dev/stderr"
exit(1)
}
n = split(FIELDWIDTHS,FWS)
if (n == 1) {
print "Warning: FIELDWIDTHS contains only one field width." > "/dev/stderr"
print "Attempting to continue." > "/dev/stderr"
}
start = 1
for (i=1; i <= n; i++) {
$i = trim(substr(copyd0,start,FWS[i]))
start = start + FWS[i]+1
}
return n;
}
# Note that the "/dev/stderr" entries in some lines have wrapped.
#
# I then call setFieldsByWidth() in my main awk code as follows:
function trim(str)
{
sub("^[ ]*", "", str);
sub("[ ]*$", "", str);
return str;
}
BEGIN {
OFS=","
}
!/^[ ]*$/ && FNR > 2 && !/record\(s\) selected./ {
saveDollarZero = $0 # if you want it later
numFields = setFieldsByWidth()
# now we can manipulate $0, NF and $1 .. $NF as we wish
for(i=1; i <= numFields; i++)
printf("%s%s", $i, (i != numFields) ? OFS : ORS);
}
The code for specifying widths in the configuration file is left as exercise for the person taking the class.
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
|
|

06-25-04, 14:36
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
Hi,
what does /dev/stderr mean.
i appreciate ur help.
|
|

06-25-04, 14:49
|
|
Registered User
|
|
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
|
|
"/dev/stderr" - standard error
you can sabstitute: > "/dev/stderr"
with
| "cat 1>&2"
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
|
|

06-25-04, 15:32
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
awk -v FIELDWIDTHS='2 20' -f vg.awk state
syntax error The source line is 9. The function is setFieldsByWidth.
The error context is
print "You need to set the width of the fields that you require" > >>> | <<< "cat 1>&2"
awk: The statement cannot be correctly parsed.
The source line is 9. The function is setFieldsByWidth.
syntax error The source line is 10. The function is setFieldsByWidth.
This is the error i am getting when i am trying to run ur program.
Thanks in advance
|
|

06-25-04, 15:41
|
|
Registered User
|
|
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
|
|
I've modified the code incorrectly.
Here's the modified version:
Code:
function setFieldsByWidth( i,n,FWS,start,copyd0) {
# Licensed under GPL Peter S Tillier, 2003
# NB corrupts $0
copyd0 = $0 # make copy of $0 to work on
if (length(FIELDWIDTHS) == 0) {
print "You need to set the width of the fields that you require" | stderr
print "in the variable FIELDWIDTHS (NB: Upper case!)" | stderr
exit(1)
}
if (!match(FIELDWIDTHS,/^[0-9 ]+$/)) {
print "The variable FIELDWIDTHS must contain digits, separated" | stderr
print "by spaces." | stderr
exit(1)
}
n = split(FIELDWIDTHS,FWS)
if (n == 1) {
print "Warning: FIELDWIDTHS contains only one field width." | stderr
print "Attempting to continue." | stderr
}
start = 1
for (i=1; i <= n; i++) {
$i = trim(substr(copyd0,start,FWS[i]))
start = start + FWS[i]+1
}
return n;
}
# Note that the "/dev/stderr" entries in some lines have wrapped.
#
# I then call setFieldsByWidth() in my main awk code as follows:
function trim(str)
{
sub("^[ ]*", "", str);
sub("[ ]*$", "", str);
return str;
}
BEGIN {
OFS=","
stderr="cat 1>&2"
}
!/^[ ]*$/ && FNR > 2 && !/record\(s\) selected./ {
saveDollarZero = $0 # if you want it later
numFields = setFieldsByWidth()
# now we can manipulate $0, NF and $1 .. $NF as we wish
for(i=1; i <= numFields; i++)
printf("%s%s", $i, (i != numFields) ? OFS : ORS);
}
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
|
|

06-25-04, 15:56
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
cheers vgersh,
the code is perfect.
Thanks a million for ur help.
|
|

06-28-04, 09:52
|
|
Registered User
|
|
Join Date: Jun 2004
Posts: 8
|
|
Hi vgersh,
Does ur code work for only 3 columns or is it a generalise one.
|
|

06-28-04, 10:51
|
|
Registered User
|
|
Join Date: Apr 2004
Location: Boston, MA
Posts: 325
|
|
what do you think?
try it.
__________________
vlad
+-----------------------+
| #include <disclaimer.h> |
+-----------------------+
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|