| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

04-18-06, 15:24
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
|
Shell script that parses a CSV file, builds sql insert calls, then executes them
|
|
I am given a CSV file that looks like this
Code:
Key col1 col2
---------------------
1 asdf hgsd
2 fasd gsdh
3 fdsa dsfg
4 rtw sdgf
I want to take that data and put each row into an array, say row[@]
Code:
i = 0
cat myfile.csv | while read fileline
do
row[i] = $fileline
i = i + 1
done
I have to join that data with the Key and get another column
So for each I have to do
select col3 from InputTable where KeyCol = Key;
then for each parse row[@] into a col[@]
Code:
insert into OutputTable
(
col1,
col2,
col3
)
values
(
col[1],
col[2],
col[3]
)
So something like this:
Code:
n = split(row[0], col, ",")
db2 +p -t <<EOF >> $LOG_FILE
connect to $DB_INSERT user $user using $pw;
values current timestamp;
SET SCHEMA $DB_INSERT_SCHEMA;
insert into $OutputTable
(
col1,
col2,
col3
)
values
(
$col[1],
$col[2],
select col3 from $InputTable where KeyCol = $col[0]
)
and do that for all row[@]
I just dont understand how to loop through it all. Are there any examples anyone could point me to?
|
Last edited by JamesAvery22; 04-18-06 at 15:30.
|

04-18-06, 16:50
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
maybe something like this?
Code:
rowcount = 0
cat $CSV_DIR/$1 | while read fileline
do
rowarray[i] = $fileline
rowcount = $rowcount + 1
done
i = 0
query = ""
while [ "$i" -le "$rowcount" ]
do
n = split(${rowarray[$i]}, colarray, ",")
if [ "$n" -ge "2" ]
then
query = $query + "\n" +
"insert into OutputTable
(
SeqKey,
ID,
Col1,
Col2
) values (
nextval for " + $DB_INSERT_SCHEMA + ".OutputTable,
select ID from InputTable where Key = " + ${colarray[0]} + "," +
${colarray[1]} + "," +
${colarray[2]} +
");"
fi
i = $i + 1
done
db2 +p -t <<EOF >> $LOG_FILE
connect to $DB_INSERT user $user using $pw;
values current timestamp;
SET SCHEMA $DB_INSERT_SCHEMA;
$query
commit
;
values current timestamp;
terminate;
|
|

04-19-06, 02:45
|
|
Registered User
|
|
Join Date: May 2005
Location: South Africa
Posts: 1,268
|
|
|
|
I don’t use db2 and the split command you use is unfamiliar to me. I only know the split command to split a file into pieces
The csv file sample you show does not seem coma delimited so you can read the data directly into variables
i.e. while read kecol col1 col2 junk
yet the split command you use n = split(row[0], col, ",") suggest it is, assuming I interpret it correct
I will not use an array. I’ll build the insert statements into a file then take the file as input. The code below is untested.
Code:
#!/usr/bin/ksh
#…seting your variables…
cat <EOF >insert.sql
connect to $DB_INSERT user $user using $pw;
values current timestamp;
SET SCHEMA $DB_INSERT_SCHEMA;
EOF
cat myfile.csv | while read fileline
do
keycol=$(echo $fileline | cut –d’,’ –f1)
col1=$(echo $fileline | cut –d’,’ –f2)
col2=$(echo $fileline | cut –d’,’ –f3)
cat <<EOF >>insert.sql
insert into $OutputTable
(
col1,
col2,
col3
)
values
(
$col1,
$col2,
select col3 from $InputTable where KeyCol = $keycol
);
EOF
Done
db2 +p -t <insert.sql >$LOG_FILE
|
|

04-19-06, 09:58
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
I was wondering about the split command. I read about it somewhere but then couldnt find the page again :shrug: I was just reading on how to do the cut command.
Sorry the example CSV file was just a dummy. The real CSV file is comma seperated.
Will the way I guessed at, by trying to execute a bunch of insert statements that are in a string variable not work?
|
|

04-19-06, 12:41
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
I tested out some of that code.
Did this
Code:
cat >insert.sql << "EOF"
connect to $DB_INSERT user $user using $pw;
values current timestamp;
SET SCHEMA $DB_INSERT_SCHEMA;
EOF
cat $CSV_DIR/$1 | while read fileline
do
colarray[0]=$(echo $fileline | cut -d',' -f1)
colarray[1]=$(echo $fileline | cut -d',' -f2)
colarray[2]=$(echo $fileline | cut -d',' -f3)
cat >>insert.sql << "EOF"
insert into OutputTable
(col1,col2,col3,col4)
values
(nextval for $DB_INSERT_SCHEMA.VAE_MF_SUBLDGR_SEQ , select col2 from InputTable where Key = ${colarray[0]} , ${colarray[1]} , ${colarray[2]} );
EOF
done
It works but the problem is "${colarray[1]}" is written into insert.sql instead of its value. How can I get the value?
edit------------------
need to remove the quotes around EOF...
http://www.webservertalk.com/archive...-3-952249.html
Looks like I should also use printf if I want it to be faster.
|
Last edited by JamesAvery22; 04-19-06 at 14:44.
|

04-19-06, 15:40
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
my tested code:
Code:
printf "%s\n%s\n%s\n" "connect to $DB_INSERT user $user using $pw;" "values current timestamp;" "SET SCHEMA $DB_INSERT_SCHEMA;" >insert.sql
cat $CSV_DIR/$1 | while read fileline
do
colarray[0]=$(echo $fileline | cut -d',' -f1)
colarray[1]=$(echo $fileline | cut -d',' -f2)
printf "%s\n" "insert into OutputTable (col1,col2,col3) values (nextval for $DB_INSERT_SCHEMA.OUTPUTTABLE_SEQ , (select ID from InputTable where Key = '${colarray[0]}') ,'${colarray[1]}' );" >>insert.sql
done
Thanks for everyones help
|
|

05-02-06, 09:53
|
|
Registered User
|
|
Join Date: Jan 2004
Posts: 70
|
|
I've been using this script and it works fine with an input file that has <200 lines. Takes about 20 seconds to create the insert.sql. Im trying it on a file with 13k+ lines and it is taking forever. Creating the insert.sql is taking longer than executing it. Is there anyway to do this faster?
|
|

05-03-06, 07:12
|
|
Registered User
|
|
Join Date: May 2005
Location: South Africa
Posts: 1,268
|
|
Try using sed
Code:
sed "s/\(.*\),\(.*\),\(.*\)/insert into OutputTable (col1,col2,col3) values (nextval for $DB_INSERT_SCHEMA.OUTPUTTABLE_SEQ , (select ID from InputTable where Key = \1) ,'\2' );/" $CSV_DIR/$1 >insert.sql
|
|

03-15-12, 03:43
|
|
Registered User
|
|
Join Date: Nov 2011
Posts: 4
|
|
|
sed to parses a csv file
Suppose we have more than 9 columns in the csv file, then how to creat the insert stmt
As after 9th column, for the 10th column it is picking the 1st columns values and concates the 0 at the end, similarly for 11th column 1st columns values and concates the 1 at the end.
Please suggest what did's to be done to solve these problem.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|