Results 1 to 9 of 9
  1. #1
    Join Date
    Jan 2004
    Posts
    70

    Unanswered: Shell script that parses a CSV file, builds sql insert calls, then executes them

    I am given a CSV file that looks like this

    Code:
    Key	col1	col2
    ---------------------
    1	asdf	hgsd
    2	fasd	gsdh
    3	fdsa	dsfg
    4	rtw	sdgf
    I want to take that data and put each row into an array, say row[@]

    Code:
    i = 0
    cat myfile.csv | while read fileline
    do
    row[i] = $fileline
    i = i + 1
    done
    I have to join that data with the Key and get another column

    So for each I have to do

    select col3 from InputTable where KeyCol = Key;

    then for each parse row[@] into a col[@]

    Code:
    insert into OutputTable 
    (
    	col1,
    	col2,
    	col3
    )
    values
    (
    	col[1],
    	col[2],
    	col[3]
    )
    So something like this:

    Code:
    n = split(row[0], col, ",")
    
    db2 +p -t <<EOF >>  $LOG_FILE
    
    connect to $DB_INSERT user $user using $pw;
    
    values current timestamp;
    
    SET SCHEMA $DB_INSERT_SCHEMA;
    
    insert into $OutputTable 
    (
    	col1,
    	col2,
    	col3
    )
    values
    (
    	$col[1],
    	$col[2],
    	select col3 from $InputTable where KeyCol = $col[0]
    )
    and do that for all row[@]


    I just dont understand how to loop through it all. Are there any examples anyone could point me to?
    Last edited by JamesAvery22; 04-18-06 at 16:30.

  2. #2
    Join Date
    Jan 2004
    Posts
    70
    maybe something like this?

    Code:
    rowcount = 0
    cat $CSV_DIR/$1 | while read fileline
    do
    rowarray[i] = $fileline
    rowcount = $rowcount + 1
    done
    
    i = 0
    query = ""
    while [ "$i" -le "$rowcount" ]
    do
    	n = split(${rowarray[$i]}, colarray, ",")
    	if [ "$n" -ge "2" ]
    		then
    		query = $query + "\n" + 
    		"insert into OutputTable 
    		(
    			SeqKey,
    			ID,
    			Col1,
    			Col2
    		) values ( 
    			nextval for " + $DB_INSERT_SCHEMA + ".OutputTable,
    			select ID from InputTable where Key = " + ${colarray[0]} + "," + 
    			${colarray[1]} + "," + 
    			${colarray[2]} + 
    		");"
    	fi
    	i = $i + 1
    done
    
    db2 +p -t <<EOF >>  $LOG_FILE
    
    connect to $DB_INSERT user $user using $pw;
    
    values current timestamp;
    
    SET SCHEMA $DB_INSERT_SCHEMA;
    
    $query
    
    commit
    ;	
    values current timestamp;
    
    terminate;

  3. #3
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    I don’t use db2 and the split command you use is unfamiliar to me. I only know the split command to split a file into pieces

    The csv file sample you show does not seem coma delimited so you can read the data directly into variables
    i.e. while read kecol col1 col2 junk
    yet the split command you use n = split(row[0], col, ",") suggest it is, assuming I interpret it correct


    I will not use an array. I’ll build the insert statements into a file then take the file as input. The code below is untested.

    Code:
    #!/usr/bin/ksh
    #…seting your variables…
    cat <EOF >insert.sql
    connect to $DB_INSERT user $user using $pw;
    values current timestamp;
    SET SCHEMA $DB_INSERT_SCHEMA;
    EOF
    cat myfile.csv | while read fileline  
    do
      keycol=$(echo $fileline | cut –d’,’ –f1)
      col1=$(echo $fileline | cut –d’,’ –f2)
      col2=$(echo $fileline | cut –d’,’ –f3)
    cat <<EOF >>insert.sql
    insert into $OutputTable 
    (
    	col1,
    	col2,
    	col3
    )
    values
    (
    	$col1,
    	$col2,
    	select col3 from $InputTable where KeyCol = $keycol
    );
    EOF
    Done
    db2 +p -t <insert.sql >$LOG_FILE

  4. #4
    Join Date
    Jan 2004
    Posts
    70
    I was wondering about the split command. I read about it somewhere but then couldnt find the page again :shrug: I was just reading on how to do the cut command.

    Sorry the example CSV file was just a dummy. The real CSV file is comma seperated.

    Will the way I guessed at, by trying to execute a bunch of insert statements that are in a string variable not work?

  5. #5
    Join Date
    Jan 2004
    Posts
    70
    I tested out some of that code.

    Did this

    Code:
    cat >insert.sql << "EOF"
    connect to $DB_INSERT user $user using $pw;
    values current timestamp;
    SET SCHEMA $DB_INSERT_SCHEMA;
    EOF
    
    
    cat $CSV_DIR/$1 | while read fileline  
    do
    	colarray[0]=$(echo $fileline | cut -d',' -f1)
    	colarray[1]=$(echo $fileline | cut -d',' -f2)
    	colarray[2]=$(echo $fileline | cut -d',' -f3)
    cat >>insert.sql << "EOF"
    insert into OutputTable
    (col1,col2,col3,col4) 
    values 
    (nextval for $DB_INSERT_SCHEMA.VAE_MF_SUBLDGR_SEQ , select col2 from InputTable where Key = ${colarray[0]} , ${colarray[1]} , ${colarray[2]} );
    EOF
    
    done
    It works but the problem is "${colarray[1]}" is written into insert.sql instead of its value. How can I get the value?


    edit------------------

    need to remove the quotes around EOF...

    http://www.webservertalk.com/archive...-3-952249.html

    Looks like I should also use printf if I want it to be faster.
    Last edited by JamesAvery22; 04-19-06 at 15:44.

  6. #6
    Join Date
    Jan 2004
    Posts
    70
    my tested code:

    Code:
    printf "%s\n%s\n%s\n" "connect to $DB_INSERT user $user using $pw;" "values current timestamp;" "SET SCHEMA $DB_INSERT_SCHEMA;" >insert.sql
    
    cat $CSV_DIR/$1 | while read fileline  
    do
    	colarray[0]=$(echo $fileline | cut -d',' -f1)
    	colarray[1]=$(echo $fileline | cut -d',' -f2)
    	printf "%s\n" "insert into OutputTable (col1,col2,col3) values (nextval for $DB_INSERT_SCHEMA.OUTPUTTABLE_SEQ , (select ID from InputTable where Key = '${colarray[0]}') ,'${colarray[1]}' );" >>insert.sql
    done
    Thanks for everyones help

  7. #7
    Join Date
    Jan 2004
    Posts
    70
    I've been using this script and it works fine with an input file that has <200 lines. Takes about 20 seconds to create the insert.sql. Im trying it on a file with 13k+ lines and it is taking forever. Creating the insert.sql is taking longer than executing it. Is there anyway to do this faster?

  8. #8
    Join Date
    May 2005
    Location
    South Africa
    Posts
    1,365
    Provided Answers: 1
    Try using sed
    Code:
    sed "s/\(.*\),\(.*\),\(.*\)/insert into OutputTable (col1,col2,col3) values (nextval for $DB_INSERT_SCHEMA.OUTPUTTABLE_SEQ , (select ID from InputTable where Key = \1) ,'\2' );/" $CSV_DIR/$1  >insert.sql

  9. #9
    Join Date
    Nov 2011
    Posts
    4

    sed to parses a csv file

    Suppose we have more than 9 columns in the csv file, then how to creat the insert stmt
    As after 9th column, for the 10th column it is picking the 1st columns values and concates the 0 at the end, similarly for 11th column 1st columns values and concates the 1 at the end.

    Please suggest what did's to be done to solve these problem.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •