Results 1 to 8 of 8
  1. #1
    Join Date
    Sep 2008
    Posts
    11

    Unanswered: Need to update columns with values from other tables

    Here is the situation:
    There is a process that receives a file from a vendor , formats the data and loads it to EMPL table.
    EMPL table contains among others Zip_cd and Zip_loc columns and Dept_cd and Dept_name columns.
    Zip_cd and Zip_loc columns are related and only value of Zip_cd is supplied on the file. There is separate LOCATION table that contains a City for every zip code.
    Same is for Dept_cd and Dept_name. Only Dept_cd is on the input file and there is a separate DEPARTMENT table that contains department name for each department code.

    So my assigment is to populate the Zip_loc and and Dept_name columns with values from associated LOCATION and DEPARTMENT tables.

    EMPL

    Empl_num
    Empl_name
    ......
    ......
    Dept_cd (value is supplied)
    Dept_name (value has to be taken from DEPARTMENT table)
    .......
    Zip_cd (value is supplied)
    Zip_loc (value has to be taken from LOCATION table)


    DEPARTMENT

    Dept_cd
    Dept_name

    ZIP

    ZIP_cd
    ZIP_loc


    What is the best way to approach this?

    1. Should I try and create an intermediate file that will be populated with the values from those 2 tables and after that it will be sorted and the EMPL table will be loaded?

    2. Or should I let the existing process alone (let the EMPL table to be created) and after that use a program to update the table and populate the columns?

    The thing is that the rows in EMPL table are in Employer number sequence and all there is no indexes on ZIP_CD or Dept_cd columns. So the update will have to do a table scan anyway so the process won't benefit much from using SQL versus one record at a time processing if I went with #1 and tried to update each record before loading.

    But let's assume I went with #2.

    3. I know how to update one column at a time:

    update EMPL
    set zip_loc = (select zip_loc from LOCATION b
    where b.zip_cd = EMPL.zip_cd)
    where b.zip_cd in (select EMPL.zip_cd from EMPL);

    But each row has two columns to be updated: ZIP_loc and Dept_name.
    Is there a way to do it with one SQL statement?


    4.
    In case there is no associated rows in the LOCATION or DEPARTMENT table , I have to pupulate the zip_loc and dept_name columns with the values of zip_cd and dept_cd.

    Is there a neat way to put this default into SQL? Something similar to COALESCE function?

    Thanks in advance.

  2. #2
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by Bambuk
    Zip_cd and Zip_loc columns are related and only value of Zip_cd is supplied on the file. There is separate LOCATION table that contains a City for every zip code.
    In this situation, it would be more logical (and more typical) to use a normalised design, viz., remove the Zip_loc column from the table. And make the Zip_cd column a foreign key pointing to the LOCATION table.
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  3. #3
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by Bambuk
    1. Should I try and create an intermediate file that will be populated with the values from those 2 tables and after that it will be sorted and the EMPL table will be loaded?
    The optimizer should do the sorting for you, if necessary; so I would definitely not go for (1) but use an SQL-only approach; use EXPLAIN to verify whether the optimizer chose a sufficiently efficient access path; if not, try to provide relevant table and index statistics (and indexes) until the access path looks acceptable.
    Quote Originally Posted by Bambuk
    2. Or should I let the existing process alone (let the EMPL table to be created) and after that use a program to update the table and populate the columns?
    Again, I would use a single SQL statement, not a programmed cursor, to do this massive update. The optimizer should decide on its own which update sequence is the best one.
    Quote Originally Posted by Bambuk
    3. each row has two columns to be updated: ZIP_loc and Dept_name.
    Is there a way to do it with one SQL statement?
    Of course:
    Code:
    UPDATE empl SET
      zip_loc = (SELECT zip_loc FROM location WHERE zip_cd = empl.zip_cd),
      dept_name = (SELECT dept_name FROM department WHERE dept_cd = empl.dept_cd)
    Unfortunately there is no way to say
    Code:
    UPDATE empl SET
      (zip_loc, dept_name) = (SELECT zip_loc, dept_name FROM ....JOIN...)
    but the above "double" SET should be equally efficient, and identical to a manual, programmed cursor update. But check this first (through EXPLAIN).

    Quote Originally Posted by Bambuk
    4. In case there is no associated rows in the LOCATION or DEPARTMENT table , I have to pupulate the zip_loc and dept_name columns with the values of zip_cd and dept_cd.
    Is there a neat way to put this default into SQL? Something similar to COALESCE function?
    Sure:
    Code:
    UPDATE empl SET
      zip_loc = (SELECT COALESCE(zip_loc, empl.zip_cd) FROM location
                 WHERE zip_cd = empl.zip_cd),
      dept_name = (SELECT COALESCE(dept_name, empl.dept_cd) FROM department
                 WHERE dept_cd = empl.dept_cd)
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  4. #4
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Do you have, by any chance, the option to normalise the physical table (under a different name), and put a view on it (with the old table name) that shows exactly what you would see in the old design?
    Something like
    Code:
    RENAME TABLE empl TO base_empl ;
    CREATE VIEW empl AS
    (SELECT e.Empl_num, e.Empl_name, ...,
            e.Dept_cd, COALESCE(d.Dept_name, e.Dept_cd) AS Dept_name,
            e.Zip_cd, COALESCE(z.Zip_loc, e.Zip_cd) AS Zip_loc
     FROM base_empl e LEFT OUTER JOIN department d ON e.Dept_cd = d.Dept_cd
                      LEFT OUTER JOIN zip z ON e.Zip_cd = z.Zip_cd
    )
    After that, the now unused columns base_empl.Dept_name and base_empl.Zip_loc should ideally be emptied or even removed.
    The latter requires a DROP/CREATE of the table. For the former, set its datatype to VARCHAR(255), put an empty string in it, and finally optionally make the columns IMPLCITLY HIDDEN (if this is possible in your version of DB2).
    Last edited by Peter.Vanroose; 12-28-08 at 10:17.
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  5. #5
    Join Date
    Sep 2008
    Posts
    11
    Quote Originally Posted by Peter.Vanroose
    In this situation, it would be more logical (and more typical) to use a normalised design, viz., remove the Zip_loc column from the table. And make the Zip_cd column a foreign key pointing to the LOCATION table.

    Peter,
    thank you very much for the code.
    I know the normalization would make sense, but as it usually happen other groups don't want to change the code to access different tables. We used to receive the description values along with the codes on the vendor file, but they could be different. The same Dept_cd could have different descriptions in different rows. So they decided to have one to one relationship thru new tables, but for political reasons we cannot force other groups to change the code.

    One more question:
    do I have to write a program or is there a way to put an SQL in a JCL? Is there any pros and cons?

    Thanks.

  6. #6
    Join Date
    Sep 2004
    Location
    Belgium
    Posts
    1,126
    Quote Originally Posted by Bambuk
    do I have to write a program or is there a way to put an SQL in a JCL?
    That's indeed possible: put the SQL in the SYSIN DD of DSNTEP4 (or DSNTEP2):
    Code:
    //DSNTEP4  EXEC  PGM=IKJEFT01,DYNAMNBR=20
    //SYSTSPRT DD SYSOUT=*
    //SYSTSIN DD *
     DSN SYSTEM(DSN)
     RUN PROGRAM(DSNTEP4) PLAN(DSNTEP481)
    /*
    //SYSPRINT DD SYSOUT=*
    //SYSUDUMP DD SYSOUT=*
    //SYSIN DD *
     UPDATE empl SET zip_loc =
      (SELECT COALESCE(zip_loc, empl.zip_cd) FROM location
       WHERE  zip_cd = empl.zip_cd),
      dept_name =
      (SELECT COALESCE(dept_name, empl.dept_cd) FROM department
       WHERE dept_cd = empl.dept_cd) ;
    /*
    //
    You may e.g. plug in this EXEC directly after the LOAD, in the same JCL.
    The SQL is executed as dynamic SQL; if you prefer static SQL, you'll have to write a program.
    --_Peter Vanroose,
    __IBM Certified Database Administrator, DB2 9 for z/OS
    __IBM Certified Application Developer
    __ABIS Training and Consulting
    __http://www.abis.be/

  7. #7
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    I would not go back and do the update that's being discussed here, you should be getting those values during the insert process. Ideally, you would normalize as Peter suggested, he even gave you a way that not much code would have to change by using a view. You could do an insert from select, that way you do not have to have a separate update function.
    INSERT INTO MYTAB SELECT :HV1,:HV2,A.DEPT_NAME, B.ZIP_LOC FROM ....
    Dave

  8. #8
    Join Date
    Sep 2008
    Posts
    11
    OK.
    Now that I am actually working on this it turns out that that I have to update not the file coming from a vendor, but an existing table.
    The current process unloads the table and in subsequient steps populates the fields by taking values from a vendor file and loads the updated file back to the table.
    So now we are going to take the values from the two tables.
    What is the best way to update the table?

    So far I see two ways:
    A)
    1. Unload the table (to create a backup).
    2. Create an program that will update the table by using the SQL above.
    It looks that in our shop it is not allowed to update tables through JCLs.

    B)
    1. Unload the table.
    2. Create a program that will read unloaded file, populate the fields by taking values from other tables and create updated sequential file.
    3. Load the updated file into the table.

    So far,
    it looks like A is a better choise. Is that so?

    In this case I will have to commite changes every 1000 rows or so? But the SQL above deals with the whole table, so how can I code the logic to commit the changes every 1000 rows?

    Also is creating a backup by unloading the table is a normal practice? It sure is when dealing with VSAM files, but maybe there is something different with DB2.

    Any suggestions welcome,
    Thanks.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •