Results 1 to 7 of 7
  1. #1
    Join Date
    Jan 2003
    Posts
    1,629
    Provided Answers: 1

    Unanswered: How to transfer tables between different operating systems and code pages?

    Hi,
    in our company we have DB2 v8.2 on Windows 2003 with database code set 1250 with 500 tables (source database).
    I should move this tables to DB2 v11 on Linux/Intel with code set UTF-8 (target database).

    I thought to create an exports for all tables like:
    Code:
    export to file.definitions.ixf of ixf select from schema.tab1 where 0 = 1;
    to get table definitions. And another export to get data:
    Code:
    export to file.data.ixf of ixf select from schema.tab1;
    Then import definitions:
    Code:
    import from file.definitions.ixf of ixf create into schema.tab1;
    and load data:
    Code:
    load from file.data.ixf of ixf load into schema.tab1;
    This would be pretty simple way to migrate, but because code set of source and target databases are different I can't do above, because 1250 is single-byte size character database and UTF-8 is multi-byte size character database.

    For example source database table:
    Code:
    create table admin.testchar (col char(1));
    insert into admin.testchar values ('č');
    works fine, because database is single-byte character and 'č' gets single byte to store.

    Same table gets recreated in target database with char(1) column definition (the same as in source database):
    Code:
    create table admin.testchar (col char(1));
    When importing data:
    Code:
    -- works fine
    insert into admin.testchar values ('a');
    -- does not work
    insert into admin.testchar values ('č');
    Above command reports error:
    SQL0433N Value "č" is too long.
    In DB2 UTF-8 code set databases, CHAR data type field is single-byte data field and character like "č" in UTF-8 gets two-bytes to store, so can't be saved in CHAR(1) data fields. Note: There are all other sort of problems with CHAR fields like substr(col, 1, 1) returns garbage, because DB2 functions can't really work with multi-byte characters when CHAR data field is used in UTF-8 created database.

    Work-around in UTF-8 DB2 database is to stop using CHAR data fields and instead of it create GRAPHIC data type fields:
    Code:
    create table admin.testgrap (graphic char(1));
    insert into admin.testchar values ('č');
    above insert works fine.

    Now to the problem. I can't just export definitions from source database (because export will produce char data fields) and import them in target database, because I need to change CHAR data fields to GRAPHIC data fields.

    P.S. The same problem is with VARCHAR data field and solution is VARGRAPHIC data field.

    Any idea how to solve this table migration problem from Windows/1250 to Linux/UTF-8?
    Regards
    Last edited by grofaty; 05-15-17 at 07:25.

  2. #2
    Join Date
    Jul 2016
    Location
    Moscow
    Posts
    259
    Provided Answers: 39
    Hi,

    If you don't want to change the definition of your tables, then one possible solution is to set the STRING_UNITS db parameter to CODEUNITS32. Every (VAR)CHAR(X) will be converted to (VAR)CHAR(X CODEUNITS32) implicitly for any CREATE/ALTER TABLE statement afterwards. Every string manipulations (substr, length, "č" -> varchar(1), etc) will work as expected in your old single-byte database.
    The side effect is that the max length of a field with CODEUNITS32 is 4x shorter comparing to the field of the same type with OCTETS (bytes).
    Regards,
    Mark.

  3. #3
    Join Date
    Apr 2012
    Posts
    1,135
    Provided Answers: 25
    If your application (and SQL) is tested to work with fixed-width encodings, AND your business does not wish to change that, then why create the Linux database with utf-8 encoding ?

  4. #4
    Join Date
    Jan 2003
    Posts
    1,629
    Provided Answers: 1
    @mark.bb, I have been testing your idea whole day. Changing CODEUNTIS32 database parameter like:
    db2 update db cfg for <database> using STRING_UNITS CODEUNITS32
    solves the problem.

    The side effect is less characters can be written on the same tablespace. But I have fixed the problem with creating new bufferpool with 8K pagesize and tablespace with 8K pagesize and problem is solved.


    But now I have figure it out that there got to be some bug or similar if this parameter STRING_UNITS is set to CODEUNITS32. Some of the select statements when executed it returns the "SQL30081N A communication error..." message, and db2 client loses the connection to db2 server. I have stripped down my SQL and got to the following select that performs this problem:
    Code:
    SELECT REPLACE(IBMREQD, CHR(10), '') FROM SYSIBM.SYSDUMMY1
    Setting STRING_UNITS back to SYSTEM and above selects runs just fine. It looks like a bug to me. If something in select is not permitted it should return syntax error, not closing connection any idea what is wrong?



    @db2mor, reasons:
    1. DB2 v8.2 is not supported for years, we must install officially supported product DB2 v11.
    2. Windows is not desired operating system in our company anymore, we would like to run DB2 on Linux/Intel.
    3. Currently application is only using single-byte characters which is fine for Latin languages, but now we got a requirement to also support Cyrillic letters, and having Latin and Cyrillic letters in the same database the only option in my knowledge is UTF-8 database (other option would be having two single-byte databases, but this complicates things even further, application change would be also required etc).

  5. #5
    Join Date
    Jul 2016
    Location
    Moscow
    Posts
    259
    Provided Answers: 39
    Quote Originally Posted by grofaty View Post
    But now I have figure it out that there got to be some bug or similar if this parameter STRING_UNITS is set to CODEUNITS32. Some of the select statements when executed it returns the "SQL30081N A communication error..." message, and db2 client loses the connection to db2 server. I have stripped down my SQL and got to the following select that performs this problem:
    Code:
    SELECT REPLACE(IBMREQD, CHR(10), '') FROM SYSIBM.SYSDUMMY1
    Setting STRING_UNITS back to SYSTEM and above selects runs just fine. It looks like a bug to me. If something in select is not permitted it should return syntax error, not closing connection any idea what is wrong?
    My DB2 10.5.8 on Windows with STRING_UNITS=CODEUNITS32 even traps on this query.
    This is not normal and you should open PMR with IBM support.
    Regards,
    Mark.

  6. #6
    Join Date
    Apr 2012
    Posts
    1,135
    Provided Answers: 25
    The data-migration is not your only issue then, if your application must be tested with variable width encoding, having been written for a fixed width encoding - so testing will be costly.
    I would Use db2tool to extract the DDL from V8.x on Windows, alter the resulting DDL according to the new design requirements for v11.x, and then deploy that revised DDL on Linux.
    When exporting data from v8.x, and import/load on v11.x, take care with the codepage translation issues (e.g. on windows you might want to ensure DB2CODEPAGE variables is correctly set for the export), while on Linux I would ensure either the $LANG variable is correct for utf-8 and the desired territory, or would use an appropriate modifier on the import/load .
    For columns that will contain variable-width encoded characters, consider increasing their length but preserving their data type, as that may be less costly than changing the data types.

  7. #7
    Join Date
    Jan 2003
    Posts
    1,629
    Provided Answers: 1
    @mark.bb, now I see the problem is actually in REPLACE function, I thought it is in CHR executing using special characters, but obviously not. So the following select has the same problem:
    Code:
    SELECT REPLACE(IBMREQD, 'Y', '') FROM SYSIBM.SYSDUMMY1
    When above SQL is executed db2 client loses connection to DB2 server. On DB2 server connection is still there and if looked with "db2 list applications show detail" in Status column connection is in "Compiling" status. But it never finishes. It also can't be forced off the database with "force application (id)". I also tried "force application all", but all of the application successfully forced except applications executed from above query. Interesting the only work-around to get rid of this orphan connections is to stop the instance with "db2stop force". Pretty weird behavior, looks like a bug.

    @db2mor, thanks for additional info. We have already tried and application works fine with multi-byte database.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •