OS: Windows XP PRO, SP2.
DB2 Express-C
UTF-8 COLLATING sequence: IDENTITY
**** UPDATE ****
I deleted the database and created a new one like this:
CREATE DATABASE parents ON 'C:' USING CODESET UTF-8
TERRITORY NO COLLATE USING UCA400_NO
then I did this:
db2 get db cfg for database_name
and see that new COLLATING sequence are now: UCA400_NO
I ran my queries again.... ( I inserted the records from scratch )
select PAG_NAME from pager where PAG_ID>=3000 order by ICU.SORTKEY(PAG_NAME,'LNB')
then the records are still ordered like this:
Aø
Aæ
Aå
The correct order should be:
Aæ
Aø
Aå
Urk....Same result as with database COLLATING sequence set to IDENTITY.
I did some more testing.....
and from this link:
http://www-128.ibm.com/developerwork...ole/index.html
If you scroll down to first "order by" example then I inserted the records .. Alice, Celine, Cindy etc and tested if I
got the same result as in this page:
Using this for English:
select PAG_NAME from pager where PAG_ID>=7000 order by ICU.SORTKEY(PAG_NAME,'LEN')
I get same sort order as in the link above.
Using this for Slovak
select PAG_NAME from pager where PAG_ID>=7000 order by ICU.SORTKEY(PAG_NAME,'LSK')
Then I get the same result as in the link above.
But using LNB for Norwegian Bokmål, places æøå in wrong order ..... strange.
Do I have to specify charset to use against DB2 in PHP?
Can that have anything to do with it.
I belive this does not have anything with sorting of the data.
When I insert records containing æøå letters today, they are displayed correctly when I select them laiter though.
When I work with UTF-8 data against MySQL I use this after connecting in PHP
@mysql_query("SET NAMES 'utf8'");
For ORACLE I do this after connecting:
putenv("NLS_LANG=.AL32UTF8");
putenv("NLS_COMP=LINGUISTIC");
putenv("NLS_SORT=BINARY_CI");
For IBASE
if ( ! $this->db = @ibase_pconnect($dbpath.$dbname, $dbuser, $dbpass, 'UTF8')){ return false; }
**** UPDATE END ***
I then thought I could store let say both french, german, norwegian, thai etc...
And let say a Norwegian client, which stores all his records in Norwegian want to sort his data in norwegian sort order, then I thought it was possible for him to specify the collation for his language (well, actually I do it for him on the fly when ordering the data (at least I hoped I could)....
I have installed the ICU windows binaries (DLL's) in the FUNCTION directory and created a UDF in order to be able to use a collation from the ICU in my queries.
So for a Norwegian client I would sort like this:
select CUS_NAME from customer where lower(CUS_NAME) like 'a%' order by ICU.SORTKEY(CUS_NAME, 'LNB')
For a german I would just change the collation from LNB to LDE like this:
select CUS_NAME from customer where lower(CUS_NAME) like 'a%' order by ICU.SORTKEY(CUS_NAME, 'LDE')
I have gotten the collation names from this page:
http://www-128.ibm.com/developerwork...sidefile1.html
The strange thing is this: When I use the default collation for "Norwegian Bokmål" which is LNB the characters æøå is not sorted correctly.
I also know that there is some attributes one can use with the LNB .. like
LNB_AS, LNB_AS_CX .... there is a lot more.
Have also tried full collation (for norwegian) with all attributes like this:
LNB_AN_CX_EX_FX_HX_NX_S3
But when I go to the IBM online page where I can test this specific collation I see that all the attributes are the default settings.
If I enter my test words with æøå in this page and hit Sort button then the words are sorted correctly.
Is it not possible to store data from several languages in a UTF-8 database, and then laiter on using queries that uses a specific collation for a specific language (one language at a time though)?
Could it be that there is something wrong with the ruleset for Norwegian Bokmål collation (LNB) in the ICU dll's ?
Anyway... thanks for taking the time to look at this.